At the beginning of 2020, the HIFIS Software team initiated a survey exploring the approaches of researchers within the Helmholtz Association regarding software development. This blog post explores the familiarity and usage of different tools by research groups, and discusses the applications of these results to the consulting that HIFIS offers.
In software development, there are many good practices that help developers to produce code more quickly, with more confidence in the results, and that can be read by a wider group of people. These might include common coding styles, pair programming, and continuous integration/deployment. However, not all best practices make sense all of the time, and some practices may not be particularly relevant to research software engineering.
In the 2020 survey, we gave respondents a list of 17 best practices, and, for each option, asked them to choose if they used this technique regularly, sometimes, never, or if they had not heard of this technique. For each of these techniques, we were then able to calculate the percentages of people who were familiar with a technique, those who had actually used it, and those who used it on a regular basis. The following graph shows the result of this analysis.
A plot of familiarity, usage, and regular usage for different development practices.
The first most obvious result from this, speaking at a broad level, is that respondents are generally familiar with the different techniques and practices suggested. In fact, for each of the techniques given, a majority of the people who answered the question were familiar with that technique. This suggests that these respondents are generally already aware of development practices that could help them.
The second interesting result here is the lack of a clear relationship between familiarity and regular usage. We might suspect that the more well-known a technique is, the more regular usage it will get. However, this is fairly clearly not the case for many of the techniques in the dataset that we have obtained.
This could be because many of these techniques are not equally applicable. Over 90% of respondents were aware of code reviews as a development technique for improving code quality, but just over a quarter of them regularly put that technique into practice. However, code reviews require multiple developers, and we know that many of the participants in this survey work alone, or with at most one other person, making code reviews fairly impractical.
Alternatively, this could be because the techniques mentioned are not all equally easy to apply. Applying a common code style is a simple change that can often be made with little effort by running a tool like autopep8 or Black across the codebase. However, setting up continuous integration (CI) for a project can be more complicated - it often involves writing code in a different language, and will generally require an understanding of the way different CI platforms work.
From the perspective of the Consulting team, this second set of techniques may give us insights into where we can support developers in Helmholtz best. These are techniques that often have significant benefits when they are implemented (being able to automatically compile code, run tests, and validate formatting) but require more knowledge to implement (the precise syntax that GitLab or GitHub use to define a CI job). In these cases, we would be able to step in and provide initial support to get a team started, giving them the tools and understanding that they need to go further on their own.
As part of the survey, we also asked about participants’ usages and familiarity with different testing techniques, using the same form of question as above. Similar analysis as before yields the following graph:
A plot of familiarity, usage, and regular usage for different testing practices.
Here we see similar conclusions to the ones we drew before: for each technique, a majority of people who answered this question were familiar with that technique, and many techniques are well-known but not often used, and vice versa.
The previous explanations for these observations also seem to apply here. For example, performance testing is generally only required when performance becomes an issue. Likewise, usability testing will often not apply to a lot of research scripts, where the functionality of a piece of software is generally more important than the way it is used.
Similarly, some of these techniques are complicated to set up, and may require assistance and support to implement. For example, static code analysis can have a lot of value in preventing bugs from appearing by identifying common programming mistakes. Tools that can support this include mypy which provides static types for Python, or linters such as Pylint which offer a variety of relevant hints. Setting these tools up can be complicated for initial projects, as a lot of code may already exist in the codebase that is not compliant with these tools. This code will then need to be corrected, or workarounds will need to be found for code where a cleanup would be too much effort for too little gain.
There are a few limitations with this analysis. First and foremost, the survey had a relatively small number of participants, with many of them not completing the survey fully. In addition, we cannot be sure how well the survey was targeted, and so the participants may represent certain demographics within the Helmholtz Association more than others. For example, the familiarity of respondents with many of the techniques discussed here indicate a relatively high competency level, so these results may not be representative of researchers who are less comfortable with coding.
There are also limitations in terms of the conclusions that we can draw from this. I have suggested two reasons (lack of applicability and difficulty of application) for the variance in familiarity and regular usage between different techniques, but for each individual technique it is not possible to determine which reason is stronger, or indeed if these are the most common reasons at all. To solve this, we would need to do further analysis to validate our assumptions (which is difficult with this data with a small sample size, where cross-analysis is impractical) or else do further surveys.
The questions asked produced some interesting results, which may help us to identify ways that the consulting team can better support researchers at the Helmholtz Association. Specifically, the result that familiarity with a technique does not necessarily imply use of a technique suggests that consultations may be a valuable way of supporting researchers who know what they want to achieve, but do not have the time or ability to achieve it.
There are limitations to this survey, which unfortunately affect our ability to draw more detailed conclusions, but the experiences drawn from creating and analysing this dataset will play a role in creating the 2021 survey.