Beginning of 2020 the HIFIS Software team initiated a software survey targeting employees of the whole Helmholtz Association in which 467 participants could be considered for the analysis. The figure below depicts how strongly the different Helmholtz research fields are represented in this survey.
With the results of the survey we want to understand, how we as HIFIS Software Services can best support your every day life as a research software developer. In this blog post we will examine the results from a technology perspective and will on the one hand give an overview of the status quo of the software engineering process of the participants, and on the other hand try to identify specific measures.
Version Control
One of the basic requirements for developing sustainable and high-quality research software is the usage of a version control system (VCS). On the market there exist multiple competitors, distributed version control systems like Git or Mercurial and centralized version control systems like SVN. In accordance with the trends shown in analysis done by Stackoverflow, we expected Git to be the most popular tool within Helmholtz.
Trend of Stackoverflow questions per month. Created via Stackoverflow Trends on 2020-10-15.
The participants of the survey have answered to the multiple-choice question about which VCSs they use as shown in the figure below.
A similar diagram as above has already been evaluated in a related blog post on results from the survey analysis. Here, based on these descriptions we only would like to draw conclusions from a technological point of view. Only roughly 10% of the participants claim that they do not use VCSs while developing their research software. These results indicate that the awareness is high among the participants that the usage of version control systems is an important aspect in sustainable software development.
In order to unravel that a bit more, we identified a trend in the figure below that the use of VCSs increase the wider research software developers share their source code in terms of categories like within their research group, research organization, research field or even general public. Hence, there might be a relationship between the broadness of code share and usage of VCSs. If this trend holds true then it illustrates that version control systems are indeed mandatory tools to collaborate with other developers.
The responses to the survey are then grouped into the six Helmholtz research fields:
- Aeronautics, Space and Transport
- Energy
- Earth and Environment
- Health
- Matter
- Key Technologies
In the research field Aeronautics, Space and Transport SVN seems to be more widely spread compared to other research fields but also the portion of developers who do not use version control is lowest among the participants of this research field. On the one hand, given the collected data about the amount of VCSs questions asked on Stackoverflow over time introduced earlier this most probably gives an indication that there is a significant amount of comparably older repositories that use SVN and that this research field might have a longer tradition of using VCSs. On the other hand, this shows that the use of VCSs in this research field today is more prevalent compared to other Helmholtz research fields.
From the data it is also possible to compare the usage of version control systems with the team size participants usually develop software in. The result is shown in the figure below:
It is clearly visible that the amount of participants who claim to not use any kind of version control decreases with increasing team size. This insight is actually very valuable. This illustration suggests a relationship between team size and the use of VCSs. One reason for increasing use of VCSs with growing team size might be that VCSs make collaboration more comfortable and that researchers are aware of this fact. Whether the use of VCSs has actually already become a de-facto standard in research software will be further investigated (e.g. in our next survey).
On the other hand from the participants who claim to develop software mostly on their own 20% specify to not use version control at all. This is something we as HIFIS Software Services would like to see change in the future. For us, it is important to make people aware that using version control is a mandatory requirement for software development projects of any scale. This requires us to make the entry hurdle to using version control systems as low as possible. This means that every software developer in Helmholtz must have access to a suitable and easy-to-use infrastructure to enable this basic requirement. Therefore, HIFIS Software Services will offer a GitLab instance that is usable by every employee of the Helmholtz Association free of charge.
Software Development Platforms
Using version control systems can be considered the entry-point to a world of platforms that build even more around this basic requirement. Even if you can typically use a version control system completely local as well, it really starts paying off when combining version control with online platforms like e.g. GitLab, GitHub or Bitbucket. On the one hand this opens up your project for collaboration but also gives you access to a whole ecosystem of other extremely useful tools like issue tracking, merge requests, CI/CD or code reviews. This is why we were also eager to know which software development platforms the participants use in their every-day life.
The results show that among the participants the most widely used platforms are GitHub.com and self-hosted GitLab instances followed by GitLab.com. Thus, about 54% of the participants claim to use GitHub.com, 49% use self-hosted GitLab instances and about 25% of the participants specify to use GitLab.com. About 13% claim to not use any of the platforms. This value is in a similar range to the participants who specified to not use version control systems.
Continuous Integration
Continuous Integration (CI) is referred to as the practice of merging code changes into a shared mainline several times a day. A typical workflow would incorporate the automatic building of a software, the automatic execution of unit tests and finally, the automatic deployment of artifacts, e.g the documentation or compiled binaries. The last step is also referred to as Continuous Deployment (CD). On the market, there exist multiple tools that support this kind of software development process. Some of the tools available at the time of this survey were GitLab CI, Jenkins, Travis or CircleCI.
The results of the survey show a pretty diverse situation for the usage of CI services by the participants.
On the one hand, a portion of 53% of the participants claim to not use CI services at all. Among the participants who declared to use CI services, the most commonly used technologies were GitLab CI (29%), Jenkins (16%) and Travis CI (13%). Due to the fact that many Helmholtz centers host their own GitLab instances which also allows to use GitLab CI, we expected GitLab CI to be the most popular tool among the participants of the survey. Jenkins is also a tool that can be self-hosted and thus, is also popular and available in different centers. Due to the popularity of GitHub, especially for Open Source projects, it is not surprising that also Travis CI is widely chosen according to the survey responses. At the time of creating the survey, GitHub Actions was not yet widely available on the market. This explains, why this service does not show up in the list of chosen tools.
We as HIFIS Software Services would like to see a rise in the overall usage of CI/CD in the daily software development process. It offers the chance to automate repeating tasks and introduces automated quality checks for code changes before they get merged into the mainline. Therefore, we want to ensure that every Helmholtz researcher regardless of their affiliation has seamless access to general purpose resources for CI/CD. This is why the provided GitLab instance will be equipped with scalable resources for CI/CD. With this offer, in combination with proper education, training and consultation we hope to see a rise of the general usage of automation technologies in research software engineering.