DESOSA 2022

Audacity - Quality and Evolution

As a large open-source project, Audacity relies on the contributions provided by a large collaborative community. In this paper, we provide a retrospective on the quality and evolution of Audacity, investigating their contribution pipeline, the overall quality culture, and the technical debt that has been accumulated.

Software Quality Processes

Audacity uses two primary development tools to ensure a proper contributions are made with a maintainable format.

Forking

For developers to contribute to the Audacity code-base, they need the first fork the Github repository. Forking is the act of cloning a copy of the original repository, where one can freely alter elements of the code without affecting the original project. After formulating some changes, a developer can contribute their changes using a pull request (PR). These pull requests need to be reviewed by other developers, who decide whether or not to accept or reject the request.

Pull Requests

Audacity’s pull requests come with a predefined checklist template, which ensures that a contributor provides the necessary information to judge the quality and maintain a coherent standard. The first information which must be provided, beyond a title of the issue addressed, is a description of the observed issue and the chosen method to resolve it. The information is required to provide the context under which the pull request can be evaluated. Following this, a checklist asking the following information is provided1:

  • I signed CLA: Ensures that the contributor understands the license agreement.
  • Title: Ensures that no confusion is caused from an inaccurate title.
  • Extensive Changes: Enforces the maintainable principle that any large changes can be broken down into reviewable commits.
  • Commit Messages: Enforces that every commit comes with a proper description, which matches the alteration to the code.
  • Unnecessary Changes: Ensures that only PR-related changes were made, as to avoid code breaks.

Additionally, an optional check-box is provided to know if each commit compiles on the machine without undesirable changes.

Coding

Audacity enforces clear coding standards, to ensure a maintainable project. The Audacity website provides a comprehensive list of comment and coding standards. Below is a non-exhaustive list of conventions imposed on a contributor2:

  • Naming convention
  • When/Where to comment
  • Tabularization
  • Do’s and Don’ts
  • Header file conventions

Continuous Integration Processes

When a PR is made, alongside the requirements of clear commit messages and the PR checklist, continuous integration is applied to make the merging process proceed smoothly. CI proceeds by first ensuring the proposed code changes can be merged without any conflicts. If no merge conflicts occur, the codebase with the proposed changes will be built and several tests will be executed to ensure that key functionalities remains working. Besides having CI for building the project, Audacity also has CI for packaging the codebase into for example a tarball. As Audacity supports multiple operating systems, the application is fully built and tested on each supported platform. If any errors occur on a specific platform, it is immediately reported.

In general, Audacity’s CI process contains two key components:

  • Github Actions to define the workflow
  • Building the project on different platforms
  • Testing key features
  • Packaging the codebase

Github Actions

Audacity uses Github Actions3 for their CI needs. The Audacity project has a predefined workflow (yaml) file that lists all building, testing, and deployment steps for each platform. As previously mentioned, these steps are all roughly the same besides the fact that they are built for each supported operating system.

Github Actions can be linked to trigger whenever an event occurs in the repository. In the case of Audacity, an event is triggered whenever a pull request is opened. Audacity specifies a job for each platform, which means that for each platform a virtual machine is created where the project is built and the tests are run.

Rigorous Test Processing and Coverage

During CI Audacity tests some key features and runs a list of unit tests. However, the extent of automated testing does not go beyond this. Audacity does however employ manual testing for certain changes like bugfixes4. These are not a part of the CI cycle and are instead performed by the community.

Automated testing

The main components that Audacity tests are audio related functionalities, where the CI pipeline checks if the recording of audio processing elements still functions even if that action is mangled by un-ordinary behavior. Audacity also tests ‘BlockFile’ functionality, a feature which ensures audio is recorded in blocks to simplify the process of appending/deleting audio within tracks. Besides these, Audacity also employs unit tests for smaller things. However these are not nearly sufficient for testing the whole system.

Testing fixed bugs

Bugfixes can often not be tested unless a specific job is written for them. This however does not fit into the main CI cycle as they are often too specific to be part of it. Whenever bugs are fixed they must be tested by multiple developers and are then marked with a label. Bugs are issues that can be subscribed to and subscribers will receive a message about changes or comments. This allows anyone interested to test the fixes themselves. When the bugfix is properly investigated, the regular PR cycle may be followed to get the fixes into the main codebase.

Unit testing

Audacity employs Unit testing using a framework called catch25. This library is used for small units such as correctness of values.

Hotspot Components from the Past and Future

Hotspot components are components involved in many code changes. In order to visualize the quality of Audacity, we used CodeScene 6, by running it directly on the GitHub repository of Audacity. Based on the previous commits on the repository, the top hotspot components are: Built-in effects and AudioIO. This can be seen in the figure below since the audioIO.cpp and effects.cpp are the main files that undergo frequent changes over time. The first one is where the Audacity app communicates with lower level components, to access and manipulate audio for recording and playback. The latter is the file where all audio processing is done. Further analysis of the code from both files is discussed in the following section. This trend can be explained by the fact that a developer needs to make changes in both files to modify or add features to Audacity.

Figure: Hotspot components Audacity with more than 8 commits using CodeScene

To find future hotspots, we analyzed the documentation and discussions on the developers Discord server. The Audacity team has no plans to change their way of working (having previous chosen to abort a total refactor7). Therefore, the main hotspots will stay audioIO.cpp and effects.cpp for now. The focus for the near future is on plug-ins, so there may be more changes coming in the code responsible for plug-ins 8.

Code quality

To analyze the quality of the code, there are many metrics that can be employed, such as the cyclomatic complexity9 and lines of code (LOC) per file10 to create a code health score. In the analysis of the entire code base, it was found that the Built-in effects component scored the lowest out of all other elements of the architecture of Audacity. In the figure below, we show the score changing over the past few months, but it is still dangerously low around 3.2. The main issues identified are the existence of Brain Classes11, Brain Methods12 and Bumpy Roads13, which means that several classes and methods are responsible for too many things, and there exists multiple chunks of nested conditional logic in the same function in different parts of the code.

Figure: Code analysis of files that consist the Built-in effects component

An extra analysis was done on both hotspot files identified in the previous section using different metrics. As shown in the table below, both files have poor scores on several metrics such as the existence of multiple methods (with CC over 20), while the threshold for the C++ language is 4 levels of nesting14.

Figure: Comparison of code health metrics of effects.cpp and AudioIO.cpp

This analysis shows that Audacity has difficulty maintaining good code quality in its main files. A refactoring of these files is long overdue. A detailed discussion about the previous failed refactoring attempts is presented in the Technical Debt section.

Quality culture

The audacity community welcomes new ideas and contributions. Therefore, there exists contribution and coding guidelines for Audacity15. While useful, these standards must be maintained. To assess the extent of whether these guidelines are respected and maintained by the community, we browsed through Audacity’s repository and analyzed the discussions on PRs and issues.

Out of the 707 open issues and 79 active PRs, we analyzed 10 of each (shown in the table below) that in our opinion show the quality culture in Audacity’s community.

Issues Status PR Status
#2634 Open #2677 Merged
#2597 Open #2647 Merged
#2595 Open #2635 Merged
#2590 Open #2594 Merged
#1512 Open #2297 Merged
#1481 Open #2680 Open
#2308 Open #2561 Open
#2149 Closed #2517 Draft
#2374 Closed #1940 Merged
#2314 Closed #2212 Closed

The general trend that we found out after analyzing these PRs and issues, is that the community is open to discussion and decisions as a group. Moreover, all contributions undergo two stages of informal checks. First, the developers decide whether the proposed contribution is beneficial for the software (#1481). Second, the developers test the proposed code on their machines and provide feedback in case they think something in the code requires alteration (#2561). It is also important to note that the communication with the developers is swift and to the point (#2635).

Another interesting note is that the community has set up a system of labels to tag issues and PRs with descriptive information. Some of these labels are dedicated to the quality of code and product, such as (do we want this?) to prompt other developers/contributors to give their opinions about the contribution.

This shows that Audacity’s community always strives to add valued features coming into their source code. However, this is not applied to all issues on the repository. Out of the 722 merged PRs, 298 PRs have no comments or discussions from other developers. Therefore, a more rigorous system to check and review incoming contributions would be beneficial for a project such as Audacity.

Technical debt present in the system.

Previously, the Audacity community attempted to refactor the codebase by extracting Audacity’s audio processing functions into the LibAudacity library7. The principle notion behind the refactoring was to allow a GUI-less library to be embedded within other programs, and to be available for scripting. LibAudacity was later replaced by Mezzo, which also aimed to provide a cleaner separation between audio processing code and GUI16. Finally, after both mostly unsuccessful attempts, the Audacity core developers concluded that it was more productive to directly work on Audacity, and that the recent scripting features already provide an incremental way of separate GUI from audio processing functions. As a direct consequence, the current state of Audacity has a fair amount of technical debt.

The main design flaw, as previously mentioned, is the lack of separation between the audio processing functions and the GUI. This difficulty stems from the simplicity of stitching code together quickly to finalize the release of a feature to maintain the interest of the professional user-base. On top of this, as one can see in our contribution PRs, numerous developers find it easier to communicate through the comments of the code, rather than holding proper discussions, making it even harder to understand how the code can properly be split.

Though the Audacity developers proclaim that it is more productive to directly work on Audacity, wiping the slate clean by overhauling the system into more detatchable components would result in a cleaner, more maintainable code than the current one. This would not only result in a better contribution pipeline, but would also better allow new Audacity contributors to participate. At this current stage, while Audacity is usable and a slow split between audio processing and GUI is becoming visible, a lot of tangled components make the code remain in a fair share of debt.

References


  1. https://wiki.audacityteam.org/wiki/CodingStandards ↩︎

  2. https://wiki.audacityteam.org/wiki/Quality ↩︎

  3. https://docs.github.com/en/actions/quickstart ↩︎

  4. https://wiki.audacityteam.org/wiki/QA_Procedures_overview ↩︎

  5. https://github.com/catchorg/Catch2 ↩︎

  6. https://codescene.com/ ↩︎

  7. https://wiki.audacityteam.org/wiki/LibAudacity ↩︎

  8. https://wiki.audacityteam.org/wiki/Roadmap#The_Release_Schedule ↩︎

  9. https://en.wikipedia.org/wiki/Cyclomatic_complexity ↩︎

  10. https://en.wikipedia.org/wiki/Source_lines_of_code ↩︎

  11. https://docs.embold.io/brain-class/ ↩︎

  12. https://www.simpleorientedarchitecture.com/how-to-identify-brain-method-using-ndepend/ ↩︎

  13. https://codescene.com/blog/bumpy-road-code-complexity-in-context/ ↩︎

  14. https://pmd.sourceforge.io/pmd-4.3/rules/codesize.html ↩︎

  15. https://wiki.audacityteam.org/wiki/Developer_Guide ↩︎

  16. https://wiki.audacityteam.org/wiki/Mezzo ↩︎

Audacity
Authors
Haoran Xia
Nafie El Coudi El Amrani
Maxmillan Ries
Cristian-Mihai Rosiu