DESOSA 2022

Backstage - Quality and evolution

Quality and evolution

We already have discussed the vision behind Backstage and the architecture that makes up the Backstage product. In this essay we talk about the quality control of Backstage and how the project is evolving.

Key attributes

The key attributes for Backstage can be divided into two categories: external qualities and internal qualities. We will first discuss the systems internal quality attributes and afterwards discuss the external attributes.

External attributes

The external attributes that we recognize are the: modularity, adaptability, usability.

Modularity

The first external attribute that can be assigned to Backstage is its modularity. This is due to its extensive ability to include plugins into any Backstage installation1 With these plugins the functionality of Backstage can be extended to solve a wide range of problems. That is why Backstage can be considered to be a successful example of a modular system.

Adaptability

The second external quality that can be attributed to Backstage is its adaptability. Backstage can be adapted to serve a broad range of organizations, from small-scale startups to large enterprises2. Next to that the diversity of companies that have Backstage deployed is also large. They range from e-commerce companies to large SaaS companies. Backstage is able to adapt to any of these situations.

Usability

The third attribute that can be assigned to Backstage is its excellent usability. Proof of this usability is the wide-scale adoption of Backstage by companies. Engineering teams in a wide-range of situation have chosen to adopt Backstage, which allows it be considered as highly usable.

Internal attributes

The biggest two internal attributes are maintainability and deployability.

Maintainability

Backstage provides extensive documentation for contributors. Good documentation is important for large open source projects as they can help to ensure code quality. Backstage does this excellently, which is why the documentation has already been used a number of times in this essay as a reference.

Deployability

Backstage also can be considered to be excellent in its deployability. This essay will expand on this domain later (see chapter ‘Continuos Integration’), where it can be read that Backstage releases a new build delay. That means new version of Backstage are deployed on a very regular basis.

Overall Quality Process

Backstage does a few key things to maintain a good overall quality proces. First of all they rely on excellent documentation to create an efficient well documented engineering process. Second of all they rely on very active Discord channel to communicate among contributors. These two factors create a good overal quality process.

Continuous integration

The continuous integration process of Backstage is quite extensive. As the frequency of updates is quite high, having an extensive CI process is quite important for Backstage. Whenever a merge request is made the CI process runs three types of pipelines.

The first type of pipelines are used to provide information about the merge request:

  • Developer Certificate of Origin (DCO) and verifies that the CONTRIBUTING.md is being followed, it verifies that every commit in a pull request is signed of by the author
  • Automate area labels uses labeler@v3 to assign the following labels to the merge request depending on what files have been changed in the merge request
  • Automate review labels uses github-script@v5 to assign the label ‘awaiting-review’ if the merge request was created anybody else than the maintainers of Backstage

The second type are pipelines used to verify changes made to the repo using third party tools or tools created specifically for Backstage:

  • Verify Storybook uses an automated Chromatic workflow in order to verify all Storybook ui components in the repo can be compiled
  • Verify Docs Quality verifies that new documentation such as changesets, which are used to notify what plugins require new releases to be published, are included
  • Verify CodeQL scans the code for any vulnerabilities that might be abused if not fixed
  • Code scanning result outputs the results obtained from CodeQL in a human-readable way

The last type of pipelines contained in the continuous integration process are the end-to-end tests and checks executed in Node: The jobs used to test node processes are executed on two versions of Node namely 14.x and 16.x. This choice was made because Node releases a long term support version (LTS) of their runtime every even version. Node 14.x was the previous LTS and node 16.x is the current LTS.3

  • CI This pipelines is the main and most extensive pipeline for continuous integration, it runs the following:

    • Prettier to check if any code has not been correctly formatted.
    • yarn-lock-check to verify all dependency sources are from the Node Package Manager site. This is done in an attempt to prevent any malicious versions of dependencies from contaminating Backstage.
    • Custom tool to check if the default config is still valid.
    • ESLint to look for problematic patterns and enforce code conventions.
    • Typescript compile in order to perform type-checking and verify correct declarations.
    • Custom scripts located at backstage/scripts/, verifying that api references are correctly published when Backstage is run, correct changesets have been included, that all type dependencies can be found and any url in the documentation correctly uses relative and absolute URLs for internal references and external references where needed.
    • Lerna to test all changed packages and verifies that these still compile successfully.
  • E2E Test Techdocs TechDocs is Spotify’s homegrown docs-like-code solution built directly into Backstage. This means engineers write their documentation in Markdown files which live together with their code. The continuous integration pipeline compiles the TechDocs server on the current long term support (LTS) version of ubuntu. The pipeline then runs a few checks on the TechDocs server, verifying whether it can generate and serve documentation for both Backstage and MkDocs as well as show help text.

  • E2E Test Linux/Windows This pipeline runs a full end-to-end test of backstage on both ubuntu LTS and windows-2019 for both versions of Node. To do this it sets up a chrome browser, compiles the Backstage application and runs all tests that have been written. See the next chapter for further details on these tests.

There are remaining jobs which are not run as part of any automated process, however they are separately triggered jobs that are used to manage the main repository. Examples of this is a tool that checks for stale issues and one that automatically publishes the Backstage website when any merge has been accepted containing changes to the website. There is also a nightly ran job that checks for any changes in plugins and packages using changeset files, that publishes a new version of them on the Node Package Manager site when needed.

Test processes and coverage

Backstage does not have a requirement for test coverage, it leaves the decision on whether tests are required for a certain piece of code up to the developer. Since Backstage is split into plugins and packages, tests can be written for either of these and are automatically added to test runs. Convention for writing tests is using a filename called *.test.ts where * denotes the name of the file that is to be tested. Tests are run using Jest. At the time of writing code coverage for Backstage is at 77%, the lowest it has been at is 46%, however drops in coverage are accepted as can be seen in the graph shown below.

Figure: Coverage chart of the Backstage github

4

Hotspot components

The repository can be divided into two components, the plugins which are the tools Backstage provides and the packages, which contain mostly the functionality of Backstage itself. This clear distinction has to be made when considering where the hotspots in the code are. To analyze hotspots within these packages, CodeScene5 is used.

Let us first look at the packages. In the package side we see some hotspots that indicate an evolution throughout its development. The first one is app/, but this package is only used as an example and not for functionality. Then the next logical spot is the core-components, which deal with the visual React components.

Figure: Hotspots in packages

A more interesting case is looking at the plugin side of the monorepo. This package can be heavily expanded upon and will most likely scale with multiple plugins in the future. However, existing plugins get updated frequent as well. The main plugins that had changes were plugins revolving around Catalog, TechDocs and Scaffolding. These plugins form an essential part of the Backstage experience and it is to be expected that these see more changes than other more standalone plugins. Evidently, these parts of the code were also a focus for the release of Backstage 1.0.

When looking at the future roadmap which we described in a previous essay, it becomes clear that most future work will be done in existing plugins or the feature will be made as a plugin itself.

Figure: Hotspots in plugins

Code quality

To analyze the code quality in various components, Codescene5 again has been used. What becomes clear is that looking at the code health, the various components are scoring great. This might be because of the modularity of the codebase, where every single plugin becomes a project on its own. This seems to make the quality maintainable, given that it doesn’t conflict or depend on other components and changes within the plugins are infrequent.

Figure: Unhealthy components in plugins

In fact, the few unhealthier components are coincidentally also classified as a hotspot earlier. They suffer from methods that became more complex over its development, which will make it harder to maintain in the future if not fixed. Furthermore, as previously mentioned, these components are vital for the release of Backstage 1.0, which is due in March 2022. It seems that in order to get it shipped, the team prefers to ignore overall quality for now. As authors of this essay, we argue that future implementations should try to fix these issues before it becomes a problem.

Figure: Decrease in code quality in last 3 months for hotspots

Quality culture contributions

The nice thing about Backstage is that it is not only created by a big Software company, Spotify, but that it is also adopted and maintained by a lot of other big companies. Just to name a few: Bol.com, Netflix, Fiverr, Zalando and American Airlines. Looking at the commits, merge requests and issues it is clear that all these companies do actively contribute to Backstage as a product. This is for them also a symbiotic relation. These companies can easily contribute to the platform, customizing it to their desire, and the platform gets better as a whole because of it.

Issues can have several different tags assigned to them, these tags are used for every issue. There are tags for defining what type of issue it is, tags for what part of Backstage the issue refers to, a tag good first issue, and even a tag rfc in which a discussion is requested for architectural decisions. This last label, rfc, is a good indicator for the level of collaboration on architectural significant decisions. When users or the core developers have a major architectural decision to make an rfc labelled issue is created. Let’s take a snapshot of the 10 most recent rfc issues created more than a month ago:

Figure: 10 Most recent rfc issues older than a month

Here we can see that many of these issues have a good amount of comments in them, and that multiple of these rfc issues are created every month. The rfc issues are all formatted to include: the need, the proposal, any alternatives to the proposals and the risks involved in this proposal. Every one of these issues have a thorough explanation, and the required format requires the proposer to already put thought into the rfc. Furthermore most of these issues have a very insightful comment chain, already discussing code changes and discussing the implications of the rfc proposal.

Technical analysis codebase

As the whole system is very modularized in plugins and the initial architecture is very well thought out not much architectural technical dept is present. We did run the source code through a tool called [Deepsource](RFC: Backstage Observability). This tool finds if there are any problems with performance, security, styling, documentation, anti-patterns and so called “bug risks”. As is to be expected from such a big codebase it found a lot of issues, 16800 to be precise. Luckily most of the issues present were nothing to worry about and mostly minor issues.

Figure: Deepsource coverage report

We did find some inconsistencies, and a pull request fixing them is pending approval by the maintainers. What was noticeable was the lack of code comments. Deepsource reported a 24.8% documentation coverage, and when looking at individual source files this is confirmed. Not much functionality is explained using comments.

References


  1. Backstage Plugins[Online]. Available: https://backstage.io/docs/plugins/. [Accessed: 20-Mar-2022]. ↩︎

  2. Backstage FAQ [Online]. Available: https://backstage.io/docs/FAQ. [Accessed: 20-Mar-2022]. ↩︎

  3. Node - Previous Releases [Online]. Available: https://nodejs.org/en/download/releases/. [Accessed: 21-Mar-2022]. ↩︎

  4. Backstage Coverage Chart [Online]. Available: https://app.codecov.io/gh/backstage/backstage. [Accessed: 21-Mar-2022]. ↩︎

  5. CodeScene[Online]. Available: https://codescene.io. [Accessed: 20-Mar-2022]. ↩︎

Backstage
Authors
Aaron van Diepen
Daan Groenewegen
Philip Groet
Thijs Verreck