Most of us probably remember the advertisements: ‘more than 3 billion devices run Java!’, and since the time when these ads were common, the Java ecosystem has continued evolving. It is hard to overstate the influence of Java: it is used extensively in server development, it is a key pillar in Android, and its object-oriented approach has influenced the development of many other languages. At the end of 2018, there were an estimated 7.6 million Java developers1. Every one of these 7.6 million developers has their own preferences, their own habits, and their own style.
Organisations developing Java invest in making sure that even given these personal differences, teams of developers produce coherent, working code. The functional aspects of code can be validated with tests - if not by the fact that the entire project has broken down at Friday 15:30, again. In these functional terms, developers also tend to be united: the code should work, and do so efficiently and securely. These requirements tend to be - and rightly so - fairly non-negotiable.
In non-functional terms, there is much more to have heated debates about - online, in team meetings, code review sessions and at the coffee machine. One area where this is notable is code style. Below are two examples of the same Java ‘hello world’ program, written with different code styles.
public class HelloWorldApp
{
// Main method for this application.
// Args are unused
public static void main (String args[])
{
var hello="Hello World!";
/* Prints the string to the console. */
System.out.println (hello);
}
}
public class HW {
/**
* Main method for this application.
* @param a unused
*/
public static void main(String[] a) {
String h = "Hello World!";
System.out.println(h); // Prints the string to the console
}
}
A ‘coding convention’ or ‘code style’ is a set of rules about the format of the code. The exact rules to follow are up for debate; the important thing is that it is a coherent set. Rules can be about any part of the code style. Often, a coding convention includes rules about indentation and other whitespace, the capitalization, style and spelling of class, variable and function names, and the use and style of comments.
Checkstyle2 is one of many tools that can be used to enforce a code style. It includes out-of-the-box, opinionated style configurations, but also allows users to configure their own style by enabling or disabling rules, or writing their own. In addition, Checkstyle can check code for common anti-patterns, and in doing so improve code maintainability and extensibility as well as limit bugs.
Why a Coherent Style is Important
It is clear that catching bugs and improving maintainability are desiderata. But why is it desirable to enforce a single code style? The two examples from earlier are functionally identical: why do we care about code style at all?
Firstly, code that has a uniform code style is easier to read. Adapting to
reading a new code style takes time and energy. If a code style is consistently
enforced, developers will know what to expect, and where things should be.
Readable code is also naturally more maintainable. Many code styles also include
rules specifically targeting maintainability - for example, a rule limiting the
amount of methods allowed in a class. Additionally, having a consistent code
style keeps diffs clean. If, within a team, developers do not agree on whether
to put the {
on the same line or a new line, the diffs will be littered with formatting
changes whenever a stubborn developer decides to “fix” this issue. Apart from leaving a cleaner
git history, preventing this can also prevent unnecessary merge conflicts.
What Checkstyle Can Do
Checkstyle is a static analysis tool: in contrast to things like tests, it evaluates code without running it. Checkstyle targets three main problems: class design problems, method design problems, and code layout problems. The main idea of the first two is to detect possible flaws in the way methods and classes are designed. Even though it might not be initially clear for the programmer, Checkstyle can detect such a problem using simple systematic checks - for example, the check HideUtilityClassConstructor ensures that utility classes cannot be instantiated. The other problem is making sure the code layout and formatting is consistent across a project, which we have discussed before is a desirable property.
An important restriction in the design is that Checkstyle only checks a single file at a time and will thus not find issues that are only recognized when looking at the bigger picture. This is an intentional design choice. It helps keep checks simple, the code maintainable while being sufficiently powerful.
Checkstyle includes two coherent sets of rules: Google’s style3 and Sun’s style4. These coding styles are directly applicable to general Java applications. However, it is important that if a developer or an organisation desires it, they can customize these rulesets, or even build one from scratch. Therefore, Checkstyle allows developers to easily configure their code style rules, as well as to overhaul their code style rules entirely.
Checkstyle is quite mature software. There are no big changes to capability planned in the future. However, as changes to Java are introduced, Checkstyle needs to be updated to stay up-to-date. In addition, even in mature software, bugs can still be found and improvements can be made. These should be fixed or implemented relatively quickly.
Checkstyle and the World
We have identified several stakeholders for the Checkstyle project. First, there are the direct stakeholders. The developers, contributors, and users are the obvious ones. Developers and contributors benefit from Checkstyle being maintainable. Users benefit from continued development of Checkstyle provided by the contributors. We have also identified developers of tools that integrate Checkstyle such as the IntelliJ plugin5, and companies that donate to the project or propose bug bounties6. For indirect stakeholders, users of software created with Checkstyle spring to mind. Because of the increase in maintainability of code, users of such software benefit from a faster development cycle because of a possibly lower technical debt. Developers that will work on projects that use Checkstyle in the future are also indirect stakeholders. They benefit from the project having used Checkstyle in the past, as the code they are working with is more readable for them.
Checkstyle is not the only tool that is capable of doing static analysis on code. In contrast, there are many tools that perform similar tasks. For Java, notable projects are PMD, Spotless and Spotbugs, but static analysis for bugs and code style are not problems limited to Java. Notable projects for static analysis include Prettier for Javascript, Flake8, Pep8 and PyLint for Python. There are too many to provide an exhaustive list. Note however, that these tools are not necessarily competitors, and often work great in conjunction with each other - we will see later in this text how Checkstyle uses many different, similar tools to enable even more extensive checking.
Ethical Considerations
Checkstyle is a tool that is relied upon by many. The IntelliJ IDEA plugin alone has 3.5 million downloads5. Since Checkstyle is only used during development, and not in deployed systems, the chance of a security vulnerability in Checkstyle affecting a deployed application is minor. However, a crash or major bug could slow down development for many major Java frameworks and applications. If Checkstyle were to give false positives due to a bug, developers will spend time fixing non-issues, or the CI/CD of a project will stop working, both leading to a slowdown in development.
False negatives could lead to more technical debt in the future, as improper code could be deployed without Checkstyle noticing its flaws. While most bugs in Checkstyle would be minor annoyances for developers, some Checkstyle rules do have more critical functions. For example, the rule “FallThrough” checks whether switch statements have a break statement for each case7, a bug that can cause major issues and may not be easy to spot.
Checking Checkstyle
In order to be taken seriously, a project that checks code style needs to have a good code style and needs to be well maintained.
Firstly, pull requests are reviewed thoroughly. Furthermore, Checkstyle releases are only published approximately once per month8, meaning that if a pull request with a major bug is accepted, it will not automatically be deployed to all Java projects around the world that rely on it. Through this careful development cycle, Checkstyle is able to reduce the risks of releasing harmful code.
Checkstyle checks itself: during development, Checkstyle is run on the Checkstyle codebase to validate its codestyle. However, this approach is also vulnerable to false negatives. Checkstyle runs an impressive amount of other tools to make sure its codebase is up to date, up to specifications and up to standards.
Tool | Purpose |
---|---|
Checkstyle | Static analysis |
SonarQube | Static analysis |
PMD | Static analysis |
Spotbugs | Static analysis |
XML plugin | Verify XML file format |
Modernizer | Disallow legacy API calls |
Forbidden API checker | Disallow specific API calls |
JUnit | Testing |
Failsafe | Integration testing |
Pitest | Mutation testing |
Jacoco | Code coverage by tests |
All of these plugins are executed automatically using Continuous Integration. To ensure compatibility, this CI pipeline is run on Ubuntu, MacOS, and Windows. Furthermore, various checks are executed on different JDK versions. The Jacoco configuration requires 100% line coverage.
What stands out is that for most plugins, exceptions to rules are mentioned and clearly documented. If some files are excluded from a rule, there is a brief comment explaining why this is the case (e.g. the files are generated and shouldn’t be touched, or there is a specific reason why the default rules do not work on a specific file).
Before it is merged, any change to the codebase is tested on the code base of an open source project. A Checkstyle tester project exists for this purpose9. It automatically generates reports from before and after the potential change to verify no unintentional changes occurred. Using GitHub actions, such a report can be generated automatically. This report is shared in the pull request as part of the reviewing process.
From all of the above, it seems clear that the Checkstyle team values all projects that make use of their tool, and they take as many precautions as possible to prevent unintentional side effects.
-
https://www.zdnet.com/article/programming-languages-python-developers-now-outnumber-java-ones/ ↩︎
-
https://www.oracle.com/java/technologies/javase/codeconventions-contents.html ↩︎
-
https://plugins.jetbrains.com/plugin/1065-Checkstyle-idea ↩︎
-
https://Checkstyle.sourceforge.io/config_coding.html#FallThrough ↩︎
-
https://github.com/checkstyle/contribution/tree/master/checkstyle-tester#checkstyle-tester ↩︎