DESOSA 2022

PMD - Product Vision

Putting PMD into Context

Imagine you are implementing an algorithm for an application using Java. The application is runnable, and you are satisfied by the output because it is exactly what you need. The application gets popular on the Google store, and you decide to add functionalities to attract more users. Some new colleagues are hired to work on the same project. But you find your colleagues are slow to get familiar with the source code. You think it is normal since this project consists of thousands of lines of code, and sometimes you also need some time to recall the functionality of a piece of code. However, some new code done by your colleagues catches your attention. Though the file has more than three hundred lines of code, its documentation tells you immediately what each method is trying to achieve. The required parameters and outputs are explicit even you have not run it yet. You quickly realize that a good documentation and coding style can improve performance.

You decide to make some improvements. However, it takes time to research better coding practices, and even more time to apply them to your project. It is both time-consuming and exhausting to check the code and improve it to a higher coding style. Luckily, you find a tool called PMD, a static source code analyzer. It finds common programming flaws like too many nested if statements, unused variables, empty catch blocks, and so forth. An example of bad coding habits is shown in the figure below: too many nested “if ”statements.

Figure: Many nested if statements

Overall, there are several scenarios for this tool: prevent security vulnerabilities (e.g., check password complexity), prevent bad coding style, replace lengthy and unclear code, improve performance bottlenecks. All scenarios have the same ultimate goal: enforcing coding standards.

End-user mental model

We analyze PMD using a two-part end-user mental model.

The first part, titled “What the System Is”, discusses how users perceive PMD. Programmers that require code style validation are PMD’s end users. In terms of user interaction and system functionality, end-users expect PMD to quickly check code style using pre-defined rules in their familiar working environment. It can be easily used with the command-line interface or a common IDE. End-users may install and use PMD on both Windows and Linux/Unix by utilizing Java JRE and a zip file archiver. And PMD parses source files into abstract syntax trees (ASTs) using JavaCC and Antlr, then executes rules against them to discover violations. Java or an XPath query can be used to create rules.

The second part,“What the System Does”, is about how end-users interact with PMD and, as a result, explains the functionality. Users can use PMD to check for possible problems in security, error-prone code, multiple threads, design, coding style, documentation, and so on. Users can see the warning messages in the IDE and export them as HTML or CSV files. So far, it is mostly focused with Java and Apex, but it also supports a variety of other languages, including VM, XML, XSL, and others. On the other hand, CPD, a copy-paste detector, is also present. CPD finds duplicate code in C/C++, C#, Go, Java, JavaScript, and other languages.

Main capabilities

In the previous sections we have discussed what PMD is used for. Now we will look into the main capabilities of PMD.

The user can run PMD either via the command line, or via an IDE plugin. With the default settings, PMD generates a text report. If the user so desires, PMD can generate the report in CSV, HTML, JSON, or XML. A sample run of PMD with the text format is shown below1.

Figure: Sample Output

As seen in the example, the user provides PMD with a “ruleset”: an XML configuration file, in which the user can describe a collection of rules that PMD needs to execute in a run. PMD comes with built-in rulesets, but encourages users to configure their own rulesets to suit their own standards. Furthermore, users are also allowed to configure the rules within these rulesets. These rules can be divided into the following 8 categories2:

  • Best practices
  • Code style
  • Design issues
  • Documentation
  • Error Prone
  • Multithreading issues
  • Performance issues
  • Security flaws

Based on the rules in the rulesets, PMD generates a code analysis report with the found violations. Finally, PMD also has a built-in tool to find duplicate code, called CPD (copy-paste detector).

Context

The context of a system describes the elements, boundaries, interconnections, interactions, and environment of operation it is defined by3. The context of PMD is visualized in the figure below.

Figure: The context PMD operates in

  • Contributors: PMD is free to use and does not have any paid plans, instead it relies on volunteers and donations. People who are willing to contribute financially can do so via GitHub or Open Collective. The development of PMD relies on developers that are willing to voluntarily contribute to the system. PMD is open to every developer that takes the time to contribute, as long as the contributor agrees to abide the project’s terms.
  • Maintainers: The members of PMD are also the maintainers of the system. This means that each change to the system has to be reviewed by one of these members. The members of PMD are: Andreas Dangel, Juan Martín Sotuyo Dodero, Clément Fournier, Pelisse Romain, and Robert Sösemann.

  • End Users: The end users of PMD are developers that strive towards clean and optimal code, without errors or design issues. This mainly includes developers in Java and Apex.

  • Requirements: In order to use PMD, it is required to have Java JRE, OpenJDK, or AdoptOpenJDK. Furthermore, since PMD is distributed as a zip archive, a zip archiver also needed.
  • Integrations: PMD provides code analysis and copy-paste detection for many programming languages. It is recommended to run PMD via the command line. However, there are also IDE plugins for users that prefer to use PMD in their favorite IDE, and plugins for the project management tools Maven en Gradle.

  • Competitors: PMD is not the only code analysis tool for the aforementioned programming languages. Other examples of open source and free code analysis tools are Coverity, which focusses on security and quality defects, and FindBugs, which uses static analysis to find bugs.

Stakeholders

As an open source software project, PMD sees steady improvement and attracts new contributors. The four categories of possible stakeholders of PMD are listed below.

  • PMD provides its service of static code analysis to its End Users. The end users pay their time and loyalty for this code checking tool as their way of stake-holding. User expectations need to be met and should be considered in designing the software through User Stories, Use Cases and End-User Cognitive Model.
  • The Management Team is responsible for the operations and structural design of the software. It maintains project documentations and public relations to polish the product profile and attracts new users. Meanwhile, it operates on code repositories and decides what contributions from developers will be merged.
  • Customers, different from end users, are the group of people that seeks delivery from PMD to benefit their business. For example, Apex language team from Salesforce.com advertises the Apex module of PMD. Any further contribution to this module will benefit Apex as they have a better code checking tool.
  • Developers/Contributors are the main drive for the project to grow. They are making PMD more powerful and are always appreciated by PMD. Developers are added as contributors via All Contributors.

Key quality attributes

The key quality attributes can be depicted in three dimensions. A circle map of these attributes is shown in the figure below.

Figure: Key Quality Attributes

We will now explain the key quality attributes in a bit more detail.

  • External Attributes:
    • Functionality is the most important quality of PMD. It covers correctness and completeness. Correctness requires the code analysis to offer effective feedback instead of false positives. Completeness requires as many as possible errors or flaws to be found.
    • Flexibility is a user-friendly quality. It allows users to run code checking in different settings, for example, checking only a subset of errors.
    • Serviceability allows users to report bugs to PMD and receive feedback to improve their experience.
  • Internal Attributes:
    • Feasibility allows developers to implement new functions and deliver in time.
    • Modularity allows PMD to support different languages, and their modules not to disturb each other.
    • Portability considers the compatibility and performance in different operating systems.
    • Maintainability allows developers to easily improve PMD without frequent refactoring.
  • Meta Attributes:
    • Measurability provides standards for PMD developers and testers to evaluate their software.
    • Testability allows the developer team to monitor the performance of PMD, for example, through unit testing and acceptance testing.

PMD Project Roadmap

PMD’s value stems from being embedded into other projects in order to maintain quality over a projects life cycle. As such, its own quality is heavily dependent on external environment. As programming languages, concepts of best practices, and project tools are created, updated, or phased out, PMD must also be modified to reflect this. The exact steps by which to update PMD to keep up with external factors is reflected in the PMD project roadmap, namely their task tracker and planned additions for the next major PMD update: PMD 7.

PMD 7 includes a variety of additions that reflect external changes: adding support for XPath 2.0, extending support for parsers written using Antlr, updating to JUnit5, new languages (such as Kotlin), and a slew of updates to Java parsing and code rules to keep up with Java releases.

As PMD is also a project in itself, there are also planned changes to improve the PMD core. Perhaps most importantly: an extensive rework of the PMD API is planned for PMD 7, with limited backwards compatibility. Since PMD usage is typically facilitated through this API, introducing breaking API changes that will require re-integration for projects that use PMD is a major change. The given justification is that the PMD API has become bloated over years of development, and does not sufficiently separate API and implementation.

Ethical Concerns

It is no new phenomenon that the creation of a product is a subject of moral complexity. The resources involved in production, functionality, and users all factor into the discussion of ethical concerns. For PMD, ethical considerations stem from its purpose as a judgement system and its status as an open source project.

PMD functions by judging input based on subjective factors (best practice rules). In order for such a system to be ethical, it must fairly judge without bias for any group, be that a particular project, language, or something else. While the consequences of bias in PMD are relatively minor, it is conceivable that if two otherwise equal groups used PMD, but PMD more frequently produced false positives for one than the other to an extreme degree, it could create an unfair advantage.

Beyond PMD’s functionality, there is its status as open source. First, this means that the PMD project cannot take any actions to prevent its use in the development of unethical software. Second, it has been maintained for an extended duration (15 years), with contributors changing over time. Just as PMD must be updated to take into account new tools and programming languages, so too must it be updated for evolving ethics. A prime example of this is the PMD logo:

Figure: Old PMD Logo

While likely intended as a joke by the initial developers before PMD had achieved support and usage by large organizations, the imagery of a gun and the associated negative connotations can be seen as glorifying violence. Beyond political correctness, it is clear that this logo is not at all aligned with the ideals of the PMD project. This has led to an open call for a new logo and a public vote to decide on the result:

Figure: New PMD Logo

While the logo update is a very obvious example, it is emblematic of the exact ideals at the core of the PMD project: continuous self reflection is key to long term improvement.

References


  1. PMD. (2022). PMD 6.43.0 Documentation. Retrieved March 3, 2022, from https://pmd.github.io/pmd-6.43.0/ ↩︎

  2. PMD. (2020). PMD 6.43.0 Documentation. Retrieved April 2, 2022, from https://pmd.github.io/pmd-6.43.0/pmd_userdocs_making_rulesets.html ↩︎

  3. Ross Ron, Michael McEvilley, and Janet Carrier Oren. (2016). Systems Security Engineering. Retrieved March 3, 2022. ↩︎

PMD
Authors
Jerrit Eickhoff
Jason Qiu
Bailey Tjiong
Liang Zhang