DESOSA 2022

Essay 2

Architectural style used in Ghidra

Ghidra is a very large project with a lot of different features. Therefore we will focus on one of the many features of Ghidra, the Function graph. The function graph displays the functions that are decompiled from the binaries as code blocks in the GUI. The main architectural style used in the function graph feature of Ghidra is the model-view-controller architecture. This architectural style is commonly used for developing user interfaces that divide the related program logic into three interconnected elements, namely the view, the controller and the model. The user uses the controller, which in turn manipulates the model. The model then updates the view, and the updated view is what the user sees. We can see that the function graph function of Ghidra uses the model-view-controller architecture from Ghidra’s code. In the ghidra/Ghidra/Features/FunctionGraph/src/main/java/ghidra/app/plugin/core/functiongraph directory, which is the main directory for the function graphs in Ghidra, we can see a folder called mvc which contains multiple java classes that are called FGController.java (FG standing for FunctionGraph), FGModel.java and FGView.java, indicating the three components of the model-view-controller architectural style.

Ghidra’s functiongraph folder which shows that it uses the MVC architectural style

Main execution environment of Ghidra (Containers view)

Ghidra is majorly written in Java, which makes it cross-compatible across different platforms (Windows, Mac, and Linux). To deploy Ghidra one must first install JDK 11 on the system that is going to run Ghidra. The developers of Ghidra already have created a link to download the official pre-built Ghidra release. After downloading this, one can extract the Ghidra release file and easily launch Ghidra with ./ghidraRun or ghidraRun.bat for Windows as described here.

Unlike other programs which traditionally use an install program, this is not needed for Ghidra. Ghidra is a self-contained program in the Ghidra folder, which was extracted from the release file. This makes it very easy for the users to uninstall Ghidra if they would like to, by simply deleting the folder. There also exist various unofficial Ghidra container files, if one would like to run Ghidra with for example Docker or Kubernetes.

Ghidra can be run in a Graphical User Interface (GUI) mode, this is done as described before. This is the more traditional way of using Ghidra by a single user. Ghidra can also be used collaboratively by multiple users with Ghidra Server. Each user launches and works on their own local copies of a Ghidra project, which can eventually be pushed into a common repository, containing all commits. Instead of using the GUI mode of Ghidra, one can also run the Headless (Batch) mode of Ghidra using a command line. Finally one can also use Ghidra in Single Jar Mode. The difference between this and the regular GUI mode is that instead of Ghidra being in a folder structure, it is now inside a compressed single jar file at the expense of configuration options. This makes it easier to use from the command line for headless operation or to use it as a library for another Java application as described here.

The most traditional way of using Ghidra: via the GUI

Components view: Structural decomposition into components

At a high level, the Ghidra codebase is organized into three main components: Framework, Features, and Processors.

The Framework component is responsible for the GUI and Storage. Some of the main sub-components of this module are the database (DB), Software Modeling which represents a program, and Docking for docking windows. The Features module consists of plugins and scripts that are used in SRE. It takes care of features like the Function Graph, Decompiler, Byte Viewer, etc. The Processors module consists of definitions of processor architectures.

The main components of the Ghidra code base

Connectors view: Main types of connectors used between components / containers

Due to the enormous scale of the codebase and lacking documentation it turned out to be near impossible to see how every one of these components interacts. Smaller sub-components like the Function Graph depend on classes outside its module, which makes it very hard to see how everything ultimately works together without guidance from the documentation.

To illustrate the issue with an example: the Program component, which in turn is part of the SoftwareModeling component, which in turn is part of the Framework component consists of sub-components, shown in the figure below, which in turn are made out of tens of Java classes each.

The sub-components of the program component

Development view

We look at the different modules that Ghidra is composed of. The framework consists of the base GUI and storage. It is called a framework because it could also be used for purposes other than Software Reverse Engineering (SRE).

Furthermore, Ghidra comes with a debug mode that aids the development of its features. Not only does the debugger allow you to debug scripts, but it also allows you to inspect and debug any Ghidra component.

Another important part of the system is Ghidra extensions. These allow one to extend Ghidra according to their needs, they are essentially optional components. We will now go over some of the different types of extensions. Analyzers are aimed at extending the code analysis of programs. Filesystems are aimed at providing support for archive files. This for example allows one to import a ZIP file without having to unzip it first. Plugins allow for various ways of extending Ghidra through accessing the GUI and event notification systems. The GUI consists of three main windows and these windows can talk to each other through plugins, by means of throwing and consuming events.

Run-time view

To get an idea of the run-time behaviour of Ghidra, we will inspect the following key scenario: one can write their own script to analyse code. A user may create a project and bring in all the binaries he wants to look at. In the Listing View, instructions can be added to the imported binaries. The Function Graph is another view of the same thing that allows you to see the information broken up by control flow as opposed to a linear flow of addresses. A decompiler will take the assembly code and decompile it into a C-like representation. The Script Manager allows a user to automate certain tasks regarding analysing binaries.

The Script Manager window lists paths of various existing scripts that can be used and adapted according to one’s needs. To create a script, one creates a class that inherits from the GhidraScript class. This class has a run method that each child class inherits, a main function that is called run().

To start applying transformations to the to be observed code, the script will first need to get a state from the GhidraScript that represents the current cursor locator. Once it has the byte address that corresponds to the current location, it can loop through and apply transformations to the instructions.

A sample script that patches each byte with a NOP assembly opcode

Architecture for Key Quality Attributes

As mentioned before, key quality attributes mainly include configurability, non-experts education, and remote collaboration.

Configurability is addressed with plugins. Ghidra provides plugin support, allowing developers to write their own plugins. Therefore, in this way, users can extend the functionality of Ghidra in various ways based on the Ghidra plugin skeleton, and contribute to the Ghidra community. The release version of Ghidra itself comes with many features in the form of plugins, and users can also download and install third-party Ghidra plugins on the Internet, such as ret-sync, gdbghidra, OOAnalyzer, etc.

Unlike some software applications that only support command line usage, Ghidra provides users with an informative GUI. Additionally, the maintainers of the Ghidra project have written a nearly 300-page Ghidra user guide that covers almost everything on how to use Ghidra. Because of this, Ghidra can continue to attract new users and form an open-source software community with thousands of members.

To facilitate collaboration, one of Ghidra’s greatest features, a proper corresponding architecture is needed. To this end, Ghidra’s architecture includes Ghidra Server, deployed on localhost, or an independent server. The process of multi-user collaboration is similar to git. Each user can obtain a copy of the project in the Ghidra server for independent work, and then perform operations similar to commit, push and merge after completion.

Although the architecture described above is sufficient for the purpose and may appeal to users, some trade-offs exist. The plug-in-based system might bring maintenance issues due to its complexity. The GUI interface may not be as extensive as the console looks for some professionals. And yes, the Ghidra server deployed locally may affect system performance. But these are acceptable in the face of the impressive variety of unique features that Ghidra has to offer.

API Design Principles Applied

Several API design principles allow for more effective understanding, maintenance, and extension.

Clear inheritance structure and naming

  • Good API Documentation. Ghidra complies with the GNU specification, and its developers (NSA) not only provide the source code of the project but also provide detailed documentation for Ghidra users, including a Javadoc file that describes all APIs. In the document, the classes, interfaces, and methods contained in the package where the API is located are recorded.

  • Inherent and Clear naming. Classes and interfaces follow good naming conventions, and users won’t find incomprehensible abbreviations and acronyms in the code. This also makes the behavior of inheriting parent classes and implementing interfaces clear.

  • Effective Decomposition. In Ghidra’s API design, the system is decomposed in a plugin-based, detailed and fine-grained way, with the methods given being highly single-functioned and simple, and more than half of the methods only contain code within 10 lines. That is, the code is less coupled and easy to extend.

As a result, the developers of Ghidra provide good resources and high-quality code for beginners and users looking to improve their code.

Ghidra
Authors
Hakan Ilbas
Yingkai Song
Lola Dekhuijzen
Johannes Ijpma