DESOSA 2022

Essay 1

Ghidra’s goal

Ghidra is a free and open-source software reverse engineering suite developed by the National Security Agency (NSA). Ghidra has a suite of full-featured, high-end software analysis tools which enable users to analyze compiled code (since compiled code is not readable by humans). It works by decompiling the binaries such that they become human-readable. Ghidra can disassemble, assemble, decompile among hundreds of other things. It has support for a wide variety of processor instruction sets and executable formats, from ransomware running on Windows to fully inspecting the firmware that is dumped from an Arduino board.

Ghidra is especially useful since we currently live in a very digital world in which the amount of digitalization will keep increasing. Therefore the amount of damage of cyberattacks, such as malware/ransomware will keep increasing as well. In the last few years, we have seen quite a lot of malware attacks. Inspecting the binaries of malware is hard without a decompiler, since binaries cannot be easily read by humans.

That is where Ghidra becomes very useful. Ghidra reveals the original software design, which can help security teams to better understand how the malware works. This can lead to the malware being removed from all of the systems that it is on to get removed if the security team knows how it works. Furthermore, it can also help anti-virus software in the future by detecting malware that performs similar instructions. Ghidra is also useful because you would not want the malware to infect your own system when you are inspecting it.

Domain concepts of Ghidra

Ghidra’s main domain is reverse engineering. Whenever you give it a binary, it will decompile it back to the source code. The source code resulting from this decompilation will look very similar to the original source code and it will have the same function when compiled again. Reverse engineering software can be interesting for a range of different aspects. The most important one is malware analysis, since we live in a very digital world and since it is very easy to get infected with it.

Ghidra’s second domain is its support for plugins and scripting. Users can write their own plugins for Ghidra in Java or Python (via Jython). This makes Ghidra even more extensible than it was, since with this, Ghidra will have support for even more processor instruction sets. These scripts can also be used to make the lives of the cyber analysts easier by the means of automation by for example automatically naming all of the unidentified functions

The third domain of Ghidra is its open-sourceness. Since cyber security is a global issue, Ghidra being open-source and free for everyone will help combat this issue. Furthermore, this encourages other cyber security analysts to improve Ghidra. Since it is open-source and therefore also free, decompiling binaries and reverse engineering are more accessible for anyone, since Ghidra’s competitors are mostly based on a yearly subscription service which can be very expensive.

Main capabilities of Ghidra

Since Ghidra is an SRE tool its main features are the disassembler, decompiler, and a debugger. The disassembler turns the binary into assembly and the decompiler turns the binary into a C-like language.

Ghidra’s main interface showing the disassembler in the center and the decompiler on the right

To aid the engineer Ghidra has different views including a Hex Editor, Defined Strings, Defined Data Types and Symbol Tree where all symbols are grouped by namespace and a Function Call Graph showing incoming and outgoing calls.

Ghidra’s Function Call Graph. Image obtained from 1

Another important part of Ghidra is their rich set of searching features, making navigating the disassembled code easier.

Ghidra was also designed with scaling, extendability and collaboration in mind. So it also supports project (which are groups of binaries) sharing and has version control and version tracking. The extendability is addressed by allowing engineers to write their own plugins and scripts.

Engineers can also extend the decompiler to different architectures using the SLEIGH processor specification language which is an implementation of SLED for Ghidra.

Current and future context

When it comes to context Ghidra has had an interesting journey. Originally Ghidra was developed as an in-house tool for the NSA, in answer to the need for an all-in-one tool that would support disassembly, decompiling and debugging combined with collaboration features to deal with increasingly larger binaries. Over time Ghidra was adapted to deal with firmware bundles to work with more than one binary at the time. Ghidra became declassified in 2014 and between then and its release in 2019, the NSA worked on getting it ready to be open source.

The user and developer base extended from NSA employees to the public.

And since the tool is now being used by the public it is now competing with some other SRE tools like IDA.

Stakeholders

Ghidra is an SRE tool aimed first and foremost at internal use by the NSA and thus they are the main stakeholder. The system is vital to the workflow of the NSA when it comes to dealing with malware and investigating threats. While it is an open-source project available to everyone, its design is still mainly driven by the needs of the NSA and their research. Apart from their own research, the NSA has another use for Ghidra, specifically as an open-source effort. By letting the cyber security community use their software for free, they hope to inspire young developers to get familiar with the practices of SRE and potentially apply for jobs at the NSA in the future.

Because of its educational purposes, learning software developers are an important second stakeholder of Ghidra. For the system to be beneficial to them, it needs to be well-documented, so that even a very beginner can manage to familiarize themselves with the system and practices of SRE.

Key Quality Attributes

To decide what key quality attributes Ghidra must meet, let us consider what the main incentive of the system may be. One of the main incentives of Ghidra is to create an SRE framework that can be used on a large scale and in collaborative settings. We need to make sure that a high workload still allows for good enough performance. Moreover, there needs to be network storage available for when multiple users are working together on one project.

Another important quality attribute of Ghidra is its configurability. Ghidra is a library of plugins, with each plugin providing some specific functionality. The system is used both in learning environments and in larger SRE efforts. By allowing users to easily turn on different plugins, the system can be adapted to fit into different work environments.

An important use-case and motivation of the Ghidra software is its function in teaching becoming software engineers. The idea is that by taking programs apart, engineers can get an insight into how the programs work. Consequently, we need a system that is easily usable by non-experts. To effectively operate as a learning tool for learning software engineers, the user interface needs to be intuitive, and a well-documented guide through the system is essential.

Product Roadmap

Not applicable, since it’s not published by the project maintainer of Ghidra.

Change History

Ghidra was officially released to the public in 2019, as of version 9.0. Its current version is 10.1.2. The major history of changes is listed as below:

  • 9.0: Ghidra was released to the public for the first time.
  • 9.1: Made improvements on data type, system calls, processor specification, iOS DYLD and Macho format, Ghidra Server, import, decompiler, languages and bug fixes.
  • 9.2: Made improvements on open Source Based Graphing, Java based universal PDB reader/analyzer/loader, dynamic modules: OSGI model for scripting, decompiler, performance improvements, function identification improvements, symbol demangling, processor models and Specification, dynamic analysis framework - debugger, and bug fixes.
  • 10.0: Made improvements on Debugger, User-defined Compiler Specification Extensions, Prototype Class Recovery From RTTI, PDB Symbol Server, Saved Analysis Options Configuration, Graphs, Structure/Union Changes, Gradle, New Processors, Binary Exporter, and bug fixes.
  • 10.1: Made improvements on distribution, debugger, decompiler, data types, Mach-O binary import, Android, performance improvements, processors, DWARF, and bug fixes.

Ethical Considerations

Debates over whether reverse engineering is ethical have been going on since this technology’s inception, whether in software engineering, automotive industry, chemical industry, entertainment, or the electronics industry. Some, led by manufacturers of engineered products, argue that reverse engineering is unethical because it infringes on a company’s intellectual property. If, based on the analysis of the products that exist in the market, some products with only minor modifications are produced, or even the reverse engineering of replicas becomes socially accepted, the possible consequence is the decline of innovation and invention–Why will companies take a lot of time and money for research and development, while companies using reverse engineering only need to spend a small amount of money by analyzing existing products and reproducing them? Proponents of reverse engineering argue that reverse engineering techniques are legal and ethical in most cases. The arguments in support of this view mainly include: reverse engineering can promote the development of products in terms of interoperability such as supporting products of software systems, which does not harm the rights and interests of the original company. Reverse engineering can also be used to check the safety of products to ensure that the computer software does not have harmful or illegal activities. In the process of software development, reverse engineering can analyze and evaluate the performance of the product, thereby improving the quality of the software system.

In the case of Ghidra, its users need to ensure that they do not use the functions provided by this tool to perform illegal operations on software products, such as copying the functions of software products under inspection and for commercial purposes, as well as any other illegal activities.

Ghidra
Authors
Hakan Ilbas
Yingkai Song
Lola Dekhuijzen
Johannes Ijpma