DESOSA 2022

Podman - under the hood


Essay 2: the system’s architecture

In our earlier essay, we gave an overview of the product vision and the context in which Podman is used. This essay dives deeper into the system’s architecture and its design choices. We investigate what these decisions mean for the project based on various system views12 and evaluate the insights where improvement could be achieved. For readers who are unfamiliar with the world of containers, our first essay contains a list of key definitions.

Container view

Podman ships in two flavours of programs, podman and podman-remote, shown in the figure below. Podman comprises a set of containers that include the local Command Line Interface (CLI), the podman-remote client and the Application Programming Interface (API) server. The CLI includes all Podman commands that are executed to orchestrate pods and containers fully. The API server provides an interface for remote connections, allowing maintainers to send Podman commands remotely. This circumvents the need for a developer to move into the server environment directly, which can be a security vulnerability.

To connect to the server, the client must exchange their SSH keys before their program can connect to the remote socket. The podman-remote program only contains the remote client functionality to provide a lightweight alternative for users that only access a remote server.

The API introduces an extra level of overhead. For every new additional feature or command, the API also requires adjustment. This can cause the CLI and the API to sometimes be out of sync.

Lastly, podman-compose is a Python implementation of a higher-level orchestration tool targeting multi-container compositions, like docker-compose, which it is compatible with. It calls Podman commands to reuse the functionality and adds extra layers for networking and storing volumes.

Figure: Podman’s container view.

Component view

Podman is a container orchestration tool that provides an interface to build, run, and manage containers and pods. Podman does not provide all these tools on its own but instead uses other tools which are parts of Open Container Initiative (OCI) standards and act as standalone applications. We show an overview of these tools below.

Figure: Podman’s component view.

Podman uses the Buildah tool to build images into containers. Buildah focuses on providing an interactive CLI tool for a user to build a Containerfile. However, it also offers an automatic build interface for Dockerfiles and OCI Containerfiles.

To run the built containers in the kernel without using a daemon, Podman uses an OCI runtime tool, such as runc, which is used by Docker or crun, which is a faster alternative with a lower memory footprint.

Skopeo is a tool for inspecting containers and registries, which Podman uses to retrieve information about their containers and images. Internally, the conmon tool is used to monitor the containers’ output during runtime.

The image and storage libraries hold the functionality for image orchestration and container storage management. Networking stack configurability is added in the newest release using the Netavark and Aardvark libraries.

The remainder of the features is implemented in libpod, which is the main Podman library. It contains the orchestration of the beforementioned tools and additional functionality such as pods.

The modularity of the components allows users to choose their options regarding the tools they prefer to use. This is great as configurability is highly desirable for developers. However, it introduces complexity to the complete project concerning dependencies and compatibility.

Podman as a tool

To understand how Podman’s architecture helps realise its vision, one must consider its standalone value - how can Podman fulfil its independent role in the larger environment, and how does it communicate with other components?

Connecting in the bigger picture

Podman must support several types of connections to users and external systems to provide the necessary security and versatility. At a macro level, this means that Podman must act as an interfacing tunnel: a user sends a command to the CLI, and Podman runs the related command on the host system.

Perhaps this requirement’s most crucial architectural aspect is how Podman uses runc to access its containers. While tools like Docker opt for a client-server model with a central daemon communicating with client containers, Podman instead connects to runc through a Fork/Exec model, relying on the Linux kernel’s audit framework for improved security and lighter-weight operation.34 The connections between Podman, runc, and the host machine is visualised in the figure below.5

Figure: Podman’s connectors to the usage environment.

This choice for a daemonless architecture leads to some added complexity because it requires workarounds to offer compatibility with Docker fully. This is necessary since a daemon provides a shared state and additional functionality, eliminating this architectural choice.

Connections at the lower levels

To illustrate the design choices inside Podman’s architecture, we present a typical use case scenario: a user wanting to use Podman by itself to run a container. The user simply runs a script in the terminal, and the underlying complexion is hidden in various connections.

First, the user sends a command to the terminal or a remote server. Once arrived and authorised, their request is honoured by parsing the command line arguments and converting them to an internal representation or by handling the API request. Then, the request is forwarded to the part of the application that provides concrete container functionality.

Connections between code modules are established through language-level means such as method invocations or thread synchronisation. External dependencies also handle critical parts of the computation, such as storing the container image and logging. Though they are independent, these dependencies are connected to the various code modules through imports and method invocations.

Finally, once the command execution is finished, Podman returns a response, informing the user of the outcome of their request. The figure below provides a visualisation of this workflow.

Figure: Podman’s internal connectors.

Runtime view

Figure: Podman’s runtime view.

To explain what happens during the runtime of Podman, we consider the same use case from before – podman runs a container.

Podman first looks up and stores the information (UID/GID) of the host user who runs the podman run command, which generates a mount namespace for mounting the container storage. Then Podman pulls every layer of the image from the registry using the image library. Podman decompresses and stores each downloaded layer in the predefined order and creates an overlay of its previous layers in a directory on the host system, using the storage library.6 After that, Podman creates a new container and adds it to the database.

After the container is created, Podman sets up the host network and simulates a Virtual Private Network (VPN) for the container. Next, Podman starts conmon, which launches the OCI runtime, which creates the process of the user’s command in the host kernel, and the process runs until it exits. During the life span of this process, conmon keeps streaming the STDOUT of the process and saves it to a log file for podman logs. After the process exits, the kernel sends a signal to conmon to indicate the process has ended. Finally, Podman waits for the exit of conmon, gets the exit code from the container, and then exits with that code.

Podman uses different practices to run container ‘rootless’.7 For example, mounting the overlay filesystem using the kernel driver (overlayfs) by a rootless user is currently not allowed, so Podman uses the containers/fuse-overlayfs library as a workaround. Podman also uses the slirp4netns library to enable unprivileged networking.8

Development view

Contributors are free to self-select and report new detailed issues to the repository. When looking at the issues on GitHub, most of the effort is on fixing bugs in the current version of Podman (v4.0), adding new features and staying compatible with Docker functionality.

The development process follows a strict set of rules when contributing to the project through PRs9. Podman also has a sizeable automatic CI process that requires tests for the proposed changes and automatically updates dependencies. When contributors simply add to documentation or no tests are necessary, the pipeline can bypass checking.

Folder structure

Figure: Podman’s internal folder strcture.

The Podman source code is divided into ten main directories. The figure above highlights the most critical ones. The first aspect to stand out is the clear separation of the CLI code and the ‘core’ functionalities. Firstly, the cmd directory contains implementations of commands for Podman with the help of the Cobra library to build a Unix-like CLI interface.

As for the core functionalities, there are two main directories: libpod and pkg. libpod handles the runtime and interacts with Podman’s dependencies on a lower level like calls to crun. pkg contains the direct functions called by the CLI and REST API.

In addition, the project has dependencies and vendor directories for external dependencies. The former contains scripts used for the internal analysis of Podman, such as storage analysis, while the latter contains fixed versions of the vendored dependencies. There are also the utils and cni directories for networking, which contain internal analysis scripts and the implementation for the OCI networking standard, respectively. Lastly, for the test directory, it is important to note that not all test files live within this folder. Specifically, unit tests do not live there as they are usually placed next to the file being tested. It does contain E2E, API, and system tests for Podman.

Podman API

Podman interacts through a REST API with Swagger as an Interface Description Language. It lives in the pkg directory and communicates with the runtime. It contains a Docker-compatible API with seven resources and a libpod API with ten resources.10 It enables Mac and Windows platforms to call the Podman service on remote Linux platforms.11 The key design principles that the API is based on are standard compliance, consistency, and ease of understanding. These three principles help the API evolve alongisde the (standard) CLI commands in a manner that is maintainable and easy to understand for the users.

The API documentation provides examples of usage scenarios, and has logical and consistent names for resources and query parameters.10 The downside is that it is tricky to find the function handlers used for specific endpoints, as every handler file is dumped in two directories. There is no transparent mapping between the endpoints’ path and the folder structure. Additionally, the developers could improve the error messages to the same level as the CLI versions.

The system has been designed to be reliable and available as it tries to handle errors gracefully and output responses quickly. Furthermore, since the environment handles scaling (as pods need to be updated and containers spun up or down), the system is also scalable. On top of that, it has an authentication module so that only specific users can perform libpod operations. It complies with REST standards as the requests are stateless.

Conclusion

Podman is built upon an ambitious vision of how container management should be done. With challenging goals like secure and lightweight container orchestration come equally demanding architectural challenges, which we observe on various levels of the project. This is addressed through Podman’s main architectural style, which is based on individual processes that communicate through the Fork/Exec model rather than the more cumbersome and less flexible daemon-based approach. This also introduces challenges with regard to trade-offs between security and performance in certain applications, which we explore further in our final essay.

Podman offers flexible options for users to choose from, like podman-remote, podman-compose, or the standard podman installation. Zooming in, the incorporation of tools like Buildah and Skopeo makes it easy for the various components of the Podman ecosystem to focus on a specific aspect, thus ensuring a clear purpose and direction for each project. This is all glued together by the connectors that enable Podman to deliver complex features and a performant runtime experience through a standardised API. The project is continuously growing under the supervision of RedHat developers and with the help of the open-source community, who together built a robust yet accessible development process. The extent to which the implementation of the architectural decomposition achieves the system’s quality and evolutionary targets are explained in the third essay.

References


  1. The C4 model for visualising software architecture. Retrieved on March 9, 2022, from https://c4model.com ↩︎

  2. arc42 Documentation. Retrieved on March 9, 2022, from https://docs.arc42.org/home/ ↩︎

  3. Podman: A more secure way to run containers (2018). Retrieved March 8, 2022, from https://opensource.com/article/18/10/podman-more-secure-way-run-containers↩︎

  4. Audit framework (2022). Retrieved March 8, 2022, from https://wiki.archlinux.org/title/Audit_framework↩︎

  5. Why Red Hat is investing in CRI-O and Podman (2019). Retrieved March 1, 2022, from https://www.redhat.com/en/blog/why-red-hat-investing-cri-o-and-podman ↩︎

  6. Matthew Heon (Red Hat), Dan Walsh (Red Hat), Giuseppe Scrivano (Red Hat) (2020, February 27). Retrieved March 10, 2022, from https://www.redhat.com/sysadmin/behind-scenes-podman ↩︎

  7. Heon, M. (2019, September 11). https://www.redhat.com/sysadmin/rootless-podman ↩︎

  8. README.md. Retrieved on March 10, 2022, from https://github.com/rootless-containers/slirp4netns ↩︎

  9. Heon, M. (2017, November 1). Podman/contributing.md. GitHub. Retrieved March 8, 2022, from https://github.com/containers/podman/blob/main/CONTRIBUTING.md ↩︎

  10. Podman Community. (n.d.). Provides an API for the Libpod library (4.0.0). Reference. Retrieved March 10, 2022, from https://docs.podman.io/en/latest/_static/api.html ↩︎

  11. Podman site. (2022). Retrieved February 16, 2022, from https://podman.io/whatis.html/ ↩︎

Podman
Authors
Calin Georgescu
Xueyuan Chen
Rover van der Noort
Krzysztof Baran