MetaMask - Scalability
In this final essay on the Metamask wallet extension we will dive into the scalability concerns regarding the performance and usability of the app. For a short introduction to the project itself, we like to refer you to essay 1. In this essay, we will first identify MetaMask’s scalability challenges and dimensions, after which we will take you along with our journey of finding out how the scalability can be improved along one specific dimension.
Scalability Challenges
Since MetaMask is a client-only app due to it inherently being connected to a decentralized ledger, scalability issues regarding the ‘backend’ are nonexistent within the MetaMask container. In theory, as long as the blockchain itself remains scalable, operations between the wallet and the blockchain should remain scalable as well. Therefore, we will have to approach the scalability of MetaMask from other angles than increasing the load on the blockchain.
Build time and scalability in dependencies
During our time familiarizing with the project, we noticed some issues that could be considered as scalability issues. First of all, we noticed that the build time of the project is frustratingly long. We measured the build time on an M1 MacBook Air, and found that it took about 20 minutes. Of course, this seriously impacts developer efficiency, which in turn slows down the product development as a whole. We found that the long build time is the result of a recent feature called LavaMoat. This feature protects MetaMask from so-called supply-chain attacks which can be launched from malicious dependencies. MetaMask uses over 250 dependencies which explains their need for LavaMoat. On a personal note, we think that this high number of dependencies is a symptom of the fact that MetaMask has too much on its plate (See essay 3), and was forced to make the implicit architectural design decision to increase MetaMask’s attack surface. After all, dependencies replace auxiliary code that developers would have to write, which increases the productivity of a developer, which in turn lets MetaMask keep up better with external developments. Usually, high-security applications like MetaMask would instead try to do as much as possible internally in order to minimize the attack surface of these supply-chain attacks.
Other dimensions
Since Metamask is client-only, we look at the components that can grow in numbers within one container for additional scalability challenges. To find plausible scenarios, we examined GitHub issues with the label “degraded performance”. Currently, there are eight issues with this label. An example of a recent issue was #13243, who claims to experience large delays while opening his/her MetaMask account. We note that this user has 48 wallets, with an additional 5 imported wallets. Other issues report that the application is also very slow and that MetaMask causes 100% CPU usage. The reporting users note that besides having a high number of accounts, they have also done a lot of transactions (2000+) over time. Hence, we distill these performance issues into two scalability dimensions, namely:
- the number of transactions done in the past, and
- the number of wallets.
For the former scalability dimension, there have been several pull requests to accommodate for users with large transaction histories, including #8363 and #9010. Although these pull requests largely solved the lagging that users with a lot of transactions were facing, the author of these pull requests notes that they are only temporary solutions, which is why the referred issues are still open. He also notes that these issues should be permanently resolved with “yet-to-be-specified improvements to our background <=> UI communication” (See #8991). Since this scalability dimension has already been investigated thoroughly, we will instead focus on the scalability issues caused by a large number of wallets.
Quantitative scalability analysis
Metamask allows the user to add as many wallets as he or she likes. So, for our scalability analysis, we wanted to quantify what happens when a user has a similar number of wallets as described in the aforementioned issues. We will specifically look at the use-case of adding an account.
In our experiment, we made slight modifications to the end-to-end (e2e) tests that Metamask uses in their continuous integration pipeline. Instead of creating one account, the test now attempts to add 1000 accounts. Between every account add, the time it took is logged. In the results below, you can see quite a shocking result: when the 80th wallet is added, it takes over 20 seconds to add it. This is unacceptable from the user experience side, as the screen is frozen for that period of time. In the book ‘Visualization Analysis and Design’ by Munzner (2015, p. 137), arguments are made for why simple tasks should never exceed 10 seconds in order to prevent human irritation levels from increasing.
The time complexity of adding n accounts becomes O(n^2), as you can see in the figures below.
Responsible architectural decisions
Our initial hypothesis was that this O(n^2) complexity had to do with the scur-bip32 HDKey implementation, where the deriveChild(n)
function is called to get the nth child of the keyring.
After isolating this function call however, we conclude that this function is O(1).
Of course, from an engineering perspective, the addWallet() operation should take just as long for wallet number 1 as for wallet number 1000, especially since we now ruled out the HD Keyring dependency.
After inspecting and experimenting with the code carefully, we find that the bottleneck does not lie in adding the account to the HD keyring. In fact, we ruled out all other functions in the logical control flow of the program by isolating each function. In the end, we were unable to find the exact lines of code that cause this behaviour.
Cause of our unsuccessful endeavour.
Although the code for adding an account looks simple, there are a plethora of objects subscribed to other objects and events, and these connections are not easily found at runtime. To illustrate, the MetaMask constructor hotspot we talked about in the previous essay, has over a thousand lines of code, all dedicated to constructing and subscribing objects to others. Of course, from issues like #13153 we know that MetaMask does not handle a lot of accounts well, and since these issues have been known for a longer period of time by the veteran developers, we can reasonably assume the root cause is a lot more complex than we initially expected.
Lessons learned
Ironically, this unsuccessful dread of digging around and testing has provided us with a clue on how the indirect architectural decisions affect MetaMask’s scalability. First of all, the architectural decision of using these subscriptions makes debugging the system behaviour on certain events quite difficult. However, it does enhance code quality as it decreases coupling, which is why we do not propose to change this. Instead, we are convinced that the true culprit is the lack of scalability tests. As noted earlier, we had to modify an e2e test to include adding multiple accounts, as there were no e2e tests that include multiple accounts, instead of simply having these tests already. We have seen that this scenario is totally plausible, and the lack of proper e2e tests that accomodate for users with multiple accounts is therefore the reason that performance issues creeped into the codebase.
Architectural Changes
So, we foremost propose the addition of e2e tests with a large number of accounts (and other extreme quantities, like a large number of transactions) for the basic use-cases of MetaMask. With the CI pipeline that was discussed in essay 3, we know that pull requests that cause performance issues will then be caught before the code can enter the codebase. As a result, we expect these e2e tests to (implicitly) greatly decrease performance issues.
Furthermore, we can specifically look at the scenario we have analysed. Besides, improving the efficiency in adding an account, Metamask can at least make some quick and easy changes to improve the user experience. This can be done in various ways:
- Add a loading indicator. As our results show, the loading time is quite predictable.
- Create the account in the background. This would allow the UI to remain responsive and let the user continue to perform other actions. However, we note that this is a complex solution, and the effort needed for this temporary fix might exceed the time needed to find the root cause of the inefficiencies.
Besides this, we feel like the user should be warned about the performance issues that will arise when he/she creates this many wallets.
We believe these changes to be rather straightforward, so we will refrain from visualizing these changes in diagram notation. Fortunately, you can find an overview of the architecture in essay 2.
Analysis
To reiterate, we propose more e2e tests and improvements to the user experience for performance issues that can not be quickly fixed, including loading indicators or background processing. Naturally, e2e tests will prevent more performance issues, because PRs decreasing performance are more likely to be found in the Continuous Integration pipeline. Of course, the user experience changes do not address the identified scalability issues, but it does make the matter of fixing the inefficiencies in the code less pressing because their impact on usability is mitigated.
Conclusion
This concludes our final essay on Metamask. We first identified a scalability issue in the number of dependencies, which through a necessary supply-chain firewall has notably increased the build time of the extension. We then identified two problematic scalability dimensions that caused performance issues for the users. We then looked at a specific use-case, adding an account, and tried to find the root cause of the problem. Although we could not pinpoint the exact code causing the issues, we did realize that MetaMask would have been able to prevent this with suitable e2e tests. For the period that these issues are not fixed, we propose several UX improvements to mitigate the impact of these issues.