Amazon has released their EC2 M6g instances for a while, they are powered by ARM-based AWS Graviton2 processors. And Amazon announces that M6g delivers up to 40% better price performance over current generation M5 instances. A straight price comparison of m6g.2xlarge against m5n.2xlarge in Tokyo region results in about 35% reduction.
(0.612 - 0.396) / 0.612 => 0.3529
So it’s really confusing how does the 40% better price performance is calculated, here we don’t even put in the potential performance reduction a single ARM core compared to a normal X86 core. But I was still going to make some real tests, there might be miracles.
I made a test with the Mixin Kernel as an archive node. Mixin Kernel is a distributed ledger, and the node will do many crypto verifications, which is a pretty good choice for a CPU performance test, especially the node utilizes all cores to do parallel computations and uses a lot of RAM for cache, and the persistence layer of the node is built upon BadgerDB, which is a key value database that makes use of the property of SSD a lot to facilitates the data reading and writing.
I spin up a M6g instance with 8 vCPU and 30 GB memory, processor flags are simple. Good.
There is no default ARM build in Mixin releases, so I need to build it by myself on the server. Golang is king, I downloaded the go arm64 build, installed
build-essential because the
zsdt package utilizes cgo, then I built mixin successfully, no hurdles. Then I started
mixin kernel to sync the full graph, and for direct comparison I also launched a fresh m5n.2xlarge instance.
The result was very frustrating, after 3 days, the ARM one had lagged behind about 50% and would never catch up with the new transactions finalized. I don’t have time to profile the exact performance bottlenecks yet, and I made more m6g.4xlarge instances to test and they all never synced up.
Until recently I updated the mixin code to the latest master which has some performance related commits, and I’m shocked that a m6g.2xlarge is capable of syncing millions of transactions in 20 hours and keeps synced with the graph. I also did the same update on a m5n.2xlarge server, it still performs better than the ARM server, but only slightly, about 30 minutes ahead to sync about 3 million transactions.
Of course not 40% better price performance, but somewhat competitive.