Amazon Stock: The Graviton3 Is Here (NASDAQ:AMZN)

AWS re:Invent 2021

Noah Berger

I wrote about Amazon’s (NASDAQ:AMZN) coming Graviton3 CPU 1 year and 9 months ago. The Graviton3 was announced somewhat later than I expected (November 2021, versus my July 2021 estimate). As of May 23, the Graviton3 has entered commercial deployment.

Let’s see what observations we can make about this new step by Amazon in deploying its own silicon.

First, remember, Amazon had already been greatly expanding the service coverage on Graviton2 instances. The Graviton2 was already quite competitive by itself. I, however, expected a large jump in performance for the Graviton3, since it would be based on a 2-generation ARM IP step versus the Graviton2.

Also, one should understand that all this analysis is from the perspective of a cloud user (buying cloud services from Amazon). The Graviton3 isn’t available outside of Amazon’s AWS, though it existing does give us an idea of what might happen regarding other third-party server CPUs using equivalent ARM IP.

A Reason For Optimism

ARM, when presenting its Neoverse V1 IP, on which Graviton3 is based, claimed up to 48% performance gains. This was in line with my own predictions.

Also, ARM showed how it expected its coming IP to perform and what challenges it would address:

ARM Server CPU Roadmap

ARM Server CPU Roadmap (ARM)

Again, I should reinforce, like I did in my Graviton2 article, that there’s something massively favoring ARM-based cloud offerings:

  • Amazon, like other cloud providers, prices its offerings per virtual core. A virtual core on an ARM instance is a physical core. A virtual core on a x86 instance is a thread in a multi-threaded core (there are two threads per physical core).
  • Due to this, a x86 instance will always tend to underperform an ARM instance unless the other thread happens to not be busy. Hyperthreading can under optimal circumstances improve a core’s performance by 35-50% when handling 2 threads (versus the same 2 threads on a core with hyperthreading turned off). Hence, a physical ARM core, even with just 75% of the single-threaded performance of a x86 core, would already tend to match the performance of a loaded x86 core.

This was already true for the Graviton2 based on the ARM Neoverse N1 core. Of course, Intel (INTC) and AMD (AMD) didn’t sit still, and new Intel and AMD-based instances were again putting pressure on the Graviton2 from a performance perspective. Hence, all the more important for us to see how the Graviton3 would respond. But we could be optimistic, because the promised performance jump was quite large.

The Performance Jump

In its press release declaring commercial availability for Graviton3-powered instances, Amazon makes several performance gain claims:

  • New C7g instances powered by next generation AWS Graviton3 processors provide up to 25% better performance for compute-intensive applications over current generation C6g instances
  • Compared to previous generation AWS Graviton2 processors, AWS Graviton3 processors deliver up to 2x faster performance for cryptographic workloads
  • Up to 3x faster performance for machine learning inference
  • Nearly 2x higher floating point performance for scientific, machine learning, and media encoding workloads.
  • AWS Graviton3 processors are also more energy efficient, using up to 60% less energy for the same performance than comparable EC2 instances.
  • C7g instances are the first in the cloud to feature the latest DDR5 memory, which provides 50% higher memory bandwidth than AWS Graviton2-based instances to improve the performance of memory-intensive scientific applications like computational fluid dynamics, geoscientific simulations, and seismic processing.
  • C7g instances also deliver 20% higher networking bandwidth than C6g instances for network intensive applications like network load balancing and data analytics.

Generically, I had myself expected a 50-80% performance gain. 50% if just keeping the same frequency level (optimizing efficiency) or 80% if using all the frequency headroom. Clearly, Amazon went for efficiency, quite probably because the performance gains were enough to optimize for operating costs.

Indeed, in a thorough benchmarking exercise, and using a geometric mean of many different benchmarks, Phoronix.com came to a roughly 42% performance uplift from equivalent Graviton2-powered instances:

When taking the geometric mean across all the raw performance benchmarks carried out on both the Graviton2 and Graviton3 instances, the c7g.4xlarge came out to being about 42% faster than the c6g.4xlarge instance type.

Thus, the Graviton3 seems to be on point. This evolutionary step taken by Amazon with its own silicon didn’t disappoint.

However, my previous (now old) Graviton2 comparison to Intel and AMD-powered solutions was based on older Server technology. So how competitive is the Graviton3 versus the current higher-powered Intel and AMD solutions available at Amazon?

Phoronix’s benchmark exercise provided us with an answer for this as well, since a few days after comparing the Graviton3 to the previous Graviton2, it also made a new exercise. In this new exercise, Phoronix pitted the Graviton3 against equivalent Amazon x86 instances:

  • c6a.4xlarge – The AMD EPYC 7003 “Milan” instance type powered by an AMD EPYC 7R13 processor. The c6a.4xlarge instance was priced on-demand at $0.612 USD per hour.
  • c6i.4xlarge – The Intel Xeon Scalable “Ice Lake” instance type using a Xeon Platinum 8375C processor. The c6i.4xlarge was using the Xeon Platinum 8375C processor. The c6i.4xlarge instance type was priced on-demand at $0.68 USD per hour.

That is, directly comparable instances, the most modern available, to the one using Graviton3:

  • c7g.4xlarge – The new Graviton3 instance type with Neoverse-V1 cores. The c7g.4xlarge on-demand pricing is currently at $0.58 USD per hour.

We can also already see the different pricing, with Graviton3 instances at a 15% discount to Intel instances, and a 5% discount to AMD instances.

So, what did Phoronix observe when benchmarking these instances? It found that:

  • Most often, Graviton3 instances were the most performant. So in nearly half of all different benchmarks, even though a Graviton3 instance would be cheaper, it would also be more performant.

Phoronix Graviton3 Compared To X86 number of wins

Phoronix.com

  • Using a geometric mean of all tests, Gravitor3 was also fastest. Of note, the previous conclusion is more important – it relays that on more different types of jobs, the Graviton3 instance will be the best choice. This second observation is more affected by outliers (when Intel wins, it often wins by a lot – for instance, when AVX512 is used).

Phoronix Graviton3 Compared To X86 geometric performance comparison

Phoronix.com

Most importantly, these are observations on pure performance alone. Since the Graviton3 instances are cheaper to a lot cheaper than x86 instances, it’s possible that it would also be desirable to use the Graviton3 even when there is a slight performance deficit.

Conclusion

Once again, the Graviton3 puts Amazon’s own ARM-powered instances ahead of its x86 offerings. Such happened even though Amazon seems to have gone for an efficiency compromise instead of outright performance (just like it did with Graviton2).

Amazon talks about 60% less energy for comparable performance to before. Well, this is good for Amazon, but of course for someone renting the Graviton3 instances, only the actual price matters. Anyway, the price remains lower than for x86 instances, so we can gather that Amazon is at least passing through some of the energy savings, even while providing much improved performance.

It is likely that if Graviton3 continues gaining share within Amazon workloads, and equivalent offerings do the same at Google (GOOGL) (GOOG), Microsoft (MSFT) and other exascalers, this will result in lower demand for Intel and AMD server-room chips. This is part of an ongoing risk for Intel and AMD.

For Amazon, having Graviton3 not only allows it to potentially be deploying a lower-cost, higher performance solution, but it allows Amazon to untie its own cloud economics from those resulting from simply buying hardware from Intel and AMD like everyone else. Also, having Graviton3 allows it better bargaining power when dealing directly with Intel and AMD, thus potentially lowering capex costs for a given amount of computing from these providers. That’s also an advantage over other cloud providers which don’t yet have a Graviton3 equivalent.

Finally, though, both Intel and AMD seem close to field a new chip generation for the server room. Intel with Sapphire Rapids, and AMD with its Zen 4-based Genoa EPYC CPU series.

Intel’s Sapphire Rapids seems set to bring a large gain in single-threaded performance, which would look good on the comparisons we just saw. AMD’s Genoa would mostly shine on core count, while its IPC gains look limited (and thus wouldn’t completely cure the deficit seen to the Graviton3).

It should be noted that AMD instances are starting to be quite a bit cheaper than Intel instances because of the large core count advantages. AMD’s Genoa might well bring Amazon’s AMD-powered instances close to Graviton3 pricing, which could itself lead Amazon to make the Graviton3 instances cheaper.

Anyway, the Graviton3 is yet another step towards problems for Intel and AMD on the server room. Also to be noticed, the Graviton3 is again based on a 2-generation old ARM technology already. Soon we should see an ARM announcement for the next step in server-class cores. That said, as of late ARM has been a bit slow with its performance gains.

Be the first to comment

Leave a Reply

Your email address will not be published.


*