Dissecting Intel's EPYC Benchmarks: Performance Through the Lens of Competitive Analysisby Johan De Gelas & Ian Cutress on November 28, 2017 9:00 AM EST
- Posted in
- Xeon Platinum
- EPYC 7601
Although the AMD EPYC is definitely a worthy contender in the server space, AMD's technical marketing of the new CPU has been surprisingly absent, as the company not published any real server benchmarks. The only benchmarks published were SPEC CPU and Stream, with AMD preferring for its partners and third parties to promote performance. And, as our long-time readers know, while the SPEC CPU benchmarks have their merits and many people value them, they are a very poor proxy of most server workloads.
In every launch, we expect companies to offer an element of competitive analysis, often to show how their platform is good or better than the rest. At the launch of Intel's latest Xeon-SP platform, analysis to EPYC was limited to a high-level, as the systems were not as freely available as expected. AMD was able to do so on Broadwell-E at the time of the EPYC announcement because it was out and available - Intel wasn't able to do it on EPYC because AMD were several months away from moving it from a cloud-only ramp up program. This is partly the effect of AMD's server market implementation and announcement roadmap, although it didn't stop Intel from hypothesising about the performance deficits in ways that caught the attention of a number of online media.
Throughout all of this, AMD could not resist but to continue to tell the world that the "EPYC SoC Sets World Records on SPEC CPU Benchmarks". In the highly profitable field that is server hardware, this could not be left unanswered by Intel, who responded that the Intel Xeon Scalable has great "momentum" with no less than 110 performance records to date.
Jumping to the present time, in order to to prove Xeon-SP dominance over the competition, Intel's data center engineering group has been able to obtain a few EPYC systems and has started benchmarking. This benchmarking, along with justifications of third-party verification, was distributed to the small set of Xeon-SP launch reviewers as a guide, to follow up on that high-level discussion some time ago. The Intel benchmarking document we received had a good amount of detail however, and the conference call we had relating to it was filled with some good technical tidbits.
Our own benchmarks showed that the EPYC was a very attractive alternative in some workloads (Java applications), while the superior mesh architecture makes Intel's Xeon the best choice in other (Databases for example).
A Side Note About SPEC
A number of these records were achieved through SPEC. As mentioned above, while SPEC is a handy tool for comparing the absolute best tweaked peak performance of the hardware underneath, or if the system wants to be analysed close to the metal because of how well known the code base is, but this has trouble transferring exactly to the real world. A lot of time the software within a system will only vaguely know what system it is being run on, especially if that system is virtualised. Sending AVX-512 commands down the pipe is one thing, but SPEC compilation can be tweaked to make sure that cache locality is maintained whereas in the real-world, that might not be possible. SPEC says a lot about the system, but ultimately most buyers of these high-end systems are probing real-world workloads on development kits to see what their performance (and subsequent scale-out performance) might be.
For the purposes of this discussion, we have glossed over Intel's reported (and verified over at SPEC.org) results.
Pricing Up A System For Comparison
Professionals and the enterprise market will mention, and quite rightly, that Intel has been charging some heavy premiums with the latest generation, with some analysts mentioning a multiple jump up in pricing even for large customers, making it clear that the Xeon enterprise CPU line is their bread and butter. Although Intel's top-end Xeon Platinum 8180 should give the latest EPYC CPU a fit of trouble thanks to its 28 Skylake-SP cores running at 2.5 to 3.8 GHz, the massive price tag ($10009 for the standard version, $13011 for the high-memory model) made sure that Intel's benchmarking team had no other choice than also throwing in a much more modest Xeon Platinum 8160 (24 cores at 2.1 - 3.7 GHz, $4702k) as well as the Xeon Gold 6148 (20 cores at 2.4-3.7 GHz, $3072).
|Release Date||Early Q3, 2017||Late Q2, 2017*|
|Microarchitecture||Skylake-SP with AVX-512||Zen|
|Process Node||Intel 14nm (14+)||GloFo 14nm|
|Cores / Threads||28 / 56||24 / 48||20 / 40||32 / 64|
|Base Frequency||2.5 GHz||2.1 GHz||2.4 GHz||2.2 GHz|
|Turbo||3.8 GHz||3.7 GHz||3.7 GHz||3.2 GHz|
|L2 Cache||28 MB||24 MB||20 MB||16 MB|
|L3 Cache||38.5 MB||33.0 MB||27.5 MB||64 MB|
|TDP||205 W||150 W||150 W||180 W|
|PCIe Lanes||48 (Technically 64 w/ Omni-Path Versions)||128|
|DRAM||6-channel DDR4||8ch DDR4|
|Max Memory||768 GB||2048 GB|
As a result of this pricing, one of the major humps for Intel in any comparison will be performance per dollar. In order to demonstrate that systems can be equivalent, Intel offered up this comparison from a single retailer. Ideally Intel should have offered multiple configurations options for this comparison, given that a single retailer can intend for different margins on different sets of products (or have different levels of partnership/ecosystem with the manufacturers).
Even then, price parity could only be reached by giving the Intel system less DRAM. Luckily this was the best way to configure the Intel based system anyway. We can only guess how much the benchmarking engineers swore at the people who set the price tags: "this could have been so much easier...". All joking apart, the document we received had a good amount of detail, and similar to how we looked into AMD's benchmarking numbers at their launch, we investigated Intel's newest benchmark numbers as well.
Post Your CommentPlease log in or sign up to comment.
View All Comments
Johan Steyn - Monday, December 18, 2017 - linkI am so glad people are realising ANandtechs rubish, probably led by Ian who wrote that terrible Threadripper review. Maybe he will realise it as more complain. It all depends on how much Intel is paying him...
mapesdhs - Wednesday, November 29, 2017 - linkANSYS is one of those cases where having massive RAM really matters. I doubt if any site would bother speccing out a system properly for that. One ANSYS user told me he didn't care about the CPU, just wanted 1TB RAM, and that was over a decade ago.
rtho782 - Tuesday, November 28, 2017 - link> Xeon Platinum 8160 (24 cores at 2.1 - 3.7 GHz, $4702k)
$4,702,000? Intel really have bumped up their pricing!!
bmf614 - Tuesday, November 28, 2017 - linkThe pricetag discussion really needs to include software licensing as well. Windows Datacenter and SQL server on a machine with 64 cores will cost more than the hardware itself. This is the reason that the Xeon 5122 exists.
bmf614 - Tuesday, November 28, 2017 - linkAlso isnt it kind of silly to invest in a server platform with limited PCIE performance when faster and faster storage and networking is becoming commonplace?
Polacott - Tuesday, November 28, 2017 - linkit really seems that AMD has crushed Intel this time. Also Charlie has some interest points about security ( has this topic being even analyzed here ? https://www.semiaccurate.com/2017/11/20/epyc-arriv... )
Software WILL be tuned for Epyc, so a safe bet will not be getting Xeon but Epic, for sure.
And power consumption and heat is really important as is an interesting part of datacenter maintenance costs.
I really don't get how the article ends up in this conclusion.
Johan Steyn - Monday, December 18, 2017 - linkIntel's financial support helps them reach this conclusion. Very sad
ZolaIII - Tuesday, November 28, 2017 - linkAs usually Intel cheated. Clients won't use neither their property compiler nor a software but GNU one's. Now let me show you a difference:
Other than that this is boring as ARM NUMA based server chips are coming with some backup from good old veterans when it comes to to supercomputing and this time around Intel won't have even a compiler advantage to drag about it.
Now this are the real news & melancholic ones for me as it brings back memories how it all started. & guess what? We are back their on the start again.
toyotabedzrock - Tuesday, November 28, 2017 - linkLinux 4.15 has code to increase EPYC performance and enable the memory encryption features. 4.16 will have the code to enable the virtual machine memory encryption.
duploxxx - Friday, December 1, 2017 - linkthx for sharing the article Johan, as usual those are the ones I will always read.
Interesting to get feedback from Intel on benchmark compares, this tells how scared they really are from the competition. There is no way around, I' ve been to many OEM and large vendor events lately. One thing is for sure, the blue team was caught with there pants down and there is for sure interest from IT into this new competitor.
Now talking a bit under the hood, having had both systems from beta stages.
I am sure Intel will be more then happy to tell you if they were running the systems with jitter control. Off course they wont tell the world about this and its related performance issues.
Second, will they also share to the world that there so called AVX enhancement have major clock speed disadvantages to the whole socket. really nice in virtual environments :)
Third, the turbo boosting that is nowhere near the claimed values when running virtualization?
Yes the benchmarking results are nice, but they don't give real world reality, its based on synthetic benches. Real world gets way less turbo boost due to core hot spots and there co-related TDP.
There are reasons why large OEM did not yet introduce EPYC solutions, they are still optimizing BIOS and microcode as they want to bring a solid performing platform. The early tests from Intel show why.
Even the shared VMware bench can be debated with no shared version info as the 6.5u1 has got major updates to the hypervisor with optimizations for EPYC.
Sure DB benches are an Intel advantage, there is no magic to it looking at the die configurations, there are trade offs. But this is ONLY when the DB are bigger then certain amount of dies so we are talking here about 16+ cores from the 32 cores/socket systems for example, anything lower will have actually more memory bandwidth then the Intel part. So how reliable are these benchmarks for a day to day production.... not all are running the huge sizes. And those who do should not just compare based on synthetical benches provided but do real life testing.
Aint it nice that a small company brings a new CPU line and already Intel needs to select there top bin parts as a counter part to show benchmarks to be better. There are 44 other bins available on the Intel portfolio, you can probably already start guessing how well they really fare against there competitor....