Intel Discloses Multi-Generation Xeon Scalable Roadmap: New E-Core Only Xeons in 2024
by Dr. Ian Cutress on February 17, 2022 5:30 PM ESTIt’s no secret that Intel’s enterprise processor platform has been stretched in recent generations. Compared to the competition, Intel is chasing its multi-die strategy while relying on a manufacturing platform that hasn’t offered the best in the market. That being said, Intel is quoting more shipments of its latest Xeon products in December than AMD shipped in all of 2021, and the company is launching the next generation Sapphire Rapids Xeon Scalable platform later in 2022. Beyond Sapphire Rapids has been somewhat under the hood, with minor leaks here and there, but today Intel is lifting the lid on that roadmap.
State of Play Today
Currently in the market is Intel’s Ice Lake 3rd Generation Xeon Scalable platform, built on Intel’s 10nm process node with up to 40 Sunny Cove cores. The die is large, around 660 mm2, and in our benchmarks we saw a sizeable generational uplift in performance compared to the 2nd Generation Xeon offering. The response to Ice Lake Xeon has been mixed, given the competition in the market, but Intel has forged ahead by leveraging a more complete platform coupled with FPGAs, memory, storage, networking, and its unique accelerator offerings. Datacenter revenues, depending on the quarter you look at, are either up or down based on how customers are digesting their current processor inventories (as stated by CEO Pat Gelsinger).
That being said, Intel has put a large amount of effort into discussing its 4th Generation Xeon Scalable platform, Sapphire Rapids. For example, we already know that it will be using >1600 mm2 of silicon for the highest core count solutions, with four tiles connected with Intel’s embedded bridge technology. The chip will have eight 64-bit memory channels of DDR5, support for PCIe 5.0, as well as most of the CXL 1.1 specification. New matrix extensions also come into play, along with data streaming accelerators, quick assist technology, all built on the latest P-core designs currently present in the Alder Lake desktop platform, albeit optimized for datacenter use (which typically means AVX512 support and bigger caches). We already know that versions of Sapphire Rapids will be available with HBM memory, and the first customer for those chips will be the Aurora supercomputer at Argonne National Labs, coupled with the new Ponte Vecchio high-performance compute accelerator.
The launch of Sapphire Rapids is significantly later than originally envisioned several years ago, but we expect to see the hardware widely available during 2022, built on Intel 7 process node technology.
Next Generation Xeon Scalable
Looking beyond Sapphire Rapids, Intel is finally putting materials into the public to showcase what is coming up on the roadmap. After Sapphire Rapids, we will have a platform compatible Emerald Rapids Xeon Scalable product, also built on Intel 7, in 2023. Given the naming conventions, Emerald Rapids is likely to be the 5th Generation.
Emerald Rapids (EMR), as with some other platform updates, is expected to capture the low hanging fruit from the Sapphire Rapids design to improve performance, as well as updates from the manufacturing. With platform compatibility, it means Emerald will have the same support when it comes to PCIe lanes, CPU-to-CPU connectivity, DRAM, CXL, and other IO features. We’re likely to see updated accelerators too. Exactly what the silicon will look like however is still an unknown. As we’re still new in Intel’s tiled product portfolio, there’s a good chance it will be similar to Sapphire Rapids, but it could equally be something new, such as what Intel has planned for the generation after.
After Emerald Rapids is where Intel’s roadmap takes on a new highway. We’re going to see a diversification in Intel’s strategy on a number of levels.
Starting at the top is Granite Rapids (GNR), built entirely of Intel’s performance cores, on an Intel 3 process node for launch in 2024. Previously Granite Rapids had been on roadmaps as an Intel 4 node product, however, Intel has stated to us that the progression of the technology as well as the timeline of where it will come into play makes it better to put Granite on that Intel 3 node. Intel 3 is meant to be Intel’s second-generation EUV node after Intel 4, and we expect the design rules to be very similar between the two, so it’s not that much of a jump from one to the other we suspect.
Granite Rapids will be a tiled architecture, just as before, but it will also feature a bifurcated strategy in its tiles: it will have separate IO tiles and separate core tiles, rather than a unified design like Sapphire Rapids. Intel hasn’t disclosed how they will be connected, but the idea here is that the IO tile(s) can contain all the memory channels, PCIe lanes, and other functionality while the core tiles can be focused purely on performance. Yes, it sounds like what Intel’s competition is doing today, but ultimately it’s the right thing to do.
Granite Rapids will share a platform with Intel’s new product line, which starts with Sierra Forest (SRF) which is also on Intel 3. This new product line will be built from datacenter optimized E-cores, which we’re familiar with from Intel’s current Alder Lake consumer portfolio. The E-cores in Sierra Forest will be a future generation than the Gracemont E-cores we have today, but the idea here is to provide a product that focuses more on core density rather than outright core performance. This allows them to run at lower voltages and parallelize, assuming the memory bandwidth and interconnect can keep up.
Sierra Forest will be using the same IO die as Granite Rapids. The two will share a platform – we assume in this instance this means they will be socket compatible – so we expect to see the same DDR and PCIe configurations for both. If Intel’s numbering scheme continues, GNR and SRF will be Xeon Scalable 6th Generation products. Intel stated to us in our briefing that the product portfolio currently offered by Ice Lake Xeon products will be covered and extended by a mix of GNR and SRF Xeons based on customer requirements. Both GNR and SRF are expected to have full global availability when launched.
The E-core Sierra Forest focused on core density will end up being compared to AMD’s equivalent, which for Zen4c will be called Bergamo – AMD might have a Zen5 equivalent when SRF comes to market.
I asked Intel whether the move to GNR+SRF on one unified platform means the generation after will be a unique platform, or whether it will retain the two-generation retention that customers like. I was told that it would be ideal to maintain platform compatibility across the generations, although as these are planned out, it depends on timing and where new technologies need to be integrated. The earliest industry estimates (beyond CPU) for PCIe 6.0 are in the 2026 timeframe, and DDR6 is more like 2029, so unless there are more memory channels to add it’s likely we’re going to see parity between 6th and 7th Gen Xeon.
My other question to Intel was about Hybrid CPU designs – if Intel was now going to make P-core tiles and E-core tiles, what’s stopping a combined product with both? Intel stated that their customers prefer uni-core designs in this market as the needs from customer to customer differ. If one customer prefers an 80/20 split on P-cores to E-cores, there’s another customer that prefers a 20/80 split. Having a wide array of products for each different ratio doesn’t make sense, and customers already investigating this are finding out that the software works better with a homogeneous arrangement, instead split at the system level, rather than the socket level. So we’re not likely to see hybrid Xeons any time soon. (Ian: Which is a good thing.)
I did ask about the unified IO die - giving the same P-core only and E-core only Xeons the same number of memory channels and I/O lanes might not be optimal for either scenario. Intel didn’t really have a good answer here, aside from the fact that building them both into the same platform helped customers synergize non-returnable development costs across both CPUs, regardless of the one they used. I didn’t ask at the time, but we could see the door open to more Xeon-D-like scenarios with different IO configurations for smaller deployments, but we’re talking products that are 2-3+ years away at this point.
Xeon Scalable Generations | ||||||
Date | AnandTech | Codename | Abbr. | Max Cores |
Node | Socket |
Q3 2017 | 1st | Skylake | SKL | 28 | 14nm | LGA 3647 |
Q2 2019 | 2nd | Cascade Lake | CXL | 28 | 14nm | LGA 3647 |
Q2 2020 | 3rd | Cooper Lake | CPL | 28 | 14nm | LGA 4189 |
Q2 2021 | Ice Lake | ICL | 40 | 10nm | LGA 4189 | |
2022 | 4th | Sapphire Rapids | SPR | * | Intel 7 | LGA 4677 |
2023 | 5th | Emerald Rapids | EMR | ? | Intel 7 | ** |
2024 | 6th | Granite Rapids | GNR | ? | Intel 3 | ? |
Sierra Forest | SRF | ? | Intel 3 | |||
>2024 | 7th | Next-Gen P | ? | ? | ? | ? |
Next-Gen E | ||||||
* Estimate is 56 cores ** Estimate is LGA4677 |
For both Granite Rapids and Sierra Forest, Intel is already working with key ‘definition customers’ for microarchitecture and platform development, testing, and deployment. More details to come, especially as we move through Sapphire and Emerald Rapids during this year and next.
144 Comments
View All Comments
Mike Bruzzone - Sunday, February 20, 2022 - link
abit, thanks for your thermal observations. I trust 3D can be inexpensive data base and modeling space not just a game toy. On the thermals I follow you and we shall see and I acknowledge your thoughts and we can pick this conversation up what there are knowns. On the package area that is not TR package area for heat transfer out is my concern the package solution was not entirely thought out. Lid off 5900 engineering sample held by Dr. Su for all to see, yea, because in the lab it's being sprayed with freon or whatever is relied today from a spray can.On power efficiency features good thought. 6 nm, V5x shrink? Maybe.
Where's the $45 come from? TSMC cost : price of adding a 32 MiB SRAM slice to 7 nm 5800X.
SRAM slice is 36 mm2; $5.40 for fabrication, x2 operating cost to dice and test; markup on marginal cost = marginal revenue = price = $10.80 NOW + $10.80 package and final test = $21.60 finished good then x2 markup = $43.20. It's the mark ups that add to price and AMD is not an Apple with reoccurring revenue products.
Note Vermeer across full run grade SKUs priced to AMD, the 2 ccx dodadeca and hexadeca are expensive to produce and for all core grade SKUs the average TSMC price to AMD is $163 for V5x, essentially x2 Intel design production cost on TSMC 7 nm fabrication in parity with iSF10/7 but there is still packaging and finished goods mark up; marginal cost $81.71 + marginal revenue $81.71 = $163,43 now + $45 thereabouts for 3D = $208 to AMD. Otherwise TSMC can produce a one ccx V5x for around $39 hard cost but then add OpEx and mark up at x2 to AMD. At V5x end run on average octa + i/o might have fallen $140 to $154 but TSMC increases pricing. Matisse 3x full run averaged $131.32 but now Vermeer 5x is $163.43 + 24% and maybe AMD took a couple points more too.
If I'm within 10% that's good enough for government work. However, I will point out a material's engineer who costs on the molecular weight of inputs will laugh at some of my tried and true Intel micro production economic methods.
Mike Bruzzone FTC Docket 9341 Intel consent order monitor. Docket 9288 and 9341 federal attorneys enlisted discovery aid and micro production economics. Former Cyrix, ARM, NexGen, AMD, IDT Centaur employee or consultant.
_abit - Sunday, February 20, 2022 - link
It is a mystery then how amd managed to catch up with intel's margins with all them inflated production costs ;)Mike Bruzzone - Sunday, February 20, 2022 - link
abit, AMD, large cache applications performance strategy for the price premium AMD always seeks to make sustaining AMD gross incorporating by necessity TSMC foundry mark up where AMD needs to stay 1 to 1,5 nodes ahead of Intel to charge a high (say a premium) price, for component area performance advantage, while continuing to make up TSMC margin within AMD price. That's why AMD has to be selling Genoa as of q3 2021 on iSF10/7 reaching cost parity with TSMC 7, AMD had to move to the next node.On a design production basis that incudes OpEx Intel cost is 54% less than TSMC price to AMD and on a manufacturing cost basis 77% less than TSMC price to AMD for desktop components albeit Intel cost range is similar TSMC at the same 7 nm process node.
mb
Mike Bruzzone - Sunday, February 20, 2022 - link
abit, AMD has not caught up with Intel margin's yet; q4 at 50% and 53.6% respectively but getting closer. Intel got caught in q3 and q4 having to bundle in a lot of minimally at cost (maybe some freebie too) product into sales deals to clean out Xeon and Core inventory that was rotting on Intel shelves. Intel offed it to OEM dealers and channels. Core line only earned Intel variable cost in q3 and q4. And Intel burping slack surplus product caught up to AMD in q4 where AMD consumer product was also 'forced' down to earning only its variable cost coverage. Only Xeon and Epyc margin paid ahead in q4. mb_abit - Tuesday, February 22, 2022 - link
You are pulling stuff out your read end. Amd has the advantage of not having to bother about process development. TSMC has a lot more customers, clients and output, so its lead in process RD is understandable.All intel has to do to be in ruins is fail another node. Their targets for 10nm were unrealistic, and it took intel years to get over that. Well, their targets for 7nm are just as unrealistic, so intel has a big chance to stumble once more by trying to get ahead of itself. Renaming things doesn't change anything.
Furthermore, intel is heavily leaning on advanced packaging technologies, where amd is more conservative. Even if intel does its own manufacturing at 100%, that's still a lot more additional cost and yield issues. It is yet another huge risk of messing things up.
Last and certainly not least, in desperation that it struggles to make a good cpu uarch, intel is aimlessly spraying new product plans in all directions. Rather than focusing on its weakness, intel is desperately trying to shove the market a bunch of stuff nobody asked about, as if that's a substitute to what intel's missing. Diverging efforts will only dilute and diminish intel's ability to produce quality where it actually matters. Intel is not listening to what clients are calling for, instead is pushing to shove them with stuff that makes no sense.
Sorry to break it to you, but intel is not out of the woods yet. It may be another couple of years before it produces a strong enterprise cpu. Amd's sole limitation was production capacity, and having surpassed intel by market cap puts amd in a position to book a lot more capacity, in the absence of an adequate enterprise solution from intel, and with higher output, amd may well double its server market share annually and capture 50% of it.
Mike Bruzzone - Wednesday, February 23, 2022 - link
"Amd has the advantage of not having to bother about process development"; yes"TSMC has a lot more customers, clients and output, so its lead in process RD is understandable"; TSMC has mass, leverage and a lot of legacy equipment cash cow, much more than Intel.
"Intel targets for 10nm were unrealistic", agreed, but Intel was also ripped off for $350 billion in hard cost losses over the last decade that could have funded R&D, PE&C and Intel early retired capable employees. Ultimately Otelinni regime sabotaged Intel and the Board participated in that sabotage and theft from the Entity.
"Their targets for 7nm are just as unrealistic" SF10/7x is in parity with TSMC 7 cost wise and TSMC has shown 5 nm is an incremental node on design process and tools. Could Intel screw it up? We'll know soon but 5 is incremental. Gelsinger so said 5 nodes in 4 years on tick tock is 2.5 nodes. But is 5 nodes on Rocks doubling of cost every node presents hurdles and potential deterrents.
"Furthermore, intel is heavily leaning on advanced packaging technologies" Intel is ahead in SIP and wants to sell package as a service between TSMC Chandler and Intel Rio Rancho. Ultimately this is about defining next generation of automated chip pack and materials handling and who best to do that SO Intel gets its way, TSMC. Front end has to mesh with backend production and materials handling wise.
"Intel is aimlessly spraying new product plans in all directions" when hasn't Intel but after Rocket, Cooper and Ice I suspect Intel has rethought its customer advance audits and I know Intel is cost optimizing which means getting the product right and eliminating waste; unnecessary product and surplus. On Intel division investments I have no comment on no or to little data.
I think Intel is out of the woods but has not won the race.
"Amd's sole limitation was production capacity", agree. 676,817 wafers in 2021 produced 211,207,983 good die both octa ccx and APU plus 54,786,409 i/o for 118,917,697 that in the end I tallied upwards of 119 M which is a corporate record.
https://seekingalpha.com/instablog/5030701-mike-br...
"Amd may well double its server market share annually and capture 50% of it"
On Epyc high volume determination AMD saw a doubling in production volume from 2020 to 2021 minimally plus 82.6% and corporations plan on round numbers so AMD's aim was to double Epyc production in 2021 and likely did.
From a production standpoint where Epyc Milan and Xeon Ice lake are run end entering Genoa in risk production since q3 and Sapphire Rapids in risk production because Intel never waits at xx% whole product on production volume here stated last two quarters on financial reconciliation.
Commercial production share
AMD = 17.09%
Intel = 82.91%
Adding AMD Rome to Milan v Ice Lake only back to market share on channel
AMD = 80.13%
Intel = 18.86%
Adding Intel Cascade Lakes establishing Intel still maintains monopoly share of the commercial server and workstation market.
AMD = 3%
Intel = 97%
Adding AMD Naples and Intel Skylake
AMD = 1.81%
Intel = 98.19%
Adding Intel Broadwell v4
AMD = 1.00%
Intel = 99.00%
Adding Haswell v3
AMD = 0.55%
Intel = 99.45%
Pursuant Mercury Research AMD server share at 10.7% the closest I can get is AMD Milan + Rome + Naples v Intel Ice and Cascade Lake Gold Silver refresh;
AMD = 15.92%
Intel = 84.08%
mb
gescom - Thursday, February 24, 2022 - link
"AMD = 15.92%Intel = 84.08%"
Ok, same old same old, what about new systems shipped 2020, 2021? I think you'd see a radically different % picture.
JayNor - Saturday, February 19, 2022 - link
Raja says Intel is shipping SPR in q1 and sampling SPR-HBM, and add DSA, tiled matrix operations, bfloat16, pcie5, cxl, ddr5 ... so when will AMD match all those features?Qasar - Saturday, February 19, 2022 - link
jaynor, and you actually believe what intel says ? all things considered, that track record for the last few years, sucks. IF you werent and intel shill, then you would believe this, when it is actually out.schujj07 - Sunday, February 20, 2022 - link
Well they have 1 more month to make Q1. So far I haven't seen any product announcements for SPR.