Arm Unveils Client CPU Performance Roadmap Through 2020 - Taking Intel Head Onby Andrei Frumusanu on August 16, 2018 9:05 AM EST
Today’s announcement is an oddball one for Arm as we see the first-ever public forward looking CPU IP roadmap detailing performance and power projections for the next two generations through to 2020.
Back in May we extensively covered Arm’s next generation Cortex A76 CPU IP and how it’s meant to be a game-changer in terms of providing one of the biggest generational performance jumps in the company’s recent history. The narrative in particular focused on how the A76 now brought real competition and viable alternatives to the x86 market and in particular how it would be able to offer performance equivalent to Intel’s best mobile offerings, at much lower power.
Arm sees always-connected devices with 5G connectivity as a prime opportunity for a shift in the laptop market. Qualcomm’s recent Snapdragon 835 and Snapdragon 850 platforms were the first attempts in trying to establish this new slice for Arm-based PCs.
Today’s roadmap now publicly discloses the codenames of the next two generations of CPU cores following the A76 – Deimos and Hercules. Both future cores are based on the new A76 micro-architecture and will introduce respective evolutionary refinements and incremental updates for the Austin cores.
The A76 being a 2018 product – and we should be hearing more on the first commercial devices on 7nm towards the end of the year and coming months, Deimos is its 2019 successor aiming at more wide-spread 7nm adoption. Hercules is said to be the next iteration of the microarchitecture for 2020 products and the first 5nm implementations. This is as far as Arm is willing to project in the future for today’s disclosure, as the Sophia team is working on the next big microarchitecture push, which I suspect will be the successor to Hercules in 2021.
Part of today’s announcement is Arm’s reiteration of the performance and power goals of the A76 against competing platforms from Intel. The measurement metric today was the performance of a SPECint2006 Speed run under Linux while complied under GCC7. The power metrics represent the whole SoC “TDP”, meaning CPU, interconnect and memory controllers – essentially the active platform power much in a similar way we’ve been representing smartphone mobile power in recent mobile deep-dive articles.
Here a Cortex A76 based system running at up to 3GHz is said to match the single-thread performance of an Intel Core i5-7300U running at its maximum 3.5GHz turbo operating speed, all while doing it within a TDP of less than 5W, versus “15W” for the Intel system. I’m not too happy with the power presentation done here by Arm as we kind of have an apples-and-oranges comparison; the Arm estimates here are meant to represent actual power consumption under the single-threaded SPEC workload while the Intel figures are the official TDP figures of the SKU – which obviously don’t directly apply to this scenario.
We didn’t have internal data to verify Arm’s claims as of publishing of the article, but the 15W Intel figure is naturally on the high side, given that this just the official TDP representing multi-threaded workloads – a very quick test of CB15 ST power as reported by MSR registers on an 7200U at 3.1GHz measured 9.3W package+DRAM power while an 8250U at 3.35GHz came in at 11W. I haven’t correlated SPEC power on x86 to date, but I’m expecting it on average to be less than CB15. Even if the 15W figure for the 7300U is correct, and I’m expecting something more in the range of 9-11W, Arm might be using one of Intel’s notably less efficient performance points when doing the comparison for these SKUs. Of course this doesn’t invalidate the data as efficiency for the A76 at those frequencies would also not be optimal, it’s just something to keep in mind.
It’s also interesting to see Arm scale back on the performance comparison as they’re using a 3GHz A76 as the comparison data-point – this is in contrast to the 3.3GHz maximum 5W performance point presented during TechDay. I had tried to estimate the A76’s power in mobile form-factors based on the different metrics Arm disclosed and came at an estimated 2.3W at 3GHz. Naturally Arm says “less than 5W” and they could be erring on the safe side of not over-promising – but if it had been *that* much lower, as in my estimate, we would have maybe seen even more aggressive marketing figures. In the end, until we get the first A76 devices in our hands, we won’t know for sure what the exact figures will be and at which point on the efficiency curve Arm’s projected 3GHz performance figures will end up at.
The last slide that is notable to talk about is the performance projections for Deimos and Hercules. Here Arm’s taking a direct stab at Intel’s lack of significant progress over the last few years and reiterating its confidence in the company’s ability in sustaining high CAGR (compound annual growth rate) performance figures for the next generations.
Again at TechDay we quoted figures of 20-25% while today’s announcement contained a more conservative figures of “>=15%” – likely better representing a seemingly larger 20% projected boost for Deimos as well as what seems to be a 10% gain for the 5nm follow-up Hercules. Taking into account the relative positioning of the data-points in this chart, I did some quick correlation and it matches my initial estimated performance figures for a 3GHz A76 at around ~26 SPECint2006. Deimos and Hercules would come in at figures of ~31 and ~34 points.
Finally today’s announcement is a marketing exercise attempting to emphasise Arm’s performance and power commitments over the next few generations, trying to showcase it has the strategy and technology in place to make the Arm laptop market a real growth opportunity. If and how this pans out is something that we won’t find out at least until later on in the year, with the first actual A76 based large form-factor designs not being a thing until at least sometime in 2019. We’re eagerly awaiting the first A76 based mobile designs in the months to come and to have a first hand-on evaluation of the new microarchitecture family.
Post Your CommentPlease log in or sign up to comment.
View All Comments
ZolaIII - Thursday, August 16, 2018 - linkIt's ment to be at around 3W at 3GHz on 7nm. Should be around 5W on 10nm which equals 2x performance per W on the similar manufacturing process.
sharath.naik - Thursday, August 16, 2018 - linkMy Intel i5 8250u hits 3.4 GHz at 7 watts on a single core. Hits 2.9ghz at around 5w. So not sure if Intel is worried with this false claims. Also Intel at 2 GHz is likely to be faster than arm at 3 GHZ RISC vs CISC. Intel is in no danger from arm in the desktop space, power limit is not the top concern
Andrei Frumusanu - Friday, August 17, 2018 - link7W sounds like you're just looking at the IA cores metric - you need to account everything else for the apples-to-apples comparison.
eastcoast_pete - Thursday, August 16, 2018 - linkThanks Andrei!
Two observations: 1. Thanks ARM for finally admitting that the performance of the A73 core is either the same or BELOW that of the A72! For proof, see their comparo slide to Intel's i5u series reproduced above in the article. Also, that's how ARM gets the 2.5x increase in (projected!) compute power, would be less from the A72 or 75.
2. Right now, all that is projected or extrapolated; again, the A72/73 experience means a major discount is in order. Sure, the next big ARM cores will be better than current ones, but 2.5x as powerful is very optimistic. Maybe ARM should take the "underpromise, then overdeliver" route for once; looks a lot better. Also, those big ARM cores will have to throttle quickly after short bursts when used in handsets, just like Apple's chips, just to keep thermals in check.
An remaining issue for Windows-on-ARM is that the total performance of ARM's designs live by multicore/multithread aware software; single-core/threat applications are a significant problem in the Windows market to date.
Regardless, I welcome any new serious challenger to Chipzilla - if nothing else, it keeps them honest and on their feet. Zen and Zen+ from AMD worked like prune juice - look at Intel running to get their own higher perfomance/better value chips out there in a hurry after years of technological constipation! Go AMD!
Lastly, I agree with others here that the first ARM-derived processor that probably will play a significant role in the laptop/ultraportable space will be from Apple, and therefore not likely to run Windows natively. Apple already has high performing wide and deep core designs, and I expect the next generation A##X chip to be the first to show up in the new MacBook Airs and similar. More battery capacity and thermal headroom in laptops means one can run bigger cores faster for longer than in phones.
ZolaIII - Thursday, August 16, 2018 - linkA73 is actually a tad slower than A72 but it's also 33% more power efficient which gave it around 15% performance headroom on same power limit.
If A76 is 66~70% faster than A73 MHz per MHz while using 2x power that is a great accomplishment on it's own. We had A72's on 28 nm planar at 2GHz, A76 should eat only 25% more power while 7nm in comparison uses only 25% of power compared to 28 nm. So you probably won't see 3GHz one's in smartphone SoC's but you should still see 70~80% performance improvement process and architecture combined improvements which is more than enough to justify upgrade for users. If it's half the size & half the DTP compared to X86 part it's more than good enough for server's. Add to that opportunity to license & make your own SoC's and it makes an it an choice which Intel simply cannot match as that (ARM A76 SoC) makes it much, much cheaper. ARM is making POP ready license for A76's on Samsungs second gen EAV 7nm FinFET which will make A76 production cost down to a level of an A73 on 14~16 nm FinFET's. TSMC will follow along with others.
eastcoast_pete - Thursday, August 16, 2018 - linkThis may sound surprising, but I mostly agree with you! Yes, if the A76 ends up being 70% faster per MHz using 2x the power of A73, that would be an accomplishment, and a very meaningful one. I just wish ARM would not use the "2.5x faster" eye catcher and just go with something like "70% more compute power per MHz at 2x the power envelope" or similar. This is not just about ARM: I really dislike it when chip designs are being promoted like snake oil. Make realistic promises, then, if possible, beat them. I personally saw the A73 as a step back or sideways from the A72, although it made some sense for the mobile space. In that area, I actually think Apple's approach is a good one for mobile (and I don't like IOS, and don't own any iPhone or iAnything): high performance for short, intense bursts, throttled but okay performance for sustained workloads. That makes the most sense to me in the power and thermal confines of a handheld device. Now, in a server, processors have to sustain 70-80% of the burst speed for hours or days on end, so those designs have to be different.
ZolaIII - Thursday, August 16, 2018 - linkWell everyone likes to pump MHz to for given process and silicone insane levels including Intel and Apple. Saint sustainable limit for FinFET structured one's is at 2~2.2 GHz disregarding of manufacturing node & we see that only on the top server skus this day's. Let's stick to the TSMC projections & try to explain some things. TSMC says that their first gen 7nm FinFET DUV will reduce power usage for 60% compared to the 14~16 nm FinFET of their own. As the A76 is a four instructions per clock wide design compared to the A73 two instructions one it's logical to assume how the A76 will be twice the size if not even more (but I doubt that). That puts us to the figure how an A76 core on 7nm will actually use 80% power of the A73 on 14~16 nm or 33% of an A72 (which is three instructions wide) on 28 nm planar all MHz per MHz. What looks like ARM did remarkable good is rising efficiency of instructions process per cycle but we need to see actual silicone to judge of that.
Apple is actually the last vendor (as everyone else did it before them) to introduce real heterogeneous CPU typology and scheduling (aka Big.Little) and still not so good. Only thing I agree with recent Apple SoC design (A11) is how two big cores are enough for the mobile same as two little ones would be enough to retain advantage of the Big.little on a less power restrained system (aka laptop, desktop server). Thing what we need right now for mobile is an A73 successor made for DynamIQ to replace or better say take the place of the small A55 core's which will rise a bar in user experience a lot as small one's are actually a worker's while big one's are there only for a jump start.
Wilco1 - Thursday, August 16, 2018 - linkCortex-A73 absolutely isn't slower than Cortex-A72. It has 11% higher IPC: https://www.anandtech.com/show/11088/hisilicon-kir...
As Zolall mentioned it also reaches higher frequencies and can sustain those for far longer due to better efficiency.
HStewart - Thursday, August 16, 2018 - linkFrom "Wendy's" famous commercial "Where is the beef"
Or more precisely how these performance claims back up - keep in mind they are talking about future ARM CPU and comparing to only older Intel CPU's. This is obviously a PR move by ARM
Also it really seams like title totally ignores AMD is and it should read "Taking x86 CPUs head on" or maybe AMD is not seen as competitor
ZolaIII - Thursday, August 16, 2018 - linkAMD is not seen as leading competitor as Zen+ is still slower than 7th gen Intel. Don't worry AMD will jump the ship in time which is one thing Intel can't do.