Ask the Experts: ARM's Cortex A53 Lead Architect, Peter Greenhalghby Anand Lal Shimpi on December 10, 2013 9:00 AM EST
- Posted in
- Ask the Experts
Given the timing of yesterday's Cortex A53 based Snapdragon 410 announcement, our latest Ask the Experts installment couldn't be better. Peter Greenhalgh, lead architect of the Cortex A53, has agreed to spend some time with us and answer any burning questions you might have on your mind about ARM, directly.
Peter has worked in ARM's processor division for 13 years and worked on the Cortex R4, Cortex A8 and Cortex A5 (as well as the ARM1176JZF-S and ARM1136JF-S). He was lead architect of the Cortex A7 and ARM's big.LITTLE technology as well.
Later this month I'll be doing a live discussion with Peter via Google Hangouts, but you guys get first crack at him. If you have any questions about Cortex A7, Cortex A53, big.LITTLE or pretty much anything else ARM related fire away in the comments below. Peter will be answering your questions personally in the next week.
Please help make Peter feel at home here on AnandTech by impressing him with your questions. Do a good job here and I might be able to even convince him to give away some ARM powered goodies...
Post Your CommentPlease log in or sign up to comment.
View All Comments
Fergy - Tuesday, December 10, 2013 - linkHave OEMS spoken about plans for 'real' laptops with ARM cpus? Like the Intel and AMD laptops.
iwod - Tuesday, December 10, 2013 - link1. MIPS - Opinions On it against ARMv8 ?
2. I Quote
"There is nothing worse than scrambled bytes on a network. All Intel implementations and the vast majority of ARM implementations are little endian. The vast majority of Power Architecture implementations are big endian. Mark says MIPS is split about half and half – network infrastructure implementations are usually big endian, consumer electronics implementations are usually little endian. The inference is: if you have a large pile of big endian networking infrastructure code, you’ll be looking at either MIPS or Power. "
How True is that? And if true, do ARM has any bigger plans to tackle this problem. Obviously there are huge opportunities when SDN are now exploding.
3. Thoughts on current integration of IP ( ARM ), Implementer ( Apple/Qualcomm ) and Fab ( TSMC ) ? Especially on the speed of execution. Where previously it would takes years for any IP marker from announce to something that is on the market. We are now seeing Apple coming in much sooner and Qualcomm is also well ahead of ARM projected schedule for 64Bit SoC in terms of Shipment date.
4. Thoughts on Apple's implementation of ARMv8?
5. Thoughts on Economy of Scale in Fab and Nodes. Post 16/14nm and 450mm wafers. Development Cost etc. How would that impact ARM?
6. By having a Pure ARMv8 implementation and Not supporting the older ARMv7. How much, in terms of % transistor does it save?
7. What technical hurdles do you see for ARM in the near future?
Peter Greenhalgh - Wednesday, December 11, 2013 - linkHi iwod,
Addressing question-2, all ARM architecture and processor implementations support big and little endian data. There is an operating system visible bit that can be changed dynamically during execution.
On question-6, certainly an AArch64 only implementation would save a few transistors compared to an ARMv8 implementation supporting both AArch32 and AArch64. However probably not as much as you think and is very dependent on the micro architecture since the proportion of decode (or AArch32 specific gates) will be less in a wide OOO design than an in-order design. For now, code compatibility with the huge amount of applications written for Cortex-A5, Cortex-A7, Cortex-A9, etc is more important.
ltcommanderdata - Tuesday, December 10, 2013 - linkNext gen consoles have been noted for their use of SoCs, especially in the context of hUMA. Of course, SoC have long been the standard in the mobile space. What is the current state of hUMA-like functionality between the CPU and the GPU in mobile? And what can and/or will be done in the future to improve this, both within ARM's family of products (ARM CPU + ARM GPU) and working with third-parties (ARM CPU + any other GPU)?
Intel has adopted a cache model where each core has small pools of private, fast L1 and L2 cache and sharing/integration between cores and even the GPU happens in a larger, slower L3 cache. ARM's designs favour a private, fast L1 with sharing happening on the level of the L2 cache. What are the advantages/disadvantages between these design choices in terms of performance, power, die area, and scalability/flexibility?
Intel and AMD are busy expanding the width of their SIMD instruction set to 256-bits and beyond. Are 256-bit vectors relevant to mobile and NEON or are the use cases not there in mobile and/or the power/die area not worth it?
On the topic of ISA extensions to accelerate common functionality what other opportunities are out there? ARMv8 is adding acceleration for cryptography. Could acceleration for image processing, face recognition or voice recognition be useful or are those best left for specific chips outside the CPU?
ciplogic - Wednesday, December 11, 2013 - link* Which are the latencies in CPU cycles for CPU caches? Is it possible in future to create a design that uses a shared L3 cache?
* How many general purpose CPU registers are in Cortex-A53 compared with predecesors?
* Can be expected that Cortex-A53 to be part of netbooks in the years to come? What about micro-servers?
Peter Greenhalgh - Sunday, December 15, 2013 - linkHi Ciplogic,
While not yet in mobile, ARM already produces solutions with L3 caches such as our CCN-504 and CCN-508 products which Cortex-A53 (and Cortex-A57) can be connected too.
Since Cortex-A53 is an in-order, non-renamed processor the number of integer general purpose registers in AArch64 is 31 the same as specified by the architecture.
name99 - Wednesday, December 11, 2013 - linkHow closely does a company like ARM follow academic ideas, and how long does it take to move those ideas into silicon.
- right now the king of academic branch prediction appears to be TAGE. Is ARM looking at changing its branch predictor to TAGE, and if so would we expect that to debut in 2015? 2017?
- there have been some very interesting ideas for improving memory performance through having LLC and Memory Controller know about each other. For example Virtual Write Queue attempts to substantially reduce the cost of writing out data, while another scheme has predictors for when various ranks will be idle long enough that writes to them should be attempted, and a third scheme has prefetch requests prioritized to match ranks that are least busy. Once again, how long before we expect this sort of tech in ARM CPUs?
- in a handwaving fashion, for a high end CPU, I think it's fair to say that the single biggest cause of slowdowns is memory latency, which everyone knows; but the second biggest cause of slowdowns is the less well known problem of fetch bandwidth, specifically frequent taken branches, coupled with a four-wide fetch from a SINGLE cache line, and edge effects that result in many of those fetches being less than four wide. The heavy duty solution for this is a trace cache, a somewhat weaker solution is a loop buffer. Does ARM plan to introduce either of these? (Surely they are not going to allow the fact that Intel completely bollixed their version of a trace cache destroy what is conceptually a good idea, especially if you just use it as a small loop driven augmentation of a regular I-cache, rather than trying to have it replace the I-cache?)
secretmanofagent - Wednesday, December 11, 2013 - linkGetting away from the technical questions, I'm interested in these two.
ARM has been used in many different devices, what do you consider the most innovative use of what you designed, possibly something that was outside of how you envisioned it originally being used?
As a creator, what devices made you look at what you created and had the most pride?
Peter Greenhalgh - Wednesday, December 11, 2013 - linkHi,
I'd suggest all of us who work for ARM are proud that the vast majority of mobile devices use ARM technology!
Some of the biggest innovations with ARM devices is coming in the Internet of Things (IOT) space which isn't as technically complex from a processor perspective as high-end mobile or server, but is a space that will certainly effect our everyday lives.
NeBlackCat - Wednesday, December 11, 2013 - link> Do a good job here and I might be able to even convince him to give away some ARM powered goodies
What's his favourite type of sausage?
Since, as any halfwit should be able to work out from the above spelling, I've got bugger all chance of being eligible for giveaways.