Well, yeah that would be dumb. Especially considering they're not addressing the same markets, price-points and customers. Did I mention the all-new and shiny RX-480 gets beaten by 2013's 3.5GB GTX 970? AMD? Until Zen shows up, it feels like YawnMD.
according to anands own bench: http://www.anandtech.com/bench/product/1748?vs=174... i wouldn't really call that a beating.. they trade beats back and forth, and anandtech puts 480 (8 gigs of ram )over all better then 970, byt the 4 gig version.. is slower so then you would be correct... but not compared to the 8 gig version of rx480...
Once DX12/Vulkan games are added, and future games that will use > 3.5GB of VRAM, RX 480 4-8GB is a far superior graphics card to the 970.
But the part that's flawed in your entire analysis is that RX 480 is the R9 380 successor and RX 490 will be R9 390/390X successor. Therefore, your comparison of RX 480 to 970 is comparing GPus from entirely different classes. Seem you have been reading too much [H]. RX 480 is a GTX1060 competitor and since 1070 replaced 970, AMD hasn't released the 1070's competitor yet (that should be the RX 490).
Not a big fan of KB himself but Brent Justice's review of the 1060 on [H] was pretty decent. It was even-handed not only in terms of numbers but the language of the review and the conclusion.
Yeah, I think that's why little Vega exists. It's definitely looking like it'll have two stacks and compete with gp104.
And looking at bandwidth, it appears like amd could use the 160 MHz variants and still have buckets of bandwidth to spare.
Regarding big Vega, do you think big Vega will be twice (or nearly twice) as large as little Vega?
I would've assumed that big Vega would be more like 1.5x little Vega (just like gp102 is 1.5x gp104), but that would require 3 stacks of hbm and that doesn't appear to be a possible configuration. That's a shame as it would be pretty much perfect (same capacity as non-clamshell 8Gb gp102), but still copious amounts of bandwidth.
The really interesting thing is how much these memories cost compared to gddr5 and gddr5+. If it is substantially more expensive, as one could ques, then the end product is not very affordable either. But good to know that memory is in schedule, and that means that we can see products based on it in the next year.
I'd be even more partial to an "APUless" Zen with HBM for general CPU work. The APU only makes sense if they can get *anybody* on the heterogenous programming train, and that still isn't happening (while physX does fine), or you are using a stripped down device (presumably mobile and ChromOS level low).
Mostly the idea is that if you have plenty of cores (especially for a Zen server, but that's dreaming) you will have more pressure than the [on die] caches can handle, and thus more bandwidth than an ordinary DDR4 bus can handle. Hopefully HBM2 to the rescue.
While I'm really dreaming, instead of HBM misses going to DDR4, use 3dXpoint for main memory (like Intel won't block Micron from selling it for that. But given time...). 64GB of "main storage" that is part virtual ram and part disk cache would really upset the memory hierarchy.
- further notes: Unless things have changed, calling HBM2 "on die" is really deceptive. It *is* on die, on but an adapter die that both the HBM ram and GPU[/APU/CPU] dice are bonded to. The fact that the stuff is called "high bandwidth memory" and not "low latency memory" doesn't mean it makes a bad cache. It might need a long lines or otherwise multiple access at once, but that is to be expected on a multi-core CPU. And while a L1/L2/L3 might be all about low latency, the point of your L4 is about reducing bandwith to "main memory" (DRAM/3dXpoint/SLC?).
Could AMD theoretically make 8GB versions of Fiji using HBM2 in 2016? Similar to the way that GP104's memory controller can do GDDR5 and GDDR5X, could we feasibly see an RX 490 with Fiji and 8GB HBM2? I think one of the Fury lineup's biggest drawbacks was memory capacity. Just curious.
In theory yes; but doing so would require major amounts of design work and revalidation in manufacturing. Unless they started it a number of months ago, by the time they finish they'll be right on top of the planned Vega launch dates. The ideal time to have done this would've been when Fiji was first being designed a few years ago.
Also hbm2 is more expensive than hbm1 and Fuji is not very badly memory starved. So better increase power by vega and add more memory for future games. But it should be possible. Not sure though that it would be difficult. If there is as Many memory stacks as in normal fury, increasing the high of memory stack could be as easy as drop in replasement.
I wold love to see AMD create a Zen APU with a couple of stacks of HBM2. As to the market this would target, I'm not totally certain, but I've long wanted to see Intel or AMD create a compelling product that integrates everything into a single package much like a console does already. If the accompanying GPU is good, something like this would replace every set-top box I own along with my desktop/Plex server
Zen APU's with gpu's large enough for 1080p 60 fps high settings will take a huge bite out of the discrete gpu market. Which AMD is already losing so they really have nothing to lose having the apu cannibalizing the discrete market. Nvidia has far more to lose from that.
I actually think this is one of the only ways AMD can make a convincing comeback. I doubt I would consider buying anything else if it was a solid implementation. One chip, one cooling solution. Throw it into a solid motherboard and add an SSD. Nvidia and Intel have nothing to compete with something like that and it could finally force them to compete at the low end. My hope is a ~470 level Polaris on-die with Zen. We'll see next year
I Also beleive that HBM with zen could be really good for very small htpc computers. It would be slover than Intel in the Office programs, but would be much faster in the games. So when not needed diskrete graphic card, it could be really usefull. Ofcourse bigger Computer with diskrete gpu would be much faster, but what people do with htpc. Look videos, play party games, surf on the web occasionally. Zen with HBM could be perfect for that.
That would be one massive CPU die - if AMD made a GPU on the SoC that was fully enabled with 32 CUs (470), it would surely be at least 160mm^2. Add in the CPU die, and HBM, and there'd be something far bigger than what we're used to seeing (in consumer parts). Given the size of AM4, would they actually be able to do that?
IMHO APUs never made much sense as long as their graphics performance was so limited by the ordinary DDR3/DDR4 DRAM bandwidth (except perhaps if AMD had also been using tile based rendering which parallel discussion around here). And a discrete GPU just made 60% of the die space obsolete and HSA far less efficient.
Putting the GPU on the CPU die is all wrong; you’ve got to do the reverse, adding CPU and the rest of the SoC on the GPU.
The PS/4 chip shows the right path by having all RAM be GDDR5.
HBM2 could make that technically feasible and even economical: At 8GB/stack were talking 32GB per SoC and if that's not enough you'll go SMP/SLI/Stereo for VR or HPC.
And once your die stacking is truly mature, HBM should actually be more economical than ordinary DRAM because you can “outsource” a lot of the amplification work to the base chip.
Perhaps even some logic?
Depending on the level of Integration available on Zen variants I can easily see single die carrier PCs with little more than some PHYs and connectors on the so called motherboard.
Discrete-DRAM-be-dammed the entire "motherboard" would just fit below a nice large and quietly blowing cooler: The NUC would be a Nano!
There could still be variants with PCIe or perhaps just PCIe backplanes for the "motherboard" for those going SMP or wanting to add some odd PCIe add-in, but with a couple of USB 3.1 ports most desktops should be fine.
I guess essentially the desktop PC would have just caught up with the mobile variant but suddenly everything under my desk seems like a dinosaur.
Main problem with a SoC PC is that you lose the possibility to do partial upgrades. My 5 year old Sandy Bridge i7 can still keep up today and 16GB of RAM is still plenty enough, but that GTX 560 is really weak.
An APU with integrated memory would be fairly expensive and you have to replace it completely once it's time to upgrade the graphics card even though the memory and CPU parts of it may still be enough. It's wasteful and costly.
Ofcourse if we talk so big Computer that you can replace parts, it is better to have diskrete gpu, bit if we talk about small hand size computers, then there just is not alternatives.
The HBM story is much about flexibility vs. fixed function/allocation.You pay for the flexibility of external DRAM in latency and with lots of extra energy.
Ultimately there must be balance between memory capacity and CPU power, GPU power and memory bandwidth. That's why I'd still want to be able to put multiple SoC together to create an SMP system where I want VR and/or higher resolution: It would give more more of everything.
And someone might still take the last gen APU off your hands, should you believe an upgrade is required.
I'm not sure where it stands vs standard DDR; but HBM is significantly more power efficient than GDDR. OTOH each new generation of GPU pushes memory harder and pushes up the power consumption again. IIRC the bars from this nvidia slide represent the ram used with successive generations of GPUs; and while HBM reduced power enough to delay it by a few generations there's still a looming power crisis threatening future generations of GPUs. My suspicion is that the unspecified next generation memory on the AMD slide about the upcoming Navi GPU will probably be an HBM2 successor redesigned to be more power efficient within the memory dies themselves (HBMs savings come from massively reducing data bus power).
Thank you, I am still curious on the ddr4 power usage since that would be the thing if you need to compare it to if you are considering an apu with hbmII vs an apu with ddr4.
2 years ago it looks like DDR4 was ~1.5W per 4GB (fairly consistent between 4 and 8gb dimms and across several manufacturers and data rates). I think we've had at least one dram process shrink since then so current numbers are probably a good bit lower.
I've found it much harder to find information on this than I thought: You can see "40% less" quotes everywhere, but absolute values are hard to find.
I guess one of the issues is that DRAM power consumption actually may not be constant: During reading and writing it will most likely use more and evidently there are also energy savings possible, because otherwise suspend to RAM wouldn't make sense.
The other day I plugged 128GB of DDR4 (8 DIMMs) into my latest Xeon and was shocked when I ran across a line in my CPUID HWmonitor on energy consumption:
While the 12 core Xeon E5-2680v3 went to around 10 Watts on idle, the DRAM was listed as using 60 Watts on idle and 120 Watts during Prime95.
The 4GHz E3-1280v3 with 32GB of DDR3 DRAM (4 modules) right next to it would go to something like 4 Watts on idle for the RAM, a figure much more in line with my expectations.
I don't actually know how and with what level of exactness the figures are measured, but they could well be true and reflect distinct behavior by the memory controllers on the CPUs.
Even if it's a "Xeon" the latter system is essentially a desktop developed out of a mobile blueprint while the first is clearly a server chip. And while even servers support power saving features these days, they may not be as aggressive about it.
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
43 Comments
Back to Article
Mr.AMD - Monday, August 1, 2016 - link
Truly good news, for AMD Vega....tsk2k - Monday, August 1, 2016 - link
Still not coming before 2017ImSpartacus - Monday, August 1, 2016 - link
You're probably right. We could see little Vega in late 2016, but I doubt we'd see both Vegas until 2017nandnandnand - Monday, August 1, 2016 - link
Who is arguing that Vega is coming in 2016? Polaris and Vega in the same year would be dumb.ddriver - Monday, August 1, 2016 - link
Releasing midrange and highend in the same year would be dumb? That's pretty dumb ;)bigboxes - Tuesday, August 2, 2016 - link
Can we just stop with the pissing matches?ImSpartacus - Monday, August 1, 2016 - link
Just like gp102, gp104 and gp106 dropping in the same six month period? Hell, there's still time for gp107 for a clean sweep.3ogdy - Monday, August 1, 2016 - link
Well, yeah that would be dumb. Especially considering they're not addressing the same markets, price-points and customers. Did I mention the all-new and shiny RX-480 gets beaten by 2013's 3.5GB GTX 970? AMD? Until Zen shows up, it feels like YawnMD.Qasar - Monday, August 1, 2016 - link
according to anands own bench:http://www.anandtech.com/bench/product/1748?vs=174...
i wouldn't really call that a beating.. they trade beats back and forth, and anandtech puts 480 (8 gigs of ram )over all better then 970, byt the 4 gig version.. is slower so then you would be correct... but not compared to the 8 gig version of rx480...
RussianSensation - Monday, August 1, 2016 - link
970 loses to the RX 480. https://www.youtube.com/watch?v=-h6GiFqPYN8Once DX12/Vulkan games are added, and future games that will use > 3.5GB of VRAM, RX 480 4-8GB is a far superior graphics card to the 970.
But the part that's flawed in your entire analysis is that RX 480 is the R9 380 successor and RX 490 will be R9 390/390X successor. Therefore, your comparison of RX 480 to 970 is comparing GPus from entirely different classes. Seem you have been reading too much [H]. RX 480 is a GTX1060 competitor and since 1070 replaced 970, AMD hasn't released the 1070's competitor yet (that should be the RX 490).
Alexvrb - Monday, August 1, 2016 - link
Not a big fan of KB himself but Brent Justice's review of the 1060 on [H] was pretty decent. It was even-handed not only in terms of numbers but the language of the review and the conclusion.Roland00Address - Tuesday, August 2, 2016 - link
>Did I mention the all-new and shiny RX-480 gets beaten by 2013's 3.5GB GTX 970Anandtech has the gtx 970 launch being september 2014
http://www.anandtech.com/show/8526/nvidia-geforce-...
Thus your point about the gtx970s and their launch date is very much hyperbole
Flunk - Monday, August 1, 2016 - link
There are a lot of nutcases out there.ImSpartacus - Monday, August 1, 2016 - link
Yeah, big Vega could take three of those 1.6GT/s stacks and yield the same capacity with 25ish% more bandwidth than the Titan X.And then little Vega could use two stacks, but that would have 25ish% more bandwidth than the 1080.
I'm wondering if amd will underclock it a little bit. They generally need more bandwidth than Nvidia, but not THAT much more.
extide - Monday, August 1, 2016 - link
Little Vega: 2x 4GB statcks, 8GB, 2048-bit, 408-512GB/secBig Vega: 4x 4GB stacks, 16GB, 4096-bit, 816-1024GB/sec
Seems like you could build a nice little GPU with 2 of those stacks...
ImSpartacus - Monday, August 1, 2016 - link
Yeah, I think that's why little Vega exists. It's definitely looking like it'll have two stacks and compete with gp104.And looking at bandwidth, it appears like amd could use the 160 MHz variants and still have buckets of bandwidth to spare.
Regarding big Vega, do you think big Vega will be twice (or nearly twice) as large as little Vega?
I would've assumed that big Vega would be more like 1.5x little Vega (just like gp102 is 1.5x gp104), but that would require 3 stacks of hbm and that doesn't appear to be a possible configuration. That's a shame as it would be pretty much perfect (same capacity as non-clamshell 8Gb gp102), but still copious amounts of bandwidth.
RaichuPls - Monday, August 1, 2016 - link
1060 review? RX480 deep dive?gijames1225 - Monday, August 1, 2016 - link
Hopefully this paves the way to upper-end AMD Zen APUs having some HBM on-die for the graphics side of things.haukionkannel - Monday, August 1, 2016 - link
The really interesting thing is how much these memories cost compared to gddr5 and gddr5+. If it is substantially more expensive, as one could ques, then the end product is not very affordable either.But good to know that memory is in schedule, and that means that we can see products based on it in the next year.
stardude82 - Monday, August 1, 2016 - link
I think you mean on-package rather than on-die. Intel Iris Pro graphics and eDRAM operate off-die on-package.wumpus - Tuesday, August 9, 2016 - link
I'd be even more partial to an "APUless" Zen with HBM for general CPU work. The APU only makes sense if they can get *anybody* on the heterogenous programming train, and that still isn't happening (while physX does fine), or you are using a stripped down device (presumably mobile and ChromOS level low).Mostly the idea is that if you have plenty of cores (especially for a Zen server, but that's dreaming) you will have more pressure than the [on die] caches can handle, and thus more bandwidth than an ordinary DDR4 bus can handle. Hopefully HBM2 to the rescue.
While I'm really dreaming, instead of HBM misses going to DDR4, use 3dXpoint for main memory (like Intel won't block Micron from selling it for that. But given time...). 64GB of "main storage" that is part virtual ram and part disk cache would really upset the memory hierarchy.
- further notes:
Unless things have changed, calling HBM2 "on die" is really deceptive. It *is* on die, on but an adapter die that both the HBM ram and GPU[/APU/CPU] dice are bonded to.
The fact that the stuff is called "high bandwidth memory" and not "low latency memory" doesn't mean it makes a bad cache. It might need a long lines or otherwise multiple access at once, but that is to be expected on a multi-core CPU. And while a L1/L2/L3 might be all about low latency, the point of your L4 is about reducing bandwith to "main memory" (DRAM/3dXpoint/SLC?).
LarsBars - Monday, August 1, 2016 - link
Could AMD theoretically make 8GB versions of Fiji using HBM2 in 2016? Similar to the way that GP104's memory controller can do GDDR5 and GDDR5X, could we feasibly see an RX 490 with Fiji and 8GB HBM2? I think one of the Fury lineup's biggest drawbacks was memory capacity. Just curious.DanNeely - Monday, August 1, 2016 - link
In theory yes; but doing so would require major amounts of design work and revalidation in manufacturing. Unless they started it a number of months ago, by the time they finish they'll be right on top of the planned Vega launch dates. The ideal time to have done this would've been when Fiji was first being designed a few years ago.LarsBars - Monday, August 1, 2016 - link
Ahh gotcha. Thanks for the explanation. Bring on Vega!Eden-K121D - Tuesday, August 2, 2016 - link
HBM 1 standard was updated to support 2GB stacks so yes a 4*2 GB configuration is possiblehaukionkannel - Tuesday, August 2, 2016 - link
Also hbm2 is more expensive than hbm1 and Fuji is not very badly memory starved. So better increase power by vega and add more memory for future games. But it should be possible. Not sure though that it would be difficult. If there is as Many memory stacks as in normal fury, increasing the high of memory stack could be as easy as drop in replasement.ChefJeff789 - Monday, August 1, 2016 - link
I wold love to see AMD create a Zen APU with a couple of stacks of HBM2. As to the market this would target, I'm not totally certain, but I've long wanted to see Intel or AMD create a compelling product that integrates everything into a single package much like a console does already. If the accompanying GPU is good, something like this would replace every set-top box I own along with my desktop/Plex serverutroz - Monday, August 1, 2016 - link
I can say 100% AMD is going to have Zen APU's with HBM2 and has had this planned for a long time.Laststop311 - Monday, August 1, 2016 - link
Zen APU's with gpu's large enough for 1080p 60 fps high settings will take a huge bite out of the discrete gpu market. Which AMD is already losing so they really have nothing to lose having the apu cannibalizing the discrete market. Nvidia has far more to lose from that.tarqsharq - Tuesday, August 2, 2016 - link
2gb HBM not even HBM2 would be amazing for an APU.ChefJeff789 - Tuesday, August 2, 2016 - link
I actually think this is one of the only ways AMD can make a convincing comeback. I doubt I would consider buying anything else if it was a solid implementation. One chip, one cooling solution. Throw it into a solid motherboard and add an SSD. Nvidia and Intel have nothing to compete with something like that and it could finally force them to compete at the low end. My hope is a ~470 level Polaris on-die with Zen. We'll see next yearhaukionkannel - Tuesday, August 2, 2016 - link
I Also beleive that HBM with zen could be really good for very small htpc computers. It would be slover than Intel in the Office programs, but would be much faster in the games. So when not needed diskrete graphic card, it could be really usefull. Ofcourse bigger Computer with diskrete gpu would be much faster, but what people do with htpc. Look videos, play party games, surf on the web occasionally. Zen with HBM could be perfect for that.Drumsticks - Tuesday, August 2, 2016 - link
That would be one massive CPU die - if AMD made a GPU on the SoC that was fully enabled with 32 CUs (470), it would surely be at least 160mm^2. Add in the CPU die, and HBM, and there'd be something far bigger than what we're used to seeing (in consumer parts). Given the size of AM4, would they actually be able to do that?abufrejoval - Tuesday, August 2, 2016 - link
Is that finally the dawn of a new type of PC?IMHO APUs never made much sense as long as their graphics performance was so limited by the ordinary DDR3/DDR4 DRAM bandwidth (except perhaps if AMD had also been using tile based rendering which parallel discussion around here). And a discrete GPU just made 60% of the die space obsolete and HSA far less efficient.
Putting the GPU on the CPU die is all wrong; you’ve got to do the reverse, adding CPU and the rest of the SoC on the GPU.
The PS/4 chip shows the right path by having all RAM be GDDR5.
HBM2 could make that technically feasible and even economical: At 8GB/stack were talking 32GB per SoC and if that's not enough you'll go SMP/SLI/Stereo for VR or HPC.
And once your die stacking is truly mature, HBM should actually be more economical than ordinary DRAM because you can “outsource” a lot of the amplification work to the base chip.
Perhaps even some logic?
Depending on the level of Integration available on Zen variants I can easily see single die carrier PCs with little more than some PHYs and connectors on the so called motherboard.
Discrete-DRAM-be-dammed the entire "motherboard" would just fit below a nice large and quietly blowing cooler: The NUC would be a Nano!
There could still be variants with PCIe or perhaps just PCIe backplanes for the "motherboard" for those going SMP or wanting to add some odd PCIe add-in, but with a couple of USB 3.1 ports most desktops should be fine.
I guess essentially the desktop PC would have just caught up with the mobile variant but suddenly everything under my desk seems like a dinosaur.
Raniz - Tuesday, August 2, 2016 - link
Main problem with a SoC PC is that you lose the possibility to do partial upgrades. My 5 year old Sandy Bridge i7 can still keep up today and 16GB of RAM is still plenty enough, but that GTX 560 is really weak.An APU with integrated memory would be fairly expensive and you have to replace it completely once it's time to upgrade the graphics card even though the memory and CPU parts of it may still be enough. It's wasteful and costly.
haukionkannel - Tuesday, August 2, 2016 - link
Ofcourse if we talk so big Computer that you can replace parts, it is better to have diskrete gpu, bit if we talk about small hand size computers, then there just is not alternatives.abufrejoval - Tuesday, August 2, 2016 - link
The HBM story is much about flexibility vs. fixed function/allocation.You pay for the flexibility of external DRAM in latency and with lots of extra energy.Ultimately there must be balance between memory capacity and CPU power, GPU power and memory bandwidth. That's why I'd still want to be able to put multiple SoC together to create an SMP system where I want VR and/or higher resolution: It would give more more of everything.
And someone might still take the last gen APU off your hands, should you believe an upgrade is required.
Roland00Address - Tuesday, August 2, 2016 - link
What is the power consumption of HBM2What is the power consumption of HBM2 compared to DDR4? Compared to gddr5?
supdawgwtfd - Tuesday, August 2, 2016 - link
Lower. It's one of the points AMD created it.DanNeely - Tuesday, August 2, 2016 - link
I'm not sure where it stands vs standard DDR; but HBM is significantly more power efficient than GDDR. OTOH each new generation of GPU pushes memory harder and pushes up the power consumption again. IIRC the bars from this nvidia slide represent the ram used with successive generations of GPUs; and while HBM reduced power enough to delay it by a few generations there's still a looming power crisis threatening future generations of GPUs. My suspicion is that the unspecified next generation memory on the AMD slide about the upcoming Navi GPU will probably be an HBM2 successor redesigned to be more power efficient within the memory dies themselves (HBMs savings come from massively reducing data bus power).http://www.3dcenter.org/image/view/9634/_original
Roland00Address - Tuesday, August 2, 2016 - link
Thank you, I am still curious on the ddr4 power usage since that would be the thing if you need to compare it to if you are considering an apu with hbmII vs an apu with ddr4.DanNeely - Tuesday, August 2, 2016 - link
2 years ago it looks like DDR4 was ~1.5W per 4GB (fairly consistent between 4 and 8gb dimms and across several manufacturers and data rates). I think we've had at least one dram process shrink since then so current numbers are probably a good bit lower.http://www.tomshardware.com/reviews/intel-core-i7-...
abufrejoval - Friday, August 5, 2016 - link
I've found it much harder to find information on this than I thought: You can see "40% less" quotes everywhere, but absolute values are hard to find.I guess one of the issues is that DRAM power consumption actually may not be constant: During reading and writing it will most likely use more and evidently there are also energy savings possible, because otherwise suspend to RAM wouldn't make sense.
The other day I plugged 128GB of DDR4 (8 DIMMs) into my latest Xeon and was shocked when I ran across a line in my CPUID HWmonitor on energy consumption:
While the 12 core Xeon E5-2680v3 went to around 10 Watts on idle, the DRAM was listed as using 60 Watts on idle and 120 Watts during Prime95.
The 4GHz E3-1280v3 with 32GB of DDR3 DRAM (4 modules) right next to it would go to something like 4 Watts on idle for the RAM, a figure much more in line with my expectations.
I don't actually know how and with what level of exactness the figures are measured, but they could well be true and reflect distinct behavior by the memory controllers on the CPUs.
Even if it's a "Xeon" the latter system is essentially a desktop developed out of a mobile blueprint while the first is clearly a server chip. And while even servers support power saving features these days, they may not be as aggressive about it.
Both use ECC RAM but unbuffered.