FP64 Performance and Separating Radeon VII from Radeon Instinct MI50

One of the interesting and amusing consequences of the Radeon VII launch is that for the first time in quite a while, AMD has needed to seriously think about how they’re going to differentiate their consumer products from their workstation/server products. While AMD has continued to offer workstation and server hardware via the Radeon Pro and Radeon Instinct series, the Vega 20 GPU is AMD’s first real server-grade GPU in far too long. So, while those products were largely differentiated by the software features added to their underlying consumer-grade GPUs, Radeon VII brings some new features that aren’t strictly necessary for consumers.

It may sound like a trivial matter – clearly AMD should just leave everything enabled – but as the company is trying to push into the higher margin server business, prosumer products like the Radeon VII are in fact a tricky proposition. AMD needs to lock away enough of the server functionality of the Vega 20 GPU that they aren’t selling the equivalent of a Radeon Instinct MI50 for a fraction of the price. On the other hand, it’s in their interest to expose some of these features in order to make the Radeon VII a valuable card in its own right (one that can justify a $699 price tag), and to give developers a taste of what AMD’s server hardware can do.

Case in point is the matter of FP64 performance. As we noted in our look at the Vega 20 GPU, Vega 20’s FP64 performance is very fast: it’s one-half the FP32 rate, or 6.9 TFLOPS. This is one of the premium features of Vega 20, and since Radeon VII was first announced back at CES, the company has been struggling a bit to decide how much of that performance to actually make available to the Radeon VII. At the time of its announcement, we were told that the Radeon VII would have unrestricted (1/2) FP64 performance, only to later be told that it would be 1/8. Now, with the actual launch of the card upon us, AMD has made their decision: they’ve split it down the middle and are doing a 1/4 rate.

Looking to clear things up, AMD put out a statement:

The Radeon VII graphics card was created for gamers and creators, enthusiasts and early adopters. Given the broader market Radeon VII is targeting, we were considering different levels of FP64 performance. We previously communicated that Radeon VII provides 0.88 TFLOPS (DP=1/16 SP). However based on customer interest and feedback we wanted to let you know that we have decided to increase double precision compute performance to 3.52 3.46 TFLOPS (DP=1/4SP).

If you looked at FP64 performance in your testing, you may have seen this performance increase as the VBIOS and press drivers we shared with reviewers were pre-release test drivers that had these values already set. In addition, we have updated other numbers to reflect the achievable peak frequency in calculating Radeon VII performance as noted in the [charts].

The end result is that while the Radeon VII won’t be as fast as the MI60/MI50 when it comes to FP64 compute, AMD is going to offer the next best thing, just one step down from those cards.

At 3.5 TLFLOPS of theoretical FP64 performance, the Radeon VII is in a league of its own for the price. There simply aren’t any other current-generation cards priced below $2000 that even attempt to address the matter. All of NVIDIA’s GeForce cards and all of AMD’s other Radeon cards straight-up lack the necessary hardware for fast FP64. The next closest competitor to the Radeon VII in this regard is NVIDIA’s Titan V, at more than 4x the price.

It’s admittedly a bit of a niche market, especially when so much of the broader industry focus is on AI and neural network performance. But there’s none the less going to be some very happy data scientists out there, especially among academics.

AMD Server Accelerator Specification Comparison
  Radeon VII Radeon Instinct
Radeon Instinct
FirePro S9170
Stream Processors 3840
(60 CUs)
(60 CUs)
(64 CUs)
(44 CUs)
ROPs 64 64 64 64
Base Clock 1450MHz 1450MHz 1400MHz -
Boost Clock 1750MHz 1746MHz 1500MHz 930MHz
Memory Clock 2.0Gbps HBM2 2.0Gbps HBM2 1.89Gbps HBM2 5Gbps GDDR5
Memory Bus Width 4096-bit 4096-bit 2048-bit 512-bit
Half Precision 27.6 TFLOPS 26.8 TFLOPS 24.6 TFLOPS 5.2 TFLOPS
Single Precision 13.8 TFLOPS 13.4 TFLOPS 12.3 TFLOPS 5.2 TFLOPS
Double Precision 3.5 TFLOPS
(1/4 rate)
(1/2 rate)
(1/16 rate)
(1/2 rate)
DL Performance ? 53.6 TFLOPS 12.3 TFLOPS 5.2 TFLOPS
VRAM 16GB 16GB 16GB 32GB
ECC No Yes (full-chip) Yes (DRAM) Yes (DRAM)
Bus Interface PCIe Gen 3 PCIe Gen 4 PCIe Gen 3 PCIe Gen 3
TDP 300W 300W 300W 275W
GPU Vega 20 Vega 20 Vega 10 Hawaii
Architecture Vega
(GCN 5)
(GCN 5)
(GCN 5)
Manufacturing Process TSMC 7nm TSMC 7nm GloFo 14nm TSMC 28nm
Launch Date 02/07/2019 09/2018 06/2017 07/2015
Launch Price (MSRP) $699 - - $3999

Speaking of AI, it should be noted that machine learning performance is another area where AMD is throttling the card. Unfortunately, more details aren’t available at this time. But given the unique needs of the ML market, I wouldn’t be surprised to find that INT8/INT4 performance is held back a bit on the Radeon VII. Or for that matter certain FP16 dot products.

Also on the chopping block is full-chip ECC support. Thanks to the innate functionality of HBM2, all Vega cards already have free ECC for their DRAM. However Vega 20 takes this one step further with ECC protection for its internal caches, and this is something that the Radeon VII doesn’t get access to.

Finally, Radeon VII also cuts back a bit on Vega 20’s off-chip I/O features. Though AMD hasn’t made a big deal of it up to now, Vega 20 is actually their first PCI-Express 4.0-capable GPU, and this functionality is enabled on the Radeon Instinct cards. However for Radeon VII, this isn’t being enabled, and the card is being limited to PCIe 3.0 speeds (so future Zen 2 buyers won’t quite have a PCIe 4.0 card to pair with their new CPU). Similarly, the external Infinity Fabric links for multi-GPU support have been disabled, so the Radeon VII will only be a solo act.

On the whole, there’s nothing very surprising about AMD’s choices here, especially given Radeon VII’s target market and target price. But these are notable exclusions that are going to matter to certain users. And if not to drive those users towards a Radeon Instinct, then they’re sure to drive those users towards the inevitable Vega 20-powered Radeon Pro.

Vega 20: Under The Hood Meet the AMD Radeon VII


View All Comments

  • peevee - Tuesday, February 12, 2019 - link

    "that the card operates at a less-than-native FP64 rate"

    The chip is capapble of 2 times higher f64 performance. Marketoids must die.
  • FreckledTrout - Thursday, February 7, 2019 - link

    Performance wise it did better than I expected. This card is pretty loud and runs a bit hot for my tastes. Nice review. Where are the 8K and 16K tests :)- Reply
  • IGTrading - Thursday, February 7, 2019 - link

    When drivers mature, AMD Radeon VII will beat the GF 2080.

    Just like Radeon Furry X beats the GF 980 and Radeon Vega 64 beats the GF 1080.

    When drivers mature and nVIDIA's blatant sabotage against its older cards (and AMD's cards) gets mitigated, the long time owner of the card will enjoy better performance.

    Unfortunately, on the power side, nVIDIA still has the edge, but I'm confident that those 16 GB of VRAM will really show their worth in the following year.
  • cfenton - Thursday, February 7, 2019 - link

    I'd rather have a card that performs better today than one that might perform better in two or three years. By that point, I'll already be looking at new cards.

    This card is very impressive for anyone who needs FP64 compute and lots of VRAM, but it's a tough sell if you primarily want it for games.
  • Benjiwenji - Thursday, February 7, 2019 - link

    AMD cards have traditional age much better than Nvidia. GamerNexus just re-benchmarked the 290x from 2013 on modern games and found it comparable to the 980, 1060, and 580.

    The GTX 980 came late 2014 with a $550USD tag, now struggles on 1440p.

    Not to mention that you can get a lot out of AMD cards if you're willing to tinker. My 56, which I got from Microcenter on Nov, 2017, for $330. (total steal) Now performs at 1080 level after BIOs flash + OC.
  • eddman - Friday, February 8, 2019 - link

    What are you talking about? GTX 980 still performs as it should at 1440.

  • Icehawk - Friday, February 8, 2019 - link

    My 970 does just fine too, I can play 1440p maxed or near maxed in everything - 4k in older/simpler games too (ie, Overwatch). I was planning on a new card this gen for 4k but pricing is just too high for the gains, going to hold off one more round... Reply
  • Gastec - Tuesday, February 12, 2019 - link

    That's because, as the legend has it, Nvidia is or was in the past gimping their older generation cards via drivers. Reply
  • kostaaspyrkas - Sunday, February 10, 2019 - link

    in same frame rates nvidia gameplay gives me a sense of choppiness...amd radeon more fluid gameplay... Reply
  • yasamoka - Thursday, February 7, 2019 - link

    This wishful in-denial conjecture needs to stop.

    1) AMD Radeon VII is based on the Vega architecture which has been on the platform since June 2017. It's been about 17 months. The drivers had more than enough time to mature. It's obvious that in certain cases there are clear bottlenecks (e.g. GTA V), but this seems to be the fundamental nature of AMD's drivers when it comes to DX11 performance in some games that perform a lot of draw calls. Holding out for improvements here isn't going to please you much.

    2) The Radeon Fury X was meant to go against the GTX 980Ti, not the GTX 980. The Fury, being slightly under the Fury X, would easily cover the GTX 980 performance bracket. The Fury X still doesn't beat the GTX 980Ti, particularly due to its limited VRAM where it even falls back in performance compared to the RX480 8GB and its siblings (RX580, RX590).

    3) There is no evidence of Nvidia's sabotage against any of its older cards when it comes to performance, and frankly your dig against GameWorks "sabotaging" AMD's cards performance is laughable when the same features, when enabled, also kill performance on Nvidia's own cards. PhysX has been open-source for 3 years and has now moved on to its 4th iteration, being used almost universally now in game engines. How's that for vendor lockdown?

    4) 16GB of VRAM will not even begin to show their worth in the next year. Wishful thinking, or more like licking up all the bad decisions AMD tends to make when it comes to product differentiation between their compute and gaming cards. It's baffling at this point that they still didn't learn to diverge their product lines and establish separate architectures in order to optimize power draw and bill of materials on the gaming card by reducing architectural features that are unneeded for gaming. 16GB are unneeded, 1TB/s of bandwidth is unneeded, HBM is expensive and unneeded. The RTX 2080 is averaging higher scores with half the bandwidth, half the VRAM capabity, and GDDR6.

    The money is in the gaming market and the professional market. The prosumer market is a sliver in comparison. Look at what Nvidia do, they release a mere handful of mascots every generation, all similar to one another (the Titan series), to take care of that sliver. You'd think they'd have a bigger portfolio if it were such a lucrative market? Meanwhile, on the gaming end, entire lineups. On the professional end, entire lineups (Quadro, Tesla).

    Get real.

Log in

Don't have an account? Sign up now