If you live by the workstation, you die by the performance. When it comes to processing data, throughput is key: the more a user can do, the more projects are accomplished, and the more contracts can be completed. This means that workstation users are often compute bound, and like to throw resources at the problem, be it cores, memory, storage, or graphics acceleration. AMD’s latest foray into the mix is its second generation Threadripper product, also known as Threadripper 2, which breaks the old limit on cores and pricing: the 2990WX gives 32 cores and 64 threads for only $1799. There is also the 2950X, with 16 cores and 32 threads, for a new low of $899. We tested them both.

The AMD Threadripper 2990WX 32-Core and 2950X 16-Core Review

Ever since AMD launched its first generation Ryzen product, with eight cores up against Intel’s four cores in the mainstream, the discussion has been all about how many cores makes sense. The answer to this question is entirely workload dependent – how many users have a single workload in mind, or how many will use a variety of tools simultaneously. The workstation market encompasses a wide range of distinct power users, and despite the need for speed, there is rarely a one-size fits all solution.

AMD’s first generation of Threadripper, launched in 2017, introduced 16-core processors to the masses. Previously only available on the server platforms, these new parts were priced very competitively against 10-core offerings. AMD had ultimately used its server platform, with a few tweaks, to attack a competitive landscape where Halo products are seen as king-of-the-hill.

Intel’s own workstation products, previously named E5-2687W and relied on dual socket servers, were literally that – servers. After launching its latest high-end desktop platform, with up to 18 cores, Intel then subsequently launched the Xeon W-series, which replaced the E5-W parts from the previous generation. Again, these were up to 18-cores for ~$2500, but required special chipsets and motherboards.

Today AMD is officially putting out for sale its second generation of Threadripper. These new parts attack the market two-fold: firstly by using the improved Zen+ microarchitecture, giving for a 3% IPC increase in core performance, but also using 12nm, driving up frequencies and reducing power. The second attack on the market is core count: while AMD will be replacing the 12 and 16 core processors with new Zen+ models at higher frequencies, AMD also has 24 and 32 core processors for up to $1799.  When comparing 32 cores at $1799 against 18 cores at $2500, it seems like a slam dunk, right?

How AMD Enabled 32 Cores

The first generation server processor line from AMD, called EPYC, uses four silicon dies of eight cores each to hit a the full 32 core product. These parts also had eight memory channels and 128 lanes of PCIe 3.0 to play with. In order to make the first generation Threadripper processors, AMD disabled two of those silicon dies, giving only 16 cores, four memory channels, and 60 lanes of PCIe. The end product was sold focused at consumers, not server customers.

For 32 cores, AMD takes the same 32-core EPYC silicon, but upgrades it to Zen+ on 12nm for a higher frequency and lower power. However, to make it socket compatible with the first generation, it is slightly neutered: we have to go back to four memory channels and 60 lanes of PCIe. AMD wants users to think of this as an upgraded first generation product, with more cores, rather than a cut enterprise part. The easy explanation is to do with product segmentation, a tactic both companies have used over time to offer a range of products.

As a result, one way of visioning the new second generation 32-core and 24-core products is bi-modal: half the chip has access to the full resources, similar to the first generation product, while the other half of the chip doubles the same compute resources but has additional memory and PCIe latency compared to the first half. For any user that is entirely compute bound, and not memory or PCIe bound, then AMD has the product for you.

In our review, we’ll see that this bi-modal performance difference can have a significant effect, both good and bad, and is very workload dependent.

AMD’s New Product Stack

The official announcement last week showed that AMD is coming to market with four second generation Threadripper processors. Two of these will directly replace the first generation product: the 16-core 2950X will replace the 16-core 1950X, and the 12-core 2920X will replace the 12-core 1920X. These two new processors will not be bi-modal as explained above, with only two of the four silicon die on the package being active (the 16-core will be a 8+0+8+0 configuration, the 12-core is a 6+0+6+0). Sitting at the bottom of the stack will be the first generation 8-core (4+0+4+0) 1900X that also offers quad-channel memory and 60 PCIe lanes.

2017   2018
-     $1799 TR 2990WX
-     $1299 TR 2970WX
TR 1950X $999   $899 TR 2950X
TR 1920X $799   $649 TR 2920X
TR 1900X $549      

The two new processors are the 32-core 2990WX and the 24-core 2970WX. They will enable four cores per complex (8+8+8+8) and three cores per complex (6+6+6+6) respectively, and are under the bi-modal nature of the memory and PCIe. The naming changes up to WX, presumably for ‘Workstation eXtreme’, but this puts the product in the same marketing line as the Radeon Pro WX family.

AMD SKUs
  Cores/
Threads
Base/
Turbo
L3 DRAM
1DPC
PCIe TDP SRP
TR 2990WX 32/64 3.0/4.2 64 MB 4x2933 60 250 W $1799
TR 2970WX 24/48 3.0/4.2 64 MB 4x2933 60 250 W $1299
TR 2950X 16/32 3.5/4.4 32 MB 4x2933 60 180 W $899
TR 2920X 12/24 3.5/4.3 32 MB 4x2933 60 180 W $649
Ryzen 7 2700X 8/16 3.7/4.3 16 MB 2x2933 16 105 W $329

The AMD Ryzen Threadripper 2990WX is the new halo product, with 32 cores and 64 threads coming in with a base frequency of 3.0 GHz and a top turbo frequency of 4.2 GHz. The idle frequency of this processor is 2.0 GHz, and when installed we saw 2.0 GHz on any core without work – it almost becomes the dominating frequency if the CPU isn’t constantly loaded. The 2990WX will be available from today and retail for $1799.

The other member of the WX series is the 2970WX, which disables one core per complex for a total of 24 cores. With similar frequencies as the 2990WX, and the same TDP, PCIe lanes, and memory support, this processor will be launched in October at the $1299 price point. With fewer cores being loaded, one might expect this processor to turbo more often than the bigger 32-core part.

For the X-series, the TR 2950X is our 16-core replacement, taking full advantage of the better frequencies that the new 12nm process can give: a base frequency of 3.5 GHz and a turbo of 4.4 GHz puts the previous generation processor to shame. In fact, the 2950X is set to be the joint highest clocked AMD Ryzen product. With that bump also comes a price drop: instead of $999 users can now get a 16-core processor for $899. The 2950X is due out at the end of the month, on August 31st.

Bringing up the rear is the 2920X, sitting in to replace the 1920X and with a similar trade-off to the other parts. As with the 2950X, the frequencies are nice and high compared to last year, with a base frequency of 3.5 GHz and a turbo of 4.3 GHz. This is all in a thermal design package of 180W. AMD told us that the TDP ratings for Threadripper 2, in general, were fairly conservative, so it will be interesting to see how they hold up. The 2920X is also out in October, going for $649 retail.

In This Review

  1. AMD’s New Product Stack [this page]
  2. Core to Core to Core: Design Trade Offs
  3. Precision Boost 2, Precision Boost Overdrive
  4. Feed Me: Infinity Fabric Requires 6x Power
  5. Test Setup and Comparison Points
  6. Our New Testing Suite for 2018 and 2019
  7. HEDT Benchmarks: System Tests
  8. HEDT Benchmarks: Rendering Tests
  9. HEDT Benchmarks: Office Tests
  10. HEDT Benchmarks: Encoding Tests
  11. HEDT Benchmarks: Web and Legacy Tests
  12. Overclocking: 4.0 GHz for 500W
  13. Thermal Comparisons: Remember to Remove the CPU Cooler Plastic!
  14. Going Up Against EPYC: Frequency vs Memory Channels
  15. Conclusions: Not All Cores Are Made Equal
Core to Core to Core: Design Trade Offs
Comments Locked

171 Comments

View All Comments

  • T1beriu - Monday, August 13, 2018 - link

    > We confirmed this with AMD, but for the most part the scheduler will load up the cores that are directly attached to memory first, before using the other cores. [...]

    It seems that Tomshardware says the opposite:

    >AMD continues working with Microsoft to route threads to the die with direct-attached memory first, and then spill remaining threads over to the compute dies. Unfortunately, the scheduler currently treats all dies as equal, operating in Round Robin mode. [...] According to AMD, Microsoft has not committed to a timeline for updating its scheduler.
  • Ian Cutress - Monday, August 13, 2018 - link

    Yeah, Paul and I were discussing this. It is a round robin mode, but it's weighted based on available resources, thermal performance, proximity of busy threads, etc.
  • JoeyJoJo123 - Monday, August 13, 2018 - link

    Maybe just user error, but all the article pages between Test Setup and Comparison Results to Going up Against Epyc, just have the text "Still writing...". I'm unsure if the article is actually still being written and was supposed to be published in this partial manner or if possible something was lost between writing and upload.

    In any case, kind of crazy how the infinity fabric is consuming so much power. The cores look super-efficient, but if the uncore can get efficiency improvements, that can help the Zen architecture stay even more efficient under load. Intel's uncore consumes a fraction of the wattage, but doesn't scale as well for multiple threads.
  • Ian Cutress - Monday, August 13, 2018 - link

    Still being written. See my comment at the top. Unfortunately travel back and forth from UK to SF bit me over the weekend and I lost a couple of days testing, along with having to take a full benchmark set up with me to SF to test in the hotel room.
  • JoeyJoJo123 - Monday, August 13, 2018 - link

    I understand, take your rest. You don't need to reply to me, I actually saw the reason after I posted.
  • compilerdev2 - Monday, August 13, 2018 - link

    Hi Ian,
    I have some questions about the Chromium compilation benchmark, since I was hoping to get the 2990WX for compiling large C++ apps. What version of Chromium is used? Is the compiler being used Clang-CL or Visual C++? Is the build in debug or release (optimized) mode? If it's release mode with Visual C++, does it use LTCG? (link-time code generation, the equivalent of LTO of gcc/clang). For example, if the build is Visual C++ LTCG, the entire code optimization, code generation and linking is by default limited to 4 threads. Thanks!
  • Ian Cutress - Monday, August 13, 2018 - link

    It's the standard Windows walkthrough available online. So we use a build of Chrome 62 (it was relevant when we pulled), VC++, build in release. It's done in the command line via ninja, and yes it does use LTCG.

    Destructions are here. They might be updated a little from when I wrote the benchmark. Out test is automated to keep consistency.

    https://chromium.googlesource.com/chromium/src/+/m...
  • compilerdev2 - Monday, August 13, 2018 - link

    With LTCG those strange results make sense - it's spending a lot of time on just 4 threads - actually majority of the time is on one thread for the Chromium case, it hits some current limitations of the VC++ compiler regarding CPU/memory usage that makes scaling worse for Chromium (but not for smaller programs or with non-LTCG builds). Increasing the number of threads from the default of 4 is possible, but will not help here. The frontend (parsing) work is well parallelized by Ninja, it's probably the reason why the Threadrippers do end up ahead of the faster single-core Intel CPUs. It would be interesting to see the benchmarks without LTCG, or even better, more compilation benchmarks, since these CPUs are really great for C/C++/Rust programmers.
  • Nexus-7 - Monday, August 13, 2018 - link

    Cool write-up on the uncore power usage! I especially enjoyed that part of the article.
  • johnny_boy - Monday, August 13, 2018 - link

    The Phoronix articles are more telling for the sort of workloads a 64 thread count would be used for.

Log in

Don't have an account? Sign up now