Ryzen 5000 and Zen 3: Benchmarking Performance Measures for CAD21 Jan, 2021 By: Alex Herrera
Herrera on Hardware: This exercise provides some performance data for comparison, plus insights about examining a CPU’s clock rates.
After a long journey back to a leadership position in high-performance CPUs, AMD now presents the biggest threat to Intel’s workstation dominance in years. As covered in last month’s column, the dual introductions of the Zen 3 microarchitecture and Ryzen 5000 line represent the linchpin in AMD’s closing of the gap with the longtime market leader. The combination finally delivered on that one last metric that Intel had been using to hold off the resurgent challenger: single-thread (1T) performance.
CPUs are anything but one-size-fits-all in terms of technology, applications, and marketing. Zen-derived CPUs have caught and — at least for now — surpassed Intel in core count (and core count per dollar), making them particularly attractive in applications that rely most heavily on multithread-capable workloads. CAD is certainly one of those application spaces, but it also happens to be one that still often demands the best single-thread (1T) performance available to boot. And unlike delivering on higher core counts, achieving parity with Intel on that latter metric has proven elusive to AMD. But with the launch of four Ryzen 5000 CPUs — codenamed Vermeer, and the first to leverage the third-generation Zen 3 microarchitecture — matching or even surpassing Intel on 1T performance was finally within reach.
This month, I’ll follow up on Zen 3’s 1T promise with some benchmarking of workloads common in CAD processing.
Ryzen 5000 Testing: How Does Zen 3 Perform for CAD-Relevant Workloads?
AMD describes Zen 3 as the “most comprehensive design overhaul of the Zen era,” touting a blanket 19% higher instructions per cycle (IPC) over Zen 2. And with single-thread performance tending to track IPC (supporting memory and I/O allowing), Vermeer offered the most promise ever to meet or exceed the best Intel can manage in 1T rates. It’s a promise I hoped to test, with a range of CAD-centric benchmarks focused on single-thread workloads. And thanks to Boxx Technologies, I was able to do just that.
The name Boxx has been mentioned in this column several times in the past, and for good reason. When it comes to exploring emerging, cutting-edge workstation components and design approaches, Boxx is consistently at the forefront. The company knows it can’t compete with the likes of Dell, HP, and Lenovo on the basis of price, so there’s no point in building machines with the same specs as that trio’s wares. Instead, Boxx is always on the lookout to differentiate with unique features and no-compromise performance, and adopting AMD CPUs has been one way it’s done so, particularly in the age of Zen.
Boxx Technologies’ Apexx A3 Denali workstation, built on AMD Ryzen CPUs.
Delivered in a compact, liquid-cooled package, Boxx’s Apexx A3 workstation was built for Ryzen and showcases the Ryzen 5000 portfolio, including the two of particular interest in professional-caliber computing: the 8-core (8C) Ryzen 7 5800X and the 12-core (12C) Ryzen 9 5900X. The two represent the fastest nominal clock rates available, at 3.8 GHz (4.7-GHz boost rate) and 3.7 GHz (4.8-GHz boost), respectively. Boxx delivered the 5800X in the Apexx A3, and AMD kindly offered a 5900X to swap in as well. The CPUs were amply complemented by memory, GPU, and storage selected for maximum performance, so as to ensure the CPUs would be the bottleneck in testing (or at least as much as possible).
For a comparison to the previous Zen 2 microarchitecture, I have a previous set of benchmark results on a 12-core Threadripper Pro 3945WX. While the complementary components were not identical, the 3945WX can provide a meaningful, albeit not perfect, reference point to assess the performance gains of Zen 3 over its predecessor.
The test systems were built on two AMD Ryzen 5000 series CPUs, based on Zen 2, with a Threadripper Pro CPU built around Zen 3.
For workstation system testing, I typically employ SPECworkstation, an independent benchmark which contains CAD workloads common to AEC, design, engineering, and manufacturing workflows. That wasn’t possible for 1T evaluation, however, as the benchmark currently lacks the means to constrain processing to a certain number of threads. Instead, I employed PassMark’s PerformanceTest 10.0 and Cinebench R42. For the former, beyond the standard overall single-thread test offered, I hand-picked three workloads deemed most relevant to CAD — physics, floating point, and compression — while Cinebench R42 added a workload typical in render processing.
The Zen 3 generation Ryzen 9 5900X ended up outperforming the Zen 2 Threadripper Pro 3945WX by anywhere from 19% to 63%. Averaging the measured tests yields an overall speedup of 39%. The chart below shows the relative performance across tested workloads, normalized to the Threadripper Pro 3945WX.
Single-thread (1T) benchmark results for AMD Ryzen 9 5900X with Zen 3 (in red) versus AMD Ryzen Threadripper Pro 3945WX (in blue).
That 39% average speed increase over the Zen 2 Threadrippper Pro 3945WX certainly would support AMD’s claims for significant and better-than-usual generation-to-generation improvement. In fact, it’s nearly double the company’s broad-brushed 19% promised gain. However, as a caveat, the comparison to the 3945WX may not represent a fully fair fight. The Ryzen 9 5900X system from Boxx is liquid-cooled, while the Threadripper Pro system was not. Liquid cooling improves thermal dissipation and therefore will — all else being equal — help the CPU maintain higher clock rates. Now, it’s quite possible the 5900X could maintain boost-level clock rates during 1T processing without the assistance of liquid-cooling, but it’s not something this exercise could verify. Still, the magnitude of the 5900X’s benchmark edge is dramatic enough to support the conclusion that yes, Zen 3 represents a potent step forward in 1T throughput.
An Illuminating Single-Thread Exercise: Base vs. Boost for Single-Thread Workloads
Beyond assessing the performance potential for Ryzen 5000 and Zen 3 for CAD, this exercise provided an opportunity to assess the impact of a CPU’s two commonly specified clock frequencies: the base frequency and the turbo frequency. The former is the guaranteed minimum sustainable clock rate, while the latter represents how much faster one or more cores can be driven if and as power and thermal constraints allow. Both AMD and Intel build in quite a bit of fine-grained intelligence to help figure out exactly when clocks can be cranked up, and for how long.
It seemed a scenario perfectly set up to compare the relative impact of base versus boost frequencies. The 5800X is 0.1 GHz higher in base frequency, while the 5900X is 0.1 GHz higher in boost. Which would perform better on long-duration, heavy-duty single-thread execution? Also considered, for context, is the same Threadripper Pro 3945WX, which has both a higher base and a lower boost frequency than either of the other two (though given its different Zen 2 microarchitecture, its clock rates are less meaningful to directly compare).
Contrasting specs for the Ryzen 7 5800X and Ryzen 9 5900X, especially interesting with respect to base and boost frequencies. Data source: AMD.
But first, let’s consider which type of workload is more likely to impart the conditions under which the CPU will remain throttled back more toward the base frequency than allowed to charge forward at boost rates. Sensibly, it turns out that forcing clocks to be reined in is a lot easier with multithread testing. With all or most cores in heavy use all or most of the time, creating the power and thermal stress necessary to cause the CPU to throttle back frequency to the base figure is far more likely.
But with single-thread execution stressing just one core — even heavily — thermal stress can be better spread across the processor. With other cores fully or mostly idle, they not only aren’t contributing to additional thermal output, but also help dissipate that heat across the larger silicon and package mass. And that raises the very valid question of whether it’s more the boost rate that will typically have greater impact on 1T performance than the base rate.
I attempted to put as much 1T stress on the CPUs as possible, running PC Mark “very long” test options many times in sequence with no intervening idle, in the hopes I could push demand high enough to throttle back runtime clock rates toward base frequencies. In the end, however, I was not successful. How do I know I wasn’t? Well, it turned out to be glaringly obvious, as a repeat of the performance summary chart above — with the 5800X included — clearly illustrates. In every case, the 5900X exceeded the 5800X by 2%, precisely matching the former SKU’s edge in boost frequency.
It’s also worth noting, that given the ability for Vermeer to sustain boost clock rates during 1T processing, it’s likely neither the 5800X nor the 5900X will show the best 1T benchmark performance a Ryzen 5000 CPU can manage. Presumably then, that title would fall on the 5950X, the 16C member of the Ryzen 5000 family. While its higher core count dictates its lower base frequency of 3.4 GHz, it offers the highest boost rate in the family, at 4.9 GHz. I did not have the commensurately more expensive 5950X available to test, but there is plenty of third-party data on the web indicating it does indeed set the high-water mark in 1T testing.
Some Multithread Data, While We're at It
My primary interest in digging into Ryzen 5000 was to get a handle on its 1T potential, particularly in the context of CAD. But since I was at it, and with reasonable comparisons available from both Threadripper PRO and differing SKUs, why not see what a Zen 3 powered Ryzen 5000 could manage for hefty highly-parallel workloads in AEC, design and engineering? For this task, I could go back and rely on the tried-and-true SPECworkstation 3.0.4, running the full battery of CPU focused tests. Results were mixed, as one might expect, but overall the 12C 5900X surpassed the 3945WX by about 11% on average.
It’s worth noting an important caveat though: these strong results reflect not only Vermeer’s aptitude but the added and likely significant performance boost that the A3 Denali’s liquid-cooling is providing over a conventional air-cooled CPU and chassis. That is, while much of the Ryzen 9 5900X’s gains should be attributable to Zen 3 improvements, the Boxx workstation should allow for higher runtime clock rates (somewhere between base and boost) than an air-cooled system like the Threadripper Pro machine.
Intel Is Still a Moving Target
Of course, Intel’s not standing still while AMD pushes forward. Yes, Zen 3 and Vermeer certainly do look to have closed the gap in 1T performance — or even pushed past Intel’s current storefront Core i7/i9 products — but AMD’s rival is expected to shortly launch its Rocket Lake generation of desktop CPUs, expected by the end of Q1. Intel has set the expectation for a “double-digit percentage IPC performance improvement”, a degree that I’ve noted represents a gen-over-gen gain more typical than what Zen 3 has accomplished.
With the product unlaunched, we see no formal benchmark results or claims. However, the CPU watching website notebookcheck.net claims to have aggregated some preliminary CPU Z single thread benchmark that shows a top-end Rocket Lake S nudging out the Ryzen 5950X (which as noted above, is likely the 1T scoring champ among the first Vermeer SKUs). While the next few months will bear out that conclusion (or not), let’s assume that general description is valid, that a top-end Rocket Lake will just nudge out a top-end Vermeer. That does not refute the reasonable conclusion that AMD has, generally speaking, finally caught up to Intel on 1T processing.
AMD and Intel Competing on Relatively Equal Footing across CAD Workloads — Finally!
It's been a long road back to across-the-board high-performance computing. The last domino to fall was single-thread performance, a goal that now appears checked off with Zen 3 and supporting technologies. We’re now likely in an era of relative CPU parity, where one might have a modest edge on the other temporarily, only to be displaced a quarter later by the rival’s next-generation — the type of leapfrogging common among two similarly positioned rivals. Though be fair, if one of the two today does have a consistent substantive edge, it’s not Intel but AMD, with respect to multi core abilities and performance.
An evenly matched Intel/AMD duopoly should benefit users more than the de facto Intel monopoly we’ve had in workstations for over a decade (since AMD’s previous Opteron line of CPUs faded from workstation use in the 00’s). And that tight competition should only mean CAD hardware buyers will reap the benefits, with neck-and-neck vendors yielding the best possible hardware at the lowest possible prices.
Further Exploration of the Benefits of Base vs. Boost for CAD
As touched on above, carrying out this exercise begged another, one to further explore an issue that’s of particular relevance to the CAD community: the interplay of base clocks versus boost clocks and core counts running both 1T and MT workloads. As noted above and many times in the pages of this column, the group of CAD users spending all or most computing time with 1T-predominant interactive 3D modeling will typically be better served by favoring frequency over core counts, while another group immersed in frequent MT workloads like rendering and simulation should probably focus more on core count versus clock rate, or at least balance the metrics.
But this exercise also suggested a further conclusion, a rule of thumb CAD pros might consider in comparing CPU models for their next workstations: that the former group is likely better served as well focusing on the boost frequency over the base frequency, given that even a heavily loaded single core will often sustain those higher boost levels. And furthermore, the converse argument would then suggest the latter group is likely better served focusing on the base frequency over the boost, given that long-duration heavy loading of all CPU cores if much more likely to require the CPU to throttle back frequencies at (or closer to) the base.
Those premises appear supported both by logic and empirical data, but worth a deeper examination. And that’s something I plan to contrast for CAD, relevant for those who spend the vast majority of time drafting and modeling as well as those occupied with more extensive compute tasks like simulation and rendering.