Intel Launches Sapphire Rapids Xeon, Benefiting High-Performance CAD Workstations, Part 230 Mar, 2023 By: Alex Herrera
Herrera on Hardware: Putting HP’s new Z8 Fury to the test and showing how Intel’s new Sapphire Rapids Xeon levels the playing field once again for high-end workstations.
Last time we discussed Intel’s new high-end Xeon W CPUs designed for high-performance CAD workstations, read here to catch up with this new technology. Continuing on, we dig further into HP’s new Z8 Fury build and performance.
Introducing the HP Z8 Fury
The previous discussion about which workstation form factors can support how much power and performance may have you wondering what machine I was actually able to test out with a 350 W Xeon W9-3495X. Well, thanks to HP, I could do so on a new beast of a Premium 1S workstation, the HP Z8 Fury.
HP’s first single-socket Z8, the Z8 Fury, supporting up to 56C Xeon W9-3495X.
Bear in mind, the CPU is far from the only watt-guzzling component a top end workstation like the Z8 Fury accommodates, as all supporting system hardware needs to scale at least somewhat commensurately to ensure balanced performance. No sensible workstation design would pair a 56C CPU with 16- or 32GB of memory, for example, but rather more like 128GB and beyond. The same goes, in most cases, with the GPU and storage capabilities. Consider the maximum specifications of CPU, GPU, and memory for the Z8 Fury along with the specifications of my review system.
Specifications for HP’s Z8 Xeon Fury. Data source: HP.
Add up the power demands of that maximum possible configuration, and the Z8 Fury can draw up to 30 amps at 120 V out of the wall. (Think about that for a second, because I had to, given my office’s local circuit maxes out at a typical 15 A). More impressive than drawing all those amps, though, is that it can manage to cool all the thermal dissipation resulting from that power, and it can do it without drowning you in jet engine-level decibels. Watch for more details about the HP Z8 Fury in my upcoming May column where I'll take advantage of its capabilities once again to check out another high-performance component, Nvidia’s new top-of-the line RTX 6000 Ada Generation GPU, a beast in its own right, capable of pushing rendering to dramatic new levels.
1T Performance Poses a Lower Bar, One Cleared Just as Easily by Many — and Far More Economical — CPUs and Workstations
As the SPECworkstation benchmark results attest, workstation CPU lines like Threadripper PRO 5900WX and Xeon W-3400 are built specifically to tackle hefty, long-duration multi-threaded (MT) workloads. Those whose workflows include lots of complex simulations and high-quality rendering will look to higher core counts CPUs like these to accelerate their computing iterations and raise productivity. Still, there will always be some single-thread (1T) work to be done, like parametric modeling, and it sure would nice if that pricier, many-core CPU you invested in could deliver just as big a boost in 1T throughput. But that’s just not going to be the case, for reasons again tied back to power and thermal constraints and architectural evolution.
The bulk of generation-to-generation performance CPU gains today come from scaling core counts, which directly yields better MT performance. But while 1T performance will also reap some gain via an enhanced core microarchitecture, it’s again dependent on clock rates. And in the age of Turbo Boost clocking — think of it as automatic, varying-duration overclocking to the extent the system can tolerate the power and heat — a less expensive CPU with few cores can deliver comparable 1T performance as a high-end CPU with many cores, and sometimes modestly but materially higher.
Given all that understanding then, how would an MT-processing beast like the X9-3495X or AMD’s high-core count Threadripper PRO SKUs fare in 1T benchmarking compared to mainstream — both desktop and mobile — Core brand CPUs? To get an idea, I ran both the popular PerformanceTest 10 and Cinebench R20 rendering benchmarks in single-thread mode. The scores reveal the differences aren’t necessarily dramatic but they are material, with a top-end 16C desktop Core i9-12900K outperforming the W9-3495X, by about 15%, while even an entry-level mobile workstation with a 28 W Core P-series CPU essentially matched the W9-3495X.
Don’t expect massive upscaling in multi-thread capabilities to carry over to 1T performance (Source: Jon Peddie Research). Click image to enlarge.
(Check out this two-part series for more on the seemingly subtle but highly impactful differences between base clocks and boost clocks, in particular in the context of 1T performance.)
Sapphire Rapids will Put the Xeon Workstation House Back in Order, but the AMD Challenge Remains
Intel must be breathing a sigh of relief, right along with the workstation OEMs that have long counted on Xeon W and Xeon Scalable to power their premium single socket (1S) workstations and all dual-socket (2S) models. Those combined segments of the fixed workstation market have seen volume cut in half in just a few short years, and the lack of fresh silicon available to differentiate the upper tier models is one big reason. With both AMD and Intel now offering compelling high-core count CPU delivering dramatic gains in multi-threading performance, it’s a segment that should see a rebound in user interest and sales.
But it’s also a segment that highlights the differences in appropriate hardware for different workflow demands. Given a technical understanding of the very different constraints on power (and thermals) dictated by 1T versus MT processing, it’s not a surprise mainstream Core CPUs — even a 28 watt Core i7-1280P that ships in a minimalist sub-entry mobile workstation — can deliver 1T performance on par with a top-end fixed workstation CPU. While it may not be surprising, however, it’s a conclusion that yields dramatic implications on what type of workstation is most appropriate for your CAD workflow.
If what you want to accomplish with your project and your workflow involves more, complex, high level of detail renderings, simulations, and computational analyses — all of which can leverage more threads and cores — investing in a fixed workstation with more cores makes sense, potentially all the way up to the 56C and 64C machines we looked at here. Yes, you’ll still get in the neighborhood of the best 1T performance available for those non-critical-path workloads, but that’s obviously not going to be the reason to buy it. Conversely, of course, if the vast majority of what you want to accomplish is limited to 1T computation — think modeling and interactive 3D graphics – you’ll be as well served with an Entry 1S fixed workstation or even a mobile model.
Worth repeating, however, is the specific wording above — “what you want to accomplish” — because the extent of your computing and visualization tasks in the past may not be enough to stay competitive moving forward. Competing successfully in markets like automotive, aerospace, and other product design and manufacturing, as well as architecture, engineering, and construction, means continually doing more to both win and please customers. Expectations of project deliverables continue to expand in scope and complexity, so workflows and the hardware to implement them can’t remain static either.
For most in CAD, something like the Xeon W9-3495X or the Threadripper PRO 5995WX is overkill. Still, for those who do demand as much, or perhaps just a notch below, a premium fixed workstation powered by a 16+ core Xeon or Threadripper PRO will be a machine to consider. And now, with Intel back on track in the premium end of the market, those buyers will find many options to choose from in the most competitive marketplace in decades.
In my upcoming May column, I'll cover more details of the HP Z8 Fury and take advantage of its capabilities to checkout another high-performance component, Nvidia's new top-of-the-line RTX 6000 Ada Generation GPU.