Two New CPUs Will Open More Options for the CAD Workstation Platform17 Sep, 2020 By: Alex Herrera
Herrera on Hardware: Lenovo builds around AMD’s Threadripper PRO processor, while Intel ships the first fruits of Xe graphics and SuperFin technology.
The typical workstation serving CAD professionals sees a few significant upgrades every year, with some being more impactful than others. Sometimes it’s a new central processing unit (CPU) that plugs into an existing socket, other times it’s a graphics processing unit (GPU) successor based on a new architectural generation, and occasionally it’s a complete overhaul, with a ground-up redesign incorporating a completely new platform and components.
With respect to CPUs, we’ve seen Intel release both its 10th generation Core processors targeting fixed machines (mobile-tailored 10th generation versions were released in the second half of 2019), as well as just this month, the first 11th Gen Core products targeting mobiles. But for once, all the action wasn’t limited to the Intel universe — AMD not only launched its first professional-specific workstation processor in over a decade, but saw it picked up by a top-tier workstation supplier to boot.
Intel’s Tiger Lake Leverages SuperFin Technology
Let’s start by taking a look at the mobile-focused part. Perhaps it’s never been more appropriate to talk mobile workstation CPUs first, as 2020 has seen all laptops — commercial, consumer, and CAD-tailored mobile workstations — surge in sales, measured not only relative to shipments of fixed/desktop models but in absolute terms as well. In the third quarter of 2020, for example, mobile workstation sales climbed by 20% year-over-year, while fixed workstations declined 15%, according to Jon Peddie Research. The reason is obvious: The push to adopt work-from-home and remote schooling setups clearly favors the mobile platform.
Intel’s first release to market under the 11th Generation Core i7/i5/i3-11xxG branding, code-named “Tiger Lake,” is notable for two engineering advancements — one in process technology and the other in graphics architecture. Over the past four decades, the vast majority of generation-to-generation improvements in the semiconductor industry have come courtesy of Moore’s Law, the principle laid down by Intel founder Gordon Moore that essentially observed that silicon density (and therefore, cost per transistor) grows at a geometric rate, doubling steadily over time. It’s been such a powerful tool that, directly or indirectly, one could argue it’s the foundation of our entire technology ecosystem today. Where would we be if transistors required as much space, power, and expense as they did back in the 1970s?
As Moore’s Law butts up against atomic and quantum physics limits and progress slows, vendors are focusing more and more on architectural and packaging improvements to provide some of the incremental improvements process shrinks alone would achieve. Still, pushing down the Moore’s Law path of silicon shrinks remains the primary means for all chip vendors to create compelling new products promising better performance, lower cost, decreased power usage, or some combination of the three.
For decades, in fact, it had been Intel’s most formidable weapon. That’s all changed, however, as at this point, most are aware of Intel’s struggles over the past few years in moving its process road map forward. Its 10-nanometer (nm) node was painfully late to production, and most recently the company had to announce it was pushing out its 7-nm process to the latter half of 2021.
Given that, the company’s announcement of its SuperFin technology — which combines a few significant achievements in transistor and interconnect structures — couldn’t have come at a better time. Intel promises SuperFin can achieve much of what a Moore’s Law process step would, yet it retains the same core 10-nm process density. SuperFin may not improve density, but it still directly serves two of the ultimate goals — performance and/or power — evidence of the fact that process dimensions alone do not singularly equate to superiority, especially when netted out to a single number, like the “10” in 10 nm.
By virtue of an enhanced transistor and interconnect electrical characteristics, SuperFin substantially improves the performance/power characteristics of its 10-nm process. Skipping the esoterics of SuperFin technology, in the end it yields a significantly more efficient transistor which can be leveraged either as a higher-performing transistor at the same voltage/power or as a lower-voltage/power transistor at the same performance. That spread in usage is common in process advancements, and helps the technology be exploited across different product priorities and applications, from low-power mobile to max-performance desktop or datacenter.
The SuperFin transistor: enhancements add up to higher performance and/or improved power efficiency. Image source: Intel.
SuperFin will provide some relief to the company that’s going to have to ride its 10-nm process for significantly longer than planned. How much does SuperFin help? Well, Intel positions it as responsible for the “largest intranode performance delta in our history” and “comparable to a full-node transition.” From another perspective, 11th Generation Tiger Lake on the 10-nm SuperFin process manages a base frequency up to 3.0 GHz at 28 W (thermal design power), while 10th Generation Ice Lake managed up to 2.3 GHz on the pre-SuperFin 10-nm process at nearly the same microarchitecture. So, all else being equal, SuperFin translates to significant improvements in both single- and multi-thread throughput.
Bear in mind that while SuperFin-enabled CPUs today come in this first handful of Core i7/i5/i3 – 11xxG parts for mainstream mobile platforms, it will most certainly spawn derivatives at higher power (45 W and up), appropriate for not only high-performance mobile workstations but fixed workstations as well. And one would expect commensurate gains at those power envelopes as well.
Debut of Xe Graphics
The word has long been out that Intel was repositioning new graphics technology to succeed the GenN line of integrated GPUs bearing the Intel HD and Iris brands, now a decade-plus old, and support its desktop and mobile CPU lines. The wait is over, as Intel reaffirmed its commitment to deliver a revamped, more competitive GPU architecture in 2020 and gave it a name: Xe.
First shipping in Tiger Lake with the Iris Xe brand, Xe represents a major step forward in Intel’s commitment to provide competitive integrated graphics acceleration in its processors and systems-on-a-chip (SoCs, which are CPUs with higher integration of peripheral componentry, particularly suited to mobile and small form factors). But Intel’s plans don’t end with integrated Xe solutions targeting its existing successor to the GenN line of graphics the company has developed over the past decade. The company has made no secret of the fact that it plans to deploy Xe across a range of applications, including those that demand the performance of a discrete implementation. (It’s worth noting that Intel’s designs on the discrete graphics market have a long history — mostly of failures — extending over 30 years, but that’s a story for another time.)
How does Xe performance look in its first, Tiger Lake incarnation? First consider that a tile is composed of 50% more raw computational resources than the Gen 11 graphics in 10th Gen Core Ice Lake. Furthermore, those tiles can be aggregated in scaled implementations of (at least) one, two, or four, supporting a range of performance and power targets. Factor that out, and Xe looks to scale up by anywhere from 50% to 500% in peak performance. As such, it’s an architecture that will spawn at least four microarchitectures, each deployed to different applications and markets: XeHPC for HPC applications, the similar (or at least closely related) XeHP for datacenter and AI, XeLP for integrated and entry graphics, and XeHPG for mid-range and enthusiast segments. Clearly, discrete implementations (whether packaged separately or not) will factor heavily into Xe’s product rollout moving forward.
Xe is the graphics architecture that will spawn multiple product incarnations. Image source: Intel.
AMD CPU Finally Back in Top-Tier Workstations: Threadripper PRO and the Lenovo ThinkStation P620
This column has paid a fair amount of attention to AMD’s role as a CPU provider for workstations over the past several years, based not on its current market share — which is virtually nonexistent — but on the potential of its Zen processor technology to re-enter and find success in the market. Calling Zen’s presence “virtually nonexistent” is not meant as a knock on the worthy vendors like Puget Systems, Boxx, and Velocity Micro that have created workstations around AMD CPUs; rather, it is a reflection of how difficult it is for any unproven vendor to crack the portfolio of the big-volume Tier 1 suppliers, today represented by the trio of Dell, HP, and Lenovo (DHL). These three represent roughly 90% of the market supply for workstations — hence AMD’s lack of any material contribution to CPU shipments in the market. And more than any other, those three OEMs are looking for the confidence that a partner can deliver long-term reliability and longevity in the market, something that AMD had to prove over time.
And to its credit, AMD executed. By mid-2020, we’d witnessed several generations of predictable and compelling Zen CPUs, and it was looking like AMD was successfully checking off that reliability requirement that a top-tier vendor would need to feel comfortable before jumping on board. AMD’s first three generations of Ryzen, Threadripper, and Epyc (third coming shortly) made a compelling case for Zen’s place on CAD workstation platforms, based on its aptitude for both high-performance single-thread computation and, more so, multi-thread computation.
Finally, the wait is over, as at least one of the top-tier trio clearly agrees the time is now right to introduce a workstation built around a Zen technology CPU. Lenovo’s new ThinkStation P620 workstation is based on Threadripper PRO, a chip that bears the name of the CPU line for high–core count enthusiasts, but from an implementation standpoint it is actually based on its close sibling, the EPYC CPU code-named Rome. Sandwiched between Lenovo’s Xeon W single-socket (1S) ThinkStation P520 and the dual-socket (2S) Xeon Scalable P720, the P620 offers between 12 and 64 Zen 2 cores paired with 8 memory channels, targeting the upper-end — albeit not boutique — workstation tier as well as suggesting a 1S alternative to existing 2S platforms.
Threadripper PRO a Worthy Culmination to AMD’s Path Back to Workstation CPU Relevance
The initial family of four Threadripper PRO SKUs in the 3900WX processor family range from the 4.0-GHz (base frequency), 12-core Threadripper PRO 3945WX all the way up to the 64-core 3995WX.
The first generation of AMD Ryzen Threadripper PRO 3900WX workstation CPUs. Data source: AMD.
Unlike Tiger Lake, AMD’s Threadripper PRO was able to exploit the next-denser processing node, TSMC’s 7-nm silicon manufacturing process, allowing engineers to roughly double transistor density to drive up throughput with the next-generation Zen 2 microarchitecture. (That said, it bears repeating that as noted above, processes cannot completely and fairly be compared by that one number, for example the 10 and the 7.) With Zen 2, the improvements are many, but two carry most of the weight, especially where high-demand professional computing is concerned: around 15% faster instructions per cycle (IPC) than the first-generation Zen core, and impressively quadrupling the peak floating point throughput rate.
To top it off, AMD’s approach to chiplet integration that it calls the Infinity Architecture (covered previously in “Chiplet Architectures Emerge as One Arrow in Industry Quiver of Technologies Extending Compute Performance”) allows the Threadripper PRO to claim several industry workstation firsts: the first 8+ core workstation CPU to break 4.0-GHz base frequency (the 3945WX), the first to offer more than 28 cores in a single socket, the first to offer 8-channel memory in a single socket (more channels equals more bandwidth), the first to offer PCI Express Gen 4 speeds, and the first to offer more than 56 cores, including dual-socket platforms.
Lenovo’s ThinkStation P620 complements any of the four 3900WX SKUs with up to 1 TB of DDR4-3200 ECC memory, 20 TB of storage on eight drives, and four high-end Nvidia Quadro RTX GPUs (e.g., RTX 4000) or two ultra-high-end Quadros (RTX 5000 and above). Sales of the ThinkStation P620 are slated to ramp by the end of September.
The ThinkStation P620 is the first and only top-tier workstation build around AMD’s Threadripper PRO. Image source: Lenovo.
CAD Workstation Professionals Among the Ultimate Beneficiaries
Ultimately, of course, technology advancements for the sake of technology are of little interest to CAD professionals. The ultimate litmus test in value is how they might improve productivity, specifically in how much less time is required to run your existing workflow, or how much more work you can potentially pack into the same amount of time. For the 11th Generation Tiger Lake, the answer is pretty straightforward. Thanks to SuperFin, Intel appears to have been able to put a meaningful performance gap between this new 11th Generation Core and the 10th.
For AMD’s Threadripper PRO, the benefits go beyond performance metrics. Yes, it certainly looks like it can provide a substantial throughput advantage over the current norms for multi-threaded workloads. And the bonus is that users don’t have to make big sacrifices in single-thread workloads to get that multi-thread performance. That alone makes a compelling argument for a new product.
But specific products aside, workstation OEMs and CAD users stand to benefit in the long term if AMD CPUs can successfully re-engage in the workstation market. A revitalized and competitive AMD will improve the health of both OEMs and customers, even if they were to continue to buy Intel-based workstations. The last thing an OEM wants is to beholden to a single supplier (it doesn’t make for effective negotiating or clear product differentiation). The customer has fewer choices as well, and quite possibly has to pay more as a result of the de facto monopoly.
For a clear illustration of that dynamic, harken back to 2004, when AMD introduced the Opteron processor, primarily targeting servers but finding an excellent fit in high-performance workstations as well. With Opteron, AMD managed to give OEMs and customers another product option, which alone was good. But the more lasting impact of Opteron was to secure far more competitive workstation platforms from Intel. The right part delivered at the right time, the first Opteron part (code-named Hammer) was an x86 implementation that combined the best of conventional 32-bit x86 with 64-bit address extensions, along with a highly scalable direct memory attach scheme that made Intel's front-side bus architecture appear antiquated.
The emergence of Opteron not only caused Intel to adopt both similar 64-bit extensions and direct memory attach for its Core and Xeon lines, it put the nail in the coffin of Itanium, which was at that time the architecture Intel planned to succeed x86. Intel’s subsequent CPUs proved far more effective, ironically helping to knock AMD’s Opteron line back out of workstations, as successors to Hammer proved less than stellar. Ultimately, the Intel-based workstation products that professionals bought post-Opteron were far superior than they would have been had Hammer never been. That dynamic, at the very least, is why the industry and its customers (save Intel, of course) should be pleased to see AMD back in the workstation CPU game.
Flipping that argument around in the context of the launch of Xe graphics, one could hope for similar progress from Intel as a supplier of high-performance graphics hardware. In the case of workstation GPUs, Nvidia dominates (with AMD around to apply some pressure), but having another competitive vendor in the mix should only help.
Competition is good, with end users the ultimate beneficiaries. And CAD workstation buyers should only see their options — be they measured on price, performance, or both — improve as a result.
Author’s disclosure: I worked with AMD to support the rollout of Threadripper PRO this past summer, but had not formally engaged with the company prior on CPUs for workstations.