The Misconception of Professional Computing and CPU Core Counts17 Feb, 2022 By: Alex Herrera
Herrera on Hardware: Especially in CAD applications, lower core count CPUs still dominate the market.
It feels like I’m coming across the headline more often than ever before. Paraphrasing, the headline either states or implies that high core count CPUs — let’s roughly call that in today’s context as 10 cores or more — are dominating sales in the workstation market. On the surface, it’s an easy conclusion drawn from an accepted narrative. “High performance” CPUs are high-core count CPUs. And, since workstations are designed to deliver high performance then, of course, high-core count CPUs would logically be in high demand for both workstations and the top consumer of those workstations, CAD users.
The reality does not fit the headline, however, nor the simplistic narrative. The truth is that the median CPU among all workstations shipped in 2021 had only four cores. That’s worth repeating: in 2021, fifteen years after the introduction of the first quad-core processor, in the most (arguably) demanding segment in client-side computing, the typical machine still shipped with a quad-core CPU. Data on usage can’t be ascertained with precision, but I’d speculate that quad-core has been dominating sales in machines used for CAD at a higher rate than other workstation-relevant spaces. Only now, on the back of Intel’s 11th Generation Core (Rocket Lake, covered here) and 12th Generation Core (Alder Lake, covered here), are 6- and 8-core CPUs beginning to take over volume leadership. 2022 should finally be the year that the median moves up from the quad-core we’ve been stuck at for years.
The misconception is quite understandable, because on the surface, the narrative would appear to hold water. And indeed, in some segments of the market that rely on performance for heavily-threaded, parallel workloads, the premises leading to this conclusion are valid in some corners of the CAD world. The issue is that those corners of the CAD world represent a minority greatly outnumbered by the masses who focus their hardware choices on price/performance for mainstream CAD applications.
For Most CAD Workflows, Multi-thread Performance Still Takes a Back Seat
To be clear, yes, professional computing spaces like CAD are more likely to value and leverage as many cores as can be had, certainly a lot more than, for example, typical office applications and workloads. But no, it’s not the most common case for mainstream in those segments, where the most indispensable, time-critical tasks tend to leverage only a single thread of execution, or perhaps just a few (1T). Consider the universal, time-dominating iterative cycle of model, visualize, model, etc. Neither parametric modeling nor interactive 3D graphics processing effectively leverage a multitude of cores. (It’s worth emphasizing I’m referring to visualization via 3D graphics, not rendering, the latter of which can quite effectively harness many CPU cores. For a refresh on the significant differences in 3D graphics versus rendering, check out this previous primer.)
The Unavoidable Frequency Tradeoff
OK, but even if you’re one of the many whose computing productivity is still heavily dependent on 1T performance, what’s wrong with having more cores for the times you could take advantage of them? Consider tasks better suited parallel core execution, like physics and engineering simulations, and CPU rendering. Unfortunately, it can hurt because securing more cores isn’t free. It will cost you in two respects: dollars and frequency.
Consider the inverse relationship between core counts and frequencies this column has discussed in the past. As chips grow in size, the more power consumption concentrated on a single piece of monolithic silicon, the more difficult it is to dissipate the resulting heat and the bigger the die, the more difficult it is to maintain adequate signal integrity. Both challenges tend to tamp down the minimum guaranteed and sustainable clock rates (the base rate). I charted the base frequencies for the popular workstation CPUs both for mainstream, Intel’s Xeon W-2200 family, selecting the SKUs that represent the highest base frequency at the given core count. The decline in base frequency is clear, dropping from 4.1 GHz at 4C down to 3.0 GHz at 18C. So as core counts rise, we see that inverse relationship, where the minimum guaranteed operating frequency (the base GHz) declines.
The tradeoff presents another argument to forgo the higher core counts, assuming your primary interest is maximizing GHz to guarantee best possible 1T performance.
The inverse relationship of core count and base frequency in Intel’s Xeon W-2200 family. Image source: Intel.
A Gray Area: Turbo Frequencies
At the risk of complicating the simple conclusion that a higher core count CPU’s lower base frequency equates to lower 1T performance, it’s worth remembering the impact of modern CPUs’ turbo clock rates, most notably AMD Turbo Core and Intel Turbo Boost Technology. Turbo rates allow a CPU to make use of available electrical and thermal headroom to drive clocks up, but only for a temporary period while that headroom allows.
As discussed here, for sensible engineering reasons, unlike base clocks, turbo rates can actually rise with core counts, albeit not universally and to varying degrees. And it’s that variance that complicates matters, particularly in assessing the 1T performance penalty associated with higher core counts. It is possible that a higher core count CPU can sustain higher turbo rates when only one core is being heavily taxed, and if the higher core count CPU has a higher turbo clock spec, 1T performance will benefit. This is often the case in short-term execution, common in simple benchmarking, and it’s not possible to draw a definitive conclusion on how much your workflow can rely on turbo rates for non-stop 1T processing. (And, for heavy MT (multi-threading), you can pretty much disregard the turbo rates, as any significant thermal or electrical headroom to allow sustained turbo frequencies are unlikely.) Ultimately, while higher turbo rates can provide an extra incentive to — ironically — consider higher core counts for 1T processing, when it comes to guaranteed performance, users can only rely on base rates.
Multi-Core Was the Right — Really the Only — Viable Direction to Take
So, if the majority of high-demand workflows are still beholden to 1T performance, and if increasing core counts may not help (and could even hurt) in that respect, why is the CPU industry adamantly pushing down the path of incessant increases in core counts? There are several answers, the first of which has to do with whether we are talking about clients or servers. The reality check discussed above — tempering the conception that professional workflows are ideal for the deployment of many-core processors — applies to clients, not servers. Servers can take full advantage of high core count CPUs, simply because leveraging more cores does not have to rely on multi-thread applications, as is the case for clients. Servers host multiple users and processes, so N cores on a server CPU could theoretically serve N clients, each running 1T workloads.
Still, one can argue that the specific CPU products targeting servers aren’t the same targeting clients, so why not build massively-core’d CPUs for servers and fewer-core’d CPUs with massive 1T performance for clients? Well, to some degree that does happen, as CPU product lines for servers — like Intel’s Xeon Scalable and AMD’s EPYC — do sport higher core counts than lines targeting clients.
But, the most overarching answer as to why the market for client-focused CPUs continues to focus on higher core counts is straight-forward: it’s the only rational way to dramatically increase the overall theoretical performance that can be extracted from the same silicon area as the previous generation. Remember that multi-core architectures haven’t been the norm for all that long. For the bulk of the CPU’s evolution, engineers and architects focused heavily on speeding a single thread of execution. For a while, advancements in transistor switching speed allowed clock rates — from KHz to MHz to GHz — to steadily climb, with each jump in process technology automatically yielding performance improvements. In addition, focus shifted more and more to superscalar techniques, which essentially tries to extract fine-grained instruction-level parallelism from a serial thread of execution. With superscalar architectures, a CPU will try and load and execute as many instructions from the same thread and process at the same time, using ever-more complex techniques to wring out every last drop of IPC (instructions per cycle rate).
The industry rode superscalar refinements and higher Hz for years, but ran into two major roadblocks: thermal limits and diminishing performance returns. With the low-hanging fruit of superscalar techniques long picked, architectures were getting incredibly complex, yet yielding much more modest generation-to-generation returns. And trying to rely on ever-higher clock frequencies to achieve that goal was pushing thermal output to levels beyond the means to cool them.
By contrast, keeping the same or modestly enhanced core at the same rough frequency, and instantiating twice as many on silicon with the denser, next-generation process, doesn’t exacerbate the power problem yet doubles the theoretical maximum throughput. The tradeoff? That gain is only realized if two more threads are running concurrently, not the case in 1T processing. The move to multi-core was the best practical path forward, but not one without the tradeoff of accepting lower generation-to-generation gains in 1T performance.
The Unavoidable Price Tradeoff
Though perhaps not as in tune with the inverse relationship between core count and base frequency, most CAD buyers have been acutely aware of the undeniable relationship between core count and price. Below I chart typical retail prices (sampled online on the same day) for three popular CPU families serving CAD applications. The incremental cost of additional cores is significant moving up the lines, particularly beyond 8C in the range shown.
No, more cores are never free (charting approximate retail pricing for three current workstation-caliber CPU families, sampled online).
The Prevailing Mainstream Wisdom for CAD Buyers: Err to the Higher GHz, Lower Price, and Live Without Max Core Counts
Considering the tradeoffs in price and frequency, the reason lower core count CPUs have continued to reign as the volume leader becomes clear. Why should I buy more cores, when they cost significantly more, and I spend most of my time on tasks that don’t take advantage of them? Better to stay in the vicinity of the market’s sweet spot of price/performance, err towards higher GHz, and take whatever number of cores come by default.
That understandable buying strategy is supported not only by the fact that quad-core dominated for such a long period, but that the market’s median CPU is just now making the move up to 6 and 8 cores. That is, many folks are buying 6C and 8C CPUs not necessarily because they want 50% to 100% more multi-thread processing power, but because the sweet spot in the market is naturally rising to the 6C/8C neighborhood. Consider Intel’s 12th Generation Core family built on Alder Lake (covered last month here), whose highest performing SKUs now come with 12 to 16 cores by default. Ultimately then, even those focused primarily or exclusively the best 1T performance per dollar will still be driving up that median core count.
A Caveat: Multi-Core and Superscalar Architectures Won’t Be Enough Moving Forward — Neither Will Sticking with Few-Core CPUs
To emphasize an earlier point, putting to rest the misconception that high core count CPUs are the norm in professional, workstation-caliber applications does not imply they don’t have a place. They most certainly do, and for those with the budgets and must-have performance for highly parallel workloads, 32 or even 64 cores will likely represent the shopping focus. Rather, it’s the bulk of the user base that is responsible for actual CPU buying behavior running counter to perception.
All this begs the question as to whether the mainstream CAD community’s buying habits will continue to lean in favor of better GHz, higher 1T performance, and fewer dollars, or begin to place more value in higher core counts moving forward. While it wouldn’t be prudent to forecast a decline in the reliance on 1T processing for CAD, I would speculate that workflows will take on more advanced and useful tasks that do leverage MT. I think of tasks like complex lighting simulation and rendering for AEC, machine learning (for things like generative design) and more frequent use of fine-grained engineering simulations earlier in the design cycle. Ultimately, though, the mid-point of most markets is — and always will be — heavily sensitive to price, where saving dollars always means sacrificing something. And, especially when mainstream CPUs will continue to rise in core count gradually, regardless— as in 2022 pushing to 6C and 8C — for many, pushing for substantially higher counts is likely to remain that something sacrificed.