With RDNA2, AMD Takes Its Turn with a New GPU Generation13 Nov, 2020 By: Alex Herrera
Herrera on Hardware: The impressive introduction in gaming-focused products foreshadows the next generation of professional graphics products for CAD.
In September, the big news on the visual computing front for professionals was NVIDIA’s Ampere. The GPU leader released its first workstation-class product built from Ampere and geared to applications ranging from CAD to digital media creation and more. (See "NVIDIA Unveils Ampere-Generation RTX A6000 and A40, Affirming Evolving — and Aggressive — Path for its GPUs.")
NVIDIA dominates the space for GPUs in the CAD market, but it isn’t the only vendor of consequence. AMD trails in presence among professional graphics markets, but it has by no means given up the fight, pumping out its new generations of GPU technology at a regular cadence as well.
And if recent revelations are any indication, AMD may have its best shot in more than a decade to take meaningful share in the market for professional graphics hardware. In late October, the company unveiled its first product based on the new-but-anticipated RDNA2 architecture, close on the heels of the Ampere release and foreshadowing what should come soon in competitive professional-class GPU products for CAD users.
The Current Radeon Pro GPUs Based on RDNA Architecture
Introduced in the summer of 2019 under the Radeon gaming brand, Navi was the first incarnation of AMD’s current GPU architecture, RDNA, now driving the company’s Radeon Pro brand GPUs for professional visual computing. Today, Navi/RDNA forms the silicon foundation for the mid-range Radeon Pro W5500 and W5700 GPUs, both launched earlier this year. (It’s worth adding that the “mid-range” descriptor reflects my tracking of the market, though AMD refers to both cards as “high-end.”)
At the top level, RDNA does not look dramatically different than the company’s preceding architecture, Graphics Core Next (GCN), as both incorporate a similar array of Compute Units (CUs) supported by supporting cache, memory controllers, and graphics-specific functional units (e.g., rasterizer, geometry processor, and render back-ends).
The basic foundation of RDNA and Navi-based Radeon Pro: an array of Compute Units with supporting hardware. Image source: AMD.
RDNA2 Architectural Advancements and Potential Payoff for CAD
As the name suggests, RDNA2 doesn’t represent a major departure from the company’s fundamental architectural approach with RDNA; think of it more as an enhancement or kicker to the architecture. However, where such descriptions often imply minor improvements in performance and functionality, that’s not the case with RDNA2 — at least not based on what AMD’s shown with the first physical implementation of RDNA2, the gaming-oriented Radeon 6000 GPUs.
From a silicon standpoint, the first RDNA2 chip — nicknamed “Big Navi” — integrates 6.8 billion transistors on the same 7-nm silicon process technology as original Navi. What’s new on RDNA2’s architecture? At its unveiling, the company highlighted several relatively nebulous improvements: “high-performance Compute Units,” “revolutionary Infinity Cache,” “breakthrough high-speed design,” and “advanced features” … on the surface, not a whole lot of detail.
Based on the collateral the company provided at Big Navi’s introduction, though, I’d push to the forefront the following most concrete advancements for Big Navi and RDNA2, the combination of which do support the company’s fuzzier top-level claims.
- Populating more Compute Units (the atomic processing engines in the architecture)
- Compute Unit microarchitectural improvements in both throughput and ray-tracing acceleration.
- Addition of a big, high-performance Infinity Cache.
First off, “breakthrough high-speed design” likely and primarily refers to the engineering that went into a 30% increase in clock rates relative to first-gen Navi, achieving in excess of 2 GHz internally. That combination of more CUs at higher clock rates yields what AMD claims is, on average, a 2X gaming performance increase. (Of course, such metrics always have to be taken with a grain of salt, as performance will always depend on software, datatypes, and application and user behavior as well as peak hardware rates.)
Keeping Up with the Joneses … AMD’s First Dedicated Hardware Ray-Tracing Acceleration
Whether or not you’re a believer in the aggressive move Nvidia took in deploying GPU hardware to accelerate ray tracing with 2018–19’s Turing architecture (and reaffirmed with 2020’s Ampere), it’s a move that forced AMD to take notice. To me, Nvidia was preaching to the choir: I’ve long held that the GPU industry needed to start focusing on accelerating rendering (for which ray tracing in some form is the most commonly implemented approach), rather than on traditional raster-based 3D graphics only.
Rendering is based in the physical reality of our world: lighting, materials, and human eye/brain receptivity. In contrast, 3D graphics has historically been (to varying degrees) an excellent compromise, approximating reality while maintaining real-time interactive performance. With the incredible throughput available in high-density silicon today, combined with the diminishing returns of proceeding exclusively down the path of refining traditional 3D graphics, I have argued we’re ready to push down the gradual path of that transition. For CAD users, that eventually (emphasize “eventually,” as we’re not talking overnight by any stretch) will mean interactively developing with more photorealistic ray tracing and relying less on traditional 3D graphics.
Still, I’d argue that whether AMD or the 3D visual computing ecosystem embrace the same rationale is moot. Nvidia drives the GPU market with an impressive and successful record of pushing the GPU on new paths — unilaterally if necessary. Simply put, AMD had to provide some ray-tracing silicon oomph in its next generation architectural release. And it has.
The most notable manifestation is the addition of one Ray Accelerator (RA) per CU. Similar to the purpose of Nvidia’s RT Core (in Turing and Ampere), the RA handles the more awkward and compute-intensive calculation for intersecting rays and surfaces, a critical performance path in the overall rendering workload. Exposing the RA functionality to software, AMD is leaning on Microsoft’s DirectX 12 Ultimate, which exposes ray tracing to independent software vendors (ISVs) for integration in applications.
For more on the rationale behind a shift toward hardware-based ray tracing, check out these previous columns: "What Does NVIDIA’s Ray Tracing News Mean for the CAD Market?" and "Herrera on Hardware: GPU Technology Conference."
From an implementation standpoint, the most significant step forward in RDNA2 appears to be the Infinity Cache. Based on the cache architecture built for the company’s Zen 3 CPU microarchitecture, the Infinity Cache delivers 128 MB of on-chip memory. That ample size operating at fast internal rates allowed AMD engineers to drop the memory interface from today’s typical 384 bits down to 256 bits. The narrower interface reduces the bandwidth available to external memory (by about 33% peak), but the addition of the on-chip Infinity Cache allows Big Navi to double the overall effective memory bandwidth of a conventional, external-only 384-bit GDDR6 interface.
Why not just leverage the benefits of more cache and keep the wider, higher-bandwidth interface? In one word, power. Big Navi, along with all high-performance GPUs today, bumps up against practical power limits, both in terms of supplying the power and dissipating the heat resulting from burning all those watts. Vendors usually don’t want to exceed the range of 250–300 W, and the more bits/wires running around between chips (either intra-package or across the printed circuit board), the lower the power. Even with the narrow interface, Big Navi hits 300 W, so clearly the Infinity Cache was a favored solution both for performance and power reduction, but retaining the wider external interface would have been too much. Combining the increase in performance while cutting and/or maintaining power limits, AMD claims a 54% improvement in Big Navi’s performance/watt, as compared to first-generation RDNA.
Big Navi’s Infinity Cache enabled AMD to double the bandwidth of a conventional 384-bit GDDR6 interface, while cutting a bit of power. Image source: AMD.
Anticipating a Big Navi Radeon Pro GPU for Professionals Soon
No, AMD doesn’t have a workstation-class product to compete with NVIDIA’s first Ampere-generation GPU today. But the Big Navi silicon that powers the Radeon 6000 series will in all likelihood be tapped to bring RDNA2 to CAD professionals — and probably soon. The company hasn’t made any announcements, but based on past rollouts, I’d expect to see a Big Navi–powered Radeon Pro by the end of Q1’21.
And when it turns up, I’ll be reviewing its promise for CAD use. Because based on its raw hardware metrics, as well as gains in real-world gaming performance, I think we’ve got a good chance of seeing the most competitive AMD professional GPU in years.