Allocate Your Workstation Budget According to Your Workload

3 Sep, 2014 By: Alex Shows

Uncertain about how best to spend your hardware dollars? A careful evaluation of the type of work to be done — and these guidelines — will provide the answers.

To more precisely compare CPUs when deciding which to choose, you should compare the low-frequency mode (LFM), high-frequency mode (HFM), minimum Turbo frequency (all cores loaded), and maximum Turbo frequency (one core loaded). LFM is important to compare if you want more power efficiency at idle. If the CPU isn’t doing any work, how important is it that the CPU consumes as little power as possible? HFM is important to compare if the CPU doesn’t support Turbo. Minimum Turbo frequency is important to compare if the CPU will spend most of its time running computational workloads. And finally, the maximum Turbo frequency is important to compare if the CPU will spend most of its time running interactive workloads.

When deciding whether to maximize CPU core count, maximum frequency, or find a compromise between the two, one should always seek the latest CPU microarchitecture and generation. Newer CPU generations typically come with either a process shrink (smaller transistors) or a new architecture. Newer architectures often bring greater performance at the same frequency, and the benefits of this extend beyond just CPU performance. Because many applications spend time waiting on a single core to feed the GPU instructions and data, as the CPU’s integer performance increases, so does the graphics performance. This means that the same frequency CPU, on a newer-generation architecture, can provide higher frames per second with the same graphics card! 

GPU Guidelines

In general, when it comes to graphics cards (GPUs), the more you spend the more speed you can buy. Speed in graphics is most commonly associated with real-time rendering performance, as measured in frames per second. The higher your frames per second in an application, the more fluid your interactions with the data model, and the more productive you can be. Computational capabilities aside, finding the right graphics solution for a workstation depends on the desired frames per second in the applications of greatest interest.

A good rule of thumb for graphics performance is to look for a card that is capable of delivering more than 30 frames per second in the most important applications, using data models and rendering modes most like those in your day-to-day use. While the persistence of vision phenomenon dictates that we require 25 frames per second for smooth animation, more is always better. If a particular graphics card is able to deliver more than 100 frames per second in a particular rendering method using a specific model size and type, it is reasonable to assume that you can increase the complexity and/or size of the model and still be able to interact with that model without observable stuttering.  

SPECviewperf is an excellent benchmark for comparing workstation graphics cards because it measures the frames per second of several varied workloads using rendering methods that mirror those of popular workstation applications. Anyone can view the detailed frames-per-second measurements of several different methods of rendering and compare graphics card performance based on published results, as well as see representative screen captures of the image quality of these methods. If one were a user of PTC Creo, for example, one could use this data to compare how one card performs versus another, not just in Creo, but specifically with a data model and rendering mode that most closely represent a particular use of Creo.

Thus when considering graphics, weigh the amount of time in a typical day that the workstation will spend in either highly interactive work, or in computational work that utilizes the GPU. The more time spent in these usage types, the greater the portion of the workstation budget that should be spent on graphics.

The Importance of Memory

It has been said that you can never have too much random-access memory (RAM). While that adage may be true for modern multicore systems running massively multithreaded applications, it is still very important to weigh other factors when considering which type of memory to include in the workstation. For computational workloads, you’ll almost always want to maximize the amount of memory bandwidth available to the processing cores. Thus if given the choice about whether to populate eight DIMMs of an 8 GB capacity each, or four DIMMs of a 16 GB capacity each, choose the option that populates more DIMM slots. The increase in available memory bandwidth will reduce the likelihood that memory bandwidth is the bottleneck to computational workloads, shifting the computational burden back to the CPU cores, frequency, and cache.

Choosing the right frequency is also important, and varies depending on the workload. In applications requiring maximum memory bandwidth, populating all available DIMM slots with the highest-frequency memory is important. However, some applications require the lowest latency possible, irrespective of available bandwidth, and in that case you would want to populate all available DIMM slots with the lower-frequency memory. An example of this is in random accesses of memory that is small enough to fit in the CPU cache but there is no way for the CPU to predict what memory location to access next. While memory bandwidth remains important, the lower latency of the slower memory speed can provide benefits to these random reads and writes.  

Lastly, when the integrity of data used in individual computations is paramount to the end result, error-correcting code (ECC) memory should be used. ECC memory uses a parity bit scheme that computes whether the bit is a 0 or a 1 depending on the data saved, and can not only detect when the data is incorrect, but can also correct the error. This is especially important when iterating across a large dataset where the outputs of computations are continually provided as inputs into another sequence of computations, because one mistake missed in early computations can have a dramatic impact on the final outcome.

1 2 3 

About the Author: Alex Shows

Add comment

Download Cadalyst Magazine Special Edition