Can 1-to-1 Remote Workstations Provide the Same Performance as Local Machines?4 Jun, 2020 By: Alex Herrera
Herrera on Hardware: Are you concerned about latency with a remote computing solution? You’ll want to evaluate whether response time and image quality meet your expectations or are noticeable enough to be an issue.
Editor's Note: Click here to read Part 1 of this article, "Boxx Expands into Remote Workstations with Help from Cirrascale."
Boxx Technologies’ acquisition of Cirrascale and subsequent launch of Boxx Cloud Services was not only prescient in its timing, but unique in what it’s brought to the rapidly expanding cloud computing ecosystem. As explored in the first part of this article, Boxx Cloud Services is one of the most recent providers of remote desktop hosting solutions — a launch that dovetailed with the world’s urgently renewed interest in remote computing, triggered by the COVID-19 crisis.
While it’s not first to the cloud computing party, its offerings are anything but copies of hosted desktops available from names like Amazon Web Services and Microsoft Azure. Boxx Cloud Services’ for-rent workstations offer one-to-one dedicated hosted machines — not just comparable to their traditional deskbound machines, but identical. Forget slower-clocked server-optimized CPUs and the shared memory, storage, and GPU resources of a virtualized cloud platform; Boxx Cloud Services workstations would represent the top end in performance (including overclocked CPUs), were they packaged and sold as deskside towers.
Verifying the Premise of Identical Performance
Now, while it’s theoretically solid to argue that the system throughput of the remote machine should essentially match the identically configured local machine, I (with Boxx’s help) went ahead and benchmarked anyway. We ran SPECwpc 3.0.4’s Product Development (focusing on common CAD compute and visual workloads), General Operations, and GPU Compute test suites. The results supported the theory, no surprise, as five composite results for workloads stressing CPU, graphics, storage, and GPU compute showed tight tracking between systems.
Differences were extremely small — in the noise — with the exception of 3D graphics performance. Overall, graphics ran about 5% slower on the remote machine, a result with an understandable explanation: PCoIP does chew up a bit of overhead, most notably in encoding the desktop screen as a video stream for return transmission. By default, PCoIP Client Software has to both perform that encoding in software and interrupt GPU graphics processing to fetch frames from video memory, the combination of which could logically account for a 5% hit. The good news is that PCoIP Client Software now also supports hardware video encoding on Nvidia RTX–class GPUs, further offloading the CPU and reducing that penalty (though this is a remedy I did not test).
No surprise, the same workstation produces essentially the same throughput (when tested with the SPECwpc 3.0.4 benchmark), no matter where it is.
Network — Perhaps Especially Latency — Is the Most Important Performance Consideration
Using SPECwpc to test that two essentially identical machines can deliver the same throughput is not a particularly interesting exercise or revealing comparison (with the exception of quantifying that modest and explainable graphics performance penalty). We’re talking about the same Boxx model, just in one case, with that machine next to your desk, and in the other, in a rack somewhere else. Rather, when we’re comparing using a local workstation under the desk to using the same machine located in a remote datacenter, we need to consider how well the network — both local-area and wide-area networks (LAN and WAN) between you and the remote workstation — can support both the display of your desktop screen and interactivity. Essentially, that comes down to bandwidth and latency. With respect to bandwidth, the network will be burdened with the additional bandwidth required to transport at least one (and more likely, two or three) FullHD-resolution (again, at least) encoded streams from datacenter to client. Thankfully, with the robust improvement in available bandwidth of mainstream LAN technologies and WAN providers, bandwidth is arguably the lesser issue of concern, as modern Internet access today will more than likely suffice in the vast majority of small business and home offices.
Often, the more worthy consideration than a network connection’s available bandwidth is its latency, which is the thing that can turn an otherwise pleasant interactive computing session into an irritating struggle. Ultimately, for each specific user’s environment it’s the round-trip time (RTT), for example, manifested in the delay from when you make a request and the result of that request appears on your local screen. For example, when you click the mouse to change the model view, all the following then occurs: Your local client processes with PCoIP (in this case), transmits over LAN through your router, onto the WAN, and eventually to the datacenter LAN and your allocated workstation. Then for the return trip, the remote machine processes the request (exactly as it would have for a mouse click connected directly to your deskside machine), creates the updated graphical view, uses PCoIP software to encode the updated screen, transmits through the datacenter LAN to their router back over the WAN to your router, then across your LAN to your client, whose PCoIP software decodes that desktop screen and displays on your monitor. It sounds like a lot, but most of that happens in the blink of an eye on a local workstation — the incremental difference in the remote solution is roughly equal to the amount of time spent crossing the entire network twice.
Dragging a window around the hosted desktop very rapidly — at a rate that would stress the responsiveness of the system, albeit far beyond the speed I would ever do in normal use — did reveal a noticeable lag from mouse location to window location. But while noticeable, it was certainly not irritating. Other subjective tests I used to stress the interactive round trip response, like fast zooming in and out of a Google map and scrolling on a web page, showed a lag I could notice, but just barely. In the context of CAD, a more difficult test of response might be very rapid and continuous pan-and-zoom of models.
So yes, chances are that even with the 70-millisecond (ms) latency, you can create interactive sequences to make a remote solution noticeably less responsive than the deskside, but then that leads to two questions: how often are you engaging in that worse- or worst-case behavior, like extremely rapid and continuous pan-and-zoom, for example? And even if noticeable, is it annoying? If not, chances are that even if you can find perceptible lag when going out of your way, like I did with niche usage, it probably doesn’t translate to a negative overall experience.