Published on: August 30, 2021
Which vehicle would you rather own: A car or a bus? If you typically only transport yourself and a few others, a car is the better choice—you can easily get everyone across town in one trip, and a car is less expensive to purchase and operate than a bus. But what if you had 100 people to transport? All of a sudden, making 20 additional trips across town becomes quite expensive and the bus becomes the most efficient cost-to-benefit option.
A similar analogy can be applied to DEM hardware selections: The kind of CPU and/or GPU devices that will work best for your simulations depends upon your problem size and your budget. And with Rocky software, which supports processing across 2 or more GPUs (also known as multi-GPU processing), the kind of hardware you choose is more important—and potentially more confusing—than ever.
In this post, we’ll take a closer look at the multi-GPU processing available in Rocky, and will provide you with some guidance for making the best possible hardware decisions.
Rocky Multi-GPU processing: how it works
Large-scale DEM simulations that have millions of particles use huge amounts of memory in the hardware. In addition, CPU memory can be quite expensive and simulation performance can vary quite drastically. A single CPU or GPU (Graphics Processing Unit) has a limited amount of memory and the particle count that can be handled is still restricted to this memory.
The multi-GPU solver in Rocky, however, overcomes this memory restriction by efficiently distributing and managing the combined memory of 2 or more graphic cards within a single motherboard. For example, a cyclone separator (Figure 1) modeled in Rocky 4.5 using multi-GPU solver technology was able to simulate 200 million particles. These kinds of very high particle counts were not possible previously but are now a reality thanks to the multi-GPU capabilities found in Rocky software.

How to choose the right hardware
To get the maximum performance from the hardware, Rocky software is programmed to use only NVIDIA GPUs.
Selecting the type of hardware you need depends largely upon the cost-to-benefit question. Even a single GPU card offers more value than an 8-core CPU processor.
In our benchmark studies for the Rocky 4.5 release, which compared the scalability of the simulation with the number of GPUs, we observed very good scaling for large particle numbers in comparison with small particle counts (Figure 2). And while all cases typically benefited from the addition of at least one GPU, the higher the number of particles simulated, the more benefit was seen from adding more than one GPU devices.
Figure 2: Rocky 4.5 speed-up (relative to 8x CPU cores) per GPU amount X number of particles simulated.
For more details about how the numbers on Figure 2 were calculated, please visit the companion blog: More GPUs = faster processing with Rocky.
In summary, a good way to think of GPU vs. CPU performance is to think of the car vs. bus question: Just as a bus better transports many people across town, a 2 or more GPU (multi-GPU) solver better simulates many millions of particles and offers more value for money over a CPU alone. However, multi-GPU may provide little benefit for small problems (less than 30 thousand particles) so a CPU could be the best bang for your buck in that case, just as a car is a better choice for transporting only a few people.
Rocky 4.5
If you are curious to know more about how your simulation would perform with our Rocky 4.5 multi-GPU solver, watch our webinar for free: