Published on: April 3, 2020
Upated on: November 18, 2020
In the past, DEM simulations were restricted to relatively small problems that used, for example, only thousands of larger particles that were mostly spherical in shape.
Recent improvements in both DEM codes and computational power have enabled closer-to-reality particle simulations. Users today can expect to simulate problems using the real particle shape and the actual particle size distribution (PSD), creating DEM simulations with millions of particles.
However, these enhancements in simulation accuracy have come at the cost of increased computational loads in both processing time and memory requirements. Within Rocky DEM, these loads can be offset considerably by using GPU processing abilities, which provides users with the capacity to obtain results in a more practical time frame.
The addition of GPU processing has helped make DEM a practical tool for engineering design. For example, the speed-up experienced by processing a simulation with even an inexpensive gaming GPU is remarkable when compared to a standard 8-core CPU machine working alone.
As of Rocky 4, users can make use of the multi-GPU technology capabilities, which facilitates large-scale and/or complicated solutions that were previously impossible to tackle due to memory limitations. By combining the memory of several GPU cards at once, users can overcome these limitations and achieve a substantial performance increase by aggregating their computing power.
From an investment perspective, there are many benefits to multi-GPU processing. The hardware cost of running cases with several millions of particles using multiple GPUs is much smaller than buying an equivalent CPU-based machine. The energy consumption is also less with GPUs, and GPU-based machines are also easier to upgrade by adding more cards or buying newer ones.
Figure 1. rotating drum benchmark case
CPU: Intel Xeon Gold 6230 @ 2.10GHz (8 cores used)
GPU: nVidiaA100 and nVidiaV100 @ OracleCloud
Multi-GPU: 1x, 2x, 4x
Figure 2. 16-triangle polyhedron used in benchmark
A performance benchmark of a rotating drum (Figure 1) illustrates the speed-up possibilities for common applications. In this case comparing CPU performance to GPU and multi-GPU speed-up, drum geometry was adapted as the number of particles increased to keep the same rate of particles/length and contacts/particles (coordination number). The case utilized polyhedrons (shaped from 16 triangles) of equivalent size (Figure 2).
Results show a significant performance gain with GPU versus CPU simulation: up to 80x faster for NVIDIA V100 and 92x for A100 when compared with an 8-core CPU Intel Xeon Gold 6230 @ 2.10GHz. Also, at peak performance, A100 is 24% faster and scaled further than V100 when increasing particle count. GPU maximum gain is achieved with approximate 500k particles/GPU. The following charts show performance improvement when switching from V100 to A100 GPUs in different numbers of particles.
Figure 3. CPU x GPU scale-up for polyhedrons
Moreover, in a world where we push multiphysics simulations ever farther, Rocky GPU and multi-GPU processing enables you to free-up all your CPUs for coupled simulations, avoiding hardware competition.
Discover on the infographic below how the GPU and multi-GPU processing capabilities available in Rocky DEM can help you speed up your particle simulations regardless of the size of your business:
For speed tests and hardware recommendations, see Rocky 4 with Multi-GPU: Which Hardware is Best for You?