Graphics card computing
Graphics cards can be used very efficiently for scientific computing. As known from the many fields
of numerical physics (e.g. molecular dynamics), use of graphics cards instead of computer processor
can lead to several hundreds of speedup, which can be a breakthrough in many scientific problems.
Besides attempts to run different calculation in graphics cards, like classical molecular dynamics
(see A. Campbellov�, P. Klapetek, M. Valtr, Meas. Sci. Technol. 20 (2009) 84014), I am interested namely in
use of graphics cards for Finite Difference in Time Domain Method (FDTD).
Finite Difference in Time Domain method (FDTD) is a standard method for electromagnetic computations for a broad range
of frequencies covering nearly all the electromagnetic field related topics in industry and science. FDTD is based on an iterative numerical
solution of the Maxwell equations simulating wave propagation in a sequence of very short time steps
For the computation on a Graphical Processing Unit NVIDIA CUDA environment was used.
This environment provides common drivers and API for a broad range of NVIDIA products,
running both on Windows and Linux operating systems. Both data processing and memory model is completely different for GPU and the part of the code that should be run on GPU (called kernel) must be written to fulfill these conditions. In general, the GPU is equipped by several multiprocessors, consisting of a large number of processors. Many hundreds of threads (kernel calls) grouped in thread blocks can be processed simultaneously on GPU therefore. Memory available on GPU can be divided into a global memory - accessible by all the multiprocessors, a shared memory - accessible by processors within one multiprocessor, and a local memory - accessible by single processor. All the memories are hardware limited (for each type of GPU differently).
The following speedups were observed after simply rewritting the C code of FDTD in order to fit the CUDA programming model. The time dependence on for same number of FDTD steps on computational volume of different cube size is described:
|cube edge size||computer||graphics card|
|180||20 min 42 s||25 s|
|240||48 min 15 s||47 s|
|280||77 min 30 s||68 s|
More can be found in recent
- P. Klapetek et al: Rough surface scattering simulations using graphics cards, Applied Surface Science 2010, 256, pp 5640-5643
- P. Klapetek et al: Near field optical microscopy simulations using graphics processing units, Surface and Interface Analysis 2010, 42, pp 1109-1113