CUDA expression templates
Date issued
2011
Journal Title
Journal ISSN
Volume Title
Publisher
Václav Skala - UNION Agency
Abstract
Many algorithms require vector algebra operations such as the dot product, vector norms or component-wise manipulations.
Especially for large-scale vectors, the efficiency of algorithms depends on an efficient implementation of those calculations.
The calculation of vector operations benefits from the continually increasing chip level parallelism on graphics hardware. Very
efficient basic linear algebra libraries like CUBLAS make use of the parallelism provided by CUDA-enabled GPUs. However,
existing libraries are often not intuitively to use and programmers may shy away from working with cumbersome and errorprone
interfaces. In this paper we introduce an approach to simplify the usage of parallel graphics hardware for vector calculus.
Our approach is based on expression templates that make it possible to obtain the performance of a hand-coded implementation
while providing an intuitive and math-like syntax. We use this technique to automatically generate CUDA kernels for various
vector calculations. In several performance tests our implementation shows a superior performance compared to CPU-based
libraries and comparable results to a GPU-based library.
Description
Subject(s)
grafické procesory, paralelní výpočty, CUDA, lineární algebra
Citation
WSCG '2011: Communication Papers Proceedings: The 19th International Conference in Central Europe on Computer Graphics, Visualization and Computer Vision, p. 185-192.