GPU-Accelerated Sparse Matrix Vector Product based on Element-by-Element Method for Unstructured FEM using OpenACC

The development of directive based parallel programming models such as OpenACC has significantly reduced the cost in using accelerators such as GPUs. In this study, the sparse matrix vector product (SpMV), which was often the most computationally expensive part in physics-based simulations, was accelerated by GPU porting using OpenACC. Further speed-up was achieved by introducing the element-by-element (EBE) method in SpMV, an algorithm that is suitable for GPU architecture because it requires large amount of operations but small amount of memory access. In a comparison on one compute node of the supercomputer ABCI, using GPUs resulted in a 22- fold speedup over the CPU-only case, even when using the typical SpMV algorithm, and an additional 3.4-fold speedup when using the EBE method. The results on such analysis was applied to a seismic response analysis considering soil liquefaction, and using GPUs resulted in a 42-fold speedup compared to using only CPUs.