Conjugate Gradient Solvers with High Accuracy and Bit-wise Reproducibility between CPU and GPU using Ozaki scheme

Daichi Mukunoki, Katsuhisa Ozaki, Takeshi Ogita, Roman Iakymchuk

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

On Krylov subspace methods such as the Conjugate Gradient (CG) method, the number of iterations until convergence may increase due to the loss of computational accuracy caused by rounding errors in floating-point computations. At the same time, because the order of the computation is nondeterministic on parallel computation, the result and the behavior of the convergence may be nonidentical in different computational environments, even for the same input. In this study, we present an accurate and reproducible implementation of the unpreconditioned CG method on x86 CPUs and NVIDIA GPUs. In our method, while all variables are stored on FP64, all inner product operations (including matrix-vector multiplications) are performed using the Ozaki scheme. The scheme delivers the correctly rounded computation as well as bit-level reproducibility among different computational environments. In this paper, we show some examples where the standard FP64 implementation of CG results in nonidentical results across different CPUs and GPUs. We then demonstrate the applicability and the effectiveness of our approach in terms of accuracy and reproducibility and their performance on both CPUs and GPUs. Furthermore, we compare the performance of our method against an existing accurateand reproducible CG implementation based on the Exact Basic Linear Algebra Subprograms (ExBLAS) on CPUs.

Original languageEnglish
Title of host publicationProceedings of International Conference on High Performance Computing in Asia-Pacific Region, HPC Asia 2021
PublisherAssociation for Computing Machinery
Pages100-109
Number of pages10
ISBN (Electronic)9781450388429
DOIs
Publication statusPublished - 2021 Jan 20
Event2021 International Conference on High Performance Computing in Asia-Pacific Region, HPC Asia 2021 - Virtual, Online, Korea, Republic of
Duration: 2021 Jan 202021 Jan 22

Publication series

NameACM International Conference Proceeding Series

Conference

Conference2021 International Conference on High Performance Computing in Asia-Pacific Region, HPC Asia 2021
Country/TerritoryKorea, Republic of
CityVirtual, Online
Period21/1/2021/1/22

Keywords

  • Accuracy
  • CPU
  • Conjugate Gradient
  • GPU
  • heterogeneous computing
  • reproducibility

ASJC Scopus subject areas

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Conjugate Gradient Solvers with High Accuracy and Bit-wise Reproducibility between CPU and GPU using Ozaki scheme'. Together they form a unique fingerprint.

Cite this