Threaded accurate matrix-matrix multiplications with sparse matrix-vector multiplications

Shuntaro Ichimura, Takahiro Katagiri, Katsuhisa Ozaki, Takeshi Ogita, Toru Nagai

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Basic Linear Algebra Subprograms (BLAS) is a frequently used numerical library for linear algebra computations. However, it places little emphasis on computational accuracy, especially with respect to the accuracy assurance of the results. Although some algorithms for ensuring the computational accuracy of BLAS operations have been studied, there is a need for performance evaluation in advanced computer architectures. In this study, we parallelize high-precision matrix-matrix multiplication using thread-level parallelism. In addition, we conduct a performance evaluation from the viewpoints of execution speed and accuracy. We implement a method to convert dense matrices into sparse matrices by exploiting the nature of the target algorithm and adapting sparse-vector multiplication. Results obtained using the FX100 supercomputer system at Nagoya University indicate that (1) implementation with the ELL format achieves 1.43x speedup and (2) a maximum of 38x speedup compared to conventional implementation for dense matrix operations with dgemm.

Original languageEnglish
Title of host publicationProceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1093-1102
Number of pages10
ISBN (Print)9781538655559
DOIs
Publication statusPublished - 2018 Aug 3
Externally publishedYes
Event32nd IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018 - Vancouver, Canada
Duration: 2018 May 212018 May 25

Other

Other32nd IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018
CountryCanada
CityVancouver
Period18/5/2118/5/25

Fingerprint

Linear algebra
Computer architecture
Supercomputers
Performance evaluation

Keywords

  • Accuracy Assurance
  • Component
  • Error-free Transformation
  • High-precision Matrix-Matrix Multiplications
  • Sparse Matrix-vector Multiplications
  • Thread Parallelism

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Hardware and Architecture
  • Information Systems and Management

Cite this

Ichimura, S., Katagiri, T., Ozaki, K., Ogita, T., & Nagai, T. (2018). Threaded accurate matrix-matrix multiplications with sparse matrix-vector multiplications. In Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018 (pp. 1093-1102). [8425535] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/IPDPSW.2018.00168

Threaded accurate matrix-matrix multiplications with sparse matrix-vector multiplications. / Ichimura, Shuntaro; Katagiri, Takahiro; Ozaki, Katsuhisa; Ogita, Takeshi; Nagai, Toru.

Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018. Institute of Electrical and Electronics Engineers Inc., 2018. p. 1093-1102 8425535.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ichimura, S, Katagiri, T, Ozaki, K, Ogita, T & Nagai, T 2018, Threaded accurate matrix-matrix multiplications with sparse matrix-vector multiplications. in Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018., 8425535, Institute of Electrical and Electronics Engineers Inc., pp. 1093-1102, 32nd IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018, Vancouver, Canada, 18/5/21. https://doi.org/10.1109/IPDPSW.2018.00168
Ichimura S, Katagiri T, Ozaki K, Ogita T, Nagai T. Threaded accurate matrix-matrix multiplications with sparse matrix-vector multiplications. In Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018. Institute of Electrical and Electronics Engineers Inc. 2018. p. 1093-1102. 8425535 https://doi.org/10.1109/IPDPSW.2018.00168
Ichimura, Shuntaro ; Katagiri, Takahiro ; Ozaki, Katsuhisa ; Ogita, Takeshi ; Nagai, Toru. / Threaded accurate matrix-matrix multiplications with sparse matrix-vector multiplications. Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018. Institute of Electrical and Electronics Engineers Inc., 2018. pp. 1093-1102
@inproceedings{468940a2eaab4a839cbeeef718f12383,
title = "Threaded accurate matrix-matrix multiplications with sparse matrix-vector multiplications",
abstract = "Basic Linear Algebra Subprograms (BLAS) is a frequently used numerical library for linear algebra computations. However, it places little emphasis on computational accuracy, especially with respect to the accuracy assurance of the results. Although some algorithms for ensuring the computational accuracy of BLAS operations have been studied, there is a need for performance evaluation in advanced computer architectures. In this study, we parallelize high-precision matrix-matrix multiplication using thread-level parallelism. In addition, we conduct a performance evaluation from the viewpoints of execution speed and accuracy. We implement a method to convert dense matrices into sparse matrices by exploiting the nature of the target algorithm and adapting sparse-vector multiplication. Results obtained using the FX100 supercomputer system at Nagoya University indicate that (1) implementation with the ELL format achieves 1.43x speedup and (2) a maximum of 38x speedup compared to conventional implementation for dense matrix operations with dgemm.",
keywords = "Accuracy Assurance, Component, Error-free Transformation, High-precision Matrix-Matrix Multiplications, Sparse Matrix-vector Multiplications, Thread Parallelism",
author = "Shuntaro Ichimura and Takahiro Katagiri and Katsuhisa Ozaki and Takeshi Ogita and Toru Nagai",
year = "2018",
month = "8",
day = "3",
doi = "10.1109/IPDPSW.2018.00168",
language = "English",
isbn = "9781538655559",
pages = "1093--1102",
booktitle = "Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - Threaded accurate matrix-matrix multiplications with sparse matrix-vector multiplications

AU - Ichimura, Shuntaro

AU - Katagiri, Takahiro

AU - Ozaki, Katsuhisa

AU - Ogita, Takeshi

AU - Nagai, Toru

PY - 2018/8/3

Y1 - 2018/8/3

N2 - Basic Linear Algebra Subprograms (BLAS) is a frequently used numerical library for linear algebra computations. However, it places little emphasis on computational accuracy, especially with respect to the accuracy assurance of the results. Although some algorithms for ensuring the computational accuracy of BLAS operations have been studied, there is a need for performance evaluation in advanced computer architectures. In this study, we parallelize high-precision matrix-matrix multiplication using thread-level parallelism. In addition, we conduct a performance evaluation from the viewpoints of execution speed and accuracy. We implement a method to convert dense matrices into sparse matrices by exploiting the nature of the target algorithm and adapting sparse-vector multiplication. Results obtained using the FX100 supercomputer system at Nagoya University indicate that (1) implementation with the ELL format achieves 1.43x speedup and (2) a maximum of 38x speedup compared to conventional implementation for dense matrix operations with dgemm.

AB - Basic Linear Algebra Subprograms (BLAS) is a frequently used numerical library for linear algebra computations. However, it places little emphasis on computational accuracy, especially with respect to the accuracy assurance of the results. Although some algorithms for ensuring the computational accuracy of BLAS operations have been studied, there is a need for performance evaluation in advanced computer architectures. In this study, we parallelize high-precision matrix-matrix multiplication using thread-level parallelism. In addition, we conduct a performance evaluation from the viewpoints of execution speed and accuracy. We implement a method to convert dense matrices into sparse matrices by exploiting the nature of the target algorithm and adapting sparse-vector multiplication. Results obtained using the FX100 supercomputer system at Nagoya University indicate that (1) implementation with the ELL format achieves 1.43x speedup and (2) a maximum of 38x speedup compared to conventional implementation for dense matrix operations with dgemm.

KW - Accuracy Assurance

KW - Component

KW - Error-free Transformation

KW - High-precision Matrix-Matrix Multiplications

KW - Sparse Matrix-vector Multiplications

KW - Thread Parallelism

UR - http://www.scopus.com/inward/record.url?scp=85052243155&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85052243155&partnerID=8YFLogxK

U2 - 10.1109/IPDPSW.2018.00168

DO - 10.1109/IPDPSW.2018.00168

M3 - Conference contribution

AN - SCOPUS:85052243155

SN - 9781538655559

SP - 1093

EP - 1102

BT - Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018

PB - Institute of Electrical and Electronics Engineers Inc.

ER -