메뉴 건너뛰기




Volumn 18, Issue , 2004, Pages 987-996

SRUMMA: A matrix multiplication algorithm suitable for clusters and scalable shared memory systems

Author keywords

[No Author keywords available]

Indexed keywords

BASIC LINEAR ALGEBRA SUBROUTINES (BLAS); MASSIVELY PARALLEL PROCESSOR (MPP); REMOTE MEMORY ACCESS (RMA) COMMUNICATION; SRUMMA;

EID: 12444253004     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (43)

References (39)
  • 1
    • 0003278639 scopus 로고    scopus 로고
    • Automatically tuned linear algebra software (ATLAS)
    • R. Whaley and J. Dongarra, "Automatically Tuned Linear Algebra Software (ATLAS)", Supercomputing'89.
    • Supercomputing'89
    • Whaley, R.1    Dongarra, J.2
  • 3
    • 0023288009 scopus 로고
    • Matrix algorithms on a hypercube I: Matrix multiplication
    • G. C. Fox, S. W. Otto, and A. J. G. Hey, "Matrix algorithms on a hypercube I: Matrix multiplication", Parallel Computing, vol. 4, pp. 17-31. 1987.
    • (1987) Parallel Computing , vol.4 , pp. 17-31
    • Fox, G.C.1    Otto, S.W.2    Hey, A.J.G.3
  • 6
    • 0024883116 scopus 로고
    • Communication efficient matrix multiplication on hypercubes
    • J. Berntsen, Communication efficient matrix multiplication on hypercubes, Parallel Computing, vol. 12, pp. 335-342, 1989.
    • (1989) Parallel Computing , vol.12 , pp. 335-342
    • Berntsen, J.1
  • 7
    • 84904335157 scopus 로고
    • Scalability of parallel algorithms for matrix multiplication
    • A. Gupta and V. Kumar, "Scalability of Parallel Algorithms for Matrix Multiplication", Proc. ICPP, 1993
    • (1993) Proc. ICPP
    • Gupta, A.1    Kumar, V.2
  • 10
    • 0037970044 scopus 로고
    • Comparison of scalable parallel matrix multiplication libraries
    • IEEE Computer Society Press
    • S. Huss-Lederman, E. M. Jacobson, and A. Tsao, "Comparison of Scalable Parallel Matrix Multiplication Libraries," in Scalable Parallel Libraries Conference, IEEE Computer Society Press, 1994, pp. 142-149.
    • (1994) Scalable Parallel Libraries Conference , pp. 142-149
    • Huss-Lederman, S.1    Jacobson, E.M.2    Tsao, A.3
  • 12
    • 0039066274 scopus 로고
    • Communication efficient matrix multiplication on hypercubes
    • H. Gupta and P. Sadayappan, "Communication Efficient Matrix Multiplication on Hypercubes", in Proc Sixth ACM SPAA, 1994.
    • (1994) Proc Sixth ACM SPAA
    • Gupta, H.1    Sadayappan, P.2
  • 13
    • 0031146653 scopus 로고    scopus 로고
    • A poly-algorithm for parallel dense matrix multiplication on two-dimensional process grid topologies
    • J. Li, A. Skjellum, and R. D. Falgout, "A Poly-Algorithm for Parallel Dense Matrix Multiplication on Two-Dimensional Process Grid Topologies," Concurrency, Practice and Experience, vol. 9(5), 1997.
    • (1997) Concurrency, Practice and Experience , vol.9 , Issue.5
    • Li, J.1    Skjellum, A.2    Falgout, R.D.3
  • 16
    • 0028530654 scopus 로고
    • PUMMA: Parallel universal matrix multiplication algorithms on distributed memory concurrent computers
    • J. Choi, J. Dongarra, and D. W. Walker, "PUMMA: Parallel Universal Matrix Multiplication Algorithms on distributed memory concurrent computers," Concurrency: Practice and Experience, vol. 6(7), pp. 543-570, 1994.
    • (1994) Concurrency: Practice and Experience , vol.6 , Issue.7 , pp. 543-570
    • Choi, J.1    Dongarra, J.2    Walker, D.W.3
  • 18
    • 0028545949 scopus 로고
    • A high performance matrix multiplication algorithm on a distributed memory parallel computer using overlapped communication
    • R. C. Agarwal, F. Gustavson, and M. Zubair, "A high performance matrix multiplication algorithm on a distributed memory parallel computer using overlapped communication," IBM J. of Research and Development, vol. 38 (6), 1994.
    • (1994) IBM J. of Research and Development , vol.38 , Issue.6
    • Agarwal, R.C.1    Gustavson, F.2    Zubair, M.3
  • 19
    • 0031123769 scopus 로고    scopus 로고
    • SUMMA: Scalable universal matrix multiplication algorithm
    • R. van de Geijn, R. and J. Watts, "SUMMA: Scalable Universal Matrix Multiplication Algorithm," Concurrency: Practice and Experience, vol. 9(4), pp. 255-274, 1997.
    • (1997) Concurrency: Practice and Experience , vol.9 , Issue.4 , pp. 255-274
    • Van De Geijn, R.1    Watts, J.2
  • 22
    • 0030676131 scopus 로고    scopus 로고
    • A fast scalable universal matrix multiplication algorithm on distributed-memory concurrent computers
    • J. Choi, "A Fast Scalable Universal Matrix Multiplication Algorithm on Distributed-Memory Concurrent Computers", in Proc. IPPS '97, 1997.
    • (1997) Proc. IPPS '97
    • Choi, J.1
  • 23
    • 12444271603 scopus 로고    scopus 로고
    • OpenMP issues arising in the development of parallel BLAS and LAPACK libraries
    • C. Addison and Y. Ren, "OpenMP Issues Arising in the Development of Parallel BLAS and LAPACK libraries", in Proceedings EWOMP'01. 2001.
    • (2001) Proceedings EWOMP'01
    • Addison, C.1    Ren, Y.2
  • 24
    • 0035949178 scopus 로고    scopus 로고
    • Scalability and performance of OpenMP and MPI on a 128-processor SGI origin 2000
    • G. R. Luecke and W. Lin, "Scalability and Performance of OpenMP and MPI on a 128-Processor SGI Origin 2000", Concurrency and Computation: Practice and Experience, vol. 13, pp 905-928. 2001.
    • (2001) Concurrency and Computation: Practice and Experience , vol.13 , pp. 905-928
    • Luecke, G.R.1    Lin, W.2
  • 26
    • 84860083667 scopus 로고    scopus 로고
    • Performance analysis of various parallelization methods for BLAS3 routines on cluster architectures
    • Nov
    • T. Betcke, "Performance analysis of various parallelization methods for BLAS3 routines on cluster architectures", John von Neumann-Instituts für Computing, Tech. Rep. FZJ-ZAM-IB-2000-15, Nov, 2000.
    • (2000) John von Neumann-Instituts für Computing, Tech. Rep. , vol.FZJ-ZAM-IB-2000-15
    • Betcke, T.1
  • 32
    • 0006168939 scopus 로고    scopus 로고
    • ARMCI: A portable remote memory copy library for distributed array libraries and compiler run-time systems
    • J. Nieplocha and B. Carpenter, "ARMCI: A Portable Remote Memory Copy Library for Distributed Array Libraries and Compiler Run-time Systems", in Proceedings of RTSPP IPPS/SDP, 1999.
    • (1999) Proceedings of RTSPP IPPS/SDP
    • Nieplocha, J.1    Carpenter, B.2
  • 33
    • 34247349414 scopus 로고    scopus 로고
    • ARMCI Web page. http://www.emsl.pnl.gov/docs/parsoft/armci/
    • ARMCI Web Page
  • 35
    • 77954488753 scopus 로고    scopus 로고
    • Protocols and strategies for optimizing remote memory operations on clusters
    • J. Nieplocha, V. Tipparaju, A. Saify, and D. Panda, "Protocols and Strategies for Optimizing Remote Memory Operations on Clusters", Proc CAC/IPDPS'02.2002.
    • (2002) Proc CAC/IPDPS'02
    • Nieplocha, J.1    Tipparaju, V.2    Saify, A.3    Panda, D.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.