메뉴 건너뛰기




Volumn 10, Issue , 2004, Pages 257-266

Optimizing parallel multiplication operation for rectangular and transposed matrices

Author keywords

[No Author keywords available]

Indexed keywords

BASIC LINEAR ALGEBRA SUBROUTINES (BLAS); PARALLEL MULTIPLICATION OPERATION; REMOTE MEMORY ACCESS (RMA); TRANSPOSED MATRICES;

EID: 4544240248     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICPADS.2004.1316103     Document Type: Conference Paper
Times cited : (3)

References (36)
  • 1
    • 12444253004 scopus 로고    scopus 로고
    • SRUMMA: A matrix multiplication algorithm suitable for clusters and scalable shared memory systems
    • M. Krishnan, J. Nieplocha, "SRUMMA: A Matrix Multiplication Algorithm Suitable for Clusters and Scalable Shared Memory Systems", in IPDPS'2004.
    • IPDPS'2004
    • Krishnan, M.1    Nieplocha, J.2
  • 3
    • 0023288009 scopus 로고
    • Matrix algorithms on a hypercube I: Matrix multiplication
    • G. C. Fox, S. W. Otto, and A. J. G. Hey, "Matrix algorithms on a hypercube I: Matrix multiplication", Parallel Computing, vol. 4, pp. 17-31. 1987.
    • (1987) Parallel Computing , vol.4 , pp. 17-31
    • Fox, G.C.1    Otto, S.W.2    Hey, A.J.G.3
  • 6
    • 0024883116 scopus 로고
    • Communication efficient matrix multiplication on hypercubes
    • J. Berntsen, "Communication efficient matrix multiplication on hypercubes", Parallel Computing, vol. 12, 1989.
    • (1989) Parallel Computing , vol.12
    • Berntsen, J.1
  • 7
  • 8
    • 0026973156 scopus 로고
    • A matrix product algorithm and its comparative performance on hypercubes
    • C. Lin and L.Snyder, "A matrix product algorithm and its comparative performance on hypercubes", in SHPCC, 1992.
    • (1992) SHPCC
    • Lin, C.1    Snyder, L.2
  • 13
    • 0031146653 scopus 로고    scopus 로고
    • A poly-algorithm for parallel dense matrix multiplication on two-dimensional process grid topologies
    • 97
    • J. Li, A. Skjellum, and R. D. Falgout, "A Poly-Algorithm for Parallel Dense Matrix Multiplication on Two-Dimensional Process Grid Topologies," Concurrency, Practice and Experience, vol. 9(5),'97.
    • Concurrency, Practice and Experience , vol.9 , Issue.5
    • Li, J.1    Skjellum, A.2    Falgout, R.D.3
  • 16
    • 0028530654 scopus 로고    scopus 로고
    • PUMMA: Parallel universal matrix multiplication algorithms on distributed memory concurrent computers
    • 94
    • J. Choi, J. Dongarra, and D. W. Walker, "PUMMA: Parallel Universal Matrix Multiplication Algorithms on distributed memory concurrent computers," Concurrenc:,Practice and Experience, vol. 6(7),'94.
    • Concurrenc:,Practice and Experience , vol.6 , Issue.7
    • Choi, J.1    Dongarra, J.2    Walker, D.W.3
  • 18
    • 0028545949 scopus 로고    scopus 로고
    • A high performance matrix multiplication algorithm on a distributed memory parallel computer using overlapped communication
    • R. C. Agarwal, F. Gustavson, and M. Zubair, "A high performance matrix multiplication algorithm on a distributed memory parallel computer using overlapped communication," IBM J. of Research and Development '94.
    • IBM J. of Research and Development '94
    • Agarwal, R.C.1    Gustavson, F.2    Zubair, M.3
  • 20
    • 0003978709 scopus 로고
    • A proposal for a set of parallel basic linear algebra subprograms
    • University of Tennessee, Knoxville
    • J. Choi, J. Dongarra, S. Ostrouchov, A. Petitet, D. Walker, and, R. C. Whaley, "A Proposal for a Set of Parallel Basic Linear Algebra Subprograms", University of Tennessee, Knoxville, Tech. Rep. CS-95-292, 1995.
    • (1995) Tech. Rep. , vol.CS-95-292
    • Choi, J.1    Dongarra, J.2    Ostrouchov, S.3    Petitet, A.4    Walker, D.5    Whaley, R.C.6
  • 22
    • 0030676131 scopus 로고    scopus 로고
    • A fast scalable universal matrix multiplication algorithm on distributed-memory concurrent computers
    • J. Choi, "A Fast Scalable Universal Matrix Multiplication Algorithm on Distributed-Memory Concurrent Computers", in Proc. of IPPS, 1997.
    • (1997) Proc. of IPPS
    • Choi, J.1
  • 25
    • 84862431241 scopus 로고    scopus 로고
    • The implementation of MPI-2 one-sided communication for the NEC SX-5
    • J. L. Träff, H. Ritzdorf, R. Hempel "The Implementation of MPI-2 One-Sided Communication for the NEC SX-5", SC'2000.
    • SC'2000
    • Träff, J.L.1    Ritzdorf, H.2    Hempel, R.3
  • 26
    • 4544241477 scopus 로고    scopus 로고
    • High Performance RDMA-Based MPI Implementation over InfmiBand
    • J. Liu, J. Wu, S. P. Kini, P. Wyckoff, and D. K. Panda, "High Performance RDMA-Based MPI Implementation over InfmiBand" in ACM ICS, 2003.
    • (2003) ACM ICS
    • Liu, J.1    Wu, J.2    Kini, S.P.3    Wyckoff, P.4    Panda, D.K.5
  • 30
    • 0006168939 scopus 로고    scopus 로고
    • ARMCI: A portable remote memory copy library for distributed array libraries and compiler run-time systems
    • J. Nieplocha, B. Carpenter, ARMCI: A Portable Remote Memory Copy Library for Distributed Array Libraries and Compiler Run-time Systems, Proc. RTSPP IPPS/SDP 1999.
    • (1999) Proc. RTSPP IPPS/SDP
    • Nieplocha, J.1    Carpenter, B.2
  • 33
    • 84862424792 scopus 로고    scopus 로고
    • http://www.csm.ornl.gov/evaluation
  • 34
    • 84862428917 scopus 로고    scopus 로고
    • http://www.csm.ornl.gov/~dunigan/


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.