메뉴 건너뛰기




Volumn 9, Issue 5, 1997, Pages 345-389

A poly-algorithm for parallel dense matrix multiplication on two-dimensional process grid topologies

Author keywords

[No Author keywords available]

Indexed keywords

HEURISTIC METHODS; MATRIX ALGEBRA; PARALLEL PROCESSING SYSTEMS; TOPOLOGY;

EID: 0031146653     PISSN: 10403108     EISSN: None     Source Type: Journal    
DOI: 10.1002/(SICI)1096-9128(199705)9:5<345::AID-CPE258>3.0.CO;2-7     Document Type: Article
Times cited : (26)

References (27)
  • 1
    • 12444260176 scopus 로고
    • The parallelization of level 2 and 3 BLAS operations on distributed memory machines
    • Purdue University, West Lafayette, IN
    • M. Aboelaze, N. P. Chrisochoides, E. N. Houstis and C. E. Houstis, 'The parallelization of level 2 and 3 BLAS operations on distributed memory machines', Technical Report CSD-TR-91-007, Purdue University, West Lafayette, IN, 1991.
    • (1991) Technical Report CSD-TR-91-007
    • Aboelaze, M.1    Chrisochoides, N.P.2    Houstis, E.N.3    Houstis, C.E.4
  • 3
    • 0013279262 scopus 로고
    • Parallel numerical linear algebra
    • Computer Science Division, U. C. Berkeley, Berkeley, CA
    • J. W. Demmel, M. T. Heath and H. A. van der Vorst, 'Parallel numerical linear algebra', Technical Report CSD-92-703, Computer Science Division, U. C. Berkeley, Berkeley, CA, 1992.
    • (1992) Technical Report CSD-92-703
    • Demmel, J.W.1    Heath, M.T.2    Van Der Vorst, H.A.3
  • 5
    • 0028464291 scopus 로고
    • Multiplication of matrices of arbitrary shape on a data parallel computer
    • K. K. Mathur and L. S. Johnsson, 'Multiplication of matrices of arbitrary shape on a data parallel computer', Parallel Comput., 20, 919-951 (1994).
    • (1994) Parallel Comput. , vol.20 , pp. 919-951
    • Mathur, K.K.1    Johnsson, L.S.2
  • 6
    • 0028530654 scopus 로고
    • PUMMA: Parallel universal matrix multiplication algorithms on distributed memory concurrent computers
    • J. Choi, J. J. Dongarra and D. W. Walker, 'PUMMA: Parallel universal matrix multiplication algorithms on distributed memory concurrent computers', Concurrency,Pract. Exp., 6, 543-570 (1994).
    • (1994) Concurrency,Pract. Exp. , vol.6 , pp. 543-570
    • Choi, J.1    Dongarra, J.J.2    Walker, D.W.3
  • 8
    • 0028545949 scopus 로고
    • A high-performance matrix-multiplication algorithm on a distributed-memory parallel computer using overlapped communication
    • R. C. Agarwal, F. G. Gustavson and M. Zubair, 'A high-performance matrix-multiplication algorithm on a distributed-memory parallel computer using overlapped communication', IBM J. Res. Dev., 38, (6), 673-681 (1994).
    • (1994) IBM J. Res. Dev. , vol.38 , Issue.6 , pp. 673-681
    • Agarwal, R.C.1    Gustavson, F.G.2    Zubair, M.3
  • 9
    • 0003588633 scopus 로고
    • SUMMA: Scalable universal matrix multiplication algorithm
    • Department of Computer Science, The University of Texas at Austin, Austin, TX
    • R. A. van de Geijn and J. Watts, 'SUMMA: Scalable universal matrix multiplication algorithm', Technical Report TR-95-13, Department of Computer Science, The University of Texas at Austin, Austin, TX, 1995.
    • (1995) Technical Report TR-95-13
    • Van De Geijn, R.A.1    Watts, J.2
  • 10
    • 0026971290 scopus 로고
    • The multicomputer toolbox approach to concurrent BLAS and LACS
    • J. Saltz (Ed.), IEEE Press, Los Alamitos, CA, also available as LLNL Technical Report UCRL-JC-109775
    • R. D. Falgout, A. Skjellum, S. G. Smith and C. H. Still, 'The multicomputer toolbox approach to concurrent BLAS and LACS', in J. Saltz (Ed.), Proc. Scalable High Performance Computing Conf. (SHPCC), IEEE Press, Los Alamitos, CA, 1992, pp. 121-128, also available as LLNL Technical Report UCRL-JC-109775.
    • (1992) Proc. Scalable High Performance Computing Conf. (SHPCC) , pp. 121-128
    • Falgout, R.D.1    Skjellum, A.2    Smith, S.G.3    Still, C.H.4
  • 11
    • 85033286298 scopus 로고
    • The multicomputer toolbox approach to concurrent BLAS
    • Department of Computer Science, Mississippi State University, Mississippi State, MS, October
    • R. D. Falgout, A. Skjellum, S. G. Smith and C. H. Still, 'The multicomputer toolbox approach to concurrent BLAS', Technical Report MSU-CS-TR931001, Department of Computer Science, Mississippi State University, Mississippi State, MS, October 1993.
    • (1993) Technical Report MSU-CS-TR931001
    • Falgout, R.D.1    Skjellum, A.2    Smith, S.G.3    Still, C.H.4
  • 12
    • 0025639404 scopus 로고
    • Data redistribution and concurrency
    • E. F. van de Velde, 'Data redistribution and concurrency', Parallel Comput., 16, 125-38 (1990).
    • (1990) Parallel Comput. , vol.16 , pp. 125-138
    • Van De Velde, E.F.1
  • 13
    • 84969086546 scopus 로고
    • Adaptive data distribution for concurrent continuation
    • California Institute of Technology, Caltech/Rice Center for Research in Parallel Computation
    • E. F. van de Velde and J. Lorenz, 'Adaptive data distribution for concurrent continuation', Technical Report CRPC-89-4, California Institute of Technology, 1989, Caltech/Rice Center for Research in Parallel Computation.
    • (1989) Technical Report CRPC-89-4
    • Van De Velde, E.F.1    Lorenz, J.2
  • 14
    • 12444263806 scopus 로고
    • The multicomputer toolbox: Scalable parallel libraries for large-scale concurrent applications
    • Lawrence Livermore National Laboratory, December
    • A. Skjellum and C. H. Baldwin, 'The multicomputer toolbox: scalable parallel libraries for large-scale concurrent applications', Technical Report UCRL-JC-109251, Lawrence Livermore National Laboratory, December 1991.
    • (1991) Technical Report UCRL-JC-109251
    • Skjellum, A.1    Baldwin, C.H.2
  • 19
    • 12444266294 scopus 로고
    • Driving issues in scalable libraries: Poly-algorithms, data distribution independence, redistribution, local storage schemes
    • David H. Bailey et al., (Ed.), SIAM Press, Philadelphia, PA, February
    • A. Skjellum and P. V. Bangalore, 'Driving issues in scalable libraries: poly-algorithms, data distribution independence, redistribution, local storage schemes', in David H. Bailey et al., (Ed.), Proc. of the Seventh SIAM Conference on Parallel Processing for Scientific Computing, SIAM Press, Philadelphia, PA, February 1995, pp. 734-737.
    • (1995) Proc. of the Seventh SIAM Conference on Parallel Processing for Scientific Computing , pp. 734-737
    • Skjellum, A.1    Bangalore, P.V.2
  • 20
    • 3943101125 scopus 로고
    • Writing libraries in MPI
    • Anthony Skjellum and Donna Reese, (Ed.), IEEE Computer Society Press, October
    • A. Skjellum, N. E. Doss and P. V. Bangalore, 'Writing libraries in MPI', in Anthony Skjellum and Donna Reese, (Ed.), Proc. Scalable Parallel Libraries Conference, IEEE Computer Society Press, October 1993, pp. 166-173.
    • (1993) Proc. Scalable Parallel Libraries Conference , pp. 166-173
    • Skjellum, A.1    Doss, N.E.2    Bangalore, P.V.3
  • 21
    • 0037970044 scopus 로고
    • Comparison of scalable parallel matrix multiplication libraries
    • Anthony Skjellum and Donna Reese (Eds.), IEEE Computer Society Press, October
    • S. Huss-Lederman, E. M. Jacobson and A. Tsao, 'Comparison of scalable parallel matrix multiplication libraries', in Anthony Skjellum and Donna Reese (Eds.), Proc. Scalable Parallel Libraries Conference, IEEE Computer Society Press, October 1993, pp. 142-149.
    • (1993) Proc. Scalable Parallel Libraries Conference , pp. 142-149
    • Huss-Lederman, S.1    Jacobson, E.M.2    Tsao, A.3
  • 24
    • 34250487811 scopus 로고
    • Gaussian elimination is not optimal
    • V. Strassen, 'Gaussian elimination is not optimal', Numer. Math., 13, 354-356 (1969).
    • (1969) Numer. Math. , vol.13 , pp. 354-356
    • Strassen, V.1
  • 25
    • 0002663082 scopus 로고
    • GEMMW: A portable level 3 BLAS Winograd variant of Strassen's matrix-matrix multiply algorithm
    • C. C. Douglas, M. Heroux, G. Slishman and R. M. Smith, 'GEMMW: A portable level 3 BLAS Winograd variant of Strassen's matrix-matrix multiply algorithm', J. Comput. Phys., 110, 1-10 (1994).
    • (1994) J. Comput. Phys. , vol.110 , pp. 1-10
    • Douglas, C.C.1    Heroux, M.2    Slishman, G.3    Smith, R.M.4
  • 26
    • 0000456144 scopus 로고
    • Parallel matrix and graph algorithms
    • E. Dekel, D. Nassimi and S. Sahni, 'Parallel matrix and graph algorithms', SIAM J. Comput., 10, (4), 657-73 (1981).
    • (1981) SIAM J. Comput. , vol.10 , Issue.4 , pp. 657-673
    • Dekel, E.1    Nassimi, D.2    Sahni, S.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.