메뉴 건너뛰기




Volumn , Issue , 2011, Pages

Improving communication performance in dense linear algebra via topology aware collectives

Author keywords

Communication; Exascale; Interconnect topology; Mapping; Performance

Indexed keywords

BLUE GENE; COMMUNICATION PERFORMANCE; COMMUNICATION REDUCTION; COMMUNICATION-INTENSIVE KERNEL; DENSE LINEAR ALGEBRA; EXASCALE; INTERCONNECT TOPOLOGY; LU FACTORIZATION; MASSIVELY PARALLEL MACHINE; MATRIX MULTIPLICATION; MULTICASTS; NETWORK CONTENTION; PERFORMANCE; PERFORMANCE MODEL; TOPOLOGY AWARE;

EID: 83155193222     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/2063384.2063487     Document Type: Conference Paper
Times cited : (44)

References (21)
  • 1
    • 0029370767 scopus 로고
    • A three-dimensional approach to parallel matrix multiplication
    • September
    • R. C. Agarwal, S. M. Balle, F. G. Gustavson, M. Joshi, and P. Palkar. A three-dimensional approach to parallel matrix multiplication. IBM J. Res. Dev., 39:575-582, September 1995.
    • (1995) IBM J. Res. Dev. , vol.39 , pp. 575-582
    • Agarwal, R.C.1    Balle, S.M.2    Gustavson, F.G.3    Joshi, M.4    Palkar, P.5
  • 6
    • 0024883116 scopus 로고
    • Communication efficient matrix multiplication on hypercubes
    • DOI 10.1016/0167-8191(89)90091-4
    • J. Berntsen. Communication efficient matrix multiplication on hypercubes. Parallel Computing, 12(3):335 - 342, 1989. (Pubitemid 20644636)
    • (1989) Parallel Computing , vol.12 , Issue.3 , pp. 335-342
    • Berntsen Jarle1
  • 11
    • 0028401457 scopus 로고
    • The communication challenge for MPP: Intel Paragon and Meiko CS-2
    • R. W. Hockney. The communication challenge for MPP: Intel Paragon and Meiko CS-2. Parallel Computing, 20(3):389 - 398, 1994.
    • (1994) Parallel Computing , vol.20 , Issue.3 , pp. 389-398
    • Hockney, R.W.1
  • 12
    • 80052309746 scopus 로고    scopus 로고
    • Trading replication for communication in parallel distributed-memory dense solvers
    • D. Irony and S. Toledo. Trading replication for communication in parallel distributed-memory dense solvers. Parallel Processing Letters, 71:3-28, 2002.
    • (2002) Parallel Processing Letters , vol.71 , pp. 3-28
    • Irony, D.1    Toledo, S.2
  • 13
    • 0024735141 scopus 로고
    • Optimum broadcasting and personalized communication in hypercubes
    • DOI 10.1109/12.29465
    • S. L. Johnsson and C.-T. Ho. Optimum broadcasting and personalized communication in hypercubes. IEEE Trans. Comput., 38:1249-1268, September 1989. (Pubitemid 20607016)
    • (1989) IEEE Transactions on Computers , vol.38 , Issue.9 , pp. 1249-1268
    • Johnsson S.Lennart1    Ho Ching-Tien2
  • 16
    • 0029535709 scopus 로고
    • Collective communication in wormhole-routed massively parallel computers
    • Dec.
    • P. McKinley, Y.-J. Tsai, and D. Robinson. Collective communication in wormhole-routed massively parallel computers. Computer, 28(12):39 -50, Dec. 1995.
    • (1995) Computer , vol.28 , Issue.12 , pp. 39-50
    • McKinley, P.1    Tsai, Y.-J.2    Robinson, D.3
  • 17
    • 34248676296 scopus 로고    scopus 로고
    • Performance analysis of MPI collective operations
    • DOI 10.1007/s10586-007-0012-0, Evaluation and Optimization of High-Performance Computing and Networking Systems
    • J. Pješivac-Grbović, T. Angskun, G. Bosilca, G. E. Fagg, E. Gabriel, and J. J. Dongarra. Performance analysis of MPI collective operations. Cluster Computing, 10:127-143, June 2007. (Pubitemid 46767521)
    • (2007) Cluster Computing , vol.10 , Issue.2 , pp. 127-143
    • Pjesivac-Grbovic, J.1    Angskun, T.2    Bosilca, G.3    Fagg, G.E.4    Gabriel, E.5    Dongarra, J.J.6
  • 18
    • 80052305141 scopus 로고    scopus 로고
    • Communication-optimal parallel 2.5D matrix multiplication and LU factorization algorithms
    • University of California, Berkeley, Feb
    • E. Solomonik and J. Demmel. Communication-optimal parallel 2.5D matrix multiplication and LU factorization algorithms. Technical Report UCB/EECS-2011-10, EECS Department, University of California, Berkeley, Feb 2011.
    • (2011) Technical Report UCB/EECS-2011-10, EECS Department
    • Solomonik, E.1    Demmel, J.2
  • 21
    • 0031123769 scopus 로고    scopus 로고
    • SUMMA: Scalable universal matrix multiplication algorithm
    • R. A. Van De Geijn and J. Watts. SUMMA: scalable universal matrix multiplication algorithm. Concurrency: Practice and Experience, 9(4):255-274, 1997. (Pubitemid 127679707)
    • (1997) Concurrency Practice and Experience , vol.9 , Issue.4 , pp. 255-274
    • Van De Geijn, R.A.1    Watts, J.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.