SCOPUS 정보 검색 플랫폼

Proceedings of 2011 SC - International Conference for High Performance Computing, Networking, Storage and Analysis

Volumn , Issue , 2011, Pages

Improving communication performance in dense linear algebra via topology aware collectives

(3) Solomonik, Edgar a Bhatele, Abhinav b Demmel, James a

a UNIVERSITY OF CALIFORNIA (United States)

b LAWRENCE LIVERMORE NATIONAL LABORATORY (United States)

Author keywords

Communication; Exascale; Interconnect topology; Mapping; Performance

Indexed keywords

BLUE GENE; COMMUNICATION PERFORMANCE; COMMUNICATION REDUCTION; COMMUNICATION-INTENSIVE KERNEL; DENSE LINEAR ALGEBRA; EXASCALE; INTERCONNECT TOPOLOGY; LU FACTORIZATION; MASSIVELY PARALLEL MACHINE; MATRIX MULTIPLICATION; MULTICASTS; NETWORK CONTENTION; PERFORMANCE; PERFORMANCE MODEL; TOPOLOGY AWARE;

ALGEBRA; ALGORITHMS; COMMUNICATION; COMPUTER SOFTWARE SELECTION AND EVALUATION; ELECTRIC NETWORK TOPOLOGY; FACTORIZATION; MAPPING; MATRIX ALGEBRA; SUPERCOMPUTERS;

TOPOLOGY;

EID: 83155193222 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/2063384.2063487 Document Type: Conference Paper

Times cited : (44)

References (21)

1
- 0029370767
- A three-dimensional approach to parallel matrix multiplication
- September
- R. C. Agarwal, S. M. Balle, F. G. Gustavson, M. Joshi, and P. Palkar. A three-dimensional approach to parallel matrix multiplication. IBM J. Res. Dev., 39:575-582, September 1995.
- (1995) IBM J. Res. Dev. , vol.39 , pp. 575-582
- Agarwal, R.C.¹ Balle, S.M.² Gustavson, F.G.³ Joshi, M.⁴ Palkar, P.⁵

2
- 0025231126
- Communication complexity of PRAMs
- DOI 10.1016/0304-3975(90)90188-N
- A. Aggarwal, A. K. Chandra, and M. Snir. Communication complexity of PRAMs. Theoretical Computer Science, 71(1):3 - 28, 1990. (Pubitemid 20676468)
- (1990) Theoretical Computer Science , vol.71 , Issue.1 , pp. 3-28
- Aggarwal Alok¹ Chandra Ashok, K.² Snir Marc³

3
- 0029193089
- LogGP: Incorporating long messages into the LogP modelone step closer towards a realistic model for parallel computation
- New York, NY, USA, ACM
- A. Alexandrov, M. F. Ionescu, K. E. Schauser, and C. Scheiman. LogGP: incorporating long messages into the LogP modelone step closer towards a realistic model for parallel computation. In Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures, SPAA'95, pages 95-105, New York, NY, USA, 1995. ACM.
- (1995) Proceedings of the Seventh Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA'95 , pp. 95-105
- Alexandrov, A.¹ Ionescu, M.F.² Schauser, K.E.³ Scheiman, C.⁴

4
- 80054034521
- Minimizing communication in linear algebra
- To appear in
- G. Ballard, J. Demmel, O. Holtz, and O. Schwartz. Minimizing communication in linear algebra. To appear in SIAM J. Mat. Anal. Appl, 2011.
- (2011) SIAM J. Mat. Anal. Appl
- Ballard, G.¹ Demmel, J.² Holtz, O.³ Schwartz, O.⁴

5
- 0011438068
- Technical Report Austin, TX, USA
- M. Barnett, D. G. Payne, R. A. van de Geijn, and J. Watts. Broadcasting on meshes with worm-hole routing. Technical report, Austin, TX, USA, 1993.
- (1993) Broadcasting on Meshes with Worm-hole Routing
- Barnett, M.¹ Payne, D.G.² Van De Geijn, R.A.³ Watts, J.⁴

6
- 0024883116
- Communication efficient matrix multiplication on hypercubes
- DOI 10.1016/0167-8191(89)90091-4
- J. Berntsen. Communication efficient matrix multiplication on hypercubes. Parallel Computing, 12(3):335 - 342, 1989. (Pubitemid 20644636)
- (1989) Parallel Computing , vol.12 , Issue.3 , pp. 335-342
- Berntsen Jarle¹

7
- 0003615167
- Society for Industrial and Applied Mathematics, Philadelphia, PA, USA
- L. S. Blackford, J. Choi, A. Cleary, E. D'Azeuedo, J. Demmel, I. Dhillon, S. Hammarling, G. Henry, A. Petitet, K. Stanley, D. Walker, and R. C. Whaley. ScaLAPACK user's guide. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 1997.
- (1997) ScaLAPACK User's Guide
- Blackford, L.S.¹ Choi, J.² Cleary, A.³ D'Azeuedo, E.⁴ Demmel, J.⁵ Dhillon, I.⁶ Hammarling, S.⁷ Henry, G.⁸ Petitet, A.⁹ Stanley, K.¹⁰ Walker, D.¹¹ Whaley, R.C.¹²

8
- 0009346826
- LogP: Towards a realistic model of parallel computation
- New York, NY, USA, ACM
- D. Culler, R. Karp, D. Patterson, A. Sahay, K. E. Schauser, E. Santos, R. Subramonian, and T. von Eicken. LogP: towards a realistic model of parallel computation. In Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming, PPOPP'93, pages 1-12, New York, NY, USA, 1993. ACM.
- (1993) Proceedings of the Fourth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP'93 , pp. 1-12
- Culler, D.¹ Karp, R.² Patterson, D.³ Sahay, A.⁴ Schauser, K.E.⁵ Santos, E.⁶ Subramonian, R.⁷ Von Eicken, T.⁸

9
- 77950942204
- MPI collective communications on the Blue Gene/P supercomputer: Algorithms and optimizations
- A. Faraj, S. Kumar, B. Smith, A. Mamidala, and J. Gunnels. MPI collective communications on the Blue Gene/P supercomputer: Algorithms and optimizations. In High Performance Interconnects, 2009. HOTI 2009. 17th IEEE Symposium on, pages 63 -72, 2009.
- (2009) High Performance Interconnects, 2009. HOTI 2009. 17th IEEE Symposium on , pp. 63-72
- Faraj, A.¹ Kumar, S.² Smith, B.³ Mamidala, A.⁴ Gunnels, J.⁵

10
- 83155188200
- 29:1-29:12
- L. Grigori, J. W. Demmel, and H. Xiang. Communication avoiding Gaussian Elimination. pages 29:1-29:12, 2008.
- (2008) Communication Avoiding Gaussian Elimination
- Grigori, L.¹ Demmel, J.W.² Xiang, H.³

11
- 0028401457
- The communication challenge for MPP: Intel Paragon and Meiko CS-2
- R. W. Hockney. The communication challenge for MPP: Intel Paragon and Meiko CS-2. Parallel Computing, 20(3):389 - 398, 1994.
- (1994) Parallel Computing , vol.20 , Issue.3 , pp. 389-398
- Hockney, R.W.¹

12
- 80052309746
- Trading replication for communication in parallel distributed-memory dense solvers
- D. Irony and S. Toledo. Trading replication for communication in parallel distributed-memory dense solvers. Parallel Processing Letters, 71:3-28, 2002.
- (2002) Parallel Processing Letters , vol.71 , pp. 3-28
- Irony, D.¹ Toledo, S.²

13
- 0024735141
- Optimum broadcasting and personalized communication in hypercubes
- DOI 10.1109/12.29465
- S. L. Johnsson and C.-T. Ho. Optimum broadcasting and personalized communication in hypercubes. IEEE Trans. Comput., 38:1249-1268, September 1989. (Pubitemid 20607016)
- (1989) IEEE Transactions on Computers , vol.38 , Issue.9 , pp. 1249-1268
- Johnsson S.Lennart¹ Ho Ching-Tien²

14
- 52649108804
- Technology-driven, highly-scalable dragony topology
- Washington, DC, USA, IEEE Computer Society
- J. Kim, W. J. Dally, S. Scott, and D. Abts. Technology-driven, highly-scalable dragony topology. In Proceedings of the 35th Annual International Symposium on Computer Architecture, ISCA'08, pages 77-88, Washington, DC, USA, 2008. IEEE Computer Society.
- (2008) Proceedings of the 35th Annual International Symposium on Computer Architecture, ISCA'08 , pp. 77-88
- Kim, J.¹ Dally, W.J.² Scott, S.³ Abts, D.⁴

15
- 57349161912
- The deep computing messaging framework: Generalized scalable message passing on the Blue Gene/P supercomputer
- New York, NY, USA, ACM
- S. Kumar, G. Dozsa, G. Almasi, P. Heidelberger, D. Chen, M. E. Giampapa, B. Michael, A. Faraj, J. Parker, J. Ratterman, B. Smith, and C. J. Archer. The deep computing messaging framework: generalized scalable message passing on the Blue Gene/P supercomputer. In Proceedings of the 22nd annual international conference on Supercomputing, ICS'08, pages 94-103, New York, NY, USA, 2008. ACM.
- (2008) Proceedings of the 22nd Annual International Conference on Supercomputing, ICS'08 , pp. 94-103
- Kumar, S.¹ Dozsa, G.² Almasi, G.³ Heidelberger, P.⁴ Chen, D.⁵ Giampapa, M.E.⁶ Michael, B.⁷ Faraj, A.⁸ Parker, J.⁹ Ratterman, J.¹⁰ Smith, B.¹¹ Archer, C.J.¹²

16
- 0029535709
- Collective communication in wormhole-routed massively parallel computers
- Dec.
- P. McKinley, Y.-J. Tsai, and D. Robinson. Collective communication in wormhole-routed massively parallel computers. Computer, 28(12):39 -50, Dec. 1995.
- (1995) Computer , vol.28 , Issue.12 , pp. 39-50
- McKinley, P.¹ Tsai, Y.-J.² Robinson, D.³

17
- 34248676296
- Performance analysis of MPI collective operations
- DOI 10.1007/s10586-007-0012-0, Evaluation and Optimization of High-Performance Computing and Networking Systems
- J. Pješivac-Grbović, T. Angskun, G. Bosilca, G. E. Fagg, E. Gabriel, and J. J. Dongarra. Performance analysis of MPI collective operations. Cluster Computing, 10:127-143, June 2007. (Pubitemid 46767521)
- (2007) Cluster Computing , vol.10 , Issue.2 , pp. 127-143
- Pjesivac-Grbovic, J.¹ Angskun, T.² Bosilca, G.³ Fagg, G.E.⁴ Gabriel, E.⁵ Dongarra, J.J.⁶

18
- 80052305141
- Communication-optimal parallel 2.5D matrix multiplication and LU factorization algorithms
- University of California, Berkeley, Feb
- E. Solomonik and J. Demmel. Communication-optimal parallel 2.5D matrix multiplication and LU factorization algorithms. Technical Report UCB/EECS-2011-10, EECS Department, University of California, Berkeley, Feb 2011.
- (2011) Technical Report UCB/EECS-2011-10, EECS Department
- Solomonik, E.¹ Demmel, J.²

19
- 14744288044
- Optimization of collective communication operations in MPICH
- DOI 10.1177/1094342005051521
- R. Thakur, R. Rabenseifner, and W. Gropp. Optimization of collective communication operations in MPICH. International Journal of High Performance Computing Applications, 19(1):49-66, Spring 2005. (Pubitemid 40329106)
- (2005) International Journal of High Performance Computing Applications , vol.19 , Issue.1 , pp. 49-66
- Thakur, R.¹ Rabenseifner, R.² Gropp, W.³

20
- 83155163572
- nov.
- J. Torrellas. Architectures for extreme-scale computing, nov. 2009.
- (2009) Architectures for Extreme-scale Computing
- Torrellas, J.¹

21
- 0031123769
- SUMMA: Scalable universal matrix multiplication algorithm
- R. A. Van De Geijn and J. Watts. SUMMA: scalable universal matrix multiplication algorithm. Concurrency: Practice and Experience, 9(4):255-274, 1997. (Pubitemid 127679707)
- (1997) Concurrency Practice and Experience , vol.9 , Issue.4 , pp. 255-274
- Van De Geijn, R.A.¹ Watts, J.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.