-
1
-
-
0029370767
-
A three-dimensional approach to parallel matrix multiplication
-
September
-
R. C. Agarwal, S. M. Balle, F. G. Gustavson, M. Joshi, and P. Palkar. A three-dimensional approach to parallel matrix multiplication. IBM J. Res. Dev., 39:575-582, September 1995.
-
(1995)
IBM J. Res. Dev.
, vol.39
, pp. 575-582
-
-
Agarwal, R.C.1
Balle, S.M.2
Gustavson, F.G.3
Joshi, M.4
Palkar, P.5
-
3
-
-
0029193089
-
LogGP: Incorporating long messages into the LogP modelone step closer towards a realistic model for parallel computation
-
New York, NY, USA, ACM
-
A. Alexandrov, M. F. Ionescu, K. E. Schauser, and C. Scheiman. LogGP: incorporating long messages into the LogP modelone step closer towards a realistic model for parallel computation. In Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures, SPAA'95, pages 95-105, New York, NY, USA, 1995. ACM.
-
(1995)
Proceedings of the Seventh Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA'95
, pp. 95-105
-
-
Alexandrov, A.1
Ionescu, M.F.2
Schauser, K.E.3
Scheiman, C.4
-
5
-
-
0011438068
-
-
Technical Report Austin, TX, USA
-
M. Barnett, D. G. Payne, R. A. van de Geijn, and J. Watts. Broadcasting on meshes with worm-hole routing. Technical report, Austin, TX, USA, 1993.
-
(1993)
Broadcasting on Meshes with Worm-hole Routing
-
-
Barnett, M.1
Payne, D.G.2
Van De Geijn, R.A.3
Watts, J.4
-
6
-
-
0024883116
-
Communication efficient matrix multiplication on hypercubes
-
DOI 10.1016/0167-8191(89)90091-4
-
J. Berntsen. Communication efficient matrix multiplication on hypercubes. Parallel Computing, 12(3):335 - 342, 1989. (Pubitemid 20644636)
-
(1989)
Parallel Computing
, vol.12
, Issue.3
, pp. 335-342
-
-
Berntsen Jarle1
-
7
-
-
0003615167
-
-
Society for Industrial and Applied Mathematics, Philadelphia, PA, USA
-
L. S. Blackford, J. Choi, A. Cleary, E. D'Azeuedo, J. Demmel, I. Dhillon, S. Hammarling, G. Henry, A. Petitet, K. Stanley, D. Walker, and R. C. Whaley. ScaLAPACK user's guide. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 1997.
-
(1997)
ScaLAPACK User's Guide
-
-
Blackford, L.S.1
Choi, J.2
Cleary, A.3
D'Azeuedo, E.4
Demmel, J.5
Dhillon, I.6
Hammarling, S.7
Henry, G.8
Petitet, A.9
Stanley, K.10
Walker, D.11
Whaley, R.C.12
-
8
-
-
0009346826
-
LogP: Towards a realistic model of parallel computation
-
New York, NY, USA, ACM
-
D. Culler, R. Karp, D. Patterson, A. Sahay, K. E. Schauser, E. Santos, R. Subramonian, and T. von Eicken. LogP: towards a realistic model of parallel computation. In Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming, PPOPP'93, pages 1-12, New York, NY, USA, 1993. ACM.
-
(1993)
Proceedings of the Fourth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP'93
, pp. 1-12
-
-
Culler, D.1
Karp, R.2
Patterson, D.3
Sahay, A.4
Schauser, K.E.5
Santos, E.6
Subramonian, R.7
Von Eicken, T.8
-
9
-
-
77950942204
-
MPI collective communications on the Blue Gene/P supercomputer: Algorithms and optimizations
-
A. Faraj, S. Kumar, B. Smith, A. Mamidala, and J. Gunnels. MPI collective communications on the Blue Gene/P supercomputer: Algorithms and optimizations. In High Performance Interconnects, 2009. HOTI 2009. 17th IEEE Symposium on, pages 63 -72, 2009.
-
(2009)
High Performance Interconnects, 2009. HOTI 2009. 17th IEEE Symposium on
, pp. 63-72
-
-
Faraj, A.1
Kumar, S.2
Smith, B.3
Mamidala, A.4
Gunnels, J.5
-
11
-
-
0028401457
-
The communication challenge for MPP: Intel Paragon and Meiko CS-2
-
R. W. Hockney. The communication challenge for MPP: Intel Paragon and Meiko CS-2. Parallel Computing, 20(3):389 - 398, 1994.
-
(1994)
Parallel Computing
, vol.20
, Issue.3
, pp. 389-398
-
-
Hockney, R.W.1
-
12
-
-
80052309746
-
Trading replication for communication in parallel distributed-memory dense solvers
-
D. Irony and S. Toledo. Trading replication for communication in parallel distributed-memory dense solvers. Parallel Processing Letters, 71:3-28, 2002.
-
(2002)
Parallel Processing Letters
, vol.71
, pp. 3-28
-
-
Irony, D.1
Toledo, S.2
-
13
-
-
0024735141
-
Optimum broadcasting and personalized communication in hypercubes
-
DOI 10.1109/12.29465
-
S. L. Johnsson and C.-T. Ho. Optimum broadcasting and personalized communication in hypercubes. IEEE Trans. Comput., 38:1249-1268, September 1989. (Pubitemid 20607016)
-
(1989)
IEEE Transactions on Computers
, vol.38
, Issue.9
, pp. 1249-1268
-
-
Johnsson S.Lennart1
Ho Ching-Tien2
-
14
-
-
52649108804
-
Technology-driven, highly-scalable dragony topology
-
Washington, DC, USA, IEEE Computer Society
-
J. Kim, W. J. Dally, S. Scott, and D. Abts. Technology-driven, highly-scalable dragony topology. In Proceedings of the 35th Annual International Symposium on Computer Architecture, ISCA'08, pages 77-88, Washington, DC, USA, 2008. IEEE Computer Society.
-
(2008)
Proceedings of the 35th Annual International Symposium on Computer Architecture, ISCA'08
, pp. 77-88
-
-
Kim, J.1
Dally, W.J.2
Scott, S.3
Abts, D.4
-
15
-
-
57349161912
-
The deep computing messaging framework: Generalized scalable message passing on the Blue Gene/P supercomputer
-
New York, NY, USA, ACM
-
S. Kumar, G. Dozsa, G. Almasi, P. Heidelberger, D. Chen, M. E. Giampapa, B. Michael, A. Faraj, J. Parker, J. Ratterman, B. Smith, and C. J. Archer. The deep computing messaging framework: generalized scalable message passing on the Blue Gene/P supercomputer. In Proceedings of the 22nd annual international conference on Supercomputing, ICS'08, pages 94-103, New York, NY, USA, 2008. ACM.
-
(2008)
Proceedings of the 22nd Annual International Conference on Supercomputing, ICS'08
, pp. 94-103
-
-
Kumar, S.1
Dozsa, G.2
Almasi, G.3
Heidelberger, P.4
Chen, D.5
Giampapa, M.E.6
Michael, B.7
Faraj, A.8
Parker, J.9
Ratterman, J.10
Smith, B.11
Archer, C.J.12
-
16
-
-
0029535709
-
Collective communication in wormhole-routed massively parallel computers
-
Dec.
-
P. McKinley, Y.-J. Tsai, and D. Robinson. Collective communication in wormhole-routed massively parallel computers. Computer, 28(12):39 -50, Dec. 1995.
-
(1995)
Computer
, vol.28
, Issue.12
, pp. 39-50
-
-
McKinley, P.1
Tsai, Y.-J.2
Robinson, D.3
-
17
-
-
34248676296
-
Performance analysis of MPI collective operations
-
DOI 10.1007/s10586-007-0012-0, Evaluation and Optimization of High-Performance Computing and Networking Systems
-
J. Pješivac-Grbović, T. Angskun, G. Bosilca, G. E. Fagg, E. Gabriel, and J. J. Dongarra. Performance analysis of MPI collective operations. Cluster Computing, 10:127-143, June 2007. (Pubitemid 46767521)
-
(2007)
Cluster Computing
, vol.10
, Issue.2
, pp. 127-143
-
-
Pjesivac-Grbovic, J.1
Angskun, T.2
Bosilca, G.3
Fagg, G.E.4
Gabriel, E.5
Dongarra, J.J.6
-
18
-
-
80052305141
-
Communication-optimal parallel 2.5D matrix multiplication and LU factorization algorithms
-
University of California, Berkeley, Feb
-
E. Solomonik and J. Demmel. Communication-optimal parallel 2.5D matrix multiplication and LU factorization algorithms. Technical Report UCB/EECS-2011-10, EECS Department, University of California, Berkeley, Feb 2011.
-
(2011)
Technical Report UCB/EECS-2011-10, EECS Department
-
-
Solomonik, E.1
Demmel, J.2
-
19
-
-
14744288044
-
Optimization of collective communication operations in MPICH
-
DOI 10.1177/1094342005051521
-
R. Thakur, R. Rabenseifner, and W. Gropp. Optimization of collective communication operations in MPICH. International Journal of High Performance Computing Applications, 19(1):49-66, Spring 2005. (Pubitemid 40329106)
-
(2005)
International Journal of High Performance Computing Applications
, vol.19
, Issue.1
, pp. 49-66
-
-
Thakur, R.1
Rabenseifner, R.2
Gropp, W.3
-
21
-
-
0031123769
-
SUMMA: Scalable universal matrix multiplication algorithm
-
R. A. Van De Geijn and J. Watts. SUMMA: scalable universal matrix multiplication algorithm. Concurrency: Practice and Experience, 9(4):255-274, 1997. (Pubitemid 127679707)
-
(1997)
Concurrency Practice and Experience
, vol.9
, Issue.4
, pp. 255-274
-
-
Van De Geijn, R.A.1
Watts, J.2
|