-
1
-
-
0030661485
-
Optimizing matrix multiply using PHiPAC: A portable, high-performance, ANSI C coding methodology
-
J. Bilmes, K. Asanovic, C. Chin, and J. Demmel. Optimizing Matrix Multiply using PHiPAC: a Portable, High-Performance, ANSI C Coding Methodology. In Proceedings of the ACM SIGARC International Conference on SuperComputing, 1997.
-
(1997)
Proceedings of the ACM SIGARC International Conference on SuperComputing
-
-
Bilmes, J.1
Asanovic, K.2
Chin, C.3
Demmel, J.4
-
2
-
-
0031269329
-
Efficient algorithms for all-to-all communications in multiport message-passing systems
-
Nov.
-
J. Bruck, C. Ho, S. Kipnis, E. Upfal, and D. Weathersby. Efficient Algorithms for All-to-all Communications in Multiport Message-Passing Systems. IEEE Transactions on Parallel and Distributed Systems, 8(11):1143-1156, Nov. 1997.
-
(1997)
IEEE Transactions on Parallel and Distributed Systems
, vol.8
, Issue.11
, pp. 1143-1156
-
-
Bruck, J.1
Ho, C.2
Kipnis, S.3
Upfal, E.4
Weathersby, D.5
-
3
-
-
50149106169
-
Bandwidth efficient all-to-all broadcast on switched clusters
-
Department of Computer Science, Florida State University, May
-
A. Faraj, P. Patarasuk, and X. Yuan. Bandwidth Efficient All-to-All Broadcast on Switched Clusters. Technical Report, Department of Computer Science, Florida State University, May 2005.
-
(2005)
Technical Report
-
-
Faraj, A.1
Patarasuk, P.2
Yuan, X.3
-
4
-
-
32844460718
-
Message scheduling for all-to-all personalized communication on ethernet switched clusters
-
April
-
A. Faraj and X. Yuan. Message Scheduling for All-to-all Personalized Communication on Ethernet Switched Clusters. IEEE IPDPS, April 2005.
-
(2005)
IEEE IPDPS
-
-
Faraj, A.1
Yuan, X.2
-
6
-
-
0031636309
-
FFTW: An adaptive software architecture for the FFT
-
M. Frigo and S. Johnson. FFTW: An Adaptive Software Architecture for the FFT. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), volume 3, page 1381, 1998.
-
(1998)
Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
, vol.3
, pp. 1381
-
-
Frigo, M.1
Johnson, S.2
-
7
-
-
0037997900
-
A high-performance, portable implementation of the MPI message passing interface standard
-
W. Gropp, E. Lusk, N. Doss, and A. Skjellum. A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard. In MPI Developers Conference, 1995.
-
(1995)
MPI Developers Conference
-
-
Gropp, W.1
Lusk, E.2
Doss, N.3
Skjellum, A.4
-
8
-
-
24144445542
-
Reproducible measurements of MPI performance characteristics
-
Argonne National Labratory, Argonne, IL, June
-
W. Gropp and E. Lusk. Reproducible Measurements of MPI Performance Characteristics. Technical Report ANL/MCS-P755-0699, Argonne National Labratory, Argonne, IL, June 1999.
-
(1999)
Technical Report ANL/MCS-P755-0699
-
-
Gropp, W.1
Lusk, E.2
-
11
-
-
84947212732
-
A framework for collective personalized communication
-
April
-
L. V. Kale, S. Kumar, K. Varadarajan, "A Framework for Collective Personalized Communication," IPDPS'03, April 2003.
-
(2003)
IPDPS'03
-
-
Kale, L.V.1
Kumar, S.2
Varadarajan, K.3
-
12
-
-
1442337675
-
CC-MPI: A compiled communication capable MPI prototype for ethernet switched clusters
-
June
-
A. Karwande, X. Yuan, and D.K. Lowenthal. CC-MPI: A Compiled Communication Capable MPI Prototype for Ethernet Switched Clusters. In ACM SIGPLAN PPoPP, pages 95-106, June 2003.
-
(2003)
ACM SIGPLAN PPoPP
, pp. 95-106
-
-
Karwande, A.1
Yuan, X.2
Lowenthal, D.K.3
-
13
-
-
18844428650
-
Magpie: MPI's collective communication operations for clustered wide area systems
-
May
-
T. Kielmann, et. al. Magpie: MPI's Collective Communication Operations for Clustered Wide Area Systems. In ACM SIGPLAN PPoPP, pages 131-140, May 1999.
-
(1999)
ACM SIGPLAN PPoPP
, pp. 131-140
-
-
Kielmann, T.1
-
17
-
-
0038674285
-
OMPI: Optimizing MPI programs using partial evaluation
-
November
-
H. Ogawa and S. Matsuoka. OMPI: Optimizing MPI Programs Using Partial Evaluation. In Supercomputing'96, November 1996.
-
(1996)
Supercomputing'96
-
-
Ogawa, H.1
Matsuoka, S.2
-
18
-
-
0033463967
-
Multi-processor molecular dynamics using the brenner potential: Parallelization of an implicit multi-body potential
-
Feb.
-
I. Rosenblum, J. Adler, and S. Brandon. Multi-processor molecular dynamics using the Brenner potential: Parallelization of an implicit multi-body potential. International Journal of Modern Physics, C 10(1):189-203, Feb. 1999.
-
(1999)
International Journal of Modern Physics
, vol.10 C
, Issue.1
, pp. 189-203
-
-
Rosenblum, I.1
Adler, J.2
Brandon, S.3
-
20
-
-
0003576826
-
Program transformation and runtime support for threaded MPI execution on shared-memory machines
-
July
-
H. Tang, K. Shen, and T. Yang. Program Transformation and Runtime Support for Threaded MPI Execution on Shared-Memory Machines. ACM Transactions on Programming Languages and Systems, 22(4):673-700, July 2000.
-
(2000)
ACM Transactions on Programming Languages and Systems
, vol.22
, Issue.4
, pp. 673-700
-
-
Tang, H.1
Shen, K.2
Yang, T.3
-
21
-
-
32844461816
-
Optimizing of collective communication operations in MPICH
-
Mathematics and Computer Science Division, Argonne National Laboratory, March
-
R. Thakur, R. Rabenseifner, and W. Gropp. Optimizing of Collective Communication Operations in MPICH. ANL/MCS-P1140-0304, Mathematics and Computer Science Division, Argonne National Laboratory, March 2004.
-
(2004)
ANL/MCS-P1140-0304
-
-
Thakur, R.1
Rabenseifner, R.2
Gropp, W.3
|