-
1
-
-
33750253089
-
-
Open Systems Lab, Indiana University, Tech. Rep, 08
-
T. Hoefler, J. Squyres, G. Bosilca, G. Fagg, A. Lumsdaine, and W. Rehm, "Non-Blocking Collective Operations for MPI-2," Open Systems Lab, Indiana University, Tech. Rep., 08 2006.
-
(2006)
Non-Blocking Collective Operations for MPI-2
-
-
Hoefler, T.1
Squyres, J.2
Bosilca, G.3
Fagg, G.4
Lumsdaine, A.5
Rehm, W.6
-
2
-
-
84883859962
-
-
T. Hoefler, J. Squyres, W. Rehm, and A. Lumsdaine, A Case for Non-Blocking Collective Operations, in Frontiers of High Performance Computing and Networking - ISPA 2006 Workshops, 4331/2006. Springer Berlin / Heidelberg, 12 2006, pp. 155-164. [Online]. Available: ./img/hoefler-ispa06.pdf
-
T. Hoefler, J. Squyres, W. Rehm, and A. Lumsdaine, "A Case for Non-Blocking Collective Operations," in Frontiers of High Performance Computing and Networking - ISPA 2006 Workshops, vol. 4331/2006. Springer Berlin / Heidelberg, 12 2006, pp. 155-164. [Online]. Available: ./img/hoefler-ispa06.pdf
-
-
-
-
3
-
-
84877019178
-
The Case of the Missing Supercomputer Performance: Achieving Optimal Performance on the 8, 192 Processors of ASCI Q
-
ACM
-
F. Petrini, D. J. Kerbyson, and S. Pakin, "The Case of the Missing Supercomputer Performance: Achieving Optimal Performance on the 8, 192 Processors of ASCI Q." in Proceedings of the ACM/IEEE Supercomputing. ACM, 2003, p. 55.
-
(2003)
Proceedings of the ACM/IEEE Supercomputing
, pp. 55
-
-
Petrini, F.1
Kerbyson, D.J.2
Pakin, S.3
-
4
-
-
30644479805
-
Overlapping of communication and computation and early binding: Fundamental mechanisms for improving parallel performance on clusters of workstations,
-
Ph.D. dissertation, Mississippi State University
-
R. Dimitrov, "Overlapping of communication and computation and early binding: Fundamental mechanisms for improving parallel performance on clusters of workstations," Ph.D. dissertation, Mississippi State University, 2001.
-
(2001)
-
-
Dimitrov, R.1
-
5
-
-
1242332596
-
Send-receive considered harmful: Myths and realities of message passing
-
S. Gorlatch, "Send-receive considered harmful: Myths and realities of message passing," ACM Trans. Program. Lang. Syst., vol. 26, no. 1, pp. 47-56, 2004.
-
(2004)
ACM Trans. Program. Lang. Syst
, vol.26
, Issue.1
, pp. 47-56
-
-
Gorlatch, S.1
-
6
-
-
51049109755
-
-
Message Passing Interface Forum, MPI: A Message Passing Interface Standard, 1995.
-
Message Passing Interface Forum, "MPI: A Message Passing Interface Standard," 1995.
-
-
-
-
7
-
-
0003604499
-
MPI-2: Extensions to the Message-Passing Interface,
-
Technical Report, University of Tennessee, Knoxville
-
_, "MPI-2: Extensions to the Message-Passing Interface," Technical Report, University of Tennessee, Knoxville, 1997.
-
(1997)
-
-
Gorlatch, S.1
-
9
-
-
33845393854
-
Transformations to parallel codes for communication-computation overlap
-
Washington, DC, USA: IEEE Computer Society
-
A. Danalis, K.-Y. Kim, L. Pollock, and M. Swany, "Transformations to parallel codes for communication-computation overlap," in SC '05: Proceedings of the 2005 ACM/IEEE conference on Supercomputing. Washington, DC, USA: IEEE Computer Society, 2005, p. 58.
-
(2005)
SC '05: Proceedings of the 2005 ACM/IEEE conference on Supercomputing
, pp. 58
-
-
Danalis, A.1
Kim, K.-Y.2
Pollock, L.3
Swany, M.4
-
10
-
-
84947212732
-
A Framework for Collective Personalized Communication
-
Nice, France, April
-
L. V. Kale, S. Kumar, and K. Vardarajan, "A Framework for Collective Personalized Communication," in Proceedings of IPDPS'03, Nice, France, April 2003.
-
(2003)
Proceedings of IPDPS'03
-
-
Kale, L.V.1
Kumar, S.2
Vardarajan, K.3
-
11
-
-
51049107155
-
-
Open Systems Lab, Indiana University, Tech. Rep, 08
-
T. Hoefler and A. Lumsdaine, "Design, Implementation, and Usage of LibNBC," Open Systems Lab, Indiana University, Tech. Rep., 08 2006.
-
(2006)
Design, Implementation, and Usage of LibNBC
-
-
Hoefler, T.1
Lumsdaine, A.2
-
13
-
-
33745195144
-
Hunting the overlap
-
Washington, DC, USA: IEEE Computer Society
-
C. Iancu, P. Husbands, and P. Hargrove, "Hunting the overlap," in PACT '05: Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05). Washington, DC, USA: IEEE Computer Society, 2005, pp. 279-290.
-
(2005)
PACT '05: Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05)
, pp. 279-290
-
-
Iancu, C.1
Husbands, P.2
Hargrove, P.3
-
14
-
-
51049113070
-
-
J. W. III and S. Bova, Where's the Overlap? - An Analysis of Popular MPI Implementations, 1999. [Online]. Available: citeseer.ist.psu.edu/white99wheres.html
-
J. W. III and S. Bova, "Where's the Overlap? - An Analysis of Popular MPI Implementations," 1999. [Online]. Available: citeseer.ist.psu.edu/white99wheres.html
-
-
-
-
15
-
-
84948981514
-
Comb: A portable benchmark suite for assessing mpi overlap
-
IEEE Computer Society
-
W. Lawry, C. Wilson, A. B. Maccabe, and R. Brightwell, "Comb: A portable benchmark suite for assessing mpi overlap." in CLUSTER. IEEE Computer Society, 2002, pp. 472-475.
-
(2002)
CLUSTER
, pp. 472-475
-
-
Lawry, W.1
Wilson, C.2
Maccabe, A.B.3
Brightwell, R.4
-
17
-
-
51049102456
-
-
The InfiniBand Trade Association, Infiniband Architecture Specification 1, Release 1.2, InfiniBand Trade Association, 2003.
-
The InfiniBand Trade Association, Infiniband Architecture Specification Volume 1, Release 1.2, InfiniBand Trade Association, 2003.
-
-
-
-
18
-
-
81455128348
-
Assessing Single-Message and Multi-Node Communication Performance of InfiniBand
-
IEEE Computer Society
-
T. Hoefler, C. Viertel, T. Mehlan, F. Mietke, and W. Rehm, "Assessing Single-Message and Multi-Node Communication Performance of InfiniBand," in Proceedings of IEEE PARELEC 2006. IEEE Computer Society, 9 2006, pp. 227-232.
-
(2006)
Proceedings of IEEE PARELEC 2006
, vol.9
, pp. 227-232
-
-
Hoefler, T.1
Viertel, C.2
Mehlan, T.3
Mietke, F.4
Rehm, W.5
-
19
-
-
34548008852
-
High performance MPI design using unreliable datagram for ultra-scale InfiniBand clusters
-
New York, NY, USA: ACM Press
-
M. J. Koop, S. Sur, Q. Gao, and D. K. Panda, "High performance MPI design using unreliable datagram for ultra-scale InfiniBand clusters," in Proceedings of the 21st annual international conference on Supercom-puting. New York, NY, USA: ACM Press, 2007, pp. 180-189.
-
(2007)
Proceedings of the 21st annual international conference on Supercom-puting
, pp. 180-189
-
-
Koop, M.J.1
Sur, S.2
Gao, Q.3
Panda, D.K.4
-
20
-
-
51049101785
-
Scalable High Performance Message Passing over InfiniBand for Open MPI
-
RWTH Aachen, December
-
A. Friedley, T. Hoefler, M. L. Leininger, and A. Lumsdaine, "Scalable High Performance Message Passing over InfiniBand for Open MPI," in Proceedings of 2007 KiCC Workshop, RWTH Aachen, December 2007.
-
(2007)
Proceedings of 2007 KiCC Workshop
-
-
Friedley, A.1
Hoefler, T.2
Leininger, M.L.3
Lumsdaine, A.4
-
21
-
-
53349142353
-
Zero-Copy Protocol for MPI using InfiniBand Unreliable Datagram
-
Austin, TX, USA, September 17-20
-
M. J. Koop, S. Sur, Q. Gao, and D. K. Panda, Zero-Copy Protocol for MPI using InfiniBand Unreliable Datagram," in IEEE Cluster 2007: International Conference on Cluster Computing, Austin, TX, USA, September 17-20, 2007.
-
(2007)
IEEE Cluster 2007: International Conference on Cluster Computing
-
-
Koop, M.J.1
Sur, S.2
Gao, Q.3
Panda, D.K.4
-
22
-
-
70350237882
-
Analysis of the Memory Registration Process in the Mellanox Infini-Band Software Stack
-
Springer-Verlag Berlin
-
F. Mietke, R. Baumgartl, R. Rex, T. Mehlan, T. Hoefler, and W. Rehm, "Analysis of the Memory Registration Process in the Mellanox Infini-Band Software Stack," in Euro-Par 2006 Parallel Processing. Springer-Verlag Berlin, 8 2006, pp. 124-133.
-
(2006)
Euro-Par 2006 Parallel Processing
, vol.8
, pp. 124-133
-
-
Mietke, F.1
Baumgartl, R.2
Rex, R.3
Mehlan, T.4
Hoefler, T.5
Rehm, W.6
-
23
-
-
3042721503
-
High Performance RDMA-Based MPI Implementation over InfiniBand
-
J. Liu, J. Wu, and D. K. Panda, "High Performance RDMA-Based MPI Implementation over InfiniBand," Int'l Journal of Parallel Programming, 2004, 2004.
-
(2004)
Int'l Journal of Parallel Programming, 2004
-
-
Liu, J.1
Wu, J.2
Panda, D.K.3
-
24
-
-
27844562921
-
Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation
-
Budapest, Hungary, September
-
E. Gabriel, G. E. Fagg, G. Bosilca, T. Angskun, J. J. Dongarra, J. M. Squyres, V. Sahay, P. Kambadur, B. Barrett, A. Lumsdaine, R. H. Castain, D. J. Daniel, R. L. Graham, and T. S. Woodall, "Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation," in Proceedings, 11th European PVM/MPI Users' Group Meeting, Budapest, Hungary, September 2004.
-
(2004)
Proceedings, 11th European PVM/MPI Users' Group Meeting
-
-
Gabriel, E.1
Fagg, G.E.2
Bosilca, G.3
Angskun, T.4
Dongarra, J.J.5
Squyres, J.M.6
Sahay, V.7
Kambadur, P.8
Barrett, B.9
Lumsdaine, A.10
Castain, R.H.11
Daniel, D.J.12
Graham, R.L.13
Woodall, T.S.14
-
25
-
-
33750234379
-
-
G. M. Shipman, T. S. Woodall, G. Bosilca, R. ch L. Graham, and A. B. Maccabe, High performance RDMA protocols in HPC, in Proceedings, 13th European PVM/MPI Users' Group Meeting, ser. Lecture Notes in Computer Science. Bonn, Germany: Springer-Verlag, September 2006.
-
G. M. Shipman, T. S. Woodall, G. Bosilca, R. ch L. Graham, and A. B. Maccabe, "High performance RDMA protocols in HPC," in Proceedings, 13th European PVM/MPI Users' Group Meeting, ser. Lecture Notes in Computer Science. Bonn, Germany: Springer-Verlag, September 2006.
-
-
-
-
26
-
-
0018515759
-
-
C. L. Lawson, R. J. Hanson, D. Kincaid, and F. T. Krogh, Basic Linear Algebra Subprograms for FORTRAN usage, in In ACM Trans. Math. Soft., 5 (1979), pp. 308-323, 1979.
-
C. L. Lawson, R. J. Hanson, D. Kincaid, and F. T. Krogh, "Basic Linear Algebra Subprograms for FORTRAN usage," in In ACM Trans. Math. Soft., 5 (1979), pp. 308-323, 1979.
-
-
-
-
28
-
-
14744298131
-
On benchmarking collective mpi operations
-
London, UK: Springer-Verlag
-
T. Worsch, R. Reussner, and W. Augustin, "On benchmarking collective mpi operations," in Proceedings of the 9th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface. London, UK: Springer-Verlag, 2002.
-
(2002)
Proceedings of the 9th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
-
-
Worsch, T.1
Reussner, R.2
Augustin, W.3
-
30
-
-
33751036064
-
Rdma read based rendezvous protocol for mpi over infiniband: Design alternatives and benefits
-
New York, NY, USA: ACM
-
S. Sur, H.-W. Jin, L. Chai, and D. K. Panda, "Rdma read based rendezvous protocol for mpi over infiniband: design alternatives and benefits," in PPoPP '06: Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming. New York, NY, USA: ACM, 2006, pp. 32-39.
-
(2006)
PPoPP '06: Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
, pp. 32-39
-
-
Sur, S.1
Jin, H.-W.2
Chai, L.3
Panda, D.K.4
-
31
-
-
38149121511
-
-
T. Hoefler, T. Mehlan, A. Lumsdaine, and W. Rehm, Netgauge: A Network Performance Measurement Framework, in Proceedings of Third International Conference, HPCC 2007, 4782. Springer, 9 2007, pp. 659-671. [Online]. Available: ./img/hoefler-netgauge.pdf
-
T. Hoefler, T. Mehlan, A. Lumsdaine, and W. Rehm, "Netgauge: A Network Performance Measurement Framework," in Proceedings of Third International Conference, HPCC 2007, vol. 4782. Springer, 9 2007, pp. 659-671. [Online]. Available: ./img/hoefler-netgauge.pdf
-
-
-
-
32
-
-
51049106278
-
-
T. Hoefler, A. Lichei, and W. Rehm, Low-Overhead LogGP Parameter Assessment for Modern Interconnection Networks, 03 2007. [Online]. Available: ./img/hoefler-pmeo07.pdf
-
T. Hoefler, A. Lichei, and W. Rehm, "Low-Overhead LogGP Parameter Assessment for Modern Interconnection Networks," 03 2007. [Online]. Available: ./img/hoefler-pmeo07.pdf
-
-
-
-
34
-
-
33745201924
-
The Component Architecture of Open MPI: Enabling Third-Party Collective Algorithms
-
St. Malo, France
-
J. M. Squyres and A. Lumsdaine, "The Component Architecture of Open MPI: Enabling Third-Party Collective Algorithms," in 18th ACM International Conference on Supercomputing, Workshop on Component Models and Systems for Grid Applications, St. Malo, France, 2004.
-
(2004)
18th ACM International Conference on Supercomputing, Workshop on Component Models and Systems for Grid Applications
-
-
Squyres, J.M.1
Lumsdaine, A.2
-
35
-
-
34548793392
-
-
T. Hoefler, C. Siebert, and W. Rehm, A practically constant-time MPI Broadcast Algorithm for large-scale InfiniBand Clusters with Multicast, in Proceedings of the 21st IEEE International Parallel & Distributed Processing Symposium. IEEE Computer Society, 03 2007, p. 232. [Online]. Available: ./img/hoefler-cac07.pdf
-
T. Hoefler, C. Siebert, and W. Rehm, "A practically constant-time MPI Broadcast Algorithm for large-scale InfiniBand Clusters with Multicast," in Proceedings of the 21st IEEE International Parallel & Distributed Processing Symposium. IEEE Computer Society, 03 2007, p. 232. [Online]. Available: ./img/hoefler-cac07.pdf
-
-
-
-
36
-
-
33847106529
-
Fast Barrier Synchronization for InfiniBand
-
April
-
T. Hoefler, T. Mehlan, F. Mietke, and W. Rehm, "Fast Barrier Synchronization for InfiniBand," in Proceedings, 20th International Parallel and Distributed Processing Symposium IPDPS 2006 (CAC 06), April 2006.
-
(2006)
Proceedings, 20th International Parallel and Distributed Processing Symposium IPDPS 2006 (CAC 06)
-
-
Hoefler, T.1
Mehlan, T.2
Mietke, F.3
Rehm, W.4
|