메뉴 건너뛰기




Volumn , Issue , 2009, Pages 251-260

MPIWIZ: Subgroup reproducible replay of MPI applications

Author keywords

Design; Performance; Reliability

Indexed keywords

COARSE-GRAINED; COMMUNICATION LOCALITY; COMMUNICATION MESSAGES; CONCURRENT EXECUTION; CYCLIC DEBUGGING; DETERMINISTIC BEHAVIOR; EXECUTION TIME; GROUP BOUNDARY; HYBRID DETERMINISTIC; INTRA-GROUP; MESSAGE ORDERING; MESSAGE PASSING INTERFACE; MPI APPLICATIONS; NON-DETERMINISM; PERFORMANCE; SOURCE CODE MODIFICATION; SYSTEM CALLS; TRAFFIC PATTERN;

EID: 67650178061     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1504176.1504213     Document Type: Conference Paper
Times cited : (51)

References (39)
  • 2
    • 38449105996 scopus 로고    scopus 로고
    • Retrospect: Deterministic replay of mpi applications for interactive distributed debugging
    • A. Bouteiller, G. Bosilca, and J. Dongarra. Retrospect: Deterministic Replay of MPI Applications for Interactive Distributed Debugging. In 14th European PVM/MPI User's Group Meeting, pages 297-306, 2007.
    • (2007) 14th European PVM/MPI User's Group Meeting , pp. 297-306
    • Bouteiller, A.1    Bosilca, G.2    Dongarra, J.3
  • 3
    • 0026396076 scopus 로고
    • An Improved two-way partitioning algorithm with stable performance
    • C.-K. Cheng and Y.-C. A. Wei. An Improved Two-way Partitioning Algorithm with Stable Performance. IEEE Transactions on Computer Aided Design, 10(12):1502-1511, 1991.
    • (1991) IEEE Transactions on Computer Aided Design , vol.10 , Issue.12 , pp. 1502-1511
    • Cheng, C.-K.1    Wei, Y.-C.A.2
  • 4
    • 84878244338 scopus 로고
    • An implementation of race detection and deterministic replay with MPI
    • Aug.
    • C. Cĺemençon, J. Fritscher, M. J. Meehan, and R. Rühl. An Implementation of Race Detection and Deterministic Replay with MPI. In EuroPar'95, pages 155-166, Aug. 1995.
    • (1995) EuroPar'95 , pp. 155-166
    • Cĺemençon, C.1    Fritscher, J.2    Meehan, M.J.3    Rühl, R.4
  • 7
    • 33646156415 scopus 로고    scopus 로고
    • Collective error detection for MPI collective operations
    • C. Falzone, A. Chan, E. L. Lusk, and W. Gropp. Collective Error Detection for MPI Collective Operations. In PVM/MPI'05, pages 138-147, 2005.
    • (2005) PVM/MPI'05 , pp. 138-147
    • Falzone, C.1    Chan, A.2    Lusk, E.L.3    Gropp, W.4
  • 8
    • 33847171466 scopus 로고    scopus 로고
    • Communication characteristics in the NAS parallel benchmarks
    • A. Faraj and X. Yuan. Communication Characteristics in the NAS Parallel Benchmarks. In PDCS'02, pages 724-729, 2002.
    • (2002) PDCS'02 , pp. 724-729
    • Faraj, A.1    Yuan, X.2
  • 10
    • 0030243005 scopus 로고    scopus 로고
    • A high-performance, portable implementation of the MPI message passing interface standard
    • DOI 10.1016/0167-8191(96)00024-5, PII S0167819196000245
    • W. Gropp, E. Lusk, N. Doss, and A. Skjellum. A High-performance, Portable Implementation of the MPI Message Passing Interface Standard. Parallel Computing, 22(6):789-828, Sept. 1996. (Pubitemid 126364517)
    • (1996) Parallel Computing , vol.22 , Issue.6 , pp. 789-828
    • Gropp, W.1    Lusk, E.2    Doss, N.3    Skjellum, A.4
  • 13
    • 67650136527 scopus 로고    scopus 로고
    • HPCC, Computing,Information, and Communications (CIC) R&D Subcommittee of the National Science and Technology Council's Committee on Computing, Information, and Communications (CCIC)
    • HPCC. Hpcc 1998 blue book (computing, information, and communications: Technologies for the 21st century). Computing, Information, and Communications (CIC) R&D Subcommittee of the National Science and Technology Council's Committee on Computing, Information, and Communications (CCIC), 1998.
    • (1998) Hpcc 1998 Blue Book (Computing, Information, and Communications: Technologies for the 21st Century)
  • 16
    • 0002806618 scopus 로고    scopus 로고
    • Multilevelk-way Partitioning Scheme for Irregular Graphs
    • DOI 10.1006/jpdc.1997.1404, PII S0743731597914040
    • G. Karypis and V. Kumar. Multilevel k-way Partitioning Scheme for Irregular Graphs. Journal of Parallel and Distributed Computing, 48(1):96-129, 1998. (Pubitemid 128347296)
    • (1998) Journal of Parallel and Distributed Computing , vol.48 , Issue.1 , pp. 96-129
    • Karypis, G.1    Kumar, V.2
  • 18
    • 84957617803 scopus 로고    scopus 로고
    • Characterization of Communication Patterns in Message-Passing Parallel Scientific Application Programs
    • Network-Based Parallel Computing: Communication, Architecture, and Applications
    • J. Kim and D. J. Lilja. Characterization of Communication Patterns in Message-Passing Parallel Scientific Application Programs. In CANPC'98, pages 202-216, 1998. (Pubitemid 128027868)
    • (1998) LECTURE NOTES IN COMPUTER SCIENCE , Issue.1362 , pp. 202-216
    • Kim, J.1    Lilja, D.J.2
  • 19
    • 67650152037 scopus 로고    scopus 로고
    • MPI application development with MARMOT
    • B. Krammer and M. S. Müller. MPI Application Development with MARMOT. In ParCo'05, pages 893-900, 2005.
    • (2005) ParCo'05 , pp. 893-900
    • Krammer, B.1    Müller, M.S.2
  • 20
    • 84947574470 scopus 로고    scopus 로고
    • An Integrated Record&Replay Mechanism for Nondeterministic Message Passing Programs
    • Recent Advances in Parallel Virtual Machine and Message Passing Interface
    • D. KranzlMüller, C. Schaubschläger, and J. Volkert. An Integrated Record&Replay Mechanism for Nondeterministic Message Passing Programs. In 8th European PVM/MPI Users' Group Meeting, pages 192-200, 2001. (Pubitemid 33348545)
    • (2001) LECTURE NOTES IN COMPUTER SCIENCE , Issue.2131 , pp. 192-200
    • Kranzlmuller, D.1    Schaubschlager, C.2    Volkert, J.3
  • 21
    • 67650106321 scopus 로고    scopus 로고
    • NOPE: A nondeterministic program evaluator
    • D. KranzlMüller and J. Volkert. NOPE: A Nondeterministic Program Evaluator. In ACPC'99, pages 490-499, 1999.
    • (1999) ACPC'99 , pp. 490-499
    • KranzlMüller, D.1    Volkert, J.2
  • 22
    • 0023328934 scopus 로고
    • DEBUGGING PARALLEL PROGRAMS WITH INSTANT REPLAY.
    • T. J. LeBlanc and J. M. Mellor-Crummey. Debugging Parallel Programs with Instant Replay. IEEE Trans. Computers, 36(4):471-482, 1987. (Pubitemid 17565734)
    • (1987) IEEE Transactions on Computers , vol.C-36 , Issue.4 , pp. 471-482
    • LeBlanc, T.J.1    Mellor-Crummey, J.M.2
  • 23
    • 38449090939 scopus 로고    scopus 로고
    • Correctness debugging of message passing programs using model verification techniques
    • R. Lovas and P. Kacsuk. Correctness Debugging of Message Passing Programs Using Model Verification Techniques. In 14th European PVM/MPI User's Group Meeting, pages 335-343, 2007.
    • (2007) 14th European PVM/MPI User's Group Meeting , pp. 335-343
    • Lovas, R.1    Kacsuk, P.2
  • 25
    • 67650179005 scopus 로고    scopus 로고
    • Parallel program debugging based on data-replay
    • M. Maruyama, T. Tsumura, and H. Nakashima. Parallel Program Debugging based on Data-Replay. In PDCS'05, pages 151-156, 2005.
    • (2005) PDCS'05 , pp. 151-156
    • Maruyama, M.1    Tsumura, T.2    Nakashima, H.3
  • 27
    • 84869348061 scopus 로고    scopus 로고
    • SIM-MPI Library. http://www.hpctest.org.cn/resources/ sim-mpi.tgz.
    • SIM-MPI Library
  • 28
    • 0035438824 scopus 로고    scopus 로고
    • Net-dbx: A web-based debugger of MPI programs over low-bandwidth lines
    • DOI 10.1109/71.954636
    • N. Neophytou and P. Evripidou. Net-dbx: A Web-Based Debugger of MPI Programs Over Low-Bandwidth Lines. IEEE Trans. Parallel Distrib. Syst., 12(9):986-995, 2001. (Pubitemid 32992569)
    • (2001) IEEE Transactions on Parallel and Distributed Systems , vol.12 , Issue.9 , pp. 986-995
    • Neophytou, N.1    Evripidou, P.2
  • 30
    • 84869349198 scopus 로고    scopus 로고
    • PGDBG Graphical Symbolic Debugger. http://www.pgroup. com/products/pgdbg.htm.
  • 31
    • 34548240185 scopus 로고    scopus 로고
    • Novel techniques for debugging and optimizing parallel applications
    • M. Rudgyard. Novel Techniques for Debugging and Optimizing Parallel Applications. In SC'06, page 281, 2006.
    • (2006) SC'06 , pp. 281
    • Rudgyard, M.1
  • 32
    • 33845593340 scopus 로고    scopus 로고
    • A large-scale study of failures in high-performance computing systems
    • DOI 10.1109/DSN.2006.5, 1633514, Proceedings - DSN 2006: 2006 International Conference on Dependable Systems and Networks
    • B. Schroeder and G. A. Gibson. A Large-scale Study of Failures in High-performance Computing Systems. In International Conference on Dependable Systems and Networks (DSN 2006), pages 249-258, 2006. (Pubitemid 44930426)
    • (2006) Proceedings of the International Conference on Dependable Systems and Networks , vol.2006 , pp. 249-258
    • Schroeder, B.1    Gibson, G.A.2
  • 36
    • 84869368874 scopus 로고    scopus 로고
    • Totalview. http://www.totalviewtech.com/.
    • Totalview
  • 37
    • 33750268067 scopus 로고    scopus 로고
    • Verifying Collective MPI Calls
    • Recent Advances in Parallel Virtual Machine and Message Passing Interface
    • J. L. Träff and J. Worringen. Verifying Collective MPI Calls. In 11th European PVM/MPI Users' Group Meeting, pages 18-27, 2004. (Pubitemid 39289452)
    • (2004) LECTURE NOTES IN COMPUTER SCIENCE , Issue.3241 , pp. 18-27
    • Traff, J.L.1    Worringen, J.2
  • 38
    • 48949084412 scopus 로고    scopus 로고
    • Dynamic software testing of mpi applications with umpire
    • November
    • J. S. Vetter and B. R. de Supinski. Dynamic Software Testing of MPI Applications with Umpire. In SC'00, pages 70-170, November, 4-10 2000.
    • (2000) SC'00 , vol.4-10 , pp. 70-170
    • Vetter, J.S.1    De Supinski, B.R.2
  • 39
    • 0242308158 scopus 로고    scopus 로고
    • Communication characteristics of large-scale scientific applications for contemporary cluster architectures
    • DOI 10.1016/S0743-7315(03)00104-7
    • J. S. Vetter and F. Mueller. Communication Characteristics of Largescale Scientific Applications for Contemporary Cluster Architectures. J. Parallel Distrib. Comput., 63(9):853-865, 2003. (Pubitemid 37364491)
    • (2003) Journal of Parallel and Distributed Computing , vol.63 , Issue.9 , pp. 853-865
    • Vetter, J.S.1    Mueller, F.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.