-
1
-
-
84870548923
-
An overview of the BlueGene/L supercomputer
-
Baltimore, MD, Nov.
-
N. R. Adiga, G. Almasi, G. S. Almasi, Y. Aridor et al. "An overview of the BlueGene/L supercomputer". In Proc. Supercomputing (SC2002), Baltimore, MD, Nov. 2002.
-
(2002)
Proc. Supercomputing (SC2002)
-
-
Adiga, N.R.1
Almasi, G.2
Almasi, G.S.3
Aridor, Y.4
-
3
-
-
84884662651
-
MPICH-V: Toward a scalable fault tolerant MPI for volatile nodes
-
Baltimore, Nov.
-
G. Bosilca, A. Bouteiller, F. Cappello, S. Djilali, et. al, "MPICH-V: Toward a scalable fault tolerant MPI for volatile nodes," in Supercomputing (SC2002), Baltimore, Nov. 2002.
-
(2002)
Supercomputing (SC2002)
-
-
Bosilca, G.1
Bouteiller, A.2
Cappello, F.3
Djilali, S.4
-
4
-
-
19944363534
-
Hybrid preemptive scheduling for mpi applications on the grids
-
Nov.
-
A. Bouteiller, H.-L. Bouziane, P. Lemarinier, T. Hérault, and F. Cappelo, "Hybrid preemptive scheduling for mpi applications on the grids," in 5th IEEE/ACM Workshop on Grid Computing, Nov. 2004.
-
(2004)
5th IEEE/ACM Workshop on Grid Computing
-
-
Bouteiller, A.1
Bouziane, H.-L.2
Lemarinier, P.3
Hérault, T.4
Cappelo, F.5
-
5
-
-
60449096682
-
MPICH-V2: A fault tolerant MPI for volatile nodes based on pessimistic sender based message logging
-
Phoenix, AZ, Nov.
-
A Bouteiller, F Cappello, T Herault, G Krawezik, et. al. "MPICH-V2: a fault tolerant MPI for volatile nodes based on pessimistic sender based message logging", in Supercomputing (SC2003), Phoenix, AZ, Nov. 2003.
-
(2003)
Supercomputing (SC2003)
-
-
Bouteiller, A.1
Cappello, F.2
Herault, T.3
Krawezik, G.4
-
6
-
-
77954003885
-
Mpi/ft: Architecture and taxonomies for fault-tolerant, messagepassing middleware for performance-portable parallel computing
-
Melbourne, Australia, May
-
R. Batchu, J. Neelamegam, Z. Cui, M. Beddhua, A. Skjellum, Y. Dandass, and M. Apte. "Mpi/ft: Architecture and taxonomies for fault-tolerant, messagepassing middleware for performance-portable parallel computing". In Proc. IEEE International Symposium on Cluster Computing and the Grid, Melbourne, Australia, May 2001.
-
(2001)
Proc. IEEE International Symposium on Cluster Computing and the Grid
-
-
Batchu, R.1
Neelamegam, J.2
Cui, Z.3
Beddhua, M.4
Skjellum, A.5
Dandass, Y.6
Apte, M.7
-
7
-
-
11944275313
-
Communication state transfer for the mobility of concurrent heterogeneous computing
-
K. Chanchio, X-H. Sun, "Communication state transfer for the mobility of concurrent heterogeneous computing," IEEE Trans. on Computers, vol. 53, No. 10, pp:1260-1273, 2004.
-
(2004)
IEEE Trans. on Computers
, vol.53
, Issue.10
, pp. 1260-1273
-
-
Chanchio, K.1
Sun, X.-H.2
-
8
-
-
10044298438
-
A runtime system for autonomie rescheduling of MPI programs
-
Montreal, Canada, August
-
C. Du, S. Ghosh, S. Shankar, and X.-H. Sun, "A runtime system for autonomie rescheduling of MPI programs," in Proc. International Conference of Parallel Processing, Montreal, Canada, August 2004.
-
(2004)
Proc. International Conference of Parallel Processing
-
-
Du, C.1
Ghosh, S.2
Shankar, S.3
Sun, X.-H.4
-
9
-
-
84944901368
-
HPCM: A pre-compiler aided middleware for the mobility of legacy code
-
Hong Kong, Dec.
-
C. Du, X.-H. Sun and K. Chanchio, "HPCM: A pre-compiler aided middleware for the mobility of legacy code," in Proc. IEEE Cluster Computing Conference, Hong Kong, Dec. 2003. Software available at: http://archive.nsf-middleware.org/NMIR4/contrib/download.asp.
-
(2003)
Proc. IEEE Cluster Computing Conference
-
-
Du, C.1
Sun, X.-H.2
Chanchio, K.3
-
10
-
-
33751095854
-
Scalable fault tolerant MPI: Extending the recovery algorithm
-
Sorrento (Naples), Italy, Sep
-
Fagg, G., Angskun, T., Bosilca, G., Pjesivac-Grbovic, J., Dongarra, J. "Scalable fault tolerant MPI: extending the recovery algorithm," Euro PVM/MPI, Sorrento (Naples), Italy, Sep, 2005.
-
(2005)
Euro PVM/MPI
-
-
Fagg, G.1
Angskun, T.2
Bosilca, G.3
Pjesivac-Grbovic, J.4
Dongarra, J.5
-
11
-
-
0012253727
-
Bayesian approaches to failure prediction for disk drives
-
Williams College, MA, Jun.
-
G. Hamerly, C. Andelkan, "Bayesian approaches to failure prediction for disk drives," In Proc. 18th International Conference on Machine Learning, Williams College, MA, Jun. 2001.
-
(2001)
Proc. 18th International Conference on Machine Learning
-
-
Hamerly, G.1
Andelkan, C.2
-
13
-
-
33751098725
-
-
"MPI Implementation List," http://www.lam-mpi.org/mpi/ implementations/fulllist.php?show_inactive=1
-
MPI Implementation List
-
-
-
15
-
-
33751106366
-
-
"MPICH2", http://www-unix.mcs.anl.gov/mpi/mpich2/
-
MPICH2
-
-
-
16
-
-
33751104135
-
-
"MPICH-V", http://www.lri.fr/~bouteill/MPICH-V/
-
MPICH-V
-
-
-
17
-
-
33751114605
-
-
"MPPTEST",http://www-unix.mcs.anl.gov/mpi/mpptest
-
MPPTEST
-
-
-
19
-
-
2642552074
-
The design and implementation of Zap: A system for migrating computing environment
-
Dec.
-
Steven Osman, Dinesh Subhraveti, Gong Su, and Jason Nieh, "The design and implementation of Zap: A system for migrating computing environment", in Proc. USENK 5th OSDI, Dec. 2002.
-
(2002)
Proc. USENK 5th OSDI
-
-
Osman, S.1
Subhraveti, D.2
Su, G.3
Nieh, J.4
-
20
-
-
85084159983
-
-
USENK
-
J. S Plank, M Beck, G Kingsley, K Li, "Libckpt: transparent checkpointing under Unix," USENK, 1995.
-
(1995)
Libckpt: Transparent Checkpointing under Unix
-
-
Plank, J.S.1
Beck, M.2
Kingsley, G.3
Li, K.4
-
21
-
-
0032071579
-
Heterogeneous process migration: The Tui system
-
P. Smith and N. Hutchinson, "Heterogeneous process migration: The Tui system," Software - Practice and Experience, vol 28, No.6, pp.611-639, 1998.
-
(1998)
Software - Practice and Experience
, vol.28
, Issue.6
, pp. 611-639
-
-
Smith, P.1
Hutchinson, N.2
-
22
-
-
20444444457
-
The LAM/MPI checkpoint/restart framework: System-Initiated checkpointing
-
Santa Fe, NM. October
-
S. Sankaran, J. M. Squyres, B. Barrett, A. Lumsdaine, J. Duell, P. Hargrove and E. Roman. "The LAM/MPI checkpoint/restart framework: system-Initiated checkpointing". In LACSI Symposium. Santa Fe, NM. October 2003.
-
(2003)
LACSI Symposium
-
-
Sankaran, S.1
Squyres, J.M.2
Barrett, B.3
Lumsdaine, A.4
Duell, J.5
Hargrove, P.6
Roman, E.7
-
23
-
-
4544382099
-
Failure data analysis of a large-scale heterogeneous server environment
-
Florence, Italy, Jun.
-
R. K. Sahoo, A. Sivasubramaniam, M. S. Squillante and Y. Zhang, "Failure data analysis of a large-scale heterogeneous server environment," in Proc. Intl. conf. on dependable systems and networks(DSN'04), Florence, Italy, Jun. 2004
-
(2004)
Proc. Intl. Conf. on Dependable Systems and Networks(DSN'04)
-
-
Sahoo, R.K.1
Sivasubramaniam, A.2
Squillante, M.S.3
Zhang, Y.4
-
24
-
-
0003050634
-
Cocheck: "Checkpointing and process migration for MPI"
-
Honolulu, Hawaii, April
-
G. Stellner. Cocheck: "Checkpointing and process migration for MPI," In Proc. IPPS, Honolulu, Hawaii, April 1996.
-
(1996)
Proc. IPPS
-
-
Stellner, G.1
-
26
-
-
0030381378
-
User-level checkpointing through exportable kernel state
-
P Tullmann, J Lepreau, B Ford, M Hibler, "User-level checkpointing through exportable kernel state," in Proc. Intl. Workshop on Object Oriented Operating System, 1996.
-
(1996)
Proc. Intl. Workshop on Object Oriented Operating System
-
-
Tullmann, P.1
Lepreau, J.2
Ford, B.3
Hibler, M.4
|