-
4
-
-
0004265427
-
-
RM-3420, Technical Report
-
P. Baran, On distributed communications, RM-3420, Technical Report, 1964. http://www.rand.org/about/history/baran.list.html.
-
(1964)
On Distributed Communications
-
-
Baran, P.1
-
5
-
-
84884662651
-
MPICH-V: Toward a scalable fault tolerant MPI for volatile nodes
-
G. Bosilca, A. Bouteiller, F. Cappello, S. Djilali, G. Fedak, C. Germain, T. Herault, P. Lemarinier, O. Lodygensky, F. Magniette, V. Neri, A. Selikhov, MPICH-V: toward a scalable fault tolerant MPI for volatile nodes, in: Supercomputing, ACM/IEEE 2002 Conference, p. 29.
-
Supercomputing, ACM/IEEE 2002 Conference
, pp. 29
-
-
Bosilca, G.1
Bouteiller, A.2
Cappello, F.3
Djilali, S.4
Fedak, G.5
Germain, C.6
Herault, T.7
Lemarinier, P.8
Lodygensky, O.9
Magniette, F.10
Neri, V.11
Selikhov, A.12
-
6
-
-
68249127079
-
Fault tolerance in petascale/exascale systems: Current knowledge, challenges and research opportunities
-
F. Cappello Fault tolerance in petascale/exascale systems: current knowledge, challenges and research opportunities Int. J. High Perform. Comput. Appl. 23 2009 212 226
-
(2009)
Int. J. High Perform. Comput. Appl.
, vol.23
, pp. 212-226
-
-
Cappello, F.1
-
11
-
-
28044460018
-
A higher order estimate of the optimum checkpoint interval for restart dumps
-
J.T. Daly A higher order estimate of the optimum checkpoint interval for restart dumps Future Gener. Comput. Syst. 22 2006 303 312
-
(2006)
Future Gener. Comput. Syst.
, vol.22
, pp. 303-312
-
-
Daly, J.T.1
-
12
-
-
0019079721
-
Data flow supercomputers
-
J.B. Dennis Data flow supercomputers Computer 13 1980 48 56
-
(1980)
Computer
, vol.13
, pp. 48-56
-
-
Dennis, J.B.1
-
13
-
-
9144223280
-
Checkpointing for peta-scale systems: A look into the future of practical rollback-recovery
-
E.N. Elnozahy, and J.S. Plank Checkpointing for peta-scale systems: A look into the future of practical rollback-recovery IEEE Trans. Dependable Secure Comput. 1 2004 97 108
-
(2004)
IEEE Trans. Dependable Secure Comput.
, vol.1
, pp. 97-108
-
-
Elnozahy, E.N.1
Plank, J.S.2
-
14
-
-
0027697928
-
The impossibility of implementing reliable communication in the face of crashes
-
A. Fekete, N. Lynch, Y. Mansour, and J. Spinelli The impossibility of implementing reliable communication in the face of crashes J. ACM 40 1993 1087 1107
-
(1993)
J. ACM
, vol.40
, pp. 1087-1107
-
-
Fekete, A.1
Lynch, N.2
Mansour, Y.3
Spinelli, J.4
-
16
-
-
70449631676
-
Reducers and other cilk + + hyperobjects
-
M. Frigo, P. Halpern, C.E. Leiserson, S. Lewin-Berlin, Reducers and other cilk + + hyperobjects, in: ACM Symposium on Parallelism in Algorithms and Architectures, 2009.
-
(2009)
ACM Symposium on Parallelism in Algorithms and Architectures
-
-
Frigo, M.1
Halpern, P.2
Leiserson, C.E.3
Lewin-Berlin, S.4
-
17
-
-
35048884271
-
Open MPI: Goals, concept, and design of a next generation MPI implementation
-
D. Kranzlmller, P. Kacsuk, J.J. Dongarra, Lecture Notes in Computer Science Springer Berlin, Heidelberg
-
E. Gabriel, G.E. Fagg, G. Bosilca, T. Angskun, J.J. Dongarra, J.M. Squyres, V. Sahay, P. Kambadur, B. Barrett, A. Lumsdaine, R.H. Castain, D.J. Daniel, R.L. Graham, and T.S. Woodall Open MPI: goals, concept, and design of a next generation MPI implementation D. Kranzlmller, P. Kacsuk, J.J. Dongarra, Recent Advances in Parallel Virtual Machine and Message Passing Interface Lecture Notes in Computer Science vol. 3241 2004 Springer Berlin, Heidelberg 353 377
-
(2004)
Recent Advances in Parallel Virtual Machine and Message Passing Interface
, vol.3241
, pp. 353-377
-
-
Gabriel, E.1
Fagg, G.E.2
Bosilca, G.3
Angskun, T.4
Dongarra, J.J.5
Squyres, J.M.6
Sahay, V.7
Kambadur, P.8
Barrett, B.9
Lumsdaine, A.10
Castain, R.H.11
Daniel, D.J.12
Graham, R.L.13
Woodall, T.S.14
-
19
-
-
4344695315
-
Fault tolerance in message passing interface programs
-
W. Gropp, and E. Lusk Fault tolerance in message passing interface programs Int. J. High Perform. Comput. Appl. 18 2004 363 372
-
(2004)
Int. J. High Perform. Comput. Appl.
, vol.18
, pp. 363-372
-
-
Gropp, W.1
Lusk, E.2
-
21
-
-
33749067567
-
Berkeley lab checkpoint/restart (BLCR) for linux clusters
-
P.H. Hargrove, and J.C. Duell Berkeley lab checkpoint/restart (BLCR) for linux clusters J. Phys. Conf. Ser. 46 2006 494 503
-
(2006)
J. Phys. Conf. Ser.
, vol.46
, pp. 494-503
-
-
Hargrove, P.H.1
Duell, J.C.2
-
22
-
-
0001059575
-
The topological structure of asynchronous computability
-
M. Herlihy, and N. Shavit The topological structure of asynchronous computability J. ACM 46 1999 858 923
-
(1999)
J. ACM
, vol.46
, pp. 858-923
-
-
Herlihy, M.1
Shavit, N.2
-
23
-
-
84866092112
-
-
High performance FPGA development group, [Online]
-
High performance FPGA development group, 2010 [Online]. http://www.fhpca.org/.
-
(2010)
-
-
-
24
-
-
81455159630
-
-
[Online]
-
HPC resilience consortium, 2010 [Online]. http://resilience.latech.edu.
-
(2010)
HPC Resilience Consortium
-
-
-
25
-
-
84866088331
-
-
[Online]
-
Introduction to javaspace technology, 2010 [Online]. http://java.sun.com/ developer/technicalArticles/tools/JavaSpaces/.
-
(2010)
Introduction to Javaspace Technology
-
-
-
26
-
-
84866088329
-
-
[Online]
-
Introduction to xap 8.0 - gigaspaces, 2011 [Online]. http://www. gigaspaces.com/wiki/display/XAP8/8.0+Documentation+Home.
-
(2011)
Introduction to Xap 8.0 - Gigaspaces
-
-
-
29
-
-
84866088330
-
Lightweight checkpoint mechanism and modeling in GPGPU
-
S. Laosooksathit, C. Leangsuksan, A. Dhungana, C. Chandler, K. Chanchio, A. Farbin, Lightweight checkpoint mechanism and modeling in GPGPU, in: Proceedings of the hpcvirt2010 Conference.
-
Proceedings of the hpcvirt2010 Conference
-
-
Laosooksathit, S.1
Leangsuksan, C.2
Dhungana, A.3
Chandler, C.4
Chanchio, K.5
Farbin, A.6
-
33
-
-
84866078911
-
High-performance computing and networking
-
W. Gentzsch, U. Harms, Lecture Notes in Computer Science Springer Berlin, Heidelberg
-
H. Rolf High-performance computing and networking W. Gentzsch, U. Harms, The MPI Standard for Message Passing Lecture Notes in Computer Science vol. 797 1994 Springer Berlin, Heidelberg 247 252
-
(1994)
The MPI Standard for Message Passing
, vol.797
, pp. 247-252
-
-
Rolf, H.1
-
34
-
-
0019696760
-
End-to-end arguments in system design
-
J. Saltzer, D. Reed, D. Clark, End-to-end arguments in system design, in: Second International Conference on Distributed Computing Systems, 1981, pp. 509-512.
-
(1981)
Second International Conference on Distributed Computing Systems
, pp. 509-512
-
-
Saltzer, J.1
Reed, D.2
Clark, D.3
-
35
-
-
84964284306
-
Fundamentals of cloud application architectures
-
CRC, Taylor & Francis Group (Chapter 19)
-
J.Y. Shi Fundamentals of cloud application architectures Cloud Computing: Methodology, System, and Applications 2011 CRC, Taylor & Francis Group (Chapter 19)
-
(2011)
Cloud Computing: Methodology, System, and Applications
-
-
Shi, J.Y.1
-
38
-
-
81455157591
-
Implementation and evaluation of a scalable application-level checkpoint-recovery scheme for MPI programs
-
Pittsburgh, PA
-
M. Shultz, G.B.R. Fernandes, D.M.K. Pingali, P. Stodghill, Implementation and evaluation of a scalable application-level checkpoint-recovery scheme for MPI programs, in: Proceedings of Supercomputing 2004 Conference, Pittsburgh, PA.
-
Proceedings of Supercomputing 2004 Conference
-
-
Shultz, M.1
Fernandes, G.B.R.2
Pingali, D.M.K.3
Stodghill, P.4
-
41
-
-
77950975351
-
Checuda: A checkpoint/restart tool for cuda applications
-
K.K.H. Takizawa, K. Sato, H. Kobayashi, Checuda: a checkpoint/restart tool for cuda applications, in: 2009 International Conference on Parallel and Distributed Computing, Applications and Technologies, 2009, pp. 408-413.
-
(2009)
2009 International Conference on Parallel and Distributed Computing, Applications and Technologies
, pp. 408-413
-
-
Takizawa, K.K.H.1
Sato, K.2
Kobayashi, H.3
|