-
1
-
-
81455160509
-
-
Online
-
Wikipedia, "Fastest chinese supercomputer." 2009, [Online], http://en.wikipedia.org/wiki/Tianhe-I.
-
(2009)
Fastest Chinese Supercomputer
-
-
-
3
-
-
68249127079
-
Fault tolerance in petascale/ exascale systems: Current knowledge, challenges and research opportunities
-
August Online. Available
-
F. Cappello, "Fault tolerance in petascale/ exascale systems: Current knowledge, challenges and research opportunities," Int. J. High Perform. Comput. Appl., vol. 23, pp. 212-226, August 2009. [Online]. Available: http://portal.acm.org/citation.cfm?id=1572226.1572229
-
(2009)
Int. J. High Perform. Comput. Appl.
, vol.23
, pp. 212-226
-
-
Cappello, F.1
-
4
-
-
81455159630
-
-
Online
-
"HPC resilience consortium," 2010, [Online], http://resilience.latech.edu.
-
(2010)
HPC Resilience Consortium
-
-
-
8
-
-
33749067567
-
Berkeley lab checkpoint/restart (BLCR) for linux clusters
-
Online. Available
-
P. H. Hargrove and J. C. Duell, "Berkeley lab checkpoint/restart (BLCR) for linux clusters," Journal of Physics: Conference Series, vol. 46, no. 1, p. 494, 2006. [Online]. Available: http://stacks.iop.org/1742-6596/46/i= 1/a=067
-
(2006)
Journal of Physics: Conference Series
, vol.46
, Issue.1
, pp. 494
-
-
Hargrove, P.H.1
Duell, J.C.2
-
9
-
-
81455157591
-
Implementation and evaluation of a scalable application-level checkpoint-recovery scheme for MPI programs
-
M. Shultz, G. B. R. Fenandes, D. M. K. Pingali, and P. Stodghill, "Implementation and evaluation of a scalable application-level checkpoint-recovery scheme for MPI programs,"in Proceedings of Supercomputing 2004 Conference, Pittsburgh, PA., November 2004.
-
Proceedings of Supercomputing 2004 Conference, Pittsburgh, PA., November 2004
-
-
Shultz, M.1
Fenandes, G.B.R.2
Pingali, D.M.K.3
Stodghill, P.4
-
10
-
-
77950975351
-
Checuda: A checkpoint/restart tool for cuda applications
-
H. Takizawa, K. Sato, K. Komatsu, and H. Kobayashi, "Checuda: A checkpoint/restart tool for cuda applications,"in Parallel and Distributed Computing, Applications and Technologies, 2009 International Conference on, dec 2009, pp. 408-413.
-
Parallel and Distributed Computing, Applications and Technologies, 2009 International Conference On, Dec 2009
, pp. 408-413
-
-
Takizawa, H.1
Sato, K.2
Komatsu, K.3
Kobayashi, H.4
-
11
-
-
84866088330
-
Lightweight checkpoint mechanism and modeling in GPGPU
-
S. Laosooksathit, C. Leangsuksan, A. Dhungana, C. Chandler, K. Chanchio, and A. Farbin, "Lightweight checkpoint mechanism and modeling in GPGPU," in Proceedings of the hpcvirt2010 conference, 2010.
-
Proceedings of the Hpcvirt2010 Conference, 2010
-
-
Laosooksathit, S.1
Leangsuksan, C.2
Dhungana, A.3
Chandler, C.4
Chanchio, K.5
Farbin, A.6
-
13
-
-
84974737083
-
The MPI standard for message passing," in High-Performance Computing and Networking
-
ser. W. Gentzsch and U. Harms, Eds. Springer Berlin / Heidelberg, 10.1007/3-540-57981-8-126. [Online]. Available
-
H. Rolf, "The MPI standard for message passing," in High-Performance Computing and Networking, ser. Lecture Notes in Computer Science, W. Gentzsch and U. Harms, Eds. Springer Berlin / Heidelberg, 1994, vol. 797, pp. 247-252, 10.1007/3-540-57981-8-126. [Online]. Available: http://dx.doi.org/10.1007/3-540-57981-8-126
-
(1994)
Lecture Notes in Computer Science
, vol.797
, pp. 247-252
-
-
Rolf, H.1
-
14
-
-
35048884271
-
Open MPI: Goals, concept, and design of a next generation MPI implementation
-
Recent Advances in Parallel Virtual Machine and Message Passing Interface, ser. D. Kranzlmller, P. Kacsuk, and J. J. Dongarra, Eds. Springer Berlin / Heidelberg, 10.1007/978-3-540-30218-6-19. [Online]. Available
-
E. Gabriel, G. E. Fagg, G. Bosilca, T. Angskun, J. J. Dongarra, J. M. Squyres, V. Sahay, P. Kambadur, B. Barrett, A. Lumsdaine, R. H. Castain, D. J. Daniel, R. L. Graham, and T. S. Woodall, "Open MPI: Goals, concept, and design of a next generation MPI implementation," in Recent Advances in Parallel Virtual Machine and Message Passing Interface, ser. Lecture Notes in Computer Science, D. Kranzlmller, P. Kacsuk, and J. J. Dongarra, Eds. Springer Berlin / Heidelberg, 2004, vol. 3241, pp. 353-377, 10.1007/978-3-540-30218-6-19. [Online]. Available: http://dx.doi.org/10.1007/978-3-540-30218-6-19
-
(2004)
Lecture Notes in Computer Science
, vol.3241
, pp. 353-377
-
-
Gabriel, E.1
Fagg, G.E.2
Bosilca, G.3
Angskun, T.4
Dongarra, J.J.5
Squyres, J.M.6
Sahay, V.7
Kambadur, P.8
Barrett, B.9
Lumsdaine, A.10
Castain, R.H.11
Daniel, D.J.12
Graham, R.L.13
Woodall, T.S.14
-
15
-
-
84884662651
-
MPICH-V: Toward a scalable fault tolerant MPI for volatile nodes
-
G. Bosilca, A. Bouteiller, F. Cappello, S. Djilali, G. Fedak, C. Germain, T. Herault, P. Lemarinier, O. Lodygensky, F. Magniette, V. Neri, and A. Selikhov, "MPICH-V: Toward a scalable fault tolerant MPI for volatile nodes," in Supercomputing, ACM/IEEE 2002 Conference, nov. 2002, p. 29.
-
Supercomputing, ACM/IEEE 2002 Conference, Nov. 2002
, pp. 29
-
-
Bosilca, G.1
Bouteiller, A.2
Cappello, F.3
Djilali, S.4
Fedak, G.5
Germain, C.6
Herault, T.7
Lemarinier, P.8
Lodygensky, O.9
Magniette, F.10
Neri, V.11
Selikhov, A.12
-
16
-
-
51049118147
-
Mvapich-aptus: Scalable high-performance multi-transport MPI over infiniband
-
M. Koop, T. Jones, and D. Panda, "Mvapich-aptus: Scalable high-performance multi-transport MPI over infiniband," in Parallel and Distributed Processing, 2008. IPDPS 2008. IEEE International Symposium on, april 2008, pp. 1-12.
-
Parallel and Distributed Processing, 2008. IPDPS 2008. IEEE International Symposium On, April 2008
, pp. 1-12
-
-
Koop, M.1
Jones, T.2
Panda, D.3
-
17
-
-
4344695315
-
Fault tolerance in message passing interface programs
-
August [Online]. Available
-
W. Gropp and E. Lusk, "Fault tolerance in message passing interface programs," Int. J. High Perform. Comput. Appl., vol. 18, pp. 363-372, August 2004. [Online]. Available: http://portal.acm.org/citation.cfm?id=1080704. 1080714
-
(2004)
Int. J. High Perform. Comput. Appl.
, vol.18
, pp. 363-372
-
-
Gropp, W.1
Lusk, E.2
-
18
-
-
28044460018
-
A higher order estimate of the optimum checkpoint interval for restart dumps
-
J. T. Daly, "A higher order estimate of the optimum checkpoint interval for restart dumps," Future Generation Computer Systems, vol. 22, p. 303C312, 2006.
-
(2006)
Future Generation Computer Systems
, vol.22
-
-
Daly, J.T.1
-
19
-
-
9144223280
-
Checkpointing for peta-scale systems: A look into the future of practical rollback-recovery
-
April Online. Available
-
E. N. Elnozahy and J. S. Plank, "Checkpointing for peta-scale systems: A look into the future of practical rollback-recovery," IEEE Trans. Dependable Secur. Comput., vol. 1, pp. 97-108, April 2004. [Online]. Available: http://dx.doi.org/10.1109/TDSC.2004.15
-
(2004)
IEEE Trans. Dependable Secur. Comput.
, vol.1
, pp. 97-108
-
-
Elnozahy, E.N.1
Plank, J.S.2
-
20
-
-
0001059575
-
The topological structure of asynchronous computability
-
November Online. Available
-
M. Herlihy and N. Shavit, "The topological structure of asynchronous computability," J. ACM, vol. 46, pp. 858-923, November 1999. [Online]. Available: http://doi.acm.org/10.1145/331524.331529
-
(1999)
J. ACM
, vol.46
, pp. 858-923
-
-
Herlihy, M.1
Shavit, N.2
-
21
-
-
0027697928
-
The impossibility of implementing reliable communication in the face of crashes
-
November [Online]. Available
-
A. Fekete, N. Lynch, Y. Mansour, and J. Spinelli, "The impossibility of implementing reliable communication in the face of crashes," J. ACM, vol. 40, pp. 1087-1107, November 1993. [Online]. Available: http://doi.acm.org/ 10.1145/174147.169676
-
(1993)
J. ACM
, vol.40
, pp. 1087-1107
-
-
Fekete, A.1
Lynch, N.2
Mansour, Y.3
Spinelli, J.4
-
25
-
-
0019079721
-
Data flow supercomputers
-
J. B. Dennis, "Data flow supercomputers," Computer, pp. 48-56, 1980.
-
(1980)
Computer
, pp. 48-56
-
-
Dennis, J.B.1
-
26
-
-
70449631676
-
Reducers and other cilk++ hyperobjects
-
M. Frigo, P. Halpern, C. E. Leiserson, and S. Lewin-Berlin, "Reducers and other cilk++ hyperobjects," ACM Symposium on Parallelism in Algorithms and Architectures, 2009.
-
ACM Symposium on Parallelism in Algorithms and Architectures, 2009
-
-
Frigo, M.1
Halpern, P.2
Leiserson, C.E.3
Lewin-Berlin, S.4
-
28
-
-
80053032654
-
-
Chapter 5: CRC, Taylor & Francis group
-
J. Y. Shi, Chapter 5:Fundamentals of cloud application architectures, Cloud computing: methodology, system, and applications. CRC, Taylor & Francis group, 2011.
-
(2011)
Fundamentals of Cloud Application Architectures, Cloud Computing: Methodology, System, and Applications
-
-
Shi, J.Y.1
-
30
-
-
0022148218
-
Synchronized distributed termination
-
B. Szymanski, Y. Shi, and N. Prywes, "Synchronized distributed termination," IEEE Transactions on Software Engineering, vol. SE11, no. 10, pp. 1136-1140, 1985.
-
(1985)
IEEE Transactions on Software Engineering
, vol.SE11
, Issue.10
, pp. 1136-1140
-
-
Szymanski, B.1
Shi, Y.2
Prywes, N.3
|