-
1
-
-
0038633588
-
Adaptive computing on the grid using AppLeS
-
Berman F, Wolski R, Casanova H, Cirne W, Dail H, Faerman M, Figueira S, Hayes J, Obertelli G, Schopf J, Shao G, Smallen S, Spring N, Su A, Zagorodnov D,. Adaptive computing on the grid using AppLeS. IEEE Transactions on Parallel Distribted Systems 2003; 14 (4): 369-382.
-
(2003)
IEEE Transactions on Parallel Distribted Systems
, vol.14
, Issue.4
, pp. 369-382
-
-
Berman, F.1
Wolski, R.2
Casanova, H.3
Cirne, W.4
Dail, H.5
Faerman, M.6
Figueira, S.7
Hayes, J.8
Obertelli, G.9
Schopf, J.10
Shao, G.11
Smallen, S.12
Spring, N.13
Su, A.14
Zagorodnov, D.15
-
2
-
-
34748838960
-
Supporting fault-tolerance in streaming grid applications
-
DOI 10.1145/1229428.1229464, Proceedings of the 2007 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP'07
-
Zhu Q, Chen L, Agrawal G,. Supporting fault-tolerance in streaming grid applications. PPoPP '07: Proceedings of the 12th ACM SIGPLAN Symposium on Principles and practice of Parallel Programming. ACM: New York, NY, U.S.A., 2007; 156-157. (Pubitemid 47479101)
-
(2007)
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP
, pp. 156-157
-
-
Zhu, Q.1
Chen, L.2
Agrawal, G.3
-
3
-
-
14244258300
-
Self adaptivity in Grid computing
-
DOI 10.1002/cpe.927, Grid Performance and Grids and Web Services for E-Science
-
Vadhiyar S, Dongarra J,. Self adaptivity in grid computing: Research articles. Concurrency Computation: Practice and Experience 2005; 17 (2-4): 235-257. (Pubitemid 40285777)
-
(2005)
Concurrency Computation Practice and Experience
, vol.17
, Issue.2-4
, pp. 235-257
-
-
Vadhiyar, S.S.1
Dongarra, J.J.2
-
4
-
-
33646922841
-
Fault-tolerant scheduling of fine-grained tasks in grid environments
-
DOI 10.1177/1094342006062528
-
Wrzesinska G, van Nieuwpoort RV, Maassen J, Kielmann T, Bal HE,. Fault-tolerant scheduling of fine-grained tasks in grid environments. International Journal of High Performance Computing Applications (IJHPCA) 2006; 20 (1): 103-114. (Pubitemid 43792545)
-
(2006)
International Journal of High Performance Computing Applications
, vol.20
, Issue.1
, pp. 103-114
-
-
Wrzesinska, G.1
Van Nieuwpoort, R.V.2
Maassen, J.3
Kielmann, T.4
Bal, H.E.5
-
5
-
-
0035552014
-
The Cactus Worm: Experiments with dynamic resource discovery and allocation in a grid environment
-
DOI 10.1177/109434200101500402
-
Allen G, Angulo D, Foster I, Lanfermann G, Liu C, Radke T, Seidel Ed, Shalf J,. The cactus worm: Experiments with dynamic resource discovery and allocation in a Grid environment. The International Journal of High Performance Computing Applications 2001; 15 (4): 345-358. (Pubitemid 35017026)
-
(2001)
International Journal of High Performance Computing Applications
, vol.15
, Issue.4
, pp. 345-358
-
-
Allen, G.1
Angulo, D.2
Foster, I.3
Lanfermann, G.4
Liu, C.5
Radke, T.6
Seidel, E.7
Shalf, J.8
-
6
-
-
68149132323
-
Transparent parallel checkpointing and migration in clusters and ClusterGrids
-
Kovacs J,. Transparent parallel checkpointing and migration in clusters and ClusterGrids. International Journal of Computational Science and Engineering 2009; 4 (3): 171-181.
-
(2009)
International Journal of Computational Science and Engineering
, vol.4
, Issue.3
, pp. 171-181
-
-
Kovacs, J.1
-
8
-
-
33746779994
-
MPICH-V: A multiprotocol automatic fault tolerant MPI
-
Bouteiller A, Herault T, Krawezik G, Lemarinier P, Cappello F,. MPICH-V: A multiprotocol automatic fault tolerant MPI. International Journal of High Performance Computing and Applications 2006; 20 (3): 319-330.
-
(2006)
International Journal of High Performance Computing and Applications
, vol.20
, Issue.3
, pp. 319-330
-
-
Bouteiller, A.1
Herault, T.2
Krawezik, G.3
Lemarinier, P.4
Cappello, F.5
-
9
-
-
10644223387
-
Computing on large-scale distributed systems: Xtrem Web architecture, programming models, security, tests and convergence with grid
-
Cappello F, Djilali S, Fedak G, Herault T, Magniette F, Néri V, Lodygensky O,. Computing on large-scale distributed systems: Xtrem Web architecture, programming models, security, tests and convergence with grid. Future Generation Computer Systems 2005; 21 (3): 417-437.
-
(2005)
Future Generation Computer Systems
, vol.21
, Issue.3
, pp. 417-437
-
-
Cappello, F.1
Djilali, S.2
Fedak, G.3
Herault, T.4
Magniette, F.5
Néri, V.6
Lodygensky, O.7
-
10
-
-
79960125492
-
IBMCorporation. An architectural blueprint for autonomic computing
-
[2006 October ]
-
IBMCorporation. An architectural blueprint for autonomic computing. White Paper, 2006. Available at: [October 2009 ].
-
(2009)
White Paper
-
-
-
11
-
-
70350589799
-
Coordinated checkpoint versus message log for fault tolerant MPI
-
Lemarinier P, Bouteiller A, Krawezik G, Cappello F,. Coordinated checkpoint versus message log for fault tolerant MPI. International Journal of High Performance Computer Networks 2004; 2 (2-4): 146-155.
-
(2004)
International Journal of High Performance Computer Networks
, vol.2
, Issue.24
, pp. 146-155
-
-
Lemarinier, P.1
Bouteiller, A.2
Krawezik, G.3
Cappello, F.4
-
12
-
-
0042078549
-
A survey of rollback-recovery protocols in message-passing systems
-
Elnozahy EN, Alvisi L, Wang Y-M, Johnson DB,. A survey of rollback-recovery protocols in message-passing systems. ACM Computing Survey 2002; 34 (3): 375-408.
-
(2002)
ACM Computing Survey
, vol.34
, Issue.3
, pp. 375-408
-
-
Elnozahy, E.N.1
Alvisi, L.2
Wang, Y.-M.3
Johnson, D.B.4
-
14
-
-
27144432456
-
A checkpoint/recovery model for heterogeneous dataflow computations using work-stealing
-
Euro-Par 2005 Parallel Processing: 11th International Euro-Par Conference. Proceedings
-
Jafar S, Gautier T, Krings AW, Roch J-L,. A checkpoint/recovery model for heterogeneous dataflow computations using work-stealing. In Euro-Par (Lecture Notes in Computer Science, vol. 3648), Cunha JC, Medeiros PD, (eds). Springer: Berlin, 2005; 675-684. (Pubitemid 41490867)
-
(2005)
Lecture Notes in Computer Science
, vol.3648
, pp. 675-684
-
-
Jafar, S.1
Gautier, T.2
Krings, A.3
Roch, J.-L.4
-
15
-
-
33750200181
-
ASSIST as a research framework for high-performance Grid programming environments
-
In, Cunha J.C., Rana O.F. (eds). Springer: Berlin, January.
-
Aldinucci M, Coppola M, Danelutto M, Vanneschi M, Zoccolo C,. ASSIST as a research framework for high-performance Grid programming environments. In Grid Computing: Software Environments and Tools, Cunha JC, Rana OF, (eds). Springer: Berlin, January 2006; 230-256.
-
(2006)
Grid Computing: Software Environments and Tools
, pp. 230-256
-
-
Aldinucci, M.1
Coppola, M.2
Danelutto, M.3
Vanneschi, M.4
Zoccolo, C.5
-
16
-
-
33746307003
-
SPHINX: A fault-tolerant system for scheduling in dynamic grid environments
-
Washington, DC, U.S.A. IEEE Computer Society: Silver Spring, MD.
-
uk In J, Avery P, Cavanaugh R, Chitnis L, Kulkarni M, Ranka S,. SPHINX: A fault-tolerant system for scheduling in dynamic grid environments. IPDPS '05: Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS '05)-Papers, Washington, DC, U.S.A. IEEE Computer Society: Silver Spring, MD, 2005.
-
(2005)
IPDPS '05: Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS '05)-Papers
-
-
Uk In, J.1
Avery, P.2
Cavanaugh, R.3
Chitnis, L.4
Kulkarni, M.5
Ranka, S.6
-
17
-
-
77957037827
-
Towards a middleware framework for dynamically reconfigurable scientific computing
-
In, (Advances in Parallel Computing, vol. 14), Grandinetti L. (ed.). Elsevier.
-
El Maghraoui K, Desell T, Szymanski BK, Teresco JD, Varela C,. Towards a middleware framework for dynamically reconfigurable scientific computing. In Grid Computing and New Frontiers of High Performance Processing, (Advances in Parallel Computing, vol. 14), Grandinetti L, (ed.). Elsevier: 2005; 275-301.
-
(2005)
Grid Computing and New Frontiers of High Performance Processing
, pp. 275-301
-
-
El Maghraoui, K.1
Desell, T.2
Szymanski, B.K.3
Teresco, J.D.4
Varela, C.5
-
18
-
-
33750223302
-
The internet operating system: Middleware for adaptive distributed computing
-
DOI 10.1177/1094342006068411
-
El Maghraoui K, Desell TJ, Szymanski BK, Varela CA,. The internet operating system: Middleware for adaptive distributed computing. International Journal of High Performance Computing Applications 2006; 20 (4): 467-480. (Pubitemid 44605332)
-
(2006)
International Journal of High Performance Computing Applications
, vol.20
, Issue.4
, pp. 467-480
-
-
El Maghraoui, K.1
Desell, T.J.2
Szymanski, B.K.3
Varela, C.A.4
-
19
-
-
0003640863
-
-
Addison-Wesley Longman Publishing Co., Inc.: Boston, MA, U.S.A.
-
Lange DB, Mitsuru O,. Programming and Deploying Java Mobile Agents Aglets. Addison-Wesley Longman Publishing Co., Inc.: Boston, MA, U.S.A., 1998.
-
(1998)
Programming and Deploying Java Mobile Agents Aglets
-
-
Lange, D.B.1
Mitsuru, O.2
-
20
-
-
0003802908
-
Agent tcl: A flexible and secure mobile-agent system
-
Hanover, NH, U.S.A.
-
Gray RS,. Agent tcl: A flexible and secure mobile-agent system. Technical Report, Hanover, NH, U.S.A., 1998.
-
(1998)
Technical Report
-
-
Gray, R.S.1
-
21
-
-
52349122889
-
Engineering an autonomic container for WSRF-based web services
-
Reich C, Banholzer M, Buyya R, Bubendorfer K,. Engineering an autonomic container for WSRF-based web services. Adcom 2007; 0: 277-282.
-
(2007)
Adcom
, vol.0
, pp. 277-282
-
-
Reich, C.1
Banholzer, M.2
Buyya, R.3
Bubendorfer, K.4
-
22
-
-
33646415354
-
Self healing and self configuration in a WSRF grid environment
-
In (Lecture Notes in Computer Science, vol. 3719), Hobbs M., Goscinski A.M., Zhou W. (eds). Springer: Berlin.
-
Messig M, Goscinski A,. Self healing and self configuration in a WSRF grid environment. In ICA3PP (Lecture Notes in Computer Science, vol. 3719), Hobbs M, Goscinski AM, Zhou W, (eds). Springer: Berlin, 2005; 149-158.
-
(2005)
ICA3PP
, pp. 149-158
-
-
Messig, M.1
Goscinski, A.2
-
23
-
-
0022020346
-
Distributed snapshots: Determining global states of distributed systems
-
DOI 10.1145/214451.214456
-
Mani Chandy K, Lamport L,. Distributed snapshots: Determining global states of distributed systems. ACM Transactions on Computer Systems 1985; 3 (1): 63-75. (Pubitemid 15597765)
-
(1985)
ACM Transactions on Computer Systems
, vol.3
, Issue.1
, pp. 63-75
-
-
Chandy K.Mani1
Lamport Leslie2
-
24
-
-
33646182907
-
Checkpointing BSP parallel applications on the InteGrade Grid middleware: Research articles
-
de Camargo RY, Goldchleger A, Kon F, Goldman A,. Checkpointing BSP parallel applications on the InteGrade Grid middleware: Research articles. Concurrent Computing: Practice and Experience 2006; 18 (6): 567-579.
-
(2006)
Concurrent Computing: Practice and Experience
, vol.18
, Issue.6
, pp. 567-579
-
-
De Camargo, R.Y.1
Goldchleger, A.2
Kon, F.3
Goldman, A.4
-
25
-
-
24944480927
-
Transparent fault tolerance for grid applications
-
Advances in Grid Computing - EGC 2005: European Grid Conference, Revised Selected Papers
-
Garbacki P, Biskupski B, Bal HE,. Transparent fault tolerance for grid applications. EGC (Lecture Notes in Computer Science, vol. 3470), Sloot PMA, Hoekstra AG, Priol T, Reinefeld A, Bubak M, (eds). Springer: Berlin, 2005; 671-680. (Pubitemid 41313248)
-
(2005)
Lecture Notes in Computer Science
, vol.3470
, pp. 671-680
-
-
Garbacki, P.1
Biskupski, B.2
Bal, H.3
-
26
-
-
41149169653
-
A serialization based approach for strong mobility of shared object
-
DOI 10.1145/1294325.1294359, Proceedings of the 2007 5th International Conference on the Principles and Practice of Programming in Java, PPPJ 2007
-
Marzouk S, Ben-Jemaa M, Jmaiel M,. A serialisation based approach for strong mobility of shared object. Proceedings of the First International Workshop on Java for Mobility (Ja4Mo 07) as part of the International Conference on Principles and Practices of Programming in Java (PPPJ 2007), Lisbon, Portugal. ACM: New York, September 2007; 237-242. (Pubitemid 351429500)
-
(2007)
ACM International Conference Proceeding Series
, vol.272
, pp. 237-242
-
-
Marzouk, S.1
Ben Jemaa, M.2
Jmaiel, M.3
-
27
-
-
0035201417
-
Processor allocation and checkpoint interval selection in cluster computing systems
-
DOI 10.1006/jpdc.2001.1757
-
Plank JS, Thomason MG,. Processor allocation and checkpoint interval selection in cluster computing systems. Journal of Parallel and Distributed Computing 2001; 61 (11): 1570-1590. (Pubitemid 33119054)
-
(2001)
Journal of Parallel and Distributed Computing
, vol.61
, Issue.11
, pp. 1570-1590
-
-
Plank, J.S.1
Thomason, M.G.2
-
28
-
-
33847764225
-
Model-based checkpoint scheduling for volatile resource environments
-
Department of Computer Science, University of California Santa Barbara, Santa Barbara, CA.
-
Nurmi D, Wolski R, Brevik J,. Model-based checkpoint scheduling for volatile resource environments. Technical Report 2004-25, Department of Computer Science, University of California Santa Barbara, Santa Barbara, CA, 2004.
-
(2004)
Technical Report 2004-25
-
-
Nurmi, D.1
Wolski, R.2
Brevik, J.3
-
29
-
-
0030600996
-
Checkpointing in distributed computing systems
-
DOI 10.1006/jpdc.1996.0069
-
Wong KF, Franklin M,. Checkpointing in distributed computing systems. Journal of Parallel and Distributed Computing 1996; 35 (1): 67-75. (Pubitemid 126167709)
-
(1996)
Journal of Parallel and Distributed Computing
, vol.35
, Issue.1
, pp. 67-75
-
-
Wong, K.F.1
Franklin, M.2
-
30
-
-
0018454850
-
On the optimum checkpoint interval
-
Gelenbe E,. On the optimum checkpoint interval. Journal of the ACM 1979; 26 (2): 259-270.
-
(1979)
Journal of the ACM
, vol.26
, Issue.2
, pp. 259-270
-
-
Gelenbe, E.1
-
31
-
-
0031341097
-
Performance analysis of two time-based coordinated checkpointing protocols
-
Washington, DC, U.S.A. IEEE Computer Society: Silver Spring, MD.
-
Kavanaugh GP, Sanders WH,. Performance analysis of two time-based coordinated checkpointing protocols. PRFTS '97: Proceedings of the 1997 Pacific Rim International Symposium on Fault-Tolerant Systems, Washington, DC, U.S.A. IEEE Computer Society: Silver Spring, MD, 1997; 194.
-
(1997)
PRFTS '97: Proceedings of the 1997 Pacific Rim International Symposium on Fault-Tolerant Systems
, pp. 194
-
-
Kavanaugh, G.P.1
Sanders, W.H.2
-
32
-
-
0031388399
-
Impact of checkpoint latency on overhead ratio of a checkpointing scheme
-
Vaidya NH,. Impact of checkpoint latency on overhead ratio of a checkpointing scheme. IEEE Transactions on Computers 1997; 46 (8): 942-947. (Pubitemid 127760644)
-
(1997)
IEEE Transactions on Computers
, vol.46
, Issue.8
, pp. 942-947
-
-
Vaidya, N.H.1
-
33
-
-
84976846528
-
A first order approximation to the optimum checkpoint interval
-
Young JW,. A first order approximation to the optimum checkpoint interval. Communications of the ACM 1974; 17 (9): 530-531.
-
(1974)
Communications of the ACM
, vol.17
, Issue.9
, pp. 530-531
-
-
Young, J.W.1
-
34
-
-
0027798414
-
Lazy checkpoint coordination for bounding rollback propagation
-
Princeton, NJ, U.S.A.
-
Wang Y-M, Kent Fuchs W,. Lazy checkpoint coordination for bounding rollback propagation. Twelfth Symposium on Reliable Distributed Systems, Princeton, NJ, U.S.A., 1993; 78-85.
-
(1993)
Twelfth Symposium on Reliable Distributed Systems
, pp. 78-85
-
-
Wang, Y.-M.1
Kent Fuchs, W.2
-
35
-
-
0028427727
-
Consistent global checkpoints based on direct dependency tracking
-
DOI 10.1016/0020-0190(94)00038-7
-
Wang Y-M, Lowry A, Kent Fuchs W,. Consistent global checkpoints based on direct dependency tracking. Information Processing Letters 1994; 50 (4): 223-230. (Pubitemid 124013158)
-
(1994)
Information Processing Letters
, vol.50
, Issue.4
, pp. 223-230
-
-
Wang, Y.-M.1
Lowry, A.2
Fuchs, W.K.3
-
36
-
-
0031124071
-
Consistent global checkpoints that contain a given set of local checkpoints
-
Wang Y-M, AT&T Bell Labs, Hill NJM,. Consistent global checkpoints that contain a given set of localcheckpoints. IEEE Transactions on Computers 1997; 46: 456-468. (Pubitemid 127760472)
-
(1997)
IEEE Transactions on Computers
, vol.46
, Issue.4
, pp. 456-468
-
-
Wang, Y.-M.1
-
37
-
-
0035094268
-
Direct dependency-based determination of consistent global checkpoints
-
Baldoni R, Cioffi G, Helary J-M, Raynal M,. Direct dependency-based determination of consistent global checkpoints. International Journal of Computer Systems Science and Engineering 2001; 16 (1): 43-50.
-
(2001)
International Journal of Computer Systems Science and Engineering
, vol.16
, Issue.1
, pp. 43-50
-
-
Baldoni, R.1
Cioffi, G.2
Helary, J.-M.3
Raynal, M.4
-
39
-
-
84866225421
-
A communication-induced checkpointing protocol that ensures rollback-dependency trackability
-
Washington, DC, U.S.A. IEEE Computer Society: Silver Spring, MD.
-
Baldoni R,. A communication-induced checkpointing protocol that ensures rollback-dependency trackability. FTCS '97: Proceedings of the 27th International Symposium on Fault-Tolerant Computing (FTCS '97). Washington, DC, U.S.A. IEEE Computer Society: Silver Spring, MD, 1997; 68.
-
(1997)
FTCS '97: Proceedings of the 27th International Symposium on Fault-Tolerant Computing (FTCS '97)
, pp. 68
-
-
Baldoni, R.1
-
42
-
-
0031339111
-
Preventing useless checkpoints in distributed computations
-
Washington, DC, U.S.A. IEEE Computer Society: Silver Spring, MD.
-
Helary J-M, Mostefaoui A, Raynal M,. Preventing useless checkpoints in distributed computations. SRDS '97: Proceedings of the 16th Symposium on Reliable Distributed Systems, Washington, DC, U.S.A. IEEE Computer Society: Silver Spring, MD, 1997; 183.
-
(1997)
SRDS '97: Proceedings of the 16th Symposium on Reliable Distributed Systems
, pp. 183
-
-
Helary, J.-M.1
Mostefaoui, A.2
Raynal, M.3
-
43
-
-
0141599174
-
Libckpt: Transparent checkpointing under unix
-
Knoxville, TN, U.S.A.
-
Plank JS, Beck M, Kingsley G, Li K,. Libckpt: Transparent checkpointing under unix. Technical Report, Knoxville, TN, U.S.A., 1994.
-
(1994)
Technical Report
-
-
Plank, J.S.1
Beck, M.2
Kingsley, G.3
Li, K.4
-
44
-
-
0025625415
-
CATCH: Compiler assisted techniques for checkpointing
-
Newcastle Upon Tyne, U.K.
-
Li CC, Fuchs WK,. CATCH: Compiler assisted techniques for checkpointing. The 20th Annual International Symposium on Fault-Tolerant Computing, FTCS-20, Newcastle Upon Tyne, U.K., 1990; 74-81.
-
(1990)
The 20th Annual International Symposium on Fault-Tolerant Computing, FTCS-20
, pp. 74-81
-
-
Li, C.C.1
Fuchs, W.K.2
-
45
-
-
33646071736
-
An analysis of communication-induced checkpointing
-
Austin, TX, U.S.A.
-
Alvisi L, Elnozahy E, Rao SS, Husain SA, Del Mel A,. An analysis of communication-induced checkpointing. Technical Report, Austin, TX, U.S.A., 1999.
-
(1999)
Technical Report
-
-
Alvisi, L.1
Elnozahy, E.2
Rao, S.S.3
Husain, S.A.4
Del Mel, A.5
-
46
-
-
0000364460
-
Virtual precedence in asynchronous systems: Concepts and applications
-
Saarbrücken, Germany
-
Helary J-M, Mostéfaoui A, Raynal M,. Virtual precedence in asynchronous systems: Concepts and applications. Eleventh Workshop on Distributed Algorithms, WDAG'97, Saarbrücken, Germany, 1997.
-
(1997)
Eleventh Workshop on Distributed Algorithms, WDAG'97
-
-
Helary, J.-M.1
Mostéfaoui, A.2
Raynal, M.3
-
47
-
-
0029237761
-
Message logging: Pessimistic, optimistic, and causal
-
Washington, DC, U.S.A. IEEE Computer Society: Silver Spring, MD.
-
Alvisi L, Marzullo K,. Message logging: Pessimistic, optimistic, and causal. ICDCS '95: Proceedings of the 15th International Conference on Distributed Computing Systems, Washington, DC, U.S.A. IEEE Computer Society: Silver Spring, MD, 1995; 229.
-
(1995)
ICDCS '95: Proceedings of the 15th International Conference on Distributed Computing Systems
, pp. 229
-
-
Alvisi, L.1
Marzullo, K.2
-
49
-
-
0026867749
-
Manetho: Transparent rollback-recovery with low overhead, limited rollback, and fast output commit
-
Elnozahy EN, Zwaenepoel W,. Manetho: Transparent rollback-recovery with low overhead, limited rollback, and fast output commit. IEEE Transactions on Computers 1992; 41 (5): 526-531.
-
(1992)
IEEE Transactions on Computers
, vol.41
, Issue.5
, pp. 526-531
-
-
Elnozahy, E.N.1
Zwaenepoel, W.2
|