-
1
-
-
0036922098
-
Implementation and performance evaluation of an adaptable failure detector
-
Washington D.C., USA, June
-
M. Bertier, O. Marin, and P. Sens, "Implementation and performance evaluation of an adaptable failure detector," Proc. International Conference on Dependable Systems and Networks, pp.354-363, Washington D.C., USA, June 2002.
-
(2002)
Proc. International Conference on Dependable Systems and Networks
, pp. 354-363
-
-
Bertier, M.1
Marin, O.2
Sens, P.3
-
2
-
-
0031570635
-
Application level fault tolerance in heterogeneous networks of workstations
-
Special Issue on Workstation Clusters and Network-based Computing, June
-
A. Beguelin, E. Seligman, and P. Stephan, "Application level fault tolerance in heterogeneous networks of workstations," Special Issue on Workstation Clusters and Network-based Computing, J. Parallel Distrib. Comput., vol.43, pp. 147-155, June 1997.
-
(1997)
J. Parallel Distrib. Comput.
, vol.43
, pp. 147-155
-
-
Beguelin, A.1
Seligman, E.2
Stephan, P.3
-
3
-
-
85027106133
-
-
CERNET, http://www.cernet.edu.cn
-
-
-
-
4
-
-
0002506046
-
Application-specific tools
-
ed. I. Foster and C. Kesselman, Morgan Kaufmann
-
H. Casanova, J. Dongarra, C. Johnson, and M. Miller, "Application-specific tools," in The Grid: Blueprint for a New Computing Infrastructure, ed. I. Foster and C. Kesselman, pp. 159-180, Morgan Kaufmann, 1998.
-
(1998)
The Grid: Blueprint for a New Computing Infrastructure
, pp. 159-180
-
-
Casanova, H.1
Dongarra, J.2
Johnson, C.3
Miller, M.4
-
5
-
-
0030102105
-
Unreliable failure detectors for reliable distributed systems
-
T.D. Chandra and S. Toueg, "Unreliable failure detectors for reliable distributed systems," J. ACM, vol.43, no.2, pp.225-267, 1996.
-
(1996)
J. ACM
, vol.43
, Issue.2
, pp. 225-267
-
-
Chandra, T.D.1
Toueg, S.2
-
6
-
-
0036565809
-
On the quality of service of failure detectors
-
W. Chen, S. Toueg, and M.K. Aguilera, "On the quality of service of failure detectors," IEEE Trans. Comput., vol.51, no.2, pp.13-32, 2002.
-
(2002)
IEEE Trans. Comput.
, vol.51
, Issue.2
, pp. 13-32
-
-
Chen, W.1
Toueg, S.2
Aguilera, M.K.3
-
7
-
-
0020765766
-
The effects of checkpointing on program execution time
-
A. Duda, "The effects of checkpointing on program execution time," Inf. Process. Lett., vol.16, pp.221-229, 1983.
-
(1983)
Inf. Process. Lett.
, vol.16
, pp. 221-229
-
-
Duda, A.1
-
8
-
-
0003751935
-
Failure detectors in omission failure environments
-
Department of Computer Science, Cornell University, Sept.
-
D. Dolev, R. Friedman, I. Keidar, and D. Malkhi, "Failure detectors in omission failure environments," Technical Report, Department of Computer Science, Cornell University, Sept. 1996. http://techreports.library.cornell.edu:8081/Dienst/UI/1.0/Display/cul.es/ TR96-1608
-
(1996)
Technical Report
-
-
Dolev, D.1
Friedman, R.2
Keidar, I.3
Malkhi, D.4
-
9
-
-
26644472267
-
On the design of a failure detection service for large scale distributed systems
-
Ishikawa, Japan, Sept.
-
X. Défago, N. Hayashibara, and T. Katayama, "On the design of a failure detection service for large scale distributed systems," Proc. International Symposium towards Peta-Bit Ultra-Networks (PBit), pp.88-95, Ishikawa, Japan, Sept. 2003.
-
(2003)
Proc. International Symposium Towards Peta-bit Ultra-networks (PBit)
, pp. 88-95
-
-
Défago, X.1
Hayashibara, N.2
Katayama, T.3
-
10
-
-
85027099280
-
Datagrid information and monitoring services architecture report: Design, requirements and evaluation criteria
-
DataGrid-03-D3.2-334453-4-0
-
"DataGrid Information and Monitoring Services Architecture Report: Design, Requirements and Evaluation Criteria," DataGrid Report, DataGrid-03-D3.2-334453-4-0, 2002.
-
(2002)
DataGrid Report
-
-
-
11
-
-
2642568798
-
Reliability analysis of grid computing systems
-
Y.S. Dai, M. Xie, and K.L. Poh, "Reliability analysis of grid computing systems," Proc. 2002 Pacific Rim International Symposium on Dependable Computing, pp.97-104, 2002.
-
(2002)
Proc. 2002 Pacific Rim International Symposium on Dependable Computing
, pp. 97-104
-
-
Dai, Y.S.1
Xie, M.2
Poh, K.L.3
-
12
-
-
85010199652
-
Failure detector as first class objects
-
Sept.
-
P. Felber, X. Défago, R. Guerraoui, and P. Oser, "Failure detector as first class objects," Proc. 9th IEEE International Symposium on Distributed Objects and Applications (DOA'99), pp.132-141, Sept. 1999.
-
(1999)
Proc. 9th IEEE International Symposium on Distributed Objects and Applications (DOA'99)
, pp. 132-141
-
-
Felber, P.1
Défago, X.2
Guerraoui, R.3
Oser, P.4
-
13
-
-
84885938405
-
An adaptive failure detection protocol
-
C. Felber, M. Raynal, and F. Tronel, "An adaptive failure detection protocol," Proc. 8th IEEE Pacific Rim Symposium on Dependable Computing (PRDC-8), pp. 146-153, 2001.
-
(2001)
Proc. 8th IEEE Pacific Rim Symposium on Dependable Computing (PRDC-8)
, pp. 146-153
-
-
Felber, C.1
Raynal, M.2
Tronel, F.3
-
14
-
-
0003918059
-
-
I. Foster, C. Kesselman, J.M. Nick, and S. Tuecke, "The physiology of the grid: An open grid services architecture for distributed systems integration," http://www.gridforum.forum.org/ogsi-wg/drafts/ogsa.draft2.9.2002-06-22.pdf, 2002.
-
(2002)
The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration
-
-
Foster, I.1
Kesselman, C.2
Nick, J.M.3
Tuecke, S.4
-
15
-
-
0035455653
-
The anatomy of the grid: Enabling scalable virtual organizations
-
I. Foster, C. Kesselman, and S. Tuecke, "The anatomy of the grid: Enabling scalable virtual organizations," International Journal of High Performance Computing Applications, vol.15, no.3, pp.200-222, 2001.
-
(2001)
International Journal of High Performance Computing Applications
, vol.15
, Issue.3
, pp. 200-222
-
-
Foster, I.1
Kesselman, C.2
Tuecke, S.3
-
16
-
-
0041898464
-
Condorg: A computation management agent for multi-institutional grids
-
J. Frey, T. Tannenbaum, I. Foster, M. Livny, and S. Tuecke, "Condorg: A computation management agent for multi-institutional grids," Cluster Computing, vol.5, no.3, pp.237-246, 2002.
-
(2002)
Cluster Computing
, vol.5
, Issue.3
, pp. 237-246
-
-
Frey, J.1
Tannenbaum, T.2
Foster, I.3
Livny, M.4
Tuecke, S.5
-
18
-
-
0003635251
-
-
MIT Press
-
A. Geist, A. Beguelin, J. Dongarra, W. Jiang, B. Manchek, and V. Sunderam, PVM: Parallel Virtual Machine: A User's Guide and Tutorial for Network Parallel Computing, MIT Press, 1994.
-
(1994)
PVM: Parallel Virtual Machine: A User's Guide and Tutorial for Network Parallel Computing
-
-
Geist, A.1
Beguelin, A.2
Dongarra, J.3
Jiang, W.4
Manchek, B.5
Sunderam, V.6
-
19
-
-
0004137528
-
Menta
-
ed. G.V. Wilson and P. Lu, MIT Press, Cambridge Mass.
-
A.S. Grimshaw, A. Ferrari, and E.A. West, "Menta," in Parallel Programming Using C++, ed. G.V. Wilson and P. Lu, pp.382-427, MIT Press, Cambridge Mass., 1996.
-
(1996)
Parallel Programming Using C++
, pp. 382-427
-
-
Grimshaw, A.S.1
Ferrari, A.2
West, E.A.3
-
22
-
-
0036444933
-
Failure detectors for large-scale distributed systems
-
Oct.
-
N. Hayashibara, A. Cherif, and T. Katayama, "Failure detectors for large-scale distributed systems," Proc. 21st IEEE Symposium on Reliable Distributed Systems, pp.404-409, Oct. 2002.
-
(2002)
Proc. 21st IEEE Symposium on Reliable Distributed Systems
, pp. 404-409
-
-
Hayashibara, N.1
Cherif, A.2
Katayama, T.3
-
24
-
-
0003574196
-
-
McGraw-Hill
-
K. Hwang and Z. Xu, Scalable Parallel Computing, Technology, Architecture, Programming, pp.468-472, McGraw-Hill, 1997.
-
(1997)
Scalable Parallel Computing, Technology, Architecture, Programming
, pp. 468-472
-
-
Hwang, K.1
Xu, Z.2
-
25
-
-
0027665955
-
A generalized algorithm for evaluating distributed-program reliability
-
A. Kumar and D.P. Agrawal, "A generalized algorithm for evaluating distributed-program reliability," IEEE Trans. Reliab., vol.42, no.3, pp.416-424, 1993.
-
(1993)
IEEE Trans. Reliab.
, vol.42
, Issue.3
, pp. 416-424
-
-
Kumar, A.1
Agrawal, D.P.2
-
26
-
-
0022576318
-
Distributed program reliability analysis
-
March
-
V.K.P. Kumar, S. Hariri, and C.S. Raghavendra, "Distributed program reliability analysis," IEEE Trans. Softw. Eng., vol.SE-12, no.1, pp.42-50, March 1986.
-
(1986)
IEEE Trans. Softw. Eng.
, vol.SE-12
, Issue.1
, pp. 42-50
-
-
Kumar, V.K.P.1
Hariri, S.2
Raghavendra, C.S.3
-
28
-
-
0034593157
-
Cog kits: A bridge between commodity distributed computing and high-performance grids
-
G. von Laszewski, I. Foster, J. Gawor, W. Smith, and S. Tuecke, "Cog kits: A bridge between commodity distributed computing and high-performance grids," Proc. ACM 2000 Java Grande Conference, pp.97-106, 2000.
-
(2000)
Proc. ACM 2000 Java Grande Conference
, pp. 97-106
-
-
Von Laszewski, G.1
Foster, I.2
Gawor, J.3
Smith, W.4
Tuecke, S.5
-
29
-
-
0037090799
-
A model for availability analysis of distributed software/hardware systems
-
C.D. Lai, M. Xie, K.L. Poh, Y.S. Dai, and P. Yang, "A model for availability analysis of distributed software/hardware systems," Information and Software Technology, vol.44, pp.343-350, 2002.
-
(2002)
Information and Software Technology
, vol.44
, pp. 343-350
-
-
Lai, C.D.1
Xie, M.2
Poh, K.L.3
Dai, Y.S.4
Yang, P.5
-
31
-
-
33645224897
-
Replication in ficus distributed file systems
-
G.J. Popek, R.G. Guy, T.W. Page, Jr., and J.S. Heidemann, "Replication in ficus distributed file systems," IEEE Computer Society Technical Committee on Operating Systems and Application Environments Newsletter, vol.4, pp.24-29, 1990.
-
(1990)
IEEE Computer Society Technical Committee on Operating Systems and Application Environments Newsletter
, vol.4
, pp. 24-29
-
-
Popek, G.J.1
Guy, R.G.2
Page Jr., T.W.3
Heidemann, J.S.4
-
32
-
-
0002513952
-
A gossip-style failure detection service
-
R. van Renesse, Y. Minsky, and M. Hayden, "A gossip-style failure detection service," Proc. Middleware'98, pp.55-70, 1998.
-
(1998)
Proc. Middleware'98
, pp. 55-70
-
-
Van Renesse, R.1
Minsky, Y.2
Hayden, M.3
-
33
-
-
77952176829
-
A fault detection service for wide area distributed computations
-
July
-
P. Stelling, I. Foster, C. Kesselman, C. Lee, and G. von Laszewski, "A fault detection service for wide area distributed computations," Proc. 7th IEEE Symposium on High Performance Distributed Computing, pp.268-278, July 1998.
-
(1998)
Proc. 7th IEEE Symposium on High Performance Distributed Computing
, pp. 268-278
-
-
Stelling, P.1
Foster, I.2
Kesselman, C.3
Lee, C.4
Von Laszewski, G.5
-
34
-
-
33645220707
-
Reliability analysis for grid computing
-
X. Shi, H. Jin, W. Qiang, and D. Zou, "Reliability analysis for grid computing," Lect. Notes Comput. Sci., vol.3251, pp.787-790, 2004.
-
(2004)
Lect. Notes Comput. Sci.
, vol.3251
, pp. 787-790
-
-
Shi, X.1
Jin, H.2
Qiang, W.3
Zou, D.4
-
35
-
-
0031388399
-
Impact of checkpoint latency on overhead ratio of a checkpointing scheme
-
Aug.
-
N.H. Vaidya, "Impact of checkpoint latency on overhead ratio of a checkpointing scheme," IEEE Trans. Comput., vol.46, no.8, pp.942-947, Aug. 1997.
-
(1997)
IEEE Trans. Comput.
, vol.46
, Issue.8
, pp. 942-947
-
-
Vaidya, N.H.1
|