-
1
-
-
60449097203
-
The design of OpenMP tasks
-
IEEE Transactions on march
-
E. Ayguade, N. Copty, A. Duran, J. Hoeflinger, Y. Lin, F. Massaioli, X. Teruel, P. Unnikrishnan, and G. Zhang. The design of OpenMP tasks. Parallel and Distributed Systems, IEEE Transactions on, 20(3):404-418, march 2009.
-
(2009)
Parallel and Distributed Systems
, vol.20
, Issue.3
, pp. 404-418
-
-
Ayguade, E.1
Copty, N.2
Duran, A.3
Hoeflinger, J.4
Lin, Y.5
Massaioli, F.6
Teruel, X.7
Unnikrishnan, P.8
Zhang, G.9
-
3
-
-
84884662651
-
MPICH-V: Toward a scalable fault tolerant MPI for volatile nodes
-
Los Alamitos, CA, USA, 2002. IEEE Computer Society Press
-
G. Bosilca, A. Bouteiller, F. Cappello, S. Djilali, G. Fedak, C. Germain, T. Herault, P. Lemarinier, O. Lodygensky, F. Magniette, V. Neri, and A. Selikhov. MPICH-V: Toward a Scalable Fault Tolerant MPI for Volatile Nodes. In Proceedings of the 2002 ACM/IEEE conference on Supercomputing, Supercomputing '02, pages 1-18, Los Alamitos, CA, USA, 2002. IEEE Computer Society Press.
-
Proceedings of the 2002 ACM/IEEE Conference on Supercomputing, Supercomputing '02
, pp. 1-18
-
-
Bosilca, G.1
Bouteiller, A.2
Cappello, F.3
Djilali, S.4
Fedak, G.5
Germain, C.6
Herault, T.7
Lemarinier, P.8
Lodygensky, O.9
Magniette, F.10
Neri, V.11
Selikhov, A.12
-
4
-
-
0038040085
-
Automated application-level checkpointing of MPI programs
-
New York, NY, USA ACM
-
G. Bronevetsky, D. Marques, K. Pingali, and P. Stodghill. Automated application-level checkpointing of MPI programs. In Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming, PPoPP '03, pages 84-94, New York, NY, USA, 2003. ACM.
-
(2003)
Proceedings of the Ninth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '03
, pp. 84-94
-
-
Bronevetsky, G.1
Marques, D.2
Pingali, K.3
Stodghill, P.4
-
6
-
-
34249696738
-
Parallel programmability and the chapel language
-
DOI 10.1177/1094342007078442
-
B. Chamberlain, D. Callahan, and H. Zima. Parallel programmability and the Chapel language. International Journal of High Performance Computing Applications, 21(3):291-312, 2007. (Pubitemid 47082808)
-
(2007)
International Journal of High Performance Computing Applications
, vol.21
, Issue.3
, pp. 291-312
-
-
Chamberlain, B.L.1
Callahan, D.2
Zima, H.P.3
-
7
-
-
31744441529
-
X10: An object-oriented approach to non-uniform cluster computing
-
ACM
-
P. Charles, C. Grothoff, V. Saraswat, C. Donawa, A. Kielstra, K. Ebcioglu, C. Von Praun, and V. Sarkar. X10: An object-oriented approach to non-uniform cluster computing. In ACM SIGPLAN Notices, volume 40, pages 519-538. ACM, 2005.
-
(2005)
ACM SIGPLAN Notices
, vol.40
, pp. 519-538
-
-
Charles, P.1
Grothoff, C.2
Saraswat, V.3
Donawa, C.4
Kielstra, A.5
Ebcioglu, K.6
Von Praun, C.7
Sarkar, V.8
-
8
-
-
37549003336
-
Mapreduce: Simplified data processing on large clusters
-
Jan.
-
J. Dean and S. Ghemawat. Mapreduce: simplified data processing on large clusters. Commun. ACM, 51(1):107-113, Jan. 2008.
-
(2008)
Commun. ACM
, vol.51
, Issue.1
, pp. 107-113
-
-
Dean, J.1
Ghemawat, S.2
-
9
-
-
41149092147
-
Dynamo: Amazon's highly available key-value store
-
DOI 10.1145/1294261.1294281, SOSP'07: Proceedings of the 21st ACM Symposium on Operating Systems Principles
-
G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: Amazon's highly available key-value store. In Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles, SOSP '07, pages 205-220, New York, NY, USA, 2007. ACM. (Pubitemid 351429377)
-
(2007)
Operating Systems Review (ACM)
, pp. 205-220
-
-
DeCandia, G.1
Hastorun, D.2
Jampani, M.3
Kakulapati, G.4
Lakshman, A.5
Pilchin, A.6
Sivasubramanian, S.7
Vosshall, P.8
Vogels, W.9
-
10
-
-
74049140383
-
Scalable work stealing
-
New York, NY, USA ACM
-
J. Dinan, D. B. Larkins, P. Sadayappan, S. Krishnamoorthy, and J. Nieplocha. Scalable work stealing. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC '09, pages 53:1-53:11, New York, NY, USA, 2009. ACM.
-
(2009)
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC '09
, pp. 531-5311
-
-
Dinan, J.1
Larkins, D.B.2
Sadayappan, P.3
Krishnamoorthy, S.4
Nieplocha, J.5
-
11
-
-
9144223280
-
Checkpointing for peta-scale systems: A look into the future of practical rollback-recovery
-
IEEE Transactions on April-June
-
E. Elnozahy and J. Plank. Checkpointing for peta-scale systems: a look into the future of practical rollback-recovery. Dependable and Secure Computing, IEEE Transactions on, 1(2):97 - 108, April-June 2004.
-
(2004)
Dependable and Secure Computing
, vol.1
, Issue.2
, pp. 97-108
-
-
Elnozahy, E.1
Plank, J.2
-
12
-
-
0042078549
-
A survey of rollback-recovery protocols in message-passing systems
-
E. N. Elnozahy, L. Alvisi, Y.-M. Wang, and D. B. Johnson. A survey of rollback-recovery protocols in message-passing systems. ACM Comput. Surv., 34(3):375-408, 2002.
-
(2002)
ACM Comput. Surv.
, vol.34
, Issue.3
, pp. 375-408
-
-
Elnozahy, E.N.1
Alvisi, L.2
Wang, Y.-M.3
Johnson, D.B.4
-
14
-
-
83155188951
-
Evaluating the viability of process replication reliability for exascale systems
-
New York, NY, USA ACM
-
K. Ferreira, J. Stearley, J. H. Laros, III, R. Oldfield, K. Pedretti, R. Brightwell, R. Riesen, P. G. Bridges, and D. Arnold. Evaluating the viability of process replication reliability for exascale systems. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, SC '11, pages 44:1-44:12, New York, NY, USA, 2011. ACM.
-
(2011)
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, SC '11
, pp. 441-4412
-
-
Ferreira, K.1
Stearley, J.2
Laros III, J.H.3
Oldfield, R.4
Pedretti, K.5
Brightwell, R.6
Riesen, R.7
Bridges, P.G.8
Arnold, D.9
-
15
-
-
0042986976
-
-
PhD thesis, MIT, Cambridge, MA, USA
-
M. Frigo. Portable High-Performance Programs. PhD thesis, MIT, Cambridge, MA, USA, 1999.
-
(1999)
Portable High-performance Programs
-
-
Frigo, M.1
-
16
-
-
0347507496
-
The implementation of the cilk-5 multithreaded language
-
M. Frigo, C. E. Leiserson, and K. H. Randall. The implementation of the Cilk-5 multithreaded language. SIGPLAN Not., 33:212-223, May 1998. (Pubitemid 128454798)
-
(1998)
SIGPLAN Notices (ACM Special Interest Group on Programming Languages)
, vol.33
, Issue.5
, pp. 212-223
-
-
Frigo, M.1
Leiserson, C.E.2
Randall, K.H.3
-
17
-
-
33845434226
-
Transparent, incremental checkpointing at kernel level: A foundation for fault tolerance for parallel computers
-
Washington, DC, USA IEEE Computer Society
-
R. Gioiosa, J. C. Sancho, S. Jiang, F. Petrini, and K. Davis. Transparent, incremental checkpointing at kernel level: a foundation for fault tolerance for parallel computers. In Proceedings of the 2005 ACM/IEEE conference on Supercomputing, SC '05, pages 9-, Washington, DC, USA, 2005. IEEE Computer Society.
-
(2005)
Proceedings of the 2005 ACM/IEEE Conference on Supercomputing, SC '05
, pp. 9
-
-
Gioiosa, R.1
Sancho, J.C.2
Jiang, S.3
Petrini, F.4
Davis, K.5
-
18
-
-
33749067567
-
Berkeley lab checkpoint/restart (BLCR) for Linux clusters
-
DOI 10.1088/1742-6596/46/1/067, 067
-
P. H. Hargrove and J. C. Duell. Berkeley Lab Checkpoint/Restart (BLCR) for Linux clusters. In Journal of Physics: Conf. Series (SciDAC), volume 46, pages 494-499, June 2006. (Pubitemid 44461038)
-
(2006)
Journal of Physics: Conference Series
, vol.46
, Issue.1
, pp. 494-499
-
-
Hargrove, P.H.1
Duell, J.C.2
-
19
-
-
0021439162
-
Algorithm-based fault tolerance for matrix operations
-
K.-H. Huang and J. Abraham. Algorithm-based fault tolerance for matrix operations. Computers, IEEE Transactions on, C-33(6):518-528, June 1984. (Pubitemid 14584528)
-
(1984)
IEEE Transactions on Computers
, vol.C-33
, Issue.6
, pp. 518-528
-
-
Huang, K.-H.1
Abraham, J.A.2
-
20
-
-
85160681664
-
Transparent checkpoint-restart of multiple processes on commodity operating systems
-
O. Laadan and J. Nieh. Transparent checkpoint-restart of multiple processes on commodity operating systems. In USENIX Annual Technical Conference, 2007.
-
(2007)
USENIX Annual Technical Conference
-
-
Laadan, O.1
Nieh, J.2
-
21
-
-
77955933052
-
Cassandra: A decentralized structured storage system
-
April
-
A. Lakshman and P. Malik. Cassandra: a decentralized structured storage system. SIGOPS Oper. Syst. Rev., 44:35-40, April 2010.
-
(2010)
SIGOPS Oper. Syst. Rev.
, vol.44
, pp. 35-40
-
-
Lakshman, A.1
Malik, P.2
-
22
-
-
34548046749
-
Proactive fault tolerance for HPC with Xen virtualization
-
DOI 10.1145/1274971.1274978, Proceedings of ICS07: 21st ACM International Conference on Supercomputing
-
A. B. Nagarajan, F. Mueller, C. Engelmann, and S. L. Scott. Proactive fault tolerance for hpc with xen virtualization. In Proceedings of the 21st annual international conference on Supercomputing, ICS '07, pages 23-32, New York, NY, USA, 2007. ACM. (Pubitemid 47281603)
-
(2007)
Proceedings of the International Conference on Supercomputing
, pp. 23-32
-
-
Nagarajan, A.B.1
Mueller, F.2
Engelmann, C.3
Scott, S.L.4
-
23
-
-
34547424386
-
Cooperative checkpointing: A robust approach to large-scale systems reliability
-
DOI 10.1145/1183401.1183406, Proceedings of the 20th Annual International Conference on Supercomputing, ICS 2006
-
A. J. Oliner, L. Rudolph, and R. K. Sahoo. Cooperative checkpointing: a robust approach to large-scale systems reliability. In Proceedings of the 20th annual international conference on Supercomputing, ICS '06, pages 14-23, New York, NY, USA, 2006. ACM. (Pubitemid 47168488)
-
(2006)
Proceedings of the International Conference on Supercomputing
, pp. 14-23
-
-
Oliner, A.J.1
Rudolph, L.2
Sahoo, R.K.3
-
26
-
-
0026812659
-
Design and implementation of a log-structured file system
-
DOI 10.1145/146941.146943
-
M. Rosenblum and J. Ousterhout. The design and implementation of a log-structured file system. ACM Transactions on Computer Systems (TOCS), 10(1):26-52, 1992. (Pubitemid 23598979)
-
(1992)
ACM Transactions on Computer Systems
, vol.10
, Issue.1
, pp. 26-52
-
-
Rosenblum Mendel1
Ousterhout John, K.2
-
27
-
-
79952810744
-
Lifeline-based global load balancing
-
New York, NY, USA ACM
-
V. A. Saraswat, P. Kambadur, S. Kodali, D. Grove, and S. Krishnamoorthy. Lifeline-based global load balancing. In Proceedings of the 16th ACM symposium on Principles and practice of parallel programming, PPoPP '11, pages 201-212, New York, NY, USA, 2011. ACM.
-
(2011)
Proceedings of the 16th ACM Symposium on Principles and Practice of Parallel Programming, PPoPP '11
, pp. 201-212
-
-
Saraswat, V.A.1
Kambadur, P.2
Kodali, S.3
Grove, D.4
Krishnamoorthy, S.5
-
28
-
-
0026867086
-
Active messages: A mechanism for integrated communication and computation
-
New York, NY, USA ACM
-
T. von Eicken, D. E. Culler, S. C. Goldstein, and K. E. Schauser. Active messages: a mechanism for integrated communication and computation. In Proceedings of the 19th annual international symposium on Computer architecture, ISCA '92, pages 256-266, New York, NY, USA, 1992. ACM.
-
(1992)
Proceedings of the 19th Annual International Symposium on Computer Architecture, ISCA '92
, pp. 256-266
-
-
Von Eicken, T.1
Culler, D.E.2
Goldstein, S.C.3
Schauser, K.E.4
-
29
-
-
34548768671
-
A job pause service under LAM/MPI+BLCR for transparent fault tolerance
-
C. Wang, F. Mueller, C. Engelmann, and S. L. Scott. A Job Pause Service under LAM/MPI+BLCR for Transparent Fault Tolerance. In IPDPS, pages 1-10, 2007.
-
(2007)
IPDPS
, pp. 1-10
-
-
Wang, C.1
Mueller, F.2
Engelmann, C.3
Scott, S.L.4
-
30
-
-
0028465953
-
Algorithm-based fault tolerance for FFT networks
-
IEEE Transactions on Jul
-
S.-J. Wang and N. Jha. Algorithm-based fault tolerance for FFT networks. Computers, IEEE Transactions on, 43(7):849-854, Jul 1994.
-
(1994)
Computers
, vol.43
, Issue.7
, pp. 849-854
-
-
Wang, S.-J.1
Jha, N.2
|