SCOPUS 정보 검색 플랫폼

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Volumn 7017 LNCS, Issue PART 2, 2011, Pages 109-120

SpotMPI: A framework for auction-based HPC computing using amazon spot instances

(3) Taifi, Moussa a Shi, Justin Y a Khreishah, Abdallah a

a Temple University (United States)

Author keywords

[No Author keywords available]

Indexed keywords

APPLICATION TOOLKIT; CLOUD PROVIDERS; COMMUNICATION COMPLEXITY; COMPUTING PLATFORM; COST EFFECTIVE; CRITICAL TIME; ECONOMY OF SCALE; FAIR MARKET; FORMAL MODEL; MPI APPLICATIONS; NON-TRIVIAL; OPTIMAL BIDDING; PERFORMANCE PARAMETERS; PRACTICAL COMPUTING; PROGRAMMING PARADIGMS; RESOURCE USE;

ALGORITHMS; FAULT TOLERANCE; MESSAGE PASSING; OPTIMIZATION;

CLOUD COMPUTING;

EID: 80455140325 PISSN: 03029743 EISSN: 16113349 Source Type: Book Series
DOI: 10.1007/978-3-642-24669-2_11 Document Type: Conference Paper

Times cited : (24)

References (30)

1
- 80455174959
- Starcluster (2010), http://web.mit.edu/stardev/cluster/
- (2010) Starcluster

2
- 80455151192
- Amazon hpc cluster instances (2011), http://aws.amazon.com/ec2/hpc- applications/
- (2011) Amazon Hpc Cluster Instances

3
- 0033359224
- Starfish: Fault-tolerant dynamic mpi programs on clusters of workstations
- Agbaria, A.M., Friedman, R.: Starfish: fault-tolerant dynamic mpi programs on clusters of workstations. In: Proceedings of the Eighth International Symposium on High Performance Distributed Computing, 1999, pp. 167-176 (1999)
- (1999) Proceedings of the Eighth International Symposium on High Performance Distributed Computing, 1999 , pp. 167-176
- Agbaria, A.M.¹ Friedman, R.²

4
- 85060036181
- Validity of the single processor approach to achieving large scale computing capabilities
- ACM, New York
- Amdahl, G.M.: Validity of the single processor approach to achieving large scale computing capabilities. In: Proceedings of the Spring Joint Computer Conference, April 18-20, pp. 483-485. ACM, New York (1967)
- (1967) Proceedings of the Spring Joint Computer Conference, April 18-20 , pp. 483-485
- Amdahl, G.M.¹

5
- 78049508316
- Decision model for cloud computing under sla constraints
- Andrzejak, A., Kondo, D., Yi, S.: Decision model for cloud computing under sla constraints. In: Proc. IEEE Int Modeling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS) Symp., pp. 257-266 (2010)
- (2010) Proc. IEEE Int Modeling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS) Symp. , pp. 257-266
- Andrzejak, A.¹ Kondo, D.² Yi, S.³

6
- 21644433634
- Xen and the art of virtualization
- ACM, New York
- Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T., Ho, A., Neugebauer, R., Pratt, I., Warfield, A.: Xen and the art of virtualization. In: Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles, pp. 164-177. ACM, New York (2003)
- (2003) Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles , pp. 164-177
- Barham, P.¹ Dragovic, B.² Fraser, K.³ Hand, S.⁴ Harris, T.⁵ Ho, A.⁶ Neugebauer, R.⁷ Pratt, I.⁸ Warfield, A.⁹

7
- 0039877166
- Timing models and local stopping criteria for asynchronous iterative algorithms
- Blathras, K., Szyld, D.B., Shi, Y.: Timing models and local stopping criteria for asynchronous iterative algorithms. Journal of Parallel and Distributed Computing 58(3), 446-465 (1999)
- (1999) Journal of Parallel and Distributed Computing , vol.58 , Issue.3 , pp. 446-465
- Blathras, K.¹ Szyld, D.B.² Shi, Y.³

8
- 67650326696
- Borthakur, D.: The hadoop distributed file system: Architecture and design (2007), http://developer.yahoo.com/hadoop/tutorial/
- (2007) The Hadoop Distributed File System: Architecture and Design
- Borthakur, D.¹

9
- 85088778522
- See spot run: Using spot instances for mapreduce workflows
- USENIX Association
- Chohan, N., Castillo, C., Spreitzer, M., Steinder, M., Tantawi, A., Krintz, C.: See spot run: using spot instances for mapreduce workflows. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, p. 7. USENIX Association (2010)
- (2010) Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing , pp. 7
- Chohan, N.¹ Castillo, C.² Spreitzer, M.³ Steinder, M.⁴ Tantawi, A.⁵ Krintz, C.⁶

10
- 28044460018
- A higher order estimate of the optimum checkpoint interval for restart dumps
- Daly, J.T.: A higher order estimate of the optimum checkpoint interval for restart dumps. Future Generation Computer Systems 22(3), 303-312 (2006)
- (2006) Future Generation Computer Systems , vol.22 , Issue.3 , pp. 303-312
- Daly, J.T.¹

11
- 37549003336
- Mapreduce: Simplified data processing on large clusters
- Dean, J., Ghemawat, S.: Mapreduce: Simplified data processing on large clusters. Communications of the ACM 51(1), 107-113 (2008)
- (2008) Communications of the ACM , vol.51 , Issue.1 , pp. 107-113
- Dean, J.¹ Ghemawat, S.²

12
- 84940567900
- Ft-mpi: Fault tolerant mpi, supporting dynamic applications in a dynamic world
- Dongarra, J., Kacsuk, P., Podhorszki, N. (eds.) PVM/MPI 2000. Springer, Heidelberg
- Fagg, G., Dongarra, J.: Ft-mpi: Fault tolerant mpi, supporting dynamic applications in a dynamic world. In: Dongarra, J., Kacsuk, P., Podhorszki, N. (eds.) PVM/MPI 2000. LNCS, vol. 1908, pp. 346-353. Springer, Heidelberg (2000)
- (2000) LNCS , vol.1908 , pp. 346-353
- Fagg, G.¹ Dongarra, J.²

13
- 0347133226
- A network-failure-tolerant message-passing system for terascale clusters
- Graham, R.L., Choi, S.E., Daniel, D.J., Desai, N.N., Minnich, R.G., Rasmussen, C.E., Risinger, L.D., Sukalski, M.W.: A network-failure-tolerant message-passing system for terascale clusters. International Journal of Parallel Programming 31(4), 285-303 (2003)
- (2003) International Journal of Parallel Programming , vol.31 , Issue.4 , pp. 285-303
- Graham, R.L.¹ Choi, S.E.² Daniel, D.J.³ Desai, N.N.⁴ Minnich, R.G.⁵ Rasmussen, C.E.⁶ Risinger, L.D.⁷ Sukalski, M.W.⁸

14
- 33749067567
- Berkeley lab checkpoint/restart (blcr) for linux clusters
- IOP Publishing
- Hargrove, P.H., Duell, J.C.: Berkeley lab checkpoint/restart (blcr) for linux clusters. In: Journal of Physics: Conference Series, vol. 46, p. 494. IOP Publishing (2006)
- (2006) Journal of Physics: Conference Series , vol.46 , pp. 494
- Hargrove, P.H.¹ Duell, J.C.²

15
- 80455151190
- PhD thesis, Indiana University, Bloomington, IN, USA July
- Hursey, J.: Coordinated Checkpoint/Restart Process Fault Tolerance for MPI Applications on HPC Systems. PhD thesis, Indiana University, Bloomington, IN, USA (July 2010)
- (2010) Coordinated Checkpoint/Restart Process Fault Tolerance for MPI Applications on HPC Systems
- Hursey, J.¹

16
- 34548789748
- The design and implementation of checkpoint/restart process fault tolerance for open mpi
- Hursey, J., Squyres, J.M., Mattox, T.I., Lumsdaine, A.: The design and implementation of checkpoint/restart process fault tolerance for open mpi. In: Proc. IEEE Int. Parallel and Distributed Processing Symp. IPDPS 2007, pp. 1-8 (2007)
- (2007) Proc. IEEE Int. Parallel and Distributed Processing Symp. IPDPS 2007 , pp. 1-8
- Hursey, J.¹ Squyres, J.M.² Mattox, T.I.³ Lumsdaine, A.⁴

17
- 85027938495
- Performance analysis of cloud computing services for many-tasks scientific computing
- Iosup, A., Ostermann, S., Yigitbasi, N., Prodan, R., Fahringer, T., Epema, D.: Performance analysis of cloud computing services for many-tasks scientific computing. IEEE Transactions on Parallel and Distributed Systems 22(6), 931-945 (2011)
- (2011) IEEE Transactions on Parallel and Distributed Systems , vol.22 , Issue.6 , pp. 931-945
- Iosup, A.¹ Ostermann, S.² Yigitbasi, N.³ Prodan, R.⁴ Fahringer, T.⁵ Epema, D.⁶

18
- 0003912256
- Technical report, Technical Report
- Litzkow, M., Tannenbaum, T., Basney, J., Livny, M.: Checkpoint and migration of unix processes in the condor distributed processing system. Technical report, Technical Report (1997)
- (1997) Checkpoint and Migration of Unix Processes in the Condor Distributed Processing System
- Litzkow, M.¹ Tannenbaum, T.² Basney, J.³ Livny, M.⁴

19
- 20444492163
- Fault tolerance in mpi programs
- Lusk, E.: Fault tolerance in mpi programs. Special issue of the Journal High Performance Computing Applications, IJHPCA (2002)
- (2002) Special Issue of the Journal High Performance Computing Applications, IJHPCA
- Lusk, E.¹

20
- 78650831692
- Design, modeling, and evaluation of a scalable multi-level checkpointing system
- Moody, A., Bronevetsky, G., Mohror, K., de Supinski, B.R.: Design, modeling, and evaluation of a scalable multi-level checkpointing system. In: Proc. Int. High Performance Computing, Networking, Storage and Analysis (SC) Conf. for, pp. 1-11 (2010)
- (2010) Proc. Int. High Performance Computing, Networking, Storage and Analysis (SC) Conf. for , pp. 1-11
- Moody, A.¹ Bronevetsky, G.² Mohror, K.³ De Supinski, B.R.⁴

21
- 0032597696
- Egida: An extensible toolkit for low-overhead fault-tolerance
- Digest of Papers, IEEE, Los Alamitos
- Rao, S., Alvisi, L., Vin, H.M.: Egida: An extensible toolkit for low-overhead fault-tolerance. In: Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing, 1999. Digest of Papers, pp. 48-55. IEEE, Los Alamitos (1999)
- (1999) Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing, 1999 , pp. 48-55
- Rao, S.¹ Alvisi, L.² Vin, H.M.³

22
- 81555198850
- Program scalability analysis
- Shi, J.Y.: Program scalability analysis. In: International Conference on Distributed and Parallel Processing. Geogetown University, Washington D.C (1997)
- International Conference on Distributed and Parallel Processing. Geogetown University, Washington D.C (1997)
- Shi, J.Y.¹

23
- 81455155109
- Sustainable gpu computing at scale
- Shi, J.Y., Taifi, M., Khreishah, A., Wu, J.: Sustainable gpu computing at scale. In: 14th IEEE International Conference in Computational Science and Engneering 2011 (2011)
- (2011) 14th IEEE International Conference in Computational Science and Engneering 2011
- Shi, J.Y.¹ Taifi, M.² Khreishah, A.³ Wu, J.⁴

24
- 0029713612
- Cocheck: Checkpointing and process migration for mpi
- IEEE Computer Society, Washington, DC, USA
- Stellner, G.: Cocheck: Checkpointing and process migration for mpi. In: Proceedings of the 10th International Parallel Processing Symposium, IPPS 1996, pp. 526-531. IEEE Computer Society, Washington, DC, USA (1996)
- (1996) Proceedings of the 10th International Parallel Processing Symposium, IPPS 1996 , pp. 526-531
- Stellner, G.¹

25
- 77949790526
- High-performance cloud computing: A view of scientific applications
- Vecchiola, C., Pandey, S., Buyya, R.: High-performance cloud computing: A view of scientific applications. In: Proc. 10th Int. Pervasive Systems, Algorithms, and Networks (ISPAN) Symp., pp. 4-16 (2009)
- (2009) Proc. 10th Int. Pervasive Systems, Algorithms, and Networks (ISPAN) Symp. , pp. 4-16
- Vecchiola, C.¹ Pandey, S.² Buyya, R.³

26
- 77957960970
- Reducing costs of spot instances via checkpointing in the amazon elastic compute cloud
- IEEE, Los Alamitos
- Yi, S., Kondo, D., Andrzejak, A.: Reducing costs of spot instances via checkpointing in the amazon elastic compute cloud. In: 2010 IEEE 3rd International Conference on Cloud Computing, pp. 236-243. IEEE, Los Alamitos (2010)
- (2010) 2010 IEEE 3rd International Conference on Cloud Computing , pp. 236-243
- Yi, S.¹ Kondo, D.² Andrzejak, A.³

27
- 84976846528
- A first order approximation to the optimum checkpoint interval
- Young, J.W.: A first order approximation to the optimum checkpoint interval. Communications of the ACM 17(9), 530-531 (1974)
- (1974) Communications of the ACM , vol.17 , Issue.9 , pp. 530-531
- Young, J.W.¹

28
- 48249138490
- Evaluating the performance impact of xen on mpi and process execution for hpc systems
- IEEE Computer Society, Los Alamitos
- Youseff, L., Wolski, R., Gorda, B., Krintz, C.: Evaluating the performance impact of xen on mpi and process execution for hpc systems. In: Proceedings of the 2nd International Workshop on Virtualization Technology in Distributed computing, p. 1. IEEE Computer Society, Los Alamitos (2006)
- (2006) Proceedings of the 2nd International Workshop on Virtualization Technology in Distributed Computing , pp. 1
- Youseff, L.¹ Wolski, R.² Gorda, B.³ Krintz, C.⁴

29
- 84908019185
- Dynamic resource allocation for spot markets in clouds
- Zhang, Q., Grses, E., Boutaba, R., Xiao, J.: Dynamic resource allocation for spot markets in clouds. In: Proceedings of the 11th USENIX Conference on Hot Topics in Management of Internet, Cloud, and Enterprise Networks and Services (2011)
- Proceedings of the 11th USENIX Conference on Hot Topics in Management of Internet, Cloud, and Enterprise Networks and Services (2011)
- Zhang, Q.¹ Grses, E.² Boutaba, R.³ Xiao, J.⁴

30
- 20444463494
- Ftc-charm++: An in-memory checkpoint-based fault tolerant runtime for charm++ and mpi
- IEEE, Los Alamitos
- Zheng, G., Shi, L., Kalé, L.V.: Ftc-charm++: An in-memory checkpoint-based fault tolerant runtime for charm++ and mpi. In: 2004 IEEE International Conference on Cluster Computing, pp. 93-103. IEEE, Los Alamitos (2004)
- (2004) 2004 IEEE International Conference on Cluster Computing , pp. 93-103
- Zheng, G.¹ Shi, L.² Kalé, L.V.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.