SCOPUS 정보 검색 플랫폼

Volumn 38, Issue 1-2, 2012, Pages 37-51

DAGuE: A generic distributed DAG engine for High Performance Computing

(6) Bosilca, George a Bouteiller, Aurelien a Danalis, Anthony a Herault, Thomas a Lemarinier, Pierre b Dongarra, Jack a,c

a University of Tennessee (United States)

b UNIVERSITÉ DE RENNES 1 (France)

c OAK RIDGE NATIONAL LABORATORY (United States)

Author keywords

Architecture aware scheduling; Heterogeneous architectures; HPC; Micro task DAG

Indexed keywords

ACYCLIC GRAPHS; DATA DEPENDENCIES; GENERIC FRAMEWORKS; HETEROGENEOUS ARCHITECTURES; HIGH PERFORMANCE COMPUTING; HPC; MANY-CORE; MICRO-TASK DAG; MICROTASKS; PROGRAMMING ENVIRONMENT;

ARTS COMPUTING; BENCHMARKING; COMPUTER SOFTWARE SELECTION AND EVALUATION; LINEAR ALGEBRA; SCHEDULING;

ARCHITECTURE;

EID: 84655174868 PISSN: 01678191 EISSN: None Source Type: Journal
DOI: 10.1016/j.parco.2011.10.003 Document Type: Article

Times cited : (211)

References (33)

1
- 84938023119
- Analysis of programs for parallel processing
- A.J. Bernstein Analysis of programs for parallel processing IEEE Transactions on Electronic Computers EC-15 5 1966 757 763
- (1966) IEEE Transactions on Electronic Computers , vol.15 EC- , Issue.5 , pp. 757-763
- Bernstein, A.J.¹

2
- 0004230536
- Prentice Hall Professional Technical Reference
- E.G. Coffman, Jr., P.J. Denning, Operating Systems Theory, Prentice Hall Professional Technical Reference, 1973.
- (1973) Operating Systems Theory
- Coffman Jr., E.G.¹ Denning, P.J.²

3
- 0010898812
- Ablex Publishing Corp
- J.A. Sharp Data Flow Computing: Theory and Practice 1992 Ablex Publishing Corp
- (1992) Data Flow Computing: Theory and Practice
- Sharp, J.A.¹

4
- 33244454775
- A taxonomy of workflow management systems for grid computing
- J. Yu, R. Buyya, A taxonomy of workflow management systems for grid computing, Tech. rep., Journal of Grid Computing, 2005.
- (2005) Tech. Rep., Journal of Grid Computing
- Yu, J.¹ Buyya, R.²

5
- 46149113240
- Workflow global computing with YML
- O. Delannoy, N. Emad, S. Petiton, Workflow global computing with YML, in: 7th IEEE/ACM International Conference on Grid Computing, 2006.
- (2006) 7th IEEE/ACM International Conference on Grid Computing
- Delannoy, O.¹ Emad, N.² Petiton, S.³

6
- 38049058008
- The impact of multicore on math software
- Lecture Notes in Computer Science Springer
- A. Buttari, J. Dongarra, J. Kurzak, J. Langou, P. Luszczek, and S. Tomov The impact of multicore on math software Applied Parallel Computing. State of the Art in Scientific Computing, 8th International Workshop, PARA Lecture Notes in Computer Science vol. 4699 2006 Springer 1 10
- (2006) Applied Parallel Computing. State of the Art in Scientific Computing, 8th International Workshop, PARA , vol.4699 VOL. , pp. 1-10
- Buttari, A.¹ Dongarra, J.² Kurzak, J.³ Langou, J.⁴ Luszczek, P.⁵ Tomov, S.⁶

7
- 67650056933
- Supermatrix: A multithreaded runtime scheduling system for algorithms-by-blocks
- ACM
- E. Chan, F.G. Van Zee, P. Bientinesi, E.S. Quintana-Ortí, G. Quintana-Ortí, and R. van de Geijn Supermatrix: a multithreaded runtime scheduling system for algorithms-by-blocks PPoPP '08: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming 2008 ACM 123 132
- (2008) PPoPP '08: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming , pp. 123-132
- Chan, E.¹ Van Zee, F.G.² Bientinesi, P.³ Quintana-Ortí, E.S.⁴ Quintana-Ortí, G.⁵ Van De Geijn, R.⁶

8
- 77953997924
- Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects
- E. Agullo, J. Demmel, J. Dongarra, B. Hadri, J. Kurzak, J. Langou, H. Ltaief, P. Luszczek, S. Tomov, Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects, Journal of Physics: Conference Series 180.
- Journal of Physics: Conference Series , pp. 180
- Agullo, E.¹ Demmel, J.² Dongarra, J.³ Hadri, B.⁴ Kurzak, J.⁵ Langou, J.⁶ Ltaief, H.⁷ Luszczek, P.⁸ Tomov, S.⁹

9
- 83455228043
- HMPP: A hybrid multi-core parallel programming environment
- R. Dolbeau, S. Bihan, F. Bodin, HMPP: A hybrid multi-core parallel programming environment, in: Workshop on General Purpose Processing on Graphics Processing Units (GPGPU 2007), 2007.
- (2007) Workshop on General Purpose Processing on Graphics Processing Units (GPGPU 2007)
- Dolbeau, R.¹ Bihan, S.² Bodin, F.³

10
- 78651103346
- StarPU: A unified platform for task scheduling on heterogeneous multicore architectures
- C. Augonnet, S. Thibault, R. Namyst, and P.-A. Wacrenier StarPU: a unified platform for task scheduling on heterogeneous multicore architectures Concurrency and Computation: Practice and Experience 23 2 2011 187 198
- (2011) Concurrency and Computation: Practice and Experience , vol.23 , Issue.2 , pp. 187-198
- Augonnet, C.¹ Thibault, S.² Namyst, R.³ Wacrenier, P.-A.⁴

11
- 57949083229
- A dependency-aware task-based programming environment for multi-core architectures
- J. Perez, R. Badia, J. Labarta, A dependency-aware task-based programming environment for multi-core architectures, in: IEEE International Conference on Cluster Computing, 2008, pp. 142-151.
- (2008) IEEE International Conference on Cluster Computing , pp. 142-151
- Perez, J.¹ Badia, R.² Labarta, J.³

12
- 74049102092
- Dynamic task scheduling for linear algebra algorithms on distributed-memory multicore systems
- ACM New York, NY, USA
- F. Song, A. YarKhan, and J. Dongarra Dynamic task scheduling for linear algebra algorithms on distributed-memory multicore systems SC '09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis 2009 ACM New York, NY, USA 1 11
- (2009) SC '09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis , pp. 1-11
- Song, F.¹ Yarkhan, A.² Dongarra, J.³

13
- 84655160745
- StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures
- LNCS, Delft Pays-Bas
- C. Augonnet, S. Thibault, R. Namyst, P.-A. Wacrenier, StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures, in: Euro-Par 2009 Euro-par'09 Proceedings, LNCS, Delft Pays-Bas, 2009.
- (2009) Euro-Par 2009 Euro-par'09 Proceedings
- Augonnet, C.¹ Thibault, S.² Namyst, R.³ Wacrenier, P.-A.⁴

14
- 0035266229
- Automatic parallelization techniques based on compact dag extraction and symbolic scheduling
- M. Cosnard, and E. Jeannot Automatic parallelization techniques based on compact DAG extraction and symbolic scheduling Parallel Processing Letters 11 2001 151 168 (Pubitemid 32697656)
- (2001) Parallel Processing Letters , vol.11 , Issue.1 , pp. 151-168
- Cosnard, M.¹ Jeannot, E.²

15
- 10844269765
- Compact DAG representation and its symbolic scheduling
- DOI 10.1016/j.jpdc.2004.05.001
- M. Cosnard, E. Jeannot, and T. Yang Compact DAG representation and its symbolic scheduling Journal of Parallel and Distributed Computing 64 8 2004 921 935 (Pubitemid 40000764)
- (2004) Journal of Parallel and Distributed Computing , vol.64 , Issue.8 , pp. 921-935
- Cosnard, M.¹ Jeannot, E.² Yang, T.³

16
- 83455228041
- Automatic multithreaded parallel program generation for message passing multiprocessors using parameterized task graphs
- E. Jeannot, Automatic multithreaded parallel program generation for message passing multiprocessors using parameterized task graphs, in: International Conference 'Parallel Computing 2001' (ParCo2001), 2001.
- (2001) International Conference 'Parallel Computing 2001' (ParCo2001)
- Jeannot, E.¹

17
- 56749169455
- Multi-threading and one-sided communication in parallel lu factorization
- B. Verastegui, ACM Press
- P. Husbands, and K.A. Yelick Multi-threading and one-sided communication in parallel lu factorization B. Verastegui, Proceedings of the ACM/IEEE Conference on High Performance Networking and Computing, SC 2007, November 10-16, 2007, Reno, Nevada, USA 2007 ACM Press
- (2007) Proceedings of the ACM/IEEE Conference on High Performance Networking and Computing, SC 2007, November 10-16, 2007, Reno, Nevada, USA
- Husbands, P.¹ Yelick, K.A.²

18
- 65849486487
- Distributed SBP cholesky factorization algorithms with near-optimal scheduling
- F.G. Gustavson, L. Karlsson, and B. Kgström Distributed SBP cholesky factorization algorithms with near-optimal scheduling ACM Transactions on Mathematical Software 36 2 2009 1 25
- (2009) ACM Transactions on Mathematical Software , vol.36 , Issue.2 , pp. 1-25
- Gustavson, F.G.¹ Karlsson, L.² Kgström, B.³

19
- 0026278958
- The omega test: A fast and practical integer programming algorithm for dependence analysis
- New York, NY, USA
- W. Pugh, The omega test: a fast and practical integer programming algorithm for dependence analysis, in: Supercomputing '91: Proceedings of the 1991 ACM/IEEE Conference on Supercomputing, New York, NY, USA, 1991, pp. 4-13.
- (1991) Supercomputing '91: Proceedings of the 1991 ACM/IEEE Conference on Supercomputing , pp. 4-13
- Pugh, W.¹

20
- 0033645154
- The data locality of work stealing
- U.A. Acar, G.E. Blelloch, R.D. Blumofe, The data locality of work stealing., in: SPAA'00, 2000, pp. 1-12.
- (2000) SPAA'00 , pp. 1-12
- Acar, U.A.¹ Blelloch, G.E.² Blumofe, R.D.³

21
- 77952719747
- Hwloc: A Generic Framework for Managing Hardware Affinities in HPC Applications
- IEEE (Ed.) Pisa Italy
- F. Broquedis, J. Clet Ortega, S. Moreaud, N. Furmento, B. Goglin, G. Mercier, S. Thibault, R. Namyst, hwloc: a Generic Framework for Managing Hardware Affinities in HPC Applications, in: IEEE (Ed.), PDP 2010 - The 18th Euromicro International Conference on Parallel, Distributed and Network-Based Computing, Pisa Italy, 2010.
- (2010) PDP 2010 - The 18th Euromicro International Conference on Parallel, Distributed and Network-Based Computing
- Broquedis, F.¹ Clet Ortega, J.² Moreaud, S.³ Furmento, N.⁴ Goglin, B.⁵ Mercier, G.⁶ Thibault, S.⁷ Namyst, R.⁸

22
- 38049054439
- Minimal data copy for dense linear algebra factorization
- LNCS Ume, Sweden
- F.G. Gustavson, J.A. Gunnels, and J.C. Sexton Minimal data copy for dense linear algebra factorization Applied Parallel Computing, State of the Art in Scientific Computing, 8th International Workshop, PARA 2006 vol. 4699 2006 LNCS Ume, Sweden 540 549
- (2006) Applied Parallel Computing, State of the Art in Scientific Computing, 8th International Workshop, PARA 2006 , vol.4699 VOL. , pp. 540-549
- Gustavson, F.G.¹ Gunnels, J.A.² Sexton, J.C.³

23
- 0003454919
- Philadelphia, PA, USA
- G.W. Stewart, Matrix algorithms, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2001.
- (2001) Matrix Algorithms, Society for Industrial and Applied Mathematics
- Stewart, G.W.¹

24
- 58149269099
- A class of parallel tiled linear algebra algorithms for multicore architectures
- A. Buttari, J. Langou, J. Kurzak, and J. Dongarra A class of parallel tiled linear algebra algorithms for multicore architectures Parallel Computation 35 1 2009 38 53
- (2009) Parallel Computation , vol.35 , Issue.1 , pp. 38-53
- Buttari, A.¹ Langou, J.² Kurzak, J.³ Dongarra, J.⁴

25
- 50249105132
- Parallel tiled QR factorization for multicore architectures
- A. Buttari, J. Langou, J. Kurzak, and J.J. Dongarra Parallel tiled QR factorization for multicore architectures Concurrency Computation: Practice and Experience 20 13 2008 1573 1590
- (2008) Concurrency Computation: Practice and Experience , vol.20 , Issue.13 , pp. 1573-1590
- Buttari, A.¹ Langou, J.² Kurzak, J.³ Dongarra, J.J.⁴

26
- 0003078924
- A storage-efficient WY representation for products of householder transformations
- R. Schreiber, and C. van Loan A storage-efficient WY representation for products of householder transformations J. Sci. Stat. Comput. *** 10 1991 53 57
- (1991) J. Sci. Stat. Comput.*** , vol.10 , pp. 53-57
- Schreiber, R.¹ Van Loan, C.²

27
- 48849086742
- Updating an LU factorization with pivoting
- E.S. Quintana-Ortí, and R.A. van de Geijn Updating an LU factorization with pivoting ACM Transactions on Mathematical Software 35 2 2008 11
- (2008) ACM Transactions on Mathematical Software , vol.35 , Issue.2 , pp. 11
- Quintana-Ortí, E.S.¹ Van De Geijn, R.A.²

28
- 33750205459
- Grid'5000: A large scale and highly reconfigurable experimental Grid testbed
- DOI 10.1177/1094342006070078
- R. Bolze, F. Cappello, E. Caron, M.J. Daydé, F. Desprez, E. Jeannot, Y. Jégou, S. Lanteri, J. Leduc, N. Melab, G. Mornet, R. Namyst, P. Primet, B. Quétier, O. Richard, E.-G. Talbi, and I. Touche Grid'5000: a large scale and highly reconfigurable experimental grid testbed IJHPCA 20 4 2006 481 494 (Pubitemid 44605333)
- (2006) International Journal of High Performance Computing Applications , vol.20 , Issue.4 , pp. 481-494
- Bolze, R.¹ Cappello, F.² Caron, E.³ Dayde, M.⁴ Desprez, F.⁵ Jeannot, E.⁶ Jegou, Y.⁷ Lanteri, S.⁸ Leduc, J.⁹ Melab, N.¹⁰ Mornet, G.¹¹ Namyst, R.¹² Primet, P.¹³ Quetier, B.¹⁴ Richard, O.¹⁵ Talbi, E.-G.¹⁶ Touche, I.¹⁷

29
- 27144559253
- ScaLAPACK: A linear algebra library for message-passing computers
- SIAM
- L.S. Blackford, J. Choi, A.J. Cleary, E.F. D'Azevedo, J. Demmel, I.S. Dhillon, J. Dongarra, S. Hammarling, G. Henry, A. Petitet, K. Stanley, D.W. Walker, and R.C. Whaley ScaLAPACK: a linear algebra library for message-passing computers Proceedings of the Eighth SIAM Conference on Parallel Processing for Scientific Computing 1997 SIAM
- (1997) Proceedings of the Eighth SIAM Conference on Parallel Processing for Scientific Computing
- Blackford, L.S.¹ Choi, J.² Cleary, A.J.³ D'Azevedo, E.F.⁴ Demmel, J.⁵ Dhillon, I.S.⁶ Dongarra, J.⁷ Hammarling, S.⁸ Henry, G.⁹ Petitet, A.¹⁰ Stanley, K.¹¹ Walker, D.W.¹² Whaley, R.C.¹³

30
- 0042674307
- The LINPACK benchmark: Past, present and future
- J.J. Dongarra, P. Luszczek, and A. Petitet The LINPACK benchmark: past, present and future Concurrency and Computation: Practice and Experience 15 9 2003 803 820
- (2003) Concurrency and Computation: Practice and Experience , vol.15 , Issue.9 , pp. 803-820
- Dongarra, J.J.¹ Luszczek, P.² Petitet, A.³

31
- 0040617241
- ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance
- Applied Parallel Computing: Computations in Physics, Chemistry and Engineering Science
- J. Choi, J. Demmel, I.S. Dhillon, J. Dongarra, S. Ostrouchov, A. Petitet, K. Stanley, D.W. Walker, and R.C. Whaley ScaLAPACK: a portable linear algebra library for distributed memory computers - design issues and performance J. Dongarra, K. Madsen, J. Wasniewski, Applied Parallel Computing, Computations in Physics, Chemistry and Engineering Science, Second International Workshop, PARA '95, Lyngby, Denmark, August 21-24, 1995, Proceedings Lecture Notes in Computer Science vol. 1041 1995 Springer 95 106 (Pubitemid 126043350)
- (1996) Lecture Notes in Computer Science , Issue.1041 , pp. 95-106
- Choi, J.¹ Demmel, J.² Dhillon, I.³ Dongarra, J.⁴ Ostrouchov, S.⁵ Petitet, A.⁶ Stanley, K.⁷ Walker, D.⁸ Whaley, R.C.⁹

32
- 1242291308
- Netpipe: A network protocol independent performance evaluator
- Q.O. Snell, A.R. Mikler, J.L. Gustafson, Netpipe: A network protocol independent performance evaluator, in: IASTED International Conference on Intelligent Information Management and Systems, 1996.
- (1996) IASTED International Conference on Intelligent Information Management and Systems
- Snell, Q.O.¹ Mikler, A.R.² Gustafson, J.L.³

33
- 78649256719
- The international exascale software project roadmap
- J. Dongarra, P. Beckman, et al., The international exascale software project roadmap, Tech. rep., IESP, 2011, http://www.exascale.org/iesp.
- (2011) Tech. Rep., IESP
- Dongarra, J.¹ Beckman, P.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.