SCOPUS 정보 검색 플랫폼

Proceedings - 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security and 2015 IEEE 12th International Conference on Embedded Software and Systems, HPCC-CSS-ICESS 2015

Volumn , Issue , 2015, Pages 694-699

A data-oriented method for scheduling dependent tasks on high-density multi-GPU systems

(3) Zhang, Peng a Gao, Yuxiang b Qiu, Meikang c

a St Francis Hospital The Heart Center (United States)

b Cray Inc (United States)

c PACE UNIVERSITY (United States)

Author keywords

Data scheduling; Heterogeneous multi GPU systems; Parallel computing; Task scheduling

Indexed keywords

COMPUTER ARCHITECTURE; COMPUTER PROGRAMMING; COMPUTERS; DESIGN; EMBEDDED SOFTWARE; EMBEDDED SYSTEMS; NETWORK SECURITY; PARALLEL PROCESSING SYSTEMS; PROGRAM PROCESSORS; SCHEDULING;

DATA ORIENTED METHODS; DATA SCHEDULING; DATA-INTENSIVE APPLICATION; MATRIX MULTIPLICATION; MULTI-GPU SYSTEMS; PROCESSING ELEMENTS; PROGRAMMING ENVIRONMENT; TASK-SCHEDULING;

MATRIX ALGEBRA;

EID: 84961692393 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/HPCC-CSS-ICESS.2015.314 Document Type: Conference Paper

Times cited : (11)

References (22)

1
- 77954080759
- Dense linear algebra solvers for multicore with GPU accelerators
- S. Tomov, R. Nath, H. Ltaief, and J. Dongarra, "Dense linear algebra solvers for multicore with GPU accelerators, " in Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on, 2010, pp. 1-8.
- (2010) Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on , pp. 1-8
- Tomov, S.¹ Nath, R.² Ltaief, H.³ Dongarra, J.⁴

2
- 78650883751
- Interlacing bypass rings to torus networks for more efficient networks
- P. Zhang, R. Powell, and Y. Deng, "Interlacing Bypass Rings to Torus Networks for More Efficient Networks, " Parallel and Distributed Systems, IEEE Transactions on, vol. 22, pp. 287-295, 2011.
- (2011) Parallel and Distributed Systems, IEEE Transactions on , vol.22 , pp. 287-295
- Zhang, P.¹ Powell, R.² Deng, Y.³

3
- 84908117733
- CRC Press, Inc.
- J. Kurzak, D. A. Bader, and J. Dongarra, Scientific Computing with Multicore and Accelerators: CRC Press, Inc., 2010.
- (2010) Scientific Computing with Multicore and Accelerators
- Kurzak, J.¹ Bader, D.A.² Dongarra, J.³

4
- 84881127643
- Sparse matrixvector multiplication on NVIDIA GPU
- H. Liu, S. Yu, Z. Chen, B. Hsieh, and L. Shao, "Sparse matrixvector multiplication on nvidia gpu, " Int. J. Numer. Anal. Model, vol. 3, pp. 185-191, 2012.
- (2012) Int. J. Numer. Anal. Model , vol.3 , pp. 185-191
- Liu, H.¹ Yu, S.² Chen, Z.³ Hsieh, B.⁴ Shao, L.⁵

5
- 79951616240
- Revealing feasibility of FMM on ASIC: Efficient implementation of N-Body problem on FPGA
- Z. Zheng, Y. Zhu, X. Wang, Z. Que, T. Huang, X. Yin, H. Wang, G. Rong, and M. Qiu, "Revealing feasibility of FMM on ASIC: efficient implementation of N-Body problem on FPGA, " in Computational Science and Engineering (CSE), 2010 IEEE 13th International Conference on, 2010, pp. 132-139.
- (2010) Computational Science and Engineering (CSE), 2010 IEEE 13th International Conference on , pp. 132-139
- Zheng, Z.¹ Zhu, Y.² Wang, X.³ Que, Z.⁴ Huang, T.⁵ Yin, X.⁶ Wang, H.⁷ Rong, G.⁸ Qiu, M.⁹

6
- 84969913998
- BLAS for GPUs
- Kurzak J, Bader DA, Dongarra J (eds). CRC Press: Boca Raton, FL
- R. Nath, S. Tomov, and J. Dongarra, "BLAS for GPUs, " Scientific Computing with Multicore and Accelerators, Kurzak J, Bader DA, Dongarra J (eds). CRC Press: Boca Raton, FL, 2010.
- (2010) Scientific Computing with Multicore and Accelerators
- Nath, R.¹ Tomov, S.² Dongarra, J.³

7
- 84928141293
- GPU-based parallel reservoir simulation for large-scale simulation problems
- S. Yu, H. Liu, Z. J. Chen, B. Hsieh, and L. Shao, "GPU-based parallel reservoir simulation for large-scale simulation problems, " in SPE Europec/EAGE Annual Conference, 2012.
- (2012) SPE Europec/EAGE Annual Conference
- Yu, S.¹ Liu, H.² Chen, Z.J.³ Hsieh, B.⁴ Shao, L.⁵

8
- 84983744589
- Matrix multiplication on high-density multi-GPU architectures: Theoretical and experimental investigations
- J. M. Kunkel and T. Ludwig, Eds., ed: Springer International Publishing
- P. Zhang and Y. Gao, "Matrix Multiplication on High-Density Multi-GPU Architectures: Theoretical and Experimental Investigations, " in High Performance Computing. vol. 9137, J. M. Kunkel and T. Ludwig, Eds., ed: Springer International Publishing, 2015, pp. 17-30.
- (2015) High Performance Computing , vol.9137 , pp. 17-30
- Zhang, P.¹ Gao, Y.²

9
- 79951765394
- Data-aware task scheduling on multi-accelerator based platforms
- C. Augonnet, J. Clet-Ortega, S. Thibault, and R. Namyst, "Data-Aware Task Scheduling on Multi-accelerator Based Platforms, " in Parallel and Distributed Systems (ICPADS), 2010 IEEE 16th International Conference on, 2010, pp. 291-298.
- (2010) Parallel and Distributed Systems (ICPADS), 2010 IEEE 16th International Conference on , pp. 291-298
- Augonnet, C.¹ Clet-Ortega, J.² Thibault, S.³ Namyst, R.⁴

10
- 84909993282
- Eigenanalysis-based task mapping on parallel computers with cellular networks
- P. Zhang, Y. Gao, J. Fierson, and Y. Deng, "Eigenanalysis-Based Task Mapping on Parallel Computers with Cellular Networks, " Mathematics of Computation, vol. 83, pp. 1727-1756, 2014.
- (2014) Mathematics of Computation , vol.83 , pp. 1727-1756
- Zhang, P.¹ Gao, Y.² Fierson, J.³ Deng, Y.⁴

11
- 70350641505
- StarPU: A unified platform for task scheduling on heterogeneous multicore architectures
- ed: Springer
- C. Augonnet, S. Thibault, R. Namyst, and P.-A. Wacrenier, "StarPU: A unified platform for task scheduling on heterogeneous multicore architectures, " in Euro-Par 2009 Parallel Processing, ed: Springer, 2009, pp. 863-874.
- (2009) Euro-Par 2009 Parallel Processing , pp. 863-874
- Augonnet, C.¹ Thibault, S.² Namyst, R.³ Wacrenier, P.-A.⁴

12
- 84655174868
- DAGuE: A generic distributed DAG engine for high performance computing
- G. Bosilca, A. Bouteiller, A. Danalis, T. Herault, P. Lemarinier, and J. Dongarra, "DAGuE: A generic distributed DAG engine for high performance computing, " Parallel Computing, vol. 38, pp. 37-51, 2012.
- (2012) Parallel Computing , vol.38 , pp. 37-51
- Bosilca, G.¹ Bouteiller, A.² Danalis, A.³ Herault, T.⁴ Lemarinier, P.⁵ Dongarra, J.⁶

13
- 79959734507
- Ompss: A proposal for programming heterogeneous multi-core architectures
- A. Duran, E. Ayguadé, R. M. Badia, J. Labarta, L. Martinell, X. Martorell, and J. Planas, "Ompss: a proposal for programming heterogeneous multi-core architectures, " Parallel Processing Letters, vol. 21, pp. 173-193, 2011.
- (2011) Parallel Processing Letters , vol.21 , pp. 173-193
- Duran, A.¹ Ayguadé, E.² Badia, R.M.³ Labarta, J.⁴ Martinell, L.⁵ Martorell, X.⁶ Planas, J.⁷

14
- 0027593363
- Parallax: A tool for parallel program scheduling
- T. Lewis and H. El-Rewini, "Parallax: A Tool for Parallel Program Scheduling, " IEEE Parallel Distrib. Technol., vol. 1, pp. 62-72, 1993.
- (1993) IEEE Parallel Distrib. Technol. , vol.1 , pp. 62-72
- Lewis, T.¹ El-Rewini, H.²

15
- 84936100422
- A data-driven paradigm for mapping problems
- P. Zhang, L. Liu, and Y. Deng, "A data-driven paradigm for mapping problems, " Parallel Computing, vol. 48, pp. 108-124, 2015.
- (2015) Parallel Computing , vol.48 , pp. 108-124
- Zhang, P.¹ Liu, L.² Deng, Y.³

16
- 84874887986
- Thermal-aware task scheduling in 3D chip multiprocessor with real-time constrained workloads
- J. Li, M. Qiu, J.-W. Niu, L. T. Yang, Y. Zhu, and Z. Ming, "Thermal-aware task scheduling in 3D chip multiprocessor with real-time constrained workloads, " ACM Transactions on Embedded Computing Systems (TECS), vol. 12, p. 24, 2013.
- (2013) ACM Transactions on Embedded Computing Systems (TECS) , vol.12 , pp. 24
- Li, J.¹ Qiu, M.² Niu, J.-W.³ Yang, L.T.⁴ Zhu, Y.⁵ Ming, Z.⁶

17
- 84857684600
- arXiv preprint arXiv:1010. 2000
- H. Bouwmeester and J. Langou, "A critical path approach to analyzing parallelism of algorithmic variants. Application to Cholesky inversion, " arXiv preprint arXiv:1010. 2000, 2010.
- (2010) A Critical Path Approach to Analyzing Parallelism of Algorithmic Variants. Application to Cholesky Inversion
- Bouwmeester, H.¹ Langou, J.²

18
- 84862806683
- Online optimization for scheduling preemptable tasks on IaaS cloud systems
- J. Li, M. Qiu, Z. Ming, G. Quan, X. Qin, and Z. Gu, "Online optimization for scheduling preemptable tasks on IaaS cloud systems, " Journal of Parallel and Distributed Computing, vol. 72, pp. 666-677, 2012.
- (2012) Journal of Parallel and Distributed Computing , vol.72 , pp. 666-677
- Li, J.¹ Qiu, M.² Ming, Z.³ Quan, G.⁴ Qin, X.⁵ Gu, Z.⁶

19
- 80155187624
- Optimal data allocation for scratch-pad memory on embedded multi-core systems
- Y. Guo, Q. Zhuge, J. Hu, M. Qiu, and E. H.-M. Sha, "Optimal data allocation for scratch-pad memory on embedded multi-core systems, " in Parallel Processing (ICPP), 2011 International Conference on, 2011, pp. 464-471.
- (2011) Parallel Processing (ICPP), 2011 International Conference on , pp. 464-471
- Guo, Y.¹ Zhuge, Q.² Hu, J.³ Qiu, M.⁴ Sha, E.H.-M.⁵

20
- 77953997924
- Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects
- E. Agullo, J. Demmel, J. Dongarra, B. Hadri, J. Kurzak, J. Langou, H. Ltaief, P. Luszczek, and S. Tomov, "Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects, " in Journal of Physics: Conference Series, 2009, p. 012037.
- (2009) Journal of Physics: Conference Series , pp. 012037
- Agullo, E.¹ Demmel, J.² Dongarra, J.³ Hadri, B.⁴ Kurzak, J.⁵ Langou, J.⁶ Ltaief, H.⁷ Luszczek, P.⁸ Tomov, S.⁹

21
- 78349252088
- SkePU: A multi-backend skeleton programming library for multi-GPU systems
- J. Enmyren and C. W. Kessler, "SkePU: a multi-backend skeleton programming library for multi-GPU systems, " in Proceedings of the fourth international workshop on High-level parallel programming and applications, 2010, pp. 5-14.
- (2010) Proceedings of the Fourth International Workshop on High-level Parallel Programming and Applications , pp. 5-14
- Enmyren, J.¹ Kessler, C.W.²

22
- 0036467470
- PASTIX: A highperformance parallel direct solver for sparse symmetric positive definite systems
- P. Hénon, P. Ramet, and J. Roman, "PASTIX: a highperformance parallel direct solver for sparse symmetric positive definite systems, " Parallel Computing, vol. 28, pp. 301-321, 2002.
- (2002) Parallel Computing , vol.28 , pp. 301-321
- Hénon, P.¹ Ramet, P.² Roman, J.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.