SCOPUS 정보 검색 플랫폼

Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2010

Volumn , Issue , 2010, Pages

Dynamic load balancing on single- and multi-GPU systems

(4) Chen, Long a Villa, Oreste b Krishnamoorthy, Sriram b Gao, Guang R a

a UNIVERSITY OF DELAWARE (United States)

b PACIFIC NORTHWEST NATIONAL LABORATORY (United States)

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTATIONAL POWER; DYNAMIC LOAD BALANCING; GPU PROGRAMMING; GRAPHICS PROCESSING UNITS; LINEAR SPEED-UP; LOAD BALANCE; LOAD IMBALANCE; LOAD-BALANCING; MANY-CORE; PERFORMANCE IMPROVEMENTS; PROGRAMMING TECHNIQUE; TASK-BASED;

BENCHMARKING; DISTRIBUTED PARAMETER NETWORKS; DYNAMIC LOADS; MOLECULAR DYNAMICS; NETWORK MANAGEMENT; PARALLEL ARCHITECTURES;

PROGRAM PROCESSORS;

EID: 77953985375 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/IPDPS.2010.5470413 Document Type: Conference Paper

Times cited : (124)

References (25)

1
- 77954006904
- ATI Stream
- AMD. ATI Stream. http://www.amd.com.

2
- 70350641505
- StarPU: A unified platform for task scheduling on heterogeneous multicore architectures
- Delft, Netherlands
- C. Augonnet, S. Thibault, R. Namyst, and P.-A. Wacrenier. StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures. In Euro-Par 2009, pages 863-874, Delft, Netherlands, 2009.
- Euro-Par 2009 , vol.2009 , pp. 863-874
- Augonnet, C.¹ Thibault, S.² Namyst, R.³ Wacrenier, P.-A.⁴

3
- 0003478393
- G. Bird editor, Oxford University Press
- G. Bird, editor. Molecular gas dynamics and the direct simulation of gas flows : GA Bird Oxford engineering science series: 42. Oxford University Press, 1995.
- (1995) Molecular Gas Dynamics and the Direct Simulation of Gas Flows : GA Bird Oxford Engineering Science Series , vol.42

4
- 70450059008
- Accelerating leukocyte tracking using CUDA: A case study in leveraging manycore coprocessors
- M. Boyer, D. T., S. A., and K. S. Accelerating leukocyte tracking using CUDA: A case study in leveraging manycore coprocessors. In IPDPS 2009, pages 1-12, 2009.
- (2009) IPDPS , vol.2009 , pp. 1-12
- Boyeri, M.¹ T, D.² A, S.³ S, K.⁴

5
- 0001782767
- Parallelization of charmm for MIMD machines
- B. Brooks and H. M. Parallelization of Charmm for MIMD Machines. Chemical Design Automation News, 7(16):16-22, 1992.
- (1992) Chemical Design Automation News , vol.7 , Issue.16 , pp. 16-22
- Brooks, B.¹ M, H.²

6
- 77954018826
- On dynamic load balancing on graphics processors
- D. Cederman and P. T. On Dynamic Load Balancing on Graphics Processors. In GH 2008, pages 57-64, 2008.
- (2008) GH 2008 , pp. 57-64
- Cederman, D.¹ T, P.²

7
- 0029179685
- Modeling the benefits of mixed data and task parallelism
- New York, NY, USA, ACM
- S. Chakrabarti, J. Demmel, and K. Yelick. Modeling the benefits of mixed data and task parallelism. In SPAA'95, pages 74-83, New York, NY, USA, 1995. ACM.
- (1995) SPAA'95 , pp. 74-83
- Chakrabarti, S.¹ Demmel, J.² Yelick, K.³

8
- 77953970436
- Parallel molecular dynamics
- March
- T. Clark, M. J.A., and S. L.R. Parallel Molecular Dynamics. In SIAMPP'91, pages 338-344, March 1991.
- (1991) SIAMPP'91 , pp. 338-344
- Clark, T.¹ J, A.M.² L, R.S.³

9
- 0004116989
- McGraw-Hill Higher Education
- T. H. Cormen, C. Stein, R. L. Rivest, and C. E. Leiserson. Introduction to Algorithms. McGraw-Hill Higher Education, 2001.
- (2001) Introduction to Algorithms
- Cormen, T.H.¹ Stein, C.² Rivest, R.L.³ Leiserson, C.E.⁴

10
- 33750913667
- Kd-tree acceleration structures for a gpu raytracer
- New York, NY, USA
- T. Foley and J. Sugerman. Kd-tree acceleration structures for a gpu raytracer. In HWWS'05, pages 15-22, New York, NY, USA, 2005.
- (2005) HWWS'05 , pp. 15-22
- Foley, T.¹ Sugerman, J.²

11
- 84870726202
- D. Frenkel and B. Smit, editors, Academic Press, Inc., Orlando, FL, USA
- D. Frenkel and B. Smit, editors. Understanding Molecular Simulation: From Algorithms to Applications. Academic Press, Inc., Orlando, FL, USA, 1996.
- (1996) Understanding Molecular Simulation: From Algorithms to Applications

12
- 77955990292
- Enabling task parallelism in the cuda scheduler
- M. Guevara, C. Gregg, and S. K. Enabling task parallelism in the cuda scheduler. In PEMA 2009, 2009.
- PEMA 2009 , vol.2009
- Guevara, M.¹ Gregg, C.² K, S.³

13
- 38349041620
- Accelerating large graph algorithms on the gpu using cuda
- P. Harish and N. P.J. Accelerating large graph algorithms on the gpu using cuda. In HiPC, pages 197-208, 2007.
- (2007) HiPC , pp. 197-208
- Harish, P.¹ P, J.N.²

14
- 0025917643
- Wait-free synchronization
- M. Herlihy. Wait-free synchronization. ACM TPLS., 13(1):124-149, 1991.
- (1991) ACM TPLS , vol.13 , Issue.1 , pp. 124-149
- Herlihy, M.¹

15
- 77954019183
- OpenCL
- Khronos. OpenCL. http://www.khronos.org.

16
- 67650046428
- Merge: A programming model for heterogeneous multi-core systems
- M. D. Linderman, J. D. Collins, H. Wang, and T. H. M. Merge: a programming model for heterogeneous multi-core systems. SIG- PLANNot., 43(3):287-296, 2008.
- (2008) SIG-PLANNot , vol.43 , Issue.3 , pp. 287-296
- Linderman, M.D.¹ Collins, J.D.² Wang, H.³ H, M.T.⁴

17
- 77953994799
- June
- T. Murray. Personal communication, June 2009.
- (2009) Personal Communication
- Murray, T.¹

18
- 34249052630
- Adaptive load balancing for raycasting of non-uniformly bricked volumes
- Parallel Graphics and Visualization
- M. Mller, C. and Strengert and T. Ertl. Adaptive load balancing for raycasting of non-uniformly bricked volumes. Parallel Computing, 33(6):406-419, 2007. Parallel Graphics and Visualization.
- (2007) Parallel Computing , vol.33 , Issue.6 , pp. 406-419
- Mller, M.C.¹ Strengert² Ertl, T.³

19
- 78651550268
- Scalable parallel programming with CUDA
- J. Nickolls, I. Buck, M. G., and K. S. Scalable Parallel Programming with CUDA. Queue, 6(2):40-53, 2008.
- (2008) Queue , vol.6 , Issue.2 , pp. 40-53
- Nickolls, J.¹ Buck, I.² G, M.³ S, K.⁴

20
- 77953976782
- CUDA
- Nvidia. CUDA. http://www.nvidia.com.

21
- 77952873681
- Nvidia
- Nvidia. NVIDIA CUDA Programming Guide 2.3, 2009.
- (2009) NVIDIA CUDA Programming Guide 2.3

22
- 60649087529
- A task parallel algorithm for computing the costs of all-pairs shortest paths on the cuda-compatible gpu
- T. Okuyama, F. I., and K. H. A task parallel algorithm for computing the costs of all-pairs shortest paths on the cuda-compatible gpu. In ISPA'08, pages 284-291, 2008.
- (2008) ISPA'08 , pp. 284-291
- Okuyama, T.¹ I, F.² H, K.³

23
- 79959466764
- Optimization principles and application performance evaluation of a multithreaded gpu using cuda
- S. Ryoo, C. I. Rodrigues, S. S. Baghsorkhi, S. S. Stone, D. B. Kirk, and W. M. Hwu. Optimization principles and application performance evaluation of a multithreaded gpu using cuda. In PPoPP'08, pages 73-82, 2008.
- (2008) PPoPP'08 , pp. 73-82
- Ryoo, S.¹ Rodrigues, C.I.² Baghsorkhi, S.S.³ Stone, S.S.⁴ Kirk, D.B.⁵ Hwu, W.M.⁶

24
- 0026120011
- Molecular dynamics on hypercube parallel computers
- W. Smith. Molecular dynamics on hypercube parallel computers. Computer Physics Communications, 62:229-248, 1991.
- (1991) Computer Physics Communications , vol.62 , pp. 229-248
- Smith, W.¹

25
- 70350771131
- Benchmarking GPUs to tune dense linear algebra
- V. Volkov and J. W. Demmel. Benchmarking GPUs to tune dense linear algebra. In SC 2008, pages 1-11, 2008.
- (2008) SC , vol.2008 , pp. 1-11
- Volkov, V.¹ Demmel, J.W.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.