SCOPUS 정보 검색 플랫폼

Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC

Volumn , Issue , 2013, Pages 338-343

Cache Capacity Aware Thread Scheduling for Irregular Memory Access on many-core GPGPUs

(4) Kuo, Hsien Kai a Yen, Ta Kan a Lai, Bo Cheng Charles a Jou, Jing Yang a

a NATIONAL CHIAO TUNG UNIVERSITY (Taiwan)

Author keywords

[No Author keywords available]

Indexed keywords

CACHE MISS REDUCTION; COMPLEX APPLICATIONS; CONCURRENT THREADS; IRREGULAR APPLICATIONS; MEMORY BOTTLENECK; PERFORMANCE DEGRADATION; SCHEDULING SCHEMES; THREAD SCHEDULING;

COMPUTER AIDED DESIGN; PROGRAM PROCESSORS;

SCHEDULING;

EID: 84877739484 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ASPDAC.2013.6509618 Document Type: Conference Paper

Times cited : (12)

References (16)

1
- 33847112910
- A study of the on-chip interconnection network for the ibm cyclops64 multi-core architecture
- Y. P. Zhang, T. Jeong, F. Chen, H. P. Wu, R. Nitzsche, and G. R. Gao, "A Study of the On-Chip Interconnection Network for the IBM Cyclops64 Multi-Core Architecture," in Int'l Parallel and Distributed Processing Symp., 2006.
- (2006) Int'l Parallel and Distributed Processing Symp.
- Zhang, Y.P.¹ Jeong, T.² Chen, F.³ Wu, H.P.⁴ Nitzsche, R.⁵ Gao, G.R.⁶

2
- 79955435088
- Fermi gf100 gpu architecture
- C. M. Wittenbrink, E. Kilgariff, and A. Prabhu, "Fermi GF100 GPU Architecture," IEEE Micro, vol. 31, pp. 50-59, 2011.
- (2011) IEEE Micro , vol.31 , pp. 50-59
- Wittenbrink, C.M.¹ Kilgariff, E.² Prabhu, A.³

3
- 35948991669
- NVIDIA. (2012). NVIDIA CUDA C Programming Guide 4.1. Available: http://www.nvidia.com/object/cuda-home-new.html
- (2012) NVIDIA CUDA C Programming Guide 4. 1

4
- 84860003663
- Thread affinity mapping for irregular data access on shared cache gpgpu
- H.-K. Kuo, K.-T. Chen, B.-C. C. Lai, and J.-Y. Jou, "Thread Affinity Mapping for Irregular Data Access on Shared Cache GPGPU," in Asia and South Pacific Design Automation Conf., 2012.
- (2012) Asia and South Pacific Design Automation Conf.
- Kuo, H.-K.¹ Chen, K.-T.² Lai, B.-C.C.³ Jou, J.-Y.⁴

5
- 33745715056
- Exploiting locality for irregular scientific codes
- H. Han and C.-W. Tseng, "Exploiting Locality for Irregular Scientific Codes," IEEE Trans. Parallel and Distributed Systems, vol. 17, pp. 606-618, 2006.
- (2006) IEEE Trans. Parallel and Distributed Systems , vol.17 , pp. 606-618
- Han, H.¹ Tseng, C.-W.²

6
- 0001483604
- Communication optimizations for irregular scientific computations on distributed memory architectures
- R. Das, M. Uysal, J. Saltz, and Y.-S. Hwang, "Communication Optimizations for Irregular Scientific Computations on Distributed Memory Architectures," J. Parallel Distrib. Comput., vol. 22, pp. 462-478, 1994.
- (1994) J. Parallel Distrib. Comput. , vol.22 , pp. 462-478
- Das, R.¹ Uysal, M.² Saltz, J.³ Hwang, Y.-S.⁴

7
- 76349105923
- Taming irregular eda applications on gpus
- Y. Deng, B. D. Wang, and M. Shuai, "Taming Irregular EDA Applications on GPUs," in Int'l Conf. Computer-Aided Design, 2009.
- (2009) Int'l Conf. Computer-Aided Design
- Deng, Y.¹ Wang, B.D.² Shuai, M.³

8
- 79953126288
- On-The-fly elimination of dynamic irregularities for gpu computing
- E. Z. Zhang, Y. Jiang, Z. Guo, K. Tian, and X. Shen, "On-the-Fly Elimination of Dynamic Irregularities for GPU Computing," in Int'l Conf. Architectural Support for Programming Languages and Operating Systems, 2011.
- (2011) Int'l Conf. Architectural Support for Programming Languages and Operating Systems
- Zhang, E.Z.¹ Jiang, Y.² Guo, Z.³ Tian, K.⁴ Shen, X.⁵

9
- 70349169075
- Analyzing cuda workloads using a detailed gpu simulator
- A. Bakhoda, G. L. Yuan, W. W. L. Fung, H. Wong, and T. M. Aamodt, "Analyzing CUDA Workloads Using a Detailed GPU Simulator," presented at the Int'l Symp. Performance Analysis of Systems and Software, 2009.
- (2009) Int'l Symp. Performance Analysis of Systems and Software
- Bakhoda, A.¹ Yuan, G.L.² Fung, W.W.L.³ Wong, H.⁴ Aamodt, T.M.⁵

10
- 21244474546
- Predicting inter-thread cache contention on a chip multi-processor architecture
- D. Chandra, F. Guo, S. Kim, and Y. Solihin, "Predicting Inter-Thread Cache Contention on a Chip Multi-Processor Architecture," in Int'l Symp. High-Performance Computer Architecture 2005.
- (2005) Int'l Symp. High-Performance Computer Architecture
- Chandra, D.¹ Guo, F.² Kim, S.³ Solihin, Y.⁴

11
- 84863707259
- Working sets past and present
- P. J. Denning, "Working Sets Past and Present," IEEE Trans. Software Engineering, vol. SE-6, pp. 64-84, 1980.
- (1980) IEEE Trans. Software Engineering , vol.SE-6 , pp. 64-84
- Denning, P.J.¹

12
- 77955986662
- To gpu synchronize or not gpu synchronize?
- W.-C. Feng and S. Xiao, "To GPU Synchronize or not GPU Synchronize?," in Int'l Symp. Circuits and Systems, 2010.
- (2010) Int'l Symp. Circuits and Systems
- Feng, W.-C.¹ Xiao, S.²

13
- 85031293038
- Worst-case analysis of memory allocation algorithms
- M. R. Garey, R. L. Graham, and J. D. Ullman, "Worst-Case Analysis of Memory Allocation Algorithms," in ACM Symp. Theory of Computing, 1972.
- (1972) ACM Symp. Theory of Computing
- Garey, M.R.¹ Graham, R.L.² Ullman, J.D.³

14
- 0016561620
- Analysis of several task-scheduling algorithms for a model of multiprogramming computer systems
- K. L. Krause, V. Y. Shen, and H. D. Schwetman, "Analysis of Several Task-Scheduling Algorithms for a Model of Multiprogramming Computer Systems," J. ACM, vol. 22, pp. 522-550, 1975.
- (1975) J. ACM , vol.22 , pp. 522-550
- Krause, K.L.¹ Shen, V.Y.² Schwetman, H.D.³

15
- 84877747717
- ITC'99 Benchmarks. Available: http://www.cad.polito.it/downloads/tools/ itc99.html

16
- 78149233155
- Ocelot: A dynamic optimization framework for bulk-synchronous applications in heterogeneous systems
- G. F. Diamos, A. R. Kerr, S. Yalamanchili, and N. Clark, "Ocelot: A Dynamic Optimization Framework for Bulk-Synchronous Applications in Heterogeneous Systems," in Int'l Conf. Parallel Architectures and Compilation Techniques, 2010.
- (2010) Int'l Conf. Parallel Architectures and Compilation Techniques
- Diamos, G.F.¹ Kerr, A.R.² Yalamanchili, S.³ Clark, N.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.