SCOPUS 정보 검색 플랫폼

Proceedings - International Symposium on Computer Architecture

Volumn , Issue , 2009, Pages 152-163

An analytical model for a gpu architecture with memory-level and thread-level parallelism awareness

(2) Hong, Sunpyo a Kim, Hyesoon b

a GEORGIA INSTITUTE OF TECHNOLOGY (United States)

b Georgia Institute of Technology (United States)

Author keywords

Analytical model; CUDA; GPU architecture; Memory level parallelism; Performance estimation; Warp level parallelism

Indexed keywords

ABSOLUTE ERROR; ANALYTICAL MODEL; APPLICATION PERFORMANCE; DEGREE OF MEMORY; DESIGN SPACES; EXECUTION TIME; GEOMETRIC MEAN; GPU COMPUTING; KEY COMPONENT; MEMORY BANDWIDTHS; MEMORY LEVEL PARALLELISMS; MODEL ESTIMATES; MULTI CORE; OVERALL EXECUTION; PARALLEL APPLICATION; PARALLEL MEMORY; PARALLEL PROCESSOR; PARALLEL PROGRAM; PERFORMANCE BOTTLENECKS; PERFORMANCE CHARACTERISTICS; PERFORMANCE ESTIMATION; PROGRAMMING LANGUAGE; SOFTWARE ENGINEERS; THREAD LEVEL PARALLELISM;

COMPUTER GRAPHICS EQUIPMENT; COMPUTER SIMULATION; COMPUTERS; ESTIMATION; MICROPROCESSOR CHIPS; MODELS; PARALLEL ARCHITECTURES; PROGRAM PROCESSORS; QUERY LANGUAGES; WEAVING;

BENCHMARKING;

EID: 70450231944 PISSN: 10636897 EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/1555754.1555775 Document Type: Conference Paper

Times cited : (521)

References (28)

1
- 84869687636
- ATI Mobility RadeonTM HD4850/4870 Graphics-Overview
- ATI Mobility RadeonTM HD4850/4870 Graphics-Overview. http://ati.amd.com/ products/radeonhd4800.

2
- 84869680496
- Intel Core2 Quad Processors. http://www.intel.com/products/processor/ core2quad.
- Intel Core2 Quad Processors

3
- 79955115474
- NVIDIA GeForce series GTX280, 8800GTX, 8800GT. http://www.nvidia.com/ geforce.
- NVIDIA GeForce series GTX280, 8800GTX, 8800GT

4
- 84869678664
- NVIDIA Quadro FX5600. http://www.nvidia.com/quadro.
- NVIDIA Quadro FX5600

5
- 84869664151
- Advanced Micro Devices, Inc
- Advanced Micro Devices, Inc. AMD Brook+. http://ati.amd.com/technology/ streamcomputing/AMD-Brookplus.pdf.
- AMD Brook

6
- 70450275084
- Analyzing cuda workloads using a detailed GPU simulator
- April
- A. Bakhoda, G. Yuan, W. W. L. Fung, H. Wong, and T. M. Aamodt. Analyzing cuda workloads using a detailed GPU simulator. In IEEE ISPASS, April 2009.
- (2009) IEEE ISPASS
- Bakhoda, A.¹ Yuan, G.² Fung, W.W.L.³ Wong, H.⁴ Aamodt, T.M.⁵

7
- 64949101685
- A first-order fine-grained multithreaded throughput model
- X. E. Chen and T. M. Aamodt. A first-order fine-grained multithreaded throughput model. In HPCA, 2009.
- (2009) HPCA
- Chen, X.E.¹ Aamodt, T.M.²

8
- 44849137198
- NVIDIA Tesla: A Unified Graphics and Computing Architecture
- March-April
- E. Lindholm, J. Nickolls, S. Oberman and J. Montrym. NVIDIA Tesla: A Unified Graphics and Computing Architecture. IEEE Micro, 28(2):39-55, March-April 2008.
- (2008) IEEE Micro , vol.28 , Issue.2 , pp. 39-55
- Lindholm, E.¹ Nickolls, J.² Oberman, S.³ Montrym, J.⁴

9
- 70450275082
- M. Fatica, P. LeGresley, I. Buck, J. Stone, J. Phillips, S. Morton, and P. Micikevicius. High Performance Computing with CUDA, SC08, 2008.
- (2008) High Performance Computing with CUDA
- Fatica, M.¹ LeGresley, P.² Buck, I.³ Stone, J.⁴ Phillips, J.⁵ Morton, S.⁶ Micikevicius, P.⁷

10
- 34247369230
- Oct
- A. Glew. MLP yes! ILP no! In ASPLOS Wild and Crazy Idea Session '98, Oct. 1998.
- (1998) MLP yes! ILP no! In ASPLOS Wild and Crazy Idea Session '98
- Glew, A.¹

11
- 84871131547
- GPGPU
- GPGPU. General-Purpose Computation Using Graphics Hardware. http://www.gpgpu.org/.
- General-Purpose Computation Using Graphics Hardware

12
- 70450274279
- An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness
- Technical Report TR-2009-003, Atlanta, GA, USA
- S. Hong and H. Kim. An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness. Technical Report TR-2009-003, Atlanta, GA, USA, 2009.
- (2009)
- Hong, S.¹ Kim, H.²

13
- 84869688479
- fall 2007
- W. Hwu and D. Kirk. Ece 498a11: Programming massively parallel processors, fall 2007. http://courses.ece.uiuc.edu/ece498/a11/.
- Ece 498a11: Programming massively parallel processors
- Hwu, W.¹ Kirk, D.²

14
- 70450275951
- Intel SSE/MMX2/KNI documentation. http://www.intel80386.com/simd/mmx2- doc.html.
- Intel SSE/MMX2/KNI documentation. http://www.intel80386.com/simd/mmx2- doc.html.

15
- 4644299010
- ISCA
- T. S. Karkhanis and J. E. Smith. A first-order superscalar processor model. In ISCA, 2004.
- (2004) A first-order superscalar processor model
- Karkhanis, T.S.¹ Smith, J.E.²

16
- 74349092397
- Khronos. Opencl - the open standard for parallel programming of heterogeneous systems. http://www.khronos.org/opencl/.
- Opencl - the open standard for parallel programming of heterogeneous systems

17
- 68149168035
- Merge: A programming model for heterogeneous multi-core systems
- M. D. Linderman, J. D. Collins, H. Wang, and T. H. Meng. Merge: a programming model for heterogeneous multi-core systems. In ASPLOS XIII, 2008.
- (2008) ASPLOS , vol.13
- Linderman, M.D.¹ Collins, J.D.² Wang, H.³ Meng, T.H.⁴

18
- 0034824085
- Data-flow prescheduling for large instruction windows in out-of-order processors
- P. Michaud and A. Seznec. Data-flow prescheduling for large instruction windows in out-of-order processors. In HPCA, 2001.
- (2001) HPCA
- Michaud, P.¹ Seznec, A.²

19
- 0033365427
- Exploring instruction-fetch bandwidth requirement in wide-issue superscalar processors
- P. Michaud, A. Seznec, and S. Jourdan. Exploring instruction-fetch bandwidth requirement in wide-issue superscalar processors. In PA C T, 1999.
- (1999) PA , Issue.C T
- Michaud, P.¹ Seznec, A.² Jourdan, S.³

20
- 78651550268
- Scalable Parallel Programming with CUDA
- March-April
- J. Nickolls, I. Buck, M. Garland, and K. Skadron. Scalable Parallel Programming with CUDA. ACM Queue, 6(2):40-53, March-April 2008.
- (2008) ACM Queue , vol.6 , Issue.2 , pp. 40-53
- Nickolls, J.¹ Buck, I.² Garland, M.³ Skadron, K.⁴

21
- 85016676932
- Theoretical modeling of superscalar processor performance
- D. B. Noonburg and J. P. Shen. Theoretical modeling of superscalar processor performance. In MICRO-27, 1994.
- (1994) MICRO-27
- Noonburg, D.B.¹ Shen, J.P.²

22
- 70450258824
- NVIDIA Corporation
- NVIDIA Corporation. CUDA Programming Guide, Version 2.1.
- CUDA Programming Guide, Version 2.1

23
- 34548400750
- Addison-Wesley Professional
- M. Pharr and R. Fernando. GPU Gems 2. Addison-Wesley Professional, 2005.
- (2005) GPU Gems 2
- Pharr, M.¹ Fernando, R.²

24
- 43449094719
- S. Ryoo, C. Rodrigues, S. Stone, S. Baghsorkhi, S. Ueng, J. Stratton, and W. Hwu. Program optimization space pruning for a multithreaded gpu. In CGO, 2008.
- S. Ryoo, C. Rodrigues, S. Stone, S. Baghsorkhi, S. Ueng, J. Stratton, and W. Hwu. Program optimization space pruning for a multithreaded gpu. In CGO, 2008.

25
- 0342373102
- An analytical solution for a markov chain modeling multithreaded
- Technical report, Berkeley, CA, USA
- R. H. Saavedra-Barrera and D. E. Culler. An analytical solution for a markov chain modeling multithreaded. Technical report, Berkeley, CA, USA, 1991.
- (1991)
- Saavedra-Barrera, R.H.¹ Culler, D.E.²

26
- 49249086142
- Larrabee: A many-core x86 architecture for visual computing
- L. Seiler, D. Carmean, E. Sprangle, T. Forsyth, M. Abrash, P. Dubey, S. Junkins, A. Lake, J. Sugerman, R. Cavin, R. Espasa, E. Grochowski, T. Juan, and P. Hanrahan. Larrabee: a many-core x86 architecture for visual computing. ACM Trans. Graph., 2008.
- (2008) ACM Trans. Graph
- Seiler, L.¹ Carmean, D.² Sprangle, E.³ Forsyth, T.⁴ Abrash, M.⁵ Dubey, P.⁶ Junkins, S.⁷ Lake, A.⁸ Sugerman, J.⁹ Cavin, R.¹⁰ Espasa, R.¹¹ Grochowski, E.¹² Juan, T.¹³ Hanrahan, P.¹⁴

27
- 0031593993
- Analytic evaluation of shared-memory systems with ILP processors
- D. J. Sorin, V. S. Pai, S. V. Adve, M. K. Vernon, and D. A. Wood. Analytic evaluation of shared-memory systems with ILP processors. In ISCA, 1998.
- (1998) ISCA
- Sorin, D.J.¹ Pai, V.S.² Adve, S.V.³ Vernon, M.K.⁴ Wood, D.A.⁵

28
- 20444381978
- Face detection using spectral histograms and SVMs
- June
- C. A. Waring and X. Liu. Face detection using spectral histograms and SVMs. Systems, Man, and Cybernetics, Part B, IEEE Transactions on, 35(3):467-476, June 2005.
- (2005) Systems, Man, and Cybernetics, Part B, IEEE Transactions on , vol.35 , Issue.3 , pp. 467-476
- Waring, C.A.¹ Liu, X.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.