SCOPUS 정보 검색 플랫폼

IEEE Micro

Volumn 32, Issue 6, 2012, Pages 4-16

Redefining the role of the CPU in the era of CPU-GPU integration

(5) Arora, Manish a Nath, Siddhartha a Mazumdar, Subhra a Baden, Scott B a Tullsen, Dean M a

a UNIVERSITY OF CALIFORNIA (United States)

Author keywords

CPU architecture; CPU GPU systems; heterogeneous designs

Indexed keywords

BRANCH PREDICTION; CPU ARCHITECTURE; CPU DESIGN; INSTRUCTION LEVEL PARALLELISM; THREAD LEVEL PARALLELISM;

HARDWARE;

MICROWAVE INTEGRATED CIRCUITS;

EID: 84875981232 PISSN: 02721732 EISSN: None Source Type: Journal
DOI: 10.1109/MM.2012.57 Document Type: Article

Times cited : (47)

References (26)

1
- 77951900491
- Nvidia
- "NVIDIA's Next Generation CUDA Compute Architecture: Fermi," Nvidia, 2009.
- (2009) NVIDIA's Next Generation CUDA Compute Architecture: Fermi

2
- 35648995516
- tech. report, EECS Dept., Univ. of California, Berkeley
- K. Asanovic et al., The Landscape of Parallel Computing Research: A View from Berkeley, tech. report, EECS Dept., Univ. of California, Berkeley, 2006.
- (2006) The Landscape of Parallel Computing Research: A View from Berkeley
- Asanovic, K.¹

3
- 77954995885
- Debunking the 100x GPU vs. CPU myth: An Evaluation of throughput computing on CPU and GPU
- ACM
- V.W. Lee et al., "Debunking the 100x GPU vs. CPU Myth: An Evaluation of Throughput Computing on CPU and GPU," Proc. 37th Ann. Int'l Symp. Computer Architecture (ISCA 10), ACM, 2010, pp. 451-460.
- (2010) Proc. 37th Ann. Int'l Symp. Computer Architecture (ISCA 10) , pp. 451-460
- Lee, V.W.¹

4
- 70649092154
- Rodinia: A benchmark suite for heterogeneous computing
- IEEE CS
- S. Che et al., "Rodinia: A Benchmark Suite for Heterogeneous Computing," Proc. IEEE Int'l Symp. Workload Characterization (IISWC 09), IEEE CS, 2009, pp. 44-54.
- (2009) Proc. IEEE Int'l Symp. Workload Characterization (IISWC 09) , pp. 44-54
- Che, S.¹

5
- 34247174509
- Core architecture optimization for heterogeneous chip multiprocessors
- DOI 10.1145/1152154.1152162, PACT 2006 - Proceedings of the Fifteenth International Conference on Parallel Architectures and Compilation Techniques
- R. Kumar, D.M. Tullsen, and N.P. Jouppi, "Core Architecture Optimization for Heterogeneous Chip Multiprocessors," Proc. 15th Int'l Conf. Parallel Architecture and Compilation Techniques (PACT 06), ACM, 2006, pp. 23-32. (Pubitemid 46601078)
- (2006) Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT , vol.2006 , pp. 23-32
- Kumar, R.¹ Tullsen, D.M.² Jouppi, N.P.³

6
- 33947588048
- A survey of general-purpose computation on graphics hardware
- DOI 10.1111/j.1467-8659.2007.01012.x
- J.D. Owens et al., "A Survey of General-Purpose Computation on Graphics Hardware," Computer Graphics Forum, 2007, vol. 26, no. 1, pp. 80-113. (Pubitemid 46481097)
- (2007) Computer Graphics Forum , vol.26 , Issue.1 , pp. 80-113
- Owens, J.D.¹ Luebke, D.² Govindaraju, N.³ Harris, M.⁴ Kruger, J.⁵ Lefohn, A.E.⁶ Purcell, T.J.⁷

7
- 84856683554
- Addison-Wesley
- A. Munshi et al., OpenCL Programming Guide, Addison-Wesley, 2011.
- (2011) OpenCL Programming Guide
- Munshi, A.¹

8
- 77952340798
- Performance insights on executing nongraphics applications on CUDA on the NVIDIA GeForce 8800 GTX
- W.M. Hwu et al., "Performance Insights on Executing Nongraphics Applications on CUDA on the NVIDIA GeForce 8800 GTX," Hot Chips 19, 2007, http://www.hotchips. org/archives/hc19.
- (2007) Hot Chips , vol.19
- Hwu, W.M.¹

9
- 84876238543
- Scope for performance enhancement of CMU sphinx by parallelizing with OpenCL
- Aug
- S.C. Harish et al., "Scope for Performance Enhancement of CMU Sphinx by Parallelizing with OpenCL," J. Wisdom Based Computing, Aug. 2011, pp. 43-46.
- (2011) J. Wisdom Based Computing , pp. 43-46
- Harish, S.C.¹

10
- 84861416065
- Parallelization of particle filter algorithms
- Springer-Verlag
- M.A. Goodrum et al., "Parallelization of Particle Filter Algorithms," Proc. Int'l Conf. Computer Architecture, Springer-Verlag, 2010, pp. 139-149.
- (2010) Proc. Int'l Conf. Computer Architecture , pp. 139-149
- Goodrum, M.A.¹

11
- 51049106282
- Options pricing on the GPU
- M. Pharr and R. Fernando, eds Addison-Wesley chapter 45
- C. Kolb and M. Pharr, "Options Pricing on the GPU," GPU Gems 2, M. Pharr and R. Fernando, eds., Addison-Wesley, 2005, chapter 45.
- (2005) GPU Gems 2
- Kolb, C.¹ Pharr, M.²

12
- 77949647837
- Program Optimization of array-intensive SPEC2K benchmarks on multithreaded GPU using CUDA and brook+
- IEEE CS
- G. Wang et al., "Program Optimization of Array-Intensive SPEC2K Benchmarks on Multithreaded GPU Using CUDA and Brook+," Proc. 15th Int'l Conf. Parallel and Distributed Systems, IEEE CS, 2009, pp. 292-299.
- (2009) Proc. 15th Int'l Conf. Parallel and Distributed Systems , pp. 292-299
- Wang, G.¹

13
- 80053253150
- tech. report, NCSA, Univ. Illinois, Jan
- G. Shi, S. Gottlieb, and V. Kindratenko, MILC on GPUs, tech. report, NCSA, Univ. Illinois, Jan. 2010.
- (2010) MILC on GPUs
- Shi, G.¹ Gottlieb, S.² Kindratenko, V.³

14
- 70450029279
- Evaluating the use of GPUs in liver image segmentation and HMMER database searches
- IEEE CS doi:10.1109/IPDPS.2009.5161073
- J. Walters et al., "Evaluating the Use of GPUs in Liver Image Segmentation and HMMER Database Searches," Proc. IEEE Int'l Symp. Parallel & Distributed Processing, IEEE CS, 2009, doi:10.1109/IPDPS.2009.5161073.
- (2009) Proc. IEEE Int'l Symp. Parallel & Distributed Processing
- Walters, J.¹

15
- 84876265398
- G. Ruetsch and M. Fatica, "A CUDA Fortran Implementation of BWAVES," http://www. pgroup.com/lit/articles/nvidia-paper-bwaves.pdf.
- A CUDA Fortran Implementation of BWAVES
- Ruetsch, G.¹ Fatica, M.²

16
- 84876276364
- Simulation of quantum gates on a novel GPU architecture
- WSEAS
- E. Gutierrez et al., "Simulation of Quantum Gates on a Novel GPU Architecture," Proc. 7th Int'l Conf. Systems Theory and Scientific Computation, WSEAS, 2007, pp. 121-126.
- (2007) Proc. 7th Int'l Conf. Systems Theory and Scientific Computation , pp. 121-126
- Gutierrez, E.¹

17
- 79955076210
- Unstructured grid applications on GPU: Performance analysis and improvement
- ACM doi:10.1145/1964179.1964197
- L. Solano-Quinde et al., "Unstructured Grid Applications on GPU: Performance Analysis and Improvement," Proc. 4th Workshop General Purpose Processing on Graphics Processing Units, ACM, 2011, doi:10.1145/1964179.1964197.
- (2011) Proc. 4th Workshop General Purpose Processing on Graphics Processing Units
- Solano-Quinde, L.¹

18
- 84876241777
- J. Stratton, "LBM on GPU," http://impact. crhc.illinois.edu/ parboil.aspx.
- LBM on GPU
- Stratton, J.¹

19
- 77953892467
- Experiences accelerating matlab systems biology applications
- L.G. Szafaryn, K. Skadron, and J.J. Saucerman, "Experiences Accelerating Matlab Systems Biology Applications," Workshop Biomedicine in Computing: Systems, Architectures, and Circuits, 2009.
- (2009) Workshop Biomedicine in Computing: Systems, Architectures, and Circuits
- Szafaryn, L.G.¹ Skadron, K.² Saucerman, J.J.³

20
- 84876277859
- tech. report 1693, Computer Sciences Dept., Univ. of Wisconsin, Madison, June
- M. Sinclair, H. Duwe, and K. Sankaralingam, Porting CMP Benchmarks to GPUs, tech. report 1693, Computer Sciences Dept., Univ. of Wisconsin, Madison, June 2011.
- (2011) Porting CMP Benchmarks to GPUs
- Sinclair, M.¹ Duwe, H.² Sankaralingam, K.³

21
- 34548329985
- Microarchitecture-independent workload characterization
- DOI 10.1109/MM.2007.56
- K. Hoste and L. Eeckhout, "Microarchitecture-Independent Workload Characterization," IEEE Micro, May/June 2007, pp. 63-72. (Pubitemid 47337548)
- (2007) IEEE Micro , vol.27 , Issue.3 , pp. 63-72
- Hoste, K.¹ Eeckhout, L.²

22
- 84988438049
- Toward kilo-instruction processors
- Dec
- A. Cristal et al., "Toward Kilo-Instruction Processors," ACM Trans. Architecture and Code Optimization, Dec. 2004, pp. 389-417.
- (2004) ACM Trans. Architecture and Code Optimization , pp. 389-417
- Cristal, A.¹

23
- 42549154520
- The L-TAGE branch predictor
- May
- A. Seznec, "The L-TAGE Branch Predictor," J. Instruction-Level Parallelism, May 2007; http://www.jilp.org/vol9/v9paper6.pdf.
- (2007) J. Instruction-Level Parallelism
- Seznec, A.¹

24
- 0030677583
- Prefetching using markov predictors
- ACM
- D. Joseph and D. Grunwald, "Prefetching Using Markov Predictors," Proc. 24th Ann. Int'l Symp. Computer Architecture (ISCA 97), ACM, 1997, pp. 252-263.
- (1997) Proc. 24th Ann. Int'l Symp. Computer Architecture (ISCA 97) , pp. 252-263
- Joseph, D.¹ Grunwald, D.²

25
- 84948959230
- Pointer cache assisted prefetching
- IEEE CS
- J. Collins et al., "Pointer Cache Assisted Prefetching," Proc. 35th Ann. ACM/IEEE Int'l Symp. Microarchitecture, IEEE CS, 2002, pp. 62-73.
- (2002) Proc. 35th Ann. ACM/IEEE Int'l Symp. Microarchitecture , pp. 62-73
- Collins, J.¹

26
- 0036949391
- A stateless, content-directed data prefetching mechanism
- DOI 10.1145/635508.605427
- R. Cooksey, S. Jourdan, and D. Grunwald, "A Stateless, Content Directed Data Prefetching Mechanism," Proc. 10th Int'l Conf. Architectural Support for Programming Languages and Operating Systems, ACM, 2002, pp. 279-290. (Pubitemid 44892240)
- (2002) Operating Systems Review (ACM) , vol.36 , Issue.5 , pp. 279-290
- Cooksey, R.¹ Jourdan, S.² Grunwald, D.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.