SCOPUS 정보 검색 플랫폼

IEEE Transactions on Multimedia

Volumn 15, Issue 2, 2013, Pages 279-290

Branch and data herding: Reducing control and memory divergence for error-tolerant GPU applications

(2) Sartori, John a Kumar, Rakesh a

a University of Illinois at Urbana Champaign (United States)

Author keywords

Energy efficiency; Error tolerance

Indexed keywords

CONTROL PATH; DATA ERRORS; ERROR TOLERANCE; ERROR TOLERANT; HARDWARE IMPLEMENTATIONS; HARDWARE OPTIMIZATION; MEMORY BLOCKS; NVIDIA CUDA; OUTPUT QUALITY; PERFORMANCE BENEFITS; PERFORMANCE BOTTLENECKS; SOFTWARE IMPLEMENTATION; VISUAL COMPUTING;

ENERGY EFFICIENCY; HARDWARE; QUALITY CONTROL; STATIC ANALYSIS; WEAVING;

BENCHMARKING;

EID: 84872693395 PISSN: 15209210 EISSN: None Source Type: Journal
DOI: 10.1109/TMM.2012.2232647 Document Type: Article

Times cited : (45)

References (23)

1
- 70349169075
- Analyzing CUDA workloads using a detailed GPU simulator
- A. Bakhoda, G. Yuan,W. Fung, H.Wong, and T. Aamodt, "Analyzing CUDA workloads using a detailed GPU simulator," in Proc. ISPASS, 2009, pp. 163-174.
- (2009) Proc. ISPASS , pp. 163-174
- Bakhoda, A.¹ Yuan, G.² Fung, W.³ Wong, H.⁴ Aamodt, T.⁵

2
- 77952265942
- Best-effort semantic document search on GPUs
- S. Byna, J. Meng, A. Raghunathan, S. Chakradhar, and S. Cadambi, "Best-effort semantic document search on GPUs," in Proc. GPGPU, 2010, pp. 86-93.
- (2010) Proc. GPGPU , pp. 86-93
- Byna, S.¹ Meng, J.² Raghunathan, A.³ Chakradhar, S.⁴ Cadambi, S.⁵

3
- 83155184570
- Dymaxion: Optimizing memory access patterns for heterogeneous systems
- S. Che, J. Sheaffer, and K. Skadron, "Dymaxion: Optimizing memory access patterns for heterogeneous systems," in Proc. SC, 2011, pp. 13:1-13:11.
- (2011) Proc. SC , pp. 131-11311
- Che, S.¹ Sheaffer, J.² Skadron, K.³

4
- 78049504879
- United States Patent #7, 353,369, NVIDIA
- B. Coon, "System and Method for Managing Divergent Threads in a SIMD Architecture," United States Patent #7,353,369, 2008, NVIDIA.
- (2008) System and Method for Managing Divergent Threads in A SIMD Architecture
- Coon, B.¹

5
- 79955923056
- Thread block compaction for efficient SIMT control flow
- W. Fung and T. Aamodt, "Thread block compaction for efficient SIMT control flow," in Proc. HPCA, 2011, pp. 25-36.
- (2011) Proc. HPCA , pp. 25-36
- Fung, W.¹ Aamodt, T.²

6
- 47349104432
- Dynamicwarp formation and scheduling for efficient GPU control flow
- W. Fung, I. Sham, G. Yuan, and T. Aamodt, "Dynamicwarp formation and scheduling for efficient GPU control flow," in Proc.MICRO, 2007, pp. 407-420.
- (2007) Proc.MICRO , pp. 407-420
- Fung, W.¹ Sham, I.² Yuan, G.³ Aamodt, T.⁴

7
- 84872742646
- Khronos Group
- Khronos Group, OpenCL, 2010.
- (2010) OpenCL

8
- 70449885048
- Best-effort parallel execution framework for recognition and mining applications
- J. Meng, S. Chakradhar, and A. Raghunathan, "Best-effort parallel execution framework for recognition and mining applications," in Proc. IPDPS, 2009, pp. 1-12.
- (2009) Proc. IPDPS
- Meng, J.¹ Chakradhar, S.² Raghunathan, A.³

9
- 77954976292
- Dynamic warp subdivision for integrated branch and memory divergence tolerance
- J. Meng, D. Tarjan, and K. Skadron, "Dynamic warp subdivision for integrated branch and memory divergence tolerance," in Proc. ISCA, 2010, pp. 235-246.
- (2010) Proc. ISCA , pp. 235-246
- Meng, J.¹ Tarjan, D.² Skadron, K.³

10
- 84872735300
- Microsoft
- Microsoft, GPGPU Computing Horizons, 2010.
- (2010) GPGPU Computing Horizons

11
- 84863342255
- Improving GPU performance via large warps and two-level warp scheduling
- V. Narasiman,M. Shebanow, C. Lee, R. Miftakhutdinov, O.Mutlu, and Y. Patt, "Improving GPU performance via large warps and two-level warp scheduling," in Proc. MICRO, 2011, pp. 308-317.
- (2011) Proc. MICRO , pp. 308-317
- Narasiman, V.¹ Shebanow, M.² Lee, C.³ Miftakhutdinov, R.⁴ Mutlu, O.⁵ Patt, Y.⁶

12
- 77954745528
- NVIDIA
- NVIDIA, NVIDIA Compute PTX: Parallel Thread Execution, 2009.
- (2009) NVIDIA Compute PTX: Parallel Thread Execution

13
- 77951900491
- NVIDIA
- NVIDIA, NVIDIA's Next Generation CUDA Compute Architecture: Fermi, 2009.
- (2009) NVIDIA's Next Generation CUDA Compute Architecture: Fermi

14
- 35948991669
- NVIDIA
- NVIDIA, NVIDIA CUDA Programming Guide, Version 3.0, 2010.
- (2010) NVIDIA CUDA Programming Guide, Version 3.0

15
- 84864858573
- Increasing memory miss tolerance for SIMD cores
- D. Tarjan, J. Meng, and K. Skadron, "Increasing memory miss tolerance for SIMD cores," in Proc. SC, 2009, pp. 22:1-22:11.
- (2009) Proc. SC , pp. 221-2211
- Tarjan, D.¹ Meng, J.² Skadron, K.³

16
- 84872703219
- [Online]
- The IMPACT Research Group, Parboil Benchmark Suite. [Online]. Available: http://impact. crhc.illinois.edu/parboil.php.
- The IMPACT Research Group Parboil Benchmark Suite

17
- 84870404832
- University Of Illinois, [Online]
- University of Illinois, Clang: A C Language Family Frontend for LLVM. [Online]. Available: http://clang.llvm.org/.
- Clang: A C Language Family Frontend for LLVM

18
- 34247220310
- Energy-efficient motion estimation using error-tolerance
- DOI 10.1145/1165573.1165599, ISLPED'06 - Proceedings of the 2006 International Symposium on Low Power Electronics and Design
- G. Varatkar and N. Shanbhag, "Energy-efficient motion estimation using error-tolerance," in Proc. ISLPED, 2006, pp. 113-118. (Pubitemid 46609720)
- (2006) Proceedings of the International Symposium on Low Power Electronics and Design , vol.2006 , pp. 113-118
- Varatkar, G.V.¹ Shanbhag, N.R.²

19
- 84968854658
- Y-branches: When you come to a fork in the road, take it
- N. Wang, M. Fertig, and S. Patel, "Y-branches: When you come to a fork in the road, take it," in Proc. PACT, 2003, pp. 56-118.
- (2003) Proc. PACT
- Wang, N.¹ Fertig, M.² Patel, S.³

20
- 84872696857
- [Online]
- Wikipedia, Mandelbrot Set, 2011. [Online]. Available: http://en.wikipedia.org/wiki/Mandelbrot set.
- (2011) Wikipedia Mandelbrot Set

21
- 77952579552
- Demystifying GPU microarchitecture through microbenchmarking
- H.Wong,M. Papadopoulou, M. Sadooghi-Alvandi, and A.Moshovos, "Demystifying GPU microarchitecture through microbenchmarking," in Proc. ISPASS, 2010, pp. 235-246.
- (2010) Proc. ISPASS , pp. 235-246
- Wong, H.¹ Papadopoulou, M.² Sadooghi-Alvandi, M.³ Moshovos, A.⁴

22
- 47349114260
- The art of deception: Adaptive precision reduction for area efficient physics acceleration
- T. Yeh, P. Faloutsos, M. Ercegovac, S. Patel, and G. Reinman, "The art of deception: Adaptive precision reduction for area efficient physics acceleration," in Proc. MICRO, 2007, pp. 394-406.
- (2007) Proc. MICRO , pp. 394-406
- Yeh, T.¹ Faloutsos, P.² Ercegovac, M.³ Patel, S.⁴ Reinman, G.⁵

23
- 79953126288
- On-The-fly elimination of dynamic irregularities for GPU computing
- E. Zhang, Y. Jiang, Z. Guo, K. Tian, and X. Shen, "On-the-fly elimination of dynamic irregularities for GPU computing," in Proc. ASPLOS, 2011, pp. 369-380.
- (2011) Proc. ASPLOS , pp. 369-380
- Zhang, E.¹ Jiang, Y.² Guo, Z.³ Tian, K.⁴ Shen, X.⁵

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.