SCOPUS 정보 검색 플랫폼

Proceedings of the Annual International Symposium on Microarchitecture, MICRO

Volumn , Issue , 2011, Pages 477-488

SIMD re-convergence at thread frontiers

(6) Diamos, Gregory a Ashbaugh, Benjamin b Maiyuran, Subramaniam b Kerr, Andrew a Wu, Haicheng a Yalamanchili, Sudhakar a

a GEORGIA INSTITUTE OF TECHNOLOGY (United States)

b INTEL CORPORATION (United States)

Author keywords

[No Author keywords available]

Indexed keywords

COMPILATION PROCESS; COMPILER TECHNIQUES; CONTROL FLOWS; CUSTOM HARDWARES; DATA PARALLEL; DIRECTX; DYNAMIC INSTRUCTIONS; PROGRAMMING MODELS; REAL APPLICATIONS; SIMD ARCHITECTURE;

PARALLEL ARCHITECTURES; PARALLEL PROGRAMMING;

PROGRAM COMPILERS;

EID: 84863351470 PISSN: 10724451 EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/2155620.2155676 Document Type: Conference Paper

Times cited : (61)

References (23)

1
- 79959933381
- Tsubame-2 - A 2.4 pflops peak performance system
- T. Hatazaki, "Tsubame-2 - a 2.4 pflops peak performance system," in Optical Fiber Communication Conference. Optical Society of America, 2011.
- Optical Fiber Communication Conference. Optical Society of America, 2011
- Hatazaki, T.¹

2
- 84858778295
- Quantifying numa and contention effects in multi-gpu systems
- ser. GPGPU-4. New York, NY, USA: ACM
- K. Spafford, J. S. Meredith, and J. S. Vetter, "Quantifying numa and contention effects in multi-gpu systems," in Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units, ser. GPGPU-4. New York, NY, USA: ACM, 2011, pp. 11:1-11:7.
- (2011) Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units
- Spafford, K.¹ Meredith, J.S.² Vetter, J.S.³

3
- 0025467711
- A bridging model for parallel computation
- L. G. Valiant, "A bridging model for parallel computation," Commun. ACM, 1990.
- (1990) Commun. ACM
- Valiant, L.G.¹

4
- 84863457471
- Characterization and transformation of unstructured control flow in gpu applications
- ACM, June
- H. Wu, G. Diamos, S. Li, and S. Yalamanchili, "Characterization and transformation of unstructured control flow in gpu applications," in The First International Workshop on Characterizing Applications for Heterogeneous Exascale Systems. ACM, June 2011.
- (2011) The First International Workshop on Characterizing Applications for Heterogeneous Exascale Systems
- Wu, H.¹ Diamos, G.² Li, S.³ Yalamanchili, S.⁴

5
- 0021458622
- Chap - A simd graphics processor
- A. Levinthal and T. Porter, "Chap - a simd graphics processor," SIGGRAPH Comput. Graph., vol. 18, no. 3, pp. 77-82, 1984.
- (1984) SIGGRAPH Comput. Graph. , vol.18 , Issue.3 , pp. 77-82
- Levinthal, A.¹ Porter, T.²

6
- 47349104432
- Dynamic warp formation and scheduling for efficient gpu control flow
- W. W. L. Fung, I. Sham, G. Yuan, and T. M. Aamodt, "Dynamic warp formation and scheduling for efficient gpu control flow," in Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture, 2007.
- Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture, 2007
- Fung, W.W.L.¹ Sham, I.² Yuan, G.³ Aamodt, T.M.⁴

7
- 84863371900
- Intel, June
- Intel, Intel HD Graphics OpenSource Programmer Reference Manual, June 2010.
- (2010) Intel HD Graphics OpenSource Programmer Reference Manual

8
- 77952579552
- Demystifying gpu microarchitecture through microbenchmarking
- H. Wong, M.-M. Papadopoulou, M. Sadooghi-Alvandi, and A. Moshovos, "Demystifying gpu microarchitecture through microbenchmarking," in Performance Analysis of Systems Software (ISPASS), 2010 IEEE International Symposium on, 2010, pp. 235-246.
- Performance Analysis of Systems Software (ISPASS), 2010 IEEE International Symposium On, 2010 , pp. 235-246
- Wong, H.¹ Papadopoulou, M.-M.² Sadooghi-Alvandi, M.³ Moshovos, A.⁴

9
- 84863364461
- Real-time adaptive gpu multi-agent path planning
- Wen-mei W. Hwu, Ed. Morgan Kaufmann, Sep.
- Wen-mei W. Hwu, "Real-time adaptive gpu multi-agent path planning," in GPU Computing Gems, Wen-mei W. Hwu, Ed. Morgan Kaufmann, Sep. 2010, vol. 2.
- (2010) GPU Computing Gems , vol.2
- Hwu, W.W.¹

10
- 69949100622
- Optimizing data intensive gpgpu computations for dna sequence alignment
- August
- C. Trapnell and M. C. Schatz, "Optimizing data intensive gpgpu computations for dna sequence alignment," Parallel Computing, vol. 35, pp. 429-440, August 2009.
- (2009) Parallel Computing , vol.35 , pp. 429-440
- Trapnell, C.¹ Schatz, M.C.²

11
- 70350729133
- Accelerating monte carlo simulations of photon transport in a voxelized geometry using a massively parallel gpu
- A. Badal and A. Badano, "Accelerating monte carlo simulations of photon transport in a voxelized geometry using a massively parallel gpu," Medical Physics 36, 2009.
- (2009) Medical Physics , vol.36
- Badal, A.¹ Badano, A.²

12
- 79952149071
- Gpu implementation of extended gaussian mixture model for background subtraction
- V. Pham, P. Vo, H. V. Thanh, and B. L. Hoai, "Gpu implementation of extended gaussian mixture model for background subtraction," International Conference on Computing and Telecommunication Technologies, 2010.
- International Conference on Computing and Telecommunication Technologies, 2010
- Pham, V.¹ Vo, P.² Thanh, H.V.³ Hoai, B.L.⁴

13
- 70749119824
- Monte carlo simulation of photon migration in 3d turbid media accelerated by graphics processing units
- 17
- Q. Fang and D. A. Boas, "Monte carlo simulation of photon migration in 3d turbid media accelerated by graphics processing units," Optical Express 17, vol. 17, pp. 20178-20190.
- Optical Express , vol.17 , pp. 20178-20190
- Fang, Q.¹ Boas, D.A.²

14
- 84863345611
- Tech. Rep., February [Online]. Available
- T. Tsiodras, "A real-time raytracer of triangle meshes in cuda," Tech. Rep., February 2011. [Online]. Available: http://users. softlab.ntua.gr/~ttsiod/cudarenderer-BVH.html
- (2011) A Real-time Raytracer of Triangle Meshes in Cuda
- Tsiodras, T.¹

15
- 77956373685
- Optix: A general purpose ray tracing engine
- 13, July
- S. G. Parker, J. Bigler, A. Dietrich, H. Friedrich, J. Hoberock, D. Luebke, D. McAllister, M. McGuire, K. Morley, A. Robison, and M. Stich, "Optix: a general purpose ray tracing engine," ACM Transactions on Graphics, vol. 29, pp. 66:1-66:13, July 2010.
- (2010) ACM Transactions on Graphics , vol.29
- Parker, S.G.¹ Bigler, J.² Dietrich, A.³ Friedrich, H.⁴ Hoberock, J.⁵ Luebke, D.⁶ McAllister, D.⁷ McGuire, M.⁸ Morley, K.⁹ Robison, A.¹⁰ Stich, M.¹¹

16
- 78149233155
- Ocelot: A dynamic compiler for bulk-synchronous applications in heterogeneous systems, in
- G. Diamos, A. Kerr, S. Yalamanchili, and N. Clark, "Ocelot: A dynamic compiler for bulk-synchronous applications in heterogeneous systems," in Proceedings of PACT '10, 2010.
- Proceedings of PACT '10, 2010
- Diamos, G.¹ Kerr, A.² Yalamanchili, S.³ Clark, N.⁴

17
- 70649104826
- A characterization and analysis of ptx kernels
- A. Kerr, G. Diamos, and S. Yalamanchili, "A characterization and analysis of ptx kernels," in IISWC09: IEEE International Symposium on Workload Characterization, Austin, TX, USA, October 2009.
- IISWC09: IEEE International Symposium on Workload Characterization, Austin, TX, USA, October 2009
- Kerr, A.¹ Diamos, G.² Yalamanchili, S.³

18
- 0015330108
- The illiac iv system
- apr.
- W. Bouknight, S. Denenberg, D. McIntyre, J. Randall, A. Sameh, and D. Slotnick, "The illiac iv system," Proceedings of the IEEE, vol. 60, no. 4, pp. 369-388, apr. 1972.
- (1972) Proceedings of the IEEE , vol.60 , Issue.4 , pp. 369-388
- Bouknight, W.¹ Denenberg, S.² McIntyre, D.³ Randall, J.⁴ Sameh, A.⁵ Slotnick, D.⁶

19
- 84858771176
- AMD
- AMD, Evergreen Family Instruction Set Architecture Instructions and Microcode, 2010.
- (2010) Evergreen Family Instruction Set Architecture Instructions and Microcode

20
- 1942500450
- Using hammock graphs to structure programs
- F. Zhang and E. H. D'Hollander, "Using hammock graphs to structure programs," IEEE Trans. Softw. Eng., pp. 231-245, 2004.
- (2004) IEEE Trans. Softw. Eng. , pp. 231-245
- Zhang, F.¹ D'Hollander, E.H.²

21
- 77954976292
- Dynamic warp subdivision for integrated branch and memory divergence tolerance
- ser. ISCA '10. New York, NY, USA: ACM
- J. Meng, D. Tarjan, and K. Skadron, "Dynamic warp subdivision for integrated branch and memory divergence tolerance," in Proceedings of the 37th annual international symposium on Computer architecture, ser. ISCA '10. New York, NY, USA: ACM, 2010, pp. 235-246.
- (2010) Proceedings of the 37th Annual International Symposium on Computer Architecture , pp. 235-246
- Meng, J.¹ Tarjan, D.² Skadron, K.³

22
- 79955923056
- Thread block compaction for efficient simt control flow
- W. Fung and T. Aamodt, "Thread block compaction for efficient simt control flow," in 17th International Symposium on High Performance Computer Architecture, feb. 2011, pp. 25 -36.
- 17th International Symposium on High Performance Computer Architecture, Feb. 2011 , pp. 25-36
- Fung, W.¹ Aamodt, T.²

23
- 78049504879
- System and method for managing divergent threads in a simd architecture
- Patent US 7 353 369, April
- B. W. Coon and E. J. Lindholm, "System and method for managing divergent threads in a simd architecture," Patent US 7 353 369, April, 2008.
- (2008)
- Coon, B.W.¹ Lindholm, E.J.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.