SCOPUS 정보 검색 플랫폼

Science of Computer Programming

Volumn 78, Issue 5, 2013, Pages 458-480

Parallel execution of Java loops on Graphics Processing Units

(3) Leung, Alan a Lhoták, Ondřej a Lashari, Ghulam a

a UNIVERSITY OF WATERLOO (Canada)

Author keywords

GPU; Java; Parallelization

Indexed keywords

GPU; GRAPHICS PROCESSING UNIT; GRAPHICS PROCESSING UNITS; JAVA; JAVA BYTECODE LANGUAGE; MULTIDIMENSIONAL ARRAYS; PARALLEL EXECUTIONS; PARALLELIZATIONS;

APPLICATION PROGRAMS; COMPUTER GRAPHICS; PROGRAM PROCESSORS; SEMANTICS;

COMPUTER GRAPHICS EQUIPMENT;

EID: 84875225752 PISSN: 01676423 EISSN: None Source Type: Journal
DOI: 10.1016/j.scico.2011.06.004 Document Type: Conference Paper

Times cited : (2)

References (43)

1
- 33947588048
- A survey of general-purpose computation on graphics hardware
- J.D. Owens, D. Luebke, N. Govindaraju, M. Harris, J. Krüger, A.E. Lefohn, and T.J. Purcell A survey of general-purpose computation on graphics hardware Computer Graphics Forum 26 1 2007 80 113
- (2007) Computer Graphics Forum , vol.26 , Issue.1 , pp. 80-113
- Owens, J.D.¹ Luebke, D.² Govindaraju, N.³ Harris, M.⁴ Krüger, J.⁵ Lefohn, A.E.⁶ Purcell, T.J.⁷

2
- 0342321935
- The Jalapeño virtual machine
- B. Alpern, C.R. Attanasio, J.J. Barton, M.G. Burke, P. Cheng, J.-D. Choi, A. Cocchi, S.J. Fink, D. Grove, M. Hind, S.F. Hummel, D. Lieber, V. Litvinov, M.F. Mergen, T. Ngo, J.R. Russell, V. Sarkar, M.J. Serrano, J.C. Shepherd, S.E. Smith, V.C. Sreedhar, H. Srinivasan, and J. Whaley The Jalapeño virtual machine IBM Systems Journal 39 1 2000 211 238
- (2000) IBM Systems Journal , vol.39 , Issue.1 , pp. 211-238
- Alpern, B.¹ Attanasio, C.R.² Barton, J.J.³ Burke, M.G.⁴ Cheng, P.⁵ Choi, J.-D.⁶ Cocchi, A.⁷ Fink, S.J.⁸ Grove, D.⁹ Hind, M.¹⁰ Hummel, S.F.¹¹ Lieber, D.¹² Litvinov, V.¹³ Mergen, M.F.¹⁴ Ngo, T.¹⁵ Russell, J.R.¹⁶ Sarkar, V.¹⁷ Serrano, M.J.¹⁸ Shepherd, J.C.¹⁹ Smith, S.E.²⁰ more..

3
- 84875212124
- M. Hind, Dynamic compilation and adaptive optimization in virtual machines, tutorial presented at PLDI 2004. http://www.cs.umd.edu/̃pugh/ pldi04/tutorials.html.
- Dynamic Compilation and Adaptive Optimization in Virtual Machines, Tutorial Presented at PLDI 2004
- Hind, M.¹

4
- 0034448992
- Adaptive optimization in the Jalapeño JVM
- ACM Press
- M. Arnold, S. Fink, D. Grove, M. Hind, and P.F. Sweeney Adaptive optimization in the Jalapeño JVM Proceedings of the Conference on Object-Oriented Programming Systems, Languages, and Applications 2000 ACM Press 47 65
- (2000) Proceedings of the Conference on Object-Oriented Programming Systems, Languages, and Applications , pp. 47-65
- Arnold, M.¹ Fink, S.² Grove, D.³ Hind, M.⁴ Sweeney, P.F.⁵

5
- 84875223827
- RapidMind, http://www.rapidmind.net/.
- RapidMind

6
- 70449633228
- Automatic parallelization for graphics processing units
- ACM New York, NY, USA
- A. Leung, O. Lhoták, and G. Lashari Automatic parallelization for graphics processing units PPPJ '09: Proceedings of the 7th International Conference on Principles and Practice of Programming in Java 2009 ACM New York, NY, USA 91 100
- (2009) PPPJ '09: Proceedings of the 7th International Conference on Principles and Practice of Programming in Java , pp. 91-100
- Leung, A.¹ Lhoták, O.² Lashari, G.³

7
- 84870629709
- NVIDIA CUDA, http://developer.nvidia.com/object/cuda.html.
- NVIDIA CUDA

8
- 0003488086
- ACM Press New York, NY, USA
- H. Zima, and B. Chapman Supercompilers for Parallel and Vector Computers 1991 ACM Press New York, NY, USA
- (1991) Supercompilers for Parallel and Vector Computers
- Zima, H.¹ Chapman, B.²

9
- 0003927035
- Addison-Wesley Longman Publishing Co., Inc Boston, MA, USA
- M.J. Wolfe High Performance Compilers for Parallel Computing 1995 Addison-Wesley Longman Publishing Co., Inc Boston, MA, USA
- (1995) High Performance Compilers for Parallel Computing
- Wolfe, M.J.¹

10
- 0037952146
- Morgan Kaufmann Publishers Inc San Francisco, CA, USA
- J.R. Allen, and K. Kennedy Optimizing Compilers for Modern Architectures: a Dependence-based Approach 2002 Morgan Kaufmann Publishers Inc San Francisco, CA, USA
- (2002) Optimizing Compilers for Modern Architectures: A Dependence-based Approach
- Allen, J.R.¹ Kennedy, K.²

11
- 34447569672
- Intel 64 and IA-32 architectures software developer's manual.
- Intel 64 and IA-32 Architectures Software Developer's Manual

12
- 0033686832
- Automatic loop transformations and parallelization for Java
- P.V. Artigas, M. Gupta, S.P. Midkiff, J.E. Moreira, Automatic loop transformations and parallelization for Java, in: ICS '00: 14th Int. Conf. on Supercomputing, 2000, pp. 1-10.
- (2000) ICS '00: 14th Int. Conf. on Supercomputing , pp. 1-10
- Artigas, P.V.¹ Gupta, M.² Midkiff, S.P.³ Moreira, J.E.⁴

13
- 0035790371
- A comparison of three approaches to language, compiler, and library support for multidimensional arrays in Java
- J.E. Moreira, S.P. Midkiff, M. Gupta, A comparison of three approaches to language, compiler, and library support for multidimensional arrays in Java, in: JGI '01: Proceedings of the 2001 Joint ACM-ISCOPE Conference on Java Grande, 2001, pp. 116-125.
- (2001) JGI '01: Proceedings of the 2001 Joint ACM-ISCOPE Conference on Java Grande , pp. 116-125
- Moreira, J.E.¹ Midkiff, S.P.² Gupta, M.³

14
- 84957692616
- Automatic parallelization for non-cache coherent multiprocessors
- Y. Paek, D.A. Padua, Automatic parallelization for non-cache coherent multiprocessors, in: Languages and Compilers for Parallel Computing, 1996, pp. 266-284.
- (1996) Languages and Compilers for Parallel Computing , pp. 266-284
- Paek, Y.¹ Padua, D.A.²

15
- 84947747438
- Polaris: Improving the effectiveness of parallelizing compilers
- W. Blume, R. Eigenmann, K. Faigin, J. Grout, J. Hoeflinger, D.A. Padua, P. Petersen, W.M. Pottenger, L. Rauchwerger, P. Tu, S. Weatherford, Polaris: Improving the effectiveness of parallelizing compilers, in: Languages and Compilers for Parallel Computing, 1994, pp. 141-154.
- (1994) Languages and Compilers for Parallel Computing , pp. 141-154
- Blume, W.¹ Eigenmann, R.² Faigin, K.³ Grout, J.⁴ Hoeflinger, J.⁵ Padua, D.A.⁶ Petersen, P.⁷ Pottenger, W.M.⁸ Rauchwerger, L.⁹ Tu, P.¹⁰ Weatherford, S.¹¹

16
- 0007890215
- The structure of parafrase-2: An advanced parallelizing compiler for C and FORTRAN
- Pitman Publishing London, UK, UK
- C.D. Polychronopoulos, M.B. Gikar, M.R. Haghighat, C.L. Lee, B.P. Leung, and D.A. Schouten The structure of parafrase-2: an advanced parallelizing compiler for C and FORTRAN Selected Papers of the Second Workshop on Languages and Compilers for Parallel Computing 1990 Pitman Publishing London, UK, UK 423 453
- (1990) Selected Papers of the Second Workshop on Languages and Compilers for Parallel Computing , pp. 423-453
- Polychronopoulos, C.D.¹ Gikar, M.B.² Haghighat, M.R.³ Lee, C.L.⁴ Leung, B.P.⁵ Schouten, D.A.⁶

17
- 0011616679
- The PARADIGM Compiler for Distributed-Memory Message Passing Multicomputers
- P. Banerjee, J.A. Chandy, M. Gupta, J.G. Holm, A. Lain, D.J. Palermo, S. Ramaswamy, E. Su, The PARADIGM Compiler for Distributed-Memory Message Passing Multicomputers, in: The First International Workshop on Parallel Processing, Bangalore, India, 1994, pp. 322-330.
- (1994) The First International Workshop on Parallel Processing, Bangalore, India , pp. 322-330
- Banerjee, P.¹ Chandy, J.A.² Gupta, M.³ Holm, J.G.⁴ Lain, A.⁵ Palermo, D.J.⁶ Ramaswamy, S.⁷ Su, E.⁸

18
- 0003197260
- An overview of the SUIF compiler for scalable parallel machines
- S. Amarasinghe, J. Anderson, M. Lam, C.-W. Tseng, An overview of the SUIF compiler for scalable parallel machines, in: Proceedings of the Seventh SIAM Conference on Parallel Processing for Scientific Computing, San Francisco, CA, 1995.
- (1995) Proceedings of the Seventh SIAM Conference on Parallel Processing for Scientific Computing, San Francisco, CA
- Amarasinghe, S.¹ Anderson, J.² Lam, M.³ Tseng, C.-W.⁴

19
- 33646009337
- Optimizing compiler for the CELL processor
- 17-21 September 2005, St. Louis, MO, USA IEEE Computer Society
- A.E. Eichenberger, K.M. O'Brien, K. O'Brien, P. Wu, T. Chen, P.H. Oden, D.A. Prener, J.C. Shepherd, B. So, Z. Sura, A. Wang, T. Zhang, P. Zhao, and M. Gschwind Optimizing compiler for the CELL processor 14th International Conference on Parallel Architecture and Compilation Techniques, PACT 2005 17-21 September 2005, St. Louis, MO, USA 2005 IEEE Computer Society 161 172
- (2005) 14th International Conference on Parallel Architecture and Compilation Techniques, PACT 2005 , pp. 161-172
- Eichenberger, A.E.¹ O'Brien, K.M.² O'Brien, K.³ Wu, P.⁴ Chen, T.⁵ Oden, P.H.⁶ Prener, D.A.⁷ Shepherd, J.C.⁸ So, B.⁹ Sura, Z.¹⁰ Wang, A.¹¹ Zhang, T.¹² Zhao, P.¹³ Gschwind, M.¹⁴

20
- 24144474794
- Intel Press
- A.J.C. Bik Software Vectorization Handbook, The: Applying Intel Multimedia Extensions for Maximum Performance 2004 Intel Press
- (2004) Software Vectorization Handbook, The: Applying Intel Multimedia Extensions for Maximum Performance
- Bik, A.J.C.¹

21
- 0034446825
- Exploiting superword level parallelism with multimedia instruction sets
- ACM New York, NY, USA
- S. Larsen, and S. Amarasinghe Exploiting superword level parallelism with multimedia instruction sets PLDI '00: Proceedings of the ACM SIGPLAN 2000 Conference on Programming Language Design and Implementation 2000 ACM New York, NY, USA 145 156
- (2000) PLDI '00: Proceedings of the ACM SIGPLAN 2000 Conference on Programming Language Design and Implementation , pp. 145-156
- Larsen, S.¹ Amarasinghe, S.²

22
- 0344908850
- Automatic intra-register vectorization for the Intel architecture
- A.J.C. Bik, M. Girkar, P.M. Grey, and X. Tian Automatic intra-register vectorization for the Intel architecture International Journal of Parallel Programming 30 2 2002 65 98
- (2002) International Journal of Parallel Programming , vol.30 , Issue.2 , pp. 65-98
- Bik, A.J.C.¹ Girkar, M.² Grey, P.M.³ Tian, X.⁴

23
- 17644409855
- An optimizer for multimedia instruction sets
- G. Cheong, M. Lam, An optimizer for multimedia instruction sets, in: Proceedings of the Second SUIF Compiler Workshop, 1997.
- (1997) Proceedings of the Second SUIF Compiler Workshop
- Cheong, G.¹ Lam, M.²

24
- 0034250996
- Compilation techniques for multimedia processors
- A. Krall, and S. Lelait Compilation techniques for multimedia processors International Journal of Parallel Programming 28 4 2000 347 361
- (2000) International Journal of Parallel Programming , vol.28 , Issue.4 , pp. 347-361
- Krall, A.¹ Lelait, S.²

25
- 4544372264
- Vectorizing for a SIMdD DSP architecture
- ACM New York, NY, USA
- D. Naishlos, M. Biberstein, S. Ben-David, and A. Zaks Vectorizing for a SIMdD DSP architecture CASES '03: Proceedings of the 2003 International Conference on Compilers, Architecture and Synthesis for Embedded Systems 2003 ACM New York, NY, USA 2 11
- (2003) CASES '03: Proceedings of the 2003 International Conference on Compilers, Architecture and Synthesis for Embedded Systems , pp. 2-11
- Naishlos, D.¹ Biberstein, M.² Ben-David, S.³ Zaks, A.⁴

26
- 20444406225
- Autovectorization in GCC
- D. Naishlos, Autovectorization in GCC, in: GCC Developer's Summit, 2004, pp. 105-118.
- (2004) GCC Developer's Summit , pp. 105-118
- Naishlos, D.¹

27
- 0031168526
- Automatically exploiting implicit parallelism in Java
- A.J.C. Bik, and D.B. Gannon Automatically exploiting implicit parallelism in Java Concurrency: Practice and Experience 9 6 1997 579 619
- (1997) Concurrency: Practice and Experience , vol.9 , Issue.6 , pp. 579-619
- Bik, A.J.C.¹ Gannon, D.B.²

28
- 0033887171
- JavaspMT: A speculative thread pipelining parallelization model for Java programs
- Cancun, Mexico, May 1-5, 2000
- I.H. Kazi, D.J. Lilja, JavaspMT: A speculative thread pipelining parallelization model for Java programs, in: Proceedings of the 14th International Parallel & Distributed Processing Symposium, IPDPS'00, Cancun, Mexico, May 1-5, 2000, 2000, pp. 559-564.
- (2000) Proceedings of the 14th International Parallel & Distributed Processing Symposium, IPDPS'00 , pp. 559-564
- Kazi, I.H.¹ Lilja, D.J.²

29
- 42149083580
- SablespMT: A software framework for analysing speculative multithreading in Java
- C.J.F. Pickett, C. Verbrugge, SablespMT: a software framework for analysing speculative multithreading in Java, in: PASTE '05: The 6th ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering, 2005, pp. 59-66.
- (2005) PASTE '05: The 6th ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering , pp. 59-66
- Pickett, C.J.F.¹ Verbrugge, C.²

30
- 0034249157
- A vectorizing compiler for multimedia extensions
- N. Sreraman, and R. Govindarajan A vectorizing compiler for multimedia extensions International Journal of Parallel Programming 28 4 2000 363 400
- (2000) International Journal of Parallel Programming , vol.28 , Issue.4 , pp. 363-400
- Sreraman, N.¹ Govindarajan, R.²

31
- 0032308691
- Simple vector microprocessors for multimedia applications
- C.G. Lee, M.G. Stoodley, Simple vector microprocessors for multimedia applications, in: International Symposium on Microarchitecture, 1998, pp. 25-36.
- (1998) International Symposium on Microarchitecture , pp. 25-36
- Lee, C.G.¹ Stoodley, M.G.²

32
- 19344363982
- Efficient utilization of simd extensions
- F. Franchetti, S. Kral, J. Lorenz, C. Ueberhuber, Efficient utilization of simd extensions, in: Proceedings of the IEEE, vol. 93, 2005, pp. 409-425.
- (2005) Proceedings of the IEEE , vol.93 , pp. 409-425
- Franchetti, F.¹ Kral, S.² Lorenz, J.³ Ueberhuber, C.⁴

33
- 84948740064
- Compiler-controlled caching in superword register files for multimedia extension architectures
- 22-25 September 2002, Charlottesville, VA, USA IEEE Computer Society
- J. Shin, J. Chame, and M.W. Hall Compiler-controlled caching in superword register files for multimedia extension architectures 2002 International Conference on Parallel Architectures and Compilation Techniques, PACT 2002 22-25 September 2002, Charlottesville, VA, USA 2002 IEEE Computer Society 45 55
- (2002) 2002 International Conference on Parallel Architectures and Compilation Techniques, PACT 2002 , pp. 45-55
- Shin, J.¹ Chame, J.² Hall, M.W.³

34
- 47849103500
- Introducing control flow into vectorized code
- IEEE Computer Society Washington, DC, USA
- J. Shin Introducing control flow into vectorized code PACT '07: Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques, PACT 2007 2007 IEEE Computer Society Washington, DC, USA 280 291
- (2007) PACT '07: Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques, PACT 2007 , pp. 280-291
- Shin, J.¹

35
- 70449680229
- Automatically translating a general purpose C++ image processing library for GPUs
- J.L.T. Cornwall, O. Beckmann, P.H.J. Kelly, Automatically translating a general purpose C++ image processing library for GPUs, in: Proceedings of the Workshop on Performance Optimisation for High-Level Languages and Libraries, POHLL, 2006, p. 381.
- (2006) Proceedings of the Workshop on Performance Optimisation for High-Level Languages and Libraries, POHLL , pp. 381
- Cornwall, J.L.T.¹ Beckmann, O.² Kelly, P.H.J.³

36
- 84875224174
- Astex, http://www.irisa.fr/caps/projects/Astex.
- Astex

37
- 84875209859
- Partitioning programs for automatically exploiting GPU
- E. Petit, S. Matz, Francois, Partitioning programs for automatically exploiting GPU, in: SC'06 Workshop: General-Purpose GPU Computing: Practice And Experience. http://www.gpgpu.org/sc2006/workshop/INRIA-GPU-partitioning.pdf.
- SC'06 Workshop: General-Purpose GPU Computing: Practice and Experience
- Petit, E.¹ Matz Francois, S.²

38
- 33745147897
- Loop parallelisation for the Jikes RVM
- J. Zhao, I. Rogers, C. Kirkham, I. Watson, Loop parallelisation for the Jikes RVM, in: PDCAT '05: Proceedings of the Sixth International Conference on Parallel and Distributed Computing Applications and Technologies, 2005, pp. 35-39.
- (2005) PDCAT '05: Proceedings of the Sixth International Conference on Parallel and Distributed Computing Applications and Technologies , pp. 35-39
- Zhao, J.¹ Rogers, I.² Kirkham, C.³ Watson, I.⁴

39
- 84875222728
- Optimizing chip multiprocessor work distribution using dynamic compilation
- J. Zhao, M. Horsnell, I. Rogers, A. Dinn, C. Kirkham, I. Watson, Optimizing chip multiprocessor work distribution using dynamic compilation, in: Proceedings of Euro-Par, 2007, pp. 28-31.
- (2007) Proceedings of Euro-Par , pp. 28-31
- Zhao, J.¹ Horsnell, M.² Rogers, I.³ Dinn, A.⁴ Kirkham, C.⁵ Watson, I.⁶

40
- 84875208846
- The Jamaica project, http://intranet.cs.man.ac.uk/apt/projects/jamaica/.
- The Jamaica Project

41
- 10644248153
- Brook for GPUs: Stream computing on graphics hardware
- I. Buck, T. Foley, D. Horn, J. Sugerman, K. Fatahalian, M. Houston, and P. Hanrahan Brook for GPUs: stream computing on graphics hardware ACM Trans. Graph. 23 3 2004 777 786
- (2004) ACM Trans. Graph. , vol.23 , Issue.3 , pp. 777-786
- Buck, I.¹ Foley, T.² Horn, D.³ Sugerman, J.⁴ Fatahalian, K.⁵ Houston, M.⁶ Hanrahan, P.⁷

42
- 70449693008
- Ph.D. thesis, Universität des Saarlandes, August 2007
- P. Lucas, CGiS: High-level data-parallel GPU programming, Ph.D. thesis, Universität des Saarlandes, August 2007.
- CGiS: High-level Data-parallel GPU Programming
- Lucas, P.¹

43
- 33947595619
- Accelerator: Using data parallelism to program GPUs for general-purpose uses
- ACM Press New York, NY, USA
- D. Tarditi, S. Puri, and J. Oglesby Accelerator: using data parallelism to program GPUs for general-purpose uses ASPLOS-XII: Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems 2006 ACM Press New York, NY, USA 325 335
- (2006) ASPLOS-XII: Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems , pp. 325-335
- Tarditi, D.¹ Puri, S.² Oglesby, J.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.