SCOPUS 정보 검색 플랫폼

Proceedings - International Symposium on Code Generation and Optimization, CGO 2012

Volumn , Issue , 2012, Pages 84-93

HELIX: Automatic parallelization of irregular programs for chip multiprocessing

(6) Campanoni, Simone a Jones, Timothy b Holloway, Glenn a Reddi, Vijay Janapa c Wei, Gu Yeon a Brooks, David a

a HARVARD UNIVERSITY (United States)

b UNIVERSITY OF CAMBRIDGE (United Kingdom)

c UNIVERSITY OF TEXAS AT AUSTIN (United States)

Author keywords

[No Author keywords available]

Indexed keywords

AUTOMATIC PARALLELIZATION; CHIP MULTIPROCESSING; CODE OPTIMIZATION; COMMUNICATION COST; DATA DEPENDENCE; HELPER THREAD; OPTIMIZING COMPILERS; PARALLELIZATIONS; PREFETCHES; PROFILE DATA; SEQUENTIAL PROGRAMS; SUCCESSIVE ITERATION; SYNCHRONIZATION SIGNALS;

NETWORK COMPONENTS; PROGRAM COMPILERS;

OPTIMIZATION;

EID: 84863473415 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/2259016.2259028 Document Type: Conference Paper

Times cited : (72)

References (42)

1
- 34447569672
- Specification
- Intel 64 and IA-32 Architectures Software Developer's Manual. Specification, 2010.
- (2010) Intel 64 and IA-32 Architectures Software Developer's Manual

2
- 67650525989
- Shared memory consistency models: A tutorial
- S. Adve and K. Gharachorloo. Shared memory consistency models: A tutorial. IEEE Computer, 1995.
- (1995) IEEE Computer
- Adve, S.¹ Gharachorloo, K.²

3
- 0010355322
- Perfect pipelining: A new loop parallelization technique
- A. Aiken and A. Nicolau. Perfect pipelining: A new loop parallelization technique. ESOP, 1988.
- (1988) ESOP
- Aiken, A.¹ Nicolau, A.²

4
- 0037952146
- Morgan Kaufmann
- J. Allen and K. Kennedy. Optimizing compilers for modern architectures. Morgan Kaufmann, 2002.
- (2002) Optimizing Compilers for Modern Architectures
- Allen, J.¹ Kennedy, K.²

5
- 85060036181
- Validity of the single processor approach to achieving large scale computing capabilities
- G. Amdahl. Validity of the single processor approach to achieving large scale computing capabilities. Proc. Spring Joint Computer Conference, 1967.
- (1967) Proc. Spring Joint Computer Conference
- Amdahl, G.¹

6
- 0004242324
- 2nd edition
- A. Appel. Modern Compiler Implementation in Java, 2nd edition. 2002.
- (2002) Modern Compiler Implementation in Java
- Appel, A.¹

7
- 41349123319
- Revisiting the sequential programming model for the multicore era
- M. Bridges et al. Revisiting the sequential programming model for the multicore era. IEEE Micro, 2008.
- (2008) IEEE Micro
- Bridges, M.¹

8
- 76949106140
- A highly flexible, parallel virtual machine: Design and experience of ILDJIT
- S. Campanoni et al. A highly flexible, parallel virtual machine: Design and experience of ILDJIT. Softw. Pract. Exper., 2010.
- (2010) Softw. Pract. Exper.
- Campanoni, S.¹

9
- 0032662989
- Simultaneous subordinate microthreading (SSMT)
- R. Chappell et al. Simultaneous subordinate microthreading (SSMT). ISCA, 1999.
- (1999) ISCA
- Chappell, R.¹

10
- 84863438531
- D-K. Chen and P-C. Yew. An empirical study on DOACROSS loops. 1991.
- (1991) An Empirical Study on DOACROSS Loops
- Chen, D.-K.¹ Yew, P.-C.²

11
- 0012526362
- Statement re-ordering for DOACROSS loops
- D-K. Chen and P-C. Yew. Statement re-ordering for DOACROSS loops. ICPP, 1994.
- (1994) ICPP
- Chen, D.-K.¹ Yew, P.-C.²

12
- 0030149070
- On effective execution of nonuniform DOACROSS loops
- D-K. Chen and P-C. Yew. On effective execution of nonuniform DOACROSS loops. IEEE Transactions on Parallel and Distributed Systems, 1996.
- (1996) IEEE Transactions on Parallel and Distributed Systems
- Chen, D.-K.¹ Yew, P.-C.²

13
- 0032639855
- Redundant synchronization elimination for DOACROSS loops
- May
- D-K. Chen and P-C. Yew. Redundant synchronization elimination for DOACROSS loops. Parallel and Distributed Systems, IEEE Transactions on, 10(5), May 1999.
- (1999) Parallel and Distributed Systems, IEEE Transactions on , vol.10 , Issue.5
- Chen, D.-K.¹ Yew, P.-C.²

14
- 84863453906
- R. Costa et al. Gcc4cli. http://gcc.gnu.org/projects/cli.html.
- Gcc4cli
- Costa, R.¹

15
- 0022893044
- DOACROSS: Beyond vectorization for multiprocessors
- R. Cytron. DOACROSS: Beyond vectorization for multiprocessors. ICPP, 1986.
- (1986) ICPP
- Cytron, R.¹

16
- 35248874700
- Speculative parallelization of partially parallel loops, pages 285-299
- F. Dang and L. Rauchwerger. Speculative parallelization of partially parallel loops. In 5th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers, pages 285-299, 2000.
- (2000) 5th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
- Dang, F.¹ Rauchwerger, L.²

17
- 84863463722
- B. Guo et al. Practical and accurate low-level pointer analysis. 2005.
- (2005) Practical and Accurate Low-level Pointer Analysis
- Guo, B.¹

18
- 0024012163
- Reevaluating Amdahl's law
- 31, May
- J. Gustafson. Reevaluating Amdahl's law. Commun. ACM, 31, May 1988.
- (1988) Commun. ACM
- Gustafson, J.¹

19
- 77954006048
- Decoupled software pipelining creates parallelization opportunities
- Jialu H. et al. Decoupled software pipelining creates parallelization opportunities. CGO, 2010.
- (2010) CGO
- Jialu, H.¹

20
- 74349096277
- Parallelization of DOALL and DOACROSS loops - A survey
- A. Hurson et al. Parallelization of DOALL and DOACROSS loops - a survey. Advances in Computers, 1997.
- (1997) Advances in Computers
- Hurson, A.¹

21
- 84863484683
- VTune. http://software.intel.com/en-us/intel-vtune.
- VTune

22
- 3042569221
- Physical experimentation with prefetching helper threads on Intel's hyper-threaded processors
- D. Kim et al. Physical experimentation with prefetching helper threads on Intel's hyper-threaded processors. CGO, 2004.
- (2004) CGO
- Kim, D.¹

23
- 79951708803
- Scalable speculative parallelization on commodity clusters
- H. Kim et al. Scalable speculative parallelization on commodity clusters. MICRO, 2010.
- (2010) MICRO
- Kim, H.¹

24
- 0030400452
- A loop allocation policy for DOACROSS loops
- J. Lim et al. A loop allocation policy for DOACROSS loops. SPDP, 1996.
- (1996) SPDP
- Lim, J.¹

25
- 72049107238
- Optimal loop parallelization for maximizing iteration-level parallelism
- D. Liu et al. Optimal loop parallelization for maximizing iteration-level parallelism. CASES, 2009.
- (2009) CASES
- Liu, D.¹

26
- 0031199614
- Converting thread-level parallelism to instruction-level parallelism via simultaneous multithreading
- J. Lo et al. Converting thread-level parallelism to instruction-level parallelism via simultaneous multithreading. TCS, 1997.
- (1997) TCS
- Lo, J.¹

27
- 81455150594
- Tolerating memory latency through software-controlled pre-execution in simultaneous multithreading processors
- C-K. Luk. Tolerating memory latency through software-controlled pre-execution in simultaneous multithreading processors. SIGARCH Comp. Arch. News, 2001.
- (2001) SIGARCH Comp. Arch. News
- Luk, C.-K.¹

28
- 70449709551
- Synchronization optimizations for efficient execution on multi-cores
- A. Nicolau et al. Synchronization optimizations for efficient execution on multi-cores. ICS, 2009.
- (2009) ICS
- Nicolau, A.¹

29
- 67650096789
- Techniques for efficient placement of synchronization primitives
- A. Nicolau et al. Techniques for efficient placement of synchronization primitives. PPoPP, 2009.
- (2009) PPoPP
- Nicolau, A.¹

30
- 33749375700
- Automatic thread extraction with decoupled software pipelining
- G. Ottoni et al. Automatic thread extraction with decoupled software pipelining. MICRO, 2005.
- (2005) MICRO
- Ottoni, G.¹

31
- 84863453911
- Exposing speculative thread parallelism in SPEC2000
- M. Prabhu and K. Olukotun. Exposing speculative thread parallelism in SPEC2000. PPoPP, 2000.
- (2000) PPoPP
- Prabhu, M.¹ Olukotun, K.²

32
- 77952281906
- Speculative parallelization using software multi-threaded transactions
- A. Raman et al. Speculative parallelization using software multi-threaded transactions. ASPLOS, 2010.
- (2010) ASPLOS
- Raman, A.¹

33
- 43449113286
- Parallel-Stage decoupled software pipelining
- E. Raman et al. Parallel-Stage decoupled software pipelining. CGO, 2008.
- (2008) CGO
- Raman, E.¹

34
- 51149117060
- Performance scalability of decoupled software pipelining
- R. Rangan et al. Performance scalability of decoupled software pipelining. TACO, 2008.
- (2008) TACO
- Rangan, R.¹

35
- 84863463725
- Spin-block synchronization algorithm in the shared memory multiprocessor system
- J. Seung-Ju and K. Gil-Yong. Spin-block synchronization algorithm in the shared memory multiprocessor system. SIGOPS Oper. Syst. Rev., 1994.
- (1994) SIGOPS Oper. Syst. Rev.
- Seung-Ju, J.¹ Gil-Yong, K.²

36
- 84863431682
- Efficient DOACROSS execution on distributed shared-memory multiprocessors
- H-M. Su and P-C. Yew. Efficient DOACROSS execution on distributed shared-memory multiprocessors. ACM/IEEE conference on Supercomputing, 1991.
- (1991) ACM/IEEE Conference on Supercomputing
- Su, H.-M.¹ Yew, P.-C.²

37
- 47349118686
- A practical approach to exploiting coarse-grained pipeline parallelism in C programs
- W. Thies et al. A practical approach to exploiting coarse-grained pipeline parallelism in C programs. MICRO, 2007.
- (2007) MICRO
- Thies, W.¹

38
- 78650659831
- Towards a holistic approach to auto-parallelization
- G. Tournavitis et al. Towards a holistic approach to auto- parallelization. PLDI, 2009.
- (2009) PLDI
- Tournavitis, G.¹

39
- 41349089872
- Speculative decoupled software pipelining
- N. Vachharajani et al. Speculative decoupled software pipelining. PACT, 2007.
- (2007) PACT
- Vachharajani, N.¹

40
- 0035335764
- Time stamp algorithms for runtime parallelization of DOACROSS loops with dynamic dependences
- C.-Z. Xu and V. Chaudhary. Time stamp algorithms for runtime parallelization of DOACROSS loops with dynamic dependences. TPDS, 2001.
- (2001) TPDS
- Xu, C.-Z.¹ Chaudhary, V.²

41
- 57749168614
- Uncovering hidden loop level parallelism in sequential applications
- H. Zhong et al. Uncovering hidden loop level parallelism in sequential applications. HPCA, 2008.
- (2008) HPCA
- Zhong, H.¹

42
- 70449652981
- Exploiting parallelism with dependence-aware scheduling
- X. Zhuang et al. Exploiting parallelism with dependence-aware scheduling. In PACT, pages 193-202, 2009.
- (2009) PACT , pp. 193-202
- Zhuang, X.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.