SCOPUS 정보 검색 플랫폼

Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT

Volumn , Issue , 2009, Pages 327-337

Polyhedral-model guided loop-nest auto-vectorization

(5) Trifunovic, Konrad b Nuzman, Dorit a Cohen, Albert b Zaks, Ayal a Rosen, Ira a

a IBM HAIFA RESEARCH LAB (Israel)

b INRIA SACLAY (France)

Author keywords

[No Author keywords available]

Indexed keywords

ACCURATE PREDICTION; COST MODELS; LOOP TRANSFORMATION; LOOP VECTORIZATION; MULTIPLE TRANSFORMATION; OPTIMIZING COMPILERS; PERFORMANCE IMPACT; POLYHEDRAL FRAMEWORK; POLYHEDRAL MODELS; POLYHEDRAL REPRESENTATION; PREDICTIVE MODELLING; VECTORIZATION;

PARALLEL ARCHITECTURES;

EID: 70449626135 PISSN: 1089795X EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/PACT.2009.18 Document Type: Conference Paper

Times cited : (105)

References (26)

1
- 32844466554
- An integrated simdization framework using virtual vectors
- P. Wu, A. E. Eichenberger, A. Wang, and P. Zhao, "An integrated Simdization framework using virtual vectors," in ICS, 2005.
- (2005) ICS
- Wu, P.¹ Eichenberger, A.E.² Wang, A.³ Zhao, P.⁴

2
- 24144474794
- The software vectorization handbook
- Intel Press
- A. J. C. Bik, The Software Vectorization Handbook. Applying Multimedia Extensions for Maximum Performance. Intel Press, 2004.
- (2004) Applying Multimedia Extensions for Maximum Performance
- Bik, A.J.C.¹

3
- 0344908850
- Automatic intra-register vectorization for the intel architecture
- A. J. C. Bik, M. Girkar, P. M. Grey, and X. Tian, "Automatic intra-register vectorization for the Intel architecture," IJPP, vol. 30, no. 2, pp. 65-98, 2002.
- (2002) IJPP , vol.30 , Issue.2 , pp. 65-98
- Bik, A.J.C.¹ Girkar, M.² Grey, P.M.³ Tian, X.⁴

4
- 37149019455
- Autovectorization in GCC - Two years later
- June
- D. Nuzman and A. Zaks, "Autovectorization in GCC - two years later," in the GCC Developer's summit, June 2006.
- (2006) GCC Developer's summit
- Nuzman, D.¹ Zaks, A.²

5
- 33646554301
- Superword-level parallelism in the presence of control flow
- March
- J. Shin, M. Hall, and J. Chame, "Superword-level parallelism in the presence of control flow," in CGO, March 2005.
- (2005) CGO
- Shin, J.¹ Hall, M.² Chame, J.³

6
- 63549093768
- Outer-loop vectorization - Revisited for short SIMD architectures
- October
- D. Nuzman and A. Zaks, "Outer-loop vectorization - revisited for short SIMD architectures," in PACT, October 2008.
- (2008) PACT
- Nuzman, D.¹ Zaks, A.²

7
- 0037952146
- Morgan Kaufmann Publishers
- R. Allen and K. Kennedy, Optimizing Compilers for Modern Architectures. Morgan Kaufmann Publishers, 2001.
- (2001) Optimizing Compilers for Modern Architectures
- Allen, R.¹ Kennedy, K.²

8
- 74049164978
- A practical automatic polyhedral parallelization and locality optimization system
- Jun.
- U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan, "A practical automatic polyhedral parallelization and locality optimization system," in PLDI, Jun. 2008.
- (2008) PLDI
- Bondhugula, U.¹ Hartono, A.² Ramanujam, J.³ Sadayappan, P.⁴

9
- 57349167317
- Iterative optimization in the polyhedral model: Part II, multidimensional time
- Jun.
- L.-N. Pouchet, C. Bastoul, A. Cohen, and J. Cavazos, "Iterative optimization in the polyhedral model: Part II, multidimensional time," in PLDI, Jun. 2008.
- (2008) PLDI
- Pouchet, L.-N.¹ Bastoul, C.² Cohen, A.³ Cavazos, J.⁴

10
- 0023438847
- Automatic translation of fortran programs to vector form
- R. Allen and K. Kennedy, "Automatic translation of fortran programs to vector form," ACM Tr. on Prog. Lang. and Systems, vol. 9, no. 4, pp. 491-542, 1987.
- (1987) ACM Tr. on Prog. Lang. and Systems , vol.9 , Issue.4 , pp. 491-542
- Allen, R.¹ Kennedy, K.²

11
- 0003927035
- Addison Wesley
- M. Wolfe, High Performance Compilers for Parallel Computing. Addison Wesley, 1996.
- (1996) High Performance Compilers for Parallel Computing
- Wolfe, M.¹

12
- 84948740064
- Compiler-controlled caching in superword register files for multimedia extension architectures
- September
- J. Shin, J. Chame, and M. W. Hall, "Compiler-controlled caching in superword register files for multimedia extension architectures," in PACT, September 2002.
- (2002) PACT
- Shin, J.¹ Chame, J.² Hall, M.W.³

13
- 33746034953
- Auto-vectorization of interleaved data for simd
- D. Nuzman, I. Rosen, and A. Zaks, "Auto-vectorization of interleaved data for simd," in PLDI, 2006.
- (2006) PLDI
- Nuzman, D.¹ Rosen, I.² Zaks, A.³

14
- 0028591436
- (Pen)-ultimate tiling?
- May
- P. Boulet, A. Darte, T. Risset, and Y. Robert, "(Pen)-ultimate tiling?" in IEEE Scalable High-Performance Computing Conf., May 1994.
- (1994) IEEE Scalable High-Performance Computing Conf.
- Boulet, P.¹ Darte, A.² Risset, T.³ Robert, Y.⁴

15
- 70449667592
- Compile-time based performance prediction
- C. Cascaval, L. Derose, D. A. Padua, and D. A. Reed, "Compile-time based performance prediction," in LCPC, 1999.
- (1999) LCPC
- Cascaval, C.¹ Derose, L.² Padua, D.A.³ Reed, D.A.⁴

16
- 0037340135
- Probabilistic miss equations: Evaluating memory hierarchy performance
- B. B. Fraguela, R. Doallo, and E. L. Zapata, "Probabilistic miss equations: Evaluating memory hierarchy performance," IEEE Trans. Comput., vol. 52, no. 3, pp. 321-336, 2003.
- (2003) IEEE Trans. Comput. , vol.52 , Issue.3 , pp. 321-336
- Fraguela, B.B.¹ Doallo, R.² Zapata, E.L.³

17
- 84958731989
- Array expansion
- St. Malo, France, Jul.
- P. Feautrier, "Array expansion," in ICS, St. Malo, France, Jul. 1988.
- (1988) ICS
- Feautrier, P.¹

18
- 33746593747
- Semi-automatic composition of loop transformations for deep parallelism and memory hierarchies
- Jun. special issue on Microgrids.
- S. Girbal, N. Vasilache, C. Bastoul, A. Cohen, D. Parello, M. Sigler, and O. Temam, "Semi-automatic composition of loop transformations for deep parallelism and memory hierarchies," Intl. J. of Parallel Programming, vol. 34, no. 3, pp. 261-317, Jun. 2006, special issue on Microgrids.
- (2006) Intl. J. of Parallel Programming , vol.34 , Issue.3 , pp. 261-317
- Girbal, S.¹ Vasilache, N.² Bastoul, C.³ Cohen, A.⁴ Parello, D.⁵ Sigler, M.⁶ Temam, O.⁷

19
- 0001448065
- Some efficient solutions to the affine scheduling problem, part II, multidimensional time
- Dec.
- P. Feautrier, "Some efficient solutions to the affine scheduling problem, part II, multidimensional time," Intl. J. of Parallel Programming, vol. 21, no. 6, pp. 389-420, Dec. 1992
- (1992) Intl. J. of Parallel Programming , vol.21 , Issue.6 , pp. 389-420
- Feautrier, P.¹

20
- 35048864273
- see also Part I
- see also Part I, one dimensional time, 21(5):315-348.
- One Dimensional Time , vol.21 , Issue.5 , pp. 315-348

21
- 0004261309
- University of Maryland, Tech. Rep. CS-TR-3193
- W. Kelly and W. Pugh, "A framework for unifying reordering transformations," University of Maryland, Tech. Rep. CS-TR-3193, 1993.
- (1993) A Framework for Unifying Reordering Transformations
- Kelly, W.¹ Pugh, W.²

22
- 0030645995
- Maximizing parallelism and minimizing synchronization with affine transforms
- Paris, Jan.
- A. Lim and M. Lam, "Maximizing parallelism and minimizing synchronization with affine transforms," in PoPL'24, Paris, Jan. 1997, pp. 201-214.
- (1997) PoPL'24 , pp. 201-214
- Lim, A.¹ Lam, M.²

23
- 10444289646
- Code generation in the polyhedral model is easier than you think
- Sep.
- C. Bastoul, "Code generation in the polyhedral model is easier than you think," in PACT, Sep. 2004.
- (2004) PACT
- Bastoul, C.¹

24
- 63549147948
- C. G. Lee, "UTDSP benchmarks," http://www.eecg.toronto.edu/ ~corinna/DSP/infrastructure/UTDSP.html, 1998.
- (1998) UTDSP Benchmarks
- Lee, C.G.¹

25
- 84869674013
- [Online]. Available
- "European FP6 ACOTES project (Advanced Compiler Technologies for Embedded Streaming)." [Online]. Available: https://alchemy.saclay.inria.fr/ ACOTES/
- European FP6 ACOTES Project (Advanced Compiler Technologies for Embedded Streaming)

26
- 84869664374
- [Online]. Available
- "European HiPEAC Network of Excellence (High-Performance Embedded Architecture and Compilation)." [Online]. Available: www.hipeac.net
- European HiPEAC Network of Excellence (High-Performance Embedded Architecture and Compilation)

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.