-
1
-
-
84886006847
-
-
F. Agakov, et al. Using machine learning to focus iterative optimization, in: Proc. 4th Annual International Symposium on Code Generation and Optimization, 2006, pp. 295-305
-
F. Agakov, et al. Using machine learning to focus iterative optimization, in: Proc. 4th Annual International Symposium on Code Generation and Optimization, 2006, pp. 295-305
-
-
-
-
2
-
-
84987205012
-
-
F.E. Allen, M. Burke, P. Charles, R. Cytron, J. Ferrante, An overview of the PTRAN analysis system for multiprocessing, in: Proc. 1st International Conference on Supercomputing, 1987, pp. 194-211
-
F.E. Allen, M. Burke, P. Charles, R. Cytron, J. Ferrante, An overview of the PTRAN analysis system for multiprocessing, in: Proc. 1st International Conference on Supercomputing, 1987, pp. 194-211
-
-
-
-
3
-
-
85033444523
-
-
J. Allen, K. Kennedy, Automatic loop interchange, in: Proc. ACM SIGPLAN '84 Symposium on Compiler Construction, 1984, pp. 233-246
-
J. Allen, K. Kennedy, Automatic loop interchange, in: Proc. ACM SIGPLAN '84 Symposium on Compiler Construction, 1984, pp. 233-246
-
-
-
-
4
-
-
0001160585
-
PFC: A program to convert Fortran to parallel form
-
Hwang K. (Ed), IEEE Computer Society Press, Los Alamitos, CA
-
Allen J.R., and Kennedy K. PFC: A program to convert Fortran to parallel form. In: Hwang K. (Ed). Supercomputers: Design and Applications (1984), IEEE Computer Society Press, Los Alamitos, CA 186-203
-
(1984)
Supercomputers: Design and Applications
, pp. 186-203
-
-
Allen, J.R.1
Kennedy, K.2
-
5
-
-
79959456077
-
-
M.M. Baskaran, et al. Automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories, in: Proc. 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2008, pp. 1-10
-
M.M. Baskaran, et al. Automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories, in: Proc. 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2008, pp. 1-10
-
-
-
-
6
-
-
51449109831
-
-
I. Buck, Brook Specification v0.2, October 2003
-
I. Buck, Brook Specification v0.2, October 2003
-
-
-
-
7
-
-
0025447908
-
-
D. Callahan, S. Carr, K. Kennedy, Improving register allocation for subscripted variables, in: Proc. SIGPLAN 1990 Conference on Program Language Design and Implementation, 1990, pp. 53-65
-
D. Callahan, S. Carr, K. Kennedy, Improving register allocation for subscripted variables, in: Proc. SIGPLAN 1990 Conference on Program Language Design and Implementation, 1990, pp. 53-65
-
-
-
-
8
-
-
4644226058
-
-
Y. Chou, B. Fahs, S. Abraham, Microarchitecture optimizations for exploiting memory-level parallelism, in: Proc. 31th Annual International Symposium on Computer Architecture, 2004, pp. 76-88
-
Y. Chou, B. Fahs, S. Abraham, Microarchitecture optimizations for exploiting memory-level parallelism, in: Proc. 31th Annual International Symposium on Computer Architecture, 2004, pp. 76-88
-
-
-
-
9
-
-
78651269052
-
-
K. Fatahalian, J. Sugerman, P. Hanrahan, Understanding the efficiency of GPU algorithms for matrix-matrix multiplication, in: Proc. 2004 ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware, 2004, pp. 133-137
-
K. Fatahalian, J. Sugerman, P. Hanrahan, Understanding the efficiency of GPU algorithms for matrix-matrix multiplication, in: Proc. 2004 ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware, 2004, pp. 133-137
-
-
-
-
10
-
-
0347468637
-
-
S. Ghosh, M. Martonosi, S. Malik, Precise miss analysis for program transformations with caches of arbitrary associativity, in: Proc. 8th International Conference on Architectural Support for Programming Languages and Operating Systems, 1998, pp. 228-239
-
S. Ghosh, M. Martonosi, S. Malik, Precise miss analysis for program transformations with caches of arbitrary associativity, in: Proc. 8th International Conference on Architectural Support for Programming Languages and Operating Systems, 1998, pp. 228-239
-
-
-
-
11
-
-
34548292052
-
-
N.K. Govindaraju, S. Larsen, J. Gray, D. Manocha, A memory model for scientific algorithms on graphics processors, in: Proc. 2006 ACM/IEEE Conference on Supercomputing, No. 89, 2006, pp. 89-99
-
N.K. Govindaraju, S. Larsen, J. Gray, D. Manocha, A memory model for scientific algorithms on graphics processors, in: Proc. 2006 ACM/IEEE Conference on Supercomputing, No. 89, 2006, pp. 89-99
-
-
-
-
13
-
-
33746686278
-
-
M. Haneda, P.M.W. Knijnenburg, H.A.G. Wijshoff, Automatic selection of compiler options using non-parametric inferential statistics, in: Proc. 14th International Conference on Parallel Architectures and Compilation Techniques, 2005, pp. 123-132
-
M. Haneda, P.M.W. Knijnenburg, H.A.G. Wijshoff, Automatic selection of compiler options using non-parametric inferential statistics, in: Proc. 14th International Conference on Parallel Architectures and Compilation Techniques, 2005, pp. 123-132
-
-
-
-
14
-
-
33746706203
-
-
C. Jiang, M. Snir, Automatic tuning matrix multiplication performance on graphics hardware, in: Proc. 14th International Conference on Parallel Architecture and Compilation Techniques, 2005, pp. 185-196
-
C. Jiang, M. Snir, Automatic tuning matrix multiplication performance on graphics hardware, in: Proc. 14th International Conference on Parallel Architecture and Compilation Techniques, 2005, pp. 185-196
-
-
-
-
15
-
-
36949033619
-
-
D. Jimenez-Gonzalez, X. Martorell, A. Ramirez, Performance analysis of Cell Broadband Engine for high memory bandwidth applications, in: Proc. IEEE International Symposium on Performance Analysis of Systems and Software, 2007, pp. 210-219
-
D. Jimenez-Gonzalez, X. Martorell, A. Ramirez, Performance analysis of Cell Broadband Engine for high memory bandwidth applications, in: Proc. IEEE International Symposium on Performance Analysis of Systems and Software, 2007, pp. 210-219
-
-
-
-
17
-
-
0034512401
-
-
T. Kisuki, P.M.W. Knijnenburg, M.F.P. O'Boyle, Combined selection of tile sizes and unroll factors using iterative compilation, in: Proc. 2000 International Conference on Parallel Architectures and Compilation Techniques, 2000, pp. 237-248
-
T. Kisuki, P.M.W. Knijnenburg, M.F.P. O'Boyle, Combined selection of tile sizes and unroll factors using iterative compilation, in: Proc. 2000 International Conference on Parallel Architectures and Compilation Techniques, 2000, pp. 237-248
-
-
-
-
18
-
-
0021570530
-
-
D.J. Kuck, et al. The effects of program restructuring, algorithm change, and architecture choice on program performance, in: Proc. 13th International Conference on Parallel Processing, 1984, pp. 129-138
-
D.J. Kuck, et al. The effects of program restructuring, algorithm change, and architecture choice on program performance, in: Proc. 13th International Conference on Parallel Processing, 1984, pp. 129-138
-
-
-
-
19
-
-
34547684388
-
-
P.A. Kulkarni, D.B. Whalley, G.S. Tyson, J.W. Davidson, Evaluation heuristic optimization phase order search algorithms, in: Proc. 2007 International Symposium on Code Generation and Optimization, 2007, pp. 157-169
-
P.A. Kulkarni, D.B. Whalley, G.S. Tyson, J.W. Davidson, Evaluation heuristic optimization phase order search algorithms, in: Proc. 2007 International Symposium on Code Generation and Optimization, 2007, pp. 157-169
-
-
-
-
20
-
-
51449099053
-
-
J. Nickolls, I. Buck, NVIDIA CUDA software and GPU parallel computing architecture, Microprocessor Forum, May 2007
-
J. Nickolls, I. Buck, NVIDIA CUDA software and GPU parallel computing architecture, Microprocessor Forum, May 2007
-
-
-
-
22
-
-
1542396679
-
SPIRAL: A generator for platform-adapted libraries of signal processing algorithms
-
(special issue on Automatic Performance Tuning)
-
Püschel M., et al. SPIRAL: A generator for platform-adapted libraries of signal processing algorithms. Journal of High Performance Computing and Applications 18 1 (2004) 21-45 (special issue on Automatic Performance Tuning)
-
(2004)
Journal of High Performance Computing and Applications
, vol.18
, Issue.1
, pp. 21-45
-
-
Püschel, M.1
-
24
-
-
51449122095
-
-
S. Ryoo, Program optimization strategies for many-core processors, Ph.D. thesis, Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, 2008
-
S. Ryoo, Program optimization strategies for many-core processors, Ph.D. thesis, Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, 2008
-
-
-
-
25
-
-
79959466764
-
-
S. Ryoo, et al. Optimization principles and application performance evaluation of a multithreaded GPU using CUDA, in: Proc. 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2008, pp. 73-82
-
S. Ryoo, et al. Optimization principles and application performance evaluation of a multithreaded GPU using CUDA, in: Proc. 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2008, pp. 73-82
-
-
-
-
26
-
-
51449105693
-
-
J.W. Sias, A systematic approach to delivering instruction-level parallelism in EPIC systems, Ph.D. thesis, University of Illinois at Urbana-Champaign, 2005
-
J.W. Sias, A systematic approach to delivering instruction-level parallelism in EPIC systems, Ph.D. thesis, University of Illinois at Urbana-Champaign, 2005
-
-
-
-
27
-
-
33846475612
-
-
D. Tarditi, S. Puri, J. Oglesby, Accelerator: Using data parallelism to program GPUs for general-purpose uses, in: Proc. 12th International Conference on Architectural Support for Programming Languages and Operating Systems, 2006, pp. 325-335
-
D. Tarditi, S. Puri, J. Oglesby, Accelerator: Using data parallelism to program GPUs for general-purpose uses, in: Proc. 12th International Conference on Architectural Support for Programming Languages and Operating Systems, 2006, pp. 325-335
-
-
-
-
28
-
-
67650534864
-
-
S. Triantafyllis, M. Vachharajani, N. Vachharajani, D.I. August, Compiler optimization-space exploration, in: Proc. 2003 International Symposium on Code Generation and Optimization, 2003, pp. 204-215
-
S. Triantafyllis, M. Vachharajani, N. Vachharajani, D.I. August, Compiler optimization-space exploration, in: Proc. 2003 International Symposium on Code Generation and Optimization, 2003, pp. 204-215
-
-
-
-
29
-
-
85006879958
-
-
D.N. Truong, F. Bodin, A. Seznec, Improving cache behavior of dynamically allocated data structures, in: Proc. Seventh International Conference on Parallel Architectures and Compilation Techniques, 1998, pp. 322+
-
D.N. Truong, F. Bodin, A. Seznec, Improving cache behavior of dynamically allocated data structures, in: Proc. Seventh International Conference on Parallel Architectures and Compilation Techniques, 1998, pp. 322+
-
-
-
-
30
-
-
0030379246
-
-
M.E. Wolf, D.E. Maydan, D.-K. Chen, Combining loop transformations considering caches and scheduling, in: Proc. 29th Annual ACM/IEEE International Symposium on Microarchitecture, 1996, pp. 274-286
-
M.E. Wolf, D.E. Maydan, D.-K. Chen, Combining loop transformations considering caches and scheduling, in: Proc. 29th Annual ACM/IEEE International Symposium on Microarchitecture, 1996, pp. 274-286
-
-
-
-
31
-
-
51449085846
-
-
M. Wolfe, Iteration space tiling for memory hierarchies, in: Proc. Third SIAM Conference on Parallel Processing for Scientific Computing, 1987, pp. 357-361
-
M. Wolfe, Iteration space tiling for memory hierarchies, in: Proc. Third SIAM Conference on Parallel Processing for Scientific Computing, 1987, pp. 357-361
-
-
-
-
32
-
-
0006424869
-
-
Y. Yamada, J. Gyllenhaal, G. Haab, W.W. Hwu, Data relocation and prefetching for large data sets, in: Proc. 27th Annual ACM/IEEE International Symposium on Microarchitecture, 1994, pp. 118-127
-
Y. Yamada, J. Gyllenhaal, G. Haab, W.W. Hwu, Data relocation and prefetching for large data sets, in: Proc. 27th Annual ACM/IEEE International Symposium on Microarchitecture, 1994, pp. 118-127
-
-
-
-
33
-
-
1442337777
-
-
M. Zhao, B. Childers, M.L. Soffa, Predicting the impact of optimizations for embedded systems, in: Proc. 2003 Conference on Languages, Compilers, and Tools for Embedded Systems, 2003, pp. 1-11
-
M. Zhao, B. Childers, M.L. Soffa, Predicting the impact of optimizations for embedded systems, in: Proc. 2003 Conference on Languages, Compilers, and Tools for Embedded Systems, 2003, pp. 1-11
-
-
-
|