SCOPUS 정보 검색 플랫폼

International Journal of High Performance Computing Applications

Volumn 18, Issue 1, 2004, Pages 65-94

Statistical models for empirical search-based performance tuning

(3) Vuduc, Richard a Demmel, James W a Bilmes, Jeff A b

a UNIVERSITY OF CALIFORNIA (United States)

b University of Washington (United States)

Author keywords

Algorithm selection; Automatic performance tuning; Early stopping; Feedback directed optimization; Matrix multiplication; Performance distribution; Performance optimization; Software engineering; Support vector method

Indexed keywords

ALGORITHMS; CODES (SYMBOLS); COMPUTER ARCHITECTURE; COMPUTER HARDWARE; DIGITAL LIBRARIES; FAST FOURIER TRANSFORMS; LINEAR ALGEBRA; OPTIMIZATION; PROBLEM SOLVING; PROGRAM COMPILERS; STATISTICAL METHODS; VECTORS;

ALGORITHM SELECTION; EMPIRICAL SEARCH-BASED PERFORMANCE TUNING; FEEDBACK-DIRECTED OPTIMIZATION; SUPPORT VECTOR METHOD;

COMPUTATION THEORY;

EID: 1542710758 PISSN: 10943420 EISSN: None Source Type: Journal
DOI: 10.1177/1094342004041293 Document Type: Article

Times cited : (83)

References (97)

1
- 0010074485
- LAWRA - Linear algebra with recursive algorithms
- September
- Andersen, B.S., Gustavson, F., Karaivanov, A., Wasniewski, J., and Yalamov, P. Y. September 1999. LAWRA - Linear algebra with recursive algorithms. In Proceedings of the Conference on Parallel Processing and Applied Mathematics, Kazimierz Dolny, Poland.
- (1999) Proceedings of the Conference on Parallel Processing and Applied Mathematics, Kazimierz Dolny, Poland
- Andersen, B.S.¹ Gustavson, F.² Karaivanov, A.³ Wasniewski, J.⁴ Yalamov, P.Y.⁵

2
- 3943059676
- Efficient sorting using registers and caches
- Arge, L., Chase, J., Vitter, J. S., and Wickremesinghe, R. 2001. Efficient sorting using registers and caches. ACM Journal on Experimental Algorithmics 6:1-18.
- (2001) ACM Journal on Experimental Algorithmics , vol.6 , pp. 1-18
- Arge, L.¹ Chase, J.² Vitter, J.S.³ Wickremesinghe, R.⁴

3
- 81455138402
- Adaptive optimization in the Jalapeño JVM: The controller's analytical model
- December
- Arnold, M., Fink, S., Grove, D., Hind, M., and Sweeney, P. F. December 2000. Adaptive optimization in the Jalapeño JVM: The controller's analytical model. In MICRO-33: Third ACM Workshop on Feedback-Directed Dynamic Optimization, Monterey, CA.
- (2000) MICRO-33: Third ACM Workshop on Feedback-Directed Dynamic Optimization, Monterey, CA
- Arnold, M.¹ Fink, S.² Grove, D.³ Hind, M.⁴ Sweeney, P.F.⁵

4
- 0030396393
- Efficient path profiling
- December
- Ball, T. and Larus, J. R. December 1996. Efficient path profiling. In Proceedings of MICRO 96, Paris, France, pp. 46-57.
- (1996) Proceedings of MICRO 96, Paris, France , pp. 46-57
- Ball, T.¹ Larus, J.R.²

5
- 1542706843
- Feedback-directed data cache optimizations for the x86
- November
- Barnes, R. November 1999. Feedback-directed data cache optimizations for the x86. In Proceedings of the 32nd Annual International Symposium on Microarchitecture, Second Workshop on Feedback-Directed Optimization, Haifa, Israel.
- (1999) Proceedings of the 32nd Annual International Symposium on Microarchitecture, Second Workshop on Feedback-Directed Optimization, Haifa, Israel
- Barnes, R.¹

6
- 85117163262
- A high-level approach to synthesis of high-performance codes for quantum chemistry
- November
- Baumgartner, G., Bernholdt, D. E., Cociorva, D., Harrison, R., Hirata, S., Lam, C.-C., Nooijen, M., Pitzer, R., Ramanujam, J., and Saddayappan, P. November 2002. A high-level approach to synthesis of high-performance codes for quantum chemistry. In Proceedings of the IEEE/ACM Conference on Supercomputing, Baltimore, MD.
- (2002) Proceedings of the IEEE/ACM Conference on Supercomputing, Baltimore, MD
- Baumgartner, G.¹ Bernholdt, D.E.² Cociorva, D.³ Harrison, R.⁴ Hirata, S.⁵ Lam, C.-C.⁶ Nooijen, M.⁷ Pitzer, R.⁸ Ramanujam, J.⁹ Saddayappan, P.¹⁰

7
- 85088338365
- Run-time interprocedural data placement optimization for lazy parallel libraries
- August; Springer-Verlag, Berlin
- Beckmann, O. and Kelley, P. H. J. August 1997. Run-time interprocedural data placement optimization for lazy parallel libraries. In EuroPar, Lecture Notes in Computer Science, Springer-Verlag, Berlin.
- (1997) EuroPar, Lecture Notes in Computer Science
- Beckmann, O.¹ Kelley, P.H.J.²

8
- 0004000490
- Holden-Day, San Francisco, CA
- Bickel, P. J. and Doksum, K. A. 1977. Mathematical Statistics: Basic Ideas and Selected Topics, Holden-Day, San Francisco, CA.
- (1977) Mathematical Statistics: Basic Ideas and Selected Topics
- Bickel, P.J.¹ Doksum, K.A.²

9
- 0242590437
- Advanced compiler optimizations for sparse computations
- Bik, A. J. C. and Wijshoff, H. A. G. 1995. Advanced compiler optimizations for sparse computations. Journal of Parallel and Distributed Computing 31(1):14-24.
- (1995) Journal of Parallel and Distributed Computing , vol.31 , Issue.1 , pp. 14-24
- Bik, A.J.C.¹ Wijshoff, H.A.G.²

10
- 0030661485
- Optimizing matrix multiply using PHiPAC: A portable, high-performance, ANSI C coding methodology
- July
- Bilmes, J., Asanović, K., Chin, C., and Demmel, J. July 1997. Optimizing matrix multiply using PHiPAC: a Portable, High-Performance, ANSI C coding methodology. In Proceedings of the International Conference on Supercomputing, Vienna, Austria.
- (1997) Proceedings of the International Conference on Supercomputing, Vienna, Austria
- Bilmes, J.¹ Asanović, K.² Chin, C.³ Demmel, J.⁴

11
- 0010121870
- October; Technical Report UCB/CSD-98-1020, University of California, Berkeley, CA
- Bilmes, J., Asanović, K., Demmel, J., Lam, D., and Chin, C. October 1998. The PHiPAC v1.0 matrix-multiply distribution, Technical Report UCB/CSD-98-1020, University of California, Berkeley, CA.
- (1998) The PHiPAC V1.0 Matrix-Multiply Distribution
- Bilmes, J.¹ Asanović, K.² Demmel, J.³ Lam, D.⁴ Chin, C.⁵

12
- 84937351638
- Numerical tabulation of the distribution of kolmogorov's statistic for finite sample size
- Birnbaum, Z. W. 1952. Numerical tabulation of the distribution of Kolmogorov's statistic for finite sample size. Journal of the American Statistical Association 47:425-441.
- (1952) Journal of the American Statistical Association , vol.47 , pp. 425-441
- Birnbaum, Z.W.¹

13
- 84891471315
- Document for the basic linear algebra subprograms (BLAS) standard: BLAS technical forum
- Blackford, S. et al. 2001. Document for the Basic Linear Algebra Subprograms (BLAS) standard: BLAS Technical Forum, http://www.netlib.org/blas/blast-forum.
- Blackford, S.¹

14
- 0029193257
- High-level optimization via automated statistical modeling
- July
- Brewer, E. July 1995. High-level optimization via automated statistical modeling. In Symposium on Parallel Architectures and Algorithms, Santa Barbara, CA.
- (1995) Symposium on Parallel Architectures and Algorithms, Santa Barbara, CA
- Brewer, E.¹

15
- 84943422723
- An infrastructure for adaptive dynamic optimization
- March
- Bruening, D., Garnett, T., and Amarsinghe, S. March 2003. An infrastructure for adaptive dynamic optimization. In Proceedings of the 1st International Symposium on Code Generation and Optimization, San Francisco, CA.
- (2003) Proceedings of the 1st International Symposium on Code Generation and Optimization, San Francisco, CA
- Bruening, D.¹ Garnett, T.² Amarsinghe, S.³

16
- 84964748976
- Compiler blockability of numerical algorithms
- Carr, S. and Kennedy, K. 1992. Compiler blockability of numerical algorithms. In Proceedings of Supercomputing, Minneapolis, MN, pp. 114-124.
- (1992) Proceedings of Supercomputing, Minneapolis, MN , pp. 114-124
- Carr, S.¹ Kennedy, K.²

17
- 0026368758
- Using profile information to assist classic code optimizations
- Chang, P. P., Mahlke, S. A., and Hwu, W. W. 1991. Using profile information to assist classic code optimizations. Software - Practice and Experience 21(12):1301-1321.
- (1991) Software - Practice and Experience , vol.21 , Issue.12 , pp. 1301-1321
- Chang, P.P.¹ Mahlke, S.A.² Hwu, W.W.³

18
- 0034832018
- Exact analysis of the cache behavior of nested loops
- June
- Chatterjee, S., Parker, E., Hanlon, P. J., and Lebeck, A. R. June 2001. Exact analysis of the cache behavior of nested loops. In Proceedings of the ACM SIGPLAN 2001 Conference on Programming Language Design and Implementation, Snowbird, UT, pp. 286-297.
- (2001) Proceedings of the ACM SIGPLAN 2001 Conference on Programming Language Design and Implementation, Snowbird, UT , pp. 286-297
- Chatterjee, S.¹ Parker, E.² Hanlon, P.J.³ Lebeck, A.R.⁴

19
- 1542392274
- January; Technical Report UT-CS-03-499, University of Tennessee
- Chen, Z., Dongarra, J., Luszczek, P., and Roche, K. January 2003. Self-adapting software for numerical linear algebra and LAPACK for clusters, Technical Report UT-CS-03-499, University of Tennessee.
- (2003) Self-adapting Software for Numerical Linear Algebra and LAPACK for Clusters
- Chen, Z.¹ Dongarra, J.² Luszczek, P.³ Roche, K.⁴

20
- 16244396196
- Feedback-directed selection and characterization of compiler optimizations
- November
- Chow, K. and Wu, Y. November 1999. Feedback-directed selection and characterization of compiler optimizations. In Second Workshop on Feedback-Directed Optimization, Haifa, Israel.
- (1999) Second Workshop on Feedback-Directed Optimization, Haifa, Israel
- Chow, K.¹ Wu, Y.²

21
- 0003741921
- Houghton-Mifflin, Boston, MA
- Chow, Y. S., Robbins, H., and Siegmund, D. 1971. Great Expectations: The Theory of Optimal Stopping, Houghton-Mifflin, Boston, MA.
- (1971) Great Expectations: The Theory of Optimal Stopping
- Chow, Y.S.¹ Robbins, H.² Siegmund, D.³

22
- 0036679993
- Adaptive optimizing compilers for the 21st century
- Cooper, K. D., Subramanian, D., and Torczon, L. 2002a. Adaptive optimizing compilers for the 21st century. Journal of Supercomputing 23(1):7-22.
- (2002) Journal of Supercomputing , vol.23 , Issue.1 , pp. 7-22
- Cooper, K.D.¹ Subramanian, D.² Torczon, L.³

23
- 1542601855
- Technical Report, Rice University, Houston, TX
- Cooper, K. D., Harvey, T. J., Subramanian, D., and Torczon, L. 2002b. Compilation order matters, Technical Report, Rice University, Houston, TX.
- (2002) Compilation Order Matters
- Cooper, K.D.¹ Harvey, T.J.² Subramanian, D.³ Torczon, L.⁴

24
- 85039587556
- Darcy, J. D. 2002. Finding a fast quicksort implementation for Java, http://www.sonic.net/~jddarcy/Research/cs339-quick-sort.pdf.
- (2002) Finding a Fast Quicksort Implementation for Java
- Darcy, J.D.¹

25
- 0030706481
- Dynamic feedback: An effective technique for adaptive computing
- June
- Diniz, P. and Rinard, M. June 1997. Dynamic feedback: An effective technique for adaptive computing. In Proceedings of Programming Language Design and Implementation, Las Vegas, NV.
- (1997) Proceedings of Programming Language Design and Implementation, Las Vegas, NV
- Diniz, P.¹ Rinard, M.²

26
- 28244496090
- Benchmarking optimization software with performance profiles
- Dolan, E. D. and Moré, J. J. 2002. Benchmarking optimization software with performance profiles. Mathematical Programming 91:201-213.
- (2002) Mathematical Programming , vol.91 , pp. 201-213
- Dolan, E.D.¹ Moré, J.J.²

27
- 0025402476
- A set of level 3 basic linear algebra subprograms
- Dongarra, J., Croz, J. D., Duff, I., and Hammarling, S. 1990. A set of level 3 basic linear algebra subprograms. ACM Transactions on Mathematical Software 16(1):1-17.
- (1990) ACM Transactions on Mathematical Software , vol.16 , Issue.1 , pp. 1-17
- Dongarra, J.¹ Croz, J.D.² Duff, I.³ Hammarling, S.⁴

28
- 33750232647
- Self-adapting numerical software and automatic tuning of heuristics
- June
- Dongarra, J. and Eijkhout, V. June 2003. Self-adapting numerical software and automatic tuning of heuristics. In Proceedings of the International Conference on Computational Science, Melbourne, Australia.
- (2003) Proceedings of the International Conference on Computational Science, Melbourne, Australia
- Dongarra, J.¹ Eijkhout, V.²

29
- 0033358624
- Automatic analytic modeling for the estimation of cache misses
- October
- Fraguela, B. B., Doallo, R., and Zapta, E. L. October 1999. Automatic analytic modeling for the estimation of cache misses. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, Newport Beach, CA, pp. 221-231.
- (1999) Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, Newport Beach, CA , pp. 221-231
- Fraguela, B.B.¹ Doallo, R.² Zapta, E.L.³

30
- 0030688479
- Auto-blocking matrix-multiplication or tracking BLAS3 performance from source code
- July
- Frens, J. D. and Wise, D. S. July 1997. Auto-blocking matrix-multiplication or tracking BLAS3 performance from source code. In Proceedings of the 6th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Las Vegas, NV, pp. 206-216.
- (1997) Proceedings of the 6th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Las Vegas, NV , pp. 206-216
- Frens, J.D.¹ Wise, D.S.²

31
- 0003747665
- MIT Press, Boston, MA
- Frey, B. 1998. Graphical Models for Machine Learning and Digital Communications, MIT Press, Boston, MA.
- (1998) Graphical Models for Machine Learning and Digital Communications
- Frey, B.¹

32
- 0031636309
- FFTW: An adaptive software architecture for the FFT
- May
- Frigo, M. and Johnson, S. May 1998. FFTW: An adaptive software architecture for the FFT. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, Seattle, WA.
- Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, Seattle, WA
- Frigo, M.¹ Johnson, S.²

33
- 0033350255
- Cache-oblivious algorithms
- October
- Frigo, M., Leiserson, C. E., Prokop, H., and Ramachandran, S. October 1999. Cache-oblivious algorithms. In Proceedings of the 40th Annual Symposium on Foundations of Computer Science, New York, NY.
- (1999) Proceedings of the 40th Annual Symposium on Foundations of Computer Science, New York, NY
- Frigo, M.¹ Leiserson, C.E.² Prokop, H.³ Ramachandran, S.⁴

34
- 1542392268
- Architecture-cognizant divide and conquer algorithms
- November
- Gatlin, K. S. and Carter, L. November 1999. Architecture-cognizant divide and conquer algorithms. In Proceedings of Supercomputing, Portland, OR.
- (1999) Proceedings of Supercomputing, Portland, OR
- Gatlin, K.S.¹ Carter, L.²

35
- 84955599429
- MPI-2: Extending the message-passing interface
- Springer-Verlag, Berlin
- Geist, A., Gropp, W., Huss-Lederman, S., Lumsdaine, A., Lusk, E., Saphir, W., Skjellum, T., and Snir, M. 1996. MPI-2: extending the message-passing interface. In Proceedings of the 2nd European Conference on Parallel Processing (Euro-Par'96), Lyon, France, Lecture Notes in Computer Science Vol. 1123-1124, Springer-Verlag, Berlin, pp. 128-135. http://www.mpi-forum.org.
- (1996) Proceedings of the 2nd European Conference on Parallel Processing (Euro-Par'96), Lyon, France, Lecture Notes in Computer Science , vol.1123-1124 , pp. 128-135
- Geist, A.¹ Gropp, W.² Huss-Lederman, S.³ Lumsdaine, A.⁴ Lusk, E.⁵ Saphir, W.⁶ Skjellum, T.⁷ Snir, M.⁸

36
- 0001714824
- Cache miss equations: A compiler framework for analyzing and tuning memory behavior
- Ghosh, S., Martonosi, M., and Malik, S. 1999. Cache miss equations: a compiler framework for analyzing and tuning memory behavior. ACM Transactions on Programming Languages and Systems 21(4):703-746.
- (1999) ACM Transactions on Programming Languages and Systems , vol.21 , Issue.4 , pp. 703-746
- Ghosh, S.¹ Martonosi, M.² Malik, S.³

37
- 1542392269
- November; Technical Report TR-2002-55, University of Texas at Austin
- Goto, K. and Van de Geijn, R. November 2002. On reducing TLB misses in matrix multiplication, Technical Report TR-2002-55, University of Texas at Austin.
- (2002) On Reducing TLB Misses in Matrix Multiplication
- Goto, K.¹ Van De Geijn, R.²

38
- 84976736522
- Gprof: A call graph execution profiler
- Graham, S. L., Kessler, P. B., and McKusick, M. K. 1982. gprof: a call graph execution profiler. SIGPLAN Notices 17(6):120-126.
- (1982) SIGPLAN Notices , vol.17 , Issue.6 , pp. 120-126
- Graham, S.L.¹ Kessler, P.B.² McKusick, M.K.³

39
- 0026965653
- Eliminating branches using a superoptimizer and the GNU C compiler
- Granlund, T. and Krenner, R. 1992. Eliminating branches using a superoptimizer and the GNU C compiler. SIGPLAN Notices 27(7):341-352.
- (1992) SIGPLAN Notices , vol.27 , Issue.7 , pp. 341-352
- Granlund, T.¹ Krenner, R.²

40
- 0039435412
- FLAME: Formal linear algebra methods environment
- Gunnels, J. A., Gustavson, F. G., Henry, G. M., and Van de Geijn, R. A. 2001. FLAME: Formal Linear Algebra Methods Environment. ACM Transactions on Mathematical Software 27(4):422-455.
- (2001) ACM Transactions on Mathematical Software , vol.27 , Issue.4 , pp. 422-455
- Gunnels, J.A.¹ Gustavson, F.G.² Henry, G.M.³ Van De Geijn, R.A.⁴

41
- 84971853043
- I/O complexity: The red-blue pebble game
- May
- Hong, J. W. and Kung, H. T. May 1981. I/O complexity: the red-blue pebble game. In Proceedings of the 13th Annual ACM Symposium on Theory of Computing, Milwaukee, WI, pp. 326-333.
- (1981) Proceedings of the 13th Annual ACM Symposium on Theory of Computing, Milwaukee, WI , pp. 326-333
- Hong, J.W.¹ Kung, H.T.²

42
- 0001576756
- PYTHIA-II: A knowledge/database system for managing performance data and recommending scientific software
- Houstis, E. N., Catlin, A. C., Rice, J. R., Verykios, V. S., Ramakrishnan, N., and Houstis, C. E. 2000. PYTHIA-II: a knowledge/database system for managing performance data and recommending scientific software. ACM Transactions on Mathematical Software 26(2):277-253.
- (2000) ACM Transactions on Mathematical Software , vol.26 , Issue.2 , pp. 277-253
- Houstis, E.N.¹ Catlin, A.C.² Rice, J.R.³ Verykios, V.S.⁴ Ramakrishnan, N.⁵ Houstis, C.E.⁶

43
- 33947652414
- Implementation of Strassen's algorithm for matrix multiplication
- November
- Huss-Lederman, S., Jacobson, E. M., Johnson, J. R., Tsao, A., and Turnbull, T. November 1996. Implementation of Strassen's algorithm for matrix multiplication. In Proceedings of Supercomputing, Pittsburgh, PA.
- (1996) Proceedings of Supercomputing, Pittsburgh, PA
- Huss-Lederman, S.¹ Jacobson, E.M.² Johnson, J.R.³ Tsao, A.⁴ Turnbull, T.⁵

44
- 38149066662
- Optimizing sparse matrix vector multiplication on SMPs
- March
- Im, E.-J. and Yelick, K. March 1999. Optimizing sparse matrix vector multiplication on SMPs. In Proceedings of the 9th SIAM Conference on Parallel Processing for Scientific Computing, San Antonio, TX.
- (1999) Proceedings of the 9th SIAM Conference on Parallel Processing for Scientific Computing, San Antonio, TX
- Im, E.-J.¹ Yelick, K.²

45
- 0003841602
- Technical Report 9503, MIT, Cambridge, MA
- Jordan, M. I. 1995. Why the logistic function?, Technical Report 9503, MIT, Cambridge, MA.
- (1995) Why the Logistic Function?
- Jordan, M.I.¹

46
- 85039563016
- August; Technical Report 171, Compaq SRC
- Joshi, R., Nelson, G., and Randall, K. August 2001. Denali: A goal-directed superoptimizer, Technical Report 171, Compaq SRC.
- (2001) Denali: A Goal-directed Superoptimizer
- Joshi, R.¹ Nelson, G.² Randall, K.³

47
- 0032155271
- GEMM-based level 3 BLAS: High-performance model implementations and performance evaluation benchmark
- Kagstrom, B., Ling, P., and Loan, C. V. 1998. GEMM-based level 3 BLAS: high-performance model implementations and performance evaluation benchmark. ACM Transactions on Mathematical Software 24(3):268-302.
- (1998) ACM Transactions on Mathematical Software , vol.24 , Issue.3 , pp. 268-302
- Kagstrom, B.¹ Ling, P.² Loan, C.V.³

48
- 1542497130
- Continuous program optimization: A case study
- Kistler, T. and Franz, M. 2003. Continuous program optimization: a case study. ACM Transactions on Programming Languages and Systems 25(4):500-548.
- (2003) ACM Transactions on Programming Languages and Systems , vol.25 , Issue.4 , pp. 500-548
- Kistler, T.¹ Franz, M.²

49
- 0002363292
- Iterative compilation in program optimization
- Kisuki, T., Knijnenburg, P. M., O'Boyle, M. F., and Wijshoff, H. 2000. Iterative compilation in program optimization. In Proceedings of the 8th International Workshop on Compilers for Parallel Computers, Aussois, France, pp. 35-44.
- (2000) Proceedings of the 8th International Workshop on Compilers for Parallel Computers, Aussois, France , pp. 35-44
- Kisuki, T.¹ Knijnenburg, P.M.² O'Boyle, M.F.³ Wijshoff, H.⁴

50
- 84983965442
- An empirical study of FORTRAN programs
- Knuth, D. 1971. An empirical study of FORTRAN programs. Software - Practice and Experience 1(2):105-133.
- (1971) Software - Practice and Experience , vol.1 , Issue.2 , pp. 105-133
- Knuth, D.¹

51
- 1542706838
- MDSimAid: Automatic optimization of fast electrostatics algorithms for molecular simulations
- Springer-Verlag, Berlin
- Ko, A. N. and Izaguirre, J. A. 2003. MDSimAid: automatic optimization of fast electrostatics algorithms for molecular simulations. In Proceedings of the International Conference on Computational Science, Melbourne, Australia, LNCS Vol. 2659, Springer-Verlag, Berlin.
- (2003) Proceedings of the International Conference on Computational Science, Melbourne, Australia, LNCS , vol.2659
- Ko, A.N.¹ Izaguirre, J.A.²

52
- 0010007403
- Algorithm selection using reinforcement learning
- June
- Lagoudakis, M. G. and Littman, M. L. June 2000. Algorithm selection using reinforcement learning. In Proceedings of the 17th International Conference on Machine Learning, Stanford, CA, pp. 511-518.
- (2000) Proceedings of the 17th International Conference on Machine Learning, Stanford, CA , pp. 511-518
- Lagoudakis, M.G.¹ Littman, M.L.²

53
- 0026137116
- The cache performance and optimizations of blocked algorithms
- April
- Lam, M. S., Rothberg, E. E., and Wolf, M. E. April 1991. The cache performance and optimizations of blocked algorithms. In Proceedings of the 4th International Conference on Architectural Support for Programming Languages and Operating Systems, Santa Clara, CA.
- (1991) Proceedings of the 4th International Conference on Architectural Support for Programming Languages and Operating Systems, Santa Clara, CA
- Lam, M.S.¹ Rothberg, E.E.² Wolf, M.E.³

54
- 0003641991
- September; Technical Report TR-490, Department of Computer Science, Indiana University
- Leone, M. and Dybvig, R. K. September 1997. Dynamo: a staged compiler architecture for dynamic program optimization, Technical Report TR-490, Department of Computer Science, Indiana University.
- (1997) Dynamo: A Staged Compiler Architecture for Dynamic Program Optimization
- Leone, M.¹ Dybvig, R.K.²

55
- 85039586972
- Delayed evaluation, self-optimising software components as a programming model
- August; Paderborn, Germany
- Liniker, P., Beckmann, O., and Kelly, P. H. J. August 2002. Delayed evaluation, self-optimising software components as a programming model. In Euro-Par, Paderborn, Germany.
- (2002) Euro-Par
- Liniker, P.¹ Beckmann, O.² Kelly, P.H.J.³

56
- 0026828592
- Automated selection of mathematical software
- Lucks, M. and Gladwell, I. 1992. Automated selection of mathematical software. ACM Transactions on Mathematical Software 18(1):11-34.
- (1992) ACM Transactions on Mathematical Software , vol.18 , Issue.1 , pp. 11-34
- Lucks, M.¹ Gladwell, I.²

57
- 0029204029
- Automatic benchmark generation for cache optimization of matrix algorithms
- March; R. Geist and S. Junkins, editors, ACM, New York
- McCalpin, J. D. and Smotherman, M. March 1995. Automatic benchmark generation for cache optimization of matrix algorithms. In Proceedings of the 33rd Annual Southeast Conference, Clemson, SC, USA, R. Geist and S. Junkins, editors, ACM, New York, pp. 195-204.
- (1995) Proceedings of the 33rd Annual Southeast Conference, Clemson, SC, USA , pp. 195-204
- McCalpin, J.D.¹ Smotherman, M.²

58
- 0030190854
- Improving data locality with loop transformations
- McKinley, K. S., Carr, S., and Tseng, C.-W. 1996. Improving data locality with loop transformations. ACM Transactions on Programming Languages and Systems 18(4):424-453.
- (1996) ACM Transactions on Programming Languages and Systems , vol.18 , Issue.4 , pp. 424-453
- McKinley, K.S.¹ Carr, S.² Tseng, C.-W.³

59
- 0023592629
- Superoptimizer-a look at the smallest program
- Massalin, H. 1987. Superoptimizer-a look at the smallest program. In Proceedings of the 2nd International Conference on Architectural Support for Programming Languages and Operating Systems, Palo Alto, CA, pp. 122-126.
- (1987) Proceedings of the 2nd International Conference on Architectural Support for Programming Languages and Operating Systems, Palo Alto, CA , pp. 122-126
- Massalin, H.¹

60
- 0033703909
- An adaptive software library for fast Fourier transforms
- May
- Mirkovic, D., Mahasoom, R., and Johnsson, L. May 2000. An adaptive software library for fast Fourier transforms. In Proceedings of the International Conference on Supercomputing, Sante Fe, NM, pp. 215-224.
- (2000) Proceedings of the International Conference on Supercomputing, Sante Fe, NM , pp. 215-224
- Mirkovic, D.¹ Mahasoom, R.² Johnsson, L.³

61
- 0032308685
- Quantifying the multi-level nature of tiling interactions
- Mitchell, N., Hogstedt, K., Carter, L., and Ferrante, J. 1998. Quantifying the multi-level nature of tiling interactions. International Journal of Parallel Programming 26(6):641-670.
- (1998) International Journal of Parallel Programming , vol.26 , Issue.6 , pp. 641-670
- Mitchell, N.¹ Hogstedt, K.² Carter, L.³ Ferrante, J.⁴

62
- 84949641205
- A modal model of memory
- May; Springer-Verlag, Berlin
- Mitchell, N., Carter, L., and Ferrante, J. May 2001. A modal model of memory. In Proceedings of the International Conference on Computational Science, San Francisco, CA, LNCS Vol. 2073, Springer-Verlag, Berlin, pp. 81-96.
- (2001) Proceedings of the International Conference on Computational Science, San Francisco, CA, LNCS , vol.2073 , pp. 81-96
- Mitchell, N.¹ Carter, L.² Ferrante, J.³

63
- 70449692010
- GAPS: Iterative feedback directed parallelization using genetic algorithms
- June
- Nisbet, A. June 1998. GAPS: Iterative feedback directed parallelization using genetic algorithms. In Proceedings of the Workshop on Profile and Feedback Directed Compilation, Paris, France.
- (1998) Proceedings of the Workshop on Profile and Feedback Directed Compilation, Paris, France
- Nisbet, A.¹

64
- 0012987759
- Note on the Kolmogorov statistic in the discrete case
- Noether, G. E. 1963. Note on the Kolmogorov statistic in the discrete case. Metrika 7:115-116.
- (1963) Metrika , vol.7 , pp. 115-116
- Noether, G.E.¹

65
- 25944474315
- Cache-oblivious algorithms in practice
- Master's thesis, University of Copenhagen, Copenhagen, Denmark
- Olsen, J. H. and Skov, S. C. 2002. Cache-oblivious algorithms in practice, Master's thesis, University of Copenhagen, Copenhagen, Denmark.
- (2002)
- Olsen, J.H.¹ Skov, S.C.²

66
- 85117254435
- On increasing architecture awareness in program optimizations to bridge the gap between peak and sustained processor performance-matrix multiply revisited
- Parello, D., Temam, O., and Verdun, J.-M. November 2002. On increasing architecture awareness in program optimizations to bridge the gap between peak and sustained processor performance-matrix multiply revisited. In Proceedings of the IEEE/ACM Conference on Supercomputing, Baltimore, MD.
- (2002) Proceedings of the IEEE/ACM Conference on Supercomputing, Baltimore, MD
- Parello, D.¹ Temam, O.² Verdun, J.-M.³

67
- 0036040497
- The hardness of cache conscious data placement
- January
- Petrank, E. and Rawitz, D. January 2002. The hardness of cache conscious data placement. In Proceedings of the 29th ACM SIGPLAN-SIGACT Symposium on the Principles of Programming Languages, Portland, OR, ACM, New York, pp. 101-112.
- (2002) Proceedings of the 29th ACM SIGPLAN-SIGACT Symposium on the Principles of Programming Languages, Portland, OR, ACM, New York , pp. 101-112
- Petrank, E.¹ Rawitz, D.²

68
- 27844503782
- Better tiling and array contraction for compiling scientific programs
- November
- Pike, G. and Hilfinger, P. November 2002. Better tiling and array contraction for compiling scientific programs. In Proceedings of the IEEE/ACM Conference on Supercomputing, Baltimore, MD.
- (2002) Proceedings of the IEEE/ACM Conference on Supercomputing, Baltimore, MD
- Pike, G.¹ Hilfinger, P.²

69
- 0003120218
- Fast training of support vector machines using sequential minimal optimization
- January; B. Schölkopf, C. Burges, and A. Smola, editors, MIT Press, Cambridge, MA
- Platt, J. January 1999. Fast training of support vector machines using sequential minimal optimization. In Advances in Kernel Methods - Support Vector Learning, B. Schölkopf, C. Burges, and A. Smola, editors, MIT Press, Cambridge, MA, pp. 185-208.
- (1999) Advances in Kernel Methods - Support Vector Learning , pp. 185-208
- Platt, J.¹

70
- 84951088709
- Generation of efficient code for sparse matrix computations
- August; Springer-Verlag, Berlin
- Pugh, W. and Shpeisman, T. August 1998. Generation of efficient code for sparse matrix computations. In Proceedings of the 11th Workshop on Languages and Compilers for Parallel Computing, Chapel Hill, NC, LNCS, Springer-Verlag, Berlin.
- (1998) Proceedings of the 11th Workshop on Languages and Compilers for Parallel Computing, Chapel Hill, NC, LNCS
- Pugh, W.¹ Shpeisman, T.²

71
- 84949679103
- Fast automatic generation of DSP algorithms
- May; Springer-Verlag, Berlin
- Püschel, M., Singer, B., Veloso, M., and Moura, J. M. F. May 2001. Fast automatic generation of DSP algorithms. In Proceedings of the International Conference on Computational Science, San Francisco, CA, LNCS Vol. 2073, Springer-Verlag, Berlin, pp. 97-106.
- (2001) Proceedings of the International Conference on Computational Science, San Francisco, CA, LNCS , vol.2073 , pp. 97-106
- Püschel, M.¹ Singer, B.² Veloso, M.³ Moura, J.M.F.⁴

72
- 0012458953
- Note on generalization in experimental algorithmics
- Ramakrishnan, N. and Valdés-Pérez, R. E. 2000. Note on generalization in experimental algorithmics. ACM Transactions on Mathematical Software 26(4)568-580.
- (2000) ACM Transactions on Mathematical Software , vol.26 , Issue.4 , pp. 568-580
- Ramakrishnan, N.¹ Valdés-Pérez, R.E.²

73
- 0003056605
- The algorithm selection problem
- Rice, J. R. 1976. The algorithm selection problem. Advances in Computers 15:65-118.
- (1976) Advances in Computers , vol.15 , pp. 65-118
- Rice, J.R.¹

74
- 84954548975
- A statistical approach for the analysis of the relation between low-level performance information, the code, and the environment
- August
- Santiago, N. G., Rover, D. T. and Rodriguez, D. August 2002. A statistical approach for the analysis of the relation between low-level performance information, the code, and the environment. In Proceedings of the ICPP 4th Workshop on High Performance Scientific and Engineering Computing with Applications, Vancouver, BC, Canada, pp. 282-289.
- Proceedings of the ICPP 4th Workshop on High Performance Scientific and Engineering Computing with Applications, Vancouver, BC, Canada , vol.2002 , pp. 282-289
- Santiago, N.G.¹ Rover, D.T.² Rodriguez, D.³

75
- 84957579840
- Extending the Hong-Kung model to memory hierarchies
- D.-Z. Du and M. Li, editors, LNCS; Springer-Verlag, Berlin
- Savage, J. E. 1995. Extending the Hong-Kung model to memory hierarchies. In Computing and Combinatorics, D.-Z. Du and M. Li, editors, LNCS Vol. 959, Springer-Verlag, Berlin, pp. 270-281.
- (1995) Computing and Combinatorics , vol.959 , pp. 270-281
- Savage, J.E.¹

76
- 85039578805
- March
- Schwartz, D. A., Judd, R. R., Harrod, W. J., and Manley, D. P. March 2000. VSIPL 1.0 API. http://www.vsipl.org.
- (2000)
- Schwartz, D.A.¹ Judd, R.R.² Harrod, W.J.³ Manley, D.P.⁴

77
- 1542706831
- A rational approach to portable high performance: The basic linear algebra instruction set (BLAIS) and the fixed algorithm size template (FAST) library
- Siek, J. G. and Lumsdaine, A. 1998. A rational approach to portable high performance: the Basic Linear Algebra Instruction Set (BLAIS) and the Fixed Algorithm Size Template (FAST) library. In Proceedings of ECOOP, Brussels, Belgium.
- (1998) Proceedings of ECOOP, Brussels, Belgium
- Siek, J.G.¹ Lumsdaine, A.²

78
- 17244380718
- Overcoming the challenges to feedback-directed optimization
- January
- Smith, M. D. January 2000. Overcoming the challenges to feedback-directed optimization. In Proceedings of the ACM SIGPLAN Workshop on Dynamic and Adaptive Compilation and Optimization (Dynamo), Boston, MA.
- (2000) Proceedings of the ACM SIGPLAN Workshop on Dynamic and Adaptive Compilation and Optimization (Dynamo), Boston, MA
- Smith, M.D.¹

79
- 0003401675
- A tutorial on support vector regression
- Technical Report NC2-TR-1998-030. European Community ESPRIT Working Group in Neural and Computational Learning Theory
- Smola, A. J. and Schölkopf, B. 1998. A tutorial on support vector regression, Technical Report NC2-TR-1998-030. European Community ESPRIT Working Group in Neural and Computational Learning Theory. http://www.neurocolt.com.
- (1998)
- Smola, A.J.¹ Schölkopf, B.²

80
- 0038035143
- Meta optimization: Improving compiler heuristics with machine learning
- June
- Stephenson, M., Amarasinghe, S., Martin, M., and O'Reilly, U. M. June 2003. Meta optimization: improving compiler heuristics with machine learning. In Proceedings of the ACM Conference on Programming Language Design and Implementation, San Diego, CA.
- (2003) Proceedings of the ACM Conference on Programming Language Design and Implementation, San Diego, CA
- Stephenson, M.¹ Amarasinghe, S.² Martin, M.³ O'Reilly, U.M.⁴

81
- 1542601850
- August; Ph.D. Thesis, Cornell University
- Stodghill, P. August 1997. A relational approach to the automatic generation of sequential sparse matrix codes, Ph.D. Thesis, Cornell University.
- (1997) A Relational Approach to the Automatic Generation of Sequential Sparse Matrix Codes
- Stodghill, P.¹

82
- 0011916941
- Tuning Strassen's matrix multiplication for memory efficiency
- November
- Thottethodi, M., Chatterjee, S., and Lebeck, A. R. November 1998. Tuning Strassen's matrix multiplication for memory efficiency. In Proceedings of Supercomputing '98, Orlando, FL.
- (1998) Proceedings of Supercomputing '98, Orlando, FL
- Thottethodi, M.¹ Chatterjee, S.² Lebeck, A.R.³

83
- 0031496750
- Locality of reference in LU decomposition with partial pivoting
- Toledo, S. 1997. Locality of reference in LU decomposition with partial pivoting, SIAM Journal on Matrix Analysis and Applications 18(4):1065-1081.
- (1997) SIAM Journal on Matrix Analysis and Applications , vol.18 , Issue.4 , pp. 1065-1081
- Toledo, S.¹

84
- 67650534864
- Compiler optimization-space exploration
- March
- Triantafyllis, S., Vachharajani, M., Vachharajani, N., and August, D. I. March 2003. Compiler optimization-space exploration. In Proceedings of the International Symposium on Code Generation and Optimization, San Francisco, CA, pp. 204-215.
- (2003) Proceedings of the International Symposium on Code Generation and Optimization, San Francisco, CA , pp. 204-215
- Triantafyllis, S.¹ Vachharajani, M.² Vachharajani, N.³ August, D.I.⁴

85
- 85117245869
- Active harmony: Towards automated performance tuning
- November
- Tǎpus, C., Chung, I.-H., and Hollingsworth, J. K. November 2002. Active Harmony: towards automated performance tuning. In Proceedings of the IEEE/ACM Conference on Supercomputing, Baltimore, MD.
- (2002) Proceedings of the IEEE/ACM Conference on Supercomputing, Baltimore, MD
- Tǎpus, C.¹ Chung, I.-H.² Hollingsworth, J.K.³

86
- 33646095964
- Automatically tuned collective operations
- November
- Vadhiyar, S. S., Fagg, G. E., and Dongarra, J. November 2000. Automatically tuned collective operations. In Proceedings of Supercomputing 2000, Dallas, TX.
- (2000) Proceedings of Supercomputing 2000, Dallas, TX
- Vadhiyar, S.S.¹ Fagg, G.E.² Dongarra, J.³

87
- 1542601849
- Using iterative compilation for managing software pipeline-unrolling trade-offs
- September
- Van Der Mark, P., Rohou, E., Bodin, F., Chamski, Z., and Eisenbeis, C. September 1999. Using iterative compilation for managing software pipeline-unrolling trade-offs. In Proceedings of the 4th International Workshop on Compilers for Embedded Systems, St. Goar, Germany.
- (1999) Proceedings of the 4th International Workshop on Compilers for Embedded Systems, St. Goar, Germany
- Van Der Mark, P.¹ Rohou, E.² Bodin, F.³ Chamski, Z.⁴ Eisenbeis, C.⁵

88
- 0003991806
- Wiley, New York
- Vapnik, V. N. 1998. Statistical Learning Theory, Wiley, New York.
- (1998) Statistical Learning Theory
- Vapnik, V.N.¹

89
- 84947558148
- Arrays in Blitz++
- Springer-Verlag, Berlin
- Veldhuizen, T. 1998. Arrays in Blitz++. In Proceedings of ISCOPE, LNCS Vol. 1505, Springer-Verlag, Berlin.
- (1998) Proceedings of ISCOPE, LNCS , vol.1505
- Veldhuizen, T.¹

90
- 0345983802
- Active libraries: Rethinking the roles of compilers and libraries
- Veldhuizen, T. L. and Gannon, D. 1998. Active libraries: rethinking the roles of compilers and libraries. In Proceedings of the SIAM Workshop on Object Oriented Methods for Interoperable Scientific and Engineering Computing, Philadelphia, PA.
- (1998) Proceedings of the SIAM Workshop on Object Oriented Methods for Interoperable Scientific and Engineering Computing, Philadelphia, PA
- Veldhuizen, T.L.¹ Gannon, D.²

91
- 84949504639
- ADAPT: Automated de-coupled adaptive program transformation
- August
- Voss, M. J. and Eigenmann, R. August 2000. ADAPT: automated de-coupled adaptive program transformation. In Proceedings of the International Conference on Parallel Processing, Toronto, Canada.
- (2000) Proceedings of the International Conference on Parallel Processing, Toronto, Canada
- Voss, M.J.¹ Eigenmann, R.²

92
- 84990830919
- Performance optimizations and bounds for sparse matrix-vector multiply
- November
- Vuduc, R., Demmel, J. W., Yelick, K. A., Kamil, S., Nishtala, R., and Lee, B. November 2002. Performance optimizations and bounds for sparse matrix-vector multiply. In Proceedings of Supercomputing, Baltimore, MD.
- (2002) Proceedings of Supercomputing, Baltimore, MD
- Vuduc, R.¹ Demmel, J.W.² Yelick, K.A.³ Kamil, S.⁴ Nishtala, R.⁵ Lee, B.⁶

93
- 0343462141
- Automated empirical optimizations of software and the ATLAS project
- Whaley, R. C., Petitet, A., and Dongarra, J. 2001. Automated empirical optimizations of software and the ATLAS project. Parallel Computing 27(1):3-25.
- (2001) Parallel Computing , vol.27 , Issue.1 , pp. 3-25
- Whaley, R.C.¹ Petitet, A.² Dongarra, J.³

94
- 0034819362
- Language support for Morton-order matrices
- ACM, New York
- Wise, D. S., Frens, J. D., Gu, Y., and Alexander, G. A. 2001. Language support for Morton-order matrices. In Proceedings of the 8th ACM SIGPLAN Symposium on Principles and Practices of Parallel Programming, Snowbird, UT, ACM, New York, pp. 24-33.
- (2001) Proceedings of the 8th ACM SIGPLAN Symposium on Principles and Practices of Parallel Programming, Snowbird, UT , pp. 24-33
- Wise, D.S.¹ Frens, J.D.² Gu, Y.³ Alexander, G.A.⁴

95
- 3142767727
- A data locality optimizing algorithm
- June
- Wolf, M. E. and Lam, M. S. June 1991. A data locality optimizing algorithm. In Proceedings of the ACM SIGPLAN '91 Conference on Programming Language Design and Implementation, Toronto, Ontario, Canada.
- (1991) Proceedings of the ACM SIGPLAN '91 Conference on Programming Language Design and Implementation, Toronto, Ontario, Canada
- Wolf, M.E.¹ Lam, M.S.²

96
- 0034447396
- Transforming loops to recursion for multi-level memory hierarchies
- June
- Yi, Q., Adve, V., and Kennedy, K. June 2000. Transforming loops to recursion for multi-level memory hierarchies. In Proceedings of the ACM SIGPLAN 2000 Conference on Programming Language Design and Implementation, Vancouver, BC, Canada, pp. 169-181.
- (2000) Proceedings of the ACM SIGPLAN 2000 Conference on Programming Language Design and Implementation, Vancouver, BC, Canada , pp. 169-181
- Yi, Q.¹ Adve, V.² Kennedy, K.³

97
- 0038378242
- A comparison of empirical and model-driven optimization
- June
- Yotov, K., Li, X., Ren, G., Cibulskis, M., DeJong, G., Garzaran, M., Padua, D., Pingali, K., Stodghill, P., and Wu, P. June 2003. A comparison of empirical and model-driven optimization. In Proceedings of the ACM Conference on Programming Language Design and Implementation, San Diego, CA.
- Proceedings of the ACM Conference on Programming Language Design and Implementation, San Diego, CA , vol.2003
- Yotov, K.¹ Li, X.² Ren, G.³ Cibulskis, M.⁴ DeJong, G.⁵ Garzaran, M.⁶ Padua, D.⁷ Pingali, K.⁸ Stodghill, P.⁹ Wu, P.¹⁰

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.