메뉴 건너뛰기




Volumn 5235 LNCS, Issue , 2008, Pages 196-259

How to write fast numerical code: A small introduction

Author keywords

[No Author keywords available]

Indexed keywords

DISCRETE FOURIER TRANSFORMS;

EID: 57049117343     PISSN: 03029743     EISSN: 16113349     Source Type: Book Series    
DOI: 10.1007/978-3-540-88643-3_5     Document Type: Conference Paper
Times cited : (24)

References (82)
  • 1
    • 57049113013 scopus 로고    scopus 로고
    • Cramming more components onto integrated circuits
    • Moore, G.E.: Cramming more components onto integrated circuits. Readings in computer architecture, 56-59 (2000)
    • (2000) Readings in computer architecture , pp. 56-59
    • Moore, G.E.1
  • 2
    • 0005052298 scopus 로고
    • A vectorizing, software pipelining compiler for LIW and superscalar architecture
    • Meadows, L., Nakamoto, S., Schuster, V.: A vectorizing, software pipelining compiler for LIW and superscalar architecture. In: Proceedings of Risc (1992)
    • (1992) Proceedings of Risc
    • Meadows, L.1    Nakamoto, S.2    Schuster, V.3
  • 3
    • 57049170624 scopus 로고    scopus 로고
    • Group, S.S.C.: SUIF: A parallelizing & optimizing research compiler. Technical Report CSL-TR-94-620, Computer Systems Laboratory, Stanford University (May 1994)
    • Group, S.S.C.: SUIF: A parallelizing & optimizing research compiler. Technical Report CSL-TR-94-620, Computer Systems Laboratory, Stanford University (May 1994)
  • 4
    • 14844346973 scopus 로고    scopus 로고
    • A complete compiler approach to auto-parallelizing C programs for multi-DSP systems
    • Franke, B., O'Boyle, M.F.R: A complete compiler approach to auto-parallelizing C programs for multi-DSP systems. IEEE Trans. Parallel Distrib. Syst. 16(3), 234-245 (2005)
    • (2005) IEEE Trans. Parallel Distrib. Syst , vol.16 , Issue.3 , pp. 234-245
    • Franke, B.1    O'Boyle, M.F.R.2
  • 8
    • 57049153232 scopus 로고    scopus 로고
    • Website: Spiral (1998), http://www.spiral.net
    • (1998) Website: Spiral
  • 12
    • 1542392269 scopus 로고    scopus 로고
    • On reducing TLB misses in matrix multiplication, FLAME working note 9
    • Technical Report TR-2002-55, The University of Texas at Austin, Department of Computer Sciences November
    • Goto, K., van de Geijn, R.: On reducing TLB misses in matrix multiplication, FLAME working note 9. Technical Report TR-2002-55, The University of Texas at Austin, Department of Computer Sciences (November 2002)
    • (2002)
    • Goto, K.1    van de Geijn, R.2
  • 13
    • 0002515795 scopus 로고    scopus 로고
    • Automatically Tuned Linear Algebra Software (ATLAS)
    • Whaley, R.C., Dongarra, J.: Automatically Tuned Linear Algebra Software (ATLAS). In: Proc. Supercomputing (1998)
    • (1998) Proc. Supercomputing
    • Whaley, R.C.1    Dongarra, J.2
  • 15
    • 34548080393 scopus 로고    scopus 로고
    • An automatically-tuned sorting library
    • Bida, E., Toledo, S.: An automatically-tuned sorting library. Software: Practice and Experience 37(11), 1161-1192(2007)
    • (2007) Software: Practice and Experience , vol.37 , Issue.11 , pp. 1161-1192
    • Bida, E.1    Toledo, S.2
  • 19
    • 57049128550 scopus 로고    scopus 로고
    • Website: BeBOP, http://bebop.cs.berkeley.edu/
    • Website: BeBOP
  • 20
    • 24344485098 scopus 로고    scopus 로고
    • OSKI: A library of automatically tuned sparse matrix kernels
    • Proc. SciDAC. Journal of Physics
    • Vuduc, R., Demmel, J.W., Yelick, K.A.: OSKI: A library of automatically tuned sparse matrix kernels. In: Proc. SciDAC. Journal of Physics: Conference Series, vol. 16, pp. 521-530 (2005)
    • (2005) Conference Series , vol.16 , pp. 521-530
    • Vuduc, R.1    Demmel, J.W.2    Yelick, K.A.3
  • 21
    • 0343462141 scopus 로고    scopus 로고
    • Automated empirical optimization of software and the ATLAS project
    • Whaley, R., Petitet, A., Dongarra, J.: Automated empirical optimization of software and the ATLAS project. Parallel Computing 27(1-2), 3-35 (2001)
    • (2001) Parallel Computing , vol.27 , Issue.1-2 , pp. 3-35
    • Whaley, R.1    Petitet, A.2    Dongarra, J.3
  • 26
    • 34548238305 scopus 로고    scopus 로고
    • A rewriting system for the vectorization of signal transforms
    • Daydé, M, Palma, J.M.L.M, Coutinho, Á.L.G.A, Pacitti, E, Lopes, J.C, eds, VECPAR 2006, Springer, Heidelberg
    • Franchetti, F., Voronenko, Y., Ptischel, M.: A rewriting system for the vectorization of signal transforms. In: Daydé, M., Palma, J.M.L.M., Coutinho, Á.L.G.A., Pacitti, E., Lopes, J.C. (eds.) VECPAR 2006. LNCS, vol. 4395. Springer, Heidelberg (2006)
    • (2006) LNCS , vol.4395
    • Franchetti, F.1    Voronenko, Y.2    Ptischel, M.3
  • 32
    • 84904093461 scopus 로고    scopus 로고
    • GTTSE 2005
    • Lämmel, R, Saraiva, J, Visser, J, eds, Springer, Heidelberg
    • Lämmel, R., Saraiva, J., Visser, J. (eds.): GTTSE 2005. LNCS, vol. 4143. Springer, Heidelberg (2006)
    • (2006) LNCS , vol.4143
  • 34
    • 0004116989 scopus 로고    scopus 로고
    • Cormen, T.H, Leiserson, C.E, Rivest, R.L, Stein, C, eds, MIT Press, Cambridge
    • Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C. (eds.): Introduction to algorithms. MIT Press, Cambridge (2001)
    • (2001) Introduction to algorithms
  • 39
    • 34250487811 scopus 로고
    • Gaussian elimination is not optimal
    • Strassen, V.: Gaussian elimination is not optimal. Numerische Mathematik 14(3), 354-356 (1969)
    • (1969) Numerische Mathematik , vol.14 , Issue.3 , pp. 354-356
    • Strassen, V.1
  • 43
    • 57049125418 scopus 로고    scopus 로고
    • Website: ATLAS, http://math-atlas.sourceforge.net/
    • Website: ATLAS
  • 44
    • 57049136924 scopus 로고    scopus 로고
    • Website: Goto BLAS, http://www.tacc.utexas.edu/general/staff/goto/
    • Website: Goto BLAS
  • 45
  • 48
    • 57049180484 scopus 로고    scopus 로고
    • Website: PLAPACK, http://www.cs.utexas.edu/users/plapack/
    • Website: PLAPACK
  • 50
    • 57049089181 scopus 로고    scopus 로고
    • Website: FLAME, http://www.cs.utexas.edu/users/flame/
    • Website: FLAME
  • 51
    • 33947416576 scopus 로고    scopus 로고
    • A modified split-radix FFT with fewer arithmetic operations
    • Johnson, S.G., Frigo, M.: A modified split-radix FFT with fewer arithmetic operations. IEEE Trans. Signal Processing 55(1), 111-119 (2007)
    • (2007) IEEE Trans. Signal Processing , vol.55 , Issue.1 , pp. 111-119
    • Johnson, S.G.1    Frigo, M.2
  • 53
    • 0025600627 scopus 로고
    • A methodology for designing, modifying, and implementing FFT algorithms on various architectures
    • Johnson, J.R., Johnson, R.W., Rodriguez, D., Tolimieri, R.: A methodology for designing, modifying, and implementing FFT algorithms on various architectures. Circuits Systems Signal Processing 9(4), 449-500 (1990)
    • (1990) Circuits Systems Signal Processing , vol.9 , Issue.4 , pp. 449-500
    • Johnson, J.R.1    Johnson, R.W.2    Rodriguez, D.3    Tolimieri, R.4
  • 55
    • 84947753208 scopus 로고    scopus 로고
    • Bonelli, A., Franchetti, F., Lorenz, J., Püschel, M., Ueberhuber, C.W.: Automatic performance optimization of the discrete Fourier transform on distributed memory computers. In: Guo, M., Yang, L.T., Di Martino, B., Zima, H.P., Dongarra, J., Tang, F. (eds.) ISPA 2006. LNCS. 4330. Springer. Heidelberg (2006)
    • Bonelli, A., Franchetti, F., Lorenz, J., Püschel, M., Ueberhuber, C.W.: Automatic performance optimization of the discrete Fourier transform on distributed memory computers. In: Guo, M., Yang, L.T., Di Martino, B., Zima, H.P., Dongarra, J., Tang, F. (eds.) ISPA 2006. LNCS. vol. 4330. Springer. Heidelberg (2006)
  • 57
    • 57049174848 scopus 로고    scopus 로고
    • GNU
    • GNU: GSL http://www.gnu.org/software/gsl/
    • GSL http
  • 58
    • 84949653778 scopus 로고    scopus 로고
    • Automatic performance tuning in the UHFFT library
    • Alexandrov, V.N, Dongarra, J, Juliano, B.A, Renner, R.S, Tan, C.J.K, eds, ICCS-Comput Sci, Springer, Heidelberg
    • Mirković, D., Johnsson, S.L.: Automatic performance tuning in the UHFFT library. In: Alexandrov, V.N., Dongarra, J., Juliano, B.A., Renner, R.S., Tan, C.J.K. (eds.) ICCS-Comput Sci 2001. LNCS, vol. 2073, pp. 71-80. Springer, Heidelberg (2001)
    • (2001) LNCS , vol.2073 , pp. 71-80
    • Mirković, D.1    Johnsson, S.L.2
  • 59
    • 57049138050 scopus 로고    scopus 로고
    • Website: UHFFT, http://www2.cs.uh.edu/~mirkovic/fft/parfft.htm
    • Website: UHFFT
  • 61
    • 57049144601 scopus 로고    scopus 로고
    • Website: ACML. http://developer.amd.com/acml.jsp
    • Website: ACML
  • 62
    • 57049096799 scopus 로고    scopus 로고
    • Website: Intel MKL, http://www.intel.com/cd/software/products/asmona/eng/ 307757.htm
    • Website: Intel MKL
  • 63
    • 57049160192 scopus 로고    scopus 로고
    • Website: Intel IPP, http://www.intel.com/cd/software/products/asmona/eng/ perflib/ipp/302910.htm
    • Website: Intel IPP
  • 66
    • 57049086177 scopus 로고    scopus 로고
    • Website: IMSL, http://www.vni.com/products/imsl/
    • Website: IMSL
  • 67
    • 0024903997 scopus 로고
    • Evaluating associativity in CPU caches
    • Hill, M.D., Smith, A.J.: Evaluating associativity in CPU caches. IEEE Trans. Comput. 38(12), 1612-1630 (1989)
    • (1989) IEEE Trans. Comput , vol.38 , Issue.12 , pp. 1612-1630
    • Hill, M.D.1    Smith, A.J.2
  • 70
    • 57049129655 scopus 로고    scopus 로고
    • GNU: GCC: optimization options. http://gcc.gnu.org/onlinedocs/gcc/ Optimize-Options.html
    • GNU: GCC: optimization options. http://gcc.gnu.org/onlinedocs/gcc/ Optimize-Options.html
  • 71
    • 57049142416 scopus 로고    scopus 로고
    • Intel: Quick-reference guide to optimization with intel compilers version 10.x, http://cache-www.intel.com/cd/00/00/22/23/222300-22300.pdf
    • Intel: Quick-reference guide to optimization with intel compilers version 10.x, http://cache-www.intel.com/cd/00/00/22/23/222300-22300.pdf
  • 72
    • 57049114677 scopus 로고    scopus 로고
    • Intel: Intel VTune
    • Intel: Intel VTune
  • 73
    • 57049098946 scopus 로고    scopus 로고
    • Microsoft: Microsoft Visual Studio
    • Microsoft: Microsoft Visual Studio
  • 74
    • 13144295805 scopus 로고    scopus 로고
    • GNU
    • GNU: Gnu gprof manual, http://www.gnu.org/software/binutils/manual/gprof- 2.9.1/html-mono/gprof.html
    • Gnu gprof manual
  • 77
    • 84968470212 scopus 로고
    • An algorithm for the machine calculation of complex Fourier series
    • Cooley, J.W., Tukey, J.W.: An algorithm for the machine calculation of complex Fourier series. Math. of Computation 19, 297-301 (1965)
    • (1965) Math. of Computation , vol.19 , pp. 297-301
    • Cooley, J.W.1    Tukey, J.W.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.