메뉴 건너뛰기




Volumn , Issue , 2009, Pages

Auto-tuning 3-D FFT library for CUDA GPUs

Author keywords

[No Author keywords available]

Indexed keywords

AUTOTUNING; DENSE KERNELS; NUMBER OF THREADS; PROBLEM SIZE; SHARED MEMORIES; SINGLE PROCESSORS; TRANSFORM SIZE;

EID: 74049114159     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1654059.1654090     Document Type: Conference Paper
Times cited : (105)

References (19)
  • 1
    • 84968470212 scopus 로고
    • An Algorithm for the Machine Calculation of Complex Fourier Series
    • J. W. Cooley and J. W. Tukey. An Algorithm for the Machine Calculation of Complex Fourier Series. Math. Comput., Vol. 19:297-301, 1965.
    • (1965) Math. Comput , vol.19 , pp. 297-301
    • Cooley, J.W.1    Tukey, J.W.2
  • 3
    • 20744449792 scopus 로고    scopus 로고
    • M. Frigo and S. G. Johnson. The Design and Implementation of FFTW3. Proceedings of the IEEE, 93(2):216-231, 2005. special issue on Program Generation, Optimization, and Platform Adaptation.
    • M. Frigo and S. G. Johnson. The Design and Implementation of FFTW3. Proceedings of the IEEE, 93(2):216-231, 2005. special issue on "Program Generation, Optimization, and Platform Adaptation".
  • 4
    • 74049116773 scopus 로고    scopus 로고
    • General-Purpose Computation Using Graphics Hardware. http://www.gpgpu. org/.
    • General-Purpose Computation Using Graphics Hardware. http://www.gpgpu. org/.
  • 9
    • 58349104360 scopus 로고    scopus 로고
    • The Rise of the Commodity Vectors
    • J. M. L. M. Palma, P. Amestoy, M. J. Daydé, M. Mattoso, and J. C. Lopes, editors, VECPAR, of, Springer
    • S. Matsuoka. The Rise of the Commodity Vectors. In J. M. L. M. Palma, P. Amestoy, M. J. Daydé, M. Mattoso, and J. C. Lopes, editors, VECPAR, volume 5336 of Lecture Notes in Computer Science, pages 53-62. Springer, 2008.
    • (2008) Lecture Notes in Computer Science , vol.5336 , pp. 53-62
    • Matsuoka, S.1
  • 12
    • 44849094749 scopus 로고    scopus 로고
    • Fast N-Body Simulation with CUDA
    • H. Nguyen, editor, chapter 31, Addison-Wesley
    • L. Nyland, M. Harris, and J. Prins. Fast N-Body Simulation with CUDA. In H. Nguyen, editor, GPU Gems 3, chapter 31, pages 677-695. Addison-Wesley, 2007.
    • (2007) GPU Gems 3 , pp. 677-695
    • Nyland, L.1    Harris, M.2    Prins, J.3
  • 13
    • 19344368072 scopus 로고    scopus 로고
    • M. Püschel, J. M. F. Moura, J. Johnson, D. Padua, M. Veloso, B. Singer, J. Xiong, F. Franchetti, A. Gacic, Y. Voronenko, K. Chen, R. W. Johnson, and N. Rizzolo. SPIRAL: Code Generation for DSP Transforms. Proceedings of the IEEE, special issue on Program Generation, Optimization, and Adaptation, 93(2):232-275, 2005.
    • M. Püschel, J. M. F. Moura, J. Johnson, D. Padua, M. Veloso, B. Singer, J. Xiong, F. Franchetti, A. Gacic, Y. Voronenko, K. Chen, R. W. Johnson, and N. Rizzolo. SPIRAL: Code Generation for DSP Transforms. Proceedings of the IEEE, special issue on "Program Generation, Optimization, and Adaptation", 93(2):232-275, 2005.
  • 19
    • 0343462141 scopus 로고    scopus 로고
    • Automated empirical optimizations of software and the atlas project
    • R. C. Whaley, A. Petitet, and J. J. Dongarra. Automated empirical optimizations of software and the atlas project. Parallel Computing, 27(1-2):3-35, 2001.
    • (2001) Parallel Computing , vol.27 , Issue.1-2 , pp. 3-35
    • Whaley, R.C.1    Petitet, A.2    Dongarra, J.J.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.