메뉴 건너뛰기




Volumn , Issue , 2006, Pages

FFT program generation for shared memory: SMP and multicore

Author keywords

Automatic parallelization; Chip multiprocessor; Fast fourier transform; Multicore; Shared memory

Indexed keywords

AUTOMATIC PARALLELIZATION; CHIP MULTIPROCESSORS; CPU FREQUENCY SCALING; SHARED MEMORY;

EID: 34548244593     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1188455.1188575     Document Type: Conference Paper
Times cited : (49)

References (31)
  • 3
    • 0025403252 scopus 로고
    • FFTs in external or hierarchical memory
    • BAILEY, D. H. 1990. FFTs in external or hierarchical memory. J. Supercomputing 4, 23-35.
    • (1990) J. Supercomputing , vol.4 , pp. 23-35
    • BAILEY, D.H.1
  • 5
    • 17644412337 scopus 로고    scopus 로고
    • BIENTINESI, P., GUNNELS, J. A., MYERS, M. E., QUINTANA-ORTI, E., AND VAN DE GEIJN, R. 2005. The science of deriving dense linear algebra algorithms. TOMS 31, 1 (March), 1-26.
    • BIENTINESI, P., GUNNELS, J. A., MYERS, M. E., QUINTANA-ORTI, E., AND VAN DE GEIJN, R. 2005. The science of deriving dense linear algebra algorithms. TOMS 31, 1 (March), 1-26.
  • 7
    • 20744452904 scopus 로고    scopus 로고
    • DEMMEL, J., DONGARRA, J., EIJKHOUT, V., FUENTES, E., PETITET, A., VUDUC, R., WHALEY, C., AND YELICK, K. 2005. Self adapting linear algebra algorithms and software. Proceedings of the IEEE 93, 2, 293-312. Special issue on Program Generation, Optimization, and Adaptation.
    • DEMMEL, J., DONGARRA, J., EIJKHOUT, V., FUENTES, E., PETITET, A., VUDUC, R., WHALEY, C., AND YELICK, K. 2005. Self adapting linear algebra algorithms and software. Proceedings of the IEEE 93, 2, 293-312. Special issue on "Program Generation, Optimization, and Adaptation".
  • 8
    • 0000459334 scopus 로고    scopus 로고
    • Rewriting
    • A. Robinson and A. Voronkov, Eds, Elsevier, ch. 9
    • DERSHOWITZ, N., AND PLAISTED, D. A. 2001. Rewriting. In Handbook of Automated Reasoning, A. Robinson and A. Voronkov, Eds., vol. 1. Elsevier, ch. 9, 535-610.
    • (2001) Handbook of Automated Reasoning , vol.1 , pp. 535-610
    • DERSHOWITZ, N.1    PLAISTED, D.A.2
  • 13
    • 20744449792 scopus 로고    scopus 로고
    • FRIGO, M., AND JOHNSON, S. G. 2005. The design and implementation of FFTW3. Proceedings of the IEEE 93, 2, 216-231. Special issue on Program Generation, Optimization, and Adaptation.
    • FRIGO, M., AND JOHNSON, S. G. 2005. The design and implementation of FFTW3. Proceedings of the IEEE 93, 2, 216-231. Special issue on "Program Generation, Optimization, and Adaptation".
  • 15
    • 0039435412 scopus 로고    scopus 로고
    • GUNNELS, J. A., GUSTAVSON, F. G., HENRY, G. M., AND VAN DE GEIJN, R. A. 2001. FLAME: Formal linear algebra methods environment. TOMS 27, 4 (December), 422-455.
    • GUNNELS, J. A., GUSTAVSON, F. G., HENRY, G. M., AND VAN DE GEIJN, R. A. 2001. FLAME: Formal linear algebra methods environment. TOMS 27, 4 (December), 422-455.
  • 16
    • 84976813879 scopus 로고
    • Compiling Fortran D for MIMD distributedmemory machines
    • HIRANANDANI, S., KENNEDY, K., AND TSENG, C.W. 1992. Compiling Fortran D for MIMD distributedmemory machines. Commun. ACM 35, 8, 66-80.
    • (1992) Commun. ACM , vol.35 , Issue.8 , pp. 66-80
    • HIRANANDANI, S.1    KENNEDY, K.2    TSENG, C.W.3
  • 18
    • 0025600627 scopus 로고
    • A methodology for designing, modifying, and implementing FFT algorithms on various architectures
    • JOHNSON, J. R., JOHNSON, R. W., RODRIGUEZ, D., AND TOLIMIERI, R. 1990. A methodology for designing, modifying, and implementing FFT algorithms on various architectures. Circuits Systems Signal Processing 9, 449-500.
    • (1990) Circuits Systems Signal Processing , vol.9 , pp. 449-500
    • JOHNSON, J.R.1    JOHNSON, R.W.2    RODRIGUEZ, D.3    TOLIMIERI, R.4
  • 19
    • 84945709131 scopus 로고    scopus 로고
    • MCKELLAR, A. C., AND E. G. COFFMAN, J. 1969. Organizing matrices and matrix operations for paged memory systems. CommunicationsACM 12, 3, 153-165.
    • MCKELLAR, A. C., AND E. G. COFFMAN, J. 1969. Organizing matrices and matrix operations for paged memory systems. CommunicationsACM 12, 3, 153-165.
  • 20
    • 0023347849 scopus 로고
    • Parallelization and performance analysis of the Cooley-Tukey FFT algorithm for shared-memory architectures
    • NORTON, A., AND SILBERGER, A. J. 1987. Parallelization and performance analysis of the Cooley-Tukey FFT algorithm for shared-memory architectures. IEEE Trans. Comput.36, 5, 581-591.
    • (1987) IEEE Trans. Comput , vol.36 , Issue.5 , pp. 581-591
    • NORTON, A.1    SILBERGER, A.J.2
  • 21
    • 19344368072 scopus 로고    scopus 로고
    • PÜSCHEL, M., MOURA, J. M. F., JOHNSON, J., PADUA, D., VELOSO, M., SINGER, B. W., XIONG, J., FRANCHETTI, F., GAČIČ, A., VORONENKO, Y., CHEN, K., JOHNSON, R. W., AND RIZZOLO, N. 2005. SPIRAL: Code generation for DSP transforms. Proc. of the IEEE 93, 2, 232-275. Special issue on Program Generation, Optimization, and Adaptation.
    • PÜSCHEL, M., MOURA, J. M. F., JOHNSON, J., PADUA, D., VELOSO, M., SINGER, B. W., XIONG, J., FRANCHETTI, F., GAČIČ, A., VORONENKO, Y., CHEN, K., JOHNSON, R. W., AND RIZZOLO, N. 2005. SPIRAL: Code generation for DSP transforms. Proc. of the IEEE 93, 2, 232-275. Special issue on Program Generation, Optimization, and Adaptation.
  • 23
    • 0141696394 scopus 로고    scopus 로고
    • Stochastic search for signal processing algorithm optimization
    • SINGER, B., AND VELOSO, M. 2001. Stochastic search for signal processing algorithm optimization. In Proc. Supercomputing.
    • (2001) Proc. Supercomputing
    • SINGER, B.1    VELOSO, M.2
  • 24
    • 34548206810 scopus 로고    scopus 로고
    • An OpenMP implementation of parallel FFT and its performance on IA-64 processors
    • TAKAHASHI, D., SATO, M., AND BOKU, T. 2003. An OpenMP implementation of parallel FFT and its performance on IA-64 processors. Lecture Notes in Computer Science 2716, 99-108.
    • (2003) Lecture Notes in Computer Science , vol.2716 , pp. 99-108
    • TAKAHASHI, D.1    SATO, M.2    BOKU, T.3
  • 25
    • 84945901304 scopus 로고    scopus 로고
    • A blocking algorithm for parallel 1-D FFT on shared-memory parallel computers
    • TAKAHASHI, D. 2002. A blocking algorithm for parallel 1-D FFT on shared-memory parallel computers. Lecture Notes in Computer Science 2367, 380-389.
    • (2002) Lecture Notes in Computer Science , vol.2367 , pp. 380-389
    • TAKAHASHI, D.1
  • 27
    • 0343462141 scopus 로고    scopus 로고
    • Automated empirical optimization of software and the ATLAS project
    • WHALEY, R. C., PETITET, A., AND DONGARRA, J. J. 2001. Automated empirical optimization of software and the ATLAS project. Parallel Computing 27, 1-2, 3-35.
    • (2001) Parallel Computing , vol.27
    • WHALEY, R.C.1    PETITET, A.2    DONGARRA, J.J.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.