메뉴 건너뛰기




Volumn 55, Issue 9, 2007, Pages 4458-4473

Mechanical derivation of fused multiply-add algorithms for linear transforms

Author keywords

Automatic program generation; Discrete cosine transform (DCT); Discrete Fourier transform (DFT); Fast algorithm; Implementation; Multiply and accumulate (MAC); Multiply and accumulate (MAC) instruction

Indexed keywords

AUTOMATIC PROGRAMMING; DISCRETE COSINE TRANSFORMS; DISCRETE FOURIER TRANSFORMS; MATRIX ALGEBRA; OPTIMIZATION;

EID: 34548317511     PISSN: 1053587X     EISSN: None     Source Type: Journal    
DOI: 10.1109/TSP.2007.896116     Document Type: Article
Times cited : (7)

References (26)
  • 1
    • 10844263523 scopus 로고    scopus 로고
    • Y. Nievergelt, Scalar fused multiply-add instructions produce floating-point matrix arithmetic provably accurate to the penultimate digit, ACM Trans. Math. Softw. (TOMS), 29, no. 1, pp. 27-48, 2003.
    • Y. Nievergelt, "Scalar fused multiply-add instructions produce floating-point matrix arithmetic provably accurate to the penultimate digit," ACM Trans. Math. Softw. (TOMS), vol. 29, no. 1, pp. 27-48, 2003.
  • 2
    • 0027274408 scopus 로고
    • Implementation of efficient FFT algorithms on fused multiply-add architectures
    • Jan
    • E. Linzer and E. Feig, "Implementation of efficient FFT algorithms on fused multiply-add architectures," IEEE Trans. Signal Process., vol. 41, no. 1, p. 93, Jan. 1993.
    • (1993) IEEE Trans. Signal Process , vol.41 , Issue.1 , pp. 93
    • Linzer, E.1    Feig, E.2
  • 3
    • 0026396564 scopus 로고
    • Implementation of multiply-add FFT algorithms for complex and real data sequences
    • C. Lu, "Implementation of multiply-add FFT algorithms for complex and real data sequences," in Proc. Int. Symp. Circuits Systems (ISCAS) 1991, vol. 1, pp. 480-483.
    • (1991) Proc. Int. Symp. Circuits Systems (ISCAS) , vol.1 , pp. 480-483
    • Lu, C.1
  • 4
    • 0031261334 scopus 로고    scopus 로고
    • S. Goedecker, Fast radix 2, 3, 4, and 5 kernels for fast Fourier transformations on computers with overlapping multiply-add instructions, SIAM J. Scientif. Comput., 18, no. 6, pp. 1605-1611, 1997.
    • S. Goedecker, "Fast radix 2, 3, 4, and 5 kernels for fast Fourier transformations on computers with overlapping multiply-add instructions," SIAM J. Scientif. Comput., vol. 18, no. 6, pp. 1605-1611, 1997.
  • 5
    • 0027540716 scopus 로고
    • FFT algorithms for prime transform sizes and their implementations on VAX, IBM3090VF, and IBM RS/ 6000
    • Feb
    • C. Lu, J. W. Cooley, and R. Tolimieri, "FFT algorithms for prime transform sizes and their implementations on VAX, IBM3090VF, and IBM RS/ 6000," IEEE Trans. Signal Process., vol. 41, no. 2, pp. 638-648, Feb. 1993.
    • (1993) IEEE Trans. Signal Process , vol.41 , Issue.2 , pp. 638-648
    • Lu, C.1    Cooley, J.W.2    Tolimieri, R.3
  • 7
    • 0141676637 scopus 로고    scopus 로고
    • A radix-16 FFT algorithm suitable for multiply-add instruction based on Goedecker method
    • Apr
    • D. Takahashi, "A radix-16 FFT algorithm suitable for multiply-add instruction based on Goedecker method," in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), Apr. 2003, vol. 2, pp. 665-668.
    • (2003) Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP) , vol.2 , pp. 665-668
    • Takahashi, D.1
  • 9
    • 0028410101 scopus 로고
    • Hadamard transforms on multiply/ add architectures
    • Apr
    • E. F. D. Coppersmith and E. Linzer, "Hadamard transforms on multiply/ add architectures," IEEE Trans. Signal Process., vol. 42, no. 4, pp. 969-970, Apr. 1994.
    • (1994) IEEE Trans. Signal Process , vol.42 , Issue.4 , pp. 969-970
    • Coppersmith, E.F.D.1    Linzer, E.2
  • 12
    • 34548342614 scopus 로고    scopus 로고
    • Spiral website [Online]. Available: Www.spiral.net , pp. -,
    • Spiral website [Online]. Available: Www.spiral.net , "," vol. , pp. -,
  • 15
    • 0000459334 scopus 로고    scopus 로고
    • Rewriting
    • A. Robinson and A. Voronkov, Eds. New York: Elsevier, ch. 9, pp
    • N. Dershowitz and D. A. Plaisted, "Rewriting," in Handbook of Automated Reasoning, A. Robinson and A. Voronkov, Eds. New York: Elsevier, 2001, vol. 1, ch. 9, pp. 535-610.
    • (2001) Handbook of Automated Reasoning , vol.1 , pp. 535-610
    • Dershowitz, N.1    Plaisted, D.A.2
  • 19
    • 34548370373 scopus 로고    scopus 로고
    • FFTW 3.1.2 2006 [Online]. Available: Www.fftw.org
    • FFTW 3.1.2 2006 [Online]. Available: Www.fftw.org
  • 22
    • 0022665487 scopus 로고    scopus 로고
    • H. V. Sorensen, H. M. T. , C. S. Burrus, and M. T. Heideman, On computing the split-radix FFT, IEEE Trans. Acoust., Speech, Signal Process., ASSP-34, no. 1, pp. 152-156, 1986.
    • H. V. Sorensen, H. M. T. , C. S. Burrus, and M. T. Heideman, "On computing the split-radix FFT," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-34, no. 1, pp. 152-156, 1986.
  • 24
    • 0026678378 scopus 로고
    • New networks for perfect inversion and perfect reconstruction
    • Jan
    • F. Bruekers and A. Enden, "New networks for perfect inversion and perfect reconstruction," IEEE J. Sel. Areas Commun., vol. 10, pp. 130-137, Jan. 1992.
    • (1992) IEEE J. Sel. Areas Commun , vol.10 , pp. 130-137
    • Bruekers, F.1    Enden, A.2
  • 25
    • 30244489068 scopus 로고    scopus 로고
    • The lifting scheme: A custom-design construction of biorthogonal wavelets
    • W. Sweldens, "The lifting scheme: A custom-design construction of biorthogonal wavelets," Appl. Comput. Harmon. Anal., vol. 3, no. 2, pp. 186-200, 1996.
    • (1996) Appl. Comput. Harmon. Anal , vol.3 , Issue.2 , pp. 186-200
    • Sweldens, W.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.