-
1
-
-
84968470212
-
An algorithm for the machine calculation of complex fourier series
-
J. Cooley and J. Tukey. An algorithm for the machine calculation of complex fourier series. Mathematics of Computation, 19(90):297-301, 1965.
-
(1965)
Mathematics of Computation
, vol.19
, Issue.90
, pp. 297-301
-
-
Cooley, J.1
Tukey, J.2
-
2
-
-
79952782168
-
Auto-tuning of fast fourier transform on graphics processors
-
New York, NY, USA, ACM
-
Y. Dotsenko, S. Baghsorkhi, B. Lloyd, and N. Govindaraju. Auto-tuning of fast fourier transform on graphics processors. In Proceedings of the 16th ACM symposium on Principles and practice of parallel programming, PPoPP '11, pages 257-266, New York, NY, USA, 2011. ACM.
-
(2011)
Proceedings of the 16th ACM Symposium on Principles and Practice of Parallel Programming, PPoPP '11
, pp. 257-266
-
-
Dotsenko, Y.1
Baghsorkhi, S.2
Lloyd, B.3
Govindaraju, N.4
-
4
-
-
0027642189
-
Rotating a three-dimensional array in an optimal position for vector processing: Case study for a three-dimensional fast fourier transform
-
Aug.
-
S. Goedecker. Rotating a three-dimensional array in an optimal position for vector processing: case study for a three-dimensional fast fourier transform. Computer Physics Communications, 76:294-300, Aug. 1993.
-
(1993)
Computer Physics Communications
, vol.76
, pp. 294-300
-
-
Goedecker, S.1
-
5
-
-
34548292052
-
A memory model for scientific algorithms on graphics processors
-
ACM
-
N. K. Govindaraju, S. Larsen, J. Gray, and D. Manocha. A memory model for scientific algorithms on graphics processors. In Proceedings of the 2006 ACM/IEEE conference on Supercomputing, SC '06, New York, NY, USA, 2006. ACM.
-
(2006)
Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, SC '06, New York, NY, USA
-
-
Govindaraju, N.K.1
Larsen, S.2
Gray, J.3
Manocha, D.4
-
6
-
-
70350754502
-
High performance discrete fourier transforms on graphics processors
-
Piscataway, NJ, USA, IEEE Press
-
N. K. Govindaraju, B. Lloyd, Y. Dotsenko, B. Smith, and J. Manferdelli. High performance discrete fourier transforms on graphics processors. In Proceedings of the 2008 ACM/IEEE conference on Supercomputing, SC '08, pages 2:1-2:12, Piscataway, NJ, USA, 2008. IEEE Press.
-
(2008)
Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, SC '08
-
-
Govindaraju, N.K.1
Lloyd, B.2
Dotsenko, Y.3
Smith, B.4
Manferdelli, J.5
-
7
-
-
77954713684
-
An empirically tuned 2d and 3d fft library on cuda gpu
-
New York, NY, USA, ACM
-
L. Gu, X. Li, and J. Siegel. An empirically tuned 2d and 3d fft library on cuda gpu. In Proceedings of the 24th ACM International Conference on Supercomputing, ICS '10, pages 305-314, New York, NY, USA, 2010. ACM.
-
(2010)
Proceedings of the 24th ACM International Conference on Supercomputing, ICS '10
, pp. 305-314
-
-
Gu, L.1
Li, X.2
Siegel, J.3
-
8
-
-
84966205227
-
Computing the Fast Fourier Transform on a Vector Computer
-
D. G. Korn and J. J. Lambiotte. Computing the Fast Fourier Transform on a Vector Computer. Mathematics of Computation, 33:977-992, 1979.
-
(1979)
Mathematics of Computation
, vol.33
, pp. 977-992
-
-
Korn, D.G.1
Lambiotte, J.J.2
-
9
-
-
35048828869
-
The fft on a gpu
-
Aire-la-Ville, Switzerland, Switzerland, Eurographics Association
-
K. Moreland and E. Angel. The fft on a gpu. In Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware, HWWS '03, pages 112-119, Aire-la-Ville, Switzerland, Switzerland, 2003. Eurographics Association.
-
(2003)
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware, HWWS '03
, pp. 112-119
-
-
Moreland, K.1
Angel, E.2
-
10
-
-
0021470572
-
Fft algorithms for vector computers
-
P. N. and Swarztrauber. Fft algorithms for vector computers. Parallel Computing, 1(1):45-3, 1984.
-
(1984)
Parallel Computing
, vol.1
, Issue.1
, pp. 45-53
-
-
Swarztrauber, P.N.1
-
11
-
-
84870710877
-
-
Nukada. website
-
Nukada. Nukada FFT Library website. http://matsu-www.is.titech.ac.jp/ nukada/nufft/, 2011.
-
(2011)
Nukada FFT Library
-
-
-
12
-
-
74049114159
-
Auto-tuning 3-d fft library for cuda gpus
-
New York, NY, USA, ACM
-
A. Nukada and S. Matsuoka. Auto-tuning 3-d fft library for cuda gpus. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC '09, pages 30:1-30:10, New York, NY, USA, 2009. ACM.
-
(2009)
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC '09
-
-
Nukada, A.1
Matsuoka, S.2
-
13
-
-
70350759823
-
Bandwidth intensive 3-d fft kernel for gpus using cuda
-
Piscataway, NJ, USA, IEEE Press
-
A. Nukada, Y. Ogata, T. Endo, and S. Matsuoka. Bandwidth intensive 3-d fft kernel for gpus using cuda. In Proceedings of the 2008 ACM/IEEE conference on Supercomputing, SC '08, pages 5:1-5:11, Piscataway, NJ, USA, 2008. IEEE Press.
-
(2008)
Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, SC '08
-
-
Nukada, A.1
Ogata, Y.2
Endo, T.3
Matsuoka, S.4
-
14
-
-
84870653313
-
-
NVIDIA Corporation
-
NVIDIA Corporation. CUDA and Fermi Update, 2010.
-
(2010)
CUDA and Fermi Update
-
-
-
18
-
-
51049119174
-
An efficient, model-based cpu-gpu heterogeneous fft library
-
april
-
Y. Ogata, T. Endo, N. Maruyama, and S. Matsuoka. An efficient, model-based cpu-gpu heterogeneous fft library. In Parallel and Distributed Processing, 2008. IPDPS 2008. IEEE International Symposium on, pages 1-10, april 2008.
-
(2008)
In Parallel and Distributed Processing, 2008. IPDPS 2008. IEEE International Symposium on
, pp. 1-10
-
-
Ogata, Y.1
Endo, T.2
Maruyama, N.3
Matsuoka, S.4
-
22
-
-
76749123978
-
Complexity effective memory access scheduling for many-core accelerator architectures
-
New York, NY, USA, ACM
-
G. L. Yuan, A. Bakhoda, and T. M. Aamodt. Complexity effective memory access scheduling for many-core accelerator architectures. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 42, pages 34-44, New York, NY, USA, 2009. ACM.
-
(2009)
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 42
, pp. 34-44
-
-
Yuan, G.L.1
Bakhoda, A.2
Aamodt, T.M.3
|