-
1
-
-
77954741573
-
Large-scale FFT on GPU clusters
-
New York, NY, USA, ACM
-
Y. Chen, X. Cui, and H. Mei. Large-scale FFT on GPU clusters. In Proceedings of the 24th ACM International Conference on Supercomputing, ICS '10, pages 315-324, New York, NY, USA, 2010. ACM.
-
(2010)
Proceedings of the 24th ACM International Conference on Supercomputing, ICS '10
, pp. 315-324
-
-
Chen, Y.1
Cui, X.2
Mei, H.3
-
2
-
-
84968470212
-
An algorithm for the machine calculation of complex Fourier series
-
J. Cooley and J. Tukey. An algorithm for the machine calculation of complex Fourier series. Mathematics of Computation, 19(90):297-301, 1965.
-
(1965)
Mathematics of Computation
, vol.19
, Issue.90
, pp. 297-301
-
-
Cooley, J.1
Tukey, J.2
-
3
-
-
79952782168
-
Auto-tuning of fast Fourier transform on graphics processors
-
New York, NY, USA, ACM
-
Y. Dotsenko, S. Baghsorkhi, B. Lloyd, and N. Govindaraju. Auto-tuning of fast Fourier transform on graphics processors. In Proceedings of the 16th ACM symposium on Principles and practice of parallel programming, PPoPP '11, pages 257-266, New York, NY, USA, 2011. ACM.
-
(2011)
Proceedings of the 16th ACM Symposium on Principles and Practice of Parallel Programming, PPoPP '11
, pp. 257-266
-
-
Dotsenko, Y.1
Baghsorkhi, S.2
Lloyd, B.3
Govindaraju, N.4
-
5
-
-
84884849756
-
-
website
-
M. Frigo and G. Johnson. The FFTW website, 2012. http: //www.fftw.org.
-
(2012)
-
-
Frigo, M.1
Johnson, G.2
-
7
-
-
78649807974
-
Cyclic reduction tridiagonal solvers on GPUs applied to mixed-precision multigrid
-
Jan.
-
D. Goddeke and R. Strzodka. Cyclic reduction tridiagonal solvers on GPUs applied to mixed-precision multigrid. IEEE Trans. Parallel Distrib. Syst., 22(1):22-32, Jan. 2011.
-
(2011)
IEEE Trans. Parallel Distrib. Syst.
, vol.22
, Issue.1
, pp. 22-32
-
-
Goddeke, D.1
Strzodka, R.2
-
8
-
-
34548292052
-
A memory model for scientific algorithms on graphics processors
-
ACM
-
N. K. Govindaraju, S. Larsen, J. Gray, and D. Manocha. A memory model for scientific algorithms on graphics processors. In Proceedings of the 2006 ACM/IEEE conference on Supercomputing, SC '06, New York, NY, USA, 2006. ACM.
-
Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, SC '06, New York, NY, USA, 2006
-
-
Govindaraju, N.K.1
Larsen, S.2
Gray, J.3
Manocha, D.4
-
9
-
-
70350754502
-
High performance discrete Fourier transforms on graphics processors
-
Piscataway, NJ, USA, IEEE Press
-
N. K. Govindaraju, B. Lloyd, Y. Dotsenko, B. Smith, and J. Manferdelli. High performance discrete Fourier transforms on graphics processors. In Proceedings of the 2008 ACM/IEEE conference on Supercomputing, SC '08, pages 2:1-2:12, Piscataway, NJ, USA, 2008. IEEE Press.
-
(2008)
Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, SC '08
-
-
Govindaraju, N.K.1
Lloyd, B.2
Dotsenko, Y.3
Smith, B.4
Manferdelli, J.5
-
10
-
-
77954713684
-
An empirically tuned 2D and 3D FFT library on CUDA GPU
-
New York, NY, USA, ACM
-
L. Gu, X. Li, and J. Siegel. An empirically tuned 2D and 3D FFT library on CUDA GPU. In Proceedings of the 24th ACM International Conference on Supercomputing, ICS '10, pages 305-314, New York, NY, USA, 2010. ACM.
-
(2010)
Proceedings of the 24th ACM International Conference on Supercomputing, ICS '10
, pp. 305-314
-
-
Gu, L.1
Li, X.2
Siegel, J.3
-
11
-
-
79959598034
-
Using GPUs to compute large out-of-card FFTs
-
New York, NY, USA, ACM
-
L. Gu, J. Siegel, and X. Li. Using GPUs to compute large out-of-card FFTs. In Proceedings of the international conference on Supercomputing, ICS '11, pages 255-264, New York, NY, USA, 2011. ACM.
-
(2011)
Proceedings of the International Conference on Supercomputing, ICS '11
, pp. 255-264
-
-
Gu, L.1
Siegel, J.2
Li, X.3
-
13
-
-
74049114159
-
Auto-tuning 3-D FFT library for CUDA GPUs
-
New York, NY, USA, ACM
-
A. Nukada and S. Matsuoka. Auto-tuning 3-D FFT library for CUDA GPUs. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC '09, pages 30:1-30:10, New York, NY, USA, 2009. ACM.
-
(2009)
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC '09
-
-
Nukada, A.1
Matsuoka, S.2
-
14
-
-
70350759823
-
Bandwidth intensive 3-D FFT kernel for GPUs using CUDA
-
Piscataway, NJ, USA, IEEE Press
-
A. Nukada, Y. Ogata, T. Endo, and S. Matsuoka. Bandwidth intensive 3-D FFT kernel for GPUs using CUDA. In Proceedings of the 2008 ACM/IEEE conference on Supercomputing, SC '08, pages 5:1-5:11, Piscataway, NJ, USA, 2008. IEEE Press.
-
(2008)
Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, SC '08
-
-
Nukada, A.1
Ogata, Y.2
Endo, T.3
Matsuoka, S.4
-
17
-
-
84884877516
-
An optimized FFT-based direct Poisson solver on CUDA GPUs
-
To appear
-
J. Wu and J. JaJa. An optimized FFT-based direct Poisson solver on CUDA GPUs. IEEE Trans. Parallel Distrib. Syst. To appear.
-
IEEE Trans. Parallel Distrib. Syst.
-
-
Wu, J.1
JaJa, J.2
-
18
-
-
84870704125
-
Optimized strategies for mapping three-dimensional FFTs onto CUDA GPUs
-
IEEE Press
-
J. Wu and J. JaJa. Optimized strategies for mapping three-dimensional FFTs onto CUDA GPUs. In Innovative Parallel Computing (INPAR). IEEE Press, 2012.
-
(2012)
Innovative Parallel Computing (INPAR)
-
-
Wu, J.1
JaJa, J.2
|