-
1
-
-
84968470212
-
An Algorithm for the Machine Calculation of Complex Fourier Series
-
J. W. Cooley and J. W. Tukey. An Algorithm for the Machine Calculation of Complex Fourier Series. Math. Comput., Vol. 19:297-301, 1965.
-
(1965)
Math. Comput
, vol.19
, pp. 297-301
-
-
Cooley, J.W.1
Tukey, J.W.2
-
2
-
-
70350771127
-
Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures
-
Piscataway, NJ, USA, IEEE Press
-
K. Datta, M. Murphy, V. Volkov, S. Williams, J. Carter, L. Oliker, D. Patterson, J. Shalf, and K. Yelick. Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures. In SC '08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing, pages 1-12, Piscataway, NJ, USA, 2008. IEEE Press.
-
(2008)
SC '08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing
, pp. 1-12
-
-
Datta, K.1
Murphy, M.2
Volkov, V.3
Williams, S.4
Carter, J.5
Oliker, L.6
Patterson, D.7
Shalf, J.8
Yelick, K.9
-
3
-
-
20744449792
-
-
M. Frigo and S. G. Johnson. The Design and Implementation of FFTW3. Proceedings of the IEEE, 93(2):216-231, 2005. special issue on Program Generation, Optimization, and Platform Adaptation.
-
M. Frigo and S. G. Johnson. The Design and Implementation of FFTW3. Proceedings of the IEEE, 93(2):216-231, 2005. special issue on "Program Generation, Optimization, and Platform Adaptation".
-
-
-
-
4
-
-
74049116773
-
-
General-Purpose Computation Using Graphics Hardware. http://www.gpgpu. org/.
-
General-Purpose Computation Using Graphics Hardware. http://www.gpgpu. org/.
-
-
-
-
6
-
-
70350754502
-
High Performance Discrete Fourier Transforms on Graphics Processors
-
N. K. Govindaraju, B. Lloyd, Y. Dotsenko, B. Smith, and J. Manferdelli. High Performance Discrete Fourier Transforms on Graphics Processors. In the 2008 ACM/IEEE conference on supercomputing, 2008.
-
(2008)
the 2008 ACM/IEEE conference on supercomputing
-
-
Govindaraju, N.K.1
Lloyd, B.2
Dotsenko, Y.3
Smith, B.4
Manferdelli, J.5
-
9
-
-
58349104360
-
The Rise of the Commodity Vectors
-
J. M. L. M. Palma, P. Amestoy, M. J. Daydé, M. Mattoso, and J. C. Lopes, editors, VECPAR, of, Springer
-
S. Matsuoka. The Rise of the Commodity Vectors. In J. M. L. M. Palma, P. Amestoy, M. J. Daydé, M. Mattoso, and J. C. Lopes, editors, VECPAR, volume 5336 of Lecture Notes in Computer Science, pages 53-62. Springer, 2008.
-
(2008)
Lecture Notes in Computer Science
, vol.5336
, pp. 53-62
-
-
Matsuoka, S.1
-
11
-
-
70350759823
-
Bandwidth intensive 3-D FFT kernel for GPUs using CUDA
-
Piscataway, NJ, USA, IEEE Press
-
A. Nukada, Y. Ogata, T. Endo, and S. Matsuoka. Bandwidth intensive 3-D FFT kernel for GPUs using CUDA. In SC '08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing, pages 1-11, Piscataway, NJ, USA, 2008. IEEE Press.
-
(2008)
SC '08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing
, pp. 1-11
-
-
Nukada, A.1
Ogata, Y.2
Endo, T.3
Matsuoka, S.4
-
12
-
-
44849094749
-
Fast N-Body Simulation with CUDA
-
H. Nguyen, editor, chapter 31, Addison-Wesley
-
L. Nyland, M. Harris, and J. Prins. Fast N-Body Simulation with CUDA. In H. Nguyen, editor, GPU Gems 3, chapter 31, pages 677-695. Addison-Wesley, 2007.
-
(2007)
GPU Gems 3
, pp. 677-695
-
-
Nyland, L.1
Harris, M.2
Prins, J.3
-
13
-
-
19344368072
-
-
M. Püschel, J. M. F. Moura, J. Johnson, D. Padua, M. Veloso, B. Singer, J. Xiong, F. Franchetti, A. Gacic, Y. Voronenko, K. Chen, R. W. Johnson, and N. Rizzolo. SPIRAL: Code Generation for DSP Transforms. Proceedings of the IEEE, special issue on Program Generation, Optimization, and Adaptation, 93(2):232-275, 2005.
-
M. Püschel, J. M. F. Moura, J. Johnson, D. Padua, M. Veloso, B. Singer, J. Xiong, F. Franchetti, A. Gacic, Y. Voronenko, K. Chen, R. W. Johnson, and N. Rizzolo. SPIRAL: Code Generation for DSP Transforms. Proceedings of the IEEE, special issue on "Program Generation, Optimization, and Adaptation", 93(2):232-275, 2005.
-
-
-
-
17
-
-
70350771131
-
Benchmarking GPUs to tune dense linear algebra
-
Piscataway, NJ, USA, IEEE Press
-
V. Volkov and J. W. Demmel. Benchmarking GPUs to tune dense linear algebra. In SC '08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing, pages 1-11, Piscataway, NJ, USA, 2008. IEEE Press.
-
(2008)
SC '08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing
, pp. 1-11
-
-
Volkov, V.1
Demmel, J.W.2
-
19
-
-
0343462141
-
Automated empirical optimizations of software and the atlas project
-
R. C. Whaley, A. Petitet, and J. J. Dongarra. Automated empirical optimizations of software and the atlas project. Parallel Computing, 27(1-2):3-35, 2001.
-
(2001)
Parallel Computing
, vol.27
, Issue.1-2
, pp. 3-35
-
-
Whaley, R.C.1
Petitet, A.2
Dongarra, J.J.3
|