-
1
-
-
70450059008
-
Accelerating leukocyte tracking using CUDA: A case study in leveraging manycore coprocessors
-
M. Boyer, D. Tarjan, S. T. Acton, and K. Skadron, "Accelerating Leukocyte Tracking using CUDA: A Case Study in Leveraging Manycore Coprocessors," in Proceedings of the international parallel and distributed processing symposium (IPDPS), 2009, pp. 1-12.
-
(2009)
Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS)
, pp. 1-12
-
-
Boyer, M.1
Tarjan, D.2
Acton, S.T.3
Skadron, K.4
-
2
-
-
51449118065
-
A performance study of general-purpose applications on graphics processors using CUDA
-
S. Che, J. Meng, J. W. Sheaffer, and K. Skadron, "A performance study of general-purpose applications on graphics processors using CUDA," Journal of parallel and distributed computing, vol. 68, no. 10, pp. 1370-1380, 2008.
-
(2008)
Journal of Parallel and Distributed Computing
, vol.68
, Issue.10
, pp. 1370-1380
-
-
Che, S.1
Meng, J.2
Sheaffer, J.W.3
Skadron, K.4
-
4
-
-
78651550268
-
Scalable parallel programming with CUDA
-
J. Nickolls, I. Buck, M. Garland, and K. Skadron, "Scalable Parallel Programming with CUDA," Queue, vol. 6, no. 2, pp. 40-53, 2008.
-
(2008)
Queue
, vol.6
, Issue.2
, pp. 40-53
-
-
Nickolls, J.1
Buck, I.2
Garland, M.3
Skadron, K.4
-
5
-
-
70350759823
-
Bandwidth intensive 3-D FFT kernel for GPUs using CUDA
-
A. Nukada, Y. Ogata, T. Endo, and S. Matsuoka, "Bandwidth Intensive 3-D FFT Kernel for GPUs using CUDA," in Proceedings of the ACM/IEEE SC conference on high performance networking and computing, 2008, pp. 1-11.
-
(2008)
Proceedings of the ACM/IEEE SC Conference on High Performance Networking and Computing
, pp. 1-11
-
-
Nukada, A.1
Ogata, Y.2
Endo, T.3
Matsuoka, S.4
-
6
-
-
79959466764
-
Optimization principles and application performance evaluation of a multithreaded GPU using CUDA
-
S. Ryoo, C. I. Rodrigues, S. S. Baghsorkhi, S. S. Stone, D. B. Kirk, and W. M. Hwu, "Optimization Principles and Application Performance Evaluation of a Multithreaded GPU using CUDA," in Proceedings of the ACM SIGPLAN symposium on principles and practice of parallel programming (PPoPP), 2008, pp. 73-82.
-
(2008)
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP)
, pp. 73-82
-
-
Ryoo, S.1
Rodrigues, C.I.2
Baghsorkhi, S.S.3
Stone, S.S.4
Kirk, D.B.5
Hwu, W.M.6
-
7
-
-
38849131252
-
High-throughput sequence alignment using graphics processing units
-
M. Schatz, C. Trapnell, A. Delcher, and A. Varshney, "High- throughput Sequence Alignment Using Graphics Processing Units," BMC Bioinformatics, vol. 8, no. 1, p. 474, 2007.
-
(2007)
BMC Bioinformatics
, vol.8
, Issue.1
, pp. 474
-
-
Schatz, M.1
Trapnell, C.2
Delcher, A.3
Varshney, A.4
-
9
-
-
22844457256
-
A critical assessment of coupled cluster method in quantum chemistry
-
J. Paldus and X. Li, "A Critical Assessment of Coupled Cluster Method in Quantum Chemistry," Advances in Chemical Physics, vol. 110, pp. 1-175, 1999.
-
(1999)
Advances in Chemical Physics
, vol.110
, pp. 1-175
-
-
Paldus, J.1
Li, X.2
-
11
-
-
36849099976
-
On correlation problem in atomic and molecular systems. Calculation of wavefunction components in ursell-type expansion using quantum-field theoretical methods
-
J. Cizek, "On Correlation Problem in Atomic and Molecular Systems. Calculation of Wavefunction Components in Ursell-Type Expansion Using Quantum-Field Theoretical Methods," Journal of Chemical Physics, vol. 45, no. 11, pp. 4256-4266, 1966.
-
(1966)
Journal of Chemical Physics
, vol.45
, Issue.11
, pp. 4256-4266
-
-
Cizek, J.1
-
12
-
-
33847389465
-
Coupled-cluster theory in quantum chemistry
-
Feb
-
R. J. Bartlett and M. Musiał, "Coupled-cluster theory in quantum chemistry," Reviews of Modern Physics, vol. 79, no. 1, pp. 291-352, Feb 2007.
-
(2007)
Reviews of Modern Physics
, vol.79
, Issue.1
, pp. 291-352
-
-
Bartlett, R.J.1
Musiał, M.2
-
13
-
-
0006244148
-
A 5th-order perturbation comparison of electron correlation theories
-
K. Raghavachari, T. G.W., J. A. Pople, and M. Head-Gordon, "A 5th-Order Perturbation Comparison of Electron Correlation Theories," Chemical Physics Letters, vol. 157, no. 6, pp. 479-483, 1989.
-
(1989)
Chemical Physics Letters
, vol.157
, Issue.6
, pp. 479-483
-
-
Raghavachari, K.1
T, G.W.2
Pople, J.A.3
Head-Gordon, M.4
-
14
-
-
31744435977
-
Automatic code generation for many-body electronic structure methods: The tensor contraction engine
-
A A Auer et al., "Automatic Code Generation for Many-body Electronic Structure Methods: the Tensor Contraction Engine," Molecular Physics, vol. 2, p. 211, 2006.
-
(2006)
Molecular Physics
, vol.2
, pp. 211
-
-
Auer, A.A.1
-
15
-
-
0345566357
-
Tensor Contraction engine: Abstraction and automated parallel implementation of configuration-interaction, coupled-cluster, and many-body perturbation theories
-
S. Hirata, "Tensor Contraction Engine: Abstraction and Automated Parallel Implementation of Configuration-Interaction, Coupled-Cluster, and Many-Body Perturbation Theories," The Journal of Physical Chemistry A, vol. 107, no. 46, pp. 9887-9897, 2003.
-
(2003)
The Journal of Physical Chemistry A
, vol.107
, Issue.46
, pp. 9887-9897
-
-
Hirata, S.1
-
16
-
-
68849128792
-
A note on auto-tuning GEMM for GPUs
-
Y. Li, J. Dongarra, and S. Tomov, "A Note on Auto-tuning GEMM for GPUs," in Proceedings of the international conference on computational science (ICCS), 2009, pp. 884-892.
-
(2009)
Proceedings of the International Conference on Computational Science (ICCS)
, pp. 884-892
-
-
Li, Y.1
Dongarra, J.2
Tomov, S.3
-
17
-
-
67650056991
-
-
EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2008-49, May. [Online]. Available
-
V. Volkov and J. Demmel, "LU, QR and Cholesky Factorizations using Vector Capabilities of GPUs," EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2008-49, May 2008. [Online]. Available: http://www.eecs.berkeley.edu/Pubs/TechRpts/2008/EECS-2008-49.html
-
(2008)
LU, QR and Cholesky Factorizations Using Vector Capabilities of GPUs
-
-
Volkov, V.1
Demmel, J.2
-
18
-
-
57349180412
-
A compiler framework for optimization of affine loop nests for GPGPUs
-
M. M. Baskaran, U. Bondhugula, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan, "A compiler framework for optimization of affine loop nests for GPGPUs," in Proceedings of the international conference on Supercomputing (ICS), 2008, pp. 225-234.
-
(2008)
Proceedings of the International Conference on Supercomputing (ICS)
, pp. 225-234
-
-
Baskaran, M.M.1
Bondhugula, U.2
Krishnamoorthy, S.3
Ramanujam, J.4
Rountev, A.5
Sadayappan, P.6
-
20
-
-
77749340082
-
Model-driven autotuning of sparse matrix-vector multiply on GPUs
-
J. W. Choi, A. Singh, and R. W. Vuduc, "Model-driven Autotuning of Sparse Matrix-vector Multiply on GPUs," in Proceedings of the ACM SIGPLAN symposium on principles and practice of parallel programming (PPoPP), 2010, pp. 115-126.
-
(2010)
Proceedings of Therftxt ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP)
, pp. 115-126
-
-
Choi, J.W.1
Singh, A.2
Vuduc, R.W.3
-
21
-
-
67650563116
-
Software pipelined execution of stream programs on GPUs
-
A. Udupa, R. Govindarajan, and M. J. Thazhuthaveetil, "Software pipelined execution of stream programs on GPUs," in Proceedings of the international symposium on code generation and optimization (CGO), 2009, pp. 200-209.
-
(2009)
Proceedings of the International Symposium on Code Generation and Optimization (CGO)
, pp. 200-209
-
-
Udupa, A.1
Govindarajan, R.2
Thazhuthaveetil, M.J.3
-
22
-
-
33846471996
-
Exploiting coarsegrained task, data, and pipeline parallelism in stream programs
-
M. I. Gordon, W. Thies, and S. Amarasinghe, "Exploiting Coarsegrained Task, Data, and Pipeline Parallelism in Stream Programs," SIGOPS Oper. Syst. Rev., vol. 40, no. 5, pp. 151-162, 2006.
-
(2006)
SIGOPS Oper. Syst. Rev.
, vol.40
, Issue.5
, pp. 151-162
-
-
Gordon, M.I.1
Thies, W.2
Amarasinghe, S.3
-
23
-
-
43449094719
-
Program optimization space pruning for a multithreaded GPU
-
S. Ryoo, C. I. Rodrigues, S. S. Stone, S. S. Baghsorkhi, S.-Z. Ueng, J. A. Stratton, and W.-m. W. Hwu, "Program Optimization Space Pruning for a Multithreaded GPU," in Proceedings of the international symposium on code generation and optimization (CGO), 2008, pp. 195-204.
-
(2008)
Proceedings of the International Symposium on Code Generation and Optimization (CGO)
, pp. 195-204
-
-
Ryoo, S.1
Rodrigues, C.I.2
Stone, S.S.3
Baghsorkhi, S.S.4
Ueng, S.-Z.5
Stratton, J.A.6
Hwu, W.-M.W.7
-
24
-
-
70450231944
-
An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness
-
S. Hong and H. Kim, "An Analytical Model for a GPU Architecture with Memory-level and Thread-level Parallelism Awareness," SIGARCH Comput. Archit. News, vol. 37, no. 3, pp. 152-163, 2009.
-
(2009)
SIGARCH Comput. Archit. News
, vol.37
, Issue.3
, pp. 152-163
-
-
Hong, S.1
Kim, H.2
-
26
-
-
77957561221
-
An adaptive performance modeling tool for GPU architectures
-
S. S. Baghsorkhi, M. Delahaye, S. J. Patel, W. D. Gropp, and W.-m. W. Hwu, "An adaptive performance modeling tool for GPU architectures," in Proceedings of the ACM SIGPLAN symposium on principles and practice of parallel programming (PPoPP), 2010, pp. 105-114.
-
(2010)
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP)
, pp. 105-114
-
-
Baghsorkhi, S.S.1
Delahaye, M.2
Patel, S.J.3
Gropp, W.D.4
Hwu, W.-M.W.5
-
27
-
-
70349204069
-
Absorption spectrum of the green fluorescent protein chromophore: A difficult case for ab initio methods?
-
Jul
-
C. Filippi, M. Zaccheddu, and F. Buda, "Absorption Spectrum of the Green Fluorescent Protein Chromophore: A Difficult Case for ab Initio Methods?" Journal of Chemical Theory and Computation, vol. 5, pp. 2074-2087, Jul 2009.
-
(2009)
Journal of Chemical Theory and Computation
, vol.5
, pp. 2074-2087
-
-
Filippi, C.1
Zaccheddu, M.2
Buda, F.3
-
28
-
-
33746614482
-
Gaussian basis sets for use in correlated molecular calculations. I. the atoms boron through neon and hydrogen
-
T. Dunning, "Gaussian Basis Sets for Use in Correlated Molecular Calculations. I. The Atoms Boron through Neon and Hydrogen," Journal of Chemical Physics, vol. 90, pp. 1007-1023, 1989.
-
(1989)
Journal of Chemical Physics
, vol.90
, pp. 1007-1023
-
-
Dunning, T.1
-
29
-
-
74049154762
-
Liquid water: Obtaining the right answer for the right reasons
-
E. Aprà, A. P. Rendell, R. J. Harrison, V. Tipparaju, W. A. deJong, and S. S. Xantheas, "Liquid water: Obtaining the Right Answer for the Right Reasons," in Proceedings of the ACM/IEEE SC conference on high performance networking and computing, 2009, pp. 1-7.
-
(2009)
Proceedings of the ACM/IEEE SC Conference on High Performance Networking and Computing
, pp. 1-7
-
-
Aprà, E.1
Rendell, A.P.2
Harrison, R.J.3
Tipparaju, V.4
Dejong, W.A.5
Xantheas, S.S.6
-
30
-
-
34247114368
-
Combining analytical and empirical approaches in tuning matrix transposition
-
Q. Lu, S. Krishnamoorthy, and P. Sadayappan, "Combining Analytical and Empirical Approaches in Tuning Matrix Transposition," in Proceedings of the conference on parallel architectures and compilation techniques (PACT), 2006, pp. 233-242.
-
(2006)
Proceedings of the Conference on Parallel Architectures and Compilation Techniques (PACT)
, pp. 233-242
-
-
Lu, Q.1
Krishnamoorthy, S.2
Sadayappan, P.3
-
33
-
-
70449643566
-
Memory performance and cache coherency effects on an intel nehalem multiprocessor system
-
D. Molka, D. Hackenberg, R. Schone, and M. S. Muller, "Memory Performance and Cache Coherency Effects on an Intel Nehalem Multiprocessor System," in Proceedings of the conference on parallel architectures and compilation techniques (PACT), 2009, pp. 261-270.
-
(2009)
Proceedings of the Conference on Parallel Architectures and Compilation Techniques (PACT)
, pp. 261-270
-
-
Molka, D.1
Hackenberg, D.2
Schone, R.3
Muller, M.S.4
-
34
-
-
84873160007
-
-
H. T. Consortium, "PCI Express 3.0 specification," http://www.hypertransport.org/docs/twgdocs/HTC20051222-00046- 0028.pdf.
-
PCI Express 3.0 Specification
-
-
-
35
-
-
70449693703
-
-
Document Number: 320412, January
-
Intel, "An Introduction to the Intel QuickPath Interconnect," Document Number: 320412, January 2009, http://www.intel.com/technology/ quickpath/introduction.pdf.
-
(2009)
An Introduction to the Intel QuickPath Interconnect
-
-
|