-
1
-
-
0004072686
-
-
Addison Wesley
-
Alfred Aho, Ravi Sethi, and Jeffrey D. Ulman. Compilers: principles, techniques, and tools. Addison Wesley, 1986.
-
(1986)
Compilers Principles, Techniques, and Tools.
-
-
Aho, A.1
Sethi, R.2
Ulman, J.D.3
-
2
-
-
70349169075
-
Analyzing cuda workloads using a detailed gpu simulator
-
Ali Bakhoda, George L. Yuan, Wilson W. L. Fung, Henry Wong, and Tor M. Aamodt. Analyzing CUDA Workloads Using a Detailed GPU Simulator. Proc. of the 2009 IEEE International Symposium on Performance Analysis of Systems and Software, pp. 163-174, 2009.
-
(2009)
Proc. of the 2009 IEEE International Symposium on Performance Analysis of Systems and Software
, pp. 163-174
-
-
Bakhoda, A.1
Yuan, G.L.2
Fung, W.W.L.3
Wong, H.4
Aamodt, T.M.5
-
3
-
-
33846349887
-
A hierarchical O(N log N) force-calculation algorithm
-
Josh Barnes and Piet Hut. A hierarchical O(N log N) force-calculation algorithm. Nature, 324(4):446-449, 1986.
-
(1986)
Nature
, vol.324
, Issue.4
, pp. 446-449
-
-
Barnes, J.1
Hut, P.2
-
5
-
-
26944443478
-
Survey propagation: An algorithm for satisfiability
-
Braunstein, M. Mezard, and R. Zecchina. Survey Propagation: An Algorithm for Satisfiability. Random Structures and Algorithms, 27:201-226, 2005.
-
(2005)
Random Structures and Algorithms
, vol.27
, pp. 201-226
-
-
Mezard, B.M.1
Zecchina, R.2
-
6
-
-
84858427151
-
An efficient cuda implementation of the tree-based barnes hut n-body algorithm
-
Morgan Kaufmann
-
Martin Burtscher and Keshav Pingali. An efficient CUDA implementation of the tree-based Barnes Hut n-body algorithm. GPU Computing Gems Emerald Edition, pp. 75-92. Morgan Kaufmann, 2011.
-
(2011)
GPU Computing Gems Emerald Edition
, pp. 75-92
-
-
Burtscher, M.1
Pingali, K.2
-
9
-
-
51449118065
-
A performance study of general-purpose applications on graphics processors using CUDA
-
Shuai Che, Michael Boyer, Jiayuan Meng, David Tarjan, Jeremy W. Sheaffer, and Kevin Skadron. A performance study of general-purpose applications on graphics processors using CUDA. Journal of Parallel and Distributing Computing, 68:1370-1380, 2008.
-
(2008)
Journal of Parallel and Distributing Computing
, vol.68
, pp. 1370-1380
-
-
Che, S.1
Boyer, M.2
Meng, J.3
Tarjan, D.4
Sheaffer, J.W.5
Skadron, K.6
-
10
-
-
78751505898
-
A characterization of the rodinia benchmark suite with comparison to contemporary cmp workloads
-
Shuai Che, Jeremy W. Sheaffer, Michael Boyer, Lukasz G. Szafaryn, Liang Wang, Kevin Skadron. A Characterization of the Rodinia Benchmark Suite with Comparison to Contemporary CMP Workloads. Proc. of the 2010 IEEE International Symposium on Workload Characterization, pp. 1-11, 2010.
-
(2010)
Proc. of the 2010 IEEE International Symposium on Workload Characterization
, pp. 1-11
-
-
Che, S.1
Sheaffer, J.W.2
Boyer, M.3
Wang Liang Szafaryn, L.G.4
Skadron, K.5
-
13
-
-
84946074430
-
-
CUB, http://nvlabs.github.io/cub/, 2014.
-
(2014)
CUB
-
-
-
14
-
-
84946039062
-
-
Fermi. http://www.nvidia.com/content/PDF/fermi white papers/NVIDIA Fermi Compute Architecture Whitepaper.pdf, 2010.
-
(2010)
Fermi.
-
-
-
15
-
-
78751477137
-
Exploring gpgpu workloads: Characterization methodology, analysis and microarchitecture evaluation implications
-
Nilanjan Goswami, Ramkumar Shankar, Madhura Joshi, and Tao Li. Exploring GPGPU Workloads: Characterization Methodology, Analysis and Microarchitecture Evaluation Implications. Proc. of the 2010 IEEE International Symposium on Workload Characterization, pp. 1-10, 2010.
-
(2010)
Proc. of the 2010 IEEE International Symposium on Workload Characterization
, pp. 1-10
-
-
Goswami, N.1
Shankar, R.2
Joshi, M.3
Tao Li.4
-
16
-
-
84892549898
-
-
GPGPU-Sim 3.x Manual. http://gpgpusim. org/manual/index.php5/GPGPU-Sim-3.x-Manual, 2012.
-
(2012)
GPGPU-Sim 3.x Manual.
-
-
-
20
-
-
67650076853
-
How much parallelism is there in irregular applications?
-
Milind Kulkarni, Martin Burtscher, Rajasekhar Inkulu, and Keshav Pingali. How Much Parallelism is There in Irregular Applications? Proc. of the 14th ACM SIGPLAN Symposium on Principles and Practices of Parallel Programming, pp. 3-14, 2009.
-
(2009)
Proc. of the 14th ACM SIGPLAN Symposium on Principles and Practices of Parallel Programming
, pp. 3-14
-
-
Kulkarni, M.1
Burtscher, M.2
Inkulu, R.3
Pingali, K.4
-
21
-
-
35448941890
-
Optimistic parallelism requires abstractions
-
Milind Kulkarni, Keshav Pingali, Bruce Walter, Ganesh Ramanarayanan, Kavita Bala and L. Paul Chew. Optimistic Parallelism Requires Abstractions. Proc. of the ACM Conference on Programming Languages Design and Implementation, pp. 211-222, 2007.
-
(2007)
Proc. of the ACM Conference on Programming Languages Design and Implementation
, pp. 211-222
-
-
Kulkarni, M.1
Pingali, K.2
Walter, B.3
Ramanarayanan, G.4
Bala, K.5
Paul Chew, L.6
-
23
-
-
84938839413
-
-
LonestarGPU, http://iss.ices.utexas.edu/?p=projects/galois/lonestargpu
-
LonestarGPU
-
-
-
32
-
-
33947588048
-
A survey of general purpose computation on graphics hardware
-
John D. Owens, David Luebke, Naga Govindaraju, Mark Harris, Jens Krger, Aaron Lefohn, and Timothy J. Purcell. A survey of general purpose computation on graphics hardware. Computer Graphics Forum, 26(1):80-113, 2007.
-
(2007)
Computer Graphics Forum
, vol.26
, Issue.1
, pp. 80-113
-
-
Owens, J.D.1
Luebke, D.2
Govindaraju, N.3
Harris, M.4
Krger, J.5
Lefohn, A.6
Purcell, T.J.7
-
34
-
-
77952579552
-
Demystifying gpu microarchitecture through microbenchmarking
-
Henry Wong, Misel-Myrto Papadopoulou, Maryam Sadooghi-Alvandi, and Andreas Moshovos. Demystifying GPU Microarchitecture through Microbenchmarking. Proc. of the 2010 IEEE International Symposium on Performance Analysis of Systems and Software, pp. 235-246, 2010.
-
(2010)
Proc. of the 2010 IEEE International Symposium on Performance Analysis of Systems and Software
, pp. 235-246
-
-
Wong, H.1
Papadopoulou, M.2
Sadooghi-Alvandi, M.3
Moshovos, A.4
|