-
1
-
-
41249087856
-
General purpose molecular dynamics simulations fully implemented on graphics processing units
-
J. A. Anderson, C. D. Lorenz, and A. Travesset, "General purpose molecular dynamics simulations fully implemented on graphics processing units," Journal of Computational Physics, vol. 227, no. 10, 2008.
-
(2008)
Journal of Computational Physics
, vol.227
, Issue.10
-
-
Anderson, J.A.1
Lorenz, C.D.2
Travesset, A.3
-
2
-
-
36849056785
-
Real-time deformation of detailed geometry based on mappings to a less detailed physical simulation on the GPU
-
Eurographics Association
-
J. Mosegaard and T. S. Sorensen, "Real-time deformation of detailed geometry based on mappings to a less detailed physical simulation on the GPU," in Proceedings of the 11th Eurographics Conference on Virtual Environments, pp. 105-111, Eurographics Association, 2005.
-
(2005)
Proceedings of the 11th Eurographics Conference on Virtual Environments
, pp. 105-111
-
-
Mosegaard, J.1
Sorensen, T.S.2
-
4
-
-
77956373685
-
Optix: A general purpose ray tracing engine
-
ACM
-
S. G. Parker, J. Bigler, A. Dietrich, H. Friedrich, J. Hoberock, D. Luebke, D. McAllister, M. McGuire, K. Morley, A. Robison, et al., "Optix: a general purpose ray tracing engine," in ACM Transactions on Graphics (TOG), vol. 29, p. 66, ACM, 2010.
-
(2010)
ACM Transactions on Graphics (TOG)
, vol.29
, pp. 66
-
-
Parker, S.G.1
Bigler, J.2
Dietrich, A.3
Friedrich, H.4
Hoberock, J.5
Luebke, D.6
McAllister, D.7
McGuire, M.8
Morley, K.9
Robison, A.10
-
8
-
-
84892547586
-
Divergenceaware warp scheduling
-
T. G. Rogers, M. O'Connor, and T. M. Aamodt, "Divergenceaware warp scheduling," in Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-46), pp. 99-110, 2013.
-
(2013)
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-46)
, pp. 99-110
-
-
Rogers, T.G.1
O'Connor, M.2
Aamodt, T.M.3
-
9
-
-
84863342255
-
Improving GPU performance via large warps and two-level warp scheduling
-
V. Narasiman, M. Shebanow, C. J. Lee, R. Miftakhutdinov, O. Mutlu, and Y. N. Patt, "Improving GPU performance via large warps and two-level warp scheduling," in Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture, (MICRO-44), 2011.
-
(2011)
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture, (MICRO-44)
-
-
Narasiman, V.1
Shebanow, M.2
Lee, C.J.3
Miftakhutdinov, R.4
Mutlu, O.5
Patt, Y.N.6
-
10
-
-
84875640178
-
Owl: Cooperative thread array aware scheduling techniques for improving gpGPU performance
-
A. Jog, O. Kayiran, N. Chidambaram Nachiappan, A. K. Mishra, M. T. Kandemir, O. Mutlu, R. Iyer, and C. R. Das, "Owl: Cooperative thread array aware scheduling techniques for improving gpGPU performance," in Proceedings of the 18th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS'13), 2013.
-
(2013)
Proceedings of the 18th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS'13)
-
-
Jog, A.1
Kayiran, O.2
Chidambaram Nachiappan, N.3
Mishra, A.K.4
Kandemir, M.T.5
Mutlu, O.6
Iyer, R.7
Das, C.R.8
-
11
-
-
84903999614
-
Warp-level divergence in GPUs: Characterization, impact, and mitigation
-
P. Xiang, Y. Yang, and H. Zhou, "Warp-level divergence in GPUs: Characterization, impact, and mitigation," in Proceedings of 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA-20), 2014.
-
(2014)
Proceedings of 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA-20)
-
-
Xiang, P.1
Yang, Y.2
Zhou, H.3
-
12
-
-
84887477265
-
Neither more nor less: Optimizing thread-level parallelism for gpGPUs
-
O. Kayran, A. Jog, M. T. Kandemir, and C. R. Das, "Neither more nor less: Optimizing thread-level parallelism for gpGPUs," in Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques (PACT'13), 2013.
-
(2013)
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques (PACT'13)
-
-
Kayran, O.1
Jog, A.2
Kandemir, M.T.3
Das, C.R.4
-
13
-
-
84903951085
-
Improving gpGPU resource utilization through alternative thread block scheduling
-
M. Lee, S. Song, J. Moon, J. Kim, W. Seo, Y. Cho, and S. Ryu, "Improving gpGPU resource utilization through alternative thread block scheduling," in Proceedings of 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA-20), 2014.
-
(2014)
Proceedings of 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA-20)
-
-
Lee, M.1
Song, S.2
Moon, J.3
Kim, J.4
Seo, W.5
Cho, Y.6
Ryu, S.7
-
16
-
-
84960076275
-
Dynamic thread block launch: A lightweight execution mechanism to support irregular applications on GPUs
-
J. Wang, N. Rubin, A. Sidelnik, and S. Yalamanchili, "Dynamic thread block launch: A lightweight execution mechanism to support irregular applications on GPUs," in Proceedings of the 42nd Annual International Symposium on Computer Architecuture (ISCA-42), 2015.
-
(2015)
Proceedings of the 42nd Annual International Symposium on Computer Architecuture (ISCA-42)
-
-
Wang, J.1
Rubin, N.2
Sidelnik, A.3
Yalamanchili, S.4
-
21
-
-
84892519096
-
A localityaware memory hierarchy for energy-efficient GPU architectures
-
M. Rhu, M. Sullivan, J. Leng, and M. Erez, "A localityaware memory hierarchy for energy-efficient GPU architectures," in Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-46), 2013.
-
(2013)
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-46)
-
-
Rhu, M.1
Sullivan, M.2
Leng, J.3
Erez, M.4
-
22
-
-
78751505898
-
A characterization of the rodinia benchmark suite with comparison to contemporary cmp workloads
-
S. Che, J. W. Sheaffer, M. Boyer, L. G. Szafaryn, L. Wang, and K. Skadron, "A characterization of the rodinia benchmark suite with comparison to contemporary cmp workloads," in Proceedings of 2010 IEEE International Symposium o nWorkload Characterization (IISWC'10), 2010.
-
(2010)
Proceedings of 2010 IEEE International Symposium O NWorkload Characterization (IISWC'10)
-
-
Che, S.1
Sheaffer, J.W.2
Boyer, M.3
Szafaryn, L.G.4
Wang, L.5
Skadron, K.6
-
24
-
-
84905509992
-
Enabling preemptive multiprogramming on GPUs
-
I. Tanasic, I. Gelado, J. Cabezas, A. Ramirez, N. Navarro, and M. Valero, "Enabling preemptive multiprogramming on GPUs," in Proceeding of the 41st Annual International Symposium on Computer Architecuture (ISCA-41), 2014.
-
(2014)
Proceeding of the 41st Annual International Symposium on Computer Architecuture (ISCA-41)
-
-
Tanasic, I.1
Gelado, I.2
Cabezas, J.3
Ramirez, A.4
Navarro, N.5
Valero, M.6
-
26
-
-
70349169075
-
Analyzing CUDA workloads using a detailed GPU simulator
-
A. Bakhoda, G. Yuan, W. Fung, H. Wong, and T. Aamodt, "Analyzing cuda workloads using a detailed GPU simulator," in Proceedings of 2009 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS'09), 2009.
-
(2009)
Proceedings of 2009 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS'09)
-
-
Bakhoda, A.1
Yuan, G.2
Fung, W.3
Wong, H.4
Aamodt, T.5
-
27
-
-
84960162410
-
Thermodynamic states in explosion fields
-
Coeur d'Alene Resort, ID, USA
-
A. Kuhl, "Thermodynamic states in explosion fields," in 14th International Symposium on Detonation, Coeur d'Alene Resort, ID, USA, 2010.
-
(2010)
14th International Symposium on Detonation
-
-
Kuhl, A.1
-
28
-
-
84858427151
-
An efficient CUDA implementation of the tree-based barnes hut n-body algorithm
-
M. Burtscher and K. Pingali, "An efficient cuda implementation of the tree-based barnes hut n-body algorithm," GPU computing Gems Emerald edition, p. 75, 2011.
-
(2011)
GPU Computing Gems Emerald Edition
, pp. 75
-
-
Burtscher, M.1
Pingali, K.2
-
32
-
-
80052350460
-
Gregex: GPU based high speed regular expression matching engine
-
IEEE
-
L. Wang, S. Chen, Y. Tang, and J. Su, "Gregex: Gpu based high speed regular expression matching engine," in Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS), 2011 Fifth International Conference on, pp. 366-370, IEEE, 2011.
-
(2011)
Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS), 2011 Fifth International Conference on
, pp. 366-370
-
-
Wang, L.1
Chen, S.2
Tang, Y.3
Su, J.4
-
33
-
-
85019691440
-
Testing intrusion detection systems: A critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by lincoln laboratory
-
J. McHugh, "Testing intrusion detection systems: a critique of the 1998 and 1999 darpa intrusion detection system evaluations as performed by lincoln laboratory," ACM Transactions on Information and System Security, vol. 3, no. 4, pp. 262-294, 2000.
-
(2000)
ACM Transactions on Information and System Security
, vol.3
, Issue.4
, pp. 262-294
-
-
McHugh, J.1
-
34
-
-
84893303174
-
GPU accelerated item-based collaborative filtering for bigdata applications
-
C. H. Nadungodage, Y. Xia, J. J. Lee, M. Lee, and C. S. Park, "Gpu accelerated item-based collaborative filtering for bigdata applications," in Proceedings of 2013 IEEE International Conference on Big Data, 2013.
-
(2013)
Proceedings of 2013 IEEE International Conference on Big Data
-
-
Nadungodage, C.H.1
Xia, Y.2
Lee, J.J.3
Lee, M.4
Park, C.S.5
-
35
-
-
85015559680
-
An algorithmic framework for performing collaborative filtering
-
J. L. Herlocker, J. A. Konstan, A. Borchers, and J. Riedl, "An algorithmic framework for performing collaborative filtering," in Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 1999.
-
(1999)
Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
-
-
Herlocker, J.L.1
Konstan, J.A.2
Borchers, A.3
Riedl, J.4
-
36
-
-
84875193084
-
Relational algorithms for multi-bulk-synchronous processors
-
G. Diamos, H. Wu, J. Wang, A. Lele, and S. Yalamanchili, "Relational algorithms for multi-bulk-synchronous processors," in Proceedings of the 18th ACM SIGPLAN Symposium on Principles andPractice of Parallel Programming (PPoPP'13), 2013.
-
(2013)
Proceedings of the 18th ACM SIGPLAN Symposium on Principles AndPractice of Parallel Programming (PPoPP'13)
-
-
Diamos, G.1
Wu, H.2
Wang, J.3
Lele, A.4
Yalamanchili, S.5
-
37
-
-
70349191933
-
Lonestar: A suite of parallel irregular programs
-
M. Kulkarni, M. Burtscher, C. Cascaval, and K. Pingali, "Lonestar: A suite of parallel irregular programs," in Proceedings of 2009 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS'09), 2009.
-
(2009)
Proceedings of 2009 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS'09)
-
-
Kulkarni, M.1
Burtscher, M.2
Cascaval, C.3
Pingali, K.4
-
39
-
-
84870410502
-
Nested data-parallelism on the GPU
-
ACM
-
L. Bergstrom and J. Reppy, "Nested data-parallelism on the GPU," in ACM SIGPLAN Notices, vol. 47, pp. 247-258, ACM, 2012.
-
(2012)
ACM SIGPLAN Notices
, vol.47
, pp. 247-258
-
-
Bergstrom, L.1
Reppy, J.2
-
40
-
-
84905454859
-
Fine-grain task aggregation and coordination on GPUs
-
M. S. Orr, B. M. Beckmann, S. K. Reinhardt, and D. A. Wood, "Fine-grain task aggregation and coordination on GPUs," in Proceedings of the 41st Annual International Symposium on Computer Architecuture (ISCA-41), 2014.
-
(2014)
Proceedings of the 41st Annual International Symposium on Computer Architecuture (ISCA-41)
-
-
Orr, M.S.1
Beckmann, B.M.2
Reinhardt, S.K.3
Wood, D.A.4
-
41
-
-
85047004205
-
Locality-aware mapping of nested parallel patterns on GPUs
-
H. Lee, K. Brown, A. Sujeeth, T. Rompf, and K. Olukotun, "Locality-aware mapping of nested parallel patterns on GPUs," in Proceedings of the 47th International Symposium on Microarchitecture (MICRO-47), 2014.
-
(2014)
Proceedings of the 47th International Symposium on Microarchitecture (MICRO-47)
-
-
Lee, H.1
Brown, K.2
Sujeeth, A.3
Rompf, T.4
Olukotun, K.5
|