-
1
-
-
84855693023
-
Speeding up the evaluation phase of GP classification algorithms on GPUs
-
A. Cano, A. Zafra, and S. Ventura, "Speeding up the evaluation phase of GP classification algorithms on GPUs, " Soft Computing, vol. 16, no. 2, pp. 187-202, 2012.
-
(2012)
Soft Computing
, vol.16
, Issue.2
, pp. 187-202
-
-
Cano, A.1
Zafra, A.2
Ventura, S.3
-
3
-
-
84885199802
-
Relational algorithms for multi-bulk-synchronous processors
-
G. Diamos, H. Wu, J. Wang, A. Lele, and S. Yalamanchili, "Relational algorithms for multi-bulk-synchronous processors, " in Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2013, pp. 301-302.
-
(2013)
Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
, pp. 301-302
-
-
Diamos, G.1
Wu, H.2
Wang, J.3
Lele, A.4
Yalamanchili, S.5
-
4
-
-
84942543488
-
A portable benchmark suite for highly parallel data intensive query processing
-
I. Saeed, J. Young, and S. Yalamanchili, "A portable benchmark suite for highly parallel data intensive query processing, " in Proceedings of the 2nd Workshop on Parallel Programming for Analytics Applications, 2015, pp. 31-38.
-
(2015)
Proceedings of the 2nd Workshop on Parallel Programming for Analytics Applications
, pp. 31-38
-
-
Saeed, I.1
Young, J.2
Yalamanchili, S.3
-
5
-
-
63449109979
-
Fast BVH construction on GPUs
-
C. Lauterbach, M. Garland, S. Sengupta, D. Luebke, and D. Manocha, "Fast BVH construction on GPUs, " Computer Graphics Forum, vol. 28, no. 2, 2009.
-
(2009)
Computer Graphics Forum
, vol.28
, Issue.2
-
-
Lauterbach, C.1
Garland, M.2
Sengupta, S.3
Luebke, D.4
Manocha, D.5
-
6
-
-
38149002407
-
Whitted raytracing for dynamic scenes using a ray-space hierarchy on the GPU
-
D. Roger, U. Assarsson, and N. Holzschuch, "Whitted raytracing for dynamic scenes using a ray-space hierarchy on the GPU, " in Proceedings of the 18th Eurographics Conference on Rendering Techniques, 2007, pp. 99-110.
-
(2007)
Proceedings of the 18th Eurographics Conference on Rendering Techniques
, pp. 99-110
-
-
Roger, D.1
Assarsson, U.2
Holzschuch, N.3
-
7
-
-
79955825340
-
Load balancing versus occupancy maximization on graphics processing units: The generalized hough transform as a case study
-
J. Gómez-Luna, J. M. González-Linares, J. Ignacio Benavides, E. L. Zapata, and N. Guil, "Load balancing versus occupancy maximization on graphics processing units: The generalized hough transform as a case study, " International Journal of High Performance Computing Applications, vol. 25, no. 2, pp. 205-222, 2011.
-
(2011)
International Journal of High Performance Computing Applications
, vol.25
, Issue.2
, pp. 205-222
-
-
Gómez-Luna, J.1
González-Linares, J.M.2
Ignacio Benavides, J.3
Zapata, E.L.4
Guil, N.5
-
8
-
-
84865331893
-
Algorithm and data optimization techniques for scaling to massively threaded systems
-
J. Stratton, C. Rodrigues, I.-J. Sung, L.-W. Chang, N. Anssari, G. Liu, W.-M. Hwu, and N. Obeid, "Algorithm and data optimization techniques for scaling to massively threaded systems, " Computer, vol. 45, no. 8, pp. 26-32, 2012.
-
(2012)
Computer
, vol.45
, Issue.8
, pp. 26-32
-
-
Stratton, J.1
Rodrigues, C.2
Sung, I.-J.3
Chang, L.-W.4
Anssari, N.5
Liu, G.6
Hwu, W.-M.7
Obeid, N.8
-
9
-
-
84896811291
-
A decomposition for in-place matrix transposition
-
B. Catanzaro, A. Keller, and M. Garland, "A decomposition for in-place matrix transposition, " in Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2014, pp. 193-206.
-
(2014)
Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
, pp. 193-206
-
-
Catanzaro, B.1
Keller, A.2
Garland, M.3
-
10
-
-
84976494728
-
In-place matrix transposition on GPUs
-
J. Gómez-Luna, I. Sung, L.-W. Chang, J. González-Linares, N. Guil, and W.-M. W. Hwu, "In-place matrix transposition on GPUs, " Parallel and Distributed Systems, IEEE Transactions on, vol. PP, no. 99, pp. 1-1, 2015.
-
(2015)
Parallel and Distributed Systems, IEEE Transactions on
, vol.PP
, Issue.99
, pp. 1
-
-
Gómez-Luna, J.1
Sung, I.2
Chang, L.-W.3
González-Linares, J.4
Guil, N.5
Hwu, W.-M.W.6
-
11
-
-
84896808494
-
-
Ph. D. dissertation, University of Illinois at Urbana-Champaign, Department of Electrical and Computer Engineering
-
I.-J. Sung, "Data layout transformation through in-place transposition, " Ph. D. dissertation, University of Illinois at Urbana-Champaign, Department of Electrical and Computer Engineering, 2013.
-
(2013)
Data Layout Transformation Through In-place Transposition
-
-
Sung, I.-J.1
-
13
-
-
0029492798
-
Transposing a matrix on a vector computer
-
M. Dow, "Transposing a matrix on a vector computer, " Parallel Computing, vol. 21, no. 12, pp. 1997-2005, 1995.
-
(1995)
Parallel Computing
, vol.21
, Issue.12
, pp. 1997-2005
-
-
Dow, M.1
-
14
-
-
84875175606
-
StreamScan: Fast scan algorithms for GPUs without global barrier synchronization
-
S. Yan, G. Long, and Y. Zhang, "StreamScan: Fast scan algorithms for GPUs without global barrier synchronization, " in Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2013, pp. 229-238.
-
(2013)
Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
, pp. 229-238
-
-
Yan, S.1
Long, G.2
Zhang, Y.3
-
16
-
-
57349184047
-
Fast scan algorithms on graphics processors
-
Y. Dotsenko, N. K. Govindaraju, P.-P. Sloan, C. Boyd, and J. Manferdelli, "Fast scan algorithms on graphics processors, " in Proceedings of the 22nd Annual International Conference on Supercomputing, 2008, pp. 205-213.
-
(2008)
Proceedings of the 22nd Annual International Conference on Supercomputing
, pp. 205-213
-
-
Dotsenko, Y.1
Govindaraju, N.K.2
Sloan, P.-P.3
Boyd, C.4
Manferdelli, J.5
-
18
-
-
0002924004
-
-
Carnegie Mellon University, Technical Report CMU-CS-90-190
-
G. E. Blelloch, "Prefix sums and their applications, " Carnegie Mellon University, Technical Report CMU-CS-90-190, 1990.
-
(1990)
Prefix Sums and Their Applications
-
-
Blelloch, G.E.1
-
19
-
-
84877899022
-
Optimizing parallel prefix operations for the Fermi architecture
-
M. Harris and M. Garland, "Optimizing parallel prefix operations for the Fermi architecture, " GPU Computing Gems: Jade Edition, 2012.
-
(2012)
GPU Computing Gems: Jade Edition
-
-
Harris, M.1
Garland, M.2
-
21
-
-
84961314978
-
Locality-centric thread scheduling for bulk-synchronous programming models on CPU architectures
-
H.-S. Kim, I. El Hajj, J. Stratton, S. Lumetta, and W.-M. Hwu, "Locality-centric thread scheduling for bulk-synchronous programming models on CPU architectures, " in Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2015, pp. 257-268.
-
(2015)
Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization
, pp. 257-268
-
-
Kim, H.-S.1
El Hajj, I.2
Stratton, J.3
Lumetta, S.4
Hwu, W.-M.5
|