-
1
-
-
80052312080
-
Keeneland: Bringing heterogeneous gpu computing to the computational science community
-
J. S. Vetter, R. Glassbrook, J. Dongarra, K. Schwan, B. Loftis, S. McNally, J. Meredith, J. Rogers, P. Roth, K. Spafford, and S. Yalamanchili, "Keeneland: Bringing heterogeneous gpu computing to the computational science community," IEEE Computing in Science and Engineering, vol. 13, no. 5, pp. 90-95, 2011.
-
(2011)
IEEE Computing in Science and Engineering
, vol.13
, Issue.5
, pp. 90-95
-
-
Vetter, J.S.1
Glassbrook, R.2
Dongarra, J.3
Schwan, K.4
Loftis, B.5
McNally, S.6
Meredith, J.7
Rogers, J.8
Roth, P.9
Spafford, K.10
Yalamanchili, S.11
-
2
-
-
79951595196
-
The International Exascale Software Project RoadMap
-
J. Dongarra, P. Beckman, T. Moore, P. Aerts, G. Aloisio, J.-. Andre, D. Barkai, J.-. Berthou, T. Boku, B. Braunschweig, F. Cappello, B. Chapman, X. Chi, A. Choudhary, S. Dosanjh, T. Dunning, S. Fiore, A. Geist, B. Gropp, RobertHarrison, M. Hereld, M. Heroux, A. Hoisie, K. Hotta, Y. Ishikawa, Z. Jin, F. Johnson, S. Kale, R. Kenway, D. Keyes, B. Kramer, J. Labarta, A. Lichnewsky, T. Lippert, B. Lucas, B. Maccabe, S. Matsuoka, P. Messina, P. Michielse, B. Mohr, M. Mueller, W. Nagel, H. Nakashima, M. E. Papka, D. Reed, M. Sato, E. Seidel, J. Shalf, D. Skinner, M. Snir, T. Sterling, R. Stevens, F. Streitz, B. Sugar, S. Sumimoto, W. Tang, J. Taylor, R. Thakur, A. Trefethen, M. Valero, A. van der Steen, J. Vetter, P. Williams, R. Wisniewski, and K. Yelick, "The International Exascale Software Project RoadMap," Journal of High Performance Computer Applications, vol. 25, no. 1, 2011.
-
(2011)
Journal of High Performance Computer Applications
, vol.25
, Issue.1
-
-
Dongarra, J.1
Beckman, P.2
Moore, T.3
Aerts, P.4
Aloisio, G.5
Andre, J.6
Barkai, D.7
Berthou, J.8
Boku, T.9
Braunschweig, B.10
Cappello, F.11
Chapman, B.12
Chi, X.13
Choudhary, A.14
Dosanjh, S.15
Dunning, T.16
Fiore, S.17
Geist, A.18
Gropp, B.19
Harrison, R.20
Hereld, M.21
Heroux, M.22
Hoisie, A.23
Hotta, K.24
Ishikawa, Y.25
Jin, Z.26
Johnson, F.27
Kale, S.28
Kenway, R.29
Keyes, D.30
Kramer, B.31
Labarta, J.32
Lichnewsky, A.33
Lippert, T.34
Lucas, B.35
Maccabe, B.36
Matsuoka, S.37
Messina, P.38
Michielse, P.39
Mohr, B.40
Mueller, M.41
Nagel, W.42
Nakashima, H.43
Papka, M.E.44
Reed, D.45
Sato, M.46
Seidel, E.47
Shalf, J.48
Skinner, D.49
Snir, M.50
Sterling, T.51
Stevens, R.52
Streitz, F.53
Sugar, B.54
Sumimoto, S.55
Tang, W.56
Taylor, J.57
Thakur, R.58
Trefethen, A.59
Valero, M.60
Van Der Steen, A.61
Vetter, J.62
Williams, P.63
Wisniewski, R.64
Yelick, K.65
more..
-
3
-
-
84877709144
-
-
US Department of Energy, Tech. Rep.
-
S. Amarasinghe, M. Hall, R. Lethin, K. Pingali, D. Quinlan, V. Sarkar, J. Shalf, R. Lucas, K. Yelick, P. Balaji, P. C. Diniz, A. Koniges, M. Snir, and S. R. Sachs, "Report of the 2011 workshop on exascale programming challenges," US Department of Energy, Tech. Rep., 2011.
-
(2011)
Report of the 2011 Workshop on Exascale Programming Challenges
-
-
Amarasinghe, S.1
Hall, M.2
Lethin, R.3
Pingali, K.4
Quinlan, D.5
Sarkar, V.6
Shalf, J.7
Lucas, R.8
Yelick, K.9
Balaji, P.10
Diniz, P.C.11
Koniges, A.12
Snir, M.13
Sachs, S.R.14
-
5
-
-
84877609547
-
Brook for GPUs: Stream computing on graphics hardware
-
New York, NY, USA: ACM
-
I. Buck, T. Foley, D. Horn, J. Sugerman, K. Fatahalian, M. Houston, and P. Hanrahan, "Brook for GPUs: stream computing on graphics hardware," in SIGGRAPH '04: ACM SIGGRAPH 2004 Papers. New York, NY, USA: ACM, 2004, pp. 777-786.
-
(2004)
SIGGRAPH '04: ACM SIGGRAPH 2004 Papers
, pp. 777-786
-
-
Buck, I.1
Foley, T.2
Horn, D.3
Sugerman, J.4
Fatahalian, K.5
Houston, M.6
Hanrahan, P.7
-
6
-
-
57349101237
-
Data and computation transformations for brook streaming applications on multiprocessors
-
Washington, DC, USA: IEEE Computer Society
-
S. wei Liao, Z. Du, G. Wu, and G.-Y. Lueh, "Data and computation transformations for brook streaming applications on multiprocessors," in CGO '06: Proceedings of the International Symposium on Code Generation and Optimization. Washington, DC, USA: IEEE Computer Society, 2006, pp. 196-207.
-
(2006)
CGO '06: Proceedings of the International Symposium on Code Generation and Optimization
, pp. 196-207
-
-
Liao, S.W.1
Du, Z.2
Wu, G.3
Lueh, G.-Y.4
-
7
-
-
77951558943
-
A performance-oriented data parallel virtual machine for GPUs
-
New York, NY, USA: ACM
-
M. Peercy, M. Segal, and D. Gerstmann, "A performance-oriented data parallel virtual machine for GPUs," in SIGGRAPH '06: ACM SIGGRAPH 2006 Sketches. New York, NY, USA: ACM, 2006, p. 184.
-
(2006)
SIGGRAPH '06: ACM SIGGRAPH 2006 Sketches
, pp. 184
-
-
Peercy, M.1
Segal, M.2
Gerstmann, D.3
-
8
-
-
84870766925
-
-
CUDA, available: (accessed April 02, 2012)
-
CUDA, "NVIDIA CUDA [online]. available: http://developer.nvidia.com/ category/zone/cuda-zone," 2012, (accessed April 02, 2012).
-
(2012)
NVIDIA CUDA [Online]
-
-
-
9
-
-
84870744206
-
-
OpenCL, Available: (accessed April 02, 2012)
-
OpenCL, "OpenCL [Online]. Available: http://www.khronos.org/opencl/, " 2012, (accessed April 02, 2012).
-
(2012)
OpenCL [Online]
-
-
-
10
-
-
84877712851
-
-
Available: (accessed April 02, 2012)
-
OpenMP, "OpenMP [Online]. Available: http://openmp.org/wp/," 2012, (accessed April 02, 2012).
-
(2012)
OpenMP [Online]
-
-
-
11
-
-
78649898391
-
Hicuda: High-level gpgpu programming
-
T. D. Han and T. S. Abdelrahman, "hicuda: High-level gpgpu programming," IEEE Transactions on Parallel and Distributed Systems, vol. 22, no. 1, pp. 78-90, 2011.
-
(2011)
IEEE Transactions on Parallel and Distributed Systems
, vol.22
, Issue.1
, pp. 78-90
-
-
Han, T.D.1
Abdelrahman, T.S.2
-
13
-
-
77952268356
-
-
PGI-Accelerator, Available: (accessed April 02, 2012)
-
PGI-Accelerator, "The Portland Group, PGI Fortran and C Accelarator Programming Model [Online]. Available: http://www.pgroup.com/resources/accel. htm," 2009, (accessed April 02, 2012).
-
(2009)
PGI Fortran and C Accelarator Programming Model [Online]
-
-
-
15
-
-
77952264175
-
A mapping path for multi-GPGPU accelerated computers from a portable high level programming abstraction
-
Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units, ser. New York, NY, USA: ACM
-
A. Leung, N. Vasilache, B. Meister, M. Baskaran, D. Wohlford, C. Bastoul, and R. Lethin, "A mapping path for multi-GPGPU accelerated computers from a portable high level programming abstraction," in Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units, ser. GPGPU '10. New York, NY, USA: ACM, 2010, pp. 51-61.
-
(2010)
GPGPU '10
, pp. 51-61
-
-
Leung, A.1
Vasilache, N.2
Meister, B.3
Baskaran, M.4
Wohlford, D.5
Bastoul, C.6
Lethin, R.7
-
16
-
-
84867263494
-
-
Available: (accessed April 02, 2012)
-
OpenACC, "OpenACC: Directives for Accelerators [Online]. Available: http://www.openacc-standard.org," 2011, (accessed April 02, 2012).
-
(2011)
OpenACC: Directives for Accelerators [Online]
-
-
-
17
-
-
79959202540
-
OpenMP for Accelerators
-
J. C. Beyer, E. J. Stotzer, A. Hart, and B. R. de Supinski, "OpenMP for Accelerators." in IWOMP'11, 2011, pp. 108-121.
-
(2011)
IWOMP'11
, pp. 108-121
-
-
Beyer, J.C.1
Stotzer, E.J.2
Hart, A.3
De Supinski, B.R.4
-
18
-
-
84877712802
-
Experiences with High-Level Programming Directives for Porting Applications to GPUs
-
Springer Berlin Heidelberg
-
O. Hernandez, W. Ding, B. Chapman, C. Kartsaklis, R. Sankaran, and R. Graham, "Experiences with High-Level Programming Directives for Porting Applications to GPUs," in Facing the Multicore - Challenge II. Springer Berlin Heidelberg, 2012, pp. 96-107.
-
(2012)
Facing the Multicore - Challenge II
, pp. 96-107
-
-
Hernandez, O.1
Ding, W.2
Chapman, B.3
Kartsaklis, C.4
Sankaran, R.5
Graham, R.6
-
19
-
-
70649092154
-
Rodinia: A benchmark suite for heterogeneous computing
-
S. Che, M. Boyer, J. Meng, D. Tarjan, J. W. Sheaffer, S. ha Lee, and K. Skadron, "Rodinia: A benchmark suite for heterogeneous computing," in Proceedings of the IEEE International Symposium on Workload Characterization (IISWC), 2009.
-
Proceedings of the IEEE International Symposium on Workload Characterization (IISWC), 2009
-
-
Che, S.1
Boyer, M.2
Meng, J.3
Tarjan, D.4
Sheaffer, J.W.5
Lee, S.H.6
Skadron, K.7
-
20
-
-
84877716238
-
-
Available: (accessed April 02, 2012)
-
L. L. Pilla, "Hpcgpu Project [Online]. Available: http://hpcgpu.codeplex.com/," 2012, (accessed April 02, 2012).
-
(2012)
Hpcgpu Project [Online]
-
-
Pilla, L.L.1
-
21
-
-
70350583252
-
OpenMP to GPGPU: A compiler framework for automatic translation and optimization
-
New York, NY, USA: ACM, Feb.
-
S. Lee, S.-J. Min, and R. Eigenmann, "OpenMP to GPGPU: A compiler framework for automatic translation and optimization," in ACM SIG-PLAN Symposium on Principles and Practice of Parallel Programming (PPoPP). New York, NY, USA: ACM, Feb. 2009, pp. 101-110.
-
(2009)
ACM SIG-PLAN Symposium on Principles and Practice of Parallel Programming (PPoPP)
, pp. 101-110
-
-
Lee, S.1
Min, S.-J.2
Eigenmann, R.3
-
22
-
-
77956200064
-
An effective GPU implementation of breadth-first search
-
Proceedings of the 47th Design Automation Conference, ser. New York, NY, USA: ACM
-
L. Luo, M. Wong, and W.-m. Hwu, "An effective GPU implementation of breadth-first search," in Proceedings of the 47th Design Automation Conference, ser. DAC '10. New York, NY, USA: ACM, 2010, pp. 52-55.
-
(2010)
DAC '10
, pp. 52-55
-
-
Luo, L.1
Wong, M.2
Hwu, W.-M.3
-
23
-
-
84877693197
-
-
CUDA-reduction, available: (accessed April 02, 2012)
-
CUDA-reduction, "NVIDIA CUDA SDK - CUDA Parallel Reduction [online]. available: http://developer.nvidia.com/cuda-cc-sdk-code-samples#reduction, " 2012, (accessed April 02, 2012).
-
(2012)
NVIDIA CUDA SDK - CUDA Parallel Reduction [Online]
-
-
-
24
-
-
80054871942
-
Performance implications of nonuniform device topologies in scalable heterogeneous architectures
-
[Online]. Available
-
J. S. Meredith, P. C. Roth, K. L. Spafford, and J. S. Vetter, "Performance implications of nonuniform device topologies in scalable heterogeneous architectures," IEEE Micro, vol. 31, no. 5, pp. 66-75, 2011. [Online]. Available: http://dx.doi.org/10.1109/MM.2011.79
-
(2011)
IEEE Micro
, vol.31
, Issue.5
, pp. 66-75
-
-
Meredith, J.S.1
Roth, P.C.2
Spafford, K.L.3
Vetter, J.S.4
-
25
-
-
84862695013
-
The tradeoffs of fused memory hierarchies in heterogeneous architectures
-
Cagliari, Italy: ACM
-
K. Spafford, J. S. Meredith, S. Lee, D. Li, P. C. Roth, and J. S. Vetter, "The tradeoffs of fused memory hierarchies in heterogeneous architectures," in ACM Computing Frontiers (CF). Cagliari, Italy: ACM, 2012.
-
(2012)
ACM Computing Frontiers (CF)
-
-
Spafford, K.1
Meredith, J.S.2
Lee, S.3
Li, D.4
Roth, P.C.5
Vetter, J.S.6
|