-
1
-
-
78651550268
-
Scalable parallel programming with cuda
-
March
-
J. Nickolls, I. Buck, M. Garland, and K. Skadron, "Scalable parallel programming with cuda," Queue, vol. 6, pp. 40-53, March 2008.
-
(2008)
Queue
, vol.6
, pp. 40-53
-
-
Nickolls, J.1
Buck, I.2
Garland, M.3
Skadron, K.4
-
2
-
-
70349100958
-
-
Rev. 1.2, [Online]. Available
-
OpenCL Specification, Khronous OpenCL Working Group Std., Rev. 1.2, 2011. [Online]. Available: http://www.khronos.org/opencl/
-
(2011)
OpenCL Specification
-
-
-
3
-
-
79959904195
-
Automatic CPU-GPU communication management and optimization
-
T. B. Jablin, P. Prabhu, J. A. Jablin, N. P. Johnson, S. R. Beard, and D. I. August, "Automatic CPU-GPU communication management and optimization," in Proc. the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation(PLDI'11), New York, NY, USA, 2011, pp. 142-151.
-
Proc. the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation(PLDI'11), New York, NY, USA, 2011
, pp. 142-151
-
-
Jablin, T.B.1
Prabhu, P.2
Jablin, J.A.3
Johnson, N.P.4
Beard, S.R.5
August, D.I.6
-
4
-
-
67650081010
-
OpenMP to GPGPU: A compiler framework for automatic translation and optimization
-
S. Lee, S.-J. Min, and R. Eigenmann, "OpenMP to GPGPU: a compiler framework for automatic translation and optimization," in Proc. the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming (PPoPP'09), New York, NY, USA, 2009, pp. 101-110.
-
Proc. the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'09), New York, NY, USA, 2009
, pp. 101-110
-
-
Lee, S.1
Min, S.-J.2
Eigenmann, R.3
-
5
-
-
0037521913
-
Streamit: A language for streaming applications
-
W. Thies, M. Karczmarek, and S. Amarasinghe, "Streamit: A language for streaming applications," in Proc. International Conference on Compiler Construction(CC'02), Grenoble, France, Apr 2002.
-
Proc. International Conference on Compiler Construction(CC'02), Grenoble, France, Apr 2002
-
-
Thies, W.1
Karczmarek, M.2
Amarasinghe, S.3
-
6
-
-
47349118686
-
A practical approach to exploiting coarse-grained pipeline parallelism in C programs
-
W. Thies, V. Chandrasekhar, and S. Amarasinghe, "A practical approach to exploiting coarse-grained pipeline parallelism in C programs," in Proc. the 40th Annual IEEE/ACM International Symposium on Microarchitecture(MICRO'07), Chicago,Illinois, USA, dec. 2007, pp. 356-369.
-
Proc. the 40th Annual IEEE/ACM International Symposium on Microarchitecture(MICRO'07), Chicago,Illinois, USA, Dec. 2007
, pp. 356-369
-
-
Thies, W.1
Chandrasekhar, V.2
Amarasinghe, S.3
-
7
-
-
79959906704
-
Kremlin: Rethinking and rebooting gprof for the multicore age
-
S. Garcia, D. Jeon, C. M. Louie, and M. B. Taylor, "Kremlin: rethinking and rebooting gprof for the multicore age," in Proc. the 32nd ACM SIGPLAN conference on Programming language design and implementation (PLDI '11), New York, NY, USA, 2011, pp. 458-469.
-
Proc. the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '11), New York, NY, USA, 2011
, pp. 458-469
-
-
Garcia, S.1
Jeon, D.2
Louie, C.M.3
Taylor, M.B.4
-
8
-
-
70649092154
-
Rodinia: A benchmark suite for heterogeneous computing
-
S. Che, M. Boyer, J. Meng, D. Tarjan, J. W. Sheaffer, S.-H. Lee, and K. Skadron, "Rodinia: A benchmark suite for heterogeneous computing," in Proc. IEEE International Symposium on Workload Characterization (IISWC '09), Washington, DC, USA, 2009, pp. 44-54.
-
Proc. IEEE International Symposium on Workload Characterization (IISWC '09), Washington, DC, USA, 2009
, pp. 44-54
-
-
Che, S.1
Boyer, M.2
Meng, J.3
Tarjan, D.4
Sheaffer, J.W.5
Lee, S.-H.6
Skadron, K.7
-
9
-
-
83155190228
-
Peta-scale phase-field simulation for dendritic solidification on the tsubame 2.0 supercomputer
-
T. Shimokawabe, T. Aoki, T. Takaki, T. Endo, A. Yamanaka, N. Maruyama, A. Nukada, and S. Matsuoka, "Peta-scale phase-field simulation for dendritic solidification on the tsubame 2.0 supercomputer," in Proc. the 2011 ACM/IEEE conference on Supercomputing (SC'11), New York, NY, USA, 2011, pp. 3:1-3:11.
-
Proc. the 2011 ACM/IEEE Conference on Supercomputing (SC'11), New York, NY, USA, 2011
-
-
Shimokawabe, T.1
Aoki, T.2
Takaki, T.3
Endo, T.4
Yamanaka, A.5
Maruyama, N.6
Nukada, A.7
Matsuoka, S.8
-
10
-
-
80054863945
-
Medical ultrasound imaging: To GPU or not to GPU?
-
H. K.-H. So, J. Chen, B. Y. Yiu, and A. C. Yu, "Medical ultrasound imaging: To GPU or not to GPU?" IEEE Micro, vol. 31, pp. 54-65, 2011.
-
(2011)
IEEE Micro
, vol.31
, pp. 54-65
-
-
So, H.K.-H.1
Chen, J.2
Yiu, B.Y.3
Yu, A.C.4
-
11
-
-
84866867636
-
-
[Online]. Available
-
Cuda toolkit 4.0. NVIDIA Corporation. [Online]. Available: http://developer.nvidia.com/cuda-toolkit-40
-
Cuda Toolkit 4.0
-
-
-
12
-
-
83155190224
-
Physis: An implicitly parallel programming model for stencil computations on large-scale GPU-accelerated supercomputers
-
N. Maruyama, T. Nomura, K. Sato, and S. Matsuoka, "Physis: an implicitly parallel programming model for stencil computations on large-scale GPU-accelerated supercomputers." in Proc. the 2011 ACM/IEEE conference on Supercomputing (SC'11), New York, NY, USA, 2011, pp. 1-12.
-
Proc. the 2011 ACM/IEEE Conference on Supercomputing (SC'11), New York, NY, USA, 2011
, pp. 1-12
-
-
Maruyama, N.1
Nomura, T.2
Sato, K.3
Matsuoka, S.4
|