-
2
-
-
57349180412
-
A compiler framework for optimization of affine loop nests for GPGPUs
-
Muthu Manikandan Baskaran, Uday Bondhugula, Sriram Krishnamoorthy, J. Ramanujam, Atanas Rountev, and P. Sadayappan. A compiler framework for optimization of affine loop nests for GPGPUs. In ICS '08: Proceedings of the 22nd annual international conference on Supercomputing, pages 225-234, 2008.
-
(2008)
ICS '08: Proceedings of the 22nd Annual International Conference on Supercomputing
, pp. 225-234
-
-
Baskaran, M.M.1
Bondhugula, U.2
Krishnamoorthy, S.3
Ramanujam, J.4
Rountev, A.5
Sadayappan, P.6
-
3
-
-
78649547021
-
A block red-black SOR method for a two-dimensional parabolic equation using Hermite collocation
-
Stephen H. Brill and George F. Pinder. A block red-black SOR method for a two-dimensional parabolic equation using Hermite collocation. The Mathematics of Finite Elements and Applications, 1997.
-
(1997)
The Mathematics of Finite Elements and Applications
-
-
Brill, S.H.1
Pinder, G.F.2
-
4
-
-
34548747985
-
Coarse-grain parallel execution for 2-dimensional PDE problems
-
Georgios Goumas, Nikolaos Drosinos, Vasileios Karakasis, and Nectarios Koziris. Coarse-grain parallel execution for 2-dimensional PDE problems. International Parallel and Distributed Processing Symposium, 0:381, 2007.
-
(2007)
International Parallel and Distributed Processing Symposium
, pp. 381
-
-
Goumas, G.1
Drosinos, N.2
Karakasis, V.3
Koziris, N.4
-
6
-
-
70450231944
-
An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness
-
Sunpyo Hong and Hyesoon Kim. An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness. SIGARCH Comput. Archit. News, 37(3):152-163, 2009.
-
(2009)
SIGARCH Comput. Archit. News
, vol.37
, Issue.3
, pp. 152-163
-
-
Hong, S.1
Kim, H.2
-
9
-
-
0036396915
-
The Imagine stream processor
-
Ujval Kapasi, William J. Dally, Scott Rixner, John D. Owens, and Brucek Khailany. The Imagine stream processor. In ICCD '02: Proceedings of the 2002 IEEE International Conference on Computer Design: VLSI in Computers and Processors, page 282, 2002.
-
(2002)
ICCD '02: Proceedings of the 2002 IEEE International Conference on Computer Design: VLSI in Computers and Processors
, pp. 282
-
-
Kapasi, U.1
Dally, W.J.2
Rixner, S.3
Owens, J.D.4
Khailany, B.5
-
12
-
-
0024030170
-
Multicolor reordering of sparse matrices resulting from irregular grids
-
Rami G. Melhem and K. V. S. Ramarao. Multicolor reordering of sparse matrices resulting from irregular grids. ACM Trans. Math. Softw., 14(2):117-138, 1988.
-
(1988)
ACM Trans. Math. Softw.
, vol.14
, Issue.2
, pp. 117-138
-
-
Melhem, R.G.1
Ramarao, K.V.S.2
-
15
-
-
79959466764
-
Optimization principles and application performance evaluation of a multithreaded GPU using CUDA
-
Shane Ryoo, Christopher I. Rodrigues, Sara S. Baghsorkhi, Sam S. Stone, David B. Kirk, and Wen-mei W. Hwu. Optimization principles and application performance evaluation of a multithreaded GPU using CUDA. In PPoPP '08: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming, pages 73-82, 2008.
-
(2008)
PPoPP '08: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
, pp. 73-82
-
-
Ryoo, S.1
Rodrigues, C.I.2
Baghsorkhi, S.S.3
Stone, S.S.4
Kirk, D.B.5
Hwu, W.-M.W.6
-
16
-
-
43449094719
-
Program optimization space pruning for a multithreaded GPU
-
Shane Ryoo, Christopher I. Rodrigues, Sam S. Stone, Sara S. Baghsorkhi, Sain-Zee Ueng, John A. Stratton, and Wenmei W. Hwu. Program optimization space pruning for a multithreaded GPU. In CGO '08: Proceedings of the sixth annual IEEE/ACM international symposium on Code generation and optimization, pages 195-204, 2008.
-
(2008)
CGO '08: Proceedings of the Sixth Annual IEEE/ACM International Symposium on Code Generation and Optimization
, pp. 195-204
-
-
Ryoo, S.1
Rodrigues, C.I.2
Stone, S.S.3
Baghsorkhi, S.S.4
Ueng, S.-Z.5
Stratton, J.A.6
Hwu, W.W.7
-
18
-
-
1542710739
-
Sparse tiling for stationary iterative methods
-
Michelle Mills Strout, Larry Carter, Jeanne Ferrante, and Barbara Kreaseck. Sparse tiling for stationary iterative methods. Int. J. High Perform. Comput. Appl., 18(1):95-113, 2004.
-
(2004)
Int. J. High Perform. Comput. Appl.
, vol.18
, Issue.1
, pp. 95-113
-
-
Strout, M.M.1
Carter, L.2
Ferrante, J.3
Kreaseck, B.4
-
19
-
-
0000778059
-
Generating efficient tiled code for distributed memory machines
-
P. Tang and J. Xue. Generating efficient tiled code for distributed memory machines. Parallel Computing, 26(11):1369-1410, 2000.
-
(2000)
Parallel Computing
, vol.26
, Issue.11
, pp. 1369-1410
-
-
Tang, P.1
Xue, J.2
-
20
-
-
33750456975
-
New stable group explicit finite difference method for solution of diffusion equation
-
Rohallah Tavakoli and Parviz Davami. New stable group explicit finite difference method for solution of diffusion equation. Applied Mathematics and Computation, 181(2):1379-1386, 2006.
-
(2006)
Applied Mathematics and Computation
, vol.181
, Issue.2
, pp. 1379-1386
-
-
Tavakoli, R.1
Davami, P.2
-
21
-
-
34547433110
-
Multigrid and Gauss-Seidel smoothers revisited: Parallelization on chip multiprocessors
-
Dan Wallin, Henrik Löf, Erik Hagersten, and Sverker Holmgren. Multigrid and Gauss-Seidel smoothers revisited: parallelization on chip multiprocessors. In ICS '06: Proceedings of the 20th annual international conference on Supercomputing, pages 145-155, 2006.
-
(2006)
ICS '06: Proceedings of the 20th Annual International Conference on Supercomputing
, pp. 145-155
-
-
Wallin, D.1
Löf, H.2
Hagersten, E.3
Holmgren, S.4
-
22
-
-
33748798219
-
A new block parallel SOR method and its analysis
-
Dexuan Xie. A new block parallel SOR method and its analysis. SIAM J. Sci. Comput., 27(5):1513-1533, 2006.
-
(2006)
SIAM J. Sci. Comput.
, vol.27
, Issue.5
, pp. 1513-1533
-
-
Xie, D.1
-
23
-
-
0000703719
-
On tiling as a loop transformation
-
Jingling Xue. On tiling as a loop transformation. Parallel Processing Letters, 7(4):409-424, 1997.
-
(1997)
Parallel Processing Letters
, vol.7
, Issue.4
, pp. 409-424
-
-
Xue, J.1
-
25
-
-
70350678845
-
JCUDA: A programmer-friendly interface for accelerating Java programs with CUDA
-
Yonghong Yan, Max Grossman, and Vivek Sarkar. JCUDA: A programmer-friendly interface for accelerating Java programs with CUDA. In Euro-Par, pages 887-899, 2009.
-
(2009)
Euro-Par
, pp. 887-899
-
-
Yan, Y.1
Grossman, M.2
Sarkar, V.3
-
26
-
-
14944383149
-
A fast sweeping method for Eikonal equations
-
Hongkai Zhao. A fast sweeping method for Eikonal equations. Math. Comp., 74:603-627, 2005.
-
(2005)
Math. Comp.
, vol.74
, pp. 603-627
-
-
Zhao, H.1
|