-
1
-
-
70350758060
-
-
ABTS, D., BATAINEH, A., SCOTT, S., FAANES, G., SCHWARZMEIER, J., LUNDBERG, E., JOHNSON, T., BYE, M., AND SCHWOERER, G. 2007. The Cray BlackWidow: A Highly Scalable Vector Multiprocessor, SC'07. AGARWAL R. C., AND GUSTAVSON, F.G. 1989. Vector and parallel algorithms for Cholesky factorization on IBM 3090, Supercomputing' 89, 225-233.
-
ABTS, D., BATAINEH, A., SCOTT, S., FAANES, G., SCHWARZMEIER, J., LUNDBERG, E., JOHNSON, T., BYE, M., AND SCHWOERER, G. 2007. The Cray BlackWidow: A Highly Scalable Vector Multiprocessor, SC'07. AGARWAL R. C., AND GUSTAVSON, F.G. 1989. Vector and parallel algorithms for Cholesky factorization on IBM 3090, Supercomputing' 89, 225-233.
-
-
-
-
2
-
-
70350762187
-
-
ALVERSON, R., CALLAHAN, D., CUMMINGS, D., KOBLENZ, B., PORTERFIELD, A., AND SMITH, B. 1990. The Tera Computer System, ICS'90, 1-6. AMD. 2006. ATI CTM Guide, version 1.01.
-
ALVERSON, R., CALLAHAN, D., CUMMINGS, D., KOBLENZ, B., PORTERFIELD, A., AND SMITH, B. 1990. The Tera Computer System, ICS'90, 1-6. AMD. 2006. ATI CTM Guide, version 1.01.
-
-
-
-
3
-
-
0025536635
-
LAPACK: A portable linear algebra library for high-performance computers
-
ANDERSON, E., BAI, Z., DONGARRA, J., GREENBAUM, A., MCKENNEY, A., DU CROZ, J., HAMMERLING, S., DEMMEL, J., BISCHOF, C., AND SORENSEN, D. 1990. LAPACK: a portable linear algebra library for high-performance computers, Supercomputing' 90, 2-11.
-
(1990)
Supercomputing
, vol.90
, pp. 2-11
-
-
ANDERSON, E.1
BAI, Z.2
DONGARRA, J.3
GREENBAUM, A.4
MCKENNEY, A.5
DU CROZ, J.6
HAMMERLING, S.7
DEMMEL, J.8
BISCHOF, C.9
SORENSEN, D.10
-
5
-
-
70350780783
-
-
BABOULIN, M., DONGARRA J., AND TOMOV, S. 2008. Some Issues in Dense Linear Algebra for Multicore and Special Purpose Architectures, Technical Report UT-CS-08-200, University of Tennessee, May 6, 2008 (also LAPACK Working Note 200).
-
BABOULIN, M., DONGARRA J., AND TOMOV, S. 2008. Some Issues in Dense Linear Algebra for Multicore and Special Purpose Architectures, Technical Report UT-CS-08-200, University of Tennessee, May 6, 2008 (also LAPACK Working Note 200).
-
-
-
-
6
-
-
70350625607
-
Solving Dense Linear Systems on Graphics Processors
-
02-02-2008, Universidad Jaime I, February
-
BARRACHINA, S., CASTILLO, M., IGUAL, F. D., MAYO, R, AND QUINTANA-ORTI, E. S. 2008. Solving Dense Linear Systems on Graphics Processors, Technical Report ICC 02-02-2008, Universidad Jaime I, February 2008.
-
(2008)
Technical Report ICC
-
-
BARRACHINA, S.1
CASTILLO, M.2
IGUAL, F.D.3
MAYO, R.4
QUINTANA-ORTI, E.S.5
-
7
-
-
57349180412
-
A Compiler Framework for Optimization of Affine Loop Nests for GPGPUs
-
BASKARAN, M., BONDHUGULA, U., KRISHNAMOORTHY, S., RAMANUJAM, J., ROUNTEV, A., AND SADAYAPPAN, P. 2008. A Compiler Framework for Optimization of Affine Loop Nests for GPGPUs, ISC'08.
-
(2008)
ISC
, vol.8
-
-
BASKARAN, M.1
BONDHUGULA, U.2
KRISHNAMOORTHY, S.3
RAMANUJAM, J.4
ROUNTEV, A.5
SADAYAPPAN, P.6
-
9
-
-
70350767593
-
-
CASTILLO, M., CHAN, E., IGUAL, F. D., MAYO, R., QUINTANAORTI, E. S., QUINTANA-ORTI, G., VAN DE GEIJN, R., AND VAN ZEE, F. G. 2008. Making Programming Synonymous with Programming for Linear Algebra Libraries, FLAME Working Note #31. The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-08-20, April 17, 2008.
-
CASTILLO, M., CHAN, E., IGUAL, F. D., MAYO, R., QUINTANAORTI, E. S., QUINTANA-ORTI, G., VAN DE GEIJN, R., AND VAN ZEE, F. G. 2008. Making Programming Synonymous with Programming for Linear Algebra Libraries, FLAME Working Note #31. The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-08-20, April 17, 2008.
-
-
-
-
10
-
-
0030244536
-
-
CHOI, J., DONGARRA, J. J., OSTROUCHOV, L. S., PETITET, A. P., WALKER, D. W., AND WHALEY, R. C. 1996. The Design and Implementation of the ScaLAPACK LU, QR, and Cholesky Factorization Routines, Scientific Programming 5, 3, 173-184 (also LAPACK Working Note 80).
-
CHOI, J., DONGARRA, J. J., OSTROUCHOV, L. S., PETITET, A. P., WALKER, D. W., AND WHALEY, R. C. 1996. The Design and Implementation of the ScaLAPACK LU, QR, and Cholesky Factorization Routines, Scientific Programming 5, 3, 173-184 (also LAPACK Working Note 80).
-
-
-
-
11
-
-
70350771484
-
-
DONGARRA, J., DUFF, I. S., SORENSEN, D. C., AND VAN DER VORST, H. A. 1998. Numerical Linear Algebra for High-Performance Computers, SIAM.
-
DONGARRA, J., DUFF, I. S., SORENSEN, D. C., AND VAN DER VORST, H. A. 1998. Numerical Linear Algebra for High-Performance Computers, SIAM.
-
-
-
-
12
-
-
0025402476
-
A Set of Level 3 Basic Linear Algebra Subprograms
-
DONGARRA, J. J., DU CROZ, J., HAMMARLING, S., AND DUFF, I. 1990. A Set of Level 3 Basic Linear Algebra Subprograms, ACM Transactions on Mathematical Software 16, 1, 1-17.
-
(1990)
ACM Transactions on Mathematical Software
, vol.16
, Issue.1
, pp. 1-17
-
-
DONGARRA, J.J.1
DU CROZ, J.2
HAMMARLING, S.3
DUFF, I.4
-
13
-
-
70350769644
-
-
DONGARRA, J., AND OSTROUCHOV, S. 1990. LAPACK Block Factorization Algorithms on the Intel iPSC/860, Technical Report CS-90-115, University of Tennessee (also LAPACK Working Note 24).
-
DONGARRA, J., AND OSTROUCHOV, S. 1990. LAPACK Block Factorization Algorithms on the Intel iPSC/860, Technical Report CS-90-115, University of Tennessee (also LAPACK Working Note 24).
-
-
-
-
14
-
-
33845468997
-
LU-GPU: Efficient Algorithms for Solving Dense Linear Systems on Graphics Hardware
-
GALOPPO, N., GOVINDARAJU, N. K., HENSON, M., AND MANOCHA, D. 2005. LU-GPU: Efficient Algorithms for Solving Dense Linear Systems on Graphics Hardware, SC'05.
-
(2005)
SC
, vol.5
-
-
GALOPPO, N.1
GOVINDARAJU, N.K.2
HENSON, M.3
MANOCHA, D.4
-
15
-
-
34548292052
-
A Memory Model for Scientific Algorithms on Graphcs Processors
-
GOVINDARAJU, N. K., LARSEN, S., GRAY, J., AND MANOCHA, D. 2006. A Memory Model for Scientific Algorithms on Graphcs Processors, SC'06.
-
(2006)
SC
, vol.6
-
-
GOVINDARAJU, N.K.1
LARSEN, S.2
GRAY, J.3
MANOCHA, D.4
-
16
-
-
78651269052
-
Understanding the efficiency of GPU algorithms for matrixmatrix multiplication
-
FATAHALIAN, K., SUGERMAN, J., AND HANRAHAN, P. 2004. Understanding the efficiency of GPU algorithms for matrixmatrix multiplication, In Graphics Hardware 2004, 133-137.
-
(2004)
Graphics Hardware 2004
, pp. 133-137
-
-
FATAHALIAN, K.1
SUGERMAN, J.2
HANRAHAN, P.3
-
17
-
-
56849107345
-
Efficient Gather and Scatter Operations on Graphics Processors
-
HE, B., GOVINDARAJU, N. K., LUO, Q., AND SMITH, B. 2007. Efficient Gather and Scatter Operations on Graphics Processors, SC'07.
-
(2007)
SC
, vol.7
-
-
HE, B.1
GOVINDARAJU, N.K.2
LUO, Q.3
SMITH, B.4
-
18
-
-
70350769643
-
-
HWU, W. W., AND KIRK, D. 2007. ECE 498 AL1: Programming Massively Parallel Processors, Lecture Slides, University of Illinois, Urbana-Champaign. NVIDIA. 2006.
-
HWU, W. W., AND KIRK, D. 2007. ECE 498 AL1: Programming Massively Parallel Processors, Lecture Slides, University of Illinois, Urbana-Champaign. NVIDIA. 2006.
-
-
-
-
19
-
-
70350777041
-
-
NVIDIA GeForce 8800 GPU Architecture Overview, Technical Brief, November 2006.
-
NVIDIA GeForce 8800 GPU Architecture Overview, Technical Brief, November 2006.
-
-
-
-
22
-
-
70350769642
-
-
QUINTANA-ORTI, G., IGUAL, F. D., QUINTANA-ORTI, E. S., AND VAN DE GEIJN, R. 2008. Solving Dense Linear Systems on Platforms with Multiple Hardware Accelerators, FLAME Working Note #32. The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-08-22. May 9, 2008.
-
QUINTANA-ORTI, G., IGUAL, F. D., QUINTANA-ORTI, E. S., AND VAN DE GEIJN, R. 2008. Solving Dense Linear Systems on Platforms with Multiple Hardware Accelerators, FLAME Working Note #32. The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-08-22. May 9, 2008.
-
-
-
-
23
-
-
79959466764
-
Optimization Principles and Application Performance Evaluation of a Multithreaded GPU using CUDA
-
ACM Press
-
RYOO, S., RODRIGUES, C. I., BAGHSORKHI, S. S., STONE, S. S., KIRK, D. B., AND HWU, W. W. 2008. Optimization Principles and Application Performance Evaluation of a Multithreaded GPU using CUDA, Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, ACM Press, 2008, 73-82.
-
(2008)
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
, pp. 73-82
-
-
RYOO, S.1
RODRIGUES, C.I.2
BAGHSORKHI, S.S.3
STONE, S.S.4
KIRK, D.B.5
HWU, W.W.6
|