-
1
-
-
0037834788
-
OpenMP issues arising in the development of parallel BLAS and LAPACK libraries
-
C. Addison, Y. Ren, and M. van Waveren. OpenMP issues arising in the development of parallel BLAS and LAPACK libraries. Scientific Programming, 11(2), 2003.
-
(2003)
Scientific Programming
, vol.11
, Issue.2
-
-
Addison, C.1
Ren, Y.2
van Waveren, M.3
-
2
-
-
0024891893
-
Vector and parallel algorithms for Cholesky factorization on IBM 3090
-
New York, NY, USA, ACM Press
-
R. C. Agarwal and F. G. Gustavson. Vector and parallel algorithms for Cholesky factorization on IBM 3090. In Supercomputing '89: Proceedings of the 1989 ACM/IEEE Conference on Supercomputing, pages 225-233, New York, NY, USA, 1989. ACM Press.
-
(1989)
Supercomputing '89: Proceedings of the 1989 ACM/IEEE Conference on Supercomputing
, pp. 225-233
-
-
Agarwal, R.C.1
Gustavson, F.G.2
-
3
-
-
18044400448
-
-
B. S. Andersen, J. Waśniewski, and F. G. Gustavson. A recursive formulation of Cholesky factorization of a matrix in packed storage. A CM Trans. Math. Soft., 27(2):214-244, 2001.
-
B. S. Andersen, J. Waśniewski, and F. G. Gustavson. A recursive formulation of Cholesky factorization of a matrix in packed storage. A CM Trans. Math. Soft., 27(2):214-244, 2001.
-
-
-
-
4
-
-
0003706460
-
-
third ed, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA
-
E. Anderson, Z. Bai, C. Bischof, L. S. Blackford, J. Demmel, J. J. Dongarra, J. D. Croz, S. Hammarling, A. Greenbaum, A. McKenney, and D. Sorensen. LAPACK Users' guide (third ed.). Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 1999.
-
(1999)
LAPACK Users' guide
-
-
Anderson, E.1
Bai, Z.2
Bischof, C.3
Blackford, L.S.4
Demmel, J.5
Dongarra, J.J.6
Croz, J.D.7
Hammarling, S.8
Greenbaum, A.9
McKenney, A.10
Sorensen, D.11
-
6
-
-
17644412337
-
The science of deriving dense linear algebra algorithms
-
March
-
P. Bientinesi, J. A. Gunnels, M. E. Myers, E. S. Quintana-Ortí, and R. A. van de Geijn. The science of deriving dense linear algebra algorithms. ACM Trans. Math. Soft., 31(1):1-26, March 2005.
-
(2005)
ACM Trans. Math. Soft
, vol.31
, Issue.1
, pp. 1-26
-
-
Bientinesi, P.1
Gunnels, J.A.2
Myers, M.E.3
Quintana-Ortí, E.S.4
van de Geijn, R.A.5
-
7
-
-
35248815316
-
-
P. Bientinesi, B. Gunter, and R. van de Geijn. Families of algorithms related to the inversion of a symmetric positive definite matrix. FLAME Working Note #19 TR-2006-20, The University of Texas at Austin, Department of Computer Sciences, 2006.
-
P. Bientinesi, B. Gunter, and R. van de Geijn. Families of algorithms related to the inversion of a symmetric positive definite matrix. FLAME Working Note #19 TR-2006-20, The University of Texas at Austin, Department of Computer Sciences, 2006.
-
-
-
-
8
-
-
17644370328
-
Representing linear algebra algorithms in code: The FLAME application programming interfaces
-
March
-
P. Bientinesi, E. S. Quintana-Ortí, and R. A. van de Geijn. Representing linear algebra algorithms in code: The FLAME application programming interfaces. ACM Trans. Math. Soft., 31(1):27-59, March 2005.
-
(2005)
ACM Trans. Math. Soft
, vol.31
, Issue.1
, pp. 27-59
-
-
Bientinesi, P.1
Quintana-Ortí, E.S.2
van de Geijn, R.A.3
-
9
-
-
35248821570
-
-
P. Bientinesi and R. A. van de Geijn. Representing dense linear algebra algorithms: A farewell to indices. FLAME Working Note #17 TR-2006-10, The University of Texas at Austin, Department of Computer Sciences, 2006.
-
P. Bientinesi and R. A. van de Geijn. Representing dense linear algebra algorithms: A farewell to indices. FLAME Working Note #17 TR-2006-10, The University of Texas at Austin, Department of Computer Sciences, 2006.
-
-
-
-
10
-
-
0036870763
-
Recursive array layouts and fast matrix multiplication
-
S. Chatterjee, A. R. Lebeck, P. K. Patnala, and M. Thottethodi. Recursive array layouts and fast matrix multiplication. IEEE Trans. on Parallel and Distributed Systems, 13(11):1105-1123, 2002.
-
(2002)
IEEE Trans. on Parallel and Distributed Systems
, vol.13
, Issue.11
, pp. 1105-1123
-
-
Chatterjee, S.1
Lebeck, A.R.2
Patnala, P.K.3
Thottethodi, M.4
-
11
-
-
0002924772
-
ScaLAPACK: A scalable linear algebra library for distributed memory concurrent computers
-
IEEE Comput. Soc. Press
-
J. Choi, J. J. Dongarra, R. Pozo, and D. W. Walker. ScaLAPACK: A scalable linear algebra library for distributed memory concurrent computers. In Proceedings of the Fourth Symposium on the Frontiers of Massively Parallel Computation, pages 120-127. IEEE Comput. Soc. Press, 1992.
-
(1992)
Proceedings of the Fourth Symposium on the Frontiers of Massively Parallel Computation
, pp. 120-127
-
-
Choi, J.1
Dongarra, J.J.2
Pozo, R.3
Walker, D.W.4
-
14
-
-
0025402476
-
A set of level 3 basic linear algebra subprograms
-
March
-
J. J. Dongarra, J. Du Croz, S. Hammarling, and I. Duff. A set of level 3 basic linear algebra subprograms. ACM Trans. Math. Soft., 16(1):1-17, March 1990.
-
(1990)
ACM Trans. Math. Soft
, vol.16
, Issue.1
, pp. 1-17
-
-
Dongarra, J.J.1
Du Croz, J.2
Hammarling, S.3
Duff, I.4
-
15
-
-
1842832833
-
Recursive blocked algorithms and hybrid data structures for dense matrix library software
-
E. Elmroth, F. Gustavson, I. Jonsson, and B. Kagstrom. Recursive blocked algorithms and hybrid data structures for dense matrix library software. SIAM Review, 46(1):3-45, 2004.
-
(2004)
SIAM Review
, vol.46
, Issue.1
, pp. 3-45
-
-
Elmroth, E.1
Gustavson, F.2
Jonsson, I.3
Kagstrom, B.4
-
16
-
-
35248894624
-
-
K. Goto. http://www.tacc.utexas.edu/resources/software.
-
-
-
Goto, K.1
-
17
-
-
0039435412
-
FLAME: Formal linear algebra methods environment
-
December
-
J. A. Gunnels, F. G. Gustavson, G. M. Henry, and R. A. van de Geijn. FLAME: Formal linear algebra methods environment. ACM Trans. Math. Soft., 27(4):422-455, December 2001.
-
(2001)
ACM Trans. Math. Soft
, vol.27
, Issue.4
, pp. 422-455
-
-
Gunnels, J.A.1
Gustavson, F.G.2
Henry, G.M.3
van de Geijn, R.A.4
-
18
-
-
35248890693
-
-
F. G. Gustavson, L. Karlsson, and B. Kagstrom. Three algorithms on distributed memory using packed storage. Computational Science - Para 2006. B. Kagstrom, E. Elmroth, eds., accepted for Lecture Notes in Computer Science. Springer-Verlag, 2007.
-
F. G. Gustavson, L. Karlsson, and B. Kagstrom. Three algorithms on distributed memory using packed storage. Computational Science - Para 2006. B. Kagstrom, E. Elmroth, eds., accepted for Lecture Notes in Computer Science. Springer-Verlag, 2007.
-
-
-
-
19
-
-
35248867212
-
BLAS based on block data, structures
-
CTC92TR89, Cornell University, February
-
G. Henry. BLAS based on block data, structures. Theory Center Technical Report CTC92TR89, Cornell University, February 1992.
-
(1992)
Theory Center Technical Report
-
-
Henry, G.1
-
21
-
-
35248880835
-
-
IBM. IBM Engineering and Scientific Subroutine Library for AIX Version 3, Release 3. IBM Pub. No. SA22-7272-04, December 2001
-
IBM. IBM Engineering and Scientific Subroutine Library for AIX Version 3, Release 3. IBM Pub. No. SA22-7272-04, December 2001.
-
-
-
-
22
-
-
1642372163
-
Parallel and fully recursive multifrontal sparse Cholesky
-
D. Irony, G. Shklarski, and S. Toledo. Parallel and fully recursive multifrontal sparse Cholesky. Future Gener. Comput. Syst., 20(3):425-440, 2004.
-
(2004)
Future Gener. Comput. Syst
, vol.20
, Issue.3
, pp. 425-440
-
-
Irony, D.1
Shklarski, G.2
Toledo, S.3
-
24
-
-
0012525494
-
Programming parallel applications in Cilk
-
C. Leiserson and A. Plaat. Programming parallel applications in Cilk. SINEWS: SIAM News, 31, 1998.
-
(1998)
SINEWS: SIAM News
, vol.31
-
-
Leiserson, C.1
Plaat, A.2
-
25
-
-
35248901228
-
-
T. M. Low and R. van de Geijn. An API for manipulating matrices stored by blocks. FLAME Working Note #12 TR-2004-15, The University of Texas at Austin, Department of Computer Sciences, May 2004.
-
T. M. Low and R. van de Geijn. An API for manipulating matrices stored by blocks. FLAME Working Note #12 TR-2004-15, The University of Texas at Austin, Department of Computer Sciences, May 2004.
-
-
-
-
26
-
-
0030679296
-
-
H. Lu, A. L. Cox, S. Dwarkadas, R. Rajamony, and W. Zwaenepoel. Compiler and software distributed shared memory support for irregular applications. In PPOPP '97: Proceedings of the Sixth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 48-56, New York, NY, USA, 1997. ACM Press.
-
H. Lu, A. L. Cox, S. Dwarkadas, R. Rajamony, and W. Zwaenepoel. Compiler and software distributed shared memory support for irregular applications. In PPOPP '97: Proceedings of the Sixth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 48-56, New York, NY, USA, 1997. ACM Press.
-
-
-
-
27
-
-
0042235298
-
Tiling, block data layout, and memory hierarchy performance
-
N. Park, B. Hong, and V. K. Prasanna. Tiling, block data layout, and memory hierarchy performance. IEEE Trans. on Parallel and Distributed Systems, 14(7):640-654, 2003.
-
(2003)
IEEE Trans. on Parallel and Distributed Systems
, vol.14
, Issue.7
, pp. 640-654
-
-
Park, N.1
Hong, B.2
Prasanna, V.K.3
-
28
-
-
0035003299
-
A comparison of lookahead and algorithmic blocking techniques for parallel matrix factorization
-
June
-
P. Strazdins. A comparison of lookahead and algorithmic blocking techniques for parallel matrix factorization. International Journal of Parallel and Distributed Systems and Networks, 4(1):26-35, June 2001.
-
(2001)
International Journal of Parallel and Distributed Systems and Networks
, vol.4
, Issue.1
, pp. 26-35
-
-
Strazdins, P.1
-
30
-
-
33847129885
-
An efficient algorithm for exploiting multiple arithmetic units
-
R. Tomasulo. An efficient algorithm for exploiting multiple arithmetic units. IBM J. of Research and Development, 11(1), 1967.
-
(1967)
IBM J. of Research and Development
, vol.11
, Issue.1
-
-
Tomasulo, R.1
-
31
-
-
0037173976
-
A framework for high-performance matrix multiplication based on hierarchical abstractions, algorithms and optimized low-level kernels
-
V. Valsalam and A. Skjellum. A framework for high-performance matrix multiplication based on hierarchical abstractions, algorithms and optimized low-level kernels. Concurrency and Computation: Practice and Experience, 14(10):805-840, 2002.
-
(2002)
Concurrency and Computation: Practice and Experience
, vol.14
, Issue.10
, pp. 805-840
-
-
Valsalam, V.1
Skjellum, A.2
-
32
-
-
85029485046
-
-
R. von Hanxleden, K. Kennedy, C. H. Koelbel, R. Das, and J. H. Saltz. Compiler analysis for irregular problems in Fortran D. In 1992 Workshop on Languages and Compilers for Parallel Computing, number 757, pages 97-111, New Haven, Conn., 1992. Berlin: Springer-Verlag.
-
R. von Hanxleden, K. Kennedy, C. H. Koelbel, R. Das, and J. H. Saltz. Compiler analysis for irregular problems in Fortran D. In 1992 Workshop on Languages and Compilers for Parallel Computing, number 757, pages 97-111, New Haven, Conn., 1992. Berlin: Springer-Verlag.
-
-
-
-
33
-
-
35248846531
-
An experimental comparison of cache-oblivious and cache-aware programs
-
June
-
K. Yotov, T. Roeder, K. Pingali, J. Gunnels, and F. Gustavson. An experimental comparison of cache-oblivious and cache-aware programs. In SPAA '07: Symposium on Parallelism in Algorithms and Architectures, June 2007.
-
(2007)
SPAA '07: Symposium on Parallelism in Algorithms and Architectures
-
-
Yotov, K.1
Roeder, T.2
Pingali, K.3
Gunnels, J.4
Gustavson, F.5
|