-
1
-
-
0037834788
-
OpenMP issues arising in the development of parallel BLAS and LAPACK libraries
-
C. Addison, Y. Ren, and M. van Waveren. OpenMP issues arising in the development of parallel BLAS and LAPACK libraries. Scientific Programming, 11(2), 2003.
-
(2003)
Scientific Programming
, vol.11
, Issue.2
-
-
Addison, C.1
Ren, Y.2
van Waveren, M.3
-
2
-
-
0024891893
-
Vector and parallel algorithms for Cholesky factorization on IBM 3090
-
New York, NY, USA
-
R. C. Agarwal and F. G. Gustavson. Vector and parallel algorithms for Cholesky factorization on IBM 3090. In SC '89: Proceedings of the 1989 ACM/IEEE Conference on Supercomputing, pages 225-233, New York, NY, USA, 1989.
-
(1989)
SC '89: Proceedings of the 1989 ACM/IEEE Conference on Supercomputing
, pp. 225-233
-
-
Agarwal, R.C.1
Gustavson, F.G.2
-
3
-
-
0003706460
-
-
third ed, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA
-
E. Anderson, Z. Bai, C. Bischof, L. S. Blackford, J. Demmel, Jack J. Dongarra, J. Du Croz, S. Hammarling, A. Greenbaum, A. McKenney, and D. Sorensen. LAPACK Users' guide (third ed.). Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 1999.
-
(1999)
LAPACK Users' guide
-
-
Anderson, E.1
Bai, Z.2
Bischof, C.3
Blackford, L.S.4
Demmel, J.5
Dongarra, J.J.6
Du Croz, J.7
Hammarling, S.8
Greenbaum, A.9
McKenney, A.10
Sorensen, D.11
-
4
-
-
34548265764
-
CellSs: A programming model for the Cell BE architecture
-
Tampa, FL, USA, November
-
Pieter Bellens, Josep M. Perez, Rosa M. Badia, and Jesus Labarta. CellSs: A programming model for the Cell BE architecture. In SC '06: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, pages 5-15, Tampa, FL, USA, November 2006.
-
(2006)
SC '06: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing
, pp. 5-15
-
-
Bellens, P.1
Perez, J.M.2
Badia, R.M.3
Labarta, J.4
-
5
-
-
53349177347
-
Bientinesi, Brian Gunter, and Robert van de Geijn. Families of algorithms related to the inversion of a symmetric positive definite matrix
-
Submitted
-
Paolo Bientinesi, Brian Gunter, and Robert van de Geijn. Families of algorithms related to the inversion of a symmetric positive definite matrix. ACM Transactions on Mathematical Software. Submitted.
-
ACM Transactions on Mathematical Software
-
-
Paolo1
-
6
-
-
17644370328
-
-
Paolo Bientinesi, Enrique S. Quintana-Ortí, and Robert A. van de Geijn. Representing linear algebra algorithms in code: The FLAME application programming interfaces. ACM Transactions on Mathematical Software, 31(1):27-59, March 2005.
-
Paolo Bientinesi, Enrique S. Quintana-Ortí, and Robert A. van de Geijn. Representing linear algebra algorithms in code: The FLAME application programming interfaces. ACM Transactions on Mathematical Software, 31(1):27-59, March 2005.
-
-
-
-
7
-
-
35248843628
-
-
Ernie Chan, Enrique S. Quintana-Ortí, Gregorio Quintana-Ortí, and Robert van de Geijn. SuperMatrix out-of-order scheduling of matrix operations for SMP and multi-core architectures. In SPAA '07: Proceedings of the Nineteenth Annual ACM Symposium on Parallelism in Algorithms and Architectures, pages 116-125, San Diego, CA, USA, June 2007.
-
Ernie Chan, Enrique S. Quintana-Ortí, Gregorio Quintana-Ortí, and Robert van de Geijn. SuperMatrix out-of-order scheduling of matrix operations for SMP and multi-core architectures. In SPAA '07: Proceedings of the Nineteenth Annual ACM Symposium on Parallelism in Algorithms and Architectures, pages 116-125, San Diego, CA, USA, June 2007.
-
-
-
-
8
-
-
0036870763
-
Recursive array layouts and fast matrix multiplication
-
S. Chatterjee, A. R. Lebeck, P. K. Patnala, and M. Thottethodi. Recursive array layouts and fast matrix multiplication. IEEE Transactions on Parallel and Distributed Systems, 13(11): 1105-1123, 2002.
-
(2002)
IEEE Transactions on Parallel and Distributed Systems
, vol.13
, Issue.11
, pp. 1105-1123
-
-
Chatterjee, S.1
Lebeck, A.R.2
Patnala, P.K.3
Thottethodi, M.4
-
11
-
-
0025402476
-
-
A set of level 3 Basic Linear Algebra Subprograms, March
-
Jack J. Dongarra, Jeremy Du Croz, Sven Hammarling, and Iain Duff. A set of level 3 Basic Linear Algebra Subprograms. ACM Transactions on Mathematical Software, 16(1): 1-17, March 1990.
-
(1990)
ACM Transactions on Mathematical Software
, vol.16
, Issue.1
, pp. 1-17
-
-
Dongarra, J.J.1
Du Croz, J.2
Hammarling, S.3
Duff, I.4
-
12
-
-
53349099685
-
-
Kazushige Goto. http://www.tace.utexas.edu/resources/software.
-
Kazushige Goto
-
-
-
15
-
-
33646015987
-
Synergistic processing in Cell's multicore architecture
-
Michael Gschwind, H. Peter Hofstee, Brian Flachs, Martin Hopkins, Yukio Watanabe, and Takeshi Yamazaki. Synergistic processing in Cell's multicore architecture. IEEE Micro, 26(2): 10-24, 2006.
-
(2006)
IEEE Micro
, vol.26
, Issue.2
, pp. 10-24
-
-
Gschwind, M.1
Peter Hofstee, H.2
Flachs, B.3
Hopkins, M.4
Watanabe, Y.5
Yamazaki, T.6
-
16
-
-
0039435412
-
-
John A. Gunnels, Fred G. Gustavson, Greg M. Henry, and Robert A. van de Geijn. FLAME: Formal linear algebra methods environment. ACM Transactions on Mathematical Software, 27(4):422-455, December 2001.
-
John A. Gunnels, Fred G. Gustavson, Greg M. Henry, and Robert A. van de Geijn. FLAME: Formal linear algebra methods environment. ACM Transactions on Mathematical Software, 27(4):422-455, December 2001.
-
-
-
-
17
-
-
53349171913
-
Three algorithms on distributed memory using packed storage
-
B. Kagstrom, E. Elmroth, editors, Computational Science, PARA '06, Springer-Verlag, To appear
-
F. G. Gustavson, L. Karlsson, and B. Kagstrom. Three algorithms on distributed memory using packed storage. B. Kagstrom, E. Elmroth, editors, Computational Science - PARA '06, Lecture Notes in Computer Science. Springer-Verlag, 2007. To appear.
-
(2007)
Lecture Notes in Computer Science
-
-
Gustavson, F.G.1
Karlsson, L.2
Kagstrom, B.3
-
18
-
-
53349120767
-
-
BLAS based on block data structures, CTC92TR89, Cornell University, February
-
Greg Henry. BLAS based on block data structures. Theory Center Technical Report CTC92TR89, Cornell University, February 1992.
-
(1992)
Theory Center Technical Report
-
-
Henry, G.1
-
21
-
-
0012525494
-
Programming parallel applications in Cilk
-
Charles Leiserson and Aske Plaat. Programming parallel applications in Cilk. SINEWS: SIAM News, 31, 1998.
-
(1998)
SINEWS: SIAM News
, vol.31
-
-
Leiserson, C.1
Plaat, A.2
-
22
-
-
53349128024
-
-
Tze Meng Low and Robert van de Geijn. An API for manipulating matrices stored by blocks. FLAME Working Note #12 TR-2004-15, The University of Texas at Austin, Department of Computer Sciences, May 2004.
-
Tze Meng Low and Robert van de Geijn. An API for manipulating matrices stored by blocks. FLAME Working Note #12 TR-2004-15, The University of Texas at Austin, Department of Computer Sciences, May 2004.
-
-
-
-
23
-
-
0030679296
-
-
Honghui Lu, Alan L. Cox, Sandhya Dwarkadas, Ramakrishnan Rajamony, and Willy Zwaenepoel. Compiler and software distributed shared memory support for irregular applications. In PPOPP '97: Proceedings of the Sixth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 48-56, New York, NY, USA, 1997.
-
Honghui Lu, Alan L. Cox, Sandhya Dwarkadas, Ramakrishnan Rajamony, and Willy Zwaenepoel. Compiler and software distributed shared memory support for irregular applications. In PPOPP '97: Proceedings of the Sixth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 48-56, New York, NY, USA, 1997.
-
-
-
-
24
-
-
0042235298
-
Tiling, block data layout, and memory hierarchy performance
-
N. Park, B. Hong, and V. K. Prasanna. Tiling, block data layout, and memory hierarchy performance. IEEE Transactions on Parallel and Distributed Systems, 14(7):640-654, 2003.
-
(2003)
IEEE Transactions on Parallel and Distributed Systems
, vol.14
, Issue.7
, pp. 640-654
-
-
Park, N.1
Hong, B.2
Prasanna, V.K.3
-
25
-
-
0035003299
-
A comparison of lookahead and algorithmic blocking techniques for parallel matrix factorization
-
June
-
Peter Strazdins. A comparison of lookahead and algorithmic blocking techniques for parallel matrix factorization. International Journal of Parallel and Distributed Systems and Networks, 4(1):26-35, June 2001.
-
(2001)
International Journal of Parallel and Distributed Systems and Networks
, vol.4
, Issue.1
, pp. 26-35
-
-
Strazdins, P.1
-
26
-
-
0003081830
-
An efficient algorithm for exploiting multiple arithmetic units
-
R. Tomasulo. An efficient algorithm for exploiting multiple arithmetic units. IBM Journal of Research and Development, 11(1), 1967.
-
(1967)
IBM Journal of Research and Development
, vol.11
, Issue.1
-
-
Tomasulo, R.1
-
27
-
-
0037173976
-
A framework for high-performance matrix multiplication based on hierarchical abstractions, algorithms and optimized low-level kernels
-
Vinod Valsalam and Anthony Skjellum. A framework for high-performance matrix multiplication based on hierarchical abstractions, algorithms and optimized low-level kernels. Concurrency and Computation: Practice and Experience, 14(10):805-840, 2002.
-
(2002)
Concurrency and Computation: Practice and Experience
, vol.14
, Issue.10
, pp. 805-840
-
-
Valsalam, V.1
Skjellum, A.2
-
28
-
-
53349090142
-
-
Robert A. van de Geijn. Using PLAPACK: Parallel Linear Algebra Package. The MIT Press, 1997.
-
Robert A. van de Geijn. Using PLAPACK: Parallel Linear Algebra Package. The MIT Press, 1997.
-
-
-
-
29
-
-
85029485046
-
-
Reinhard von Hanxleden, Ken Kennedy, Charles H. Koelbel, Raja Das, and Joel H. Saltz. Compiler analysis for irregular problems in Fortran D. In 1992 Workshop on Languages and Compilers for Parallel Computing, number 757, pages 97-111, New Haven, CT, USA, 1992.
-
Reinhard von Hanxleden, Ken Kennedy, Charles H. Koelbel, Raja Das, and Joel H. Saltz. Compiler analysis for irregular problems in Fortran D. In 1992 Workshop on Languages and Compilers for Parallel Computing, number 757, pages 97-111, New Haven, CT, USA, 1992.
-
-
-
|