메뉴 건너뛰기




Volumn , Issue , 2007, Pages 116-125

SuperMatrix out-of-order scheduling of matrix operations for SMP and multi-core architectures

Author keywords

Data affinity; Data flow parallelism; Dense linear algebra libraries; Dynamic scheduling; Out of order execution

Indexed keywords

ALGORITHMS; COMPUTER ARCHITECTURE; DATA PROCESSING; PROGRAM PROCESSORS; SCHEDULING;

EID: 35248843628     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1248377.1248397     Document Type: Conference Paper
Times cited : (106)

References (33)
  • 1
    • 0037834788 scopus 로고    scopus 로고
    • OpenMP issues arising in the development of parallel BLAS and LAPACK libraries
    • C. Addison, Y. Ren, and M. van Waveren. OpenMP issues arising in the development of parallel BLAS and LAPACK libraries. Scientific Programming, 11(2), 2003.
    • (2003) Scientific Programming , vol.11 , Issue.2
    • Addison, C.1    Ren, Y.2    van Waveren, M.3
  • 3
    • 18044400448 scopus 로고    scopus 로고
    • B. S. Andersen, J. Waśniewski, and F. G. Gustavson. A recursive formulation of Cholesky factorization of a matrix in packed storage. A CM Trans. Math. Soft., 27(2):214-244, 2001.
    • B. S. Andersen, J. Waśniewski, and F. G. Gustavson. A recursive formulation of Cholesky factorization of a matrix in packed storage. A CM Trans. Math. Soft., 27(2):214-244, 2001.
  • 7
    • 35248815316 scopus 로고    scopus 로고
    • P. Bientinesi, B. Gunter, and R. van de Geijn. Families of algorithms related to the inversion of a symmetric positive definite matrix. FLAME Working Note #19 TR-2006-20, The University of Texas at Austin, Department of Computer Sciences, 2006.
    • P. Bientinesi, B. Gunter, and R. van de Geijn. Families of algorithms related to the inversion of a symmetric positive definite matrix. FLAME Working Note #19 TR-2006-20, The University of Texas at Austin, Department of Computer Sciences, 2006.
  • 8
    • 17644370328 scopus 로고    scopus 로고
    • Representing linear algebra algorithms in code: The FLAME application programming interfaces
    • March
    • P. Bientinesi, E. S. Quintana-Ortí, and R. A. van de Geijn. Representing linear algebra algorithms in code: The FLAME application programming interfaces. ACM Trans. Math. Soft., 31(1):27-59, March 2005.
    • (2005) ACM Trans. Math. Soft , vol.31 , Issue.1 , pp. 27-59
    • Bientinesi, P.1    Quintana-Ortí, E.S.2    van de Geijn, R.A.3
  • 9
    • 35248821570 scopus 로고    scopus 로고
    • P. Bientinesi and R. A. van de Geijn. Representing dense linear algebra algorithms: A farewell to indices. FLAME Working Note #17 TR-2006-10, The University of Texas at Austin, Department of Computer Sciences, 2006.
    • P. Bientinesi and R. A. van de Geijn. Representing dense linear algebra algorithms: A farewell to indices. FLAME Working Note #17 TR-2006-10, The University of Texas at Austin, Department of Computer Sciences, 2006.
  • 15
    • 1842832833 scopus 로고    scopus 로고
    • Recursive blocked algorithms and hybrid data structures for dense matrix library software
    • E. Elmroth, F. Gustavson, I. Jonsson, and B. Kagstrom. Recursive blocked algorithms and hybrid data structures for dense matrix library software. SIAM Review, 46(1):3-45, 2004.
    • (2004) SIAM Review , vol.46 , Issue.1 , pp. 3-45
    • Elmroth, E.1    Gustavson, F.2    Jonsson, I.3    Kagstrom, B.4
  • 16
    • 35248894624 scopus 로고    scopus 로고
    • K. Goto. http://www.tacc.utexas.edu/resources/software.
    • Goto, K.1
  • 18
    • 35248890693 scopus 로고    scopus 로고
    • F. G. Gustavson, L. Karlsson, and B. Kagstrom. Three algorithms on distributed memory using packed storage. Computational Science - Para 2006. B. Kagstrom, E. Elmroth, eds., accepted for Lecture Notes in Computer Science. Springer-Verlag, 2007.
    • F. G. Gustavson, L. Karlsson, and B. Kagstrom. Three algorithms on distributed memory using packed storage. Computational Science - Para 2006. B. Kagstrom, E. Elmroth, eds., accepted for Lecture Notes in Computer Science. Springer-Verlag, 2007.
  • 19
    • 35248867212 scopus 로고
    • BLAS based on block data, structures
    • CTC92TR89, Cornell University, February
    • G. Henry. BLAS based on block data, structures. Theory Center Technical Report CTC92TR89, Cornell University, February 1992.
    • (1992) Theory Center Technical Report
    • Henry, G.1
  • 21
    • 35248880835 scopus 로고    scopus 로고
    • IBM. IBM Engineering and Scientific Subroutine Library for AIX Version 3, Release 3. IBM Pub. No. SA22-7272-04, December 2001
    • IBM. IBM Engineering and Scientific Subroutine Library for AIX Version 3, Release 3. IBM Pub. No. SA22-7272-04, December 2001.
  • 22
    • 1642372163 scopus 로고    scopus 로고
    • Parallel and fully recursive multifrontal sparse Cholesky
    • D. Irony, G. Shklarski, and S. Toledo. Parallel and fully recursive multifrontal sparse Cholesky. Future Gener. Comput. Syst., 20(3):425-440, 2004.
    • (2004) Future Gener. Comput. Syst , vol.20 , Issue.3 , pp. 425-440
    • Irony, D.1    Shklarski, G.2    Toledo, S.3
  • 24
    • 0012525494 scopus 로고    scopus 로고
    • Programming parallel applications in Cilk
    • C. Leiserson and A. Plaat. Programming parallel applications in Cilk. SINEWS: SIAM News, 31, 1998.
    • (1998) SINEWS: SIAM News , vol.31
    • Leiserson, C.1    Plaat, A.2
  • 25
    • 35248901228 scopus 로고    scopus 로고
    • T. M. Low and R. van de Geijn. An API for manipulating matrices stored by blocks. FLAME Working Note #12 TR-2004-15, The University of Texas at Austin, Department of Computer Sciences, May 2004.
    • T. M. Low and R. van de Geijn. An API for manipulating matrices stored by blocks. FLAME Working Note #12 TR-2004-15, The University of Texas at Austin, Department of Computer Sciences, May 2004.
  • 26
    • 0030679296 scopus 로고    scopus 로고
    • H. Lu, A. L. Cox, S. Dwarkadas, R. Rajamony, and W. Zwaenepoel. Compiler and software distributed shared memory support for irregular applications. In PPOPP '97: Proceedings of the Sixth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 48-56, New York, NY, USA, 1997. ACM Press.
    • H. Lu, A. L. Cox, S. Dwarkadas, R. Rajamony, and W. Zwaenepoel. Compiler and software distributed shared memory support for irregular applications. In PPOPP '97: Proceedings of the Sixth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 48-56, New York, NY, USA, 1997. ACM Press.
  • 28
    • 0035003299 scopus 로고    scopus 로고
    • A comparison of lookahead and algorithmic blocking techniques for parallel matrix factorization
    • June
    • P. Strazdins. A comparison of lookahead and algorithmic blocking techniques for parallel matrix factorization. International Journal of Parallel and Distributed Systems and Networks, 4(1):26-35, June 2001.
    • (2001) International Journal of Parallel and Distributed Systems and Networks , vol.4 , Issue.1 , pp. 26-35
    • Strazdins, P.1
  • 30
    • 33847129885 scopus 로고
    • An efficient algorithm for exploiting multiple arithmetic units
    • R. Tomasulo. An efficient algorithm for exploiting multiple arithmetic units. IBM J. of Research and Development, 11(1), 1967.
    • (1967) IBM J. of Research and Development , vol.11 , Issue.1
    • Tomasulo, R.1
  • 31
    • 0037173976 scopus 로고    scopus 로고
    • A framework for high-performance matrix multiplication based on hierarchical abstractions, algorithms and optimized low-level kernels
    • V. Valsalam and A. Skjellum. A framework for high-performance matrix multiplication based on hierarchical abstractions, algorithms and optimized low-level kernels. Concurrency and Computation: Practice and Experience, 14(10):805-840, 2002.
    • (2002) Concurrency and Computation: Practice and Experience , vol.14 , Issue.10 , pp. 805-840
    • Valsalam, V.1    Skjellum, A.2
  • 32
    • 85029485046 scopus 로고    scopus 로고
    • R. von Hanxleden, K. Kennedy, C. H. Koelbel, R. Das, and J. H. Saltz. Compiler analysis for irregular problems in Fortran D. In 1992 Workshop on Languages and Compilers for Parallel Computing, number 757, pages 97-111, New Haven, Conn., 1992. Berlin: Springer-Verlag.
    • R. von Hanxleden, K. Kennedy, C. H. Koelbel, R. Das, and J. H. Saltz. Compiler analysis for irregular problems in Fortran D. In 1992 Workshop on Languages and Compilers for Parallel Computing, number 757, pages 97-111, New Haven, Conn., 1992. Berlin: Springer-Verlag.


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.