메뉴 건너뛰기




Volumn 39, Issue 3, 2013, Pages

High-performance bidiagonal reduction using tile algorithms on homogeneous multicore architectures

Author keywords

Bidiagional reduction; Bulge chasing; Data translation layer; Dynamic scheduling; High performance kernels; Tile algorithms; Two stage approach

Indexed keywords

BANDWIDTH; CACHE MEMORY; COMPUTATIONAL EFFICIENCY; MEMORY ARCHITECTURE; OPEN SOURCE SOFTWARE; OPEN SYSTEMS; SINGULAR VALUE DECOMPOSITION; SOFTWARE ARCHITECTURE;

EID: 84877905452     PISSN: 00983500     EISSN: 15577295     Source Type: Journal    
DOI: 10.1145/2450153.2450154     Document Type: Article
Times cited : (20)

References (42)
  • 4
    • 0039762126 scopus 로고
    • Evaluating block algorithm variants in LAPACK
    • J. Dongarra et al. Eds., SIAM, Philadelphia, PA
    • ANDERSON, E. AND DONGARRA, J. J. 1990. Evaluating block algorithm variants in LAPACK. In Parallel Processing for Scientific Computing, J. Dongarra et al. Eds., SIAM, Philadelphia, PA., 3-8.
    • (1990) Parallel Processing for Scientific Computing , pp. 3-8
    • Anderson, E.1    Dongarra, J.J.2
  • 5
    • 12444316073 scopus 로고    scopus 로고
    • A new stable bidiagonal reduction algorithm
    • BARLOW, J. L., BOSNER, N., AND DRMAČ, Z. 2005. A new stable bidiagonal reduction algorithm. Linear Algebra Appl. 397, 1, 35-84.
    • (2005) Linear Algebra Appl. , vol.397 , Issue.1 , pp. 35-84
    • Barlow, J.L.1    Bosner, N.2    Drmač, Z.3
  • 6
    • 77955109739 scopus 로고    scopus 로고
    • Reduction to condensed forms for symmetric eigenvalue problems on multi-core architectures
    • BIENTINESI, P., IGUAL, F., KRESSNER, D., AND QUINTANA-ORT'I, E. 2010. Reduction to condensed forms for symmetric eigenvalue problems on multi-core architectures. Parallel Process. Appl. Math. 6067, 387-395.
    • (2010) Parallel Process. Appl. Math. , vol.6067 , pp. 387-395
    • Bientinesi, P.1    Igual, F.2    Kressner, D.3    Quintana-Ort'I, E.4
  • 7
    • 0012881041 scopus 로고    scopus 로고
    • Algorithm 807: The SBR toolbox - Software for successive band reduction
    • BISCHOF, C. H., LANG, B., AND SUN, X. 2000. Algorithm 807: The SBR Toolbox - Software for successive band reduction. ACM Trans. Math. Softw. 26, 4, 602-616.
    • (2000) ACM Trans. Math. Softw. , vol.26 , Issue.4 , pp. 602-616
    • Bischof, C.H.1    Lang, B.2    Sun, X.3
  • 10
    • 48249107440 scopus 로고    scopus 로고
    • Block and parallel versions of one-sided bidiagonalization
    • BOSNER, N. AND BARLOW, J. L. 2007. Block and parallel versions of one-sided bidiagonalization. SIAM J. Matrix Anal. Appl. 29, 3, 927-953.
    • (2007) SIAM J. Matrix Anal. Appl. , vol.29 , Issue.3 , pp. 927-953
    • Bosner, N.1    Barlow, J.L.2
  • 12
    • 50249105132 scopus 로고    scopus 로고
    • Parallel tiled QR factorization for multicore architectures
    • http://dx.doi.org/10.1002/cpe.1301.
    • BUTTARI, A., LANGOU, J., KURZAK, J., AND DONGARRA, J. J. 2008. Parallel tiled QR factorization for multicore architectures. Concurrency Comput. Pract. Exper. 20, 13, 1573-1590. http://dx.doi.org/10.1002/cpe.1301.
    • (2008) Concurrency Comput. Pract. Exper. , vol.20 , Issue.13 , pp. 1573-1590
    • Buttari, A.1    Langou, J.2    Kurzak, J.3    Dongarra, J.J.4
  • 13
    • 58149269099 scopus 로고    scopus 로고
    • A class of parallel tiled linear algebra algorithms for multicore architectures
    • http://dx.doi.org/10.1016/j.parco.2008.10.002.
    • BUTTARI, A., LANGOU, J., KURZAK, J., AND DONGARRA, J. J. 2009. A class of parallel tiled linear algebra algorithms for multicore architectures. Parallel Comput. Syst. Appl. 35, 38-53. http://dx.doi.org/10.1016/j.parco.2008.10.002.
    • (2009) Parallel Comput. Syst. Appl. , vol.35 , pp. 38-53
    • Buttari, A.1    Langou, J.2    Kurzak, J.3    Dongarra, J.J.4
  • 14
    • 0030244536 scopus 로고    scopus 로고
    • The design and implementation of the ScaLAPACK LU, QR, and cholesky factorization routines
    • CHOI, J., DONGARRA, J. J., OSTROUCHOV, S., PETITET, A., WALKER, D. W., AND WHALEY, R. C. 1996. The design and implementation of the ScaLAPACK LU, QR, and Cholesky factorization routines. Sci. Program. 5, 173-184.
    • (1996) Sci. Program. , vol.5 , pp. 173-184
    • Choi, J.1    Dongarra, J.J.2    Ostrouchov, S.3    Petitet, A.4    Walker, D.W.5    Whaley, R.C.6
  • 17
    • 0026238244 scopus 로고
    • The bidiagonal singular value decomposition and hamiltonian mechanics
    • (LAPACK Working Note #11)
    • DEIFT, P., DEMMEL, J. W., LI, L.-C., AND TOMEI, C. 1991. The bidiagonal singular value decomposition and Hamiltonian mechanics. SIAM J. Numer. Anal. 28, 5, 1463-1516. (LAPACK Working Note #11).
    • (1991) SIAM J. Numer. Anal. , vol.28 , Issue.5 , pp. 1463-1516
    • Deift, P.1    Demmel, J.W.2    Li, L.-C.3    Tomei, C.4
  • 18
    • 0001192187 scopus 로고
    • Accurate singular values of bidiagonal matrices
    • (Also LAPACK Working Note #3)
    • DEMMEL, J. W. AND KAHAN, W. 1990. Accurate singular values of bidiagonal matrices. SIAM J. Sci. Stat. Comput. 11, 5, 873-912. (Also LAPACK Working Note #3).
    • (1990) SIAM J. Sci. Stat. Comput. , vol.11 , Issue.5 , pp. 873-912
    • Demmel, J.W.1    Kahan, W.2
  • 20
    • 21344496407 scopus 로고
    • Accurate singular values and differential qd algorithms
    • FERNANDO, V. AND PARLETT, B. 1994. Accurate singular values and differential qd algorithms. Numer. Math. 67, 191-229.
    • (1994) Numer. Math. , vol.67 , pp. 191-229
    • Fernando, V.1    Parlett, B.2
  • 21
    • 33747738463 scopus 로고
    • Singular value decomposition and least squares solutions
    • GOLUB, G. H. AND REINSCH, C. 1970. Singular value decomposition and least squares solutions. Numer. Math. 14, 403-420.
    • (1970) Numer. Math. , vol.14 , pp. 403-420
    • Golub, G.H.1    Reinsch, C.2
  • 22
    • 0004236492 scopus 로고    scopus 로고
    • 3rd Ed. Johns Hopkins University Press, Baltimore, MD
    • GOLUB, G. H. AND VAN LOAN, C. F. 1996. Matrix Computation 3rd Ed. Johns Hopkins University Press, Baltimore, MD.
    • (1996) Matrix Computation
    • Golub, G.H.1    Van Loan, C.F.2
  • 23
    • 0343090855 scopus 로고    scopus 로고
    • Efficient parallel reduction to bidiagonal form
    • GROSSER, B. AND LANG, B. 1999. Efficient parallel reduction to bidiagonal form. Parallel Comput. 25, 8, 969-986.
    • (1999) Parallel Comput. , vol.25 , Issue.8 , pp. 969-986
    • Grosser, B.1    Lang, B.2
  • 24
    • 1542533583 scopus 로고
    • A divide-and-conquer algorithm for the bidiagonal SVD
    • GU, M. AND EISENSTAT, S. 1995. A divide-and-conquer algorithm for the bidiagonal SVD. SIAM J. Math. Anal. Appl. 16, 79-92.
    • (1995) SIAM J. Math. Anal. Appl. , vol.16 , pp. 79-92
    • Gu, M.1    Eisenstat, S.2
  • 26
    • 84868568003 scopus 로고    scopus 로고
    • Analysis of dynamically scheduled tile algorithms for dense linear algebra on multicore architectures. Concurrency and computations: Practice and experience
    • University of Tennessee
    • HAIDAR, A., LTAIEF, H., YARKHAN, A., AND DONGARRA, J. 2011. Analysis of dynamically scheduled tile algorithms for dense linear algebra on multicore architectures. concurrency and computations: Practice and experience. Tech. rep. UT-CS-11-666, University of Tennessee.
    • (2011) Tech. Rep. UT-CS-11-666
    • Haidar, A.1    Ltaief, H.2    Yarkhan, A.3    Dongarra, J.4
  • 28
    • 58149421595 scopus 로고
    • Analysis of a complex of statistical variables into principal components
    • 498-520
    • HOTELLING, H. 1933. Analysis of a complex of statistical variables into principal components. J. Edu. Psych. 24, 417-441, 498-520.
    • (1933) J. Edu. Psych. , vol.24 , pp. 417-441
    • Hotelling, H.1
  • 29
    • 0002467254 scopus 로고
    • Simplified calculation of principal components
    • HOTELLING, H. 1935. Simplified calculation of principal components. Psychometrica 1, 27-35.
    • (1935) Psychometrica , vol.1 , pp. 27-35
    • Hotelling, H.1
  • 30
    • 0000652188 scopus 로고
    • Unitary triangularization of a nonsymmetric matrix
    • DOI 10.1145/320941.320947
    • HOUSEHOLDER, A. S. 1958. Unitary triangularization of a nonsymmetric matrix. J. ACM 5, 4. DOI 10.1145/320941.320947.
    • (1958) J. ACM , vol.5 , pp. 4
    • Householder, A.S.1
  • 31
    • 21344498628 scopus 로고
    • A parallel algorithm for computing the singular value decomposition of a matrix
    • JESSUP, E. R. AND SORENSEN, D. 1994. A parallel algorithm for computing the singular value decomposition of a matrix. SIAM J. Matrix Anal. Appl. 15, 530-548.
    • (1994) SIAM J. Matrix Anal. Appl. , vol.15 , pp. 530-548
    • Jessup, E.R.1    Sorensen, D.2
  • 33
    • 77649275879 scopus 로고    scopus 로고
    • Parallel two-sided matrix reduction to band bidiagonal form on multicore architectures
    • LTAIEF, H., KURZAK, J., AND DONGARRA, J. 2010. Parallel two-sided matrix reduction to band bidiagonal form on multicore architectures. IEEE Trans. Parallel Distrib. Syst. 417-423.
    • (2010) IEEE Trans. Parallel Distrib. Syst , pp. 417-423
    • Ltaief, H.1    Kurzak, J.2    Dongarra, J.3
  • 34
    • 80053252490 scopus 로고    scopus 로고
    • Two-stage tridiagonal reduction for dense symmetric matrices using tile algorithms on multicore architectures
    • ACM, New York
    • LUSZCZEK, P., LTAIEF, H., AND DONGARRA, J. 2011. Two-stage tridiagonal reduction for dense symmetric matrices using tile algorithms on multicore architectures. In Proceedings of IPDPS 2011. ACM, New York.
    • (2011) Proceedings of IPDPS 2011
    • Luszczek, P.1    Ltaief, H.2    Dongarra, J.3
  • 35
    • 84870611591 scopus 로고    scopus 로고
    • MKL Version 10.2
    • MKL. 2011. Intel, Math Kernel Library (MKL). http://www.intel.com/ software/products/mkl/. Version 10.2.
    • (2011) Intel, Math Kernel Library (MKL)
  • 36
    • 0019533482 scopus 로고
    • Principal component analysis in linear systems: Controllability, observability, and model reduction
    • MOORE, B. C. 1981. Principal component analysis in linear systems: Controllability, observability, and model reduction. IEEE Trans. Autom. Control AC-26, 1.
    • (1981) IEEE Trans. Autom. Control AC-26 , pp. 1
    • Moore, B.C.1
  • 38
    • 84867961757 scopus 로고    scopus 로고
    • One-sided reduction to bidiagonal form
    • RUI RALHA
    • RUI RALHA. 2003. One-sided reduction to bidiagonal form. Linear Algebra Appl. 358, 219-238.
    • (2003) Linear Algebra Appl. , vol.358 , pp. 219-238
  • 40
    • 0347737736 scopus 로고    scopus 로고
    • The decompositional approach to matrix computation
    • STEWART, G. W. 2000. The decompositional approach to matrix computation. Comput. Sci. Eng. 2, 1, 50-59.
    • (2000) Comput. Sci. Eng. , vol.2 , Issue.1 , pp. 50-59
    • Stewart, G.W.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.