메뉴 건너뛰기




Volumn , Issue , 2013, Pages

An improved parallel singular value algorithm and its implementation for multicore hardware

Author keywords

Eigenvalues and eigenvectors; Performance; Reduction to bidiagonal; Singular Value Decomposition; Task parallelism

Indexed keywords

EIGENVALUES AND EIGENFUNCTIONS; HARDWARE; PROGRAM PROCESSORS; SINGULAR VALUE DECOMPOSITION;

EID: 84899676338     PISSN: 21674329     EISSN: 21674337     Source Type: Conference Proceeding    
DOI: 10.1145/2503210.2503292     Document Type: Conference Paper
Times cited : (24)

References (63)
  • 5
    • 0001045187 scopus 로고
    • Numerical techniques in mathematical programming
    • New York. Academm Press
    • R. H. Bartels, G. H. Golub, and M. Saunders. Numerical techniques in mathematical programming. In Nonhnear Programming, pages 123-176, New York, 1971. Academm Press.
    • (1971) Nonhnear Programming , pp. 123-176
    • Bartels, R.H.1    Golub, G.H.2    Saunders, M.3
  • 6
    • 84949180378 scopus 로고    scopus 로고
    • editors, Methods High-Performance Scientific Computing. Springer, London Dordrecht Heidelberg New York, 2012. ISBN 978-1-4471-2436-8 e-ISBN 978-1-4471-2437-5 DOI 10. 1007/978-1-4471-2437-5
    • M. Becka, G. Oksa, and M. Vajtersic. Parallel Block-Jacobi SVD. In M. W. Berry, K. A. Gallivan, E. Gallopoulos, A. Grama, B. Philippe, Y. Saad, and F. Saied, editors, Methods High-Performance Scientific Computing, pages 185-197. Springer, London Dordrecht Heidelberg New York, 2012. ISBN 978-1-4471-2436-8 e-ISBN 978-1-4471-2437-5 DOI 10. 1007/978-1-4471-2437-5.
    • Parallel block-jacobi svd , pp. 185-197
    • Becka, M.1    Oksa, G.2    Vajtersic, M.3    Berry, M.W.4    Gallivan, K.A.5    Gallopoulos, E.6    Grama, A.7    Philippe, B.8    Saad, Y.9    Saied, F.10
  • 10
    • 0005571418 scopus 로고
    • Ghosts in tomography: The effects of poor angular coverage in 2-D seismic traveltime inversion
    • N. Bregman, R. Bailey, and C. Chapman. Ghosts in tomography: The effects of poor angular coverage in 2-D seismic traveltime inversion. Can. J. Explor. Geophys, 25(1):7-27, 1989.
    • (1989) Can. J. Explor. Geophys , vol.25 , Issue.1 , pp. 7-27
    • Bregman, N.1    Bailey, R.2    Chapman, C.3
  • 11
    • 38049058008 scopus 로고    scopus 로고
    • The impact of multicore on math software
    • B. Kagström, E. Elmroth, J. Dongarra, and J. Wasniewski, editors, Applied Parallel Computing. State of the Art in Scientific Computing. Springer
    • A. Buttari, J. Dongarra, J. Kurzak, J. Langou, P. Luszczek, and S. Tomov. The impact of multicore on math software. In B. Kagström, E. Elmroth, J. Dongarra, and J. Wasniewski, editors, Applied Parallel Computing. State of the Art in Scientific Computing, 8th International Workshop, PARA, volume 4699 of Lecture Notes in Computer Science, pages 1-10. Springer, 2006.
    • (2006) 8th International Workshop, PARA, Volume 4699 of Lecture Notes in Computer Science , pp. 1-10
    • Buttari, A.1    Dongarra, J.2    Kurzak, J.3    Langou, J.4    Luszczek, P.5    Tomov, S.6
  • 12
    • 36048997493 scopus 로고    scopus 로고
    • Multithreading for synchronization tolerance in matrix factorization
    • Boston, MA, June 24-28 2007. Journal of Physics: Conference Series, IOP Publishing. DOI: 10. 1088/1742-6596/78/1/012028
    • A. Buttari, J. J. Dongarra, P. Husbands, J. Kurzak, and K. Yelick. Multithreading for synchronization tolerance in matrix factorization. In Scientific Discovery through Advanced Computing, SciDAC 2007, Boston, MA, June 24-28 2007. Journal of Physics: Conference Series 78:012028, IOP Publishing. DOI: 10. 1088/1742-6596/78/1/012028.
    • Scientific Discovery Through Advanced Computing, SciDAC 2007 , vol.78 , pp. 012028
    • Buttari, A.1    Dongarra, J.J.2    Husbands, P.3    Kurzak, J.4    Yelick, K.5
  • 13
    • 50249105132 scopus 로고    scopus 로고
    • Parallel tiled QR factorization for multicore architectures
    • DOI: 10. 1002/cpe. 1301
    • A. Buttari, J. Langou, J. Kurzak, and J. J. Dongarra. Parallel tiled QR factorization for multicore architectures. Concurrency Computat.: Pract. Exper., 20(13):1573-1590, 2008. DOI: 10. 1002/cpe. 1301.
    • (2008) Concurrency Computat.: Pract. Exper. , vol.20 , Issue.13 , pp. 1573-1590
    • Buttari, A.1    Langou, J.2    Kurzak, J.3    Dongarra, J.J.4
  • 14
    • 58149269099 scopus 로고    scopus 로고
    • Class of parallel tiled linear algebra algorithms for multicore architectures
    • DOI: 10. 1016/j. parco. 2008. 10. 002
    • A. Buttari, J. Langou, J. Kurzak, and J. J. Dongarra. A class of parallel tiled linear algebra algorithms for multicore architectures. Parellel Comput. Syst. Appl., 35:38-53, 2009. DOI: 10. 1016/j. parco. 2008. 10. 002.
    • (2009) Parellel Comput. Syst. Appl. , vol.35 , pp. 38-53
    • Buttari, A.1    Langou, J.2    Kurzak, J.3    Dongarra, J.J.4
  • 17
    • 0026238244 scopus 로고    scopus 로고
    • The bidiagonal singular value decomposition and Hamiltonian mechanics
    • October 1991. (LAPACK Working Note #11)
    • P. Deift, J. W. Demmel, L.-C. Li, and C. Tomei. The bidiagonal singular value decomposition and Hamiltonian mechanics. SIAM J. Numer. Anal., 28(5):1463-1516, October 1991. (LAPACK Working Note #11).
    • SIAM J. Numer. Anal. , vol.28 , Issue.5 , pp. 1463-1516
    • Deift, P.1    Demmel, J.W.2    Li, L.-C.3    Tomei, C.4
  • 18
    • 0001192187 scopus 로고    scopus 로고
    • Accurate singular values of bidiagonal matrices
    • September 1990. (Also LAPACK LAWN #3)
    • J. W. Demmel and W. Kahan. Accurate singular values of bidiagonal matrices. SIAM J. Sci. Stat. Comput., 11(5):873-912, September 1990. (Also LAPACK LAWN #3).
    • SIAM J. Sci. Stat. Comput. , vol.11 , Issue.5 , pp. 873-912
    • Demmel, J.W.1    Kahan, W.2
  • 21
    • 84899693051 scopus 로고    scopus 로고
    • Exploiting fine-grain parallelism in recursive LU factorization
    • ISBN 978-1-61499-040-6 (print); ISBN 978-1-61499-041-3
    • J. Dongarra, M. Faverge, H. Ltaief, and P. Luszczek. Exploiting fine-grain parallelism in recursive LU factorization. Advances in Parallel Computing, Special Issue, 22:429-436, 2012. ISBN 978-1-61499-040-6 (print); ISBN 978-1-61499-041-3 (online).
    • (2012) Advances in Parallel Computing, Special Issue , vol.22 , pp. 429-436
    • Dongarra, J.1    Faverge, M.2    Ltaief, H.3    Luszczek, P.4
  • 23
    • 0023488724 scopus 로고
    • Seismic waveform modeling in heterogeneous media by ray perturbation theory
    • V. Farra and R. Madariaga. Seismic waveform modeling in heterogeneous media by ray perturbation theory. Journal of Geophysical Research: Solid Earth, 92(B3):2697-2712, 1987.
    • (1987) Journal of Geophysical Research: Solid Earth , vol.92 B3 , pp. 2697-2712
    • Farra, V.1    Madariaga, R.2
  • 24
    • 21344496407 scopus 로고
    • Accurate singular values and differential qd algorithms
    • V. Fernando and B. Parlett. Accurate singular values and differential qd algorithms. Numerisch Math., 67:191-229, 1994.
    • (1994) Numerisch Math. , vol.67 , pp. 191-229
    • Fernando, V.1    Parlett, B.2
  • 26
    • 0000288016 scopus 로고
    • Calculating the singular values and pseudoinverse of a matrix
    • G. H. Golub and W. Kahan. Calculating the singular values and pseudoinverse of a matrix. SIAM J. Numer. Anal., 2(3):205-224, 1965.
    • (1965) SIAM J. Numer. Anal. , vol.2 , Issue.3 , pp. 205-224
    • Golub, G.H.1    Kahan, W.2
  • 27
    • 0004236492 scopus 로고    scopus 로고
    • The John Hopkins University Press, 4th edition, December 27 2012. ISBN-10: 1421407949, ISBN-13: 978-1421407944
    • G. H. Golub and C. F. V. Loan. Matrix Computations. The John Hopkins University Press, 4th edition, December 27 2012. ISBN-10: 1421407949, ISBN-13: 978-1421407944.
    • Matrix Computations
    • Golub, G.H.1    Loan, C.F.V.2
  • 28
    • 0007051545 scopus 로고
    • J. Wilkinson and C. Reinsch, editors, Handbook for Automattc Computation, II, Linear Algebra. Springer-Verlag, New York
    • G. H. Golub and C. Reinsch. Singular value decomposition and least squares solutions. In J. Wilkinson and C. Reinsch, editors, Handbook for Automattc Computation, II, Linear Algebra. Springer-Verlag, New York, 1971.
    • (1971) Singular Value Decomposition and Least Squares Solutions
    • Golub, G.H.1    Reinsch, C.2
  • 29
    • 0017011163 scopus 로고
    • Ill-conditioned eigensystems and the computation of the Jordan canonical form
    • October
    • G. H. Golub and J. H. Wilkinson. Ill-conditioned eigensystems and the computation of the Jordan canonical form. SIAM Rev., 18(4), October 1976.
    • (1976) SIAM Rev. , vol.18 , Issue.4
    • Golub, G.H.1    Wilkinson, J.H.2
  • 30
    • 0542421948 scopus 로고
    • The solution of large dense generalized eigenvalue problems on the cray X-MP/24 with SSD
    • April
    • R. Grimes, H. Krakauer, J. Lewis, H. Simon, and S.-H. Wei. The solution of large dense generalized eigenvalue problems on the cray X-MP/24 with SSD. J. Comput. Phys., 69:471-481, April 1987.
    • (1987) J. Comput. Phys. , vol.69 , pp. 471-481
    • Grimes, R.1    Krakauer, H.2    Lewis, J.3    Simon, H.4    Wei, S.-H.5
  • 31
    • 0024082507 scopus 로고
    • Solution of large, dense symmetric generalized eigenvalue problems using secondary storage
    • September
    • R. G. Grimes and H. D. Simon. Solution of large, dense symmetric generalized eigenvalue problems using secondary storage. ACM Transactions on Mathematical Software, 14:241-256, September 1988.
    • (1988) ACM Transactions on Mathematical Software , vol.14 , pp. 241-256
    • Grimes, R.G.1    Simon, H.D.2
  • 32
    • 1542533583 scopus 로고
    • A divide-and-conquer algorithm for the bidiagonal svd
    • M. Gu and S. Eisenstat. A divide-and-conquer algorithm for the bidiagonal SVD. SIAM J. Mat. Anal. Appl., 16:79-92, 1995.
    • (1995) SIAM J. Mat. Anal. Appl. , vol.16 , pp. 79-92
    • Gu, M.1    Eisenstat, S.2
  • 33
    • 84862107202 scopus 로고    scopus 로고
    • Parallel and cache-efficient in-place matrix storage format conversion
    • article 17. DOI: 10. 1145/2168773. 2168775
    • F. G. Gustavson, L. Karlsson, and B. Kagström. Parallel and cache-efficient in-place matrix storage format conversion. ACM Trans. Math. Soft., 38(3):article 17, 2012. DOI: 10. 1145/2168773. 2168775.
    • (2012) ACM Trans. Math. Soft. , vol.38 , Issue.3
    • Gustavson, F.G.1    Karlsson, L.2    Kagström, B.3
  • 34
    • 83155188961 scopus 로고    scopus 로고
    • Parallel reduction to condensed forms for symmetric eigenvalue problems using aggregated fine-grained and memory-aware kernels
    • New York, NY, USA. ACM
    • A. Haidar, H. Ltaief, and J. Dongarra. Parallel reduction to condensed forms for symmetric eigenvalue problems using aggregated fine-grained and memory-aware kernels. In Proceedings of SC'11, pages 8:1-8:11, New York, NY, USA, 2011. ACM.
    • (2011) Proceedings of SC'11 , pp. 81-811
    • Haidar, A.1    Ltaief, H.2    Dongarra, J.3
  • 36
    • 84860412769 scopus 로고    scopus 로고
    • Analysis of dynamically scheduled tile algorithms for dense linear algebra on multicore architectures
    • DOI: 10. 1002/cpe. 1829
    • A. Haidar, H. Ltaief, A. YarKhan, and J. J. Dongarra. Analysis of dynamically scheduled tile algorithms for dense linear algebra on multicore architectures. Concurrency Computat.: Pract. Exper., 2011. DOI: 10. 1002/cpe. 1829.
    • (2011) Concurrency Computat.: Pract. Exper.
    • Haidar, A.1    Ltaief, H.2    Yarkhan, A.3    Dongarra, J.J.4
  • 37
    • 84899682038 scopus 로고    scopus 로고
    • New multi-stage algorithm for symmetric eigenvalues and eigenvectors achieves two-fold speedup
    • Aachen, Germany, August 26-30 (submitted)
    • A. Haidar, P. Luszczek, and J. Dongarra. New multi-stage algorithm for symmetric eigenvalues and eigenvectors achieves two-fold speedup. In Euro-Par 2013, Aachen, Germany, August 26-30 2013. (submitted).
    • (2013) Euro-Par 2013
    • Haidar, A.1    Luszczek, P.2    Dongarra, J.3
  • 39
    • 0015127540 scopus 로고
    • A numerical method for solving Fredholm integral equations of the first kind using singular values
    • R. J. Hanson. A numerical method for solving Fredholm integral equations of the first kind using singular values. SIAM J. Numer. Anal., 8(3):616-626, 1971.
    • (1971) SIAM J. Numer. Anal. , vol.8 , Issue.3 , pp. 616-626
    • Hanson, R.J.1
  • 40
    • 58149421595 scopus 로고
    • Analysis of a complex of statistical variables into principal components
    • 498-520
    • H. Hotelling. Analysis of a complex of statistical variables into principal components. J. Educ. Psych., 24:417-441, 498-520, 1933.
    • (1933) J. Educ. Psych. , vol.24 , pp. 417-441
    • Hotelling, H.1
  • 41
    • 0002467254 scopus 로고
    • Simplified calculation of principal components
    • H. Hotelling. Simplified calculation of principal components. Psychometrica, 1:27-35, 1935.
    • (1935) Psychometrica , vol.1 , pp. 27-35
    • Hotelling, H.1
  • 42
    • 0000652188 scopus 로고
    • Unitary triangularization of a nonsymmetric matrix
    • October. DOI 10. 1145/320941. 320947
    • A. S. Householder. Unitary triangularization of a nonsymmetric matrix. Journal of the ACM (JACM), 5(4), October 1958. DOI 10. 1145/320941. 320947.
    • (1958) Journal of the ACM (JACM) , vol.5 , Issue.4
    • Householder, A.S.1
  • 44
    • 21344498628 scopus 로고
    • A Parallel Algorithm for Computing the Singular Value Decomposition of a Matrix
    • E. R. Jessup and D. Sorensen. A Parallel Algorithm for Computing the Singular Value Decomposition of a Matrix. SIAM J. Matrix Anal. Appl., 15:530-548, 1994.
    • (1994) SIAM J. Matrix Anal. Appl. , vol.15 , pp. 530-548
    • Jessup, E.R.1    Sorensen, D.2
  • 45
    • 0346688721 scopus 로고    scopus 로고
    • Information filtering using the Riemannian SVD (R-SVD)
    • A. Ferreira, J. D. P. Rolim, H. D. Simon, and S.-H. Teng, editors, Berkeley, California, USA, August 9-11, Proceedings, volume 1457 of Lecture Notes in Computer Science. Springer, 1998
    • E. P. Jiang and M. W. Berry. Information filtering using the Riemannian SVD (R-SVD). In A. Ferreira, J. D. P. Rolim, H. D. Simon, and S.-H. Teng, editors, Solving Irregularly Structured Problems in Parallel, 5th International Symposium, IRREGULAR 98, Berkeley, California, USA, August 9-11, 1998, Proceedings, volume 1457 of Lecture Notes in Computer Science, pages 386-395. Springer, 1998.
    • (1998) Solving Irregularly Structured Problems in Parallel, 5th International Symposium, IRREGULAR 98 , pp. 386-395
    • Jiang, E.P.1    Berry, M.W.2
  • 46
    • 49349111725 scopus 로고    scopus 로고
    • Solving systems of linear equation on the CELL processor using Cholesky factorization
    • DOI: TPDS. 2007. 70813
    • J. Kurzak, A. Buttari, and J. J. Dongarra. Solving systems of linear equation on the CELL processor using Cholesky factorization. Trans. Parallel Distrib. Syst., 19(9):1175-1186, 2008. DOI: TPDS. 2007. 70813.
    • (2008) Trans. Parallel Distrib. Syst. , vol.19 , Issue.9 , pp. 1175-1186
    • Kurzak, J.1    Buttari, A.2    Dongarra, J.J.3
  • 47
    • 80053238375 scopus 로고    scopus 로고
    • QR factorization for the CELL processor
    • DOI: 10. 3233/SPR-2008-0268
    • J. Kurzak and J. J. Dongarra. QR factorization for the CELL processor. Scientific Programming, 00:1-12, 2008. DOI: 10. 3233/SPR-2008-0268.
    • (2008) Scientific Programming , vol.0 , pp. 1-12
    • Kurzak, J.1    Dongarra, J.J.2
  • 48
    • 73149105729 scopus 로고    scopus 로고
    • Scheduling dense linear algebra operations on multicore processors
    • DOI: 10. 1002/cpe. 1467
    • J. Kurzak, H. Ltaief, J. J. Dongarra, and R. M. Badia. Scheduling dense linear algebra operations on multicore processors. Concurrency Computat.: Pract. Exper., 21(1):15-44, 2009. DOI: 10. 1002/cpe. 1467.
    • (2009) Concurrency Computat.: Pract. Exper. , vol.21 , Issue.1 , pp. 15-44
    • Kurzak, J.1    Ltaief, H.2    Dongarra, J.J.3    Badia, R.M.4
  • 49
    • 0040250198 scopus 로고
    • A parallel algorithm for reducing symmetric banded matrices to tridiagonal form
    • November
    • B. Lang. A parallel algorithm for reducing symmetric banded matrices to tridiagonal form. SIAM J. Sci. Comput., 14:1320-1338, November 1993.
    • (1993) SIAM J. Sci. Comput. , vol.14 , pp. 1320-1338
    • Lang, B.1
  • 50
    • 0032678430 scopus 로고    scopus 로고
    • Efficient eigenvalue and singular value computations on shared memory machines
    • B. Lang. Efficient eigenvalue and singular value computations on shared memory machines. Parallel Computing, 25(7):845-860, 1999.
    • (1999) Parallel Computing , vol.25 , Issue.7 , pp. 845-860
    • Lang, B.1
  • 53
    • 84877905452 scopus 로고    scopus 로고
    • High performance bidiagonal reduction using tile algorithms on homogeneous multicore architectures
    • In publication
    • H. Ltaief, P. Luszczek, and J. Dongarra. High Performance Bidiagonal Reduction using Tile Algorithms on Homogeneous Multicore Architectures. ACM TOMS, 39(3), 2013. In publication.
    • (2013) ACM TOMS , vol.39 , Issue.3
    • Ltaief, H.1    Luszczek, P.2    Dongarra, J.3
  • 54
    • 84865266292 scopus 로고    scopus 로고
    • Enhancing parallelism of tile bidiagonal transformation on multicore architectures using tree reduction
    • R. Wyrzykowski, J. Dongarra, K. Karczewski, and J. Wasniewski, editors, Torun, Poland
    • H. Ltaief, P. Luszczek, A. Haidar, and J. Dongarra. Enhancing parallelism of tile bidiagonal transformation on multicore architectures using tree reduction. In R. Wyrzykowski, J. Dongarra, K. Karczewski, and J. Wasniewski, editors, Proceedings of 9th International Conference, PPAM 2011, volume 7203, pages 661-670, Torun, Poland, 2012.
    • (2012) Proceedings of 9th International Conference, PPAM 2011 , vol.7203 , pp. 661-670
    • Ltaief, H.1    Luszczek, P.2    Haidar, A.3    Dongarra, J.4
  • 56
    • 85057401994 scopus 로고    scopus 로고
    • Parallel methods for the singular value decomposition
    • E. Kontoghiorghes, editor. Chapman & Hall/CRC
    • B. P. M. Berry, D. Mezher and A. Sameh. Parallel methods for the singular value decomposition. In E. Kontoghiorghes, editor, Parallel Computing and Statistics, pages 117-164. Chapman & Hall/CRC, 2006.
    • (2006) Parallel Computing and Statistics , pp. 117-164
    • Berry, B.P.M.1    Mezher, D.2    Sameh, A.3
  • 57
    • 0019533482 scopus 로고
    • Principal component analysis in linear systems: Controllability, observability, and model reduction
    • February
    • B. C. Moore. Principal component analysis in linear systems: Controllability, observability, and model reduction. IEEE Transactions on Automatic Control, AC-26(1), February 1981.
    • (1981) IEEE Transactions on Automatic Control , vol.AC-26 , Issue.1
    • Moore, B.C.1
  • 59
    • 0347737736 scopus 로고    scopus 로고
    • The decompositional approach to matrix computation
    • Jan/Feb. ISSN: 1521-9615; DOI 10. 1109/5992. 814658
    • G. W. Stewart. The decompositional approach to matrix computation. Computing in Science & Engineering, 2(1):50-59, Jan/Feb 2000. ISSN: 1521-9615; DOI 10. 1109/5992. 814658.
    • (2000) Computing in Science & Engineering , vol.2 , Issue.1 , pp. 50-59
    • Stewart, G.W.1
  • 61
    • 0025467711 scopus 로고
    • A bridging model for parallel computation
    • Aug. DOI 10. 1145/79173. 79181
    • L. G. Valiant. A bridging model for parallel computation. Communications of the ACM, 33(8), Aug. 1990. DOI 10. 1145/79173. 79181.
    • (1990) Communications of the ACM , vol.33 , Issue.8
    • Valiant, L.G.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.