SCOPUS 정보 검색 플랫폼

SIAM Journal on Scientific Computing

Volumn 34, Issue 1, 2012, Pages

Communication-optimal parallel and sequential QR and LU factorizations

(4) Demmel, James a Grigori, Laura b Hoemmen, Mark c Langou, Julien d

a UNIVERSITY OF CALIFORNIA (United States)

b UFR 919 Laboratoire d'Informatique Pour la Mécanique et les Sciences de l'Ingénieur (France)

c SANDIA NATIONAL LABORATORIES (United States)

d UNIVERSITY OF COLORADO (United States)

Author keywords

Linear algebra; LU factorization; QR factorization

Indexed keywords

LOWER-UPPER DECOMPOSITION;

FACTORIZATION ALGORITHMS; LOW BOUND; LU FACTORIZATION; MATRIX; OPTIMALITY; PERFORMANCE MODELING; POLY-LOGARITHMIC FACTORS; QR ALGORITHMS; QR FACTORIZATIONS; SCALAPACK;

FACTORIZATION;

EID: 84861354409 PISSN: 10648275 EISSN: None Source Type: Journal
DOI: 10.1137/080731992 Document Type: Article

Times cited : (299)

References (49)

1
- 0003706460
- 3rd ed., SIAM, Philadelphia
- E. Anderson, Z. Bai, C. Bischof, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, S. Blackford, and D. Sorensen, LAPACK Users' Guide, 3rd ed., SIAM, Philadelphia, 1999.
- (1999) LAPACK Users' Guide
- Anderson, E.¹ Bai, Z.² Bischof, C.³ Demmel, J.⁴ Dongarra, J.⁵ Du Croz, J.⁶ Greenbaum, A.⁷ Hammarling, S.⁸ McKenney, A.⁹ Blackford, S.¹⁰ Sorensen, D.¹¹

2
- 84861387586
- University of Tennessee, Knoxville
- M. Baboulin, L. Giraud, S. Gratton, and J. Langou, Parallel Tools for Solving Incremental Dense Least Squares Problems. Application to Space Geodesy, Technical report UT-CS-06-582, University of Tennessee, Knoxville, 2006.
- (2006) Parallel Tools for Solving Incremental Dense Least Squares Problems. Application to Space Geodesy, Technical Report UT-CS-06-582
- Baboulin, M.¹ Giraud, L.² Gratton, S.³ Langou, J.⁴

3
- 2442576081
- Algorithm 827: Irbleigs: A MATLAB program for computing a few eigenpairs of a large sparse Hermitian matrix
- J. Baglama, D. Calvetti, and L. Reichel, Algorithm 827: Irbleigs: A MATLAB program for computing a few eigenpairs of a large sparse Hermitian matrix, ACM Trans. Math. Software, 29 (2003), pp. 337-348.
- (2003) ACM Trans. Math. Software , vol.29 , pp. 337-348
- Baglama, J.¹ Calvetti, D.² Reichel, L.³

4
- 84861395314
- Block Arnoldi method
- Z. Bai, J. W. Demmel, J. J. Dongarra, A. Ruhe, and H. van der Vorst, eds., SIAM, Philadelphia
- Z. Bai and D. Day, Block Arnoldi method, in Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide, Z. Bai, J. W. Demmel, J. J. Dongarra, A. Ruhe, and H. van der Vorst, eds., SIAM, Philadelphia, 2000, pp. 196-204.
- (2000) Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide , pp. 196-204
- Bai, Z.¹ Day, D.²

5
- 84861380627
- C. G. Baker, U. L. Hetmaniuk, R. B. Lehoucq, and H. K. Thornquist, Anasazi webpage, http://trilinos.sandia.gov/packages/anasazi.
- Anasazi Webpage
- Baker, C.G.¹ Hetmaniuk, U.L.² Lehoucq, R.B.³ Thornquist, H.K.⁴

6
- 80052309144
- University of California, Berkeley, CA
- G. Ballard, J. Demmel, O. Holtz, and O. Schwartz, Minimizing Communication in Numerical Linear Algebra. Technical report UCB/EECS-2011-15, University of California, Berkeley, CA, 2011.
- (2011) Minimizing Communication in Numerical Linear Algebra. Technical Report UCB/EECS-2011-15
- Ballard, G.¹ Demmel, J.² Holtz, O.³ Schwartz, O.⁴

7
- 80054034521
- Minimizing communication in numerical linear algebra
- G. Ballard, J. Demmel, O. Holtz, and O. Schwartz, Minimizing communication in numerical linear algebra, SIAM J. Matrix Anal. Appl., 32 (2011), pp. 866-901.
- (2011) SIAM J. Matrix Anal. Appl. , vol.32 , pp. 866-901
- Ballard, G.¹ Demmel, J.² Holtz, O.³ Schwartz, O.⁴

8
- 0003615167
- SIAM, Philadelphia
- L. S. Blackford, J. Choi, A. Cleary, E. D'Azevedo, J. W. Demmel, I. Dhillon, J. J. Dongarra, S. Hammarling, G. Henry, A. Petitet, K. Stanley, D. Walker, and R. C. Whaley, ScaLAPACK Users' Guide, SIAM, Philadelphia, 1997.
- (1997) ScaLAPACK Users' Guide
- Blackford, L.S.¹ Choi, J.² Cleary, A.³ D'Azevedo, E.⁴ Demmel, J.W.⁵ Dhillon, I.⁶ Dongarra, J.J.⁷ Hammarling, S.⁸ Henry, G.⁹ Petitet, A.¹⁰ Stanley, K.¹¹ Walker, D.¹² Whaley, R.C.¹³

9
- 51049083291
- University of Tennessee, Knoxville
- A. Buttari, J. Langou, J. Kurzak, and J. J. Dongarra, A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures, Technical report UT-CS-07-600, University of Tennessee, Knoxville, 2007.
- (2007) A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures, Technical Report UT-CS-07-600
- Buttari, A.¹ Langou, J.² Kurzak, J.³ Dongarra, J.J.⁴

10
- 51049083291
- University of Tennessee, Knoxville
- A. Buttari, J. Langou, J. Kurzak, and J. J. Dongarra, Parallel tiled QR factorization for multicore architectures, Technical report UT-CS-07-598, University of Tennessee, Knoxville, 2007.
- (2007) Parallel Tiled QR Factorization for Multicore Architectures, Technical Report UT-CS-07-598
- Buttari, A.¹ Langou, J.² Kurzak, J.³ Dongarra, J.J.⁴

11
- 0001023112
- Parallel QR decomposition of a rectangular matrix
- M. Cosnard, J.-M. Muller, and Y. Robert, Parallel QR decomposition of a rectangular matrix, Numer. Math., 48 (1986), pp. 239-249.
- (1986) Numer. Math. , vol.48 , pp. 239-249
- Cosnard, M.¹ Muller, J.-M.² Robert, Y.³

12
- 0000538288
- Fast parallel matrix inversion algorithms
- L. Csanky, Fast parallel matrix inversion algorithms, SIAM J. Comput., 5 (1976), pp. 618-623.
- (1976) SIAM J. Comput. , vol.5 , pp. 618-623
- Csanky, L.¹

13
- 84861360773
- New parallel (rank-revealing) QR factorization algorithms
- Parallel Processing: Eighth International Euro-Par Conference, Paderborn, Germany
- R. D. D. Cunha, D. Becker, and J. C. Patterson, New parallel (rank-revealing) QR factorization algorithms, in Proceedings of the Euro-Par 2002. Parallel Processing: Eighth International Euro-Par Conference, Paderborn, Germany, 2002.
- (2002) Proceedings of the Euro-Par 2002
- Cunha, R.D.D.¹ Becker, D.² Patterson, J.C.³

14
- 0004375934
- University of Tennessee, Knoxville
- E. F. D'Azevedo and J. J. Dongarra, The Design and Implementation of the Parallel Outof- Core ScaLAPACK LU, QR, and Cholesky Factorization Routines, Technical report 118 CS-97-247, University of Tennessee, Knoxville, 1997.
- (1997) The Design and Implementation of the Parallel Outof- Core ScaLAPACK LU, QR, and Cholesky Factorization Routines, Technical Report 118 CS-97-247
- D'Azevedo, E.F.¹ Dongarra, J.J.²

15
- 0034487070
- The design and implementation of the parallel out-ofcore ScaLAPACK LU, QR, and Cholesky factorization routines
- E. D'Azevedo and J. Dongarra, The design and implementation of the parallel out-ofcore ScaLAPACK LU, QR, and Cholesky factorization routines, Concurrency Practice Experience, 12 (2000), pp. 1481-1483.
- (2000) Concurrency Practice Experience , vol.12 , pp. 1481-1483
- D'Azevedo, E.¹ Dongarra, J.²

16
- 77953980008
- University of California, Berkeley, CA
- J. W. Demmel, L. Grigori, M. Hoemmen, and J. Langou, Communication- Avoiding Parallel and Sequential QR and LU Factorizations: Theory and Practice, Technical report UCB/EECS-2008-89, University of California, Berkeley, CA, 2008.
- (2008) Communication-Avoiding Parallel and Sequential QR and LU Factorizations: Theory and Practice, Technical Report UCB/EECS-2008-89
- Demmel, J.W.¹ Grigori, L.² Hoemmen, M.³ Langou, J.⁴

17
- 74049121700
- Nonnegative diagonals and high performance on low-profile matrices from Householder QR
- J. W. Demmel, M. Hoemmen, Y. Hida, and E. J. Riedy, Nonnegative diagonals and high performance on low-profile matrices from Householder QR, SIAM J. Sci. Comput., 31 (2009), pp. 2832-2841.
- (2009) SIAM J. Sci. Comput. , vol.31 , pp. 2832-2841
- Demmel, J.W.¹ Hoemmen, M.² Hida, Y.³ Riedy, E.J.⁴

18
- 84861398561
- Minimizing communication in sparse matrix solvers
- New York
- J. W. Demmel, M. Hoemmen, M. Mohiyuddin, and K. A. Yelick, Minimizing communication in sparse matrix solvers, in Proceedings of the 2009 ACM/IEEE Conference on Supercomputing, New York, 2009.
- (2009) Proceedings of the 2009 ACM/IEEE Conference on Supercomputing
- Demmel, J.W.¹ Hoemmen, M.² Mohiyuddin, M.³ Yelick, K.A.⁴

19
- 0342583534
- University of Tennessee, Knoxville
- J. W. Demmel, Trading Off Parallelism and Numerical Stability, Technical report UT-CS-92-179, University of Tennessee, Knoxville, 1992.
- (1992) Trading off Parallelism and Numerical Stability Technical Report UT-CS-92-179
- Demmel, J.W.¹

20
- 84947936389
- New serial and parallel recursive QR factorization algorithms for SMP systems
- B. Kågström, E. Elmroth, J. Dongarra, and J. Wasniewski, eds., Lecture Notes in Comput. Sci., Springer, New York
- E. Elmroth and F. Gustavson, New serial and parallel recursive QR factorization algorithms for SMP systems, in Proceedings of the Fourth International Workshop on Applied Parallel Computing, Large Scale Scientific and Industrial Problems, B. Kågström, E. Elmroth, J. Dongarra, and J. Wasniewski, eds., Lecture Notes in Comput. Sci. 1541, Springer, New York, 1998, pp. 120-128.
- (1998) Proceedings of the Fourth International Workshop on Applied Parallel Computing, Large Scale Scientific and Industrial Problems , vol.1541 , pp. 120-128
- Elmroth, E.¹ Gustavson, F.²

21
- 0034224207
- Applying recursion to serial and parallel QR factorization leads to better performance
- E. Elmroth and F. Gustavson, Applying recursion to serial and parallel QR factorization leads to better performance, IBM J. Res. Develop., 44 (2000), pp. 605-624.
- (2000) IBM J. Res. Develop. , vol.44 , pp. 605-624
- Elmroth, E.¹ Gustavson, F.²

22
- 0038716587
- QR factorization with Morton-ordered quadtree matrices for memory re-use and parallelism
- J. D. Frens and D. S. Wise, QR factorization with Morton-ordered quadtree matrices for memory re-use and parallelism, SIGPLAN Not., 38 (2003), pp. 144-154.
- (2003) SIGPLAN Not. , vol.38 , pp. 144-154
- Frens, J.D.¹ Wise, D.S.²

23
- 0009598276
- A block QMR algorithm for non-Hermitian linear systems with multiple right-hand sides
- PII S0024379596005290
- R. W. Freund and M. Malhotra, A block QMR algorithm for non-Hermitian linear systems with multiple right-hand sides, Linear Algebra Appl., 254 (1997), pp. 119-157. (Pubitemid 127377532)
- (1997) Linear Algebra and Its Applications , vol.254 , Issue.1-3 , pp. 119-157
- Freund, R.W.¹ Malhotra, M.²

24
- 77953973267
- Parallel block schemes for large-scale leastsquares computations
- R. B. Wilhelmson, ed., University of Illinois Press, Chicago, IL
- G. H. Golub, R. J. Plemmons, and A. Sameh, Parallel block schemes for large-scale leastsquares computations, in High-Speed Computing: Scientific Applications and Algorithm Design, R. B. Wilhelmson, ed., University of Illinois Press, Chicago, IL, 1988, pp. 171-179.
- (1988) High-Speed Computing: Scientific Applications and Algorithm Design , pp. 171-179
- Golub, G.H.¹ Plemmons, R.J.² Sameh, A.³

25
- 85014324703
- National Academies Press, Washington, D.C.
- S. L. Graham, M. Snir, and C. A. Patterson, eds., Getting Up to Speed: The Future of Supercomputing, National Academies Press, Washington, D.C., 2005.
- (2005) Getting Up to Speed: The Future of Supercomputing
- Graham, S.L.¹ Snir, M.² Patterson, C.A.³

26
- 70350784030
- Communication avoiding Gaussian elimination
- L. Grigori, J. W. Demmel, and H. Xiang, Communication avoiding Gaussian elimination, Proceedings of the ACM/IEEE SC08 Conference, 2008.
- (2008) Proceedings of the ACM/IEEE SC08 Conference
- Grigori, L.¹ Demmel, J.W.² Xiang, H.³

27
- 84861371897
- INRIA
- L. Grigori, J. W. Demmel, and H. Xiang, CALU: A Communication Optimal LU Factorization Algorithm, Technical report UCB-EECS-2010-29, INRIA, 2010.
- (2010) CALU: A Communication Optimal LU Factorization Algorithm, Technical Report UCB-EECS-2010-29
- Grigori, L.¹ Demmel, J.W.² Xiang, H.³

28
- 17644368925
- Parallel out-of-core computation and updating of the QR factorization
- DOI 10.1145/1055531.1055534
- B. C. Gunter and R. A. van de Geijn, Parallel out-of-core computation and updating of the QR factorization, ACM Trans. Math. Software, 31 (2005), pp. 60-78. (Pubitemid 40557862)
- (2005) ACM Transactions on Mathematical Software , vol.31 , Issue.1 , pp. 60-78
- Gunter, B.C.¹ Van De Geijn, R.A.²

29
- 33748688428
- Basis selection in LOBPCG
- DOI 10.1016/j.jcp.2006.02.007, PII S0021999106000866
- U. Hetmaniuk and R. Lehoucq, Basis selection in LOBPCG, J. Comput. Phys., 218 (2006), pp. 324-332. (Pubitemid 44389052)
- (2006) Journal of Computational Physics , vol.218 , Issue.1 , pp. 324-332
- Hetmaniuk, U.¹ Lehoucq, R.²

30
- 80053254284
- Ph.D. thesis, EECS Department, University of California, Berkeley, CA
- M. Hoemmen, Communication-Avoiding Krylov Subspace Methods, Ph.D. thesis, EECS Department, University of California, Berkeley, CA, 2010.
- (2010) Communication-Avoiding Krylov Subspace Methods
- Hoemmen, M.¹

31
- 84971853043
- I/O complexity: The red-blue pebble game
- New York
- J. W. Hong and H. T. Kung, I/O complexity: The red-blue pebble game, in STOC '81: Proceedings of the 13th Annual ACM Symposium on Theory of Computing, New York, 1981, pp. 326-333.
- (1981) STOC '81: Proceedings of the 13th Annual ACM Symposium on Theory of Computing , pp. 326-333
- Hong, J.W.¹ Kung, H.T.²

32
- 10844258198
- Communication lower bounds for distributed-memory matrix multiplication
- DOI 10.1016/j.jpdc.2004.03.021
- D. Irony, S. Toledo, and A. Tiskin, Communication lower bounds for distributed-memory matrix multiplication, J. Parallel Distrib. Comput., 64 (2004), pp. 1017-1026. (Pubitemid 40000755)
- (2004) Journal of Parallel and Distributed Computing , vol.64 , Issue.9 , pp. 1017-1026
- Irony, D.¹ Toledo, S.² Tiskin, A.³

33
- 84861398559
- University of California, Davis
- A. V. Knyazev, M. Argentati, I. Lashuk, and E. E. Ovtchinnikov, Block Locally Optimal Preconditioned Eigenvalue Xolvers (BLOPEX) in HYPRE and PETSc, Technical report UCDHSC-CCM-251P, University of California, Davis, 2007.
- (2007) Block Locally Optimal Preconditioned Eigenvalue Xolvers (BLOPEX) in HYPRE and PETSc, Technical Report UCDHSC-CCM-251P
- Knyazev, A.V.¹ Argentati, M.² Lashuk, I.³ Ovtchinnikov, E.E.⁴

34
- 33746412371
- A. V. Knyazev, BLOPEX, http://www-math.cudenver.edu/~aknyazev/software/ BLOPEX.
- BLOPEX
- Knyazev, A.V.¹

35
- 84861398557
- University of Tennessee, Knoxville
- J. Kurzak and J. J. Dongarra, QR Factorization for the CELL Processor, Technical report UT-CS-08-616, University of Tennessee, Knoxville, 2008.
- (2008) QR Factorization for the CELL Processor, Technical Report UT-CS-08-616
- Kurzak, J.¹ Dongarra, J.J.²

36
- 0005632874
- Block Arnoldi method
- Z. Bai, J. W. Demmel, J. J. Dongarra, A. Ruhe, and H. van der Vorst, eds., SIAM, Philadelphia
- R. Lehoucq and K. Maschhoff, Block Arnoldi method, in Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide, Z. Bai, J. W. Demmel, J. J. Dongarra, A. Ruhe, and H. van der Vorst, eds., SIAM, Philadelphia, 2000, pp. 185-187.
- (2000) Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide , pp. 185-187
- Lehoucq, R.¹ Maschhoff, K.²

37
- 0033323425
- Parallel complexity of numerically accurate linear system solvers
- M. Leoncini, G. Manzini, and L. Margara, Parallel complexity of numerically accurate linear system solvers, SIAM J. Comput., 28 (1999), pp. 2030-2058. (Pubitemid 30530990)
- (1999) SIAM Journal on Computing , vol.28 , Issue.6 , pp. 2030-2058
- Leoncini, M.¹ Manzini, G.² Margara, L.³

38
- 84861351118
- O. Marques, BLZPACK, http://crd.lbl.gov/~osni.
- BLZPACK
- Marques, O.¹

39
- 70350625706
- Performance without pain = productivity: Data layout and collective communication in UPC
- R. Nishtala, G. Almási, and C. Caşcaval, Performance without pain = productivity: Data layout and collective communication in UPC, in Proceedings of the ACM SIGPLAN 2008 Symposium on Principles and Practice of Parallel Programming, 2008.
- (2008) Proceedings of the ACM SIGPLAN 2008 Symposium on Principles and Practice of Parallel Programming
- Nishtala, R.¹ Almási, G.² Caşcaval, C.³

40
- 0001084178
- The block conjugate gradient algorithm and related methods
- D. P. O'Leary, The block conjugate gradient algorithm and related methods, Linear Algebra Appl., 29 (1980), pp. 293-322.
- (1980) Linear Algebra Appl. , vol.29 , pp. 293-322
- O'Leary, D.P.¹

41
- 0004481424
- Distributed orthogonal factorization: Givens and Householder algorithms
- A. Pothen and P. Raghavan, Distributed orthogonal factorization: Givens and Householder algorithms, SIAM J. Sci. Statist. Comput., 10 (1989), pp. 1113-1134.
- (1989) SIAM J. Sci. Statist. Comput. , vol.10 , pp. 1113-1134
- Pothen, A.¹ Raghavan, P.²

42
- 47349122478
- Scheduling of QR factorization algorithms on SMP and multi-core architectures
- Distributed and Network-Based Processing, Toulouse, France
- G. Quintana-Ortí, E. S. Quintana-Ortí, E. Chan, F. G. V. Zee, and R. A. van de Geijn, Scheduling of QR factorization algorithms on SMP and multi-core architectures, in Proceedings of the 16th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, Toulouse, France, 2008.
- (2008) Proceedings of the 16th Euromicro International Conference on Parallel
- Quintana-Ortí, G.¹ Quintana-Ortí, E.S.² Chan, E.³ Zee, F.G.V.⁴ Van De Geijn, R.A.⁵

43
- 2442513740
- Out-of-core SVD and QR decompositions
- Norfolk, VA
- E. Rabani and S. Toledo, Out-of-core SVD and QR decompositions, in Proceedings of the 10th SIAM Conference on Parallel Processing for Scientific Computing, Norfolk, VA, 2001.
- (2001) Proceedings of the 10th SIAM Conference on Parallel Processing for Scientific Computing
- Rabani, E.¹ Toledo, S.²

44
- 0344153336
- On the complexity of matrix product
- R. Raz, On the complexity of matrix product, SIAM J. Comput., 32 (2003), pp. 1356-1369.
- (2003) SIAM J. Comput. , vol.32 , pp. 1356-1369
- Raz, R.¹

45
- 0003078924
- A storage efficient WY representation for products of Householder transformations
- R. Schreiber and C. Van Loan, A storage efficient WY representation for products of Householder transformations, SIAM J. Sci. Statist. Comput., 10 (1989), pp. 53-57.
- (1989) SIAM J. Sci. Statist. Comput. , vol.10 , pp. 53-57
- Schreiber, R.¹ Van Loan, C.²

46
- 84861356843
- A. Stathopoulos, PRIMME, http://www.cs.wm.edu/~andreas/software.
- PRIMME
- Stathopoulos, A.¹

47
- 0031496750
- Locality of reference in LU decomposition with partial pivoting
- S. Toledo, Locality of reference in LU decomposition with partial pivoting, SIAM J. Matrix Anal. Appl., 18 (1997), pp. 1065-1081.
- (1997) SIAM J. Matrix Anal. Appl. , vol.18 , pp. 1065-1081
- Toledo, S.¹

48
- 0004201627
- Ph.D. thesis, Université de Rennes I, Rennes, France
- B. Vital, Étude de quelques méthodes de résolution de problèmes linéaires de grande taille sur multiprocesseur, Ph.D. thesis, Université de Rennes I, Rennes, France, 1990.
- (1990) Étude de Quelques Méthodes de Résolution de Problèmes Linéaires de Grande Taille sur Multiprocesseur
- Vital, B.¹

49
- 84861402518
- K. Wu and H. D. Simon, TRLAN, http://crd.lbl.gov/~kewu/ps/trlan .html.
- TRLAN
- Wu, K.¹ Simon, H.D.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.