SCOPUS 정보 검색 플랫폼

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Volumn 4967 LNCS, Issue , 2008, Pages 639-648

Parallel tiled QR factorization for multicore architectures

(4) Buttari, Alfredo a Langou, Julien b Kurzak, Jakub a Dongarra, Jack a,c,d

a University of Tennessee (United States)

b UNIVERSITY OF COLORADO (United States)

c OAK RIDGE NATIONAL LABORATORY (United States)

d UNIVERSITY OF MANCHESTER (United Kingdom)

Author keywords

[No Author keywords available]

Indexed keywords

ALGEBRA; ALGORITHMS; BOOLEAN FUNCTIONS; COMPUTATIONAL METHODS; EVOLUTIONARY ALGORITHMS; FACTORIZATION; LEARNING ALGORITHMS; LINEAR ALGEBRA; MULTITASKING; PAPER; SCHEDULING ALGORITHMS; STANDARDS; STATISTICS; SUPERCOMPUTERS; TREES (MATHEMATICS);

(+ MOD 2N) OPERATION; APPLIED MATHEMATICS; ARCHITECTURAL FEATURES; COMPUTATIONAL RESOURCES; FINE GRAIN PARALLELISM; HEIDELBERG (CO); HIGH PERFORMANCE COMPUTING (HIPC); IN ORDER; INTERNATIONAL CONFERENCES; LINEAR ALGEBRA ALGORITHMS; LOOSE SYNCHRONIZATION (LS); MULTI CORE ARCHITECTURE; MULTI CORE SYSTEMS; OUT-OF-ORDER EXECUTION; PARALLEL EXECUTIONS; PARALLEL PROCESSING; PERFOR MANCE COMPARISON; QR FACTORIZATIONS; SPRINGER (CO);

PARALLEL ALGORITHMS;

EID: 45449096678 PISSN: 03029743 EISSN: 16113349 Source Type: Book Series
DOI: 10.1007/978-3-540-68111-3_67 Document Type: Conference Paper

Times cited : (6)

References (26)

1
- 27344435504
- The design and implementation of a first-generation CELL processor
- Pham, D., Asano, S., Bolliger, M., Day, M.N., Hofstee, H.P., Johns, C., Kahle, J., Kameyama, A., Keaty, J., Masubuchi, Y., Riley, M., Shippy, D., Stasiak, D., Suzuoki, M., Wang, M., Warnock, J., Weitzel, S., Wendel, D., Yamazaki, T., Yazawa, K.: The design and implementation of a first-generation CELL processor. In: IEEE International Solid-State Circuits Conference, pp. 184-185 (2005)
- (2005) IEEE International Solid-State Circuits Conference , pp. 184-185
- Pham, D.¹ Asano, S.² Bolliger, M.³ Day, M.N.⁴ Hofstee, H.P.⁵ Johns, C.⁶ Kahle, J.⁷ Kameyama, A.⁸ Keaty, J.⁹ Masubuchi, Y.¹⁰ Riley, M.¹¹ Shippy, D.¹² Stasiak, D.¹³ Suzuoki, M.¹⁴ Wang, M.¹⁵ Warnock, J.¹⁶ Weitzel, S.¹⁷ Wendel, D.¹⁸ Yamazaki, T.¹⁹ Yazawa, K.²⁰ more..

2
- 45449102400
- Teraflops research chip
- Teraflops research chip, http://www.intel.com/research/platform/ terascale/teraflops.htm

3
- 0003706460
- 3rd edn. SIAM, Philadelphia
- Anderson, E., Bai, Z., Bischof, C., Blackford, S., Demmel, J., Dongarra, J., Croz, J.D., Greenbaum, A., Hammarling, S., McKenney, A., Sorensen, D.: LAPACK User's Guide, 3rd edn. SIAM, Philadelphia (1999)
- (1999) LAPACK User's Guide
- Anderson, E.¹ Bai, Z.² Bischof, C.³ Blackford, S.⁴ Demmel, J.⁵ Dongarra, J.⁶ Croz, J.D.⁷ Greenbaum, A.⁸ Hammarling, S.⁹ McKenney, A.¹⁰ Sorensen, D.¹¹

4
- 0030564728
- ScaLAPACK: A portable linear algebra library for distributed memory computers - design issues and performance
- also as LAPACK Working Note #95
- Choi, J., Demmel, J., Dhillon, I., Dongarra, J., Ostrouchov, S., Petitet, A., Stanley, K., Walker, D., Whaley, R.C.: ScaLAPACK: A portable linear algebra library for distributed memory computers - design issues and performance. Computer Physics Communications 97, 1-15 (1996), (also as LAPACK Working Note #95)
- (1996) Computer Physics Communications , vol.97 , pp. 1-15
- Choi, J.¹ Demmel, J.² Dhillon, I.³ Dongarra, J.⁴ Ostrouchov, S.⁵ Petitet, A.⁶ Stanley, K.⁷ Walker, D.⁸ Whaley, R.C.⁹

5
- 35248868578
- Implementing linear algebra routines on multi-core processors with pipelining and a look ahead
- Also available as UT-CS-06-581, September
- Kurzak, J., Dongarra, J.: Implementing linear algebra routines on multi-core processors with pipelining and a look ahead. LAPACK Working Note 178 (September 2006), Also available as UT-CS-06-581
- (2006) LAPACK Working Note , vol.178
- Kurzak, J.¹ Dongarra, J.²

6
- 38049058008
- Buttari, A., Dongarra, J., Kurzak, J., Langou, J., Luszczek, P., Tomov, S.: The impact of multicore on math software. In: Kågström, B., Elmroth, E., Dongarra, J., Waśniewski, J. (eds.) PARA 2006. LNCS, 4699, pp. 1-10. Springer, Heidelberg (2007)
- Buttari, A., Dongarra, J., Kurzak, J., Langou, J., Luszczek, P., Tomov, S.: The impact of multicore on math software. In: Kågström, B., Elmroth, E., Dongarra, J., Waśniewski, J. (eds.) PARA 2006. LNCS, vol. 4699, pp. 1-10. Springer, Heidelberg (2007)

7
- 35248843628
- Supermatrix out-of-order scheduling of matrix operations for SMP and multi-core architectures
- ACM Press, New York
- Chan, E., Quintana-Orti, E.S., Quintana-Orti, G., van de Geijn, R.: Supermatrix out-of-order scheduling of matrix operations for SMP and multi-core architectures. In: SPAA 2007: Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures, pp. 116-125. ACM Press, New York (2007)
- (2007) SPAA 2007: Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures , pp. 116-125
- Chan, E.¹ Quintana-Orti, E.S.² Quintana-Orti, G.³ van de Geijn, R.⁴

8
- 1842832833
- Recursive blocked algorithms and hybrid data structures for dense matrix library software
- Elmroth, E., Gustavson, F., Jonsson, I., Kågström, B.: Recursive blocked algorithms and hybrid data structures for dense matrix library software. SIAM Review 46(1), 3-45 (2004)
- (2004) SIAM Review , vol.46 , Issue.1 , pp. 3-45
- Elmroth, E.¹ Gustavson, F.² Jonsson, I.³ Kågström, B.⁴

9
- 38049087210
- Gustavson, F., Karlsson, L., Kågström, B.: Three algorithms for cholesky factorization on distributed memory using packed storage. In: Kågström, B., Elmroth, E., Dongarra, J., Waśniewski, J. (eds.) PARA 2006. LNCS, 4699, pp. 550-559. Springer, Heidelberg (2007)
- Gustavson, F., Karlsson, L., Kågström, B.: Three algorithms for cholesky factorization on distributed memory using packed storage. In: Kågström, B., Elmroth, E., Dongarra, J., Waśniewski, J. (eds.) PARA 2006. LNCS, vol. 4699, pp. 550-559. Springer, Heidelberg (2007)

10
- 45449118422
- Kurzak, J., Buttari, A., Dongarra, J.: Solving systems of linear equations on the CELL processor using Cholesky factorization. Technical Report UT-CS-07-596, Innovative Computing Laboratory, University of Tennessee Knoxville (April 2007)
- Kurzak, J., Buttari, A., Dongarra, J.: Solving systems of linear equations on the CELL processor using Cholesky factorization. Technical Report UT-CS-07-596, Innovative Computing Laboratory, University of Tennessee Knoxville (April 2007)

11
- 0020593101
- Solving linear algebraic equations on an mimd computer
- Lord, R.E., Kowalik, J.S., Kumar, S.P.: Solving linear algebraic equations on an mimd computer. J. ACM 30(1), 103-117 (1983)
- (1983) J. ACM , vol.30 , Issue.1 , pp. 103-117
- Lord, R.E.¹ Kowalik, J.S.² Kumar, S.P.³

12
- 0021572957
- December
- Dongarra, J.J., Hiromoto, R.E.: A collection of parallel linear equations routines for the Denelcor HEP 1(2), 133-142 (December 1984)
- (1984) A collection of parallel linear equations routines for the Denelcor HEP , vol.1 , Issue.2 , pp. 133-142
- Dongarra, J.J.¹ Hiromoto, R.E.²

13
- 0024891893
- Vector and parallel algorithms for cholesky factorization on ibm 3090
- ACM Press, New York
- Agarwal, R.C., Gustavson, F.G.: Vector and parallel algorithms for cholesky factorization on ibm 3090. In: Supercomputing 1989: Proceedings of the 1989 ACM/IEEE conference on Supercomputing, pp. 225-233. ACM Press, New York (1989)
- (1989) Supercomputing 1989: Proceedings of the 1989 ACM/IEEE conference on Supercomputing , pp. 225-233
- Agarwal, R.C.¹ Gustavson, F.G.²

14
- 45449117612
- Agarwal, R.C., Gustavson, F.G.: A parallel implementation of matrix multiplication and LU factorization on the IBM 3090. In: Proceedings of the IFIP WG 2.5 Working Group on Aspects of Computation on Asychronous Parallel Processors, Stanford CA, Augest 22-26,1988, North Holland, Amsterdam (1988)
- Agarwal, R.C., Gustavson, F.G.: A parallel implementation of matrix multiplication and LU factorization on the IBM 3090. In: Proceedings of the IFIP WG 2.5 Working Group on Aspects of Computation on Asychronous Parallel Processors, Stanford CA, Augest 22-26,1988, North Holland, Amsterdam (1988)

15
- 0034224207
- Applying recursion to serial and parallel QR factorization leads to better performance
- Elmroth, E., Gustavson, F.G.: Applying recursion to serial and parallel QR factorization leads to better performance. IBM Journal of Research and Development 44(4), 605 (2000)
- (2000) IBM Journal of Research and Development , vol.44 , Issue.4 , pp. 605
- Elmroth, E.¹ Gustavson, F.G.²

16
- 0004236492
- 3rd edn. Johns Hopkins University Press, Baltimore
- Golub, G., Van Loan, C.: Matrix Computations, 3rd edn. Johns Hopkins University Press, Baltimore (1996)
- (1996) Matrix Computations
- Golub, G.¹ Van Loan, C.²

17
- 0004094905
- 1st edn, SIAM, Philadelphia
- Stewart, G.W.: Matrix Algorithms, 1st edn., vol. 1. SIAM, Philadelphia (1998)
- (1998) Matrix Algorithms , vol.1
- Stewart, G.W.¹

18
- 45449092245
- FORTRAN Subroutines for Out-of-Core Solutions of Large Complex Linear Systems
- Technical Report CR-159142, NASA November
- Yip, E.L.: FORTRAN Subroutines for Out-of-Core Solutions of Large Complex Linear Systems. Technical Report CR-159142, NASA (November 1979)
- (1979)
- Yip, E.L.¹

19
- 45449110534
- Updating an LU factorization with pivoting
- Technical Report TR-2006-42, The University of Texas at Austin, Department of Computer Sciences , FLAME Working Note 21
- Quintana-Orti, E., van de Geijn, R.: Updating an LU factorization with pivoting, Technical Report TR-2006-42, The University of Texas at Austin, Department of Computer Sciences (2006), FLAME Working Note 21
- (2006)
- Quintana-Orti, E.¹ van de Geijn, R.²

20
- 17644368925
- Parallel out-of-core computation and updating of the QR factorization
- Gunter, B.C., van de Geijn, R.A.: Parallel out-of-core computation and updating of the QR factorization. ACM Trans. Math. Softw. 31(1), 60-78 (2005)
- (2005) ACM Trans. Math. Softw , vol.31 , Issue.1 , pp. 60-78
- Gunter, B.C.¹ van de Geijn, R.A.²

21
- 0029358998
- A parallel algorithm for the reduction of a nonsymmetric matrix to block upper-hessenberg form
- Berry, M.W., Dongarra, J.J., Kim, Y.: A parallel algorithm for the reduction of a nonsymmetric matrix to block upper-hessenberg form. Parallel Comput. 21(8), 1189-1211 (1995)
- (1995) Parallel Comput , vol.21 , Issue.8 , pp. 1189-1211
- Berry, M.W.¹ Dongarra, J.J.² Kim, Y.³

22
- 84947583789
- Gustavson, F.G.: New generalized data structures for matrices lead to a variety of high performance algorithms. In: Wyrzykowski, R., Dongarra, J., Paprzycki, M., Waśniewski, J. (eds.) PPAM 2001. LNCS, 2328, pp. 418-436. Springer, Heidelberg (2002)
- Gustavson, F.G.: New generalized data structures for matrices lead to a variety of high performance algorithms. In: Wyrzykowski, R., Dongarra, J., Paprzycki, M., Waśniewski, J. (eds.) PPAM 2001. LNCS, vol. 2328, pp. 418-436. Springer, Heidelberg (2002)

23
- 0001951009
- The WY representation for products of householder matrices
- Bischof, C., van Loan, C.: The WY representation for products of householder matrices. SIAM J. Sci. Stat. Comput. 8(1), 2-13 (1987)
- (1987) SIAM J. Sci. Stat. Comput , vol.8 , Issue.1 , pp. 2-13
- Bischof, C.¹ van Loan, C.²

24
- 0003078924
- A storage-efficient WY representation for products of Householder transformations
- Schreiber, R., van Loan, C.: A storage-efficient WY representation for products of Householder transformations. SIAM J. Sci. Stat. Comput. 10(1), 53-57 (1989)
- (1989) SIAM J. Sci. Stat. Comput , vol.10 , Issue.1 , pp. 53-57
- Schreiber, R.¹ van Loan, C.²

25
- 0001951009
- The WY representation for products of householder matrices
- Bischof, C., van Loan, C.: The WY representation for products of householder matrices. SIAM J. Sci. Stat. Comput. 8(1), 2-13 (1987)
- (1987) SIAM J. Sci. Stat. Comput , vol.8 , Issue.1 , pp. 2-13
- Bischof, C.¹ van Loan, C.²

26
- 45449098829
- Buttari, A., Langou, J., Kurzak, J., Dongarra, J.: Parallel Tiled QR Factorization for Multicore Architectures. Technical Report UT-CS-07-598, University of Tennessee (2007), LAPACK Working Note 190
- Buttari, A., Langou, J., Kurzak, J., Dongarra, J.: Parallel Tiled QR Factorization for Multicore Architectures. Technical Report UT-CS-07-598, University of Tennessee (2007), LAPACK Working Note 190

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.