SCOPUS 정보 검색 플랫폼

Concurrency and Computation: Practice and Experience

Volumn 20, Issue 13, 2008, Pages 1573-1590

Parallel tiled QR factorization for multicore architectures

(4) Buttari, Alfredo a Langou, Julien b Kurzak, Jakub a Dongarra, Jack a,c,d

a University of Tennessee Knoxville (United States)

b UNIVERSITY OF COLORADO (United States)

c OAK RIDGE NATIONAL LABORATORY (United States)

d UNIVERSITY OF MANCHESTER (United Kingdom)

Author keywords

Linear algebra; Multicore; QR factorization

Indexed keywords

LINEAR ALGEBRA; PARALLEL ARCHITECTURES; SOFTWARE ARCHITECTURE;

COMPUTATIONAL RESOURCES; HIGH PERFORMANCE COMPUTING; LINEAR ALGEBRA ALGORITHMS; LOOSE SYNCHRONIZATIONS; MULTI CORE; MULTICORE ARCHITECTURES; OUT-OF-ORDER EXECUTION; QR FACTORIZATIONS;

FACTORIZATION;

EID: 50249105132 PISSN: 15320626 EISSN: 15320634 Source Type: Journal
DOI: 10.1002/cpe.1301 Document Type: Article

Times cited : (109)

References (28)

1
- 50249141640
- 24 July 2007
- http://top500.org [24 July 2007].

2
- 27344435504
- The design and implementation of a first-generation CELL processor
- Pham D, Asano S, Bolliger M, Day MN, Hofstee HP, Johns C, Kahle J, Kameyama A, Keaty J, Masubuchi Y, Riley M, Shippy D, Stasiak D, Suzuoki M, Wang M, Warnock J, Weitzel S, Wendel D, Yamazaki T, Yazawa K. The design and implementation of a first-generation CELL processor. IEEE International Solid-State Circuits Conference 2005; 184-185.
- (2005) IEEE International Solid-State Circuits Conference , pp. 184-185
- Pham, D.¹ Asano, S.² Bolliger, M.³ Day, M.N.⁴ Hofstee, H.P.⁵ Johns, C.⁶ Kahle, J.⁷ Kameyama, A.⁸ Keaty, J.⁹ Masubuchi, Y.¹⁰ Riley, M.¹¹ Shippy, D.¹² Stasiak, D.¹³ Suzuoki, M.¹⁴ Wang, M.¹⁵ Warnock, J.¹⁶ Weitzel, S.¹⁷ Wendel, D.¹⁸ Yamazaki, T.¹⁹ Yazawa, K.²⁰ more..

3
- 50249133005
- Teraflops research chip, 24 July 2007
- Teraflops research chip. http://www.intel.com/research/platform/ terascale/teraflops.htm [24 July 2007].

4
- 0003706460
- 3rd edn, SIAM: Philadelphia
- Anderson E, Bai Z, Bischof C, Blackford S, Demmel J, Dongarra J, Du Croz J, Greenbaum A, Hammarling S, McKenney A, Sorensen D. LAPACK User's Guide (3rd edn). SIAM: Philadelphia, 1999.
- (1999) LAPACK User's Guide
- Anderson, E.¹ Bai, Z.² Bischof, C.³ Blackford, S.⁴ Demmel, J.⁵ Dongarra, J.⁶ Du Croz, J.⁷ Greenbaum, A.⁸ Hammarling, S.⁹ McKenney, A.¹⁰ Sorensen, D.¹¹

5
- 0030564728
- ScaLAPACK: A portable linear algebra library for distributed memory computers - Design issues and performance
- Also as LAPACK Working Note #95
- Choi J, Demmel J, Dhillon I, Dongarra J, Ostrouchov S, Petitet A, Stanley K, Walker D, Whaley RC. ScaLAPACK: A portable linear algebra library for distributed memory computers - Design issues and performance. Computer Physics Communications 1996; 97:1-15. (Also as LAPACK Working Note #95).
- (1996) Computer Physics Communications , vol.97 , pp. 1-15
- Choi, J.¹ Demmel, J.² Dhillon, I.³ Dongarra, J.⁴ Ostrouchov, S.⁵ Petitet, A.⁶ Stanley, K.⁷ Walker, D.⁸ Whaley, R.C.⁹

6
- 0343462141
- Automated empirical optimization of software and the ATLAS project
- l-2:3-25
- Whaley RC, Petitet A, Dongarra J. Automated empirical optimization of software and the ATLAS project. Parallel Computing 2001; 27(l-2):3-25.
- (2001) Parallel Computing , vol.27
- Whaley, R.C.¹ Petitet, A.² Dongarra, J.³

7
- 34548762396
- High-performance implementation of the level-3 bias
- Technical Report TR-2006-23, Department of Computer Sciences, The University of Texas at Austin, FLAME Working Note 20
- Goto K, van de Geijn R. High-performance implementation of the level-3 bias. Technical Report TR-2006-23, Department of Computer Sciences, The University of Texas at Austin, 2006. FLAME Working Note 20.
- (2006)
- Goto, K.¹ van de Geijn, R.²

8
- 50249110532
- 24 July 2007
- http://www.intel.com/cd/software/products/asmo-na/eng/307757.htm [24 July 2007].

9
- 50249118960
- 24 July 2007
- http://developer.amd.com/acml.jsp [24 July 2007].

10
- 50249118105
- International Organization for Standardization. Informational Technology - Portable Operating System Interface (POSIX) - Part 1: System Application Program Interface (API) [C Language], ISO: Adr, 19%; 743. http://www.iso. ch/cate/d24426.html [24 July 2007].
- International Organization for Standardization. Informational Technology - Portable Operating System Interface (POSIX) - Part 1: System Application Program Interface (API) [C Language], ISO: Adr, 19%; 743. http://www.iso. ch/cate/d24426.html [24 July 2007].

11
- 0002806690
- OpenMP: An industry-standard API for shared-memory programming
- Dagum L. Menon R. OpenMP: An industry-standard API for shared-memory programming. IEEE Computational Science and Engineering 1998; 5(1):46-55.
- (1998) IEEE Computational Science and Engineering , vol.5 , Issue.1 , pp. 46-55
- Dagum, L.¹ Menon, R.²

12
- 84947808952
- Choi J, Dongarra J, Ostrouchov S, Petitet A, Walker DW, Clinton Whaley R. A proposal for a set of parallel basic linear algebra subprograms. PARA '95: Proceedings of the Second International Workshop on Applied Parallel Computing, Computations in Physics, Chemistry and Engineering Science, London, U.K., 1996. Springer: Berlin, 19%; 107-114.
- Choi J, Dongarra J, Ostrouchov S, Petitet A, Walker DW, Clinton Whaley R. A proposal for a set of parallel basic linear algebra subprograms. PARA '95: Proceedings of the Second International Workshop on Applied Parallel Computing, Computations in Physics, Chemistry and Engineering Science, London, U.K., 1996. Springer: Berlin, 19%; 107-114.

13
- 50249129153
- Message passing interface Forum. MPI: A message-passing interface standard. The International Journal of Supercomputer Applications and High Performance Computing 1994; 8:165-414.
- Message passing interface Forum. MPI: A message-passing interface standard. The International Journal of Supercomputer Applications and High Performance Computing 1994; 8:165-414.

14
- 38049058008
- The impact of multicore on math software
- Proceedings of Workshop on State-of-the-art in Scientific and Parallel Computing Para06, Umeå, Sweden
- Buttari A, Dongarra J, Kurzak J, Langou J, Luszczek P, Tomov S. The impact of multicore on math software. Proceedings of Workshop on State-of-the-art in Scientific and Parallel Computing (Para06). Springer's Lecture Notes in Computer Science 4699, Umeå, Sweden, 2007; 1-10.
- (2007) Springer's Lecture Notes in Computer Science , vol.4699 , pp. 1-10
- Buttari, A.¹ Dongarra, J.² Kurzak, J.³ Langou, J.⁴ Luszczek, P.⁵ Tomov, S.⁶

15
- 35248843628
- Supermatrix out-of-order scheduling of matrix operations for SMP and multicore architectures
- New York, NY, U.S.A, ACM: New York
- Chan E, Quintana-Orti ES, Quintana-Orti G, van de Geijn R. Supermatrix out-of-order scheduling of matrix operations for SMP and multicore architectures. SPAA '07: Proceedings of the 19th Annual ACM Symposium on Parallel Algorithms and Architectures, New York, NY, U.S.A., 2007. ACM: New York, 2007; 116-125.
- (2007) SPAA '07: Proceedings of the 19th Annual ACM Symposium on Parallel Algorithms and Architectures , pp. 116-125
- Chan, E.¹ Quintana-Orti, E.S.² Quintana-Orti, G.³ van de Geijn, R.⁴

16
- 38049005629
- Implementing linear algebra routines on multicore processors with pipelining and a look ahead
- Proceedings of Workshop on State-of-the-art in Scientific and Parallel Computing Para06, Umeå, Sweden
- Kurzak J, Dongarra J. Implementing linear algebra routines on multicore processors with pipelining and a look ahead. Proceedings of Workshop on State-of-the-art in Scientific and Parallel Computing (Para06). Springer's Lecture Notes in Computer Science 4699, Umeå, Sweden, 2007; 147-156.
- (2007) Springer's Lecture Notes in Computer Science , vol.4699 , pp. 147-156
- Kurzak, J.¹ Dongarra, J.²

17
- 50249166476
- Solving systems of linear equations on the CELL processor using Cholesky factorization
- Technical Report VT-CS-07-596, Innovative Computing Laboratory, University of Tennessee, Knoxville, April
- Kurzak J, Buttari A, Dongarra J. Solving systems of linear equations on the CELL processor using Cholesky factorization. Technical Report VT-CS-07-596, Innovative Computing Laboratory, University of Tennessee, Knoxville, April 2007.
- (2007)
- Kurzak, J.¹ Buttari, A.² Dongarra, J.³

18
- 0034224207
- Applying recursion to serial and parallel QR factorization leads to better performance
- Elmroth E, Gustavson FG. Applying recursion to serial and parallel QR factorization leads to better performance. IBM Journal of Research and Development 2000; 44(4):605-624.
- (2000) IBM Journal of Research and Development , vol.44 , Issue.4 , pp. 605-624
- Elmroth, E.¹ Gustavson, F.G.²

19
- 0004236492
- 3rd edn, Johns Hopkins University Press: Baltimore, MD, 19
- Golub G, Van Loan C. Matrix Computations (3rd edn). Johns Hopkins University Press: Baltimore, MD, 19%.
- Matrix Computations
- Golub, G.¹ Van Loan, C.²

20
- 0004094905
- 1st edn, SIAM: Philadelphia, PA
- Stewart GW. Matrix Algorithms (1st edn), vol. 1. SIAM: Philadelphia, PA, 1998.
- (1998) Matrix Algorithms , vol.1
- Stewart, G.W.¹

21
- 0003424374
- SIAM: Philadelphia, PA
- Trefethen LN, Bau D. Numerical Linear Algebra. SIAM: Philadelphia, PA, 1997.
- (1997) Numerical Linear Algebra
- Trefethen, L.N.¹ Bau, D.²

22
- 0003078924
- A storage-efficient WY representation for products of Householder transformations
- Schreiber R, van Loan C. A storage-efficient WY representation for products of Householder transformations. SIAM Journal on Scientific and Statistical Computing 1989; 10(1):53-57.
- (1989) SIAM Journal on Scientific and Statistical Computing , vol.10 , Issue.1 , pp. 53-57
- Schreiber, R.¹ van Loan, C.²

23
- 45449092245
- FORTRAN subroutines for out-of-core solutions of large complex linear systems
- Technical Report CR-159I42, NASA, November
- Yip EL. FORTRAN subroutines for out-of-core solutions of large complex linear systems. Technical Report CR-159I42, NASA, November 1979.
- (1979)
- Yip, E.L.¹

24
- 17644368925
- Parallel out-of-core computation and updating of the QR factorization
- Gunter BC, van de Geijn RA. Parallel out-of-core computation and updating of the QR factorization. ACM Transactions on Mathematical Software 2005; 31(1):60-78.
- (2005) ACM Transactions on Mathematical Software , vol.31 , Issue.1 , pp. 60-78
- Gunter, B.C.¹ van de Geijn, R.A.²

25
- 45449110534
- Updating an LU factorization with pivoting
- Technical Report TR-2006-42, Department of Computer Sciences, The University of Texas at Austin, FLAME Working Note 21
- Quintana-Orti E, van de Geijn R. Updating an LU factorization with pivoting. Technical Report TR-2006-42, Department of Computer Sciences, The University of Texas at Austin, 2006. FLAME Working Note 21.
- (2006)
- Quintana-Orti, E.¹ van de Geijn, R.²

26
- 0029358998
- A parallel algorithm for the reduction of a nonsymmetric matrix to block upper-Hessenberg form
- Berry MW, Dongarra JJ, Kim Y. A parallel algorithm for the reduction of a nonsymmetric matrix to block upper-Hessenberg form. Parallel Computation 1995; 21(8): 1189-1211.
- (1995) Parallel Computation , vol.21 , Issue.8 , pp. 1189-1211
- Berry, M.W.¹ Dongarra, J.J.² Kim, Y.³

27
- 84947583789
- New generalized data structures for matrices lead to a variety of high performance algorithms
- London, U.K, Springer: Berlin, ISBN: 3-540-43792-4
- Gustavson FG. New generalized data structures for matrices lead to a variety of high performance algorithms. PPAM '01: Proceedings of the International Conference on Parallel Processing and Applied Mathematics - Revised Papers, London, U.K., 2002. Springer: Berlin, 2002; 418-436. ISBN: 3-540-43792-4.
- (2002) PPAM '01: Proceedings of the International Conference on Parallel Processing and Applied Mathematics - Revised Papers , pp. 418-436
- Gustavson, F.G.¹

28
- 50249182748
- SMP Superscalar (SMPSs) User's Manual, July 2007. www.bsc.es/media/1002. pdf [24 July 2007].
- SMP Superscalar (SMPSs) User's Manual, July 2007. www.bsc.es/media/1002. pdf [24 July 2007].

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.