SCOPUS 정보 검색 플랫폼

Proceedings - 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011

Volumn , Issue , 2011, Pages 944-955

Two-stage tridiagonal reduction for dense symmetric matrices using tile algorithms on multicore architectures

(3) Luszczek, Piotr a Ltaief, Hatem a Dongarra, Jack a

a University of Tennessee (United States)

Author keywords

Bulge Chasing; Scheduling; Tile Algorithms; Translation Layer; Tridiagonal Reduction

Indexed keywords

BIDIAGONAL; BULGE CHASING; CHOLESKY FACTORIZATIONS; DATA LOCALITY; EFFICIENT IMPLEMENTATION; MATRIX; MATRIX SIZE; MULTICORE ARCHITECTURES; MULTITHREADED; NUMERICAL LIBRARY; NUMERICAL SOFTWARE; PERFORMANCE DATA; PROCESSOR-MEMORY; RESEARCH PROBLEMS; RUNTIME SYSTEMS; SPECTRAL DECOMPOSITION; SYMMETRIC MATRICES; TRIDIAGONAL; TWO STAGE;

DISTRIBUTED PARAMETER NETWORKS; ECONOMIC ANALYSIS; FACTORIZATION; LINEAR TRANSFORMATIONS; MEMORY ARCHITECTURE; SCHEDULING ALGORITHMS; SOFTWARE ARCHITECTURE;

MATRIX ALGEBRA;

EID: 80053252490 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/IPDPS.2011.91 Document Type: Conference Paper

Times cited : (34)

References (43)

1
- 0347702154
- 35th ed., June The report can be downloaded from
- H. W. Meuer, E. Strohmaier, J. J. Dongarra, and H. D. Simon, TOP500 Supercomputer Sites, 35th ed., June 2010, (The report can be downloaded from http://www.netlib.org/benchmark/top500.html).
- (2010) TOP500 Supercomputer Sites
- Meuer, H.W.¹ Strohmaier, E.² Dongarra, J.J.³ Simon, H.D.⁴

2
- 35648995516
- Electrical Engineering and Computer Sciences University of California at Berkeley,, Tech. Rep. Technical Report No. UCB/EECS-2006-183, December 18
- K. Asanovic, R. Bodik, B. C. Catanzaro, J. J. Gebis, P. Husbands, K. Keutzer, D. A. Patterson, W. L. Plishker, J. Shalf, S. W. Williams, and K. A. Yelick, "The Landscape of Parallel Computing Research: A View from Berkeley," Electrical Engineering and Computer Sciences University of California at Berkeley,, Tech. Rep. Technical Report No. UCB/EECS-2006-183, December 18 2006.
- (2006) The Landscape of Parallel Computing Research: A View from Berkeley
- Asanovic, K.¹ Bodik, R.² Catanzaro, B.C.³ Gebis, J.J.⁴ Husbands, P.⁵ Keutzer, K.⁶ Patterson, D.A.⁷ Plishker, W.L.⁸ Shalf, J.⁹ Williams, S.W.¹⁰ Yelick, K.A.¹¹

3
- 80053269009
- University of Tennessee, November
- PLASMA Users' Guide, Parallel Linear Algebra Software for Multicore Archtectures, Version 2.2, University of Tennessee, November 2009.
- (2009) PLASMA Users' Guide, Parallel Linear Algebra Software for Multicore Archtectures, Version 2.2

4
- 80053278288
- April
- "The FLAME project," April 2010, http://z.cs.utexas.edu/wiki/ flame.wiki/FrontPage.
- (2010) The FLAME Project

5
- 0004236492
- 3rd ed. Baltimore, MD: Johns Hopkins University Press
- G. Golub and C. van Loan, Matrix Computations, 3rd ed. Baltimore, MD: Johns Hopkins University Press, 1996.
- (1996) Matrix Computations
- Golub, G.¹ Van Loan, C.²

6
- 0003424374
- Philadelphia, PA
- L. N. Trefethen and D. Bau, Numerical Linear Algebra. Philadelphia, PA: SIAM, 1997, http://www.siam.org/books/OT50/Index.htm.
- (1997) Numerical Linear Algebra
- Trefethen, L.N.¹ Bau, D.²

7
- 80053289864
- ParaGauss: The Density Functional Program ParaGauss for Complex Systems in Chemistry
- Springer, dOI: 10.1007/3-540-28555-5-25.
- N. Rösch, S. Krüger, V. Nasluzov, and A. Matveev, "ParaGauss: The Density Functional Program ParaGauss for Complex Systems in Chemistry," in High Performance Computing in Science and Engineering, Garching 2004, part III. Springer, 2005, pp. 285-296, dOI: 10.1007/3-540-28555- 5-25.
- (2005) High Performance Computing in Science and Engineering, Garching 2004 , Issue.PART III , pp. 285-296
- Rösch, N.¹ Krüger, S.² Nasluzov, V.³ Matveev, A.⁴

8
- 0542421948
- The solution of large dense generalized eigenvalue problems on the Cray X-MP/24 with SSD
- April [Online]. Available
- R. Grimes, H. Krakauer, J. Lewis, H. Simon, and S.-H. Wei, "The solution of large dense generalized eigenvalue problems on the Cray X-MP/24 with SSD," J. Comput. Phys., vol. 69, pp. 471-481, April 1987. [Online]. Available: http://portal.acm.org/citation.cfm?id=32855.32865
- (1987) J. Comput. Phys. , vol.69 , pp. 471-481
- Grimes, R.¹ Krakauer, H.² Lewis, J.³ Simon, H.⁴ Wei, S.-H.⁵

9
- 85188489071
- Cambridge University Press
- R. M. Martin, "Electronic structure: Basic theory and practical methods," Cambridge University Press, 2008.
- (2008) Electronic Structure: Basic Theory and Practical Methods
- Martin, R.M.¹

10
- 57949083229
- A dependency-aware task-based programming environment for multi-core architectures
- J. Perez, R. Badia, and J. Labarta, "A dependency-aware task-based programming environment for multi-core architectures,"in Cluster Computing, 2008 IEEE International Conference on, 29 2008-oct. 1 2008, pp. 142-151.
- Cluster Computing, 2008 IEEE International Conference on, 29 2008-Oct. 1 2008 , pp. 142-151
- Perez, J.¹ Badia, R.² Labarta, J.³

11
- 0003706460
- 3rd ed. Philadelphia: Society for Industrial and Applied Mathematics
- E. Anderson, Z. Bai, C. Bischof, S. L. Blackford, J. W. Demmel, J. J. Dongarra, J. D. Croz, A. Greenbaum, S. Hammarling, A. McKenney, and D. C. Sorensen, LAPACK User's Guide, 3rd ed. Philadelphia: Society for Industrial and Applied Mathematics, 1999.
- (1999) LAPACK User's Guide
- Anderson, E.¹ Bai, Z.² Bischof, C.³ Blackford, S.L.⁴ Demmel, J.W.⁵ Dongarra, J.J.⁶ Croz, J.D.⁷ Greenbaum, A.⁸ Hammarling, S.⁹ McKenney, A.¹⁰ Sorensen, D.C.¹¹

12
- 0003615167
- Philadelphia: Society for Industrial and Applied Mathematics
- L. S. Blackford, J. Choi, A. Cleary, E. F. D'Azevedo, J. W. Demmel, I. S. Dhillon, J. J. Dongarra, S. Hammarling, G. Henry, A. Petitet, K. Stanley, D. W. Walker, and R. C. Whaley, ScaLAPACK Users' Guide. Philadelphia: Society for Industrial and Applied Mathematics, 1997.
- (1997) ScaLAPACK Users' Guide
- Blackford, L.S.¹ Choi, J.² Cleary, A.³ D'Azevedo, E.F.⁴ Demmel, J.W.⁵ Dhillon, I.S.⁶ Dongarra, J.J.⁷ Hammarling, S.⁸ Henry, G.⁹ Petitet, A.¹⁰ Stanley, K.¹¹ Walker, D.W.¹² Whaley, R.C.¹³

13
- 0030244536
- The design and implementation of the ScaLAPACK LU, QR, and Cholesky factorization routines
- J. Choi, J. J. Dongarra, S. Ostrouchov, A. Petitet, D. W. Walker, and R. C. Whaley, "The design and implementation of the ScaLAPACK LU, QR, and Cholesky factorization routines," Scientific Programming, vol. 5, pp. 173-184, 1996.
- (1996) Scientific Programming , vol.5 , pp. 173-184
- Choi, J.¹ Dongarra, J.J.² Ostrouchov, S.³ Petitet, A.⁴ Walker, D.W.⁵ Whaley, R.C.⁶

14
- 0012881041
- Algorithm 807: The SBR Toolbox - Software for successive band reduction
- C. H. Bischof, B. Lang, and X. Sun, "Algorithm 807: The SBR Toolbox - software for successive band reduction," ACM Trans. Math. Softw., vol. 26, no. 4, pp. 602-616, 2000.
- (2000) ACM Trans. Math. Softw. , vol.26 , Issue.4 , pp. 602-616
- Bischof, C.H.¹ Lang, B.² Sun, X.³

15
- 77952650268
- "Intel, Math Kernel Library (MKL),"http://www.intel.com/ software/products/mkl/.
- Intel, Math Kernel Library (MKL)

16
- 80053261524
- T. A. Davis and S. Rajamanickam, "PIRO BAND, Pipelined Plane Rotations for Blocked Band Reduction."
- PIRO BAND, Pipelined Plane Rotations for Blocked Band Reduction
- Davis, T.A.¹ Rajamanickam, S.²

17
- 80054983967
- Blocked Algorithms for the Reduction to Hessenberg-Triangular Form Revisited
- B. Kågström, D. Kressner, E. Quintana-Orti, and G. Quintana-Orti, "Blocked Algorithms for the Reduction to Hessenberg-Triangular Form Revisited," BIT Numerical Mathematics, vol. 48, pp. 563-584, 2008.
- (2008) BIT Numerical Mathematics , vol.48 , pp. 563-584
- Kågström, B.¹ Kressner, D.² Quintana-Orti, E.³ Quintana-Orti, G.⁴

18
- 77957870814
- Accelerating the reduction to upper Hessenberg, tridiagonal, and bidiagonal forms through hybrid GPU-based computing
- vol. DOI information: 10.1016/j.parco.2010.06.001
- S. Tomov, R. Nath, and J. Dongarra, "Accelerating the reduction to upper Hessenberg, tridiagonal, and bidiagonal forms through hybrid GPU-based computing," Parallel Computing, vol. DOI information: 10.1016/j.parco.2010. 06.001, 2010.
- (2010) Parallel Computing
- Tomov, S.¹ Nath, R.² Dongarra, J.³

19
- 77955109739
- Reduction to Condensed Forms for Symmetric Eigenvalue Problems on Multi-core Architectures
- P. Bientinesi, F. Igual, D. Kressner, and E. Quintana-Orti, "Reduction to Condensed Forms for Symmetric Eigenvalue Problems on Multi-core Architectures," Parallel Processing and Applied Mathemetics, pp. 387-395, 2010.
- (2010) Parallel Processing and Applied Mathemetics , pp. 387-395
- Bientinesi, P.¹ Igual, F.² Kressner, D.³ Quintana-Orti, E.⁴

20
- 50249105132
- Parallel Tiled QR Factorization for Multicore Architectures
- DOI: 10.1002/cpe.1301.
- A. Buttari, J. Langou, J. Kurzak, and J. J. Dongarra, "Parallel Tiled QR Factorization for Multicore Architectures,"Concurrency Computat.: Pract. Exper., vol. 20, no. 13, pp. 1573-1590, 2008, http://dx.doi.org/10.1002/ cpe.1301 DOI: 10.1002/cpe.1301.
- (2008) Concurrency Computat.: Pract. Exper. , vol.20 , Issue.13 , pp. 1573-1590
- Buttari, A.¹ Langou, J.² Kurzak, J.³ Dongarra, J.J.⁴

21
- 58149269099
- A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures
- DOI: 10.1016/j.parco.2008.10.002
- -, "A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures," Parellel Comput. Syst. Appl., vol. 35, pp. 38-53, 2009, http://dx.doi.org/10.1016/j.parco.2008.10.002 DOI: 10.1016/j.parco.2008. 10.002.
- (2009) Parellel Comput. Syst. Appl. , vol.35 , pp. 38-53
- Buttari, A.¹ Langou, J.² Kurzak, J.³ Dongarra, J.J.⁴

22
- 48849086742
- Updating an LU Factorization with Pivoting
- DOI: 10.1145/1377612.1377615
- E. S. Quintana-Ortí and R. A. van de Geijn, "Updating an LU Factorization with Pivoting,"ACM Trans. Math. Softw., vol. 35, no. 2, p. 11, 2008, http://doi.acm.org/10.1145/1377612.1377615 DOI: 10.1145/1377612. 1377615.
- (2008) ACM Trans. Math. Softw. , vol.35 , Issue.2 , pp. 11
- Quintana-Ortí, E.S.¹ Van De Geijn, R.A.²

23
- 49349111725
- Solving systems of linear equation on the CELL processor using Cholesky factorization
- DOI: TPDS.2007.70813
- J. Kurzak, A. Buttari, and J. J. Dongarra, "Solving systems of linear equation on the CELL processor using Cholesky factorization," Trans. Parallel Distrib. Syst., vol. 19, no. 9, pp. 1175-1186, 2008, http://dx.doi.org/10.1109/TPDS.2007.70813 DOI: TPDS.2007.70813.
- (2008) Trans. Parallel Distrib. Syst. , vol.19 , Issue.9 , pp. 1175-1186
- Kurzak, J.¹ Buttari, A.² Dongarra, J.J.³

24
- 80053238375
- QR factorization for the CELL processor
- DOI: 10.3233/SPR-2008-0268
- J. Kurzak and J. J. Dongarra, "QR factorization for the CELL processor," Scientific Programming, vol. 17, pp. 1-12, 2008, http://dx.doi.org/10.3233/SPR-2008-0268 DOI: 10.3233/SPR-2008-0268.
- (2008) Scientific Programming , vol.17 , pp. 1-12
- Kurzak, J.¹ Dongarra, J.J.²

25
- 74049090446
- Comparative study of one-sided factorizations with multiple software packages on multi-core hardware
- New York, NY, USA: ACM
- E. Agullo, B. Hadri, H. Ltaief, and J. Dongarrra, "Comparative study of one-sided factorizations with multiple software packages on multi-core hardware," in SC '09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis. New York, NY, USA: ACM, 2009, pp. 1-12.
- (2009) SC '09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis , pp. 1-12
- Agullo, E.¹ Hadri, B.² Ltaief, H.³ Dongarrra, J.⁴

26
- 74049122130
- Parallel block hessenberg reduction using algorithms-by-tiles for multicore architectures revisited
- UT-CS-08-624 also
- H. Ltaief, J. Kurzak, and J. Dongarra, "Parallel block hessenberg reduction using algorithms-by-tiles for multicore architectures revisited," UT-CS-08-624 (also LAPACK Working Note 208), 2008.
- (2008) LAPACK Working Note 208
- Ltaief, H.¹ Kurzak, J.² Dongarra, J.³

27
- 0010898812
- Ablex Publishing Corp
- J. A. Sharp, Ed., Data flow computing: theory and practice. Ablex Publishing Corp, 1992.
- (1992) Data Flow Computing: Theory and Practice
- Sharp, J.A.¹

28
- 33244454775
- A Taxonomy of Workflow Management Systems for Grid Computing
- J. Yu and R. Buyya, "A Taxonomy of Workflow Management Systems for Grid Computing," Journal of Grid Computing, 2005.
- (2005) Journal of Grid Computing
- Yu, J.¹ Buyya, R.²

29
- 46149113240
- Workflow global computing with yml
- O. Delannoy, N. Emad, and S. Petiton, "Workflow global computing with yml," in 7th IEEE/ACM International Conference on Grid Computing, september 2006.
- 7th IEEE/ACM International Conference on Grid Computing, September 2006
- Delannoy, O.¹ Emad, N.² Petiton, S.³

30
- 38049058008
- The Impact of Multicore on Math Software
- Applied Parallel Computing. State of the Art in Scientific Computing, 8th International Workshop, PARA, ser. B. Kågström, E. Elmroth, J. Dongarra, and J. Wasniewski, Eds., Springer
- A. Buttari, J. Dongarra, J. Kurzak, J. Langou, P. Luszczek, and S. Tomov, "The Impact of Multicore on Math Software,"in Applied Parallel Computing. State of the Art in Scientific Computing, 8th International Workshop, PARA, ser. Lecture Notes in Computer Science, B. Kågström, E. Elmroth, J. Dongarra, and J. Wasniewski, Eds., vol. 4699. Springer, 2006, pp. 1-10.
- (2006) Lecture Notes in Computer Science , vol.4699 , pp. 1-10
- Buttari, A.¹ Dongarra, J.² Kurzak, J.³ Langou, J.⁴ Luszczek, P.⁵ Tomov, S.⁶

31
- 35248843628
- Supermatrix out-of-order scheduling of matrix operations for SMP and multi-core architectures
- New York, NY, USA: ACM
- E. Chan, E. S. Quintana-Orti, G. Quintana-Orti, and R. van de Geijn, "Supermatrix out-of-order scheduling of matrix operations for SMP and multi-core architectures," in SPAA '07: Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures. New York, NY, USA: ACM, 2007, pp. 116-125.
- (2007) SPAA '07: Proceedings of the Nineteenth Annual ACM Symposium on Parallel Algorithms and Architectures , pp. 116-125
- Chan, E.¹ Quintana-Orti, E.S.² Quintana-Orti, G.³ Van De Geijn, R.⁴

32
- 77953997924
- Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects
- E. Agullo, J. Demmel, J. Dongarra, B. Hadri, J. Kurzak, J. Langou, H. Ltaief, P. Luszczek, and S. Tomov, "Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects," Journal of Physics: Conference Series, vol. 180, 2009.
- (2009) Journal of Physics: Conference Series , vol.180
- Agullo, E.¹ Demmel, J.² Dongarra, J.³ Hadri, B.⁴ Kurzak, J.⁵ Langou, J.⁶ Ltaief, H.⁷ Luszczek, P.⁸ Tomov, S.⁹

33
- 68249112512
- HMPP: A hybrid multicore parallel programming environment
- R. Dolbeau, S. Bihan, and F. Bodin, "HMPP: A hybrid multicore parallel programming environment," in Workshop on General Purpose Processing on Graphics Processing Units (GPGPU 2007), 2007.
- Workshop on General Purpose Processing on Graphics Processing Units (GPGPU 2007), 2007
- Dolbeau, R.¹ Bihan, S.² Bodin, F.³

34
- 70350641505
- StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures
- Euro-Par 2009 Euro-par'09 Proceedings, ser. Delft Pays-Bas, [Online]. Available
- C. Augonnet, S. Thibault, R. Namyst, and P.-A. Wacrenier, "StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures," in Euro-Par 2009 Euro-par'09 Proceedings, ser. LNCS, Delft Pays-Bas, 2009. [Online]. Available: http://hal.inria.fr/inria-00384363/en/
- (2009) LNCS
- Augonnet, C.¹ Thibault, S.² Namyst, R.³ Wacrenier, P.-A.⁴

35
- 74049102092
- Dynamic task scheduling for linear algebra algorithms on distributed-memory multicore systems
- New York, NY, USA: ACM, DOI: 10.1145/1654059.1654079
- F. Song, A. YarKhan, and J. Dongarra, "Dynamic task scheduling for linear algebra algorithms on distributed-memory multicore systems," in SC '09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis. New York, NY, USA: ACM, 2009, pp. 1-11, http://doi.acm.org/10.1145/1654059.1654079 DOI: 10.1145/1654059.1654079.
- (2009) SC '09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis , pp. 1-11
- Song, F.¹ YarKhan, A.² Dongarra, J.³

36
- 0035266229
- Automatic Parallelization Techniques Based on Compact DAG Extraction and Symbolic Scheduling
- [Online]. Available: http://hal.inria.fr/inria-00000278/en
- M. Cosnard and E. Jeannot, "Automatic Parallelization Techniques Based on Compact DAG Extraction and Symbolic Scheduling," Parallel Processing Letters, vol. 11, pp. 151-168, 2001. [Online]. Available: http://dx.doi.org/10.1142/S012962640100049X http://hal.inria.fr/inria-00000278/ en/
- (2001) Parallel Processing Letters , vol.11 , pp. 151-168
- Cosnard, M.¹ Jeannot, E.²

37
- 33646107115
- Automatic blocking of QR and LU factorizations for locality
- Q. Yi, K. Kennedy, H. You, K. Seymour, and J. Dongarra, "Automatic blocking of QR and LU factorizations for locality,"in 2nd ACM SIGPLAN Workshop on Memory System Performance (MSP 2004), 2004.
- 2nd ACM SIGPLAN Workshop on Memory System Performance (MSP 2004), 2004
- Yi, Q.¹ Kennedy, K.² You, H.³ Seymour, K.⁴ Dongarra, J.⁵

38
- 0003635251
- Cambridge, MA: MIT Press
- A. Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Manchek, and V. Sunderam, PVM: Parallel Virtual Machine. A Users'Guide and Tutorial for Networked Parallel Computing. Cambridge, MA: MIT Press, 1994.
- (1994) PVM: Parallel Virtual Machine. A Users'Guide and Tutorial for Networked Parallel Computing
- Geist, A.¹ Beguelin, A.² Dongarra, J.³ Jiang, W.⁴ Manchek, R.⁵ Sunderam, V.⁶

39
- 0003710740
- Cambridge, MA: MIT Press
- M. Snir, S. W. Otto, S. Huss-Lederman, D. W. Walker, and J. J. Dongarra, MPI: The Complete Reference. Cambridge, MA: MIT Press, 1996.
- (1996) MPI: The Complete Reference
- Snir, M.¹ Otto, S.W.² Huss-Lederman, S.³ Walker, D.W.⁴ Dongarra, J.J.⁵

40
- 0001439335
- MPI: A Message-Passing Interface Standard
- M. P. I. Forum
- M. P. I. Forum, "MPI: A Message-Passing Interface Standard,"The International Journal of Supercomputer Applications and High Performance Computing, vol. 8, 1994.
- (1994) The International Journal of Supercomputer Applications and High Performance Computing , vol.8

41
- 0003413672
- available at
- -, "MPI: A Message-Passing Interface Standard (version 1.1)," 1995, available at: http://www.mpi-forum.org/.
- (1995) MPI: A Message-Passing Interface Standard (Version 1.1)
- Snir, M.¹ Otto, S.W.² Huss-Lederman, S.³ Walker, D.W.⁴ Dongarra, J.J.⁵

42
- 0003604499
- 18 Jul. available at
- -, "MPI-2: Extensions to the Message-Passing Interface," 18 Jul. 1997, available at http://www.mpi-forum.org/docs/mpi-20.ps.
- (1997) MPI-2: Extensions to the Message-Passing Interface
- Snir, M.¹ Otto, S.W.² Huss-Lederman, S.³ Walker, D.W.⁴ Dongarra, J.J.⁵

43
- 77951935506
- Scheduling two-sided transformations using tile algorithms on multicore architectures
- H. Ltaief, J. Kurzak, J. Dongarra, and R. M. Badia, "Scheduling two-sided transformations using tile algorithms on multicore architectures," Sci. Program., vol. 18, no. 1, pp. 35-50, 2010.
- (2010) Sci. Program. , vol.18 , Issue.1 , pp. 35-50
- Ltaief, H.¹ Kurzak, J.² Dongarra, J.³ Badia, R.M.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.