메뉴 건너뛰기




Volumn 27, Issue 4, 2012, Pages 277-287

Profiling high performance dense linear algebra algorithms on multicore architectures for power and energy efficiency

Author keywords

Dense linear algebra; Energy efficiency; Multicore architectures; Power profile; Tile algorithms

Indexed keywords

BLOCK ALGORITHM; DENSE LINEAR ALGEBRA; MULTICORE ARCHITECTURES; PARALLEL PERFORMANCE; POWER PROFILE; POWER PROFILING; SUB-MATRICES; TASK PARALLELISM;

EID: 84868119278     PISSN: 18652034     EISSN: 18652042     Source Type: Journal    
DOI: 10.1007/s00450-011-0191-z     Document Type: Article
Times cited : (17)

References (21)
  • 3
    • 77958512320 scopus 로고    scopus 로고
    • Energy efficiency of mixed precision iterative refinement methods using hybrid hardware platformsan evaluation of different solver and hardware configurations
    • 10.1007/s00450-010-0124-2
    • Anzt H, Rocker B, Heuveline V (2010) Energy efficiency of mixed precision iterative refinement methods using hybrid hardware platformsan evaluation of different solver and hardware configurations. Comput Sci 25(3-4):141-148. doi: 10.1007/s00450-010-0124-2
    • (2010) Comput Sci , vol.25 , Issue.3-4 , pp. 141-148
    • Anzt, H.1    Rocker, B.2    Heuveline, V.3
  • 4
    • 77958509771 scopus 로고    scopus 로고
    • A new energy aware performance metric
    • 10.1007/s00450-010-0119-z
    • Bekas C, Curioni A (2010) A new energy aware performance metric. Comput Sci 25(3-4):187-195. doi: 10.1007/s00450-010-0119-z
    • (2010) Comput Sci , vol.25 , Issue.3-4 , pp. 187-195
    • Bekas, C.1    Curioni, A.2
  • 5
    • 0012881041 scopus 로고    scopus 로고
    • Algorithm 807: The SBR toolboxsoftware for successive band reduction
    • 1941086 10.1145/365723.365736 http://doi.acm.org/10.1145/365723.365736
    • Bischof CH, Lang B, Sun X (2000) Algorithm 807: the SBR toolboxsoftware for successive band reduction. ACM Trans Math Softw 26(4):602-616. http://doi.acm.org/10.1145/365723.365736
    • (2000) ACM Trans Math Softw , vol.26 , Issue.4 , pp. 602-616
    • Bischof, C.H.1    Lang, B.2    Sun, X.3
  • 6
    • 35548933706 scopus 로고    scopus 로고
    • Mixed precision iterative refinement techniques for the solution of dense linear systems
    • 10.1177/1094342007084026 10.1177/1094342007084026
    • Buttari A, Dongarra J, Langou J, Langou J, Luszczek P, Kurzak J (2007) Mixed precision iterative refinement techniques for the solution of dense linear systems. Int J Hight Perform Comput Appl 21(4):457-466. doi: 10.1177/1094342007084026
    • (2007) Int J Hight Perform Comput Appl , vol.21 , Issue.4 , pp. 457-466
    • Buttari, A.1    Dongarra, J.2    Langou, J.3    Langou, J.4    Luszczek, P.5    Kurzak, J.6
  • 7
    • 58149269099 scopus 로고    scopus 로고
    • A class of parallel tiled linear algebra algorithms for multicore architectures
    • 2492567 10.1016/j.parco.2008.10.002
    • Buttari A, Langou J, Kurzak J, Dongarra J (2009) A class of parallel tiled linear algebra algorithms for multicore architectures. Parallel Comput 35(1):38-53
    • (2009) Parallel Comput , vol.35 , Issue.1 , pp. 38-53
    • Buttari, A.1    Langou, J.2    Kurzak, J.3    Dongarra, J.4
  • 8
    • 33746318690 scopus 로고    scopus 로고
    • Reducing power with performance constraints for parallel sparse applications
    • IEEE Comput Soc Los Alamitos http://doi.ieeecomputersociety.org/10.1109/ IPDPS.2005.378
    • Chen G, Malkowski K, Kandemir MT, Raghavan P (2005) Reducing power with performance constraints for parallel sparse applications. In: IPDPS. IEEE Comput Soc, Los Alamitos. http://doi.ieeecomputersociety.org/10.1109/IPDPS.2005.378
    • (2005) IPDPS
    • Chen, G.1    Malkowski, K.2    Kandemir, M.T.3    Raghavan, P.4
  • 9
    • 51049107284 scopus 로고    scopus 로고
    • Towards energy efficient scaling of scientific codes
    • IEEE Press New York 10.1109/IPDPS.2008.4536217
    • Ding Y, Malkowski K, Raghavan P, Kandemir MT (2008) Towards energy efficient scaling of scientific codes. In: IPDPS. IEEE Press, New York, pp 1-8. doi: 10.1109/IPDPS.2008.4536217
    • (2008) IPDPS , pp. 1-8
    • Ding, Y.1    Malkowski, K.2    Raghavan, P.3    Kandemir, M.T.4
  • 11
    • 77950629423 scopus 로고    scopus 로고
    • Powerpack: Energy profiling and analysis of high-performance systems and applications
    • 10.1109/TPDS.2009.76
    • Ge R, Feng X, Song S, Chang HC, Li D, Cameron KW (2010) Powerpack: Energy profiling and analysis of high-performance systems and applications. IEEE Trans Parallel Distrib Syst PDS-21(5):658-671
    • (2010) IEEE Trans Parallel Distrib Syst , vol.21 , Issue.5 , pp. 658-671
    • Ge, R.1    Feng, X.2    Song, S.3    Chang, H.C.4    Li, D.5    Cameron, K.W.6
  • 12
    • 0004236492 scopus 로고    scopus 로고
    • 3 John Hopkins studies in the mathematical sciences Johns Hopkins University Press Baltimore
    • Golub GH, Van Loan CF (1996) Matrix computation, 3rd edn. John Hopkins studies in the mathematical sciences. Johns Hopkins University Press, Baltimore
    • (1996) Matrix Computation
    • Golub, G.H.1    Van Loan, C.F.2
  • 13
    • 80054983967 scopus 로고    scopus 로고
    • Blocked algorithms for the reduction to Hessenberg-triangular form revisited
    • 1157.65348 10.1007/s10543-008-0180-1
    • Kågström B, Kressner D, Quintana-Ortí E, Quintana-Ortí G (2008) Blocked algorithms for the reduction to Hessenberg-triangular form revisited. BIT Numer Math 48:563-584
    • (2008) BIT Numer Math , vol.48 , pp. 563-584
    • Kågström, B.1    Kressner, D.2    Quintana-Ortí, E.3    Quintana-Ortí, G.4
  • 14
    • 33746618548 scopus 로고    scopus 로고
    • Just in time dynamic voltage scaling: Exploiting inter-node slack to save energy in MPI programs
    • IEEE Comput Soc Los Alamitos http://doi.acm.org/10.1145/1105760.1105797
    • Kappiah N, Freeh VW, Lowenthal DK (2005) Just in time dynamic voltage scaling: exploiting inter-node slack to save energy in MPI programs. In: SC. IEEE Comput Soc, Los Alamitos, p 33. http://doi.acm.org/10.1145/1105760.1105797
    • (2005) SC , pp. 33
    • Kappiah, N.1    Freeh, V.W.2    Lowenthal, D.K.3
  • 16
    • 84868144129 scopus 로고    scopus 로고
    • High performance bidiagonal reduction using tile algorithms on homogeneous multicore architectures
    • submitted
    • Ltaief H, Luszczek P, Dongarra J (2011, submitted) High performance bidiagonal reduction using tile algorithms on homogeneous multicore architectures. ACM Trans Math Softw
    • (2011) ACM Trans Math Softw
    • Ltaief, H.1    Luszczek, P.2    Dongarra, J.3
  • 17
    • 80053252490 scopus 로고    scopus 로고
    • Two-stage tridiagonal reduction for dense symmetric matrices using tile algorithms on multicore architectures
    • ACM Anchorage
    • Luszczek P, Ltaief H, Dongarra J (2011) Two-stage tridiagonal reduction for dense symmetric matrices using tile algorithms on multicore architectures. In: Proceedings of IPDPS 2011. ACM, Anchorage
    • (2011) Proceedings of IPDPS 2011
    • Luszczek, P.1    Ltaief, H.2    Dongarra, J.3
  • 19
    • 13444302326 scopus 로고    scopus 로고
    • The free lunch is over: A fundamental turn toward concurrency in software
    • Sutter H (2005) The free lunch is over: a fundamental turn toward concurrency in software. Dr Dobb's Journal 30(3). http://www.ddj.com/184405990
    • (2005) Dr Dobb's Journal , vol.30 , Issue.3
    • Sutter, H.1
  • 20
    • 0003424374 scopus 로고    scopus 로고
    • SIAM Philadelphia 0874.65013 10.1137/1.9780898719574 http://www.siam.org/ books/OT50/Index.htm
    • Trefethen LN, Bau D (1997) Numerical linear algebra. SIAM, Philadelphia. http://www.siam.org/books/OT50/Index.htm
    • (1997) Numerical Linear Algebra
    • Trefethen, L.N.1    Bau, D.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.