메뉴 건너뛰기




Volumn , Issue , 2009, Pages 71-80

Mapping the LU decomposition on a many-core architecture: Challenges and solutions

Author keywords

Load balancing; Local memory; LU decomposition; Multi core; Register tiling

Indexed keywords

ADAPTIVE LOAD DISTRIBUTION; LOCAL MEMORY; LU DECOMPOSITION; MANY-CORE ARCHITECTURE; MULTI CORE; MULTICORE ARCHITECTURES; PERFORMANCE POTENTIALS; REGISTER TILING;

EID: 84885779509     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1531743.1531756     Document Type: Conference Paper
Times cited : (26)

References (24)
  • 6
    • 84885772106 scopus 로고    scopus 로고
    • Clearspeed White Paper: CSX Processor Architecture. http://www. clearspeed.com/newsevents/presskit.
    • CSX Processor Architecture
  • 10
    • 0029324485 scopus 로고
    • Software libraries for linear algebra computations on high performance computers
    • J. J. Dongarra and D. W. Walker. Software Libraries for Linear Algebra Computations on High Performance Computers. SIAM Review, 37(2):151-180, 1995.
    • (1995) SIAM Review , vol.37 , Issue.2 , pp. 151-180
    • Dongarra, J.J.1    Walker, D.W.2
  • 13
    • 0031273280 scopus 로고    scopus 로고
    • Recursion leads to automatic variable blocking for dense linear algebra algorithms
    • Nov.
    • F. G. Gustavson. Recursion leads to automatic variable blocking for dense linear algebra algorithms. IBM Journal of Research and Development, 41(6):737-753, Nov. 1997.
    • (1997) IBM Journal of Research and Development , vol.41 , Issue.6 , pp. 737-753
    • Gustavson, F.G.1
  • 16
    • 0003648799 scopus 로고    scopus 로고
    • The OpenMP implementation of NAS parallel benchmarks and its performance
    • NASA Ames Research Center
    • H. Jin, M. Frumkin, and J. Yan. The OpenMP Implementation of NAS Parallel Benchmarks and its Performance. Technical report nas-99-011, NASA Ames Research Center, 1999.
    • (1999) Technical Report nas-99-011
    • Jin, H.1    Frumkin, M.2    Yan, J.3
  • 22
    • 35348879576 scopus 로고    scopus 로고
    • Technical Memo 75, Computer Architecture and Parallel Systems Laboratory, University of Delaware, Feb.
    • I. E. Venetis and G. R. Gao. Optimizing the LU Benchmark for the Cyclops-64 Architecture. Technical Memo 75, Computer Architecture and Parallel Systems Laboratory, University of Delaware, Feb. 2007. http://www.capsl.udel. edu/publications.shtml.
    • (2007) Optimizing the LU Benchmark for the cyclops-64 Architecture
    • Venetis, I.E.1    Gao, G.R.2
  • 24
    • 35348812496 scopus 로고    scopus 로고
    • Synchronization state buffer: Supporting efficient fine-grain synchronization on many-core architectures
    • San Diego, California, USA, June
    • W. Zhu, V. C. Sreedhar, Z. Hu, and G. R. Gao. Synchronization State Buffer: Supporting Efficient Fine-Grain Synchronization on Many-Core Architectures. In Proceedings of the 34th International Symposium on Computer Architecture, pages 35-45, San Diego, California, USA, June 2007.
    • (2007) Proceedings of the 34th International Symposium on Computer Architecture , pp. 35-45
    • Zhu, W.1    Sreedhar, V.C.2    Hu, Z.3    Gao, G.R.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.