메뉴 건너뛰기




Volumn , Issue , 2007, Pages 71-80

The cache-oblivious gaussian elimination paradigm: Theoretical framework, parallelization and experimental evaluation

Author keywords

All pairs shortest path; Cache oblivious algorithm; Gaussian elimination; Matrix multiplication; Tiling

Indexed keywords

CACHE MEMORY; COMPUTER SOFTWARE PORTABILITY; PARALLEL ALGORITHMS;

EID: 35248831668     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1248377.1248392     Document Type: Conference Paper
Times cited : (20)

References (26)
  • 1
    • 8344240379 scopus 로고    scopus 로고
    • Effectively sharing a cache among threads
    • G. Blelloch and P. Gibbons. Effectively sharing a cache among threads. Proc. SPAA, pp. 235-244, 2004.
    • (2004) Proc. SPAA , pp. 235-244
    • Blelloch, G.1    Gibbons, P.2
  • 2
    • 0030387154 scopus 로고    scopus 로고
    • An analysis of DAG-consistent distributed shared-memory algorithms
    • R. Blumofe, M. Frigo, C. Joerg, C. Leiserson, and K. Randall. An analysis of DAG-consistent distributed shared-memory algorithms. Proc. SPAA, pp. 297-308, 1996.
    • (1996) Proc. SPAA , pp. 297-308
    • Blumofe, R.1    Frigo, M.2    Joerg, C.3    Leiserson, C.4    Randall, K.5
  • 3
    • 35248843628 scopus 로고    scopus 로고
    • SuperMatrix out-of-order scheduling of matrix operations for SMP and multi-core architectures
    • E. Chan, E. Quintana-Orti, G. Quintana-Orti, and R. van de Geijn. SuperMatrix out-of-order scheduling of matrix operations for SMP and multi-core architectures. Proc. SPAA, 2007.
    • (2007) Proc. SPAA
    • Chan, E.1    Quintana-Orti, E.2    Quintana-Orti, G.3    van de Geijn, R.4
  • 4
    • 0032659795 scopus 로고    scopus 로고
    • Recursive array layouts and fast parallel matrix multiplication
    • S. Chatterjee, A. Lebeck, P. Patnala, and M. Thotethodi. Recursive array layouts and fast parallel matrix multiplication. Proc. SPAA, pp. 222-231, 1999.
    • (1999) Proc. SPAA , pp. 222-231
    • Chatterjee, S.1    Lebeck, A.2    Patnala, P.3    Thotethodi, M.4
  • 6
    • 33244497406 scopus 로고    scopus 로고
    • Cache-oblivious dynamic programming
    • R. Chowdhury and V. Ramachandran. Cache-oblivious dynamic programming. Proc. SODA, pp. 591-600, 2006.
    • (2006) Proc. SODA , pp. 591-600
    • Chowdhury, R.1    Ramachandran, V.2
  • 7
    • 35248860442 scopus 로고    scopus 로고
    • R. Chowdhury and V. Ramachandran. The cache-oblivious Gaussian elimination paradigm: theoretical framework and experimental evaluation. CS TR-06-04, UT Austin, 2006.
    • R. Chowdhury and V. Ramachandran. The cache-oblivious Gaussian elimination paradigm: theoretical framework and experimental evaluation. CS TR-06-04, UT Austin, 2006.
  • 8
    • 33846864717 scopus 로고    scopus 로고
    • R-Kleene: A high-performance divide-and-conquer algorithm for the all-pair shortest path for densely connected networks
    • P. D'Alberto and A. Nicolau. R-Kleene: a high-performance divide-and-conquer algorithm for the all-pair shortest path for densely connected networks. Algorithmica, 47(2):203-213, 2007.
    • (2007) Algorithmica , vol.47 , Issue.2 , pp. 203-213
    • D'Alberto, P.1    Nicolau, A.2
  • 9
    • 27144501107 scopus 로고    scopus 로고
    • STXXL: Standard template library for XXL data sets
    • R. Dementiev, L. Kettner, and P. Sanders. STXXL: Standard template library for XXL data sets. Proc. ESA, pp. 640-651, 2005.
    • (2005) Proc. ESA , pp. 640-651
    • Dementiev, R.1    Kettner, L.2    Sanders, P.3
  • 10
    • 84945709831 scopus 로고
    • Algorithm 97 (SHORTEST PATH)
    • R. Floyd. Algorithm 97 (SHORTEST PATH). CACM, 5(6):345, 1962.
    • (1962) CACM , vol.5 , Issue.6 , pp. 345
    • Floyd, R.1
  • 12
    • 0031622953 scopus 로고    scopus 로고
    • The implementation of the Cilk-5 multithreaded language
    • M. Frigo, C. Leiserson, and K. Randall. The implementation of the Cilk-5 multithreaded language. Proc. PLDI, pp. 212-223, 1998.
    • (1998) Proc. PLDI , pp. 212-223
    • Frigo, M.1    Leiserson, C.2    Randall, K.3
  • 13
    • 33749564381 scopus 로고    scopus 로고
    • The cache-complexity of multithreaded cache-oblivious algorithms
    • M. Frigo and V. Strumpen. The cache-complexity of multithreaded cache-oblivious algorithms. Proc. SPAA, pp. 271-280, 2006.
    • (2006) Proc. SPAA , pp. 271-280
    • Frigo, M.1    Strumpen, V.2
  • 14
    • 0039435412 scopus 로고    scopus 로고
    • FLAME: Formal linear algebra methods environment
    • J. Gunnels, F. Gustavson, G. Henry, and R. van de Geijn. FLAME: Formal linear algebra methods environment. ACM TOMS, 27(4):422-55, 2001.
    • (2001) ACM TOMS , vol.27 , Issue.4 , pp. 422-455
    • Gunnels, J.1    Gustavson, F.2    Henry, G.3    van de Geijn, R.4
  • 15
    • 35248823889 scopus 로고    scopus 로고
    • K. Goto. GotoBLAS, 2005. http://www.tacc.utexas.edu/resources/software.
    • (2005)
    • Goto, K.1    GotoBLAS2
  • 17
    • 35248900168 scopus 로고    scopus 로고
    • MAP3147NC/NP~MAP3735NC/NP MAP3367NC/NP disk drives product/maintenance manual. http://www.fujitsu.com/downloads/COMP/fcpa/hdd/.
    • MAP3147NC/NP~MAP3735NC/NP MAP3367NC/NP disk drives product/maintenance manual. http://www.fujitsu.com/downloads/COMP/fcpa/hdd/.
  • 19
    • 34547953706 scopus 로고    scopus 로고
    • Algorithms to take advantage of hardware prefetching
    • S. Pan, C. Cherng, K. Dick, and R. Ladner. Algorithms to take advantage of hardware prefetching. Proc. ALENEX, pp. 91-98, 2007.
    • (2007) Proc. ALENEX , pp. 91-98
    • Pan, S.1    Cherng, C.2    Dick, K.3    Ladner, R.4
  • 20
    • 4544352521 scopus 로고    scopus 로고
    • Optimizing graph algorithms for improved cache performance
    • J. Park, M. Penner and V. Prasanna. Optimizing graph algorithms for improved cache performance. IEEE TPDS, 15(9):769-782, 2004.
    • (2004) IEEE TPDS , vol.15 , Issue.9 , pp. 769-782
    • Park, J.1    Penner, M.2    Prasanna, V.3
  • 22
    • 84945708259 scopus 로고
    • A theorem on boolean matrices
    • S. Warshall. A theorem on boolean matrices. JACM, 9(1):11-12, 1962.
    • (1962) JACM , vol.9 , Issue.1 , pp. 11-12
    • Warshall, S.1
  • 24
    • 84976827033 scopus 로고
    • A data locality optimizing algorithm
    • M. Wolf and M. Lam. A data locality optimizing algorithm. Proc. PLDI, pp. 30-44, 1991.
    • (1991) Proc. PLDI , pp. 30-44
    • Wolf, M.1    Lam, M.2
  • 25
    • 0343462141 scopus 로고    scopus 로고
    • Automated empirical optimization of software and the ATLAS project
    • R. Whaley, A. Petitet, and J. Dongarra. Automated empirical optimization of software and the ATLAS project. Parallel Computing, 27(1-2):3-35, 2001.
    • (2001) Parallel Computing , vol.27 , Issue.1-2 , pp. 3-35
    • Whaley, R.1    Petitet, A.2    Dongarra, J.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.