메뉴 건너뛰기




Volumn 47, Issue 4, 2010, Pages 878-919

The Cache-Oblivious Gaussian Elimination Paradigm: Theoretical Framework, Parallelization and Experimental Evaluation

Author keywords

All pairs shortest path; Cache oblivious; Gaussian elimination; Matrix multiplication; Parallel; Tiling

Indexed keywords

ALL-PAIRS SHORTEST PATHS; CACHE-OBLIVIOUS; GAUSSIAN ELIMINATION; MATRIX MULTIPLICATION; PARALLEL; TILING;

EID: 77956615160     PISSN: 14324350     EISSN: 14330490     Source Type: Journal    
DOI: 10.1007/s00224-010-9273-8     Document Type: Article
Times cited : (36)

References (34)
  • 2
    • 0024082546 scopus 로고
    • The input/output complexity of sorting and related problems
    • Aggarwal, A., Vitter, J.: The input/output complexity of sorting and related problems. Commun. ACM 31(9), 1116-1127 (1988).
    • (1988) Commun. ACM , vol.31 , Issue.9 , pp. 1116-1127
    • Aggarwal, A.1    Vitter, J.2
  • 4
    • 58449090994 scopus 로고    scopus 로고
    • Blelloch, G., Chowdhury, R., Gibbons, P., Ramachandran, V., Chen, S., Kozuch, M.: Provably good multicore cache performance for divide-and-conquer algorithms. In: Proceedings of the 19th Annual ACM-SIAM Symposium on Discrete Algorithms, San Francisco, California, pp. 501-510 (2008).
  • 5
    • 8344240379 scopus 로고    scopus 로고
    • Blelloch, G., Gibbons, P.: Effectively sharing a cache among threads. In: Proceedings of the 16th ACM Symposium on Parallelism in Algorithms and Architectures, Barcelona, Spain, pp. 235-244 (2004).
  • 6
    • 0030387154 scopus 로고    scopus 로고
    • Blumofe, R., Frigo, M., Joerg, C., Leiserson, C., Randall, K.: An analysis of DAG-consistent distributed shared-memory algorithms. In: Proceedings of the 8th ACM Symposium on Parallel Algorithms and Architectures, pp. 297-308 (1996).
  • 7
    • 0032659795 scopus 로고    scopus 로고
    • Chatterjee, S., Lebeck, A., Patnala, P., Thotethodi, M.: Recursive array layouts and fast parallel matrix multiplication. In: Proceedings of the 11th ACM Symposium on Parallel Algorithms and Architectures, pp. 222-231 (1999).
  • 8
    • 33244497406 scopus 로고    scopus 로고
    • Chowdhury, R., Ramachandran, V.: Cache-oblivious dynamic programming. In: Proceedings of the 17th ACM-SIAM Symposium on Discrete Algorithms, Miami, Florida, pp. 591-600 (2006).
  • 9
    • 35248831668 scopus 로고    scopus 로고
    • Chowdhury, R., Ramachandran, V.: The cache-oblivious Gaussian Elimination Paradigm: Theoretical framework, parallelization and experimental evaluation. In: Proceedings of the 19th ACM Symposium on Parallelism in Algorithms and Architectures, San Diego, California, pp. 71-80 (2007).
  • 10
    • 57349161938 scopus 로고    scopus 로고
    • Chowdhury, R., Ramachandran, V.: Cache-efficient dynamic programming algorithms for multicores. In: Proceedings of the 20th ACM Symposium on Parallelism in Algorithms and Architectures, Munich, Germany, pp. 207-216 (2008).
  • 11
    • 77954024841 scopus 로고    scopus 로고
    • Chowdhury, R., Silvestri, F., Blakeley, B., Ramachandran, V.: Oblivious algorithms for multicores and network of processors. In: Proceedings of the 24th IEEE International Parallel and Distributed Processing Symposium, Atlanta, Georgia, April 2010.
  • 13
    • 33846864717 scopus 로고    scopus 로고
    • R-Kleene: A high-performance divide-and-conquer algorithm for the all-pair shortest path for densely connected networks
    • D'Alberto, P., Nicolau, A.: R-Kleene: a high-performance divide-and-conquer algorithm for the all-pair shortest path for densely connected networks. Algorithmica 47(2), 203-213 (2007).
    • (2007) Algorithmica , vol.47 , Issue.2 , pp. 203-213
    • D'Alberto, P.1    Nicolau, A.2
  • 15
    • 84945709831 scopus 로고
    • Algorithm 97 (SHORTEST PATH)
    • Floyd, R.: Algorithm 97 (SHORTEST PATH). Commun. ACM 5(6), 345 (1962).
    • (1962) Commun. ACM , vol.5 , Issue.6 , pp. 345
    • Floyd, R.1
  • 16
    • 0033350255 scopus 로고    scopus 로고
    • Frigo, M., Leiserson, C., Prokop, H., Ramachandran, S.: Cache-oblivious algorithms. In: Proceedings of the 40th Annual Symposium on Foundations of Computer Science, pp. 285-297 (1999).
  • 17
    • 0031622953 scopus 로고    scopus 로고
    • Frigo, M., Leiserson, C., Randall, K.: The implementation of the Cilk-5 multithreaded language. In: Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, Montreal, Canada, pp. 212-223 (1998).
  • 18
    • 33749564381 scopus 로고    scopus 로고
    • Frigo, M., Strumpen, V.: The cache complexity of multithreaded cache oblivious algorithms. In: Proceedings of the 18th ACM Symposium on Parallelism in Algorithms and Architectures, Cambridge, Massachusetts, pp. 271-280 (2006).
  • 19
    • 84888274568 scopus 로고    scopus 로고
    • Fujitsu MAP3147NC/NP MAP3735NC/NP MAP3367NC/NP disk drives product/maintenance manual.
  • 20
    • 84888281279 scopus 로고    scopus 로고
    • Goto, K.: GotoBLAS (2005). http://www. tacc. utexas. edu/resources/software.
    • (2005) GotoBLAS
    • Goto, K.1
  • 22
    • 84971853043 scopus 로고    scopus 로고
    • Hong, J., Kung, H.: I/O complexity: the red-blue pebble game. In: Proceedings of the 13th Annual ACM Symposium on Theory of Computing, pp. 326-333 (1981).
  • 24
    • 0000650782 scopus 로고
    • Two notes on notation
    • Knuth, D.: Two notes on notation. Am. Math. Mon. 99, 403-422 (1992).
    • (1992) Am. Math. Mon. , vol.99 , pp. 403-422
    • Knuth, D.1
  • 26
    • 34547953706 scopus 로고    scopus 로고
    • Pan, S., Cherng, C., Dick, K., Ladner, R.: Algorithms to take advantage of hardware prefetching. In: Proceedings of the 9th Workshop on Algorithm Engineering and Experiments, pp. 91-98 (2007).
  • 27
    • 4544352521 scopus 로고    scopus 로고
    • Optimizing graph algorithms for improved cache performance
    • Park, J., Penner, M., Prasanna, V.: Optimizing graph algorithms for improved cache performance. IEEE Trans. Parallel Distrib. Syst. 15(9), 769-782 (2004).
    • (2004) IEEE Trans. Parallel Distrib. Syst. , vol.15 , Issue.9 , pp. 769-782
    • Park, J.1    Penner, M.2    Prasanna, V.3
  • 28
    • 0343462141 scopus 로고    scopus 로고
    • Automated empirical optimization of software and the ATLAS project
    • Powell, D., Allison, L., Dix, T.: Automated empirical optimization of software and the ATLAS project. Parallel Comput. 27(1-2), 3-35 (2001). http://math-atlas. sourceforge. net.
    • (2001) Parallel Comput. , vol.27 , Issue.1-2 , pp. 3-35
    • Powell, D.1    Allison, L.2    Dix, T.3
  • 30
    • 0031496750 scopus 로고    scopus 로고
    • Locality of reference in LU decomposition with partial pivoting
    • Toledo, S.: Locality of reference in LU decomposition with partial pivoting. SIAM J. Matrix Anal. Appl. 18(4), 1065-1081 (1997).
    • (1997) SIAM J. Matrix Anal. Appl. , vol.18 , Issue.4 , pp. 1065-1081
    • Toledo, S.1
  • 31
    • 84945708259 scopus 로고
    • A theorem on boolean matrices
    • Warshall, S.: A theorem on boolean matrices. J. ACM 9(1), 11-12 (1962).
    • (1962) J. ACM , vol.9 , Issue.1 , pp. 11-12
    • Warshall, S.1
  • 32
    • 85013942562 scopus 로고    scopus 로고
    • Wolf, M., Lam, M.: A data locality optimizing algorithm. In: Proceedings of the ACM SIGPLAN 1991 Conference on Programming Language Design and Implementation, pp. 30-44 (1991).
  • 33
    • 84888279563 scopus 로고    scopus 로고
    • Womble, D., Greenberg, D., Wheat, S., Riesen, R.: Beyond core: Making parallel computer I/O practical. In: Proceedings of the 1993 DAGS/PC Symposium, pp. 56-63 (1993).
  • 34
    • 35248846531 scopus 로고    scopus 로고
    • Yotov, K., Roeder, T., Pingali, K., Gunnels, J., Gustavson, F.: An experimental comparison of cache-oblivious and cache-aware programs. In: Proceedings of the 19th ACM Symposium on Parallelism in Algorithms and Architectures, San Diego, California, pp. 93-104 (2007).


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.