-
1
-
-
0033645154
-
The data locality of work stealing
-
New York: ACM
-
Acar, U. A., Blelloch, G. E., Blumofe, R. D.: The data locality of work stealing. In: Proceedings of the 12th Annual ACM Symposium on Parallel Algorithms and Architectures, pp. 1-12. ACM, New York (2000).
-
(2000)
Proceedings of the 12th Annual ACM Symposium on Parallel Algorithms and Architectures
, pp. 1-12
-
-
Acar, U.A.1
Blelloch, G.E.2
Blumofe, R.D.3
-
2
-
-
0024082546
-
The input/output complexity of sorting and related problems
-
Aggarwal, A., Vitter, J.: The input/output complexity of sorting and related problems. Commun. ACM 31(9), 1116-1127 (1988).
-
(1988)
Commun. ACM
, vol.31
, Issue.9
, pp. 1116-1127
-
-
Aggarwal, A.1
Vitter, J.2
-
4
-
-
58449090994
-
-
Blelloch, G., Chowdhury, R., Gibbons, P., Ramachandran, V., Chen, S., Kozuch, M.: Provably good multicore cache performance for divide-and-conquer algorithms. In: Proceedings of the 19th Annual ACM-SIAM Symposium on Discrete Algorithms, San Francisco, California, pp. 501-510 (2008).
-
-
-
-
5
-
-
8344240379
-
-
Blelloch, G., Gibbons, P.: Effectively sharing a cache among threads. In: Proceedings of the 16th ACM Symposium on Parallelism in Algorithms and Architectures, Barcelona, Spain, pp. 235-244 (2004).
-
-
-
-
6
-
-
0030387154
-
-
Blumofe, R., Frigo, M., Joerg, C., Leiserson, C., Randall, K.: An analysis of DAG-consistent distributed shared-memory algorithms. In: Proceedings of the 8th ACM Symposium on Parallel Algorithms and Architectures, pp. 297-308 (1996).
-
-
-
-
7
-
-
0032659795
-
-
Chatterjee, S., Lebeck, A., Patnala, P., Thotethodi, M.: Recursive array layouts and fast parallel matrix multiplication. In: Proceedings of the 11th ACM Symposium on Parallel Algorithms and Architectures, pp. 222-231 (1999).
-
-
-
-
8
-
-
33244497406
-
-
Chowdhury, R., Ramachandran, V.: Cache-oblivious dynamic programming. In: Proceedings of the 17th ACM-SIAM Symposium on Discrete Algorithms, Miami, Florida, pp. 591-600 (2006).
-
-
-
-
9
-
-
35248831668
-
-
Chowdhury, R., Ramachandran, V.: The cache-oblivious Gaussian Elimination Paradigm: Theoretical framework, parallelization and experimental evaluation. In: Proceedings of the 19th ACM Symposium on Parallelism in Algorithms and Architectures, San Diego, California, pp. 71-80 (2007).
-
-
-
-
10
-
-
57349161938
-
-
Chowdhury, R., Ramachandran, V.: Cache-efficient dynamic programming algorithms for multicores. In: Proceedings of the 20th ACM Symposium on Parallelism in Algorithms and Architectures, Munich, Germany, pp. 207-216 (2008).
-
-
-
-
11
-
-
77954024841
-
-
Chowdhury, R., Silvestri, F., Blakeley, B., Ramachandran, V.: Oblivious algorithms for multicores and network of processors. In: Proceedings of the 24th IEEE International Parallel and Distributed Processing Symposium, Atlanta, Georgia, April 2010.
-
-
-
-
12
-
-
0004116989
-
-
2nd edn., Cambridge: MIT Press
-
Cormen, T., Leiserson, C., Rivest, R., Stein, C.: Introduction to Algorithms, 2nd edn. MIT Press, Cambridge (2001).
-
(2001)
Introduction to Algorithms
-
-
Cormen, T.1
Leiserson, C.2
Rivest, R.3
Stein, C.4
-
13
-
-
33846864717
-
R-Kleene: A high-performance divide-and-conquer algorithm for the all-pair shortest path for densely connected networks
-
D'Alberto, P., Nicolau, A.: R-Kleene: a high-performance divide-and-conquer algorithm for the all-pair shortest path for densely connected networks. Algorithmica 47(2), 203-213 (2007).
-
(2007)
Algorithmica
, vol.47
, Issue.2
, pp. 203-213
-
-
D'Alberto, P.1
Nicolau, A.2
-
14
-
-
27144501107
-
STXXL: Standard template library for XXL data sets
-
LNCS, Berlin: Springer
-
Dementiev, R., Kettner, L., Sanders, P.: STXXL: Standard template library for XXL data sets. In: Proceedings of the 13th Annual European Symposium on Algorithms. LNCS, vol. 1004, pp. 640-651. Springer, Berlin (2005).
-
(2005)
Proceedings of the 13th Annual European Symposium on Algorithms
, vol.1004
, pp. 640-651
-
-
Dementiev, R.1
Kettner, L.2
Sanders, P.3
-
15
-
-
84945709831
-
Algorithm 97 (SHORTEST PATH)
-
Floyd, R.: Algorithm 97 (SHORTEST PATH). Commun. ACM 5(6), 345 (1962).
-
(1962)
Commun. ACM
, vol.5
, Issue.6
, pp. 345
-
-
Floyd, R.1
-
16
-
-
0033350255
-
-
Frigo, M., Leiserson, C., Prokop, H., Ramachandran, S.: Cache-oblivious algorithms. In: Proceedings of the 40th Annual Symposium on Foundations of Computer Science, pp. 285-297 (1999).
-
-
-
-
17
-
-
0031622953
-
-
Frigo, M., Leiserson, C., Randall, K.: The implementation of the Cilk-5 multithreaded language. In: Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, Montreal, Canada, pp. 212-223 (1998).
-
-
-
-
18
-
-
33749564381
-
-
Frigo, M., Strumpen, V.: The cache complexity of multithreaded cache oblivious algorithms. In: Proceedings of the 18th ACM Symposium on Parallelism in Algorithms and Architectures, Cambridge, Massachusetts, pp. 271-280 (2006).
-
-
-
-
19
-
-
84888274568
-
-
Fujitsu MAP3147NC/NP MAP3735NC/NP MAP3367NC/NP disk drives product/maintenance manual.
-
-
-
-
20
-
-
84888281279
-
-
Goto, K.: GotoBLAS (2005). http://www. tacc. utexas. edu/resources/software.
-
(2005)
GotoBLAS
-
-
Goto, K.1
-
21
-
-
0039435412
-
FLAME: Formal linear algebra methods environment
-
Gunnels, J., Gustavson, F., Henry, G., van de Geijn, R.: FLAME: Formal linear algebra methods environment. ACM Trans. Math. Softw. 27(4), 422-455 (2001).
-
(2001)
ACM Trans. Math. Softw.
, vol.27
, Issue.4
, pp. 422-455
-
-
Gunnels, J.1
Gustavson, F.2
Henry, G.3
van de Geijn, R.4
-
22
-
-
84971853043
-
-
Hong, J., Kung, H.: I/O complexity: the red-blue pebble game. In: Proceedings of the 13th Annual ACM Symposium on Theory of Computing, pp. 326-333 (1981).
-
-
-
-
24
-
-
0000650782
-
Two notes on notation
-
Knuth, D.: Two notes on notation. Am. Math. Mon. 99, 403-422 (1992).
-
(1992)
Am. Math. Mon.
, vol.99
, pp. 403-422
-
-
Knuth, D.1
-
26
-
-
34547953706
-
-
Pan, S., Cherng, C., Dick, K., Ladner, R.: Algorithms to take advantage of hardware prefetching. In: Proceedings of the 9th Workshop on Algorithm Engineering and Experiments, pp. 91-98 (2007).
-
-
-
-
27
-
-
4544352521
-
Optimizing graph algorithms for improved cache performance
-
Park, J., Penner, M., Prasanna, V.: Optimizing graph algorithms for improved cache performance. IEEE Trans. Parallel Distrib. Syst. 15(9), 769-782 (2004).
-
(2004)
IEEE Trans. Parallel Distrib. Syst.
, vol.15
, Issue.9
, pp. 769-782
-
-
Park, J.1
Penner, M.2
Prasanna, V.3
-
28
-
-
0343462141
-
Automated empirical optimization of software and the ATLAS project
-
Powell, D., Allison, L., Dix, T.: Automated empirical optimization of software and the ATLAS project. Parallel Comput. 27(1-2), 3-35 (2001). http://math-atlas. sourceforge. net.
-
(2001)
Parallel Comput.
, vol.27
, Issue.1-2
, pp. 3-35
-
-
Powell, D.1
Allison, L.2
Dix, T.3
-
30
-
-
0031496750
-
Locality of reference in LU decomposition with partial pivoting
-
Toledo, S.: Locality of reference in LU decomposition with partial pivoting. SIAM J. Matrix Anal. Appl. 18(4), 1065-1081 (1997).
-
(1997)
SIAM J. Matrix Anal. Appl.
, vol.18
, Issue.4
, pp. 1065-1081
-
-
Toledo, S.1
-
31
-
-
84945708259
-
A theorem on boolean matrices
-
Warshall, S.: A theorem on boolean matrices. J. ACM 9(1), 11-12 (1962).
-
(1962)
J. ACM
, vol.9
, Issue.1
, pp. 11-12
-
-
Warshall, S.1
-
32
-
-
85013942562
-
-
Wolf, M., Lam, M.: A data locality optimizing algorithm. In: Proceedings of the ACM SIGPLAN 1991 Conference on Programming Language Design and Implementation, pp. 30-44 (1991).
-
-
-
-
33
-
-
84888279563
-
-
Womble, D., Greenberg, D., Wheat, S., Riesen, R.: Beyond core: Making parallel computer I/O practical. In: Proceedings of the 1993 DAGS/PC Symposium, pp. 56-63 (1993).
-
-
-
-
34
-
-
35248846531
-
-
Yotov, K., Roeder, T., Pingali, K., Gunnels, J., Gustavson, F.: An experimental comparison of cache-oblivious and cache-aware programs. In: Proceedings of the 19th ACM Symposium on Parallelism in Algorithms and Architectures, San Diego, California, pp. 93-104 (2007).
-
-
-
|