SCOPUS 정보 검색 플랫폼

Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms

Volumn , Issue , 2008, Pages 501-510

Provably good multicore cache performance for divide-and-conquer algorithms

(6) Blelloch, Guy E a Chowdhury, Rezaul A b Gibbons, Phillip B c Ramachandran, Vijaya b Chen, Shimin c Kozuch, Michael c

a CARNEGIE MELLON UNIVERSITY (United States)

b UNIVERSITY OF TEXAS AT AUSTIN (United States)

c INTEL CORPORATION (United States)

Author keywords

[No Author keywords available]

Indexed keywords

CACHE COMPLEXITY; CACHE PERFORMANCE; MULTI CORE PROCESSORS; MULTI CORES; TIME COMPLEXITIES;

ALGORITHMS; SCHEDULING; SEQUENTIAL SWITCHING;

PARALLEL ALGORITHMS;

EID: 58449090994 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (99)

References (30)

1
- 84868887384
- www.sun.com/processors/UltraSPARC-T1/, 2007.
- (2007)

2
- 84868882190
- www.tilera.com, 2007.
- (2007)

3
- 84868880821
- Intel shows off 80-core processor. www.news.com/2100-1006-3-6158181.html, 2007.
- (2007) Intel shows off 80-core processor

4
- 0036590708
- The data locality of work stealing
- Springer
- U. A. Acar, G. E. Blelloch, and R. D. Blumofe. The data locality of work stealing. Theory of Computing Systems, 35(3), 2002. Springer.
- (2002) Theory of Computing Systems , vol.35 , Issue.3
- Acar, U.A.¹ Blelloch, G.E.² Blumofe, R.D.³

5
- 58449115196
- A model for hierarchical memory
- A. Aggarwal, B. Alpern, A. Chandra, and M. Snir. A model for hierarchical memory. In ACM STOC, 1987.
- (1987) ACM STOC
- Aggarwal, A.¹ Alpern, B.² Chandra, A.³ Snir, M.⁴

6
- 0024082546
- The input/output complexity of sorting and related problems
- A. Aggarwal and J. S. Vitter. The input/output complexity of sorting and related problems. Communications of the ACM, 31(9), 1988.
- (1988) Communications of the ACM , vol.31 , Issue.9
- Aggarwal, A.¹ Vitter, J.S.²

7
- 0023451961
- Optimal parallel merging and sorting without memory conflicts
- S. Akl and N. Santoro. Optimal parallel merging and sorting without memory conflicts. IEEE Transactions on Computers, 36(11), 1987.
- (1987) IEEE Transactions on Computers , vol.36 , Issue.11
- Akl, S.¹ Santoro, N.²

8
- 0028483922
- The uniform memory hierachy model of computation
- 122/3, Springer
- B. Alpern, L. Carter, E. Feig, and T. Selker. The uniform memory hierachy model of computation. Algorthmica, 12(2/3), 1994. Springer.
- (1994) Algorthmica
- Alpern, B.¹ Carter, L.² Feig, E.³ Selker, T.⁴

9
- 0033722744
- Piranha: A scalable architecture based on single-chip multiprocessing
- L. A. Barroso, K. Gharachorloo, R. McNamara, A. Nowatzyk, S. Qadeer, B. Sano, S. Smith, R. Stets, and B. Verghese. Piranha: A scalable architecture based on single-chip multiprocessing. In ACM ISCA, 2000.
- (2000) ACM ISCA
- Barroso, L.A.¹ Gharachorloo, K.² McNamara, R.³ Nowatzyk, A.⁴ Qadeer, S.⁵ Sano, B.⁶ Smith, S.⁷ Stets, R.⁸ Verghese, B.⁹

10
- 35248813384
- Optimal sparse matrix dense vector multiplication in the I/O-model
- M. A. Bender, G. S. Brodal, R. Fagerberg, R. Jacob, and E. Vicari. Optimal sparse matrix dense vector multiplication in the I/O-model. In ACM SPAA, 2007.
- (2007) ACM SPAA
- Bender, M.A.¹ Brodal, G.S.² Fagerberg, R.³ Jacob, R.⁴ Vicari, E.⁵

11
- 58449118654
- Concurrent cache-oblivious B-trees
- M. A. Bender, J. T. Fineman, S. Gilbert, and B. C. Kuszmaul. Concurrent cache-oblivious B-trees. In ACM SPAA, 2005.
- (2005) ACM SPAA
- Bender, M.A.¹ Fineman, J.T.² Gilbert, S.³ Kuszmaul, B.C.⁴

12
- 8344240379
- Effectively sharing a cache among threads
- G. E. Blelloch and P. B. Gibbons. Effectively sharing a cache among threads. In ACM SPAA, 2004.
- (2004) ACM SPAA
- Blelloch, G.E.¹ Gibbons, P.B.²

13
- 0003575841
- Provably efficient scheduling for languages with fine-grained parallelism
- G. E. Blelloch, P. B. Gibbons, and Y. Matias. Provably efficient scheduling for languages with fine-grained parallelism. Journal of the ACM, 46(2), 1999.
- (1999) Journal of the ACM , vol.46 , Issue.2
- Blelloch, G.E.¹ Gibbons, P.B.² Matias, Y.³

14
- 0030707347
- Space-efficient scheduling of parallelism with synchronization variables
- G. E. Blelloch, P. B. Gibbons, Y. Matias, and G. J. Narlikar. Space-efficient scheduling of parallelism with synchronization variables. In ACM SPAA, 1997.
- (1997) ACM SPAA
- Blelloch, G.E.¹ Gibbons, P.B.² Matias, Y.³ Narlikar, G.J.⁴

15
- 0030387154
- An analysis of dag-consistent distributed shared-memory algorithms
- R. D. Blumofe, M. Frigo, C. F. Joerg, C. E. Leiserson, and K. H. Randall. An analysis of dag-consistent distributed shared-memory algorithms. In ACM SPAA, 1996.
- (1996) ACM SPAA
- Blumofe, R.D.¹ Frigo, M.² Joerg, C.F.³ Leiserson, C.E.⁴ Randall, K.H.⁵

16
- 0016046965
- The parallel evaluation of general arithmeticexpressions
- R. Brent. The parallel evaluation of general arithmeticexpressions. Journal of the ACM, 21:201-206, 1974.
- (1974) Journal of the ACM , vol.21 , pp. 201-206
- Brent, R.¹

17
- 35248843628
- Supennatrix out-of-order scheduling of matrix operations for SMP and multi-core architectures
- E. Chan, E. S. Qumtana-Orti, G. Quintana-Orti, and R. van de Geijn. Supennatrix out-of-order scheduling of matrix operations for SMP and multi-core architectures. In ACM SPAA, 2007.
- (2007) ACM SPAA
- Chan, E.¹ Qumtana-Orti, E.S.² Quintana-Orti, G.³ van de Geijn, R.⁴

18
- 35248852476
- Scheduling threads for constructive cache sharing on CMPs
- S. Chen, P. B. Gibbons, M. Kozuch, V. Liaskovitis, A. Ailamaki, G. E. Blelloch, B. Falsafi, L. Fix, N. Hardavellas, T. C. Mowry, and C. Wilkerson. Scheduling threads for constructive cache sharing on CMPs. In ACM SPAA, 2007.
- (2007) ACM SPAA
- Chen, S.¹ Gibbons, P.B.² Kozuch, M.³ Liaskovitis, V.⁴ Ailamaki, A.⁵ Blelloch, G.E.⁶ Falsafi, B.⁷ Fix, L.⁸ Hardavellas, N.⁹ Mowry, T.C.¹⁰ Wilkerson, C.¹¹

19
- 58449123296
- The cacheoblivious gaussian elimination paradigm: Theoretical framework, parallelization and experimental evaluation
- R. Chowdhury and V. Ramachandran. The cacheoblivious gaussian elimination paradigm: Theoretical framework, parallelization and experimental evaluation. In ACM SPAA, 2007.
- (2007) ACM SPAA
- Chowdhury, R.¹ Ramachandran, V.²

20
- 34548334096
- Performance of multithreaded chip multiprocessors and implications for operating system design
- A. Fedorova, M. Seltzer, C. Small, and D. Nussbaum. Performance of multithreaded chip multiprocessors and implications for operating system design. In USENIX Ann. Tech. Conf., 2005.
- (2005) USENIX Ann. Tech. Conf
- Fedorova, A.¹ Seltzer, M.² Small, C.³ Nussbaum, D.⁴

21
- 58449121093
- Cache-oblivious algorithms
- M. Frigo, C. E. Leiserson, H. Prokop, and S. Ramachandran. Cache-oblivious algorithms. In IEEE FOCS, 1999.
- (1999) IEEE FOCS
- Frigo, M.¹ Leiserson, C.E.² Prokop, H.³ Ramachandran, S.⁴

22
- 58449120511
- The cache complexity of multithreaded cache oblivious algorithms
- M. Frigo and V. Strumpen. The cache complexity of multithreaded cache oblivious algorithms. In ACM SPAA, 2006.
- (2006) ACM SPAA
- Frigo, M.¹ Strumpen, V.²

23
- 58449128303
- M. T. Goodrich, M. Nelson, and N. Sitchinava. Sorting in parallel external-memory multicores. Technical report, U.C. Irvine, 2007.
- M. T. Goodrich, M. Nelson, and N. Sitchinava. Sorting in parallel external-memory multicores. Technical report, U.C. Irvine, 2007.

24
- 0033880036
- The Stanford Hydra CMP
- L. Hammond, B. A. Hubbert, M. Siu, M. K. Prabhu, M. Chen, and K. Olukotun. The Stanford Hydra CMP. IEEE Micro, 20(2), 2000.
- (2000) IEEE Micro , vol.20 , Issue.2
- Hammond, L.¹ Hubbert, B.A.² Siu, M.³ Prabhu, M.K.⁴ Chen, M.⁵ Olukotun, K.⁶

25
- 0031235242
- A singlechip multiprocessor
- L. Hammond, B. Nayfeh, and K. Olukotun. A singlechip multiprocessor. IEEE Computer, 30(9), 1997.
- (1997) IEEE Computer , vol.30 , Issue.9
- Hammond, L.¹ Nayfeh, B.² Olukotun, K.³

26
- 0018457301
- A separator theorem for planar graphs
- R. J. Lipton and R. E. Tarjan. A separator theorem for planar graphs. SIAM Journal on Applied Mathematics, 36(2), 1979.
- (1979) SIAM Journal on Applied Mathematics , vol.36 , Issue.2
- Lipton, R.J.¹ Tarjan, R.E.²

27
- 0036489340
- Scheduling threads for low space requirement and good locality
- Springer
- G. J. Narlikar. Scheduling threads for low space requirement and good locality. Theory of Computing Systems, 35(2), 2002. Springer.
- (2002) Theory of Computing Systems , vol.35 , Issue.2
- Narlikar, G.J.¹

28
- 0029666647
- Evaluation of design alternatives for a multiprocessor microprocessor
- B. A. Nayfeh, L. Hammond, and K. Olukotun. Evaluation of design alternatives for a multiprocessor microprocessor. In ACM ISCA, 1996.
- (1996) ACM ISCA
- Nayfeh, B.A.¹ Hammond, L.² Olukotun, K.³

29
- 34250487811
- Gaussian elimination is not optimal
- Springer
- V. Strassen. Gaussian elimination is not optimal. Numerische Mathematik, 13(4), 1969. Springer.
- (1969) Numerische Mathematik , vol.13 , Issue.4
- Strassen, V.¹

30
- 34548030923
- Thread clustering: Sharing-aware scheduling on SMP-CMP-SMT multiprocessors
- D. Tam, R. Azimi, and M. Stumm. Thread clustering: Sharing-aware scheduling on SMP-CMP-SMT multiprocessors. In ACM EuroSys, 2007.
- (2007) ACM EuroSys
- Tam, D.¹ Azimi, R.² Stumm, M.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.