SCOPUS 정보 검색 플랫폼

9th International Symposium on Parallel and Distributed Computing, ISPDC 2010

Volumn , Issue , 2010, Pages 133-140

Resource-aware compiler prefetching for many-cores

(5) Caragea, George C a Tzannes, Alexandros a Keceli, Fuat a Barua, Rajeev a Vishkin, Uzi a

a UNIVERSITY OF MARYLAND (United States)

Author keywords

Optimizing compilers; Parallel architectures

Indexed keywords

CACHE HIERARCHIES; CACHE MISS; COMPILER ALGORITHMS; HARDWARE AND SOFTWARE; MANY-CORE ARCHITECTURE; MEMORY LEVEL PARALLELISMS; OPTIMIZING COMPILERS; OUT-OF-ORDER PROCESSORS; PREFETCHES; PREFETCHING; PREFETCHING ALGORITHM; RESOURCE AWARE; STATE OF THE ART;

BUFFER STORAGE; OPTIMIZATION; PARALLEL ARCHITECTURES; PROGRAM COMPILERS;

ALGORITHMS;

EID: 77956435385 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ISPDC.2010.16 Document Type: Conference Paper

Times cited : (9)

References (23)

1
- 40349098914
- Scalable cache miss handling for high memory-level parallelism
- Washington, DC, USA: IEEE Computer Society
- J. Tuck, L. Ceze, and J. Torrellas, "Scalable cache miss handling for high memory-level parallelism," in MICRO 39: Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture. Washington, DC, USA: IEEE Computer Society, 2006, pp. 409-422.
- (2006) MICRO 39: Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture , pp. 409-422
- Tuck, J.¹ Ceze, L.² Torrellas, J.³

2
- 77956424086
- Empirical evaluation of multi-socket, multi-core memory concurrency
- Renaissance Computing Institute January [Online]. Available
- A. Porterfield, R. Fowler, A. Mandel, and M. Y. Lim, "Empirical evaluation of multi-socket, multi-core memory concurrency," Renaissance Computing Institute, Tech. Rep. RENCI TR-09-01, January 2009. [Online]. Available: http://www.renci.org/publications/techreports/TR-09-01.pdf
- (2009) Tech. Rep. RENCI TR-09-01
- Porterfield, A.¹ Fowler, R.² Mandel, A.³ Lim, M.Y.⁴

3
- 0034823696
- Towards a first vertical prototyping of an extremely fine-grained parallel programming approach
- New York, NY, USA: ACM
- D. Naishlos, J. Nuzman, C.-W. Tseng, and U. Vishkin, "Towards a first vertical prototyping of an extremely fine-grained parallel programming approach," in SPAA '01: Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures. New York, NY, USA: ACM, 2001, pp. 93-102.
- (2001) SPAA '01: Proceedings of the Thirteenth Annual ACM Symposium on Parallel Algorithms and Architectures , pp. 93-102
- Naishlos, D.¹ Nuzman, J.² Tseng, C.-W.³ Vishkin, U.⁴

4
- 70449674958
- Brief announcement: Performance potential of an easy-to-program pram-on-chip prototype versus state-of-the-art processor
- New York, NY, USA: ACM
- G. C. Caragea, A. B. Saybasili, X. Wen, and U. Vishkin, "Brief announcement: performance potential of an easy-to-program pram-on-chip prototype versus state-of-the-art processor," in SPAA '09: Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures. New York, NY, USA: ACM, 2009, pp. 163-165.
- (2009) SPAA '09: Proceedings of the Twenty-First Annual Symposium on Parallelism in Algorithms and Architectures , pp. 163-165
- Caragea, G.C.¹ Saybasili, A.B.² Wen, X.³ Vishkin, U.⁴

5
- 56749182827
- Fpga-based prototype of a pram-on-chip processor
- New York, NY, USA: ACM
- X. Wen and U. Vishkin, "Fpga-based prototype of a pram-on-chip processor," in CF '08: Proceedings of the 2008 conference on Computing frontiers. New York, NY, USA: ACM, 2008, pp. 55-66.
- (2008) CF '08: Proceedings of the 2008 Conference on Computing Frontiers , pp. 55-66
- Wen, X.¹ Vishkin, U.²

6
- 77956435722
- General-purpose vs. gpu: Comparison of many-cores on irregular workloads
- USENIX, June
- G. C. Caragea, F. Keceli, A. Tzannes, and U. Vishkin, "General-purpose vs. gpu: Comparison of many-cores on irregular workloads," in HotPar '10: Proceedings of the 2nd Workshop on Hot Topics in Parallelism. USENIX, June 2010.
- (2010) HotPar '10: Proceedings of the 2nd Workshop on Hot Topics in Parallelism
- Caragea, G.C.¹ Keceli, F.² Tzannes, A.³ Vishkin, U.⁴

7
- 0026918402
- Design and evaluation of a compiler algorithm for prefetching
- T. C. Mowry, M. S. Lam, and A. Gupta, "Design and evaluation of a compiler algorithm for prefetching," SIGPLAN Not., vol. 27, no. 9, pp. 62-73, 1992.
- (1992) SIGPLAN Not. , vol.27 , Issue.9 , pp. 62-73
- Mowry, T.C.¹ Lam, M.S.² Gupta, A.³

8
- 46449113366
- Layout-accurate design and implementation of a high-throughput interconnection network for single-chip parallel processing
- A. O. Balkan, M. N. Horak, G. Qu, and U. Vishkin, "Layout-accurate design and implementation of a high-throughput interconnection network for single-chip parallel processing," hoti, pp. 21-28, 2007.
- (2007) Hoti , pp. 21-28
- Balkan, A.O.¹ Horak, M.N.² Qu, G.³ Vishkin, U.⁴

9
- 70349754483
- CRC Press ch. Models for Advancing PRAM and Other Algorithms into Parallel Programs for a PRAM-On-Chip Platform
- U. Vishkin, G. C. Caragea, and B. C. Lee, Handbook of Parallel Computing: Models, Algorithms and Applications. CRC Press, 2007, ch. Models for Advancing PRAM and Other Algorithms into Parallel Programs for a PRAM-On-Chip Platform.
- (2007) Handbook of Parallel Computing: Models, Algorithms and Applications
- Vishkin, U.¹ Caragea, G.C.² Lee, B.C.³

10
- 77952220812
- Is teaching parallel algorithmic thinking to high-school student possible? One teacher's experience
- Milwaukee, WI, March
- S. Torbert, U. Vishkin, R. Tzur, and D. Ellison, "Is teaching parallel algorithmic thinking to high-school student possible? one teacher's experience," in Proc. 41st ACM Technical Symposium on Computer Science Education (SIG CSE), Milwaukee, WI, March 2010.
- (2010) Proc. 41st ACM Technical Symposium on Computer Science Education (SIG CSE)
- Torbert, S.¹ Vishkin, U.² Tzur, R.³ Ellison, D.⁴

11
- 52049104934
- A pilot study to compare programming effort for two parallel programming models
- L. Hochstein, V. R. Basili, U. Vishkin, and J. Gilbert, "A pilot study to compare programming effort for two parallel programming models," Journal of Systems and Software, vol. 81, no. 11, pp. 1920 - 1930, 2008.
- (2008) Journal of Systems and Software , vol.81 , Issue.11 , pp. 1920-1930
- Hochstein, L.¹ Basili, V.R.² Vishkin, U.³ Gilbert, J.⁴

12
- 70449690168
- August
- "Software release of the explicit multi-threading (xmt) programming environment," http://www.umiacs.umd.edu/users/vishkin/XMT/sw-release.html, August 2008.
- (2008) Software Release of the Explicit Multi-Threading (xmt) Programming Environment

13
- 0020177251
- Cache memories
- A. J. Smith, "Cache memories," ACM Comput. Surv., vol. 14, no. 3, pp. 473-530, 1982.
- (1982) ACM Comput. Surv. , vol.14 , Issue.3 , pp. 473-530
- Smith, A.J.¹

14
- 0006674590
- Ph.D. dissertation, Rice University adviser-Ken Kennedy
- N. McIntosh, "Compiler support for software prefetching," Ph.D. dissertation, Rice University, 1998, adviser-Ken Kennedy.
- (1998) Compiler Support for Software Prefetching
- McIntosh, N.¹

15
- 84944799568
- Data access microarchitectures for superscalar processors with compiler-assisted data prefetching
- ACM Press
- W. Y. Chen, S. A. Mahlke, P. P. Chang, and W. M. W. Hwu, "Data access microarchitectures for superscalar processors with compiler-assisted data prefetching," in MICRO 24: Proceedings of the 24th annual international symposium on Microarchitecture. ACM Press, 1991.
- (1991) MICRO 24: Proceedings of the 24th Annual International Symposium on Microarchitecture
- Chen, W.Y.¹ Mahlke, S.A.² Chang, P.P.³ Hwu, W.M.W.⁴

16
- 0025429331
- Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers
- N. P. Jouppi, "Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers," in ISCA '90: Proceedings of the 17th annual international symposium on Computer Architecture, 1990.
- (1990) ISCA '90: Proceedings of the 17th Annual International Symposium on Computer Architecture
- Jouppi, N.P.¹

17
- 0034818343
- Reducing dram latencies with an integrated memory hierarchy design
- W. F. Lin, S. K. Reinhardt, and D. Burger, "Reducing dram latencies with an integrated memory hierarchy design," hpca, vol. 00, p. 0301, 2001.
- (2001) HPCA , pp. 0301
- Lin, W.F.¹ Reinhardt, S.K.² Burger, D.³

18
- 84976656398
- Effective cache prefetching on bus-based multiprocessors
- D. M. Tullsen and S. J. Eggers, "Effective cache prefetching on bus-based multiprocessors," ACM Trans. Comput. Syst., vol. 13, no. 1, pp. 57-88, 1995.
- (1995) ACM Trans. Comput. Syst. , vol.13 , Issue.1 , pp. 57-88
- Tullsen, D.M.¹ Eggers, S.J.²

19
- 0031988272
- Tolerating latency in multiprocessors through compiler-inserted prefetching
- T. C. Mowry, "Tolerating latency in multiprocessors through compiler-inserted prefetching," ACM Trans. Comput. Syst., vol. 16, no. 1, pp. 55-92, 1998.
- (1998) ACM Trans. Comput. Syst. , vol.16 , Issue.1 , pp. 55-92
- Mowry, T.C.¹

20
- 0029341212
- Sequential hardware prefetching in shared-memory multiprocessors
- F. Dahlgren, M. Dubois, and P. Stenström, "Sequential hardware prefetching in shared-memory multiprocessors," IEEE Trans. Parallel Distrib. Syst., vol. 6, no. 7, pp. 733-746, 1995.
- (1995) IEEE Trans. Parallel Distrib. Syst. , vol.6 , Issue.7 , pp. 733-746
- Dahlgren, F.¹ Dubois, M.² Stenström, P.³

21
- 0004033521
- Ph.D. dissertation, Stanford, CA, USA
- T. C. Mowry, "Tolerating latency through software-controlled data prefetching," Ph.D. dissertation, Stanford, CA, USA, 1995.
- (1995) Tolerating Latency Through Software-Controlled Data Prefetching
- Mowry, T.C.¹

22
- 0026153646
- An architecture for software-controlled data prefetching
- ACM Press
- A. C. Klaiber and H. M. Levy, "An architecture for software-controlled data prefetching," in ISCA '91: Proceedings of the 18th annual international symposium on Computer architecture. ACM Press, 1991, pp. 43-53.
- (1991) ISCA '91: Proceedings of the 18th Annual International Symposium on Computer Architecture , pp. 43-53
- Klaiber, A.C.¹ Levy, H.M.²

23
- 33646497615
- Springer-Verlag ch. Improving the Performance of GCC by Exploiting IA-64 Architectural Features
- C. Yang, X. Yang, and J. Xue, Advances in Computer Systems Archiecture. Springer-Verlag, 2005, vol. 3740/2005, ch. Improving the Performance of GCC by Exploiting IA-64 Architectural Features, pp. 236-251.
- (2005) Advances in Computer Systems Archiecture , vol.3740 , Issue.2005 , pp. 236-251
- Yang, C.¹ Yang, X.² Xue, J.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.