메뉴 건너뛰기




Volumn 49, Issue 2, 2016, Pages

Survey of recent prefetching techniques for processor caches

Author keywords

Cache pollution; Classification; Data prefetching; Hardware (HW) prefetching; Helper thread prefetching; Instruction prefetching; Review; Software (SW) prefetching; Speculative pre execution

Indexed keywords

CLASSIFICATION (OF INFORMATION); ENERGY EFFICIENCY; REVIEWS;

EID: 84965105240     PISSN: 03600300     EISSN: 15577341     Source Type: Journal    
DOI: 10.1145/2907071     Document Type: Article
Times cited : (90)

References (108)
  • 2
    • 34547676257 scopus 로고    scopus 로고
    • Interactions between compression and prefetching in chip multiprocessors
    • Alaa R. Alameldeen and David A. Wood. 2007. Interactions between compression and prefetching in chip multiprocessors. In HPCA. 228-239.
    • (2007) HPCA , pp. 228-239
    • Alameldeen, A.R.1    Wood, D.A.2
  • 3
    • 84857828566 scopus 로고    scopus 로고
    • Abs: A low-cost adaptive controller for prefetching in a banked shared last-level cache
    • 2012
    • Jorge Albericio, Rubén Gran, Pablo Ibánez, Víctor Viñals, and Jose María Llabería. 2012. ABS: A low-cost adaptive controller for prefetching in a banked shared last-level cache. ACM Trans. Arch. Code Opt. 8, 4 (2012), 19.
    • (2012) ACM Trans. Arch. Code Opt , vol.8 , Issue.4 , pp. 19
    • Albericio, J.1    Gran, R.2    Ibánez, P.3    Viñals, V.4    Llabería, J.M.5
  • 7
    • 34548021671 scopus 로고    scopus 로고
    • Performance driven data cache prefetching in a dynamic software optimization system
    • Jean Christophe Beyler and Philippe Clauss. 2007. Performance driven data cache prefetching in a dynamic software optimization system. In International Conference on Supercomputing. 202-209.
    • (2007) International Conference on Supercomputing , pp. 202-209
    • Beyler, J.C.1    Clauss, P.2
  • 8
    • 0014814325 scopus 로고
    • Space/time trade-offs in hash coding with allowable errors
    • 1970
    • Burton H. Bloom. 1970. Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13, 7 (1970), 422-426.
    • (1970) Commun. ACM , vol.13 , Issue.7 , pp. 422-426
    • Bloom, B.H.1
  • 13
    • 34548432523 scopus 로고    scopus 로고
    • Improving hash join performance through prefetching
    • 2007
    • Shimin Chen, Anastassia Ailamaki, Phillip B. Gibbons, and Todd C. Mowry. 2007. Improving hash join performance through prefetching. ACM Trans. Database Syst. 32, 3 (2007), 17.
    • (2007) ACM Trans. Database Syst , vol.32 , Issue.3 , pp. 17
    • Chen, S.1    Ailamaki, A.2    Gibbons, P.B.3    Mowry, T.C.4
  • 14
    • 0029308368 scopus 로고
    • Effective hardware-based data prefetching for high-performance processors
    • 1995
    • Tien-Fu Chen and Jean-Loup Baer. 1995. Effective hardware-based data prefetching for high-performance processors. IEEE Trans. Comput. 44, 5 (1995), 609-623.
    • (1995) IEEE Trans. Comput , vol.44 , Issue.5 , pp. 609-623
    • Chen, T.-F.1    Baer, J.-L.2
  • 15
    • 64949144540 scopus 로고    scopus 로고
    • Hybrid analytical modeling of pending cache hits, data prefetching, and MSHRs
    • Xi E. Chen and Tor M. Aamodt. 2008. Hybrid analytical modeling of pending cache hits, data prefetching, and MSHRs. In International Symposium on Micro architecture. 59-70.
    • (2008) International Symposium on Micro Architecture , pp. 59-70
    • Chen, X.E.1    Aamodt, T.M.2
  • 16
    • 0036038136 scopus 로고    scopus 로고
    • Dynamic hot data stream prefetching for general-purpose programs
    • Trishul M. Chilimbi and Martin Hirzel. 2002. Dynamic hot data stream prefetching for general-purpose programs. ACM SIGPLAN Notices 37, 5 (2002), 199-209.
    • (2002) ACM SIGPLAN Notices , vol.37 , Issue.5 , pp. 199-209
    • Chilimbi, T.M.1    Hirzel, M.2
  • 17
    • 47349132413 scopus 로고    scopus 로고
    • Low-cost epoch-based correlation prefetching for commercial applications
    • Yuan Chou. 2007. Low-cost epoch-based correlation prefetching for commercial applications. In International Symposium on Micro architecture. 301-313.
    • (2007) International Symposium on Micro Architecture , pp. 301-313
    • Chou, Y.1
  • 23
  • 28
    • 84902193542 scopus 로고    scopus 로고
    • A primer on hardware prefetching
    • 2014
    • Babak Falsafi and Thomas F. Wenisch. 2014. A primer on hardware prefetching. Synth. Lect. Comput. Arch. 9, 1 (2014), 1-67.
    • (2014) Synth. Lect. Comput. Arch , vol.9 , Issue.1 , pp. 1-67
    • Falsafi, B.1    Wenisch, T.F.2
  • 39
    • 14944355925 scopus 로고    scopus 로고
    • Memory-side prefetching for linked data structures for processor-in-memory systems
    • Christopher J. Hughes and Sarita V. Adve. 2005. Memory-side prefetching for linked data structures for processor-in-memory systems. J. Parallel and Distrib. Comput. 65, 4 (2005), 448-463.
    • (2005) J. Parallel and Distrib. Comput , vol.65 , Issue.4 , pp. 448-463
    • Hughes, C.J.1    Adve, S.V.2
  • 43
    • 84892527825 scopus 로고    scopus 로고
    • Linearizing irregular memory accesses for improved correlated prefetching
    • Akanksha Jain and Calvin Lin. 2013. Linearizing irregular memory accesses for improved correlated prefetching. In International Symposium on Micro architecture. 247-259.
    • (2013) International Symposium on Micro Architecture , pp. 247-259
    • Jain, A.1    Lin, C.2
  • 46
    • 0025429331 scopus 로고
    • Improving direct-mapped cache performance by the addition of a small fully associative cache and prefetch buffers
    • Norman P. Jouppi. 1990. Improving direct-mapped cache performance by the addition of a small fully associative cache and prefetch buffers. In International Symposium on Computer Architecture. 364-373.
    • (1990) International Symposium on Computer Architecture , pp. 364-373
    • Jouppi, N.P.1
  • 52
    • 84904110329 scopus 로고    scopus 로고
    • Multiple stream tracker: A new hardware stride prefetcher
    • Taesu Kim, Dali Zhao, and Alexander V. Veidenbaum. 2014. Multiple stream tracker: A new hardware stride prefetcher. In Computing Frontiers. 34.
    • (2014) Computing Frontiers , pp. 34
    • Kim, T.1    Zhao, D.2    Veidenbaum, A.V.3
  • 57
    • 84859463353 scopus 로고    scopus 로고
    • When prefetching works, when it doesn't, and why
    • 2012
    • Jaekyu Lee, Hyesoon Kim, and Richard Vuduc. 2012. When prefetching works, when it doesn't, and why. ACM Trans. Arch. Code Opt. 9, 1 (2012), 21-229.
    • (2012) ACM Trans. Arch. Code Opt , vol.9 , Issue.1 , pp. 21-229
    • Lee, J.1    Kim, H.2    Vuduc, R.3
  • 58
    • 62349097505 scopus 로고    scopus 로고
    • Exploiting producer patterns and l2 cache for timely dependence based prefetching
    • Chungsoo Lim and Gregory T. Byrd. 2008. Exploiting producer patterns and L2 cache for timely dependencebased prefetching. In International Conference on Computer Design. IEEE, 685-692.
    • (2008) International Conference on Computer Design. IEEE , pp. 685-692
    • Lim, C.1    Byrd, G.T.2
  • 62
    • 84866864300 scopus 로고    scopus 로고
    • Miss-correlation folding: Encoding per-block miss correlations in compressed dram for data prefetching
    • Gang Liu, Jih-Kwon Peir, and Victor Lee. 2012. Miss-correlation folding: Encoding per-block miss correlations in compressed DRAM for data prefetching. In International Parallel & Distributed Processing Symposium (IPDPS). 691-702.
    • (2012) International Parallel & Distributed Processing Symposium (IPDPS) , pp. 691-702
    • Liu, G.1    Peir, J.-K.2    Lee, V.3
  • 65
    • 0034839064 scopus 로고    scopus 로고
    • Tolerating memory latency through software-controlled pre-execution in simultaneous multithreading processors
    • Chi-Keung Luk. 2001. Tolerating memory latency through software-controlled pre-execution in simultaneous multithreading processors. In International Symposium on Computer Architecture. 40-51.
    • (2001) International Symposium on Computer Architecture , pp. 40-51
    • Luk, C.-K.1
  • 70
    • 84897572369 scopus 로고    scopus 로고
    • A survey of architectural techniques for improving cache power efficiency
    • 2014
    • Sparsh Mittal. 2014. A survey of architectural techniques for improving cache power efficiency. Elsev. Sust. Comput.: Inform. Syst. 4, 1 (2014), 33-43.
    • (2014) Elsev. Sust. Comput.: Inform. Syst , vol.4 , Issue.1 , pp. 33-43
    • Mittal, S.1
  • 71
    • 84994668261 scopus 로고    scopus 로고
    • A survey of techniques for modeling and improving reliability of computing systems
    • 2015
    • Sparsh Mittal and Jeffrey Vetter. 2015. A survey of techniques for modeling and improving reliability of computing systems. IEEE Trans. Parallel Distrib. Syst. (2015).
    • (2015) IEEE Trans. Parallel Distrib. Syst
    • Mittal, S.1    Vetter, J.2
  • 72
    • 84929352865 scopus 로고    scopus 로고
    • A survey of architectural approaches for managing embedded dram and non-volatile on-chip caches
    • 2015
    • Sparsh Mittal, Jeffrey S. Vetter, and Dong Li. 2015. A survey of architectural approaches for managing embedded DRAM and non-volatile on-chip caches. IEEE Trans. Parallel Distrib. Syst. (2015).
    • (2015) IEEE Trans. Parallel Distrib. Syst
    • Mittal, S.1    Vetter, J.S.2    Li, D.3
  • 80
    • 84978636839 scopus 로고    scopus 로고
    • Pbc: Prefetched blocks compaction
    • 2015). DOI
    • K. Raghavendra, B. Panda, and M. Mutyam. 2015. PBC: Prefetched blocks compaction. IEEE Trans. Comput. (2015). DOI: http://dx.doi.org/10.1109/TC.2015.2493533.
    • (2015) IEEE Trans. Comput
    • Raghavendra, K.1    Panda, B.2    Mutyam, M.3
  • 84
    • 84981522272 scopus 로고    scopus 로고
    • Spectral prefetcher: An effective mechanism for l2 cache prefetching
    • 2005
    • Saurabh Sharma, Jesse G. Beu, and Thomas M. Conte. 2005. Spectral prefetcher: An effective mechanism for L2 cache prefetching. ACM Trans. Arch. Code Opt. 2, 4 (2005), 423-450.
    • (2005) ACM Trans. Arch. Code Opt , vol.2 , Issue.4 , pp. 423-450
    • Sharma, S.1    Beu, J.G.2    Conte, T.M.3
  • 86
    • 0042850375 scopus 로고    scopus 로고
    • Correlation prefetching with a user-level memory thread
    • D. Solihin, Jaejin Lee, and Josep Torrellas. 2003. Correlation prefetching with a user-level memory thread. IEEE Trans. Parallel Distrib. Syst. 14, 6 (2003), 563-580.
    • (2003) IEEE Trans. Parallel Distrib. Syst , vol.14 , Issue.6 , pp. 563-580
    • Solihin, D.1    Lee, J.2    Torrellas, J.3
  • 93
    • 0001589803 scopus 로고    scopus 로고
    • Data prefetch mechanisms
    • 2000
    • Steven P. Vanderwiel and David J. Lilja. 2000. Data prefetch mechanisms. Comput. Surv. 32, 2 (2000), 174-199.
    • (2000) Comput. Surv , vol.32 , Issue.2 , pp. 174-199
    • Vanderwiel, S.P.1    Lilja, D.J.2
  • 94
    • 84862185055 scopus 로고    scopus 로고
    • The interaction and relative effectiveness of hardware and software data prefetch
    • 2012
    • Santhosh Verma and David M. Koppelman. 2012. The interaction and relative effectiveness of hardware and software data prefetch. J. Circ., Syst. Comput. 21, 02 (2012).
    • (2012) J. Circ., Syst. Comput , vol.21 , Issue.2
    • Verma, S.1    Koppelman, D.M.2
  • 100
    • 67549137647 scopus 로고    scopus 로고
    • Analyzing the worst-case execution time for instruction caches with prefetching
    • 2008
    • Jun Yan and Wei Zhang. 2008. Analyzing the worst-case execution time for instruction caches with prefetching. ACM Trans. Embedd. Comput. Syst. 8, 1 (2008), 7.
    • (2008) ACM Trans. Embedd. Comput. Syst , vol.8 , Issue.1 , pp. 7
    • Yan, J.1    Zhang, W.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.