SCOPUS 정보 검색 플랫폼

Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium, IPDPS 2012

Volumn , Issue , 2012, Pages 691-702

Miss-correlation folding: Encoding per-block miss correlations in compressed DRAM for data prefetching

(3) Liu, Gang a Peir, Jih Kwon a Lee, Victor b

a University of Florida (United States)

b INTEL CORPORATION (United States)

Author keywords

cache; compress; data parallel; miss correlation; prefetch; spatial; temporal

Indexed keywords

CACHE; COMPRESS; DATA PARALLEL; PREFETCHES; SPATIAL; TEMPORAL;

DISTRIBUTED PARAMETER NETWORKS; METADATA;

DATA COMPRESSION;

EID: 84866864300 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/IPDPS.2012.68 Document Type: Conference Paper

Times cited : (2)

References (54)

1
- 0034825713
- Performance of Hardware Compressed Main Memory
- B. Abali, H. Franke, X. Shen, D. E. Poff, and T. B. Smith, "Performance of Hardware Compressed Main Memory," in 7th HPCA, 2001.
- (2001) 7th HPCA
- Abali, B.¹ Franke, H.² Shen, X.³ Poff, D.E.⁴ Smith, T.B.⁵

2
- 34548034460
- Compression in cache design
- A.-R. Adl-Tabatabai, A. M. Ghuloum, and S. O. Kanaujia, "Compression in cache design," in 21st ACM International Conference on Supercomputing, 2007.
- 21st ACM International Conference on Supercomputing, 2007
- Adl-Tabatabai, A.-R.¹ Ghuloum, A.M.² Kanaujia, S.O.³

3
- 34547676257
- Interactions between compression and prefetching in chip multiprocessors
- A. R. Alameldeen, and D. A. Wood, "Interactions between compression and prefetching in chip multiprocessors," in 13th HPCA, 2007.
- (2007) 13th HPCA
- Alameldeen, A.R.¹ Wood, D.A.²

4
- 36949000058
- Technical Report EE-CEG-95-1, Cornell University
- M. Chamey, and A. Reeves, Generalized correlation based hardware prefetching, Technical Report EE-CEG-95-1, Cornell University, 1995.
- (1995) Generalized Correlation Based Hardware Prefetching
- Chamey, M.¹ Reeves, A.²

5
- 0026917364
- Reducing memory latency via non-blocking and prefetching caches
- T.-F. Chen, and J.-L. Baer, "Reducing memory latency via non-blocking and prefetching caches," in 5th ASPLOS, 1992.
- (1992) 5th ASPLOS
- Chen, T.-F.¹ Baer, J.-L.²

6
- 0036038136
- Dynamic hot data stream prefetching for general-purpose programs
- T. M. Chilimbi, and M. Hirzel, "Dynamic hot data stream prefetching for general-purpose programs," in PLDI, 2002.
- (2002) PLDI
- Chilimbi, T.M.¹ Hirzel, M.²

7
- 47349132413
- Low-Cost Epoch-Based Correlation Prefetching for Commercial Applications
- Y. Chou, "Low-Cost Epoch-Based Correlation Prefetching for Commercial Applications," in 40th Micro, 2007.
- (2007) 40th Micro
- Chou, Y.¹

8
- 0027621679
- Practical prefetching via data compression
- K. Curewitz, P. Krishnan, and J. Vitter, "Practical prefetching via data compression," ACM SIGMOD Record, vol. 22, no. 2, pp. 266, 1993.
- (1993) ACM SIGMOD Record , vol.22 , Issue.2 , pp. 266
- Curewitz, K.¹ Krishnan, P.² Vitter, J.³

9
- 70450233836
- Stream chaining: Exploiting multiple levels of correlation in data prefetching
- P. Diaz, and M. Cintra, "Stream chaining: Exploiting multiple levels of correlation in data prefetching," in 36th ISCA, 2009.
- (2009) 36th ISCA
- Diaz, P.¹ Cintra, M.²

10
- 27544435752
- A Robust Main-Memory Compression Scheme
- M. Ekman, and P. Stenstrom, "A Robust Main-Memory Compression Scheme," in 32nd ISCA, 2005.
- (2005) 32nd ISCA
- Ekman, M.¹ Stenstrom, P.²

11
- 36949027123
- Last-touch correlated data streaming
- M. Ferdman, and B. Falsafi, "Last-touch correlated data streaming," in ISPASS, 2007.
- (2007) ISPASS
- Ferdman, M.¹ Falsafi, B.²

12
- 84866869889
- FeS2: A Full-system Execution-driven Simulator for x86, http://fes2.cs.uiuc.edu/.
- FeS2: A Full-system Execution-driven Simulator for x86

13
- 77956977035
- Stride directed prefetching in scalar processors
- J. Fu, J. H. Patel, and B. L. Janssens, "Stride directed prefetching in scalar processors," in 25th MICRO, 1992.
- (1992) 25th MICRO
- Fu, J.¹ Patel, J.H.² Janssens, B.L.³

14
- 79951763686
- TCP: Tag Correlating Prefetchers
- Z. Hu, M. Martonosi, and S. Kaxiras, "TCP: Tag Correlating Prefetchers," in 9th HPCA, 2003.
- (2003) 9th HPCA
- Hu, Z.¹ Martonosi, M.² Kaxiras, S.³

15
- 0035187053
- Exploring the design space of future CMPs
- J. Huh, D. Burger, and S. W. Keckler, "Exploring the design space of future CMPs," in PACT, 2001.
- (2001) PACT
- Huh, J.¹ Burger, D.² Keckler, S.W.³

16
- 0030677583
- Prefetching using Markov predictors
- D. Joseph, and D. Grunwald, "Prefetching using Markov predictors," in 24th ISCA, 1997.
- (1997) 24th ISCA
- Joseph, D.¹ Grunwald, D.²

17
- 0025429331
- Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers
- N. P. Jouppi, "Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers," in 17th ISCA, 1990.
- (1990) 17th ISCA
- Jouppi, N.P.¹

18
- 0036287598
- Going the distance for TLB prefetching: An application-driven study
- G. B. Kandiraju, and A. Sivasubramaniam, "Going the distance for TLB prefetching: An application-driven study," in 29th ISCA, 2002.
- (2002) 29th ISCA
- Kandiraju, G.B.¹ Sivasubramaniam, A.²

19
- 70450235471
- Architecting phase change memory as a scalable DRAM alternative
- B. C. Lee, E. Ipek, O. Mutlu, and D. Burger, "Architecting phase change memory as a scalable DRAM alternative," in 36th ISCA, 2009.
- (2009) 36th ISCA
- Lee, B.C.¹ Ipek, E.² Mutlu, O.³ Burger, D.⁴

20
- 0033300356
- Design and evaluation of a selective compressed memory system
- J.-S. Lee, W.-K. Hong, and S.-D. Kim, "Design and evaluation of a selective compressed memory system," in IEEE International Conference on Computer Design: VLSI in Computers and Processors, 1999.
- IEEE International Conference on Computer Design: VLSI in Computers and Processors, 1999
- Lee, J.-S.¹ Hong, W.-K.² Kim, S.-D.³

21
- 77954995885
- Debunking the 100x GPU vs. CPU Myth: An Evaluation of Throughput Computing on CPU and GPU
- V. Lee, C. Kim, J. Chhugani, M. Deisher, D. Kim et al., "Debunking the 100x GPU vs. CPU Myth: An Evaluation of Throughput Computing on CPU and GPU," in 37th ISCA, 2010.
- (2010) 37th ISCA
- Lee, V.¹ Kim, C.² Chhugani, J.³ Deisher, M.⁴ Kim, D.⁵

22
- 79960875021
- Emerging Applications for Multi/Many-Core Processor
- V. Lee, Y. Chen, and P. Dubey, "Emerging Applications for Multi/Many-Core Processor," in 38th ISCA, 2011.
- (2011) 38th ISCA
- Lee, V.¹ Chen, Y.² Dubey, P.³

23
- 0036469676
- Simics: A full system simulation platform
- P. Magnusson, M. Christensson, J. Eskilson, D. Forsgren, G. Hållberg et al., "Simics: A full system simulation platform," Computer, pp. 50-58, 2002.
- (2002) Computer , pp. 50-58
- Magnusson, P.¹ Christensson, M.² Eskilson, J.³ Forsgren, D.⁴ Hållberg, G.⁵

24
- 33748870886
- Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset
- M. Martin, D. Sorin, B. Beckmann, M. Marty, M. Xu et al., "Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset," ACM SIGARCH Computer Architecture News, vol. 33, no. 4, pp. 92-99, 2005.
- (2005) ACM SIGARCH Computer Architecture News , vol.33 , Issue.4 , pp. 92-99
- Martin, M.¹ Sorin, D.² Beckmann, B.³ Marty, M.⁴ Xu, M.⁵

25
- 0038998034
- Memory Bandwidth and Machine Balance in Current High Performance Computers
- McCalpin, and J. D, "Memory Bandwidth and Machine Balance in Current High Performance Computers," IEEE Computer Society Technical Committee on Computer Architecture Newsletter, 1995.
- (1995) IEEE Computer Society Technical Committee on Computer Architecture Newsletter
- McCalpin, J.D.¹

26
- 10444284911
- AC/DC: An adaptive data cache prefetcher
- K. J. Nesbit, A. S. Dhodapkar, and J. E. Smith, "AC/DC: An adaptive data cache prefetcher," in 13th PACT, 2004.
- (2004) 13th PACT
- Nesbit, K.J.¹ Dhodapkar, A.S.² Smith, J.E.³

27
- 2342644731
- Data cache prefetching using a Global History Buffer
- K. J. Nesbit, and J. E. Smith, "Data cache prefetching using a Global History Buffer," in 10th HPCA, 2004.
- (2004) 10th HPCA
- Nesbit, K.J.¹ Smith, J.E.²

28
- 84866840089
- OpenMP, http://www.openmp.org/.

29
- 84956111432
- Lattice BGK models for Navier-Stokes equation
- Y. Qian, D. d'Humieres, and P. Lallemand, "Lattice BGK models for Navier-Stokes equation," EPL (Europhysics Letters), vol. 17, pp. 479, 1992.
- (1992) EPL (Europhysics Letters) , vol.17 , pp. 479
- Qian, Y.¹ D'Humieres, D.² Lallemand, P.³

30
- 70450273507
- Scalable high performance main memory system using phase-change memory technology
- M. K. Qureshi, V. Srinivasan, and J. A. Rivers, "Scalable high performance main memory system using phase-change memory technology," in 36th ISCA, 2009.
- (2009) 36th ISCA
- Qureshi, M.K.¹ Srinivasan, V.² Rivers, J.A.³

31
- 55449106208
- Phase-change random access memory: A scalable technology
- S. Raoux, G. Burr, M. Breitwisch, C. Rettner, Y. C. Chen et al., "Phase-change random access memory: A scalable technology," IBM Journal of Research and Development, vol. 52, no. 4.5, pp. 465-479, 2008.
- (2008) IBM Journal of Research and Development , vol.52 , Issue.4-5 , pp. 465-479
- Raoux, S.¹ Burr, G.² Breitwisch, M.³ Rettner, C.⁴ Chen, Y.C.⁵

32
- 84866876373
- Internal Report, unpublished
- Internal Report, "Single-Value Dynamic Encoding," unpublished.
- Single-Value Dynamic Encoding

33
- 70450077484
- Designing efficient sorting algorithms for manycore gpus
- N. Satish, M. Harris, and M. Garland, "Designing efficient sorting algorithms for manycore gpus," in IPDPS, 2009.
- (2009) IPDPS
- Satish, N.¹ Harris, M.² Garland, M.³

34
- 33847108092
- Coterminous locality and coterminous group data prefetching on chip-multiprocessors
- X. Shi, Z. Yang, J.-K. Peir, L. Peng, Y.-K. Chen et al., "Coterminous locality and coterminous group data prefetching on chip-multiprocessors," in 20th IPDPS, 2006.
- (2006) 20th IPDPS
- Shi, X.¹ Yang, Z.² Peir, J.-K.³ Peng, L.⁴ Chen, Y.-K.⁵

35
- 0036296856
- Using a user-level memory thread for correlation prefetching
- Y. Solihin, J. Lee, and J. Torrellas, "Using a user-level memory thread for correlation prefetching," in 29th ISCA, 2002.
- (2002) 29th ISCA
- Solihin, Y.¹ Lee, J.² Torrellas, J.³

36
- 33845894426
- Spatial memory streaming
- S. Somogyi, T. F. Wenisch, A. Ailamaki, B. Falsafi, and A. Moshovos, "Spatial memory streaming," in 33rd ISCA, 2006.
- (2006) 33rd ISCA
- Somogyi, S.¹ Wenisch, T.F.² Ailamaki, A.³ Falsafi, B.⁴ Moshovos, A.⁵

37
- 70450279104
- Spatio-temporal memory streaming
- S. Somogyi, T. F. Wenisch, A. Ailamaki, and B. Falsafi, "Spatio-temporal memory streaming," in 36th ISCA, 2009.
- (2009) 36th ISCA
- Somogyi, S.¹ Wenisch, T.F.² Ailamaki, A.³ Falsafi, B.⁴

38
- 34547655822
- Feedback directed prefetching: Improving the performance and bandwidth-efficiency of hardware prefetchers
- S. Srinath, O. Mutlu, H. Kim, and Y. N. Patt, "Feedback directed prefetching: Improving the performance and bandwidth-efficiency of hardware prefetchers," in 13th HPCA, 2007.
- (2007) 13th HPCA
- Srinath, S.¹ Mutlu, O.² Kim, H.³ Patt, Y.N.⁴

39
- 0038138424
- POWER4 system microarchitecture
- Oct
- J. Tendler, S. Dodson, S. Fields, H. Le, and B. Sinharoy, "POWER4 system microarchitecture," IBM Technical White Paper, Oct, 2001.
- (2001) IBM Technical White Paper
- Tendler, J.¹ Dodson, S.² Fields, S.³ Le, H.⁴ Sinharoy, B.⁵

40
- 84866851700
- POSIX thread, https://computing.llnl.gov/tutorials/pthreads/.
- POSIX Thread

41
- 0035266001
- IBM memory expansion technology (MXT)
- R. Tremaine, P. Franaszek, J. Robinson, C. Schulz, T. Smith et al., "IBM memory expansion technology (MXT)," IBM Journal of Research and Development, vol. 45, no. 2, pp. 271-285, 2001.
- (2001) IBM Journal of Research and Development , vol.45 , Issue.2 , pp. 271-285
- Tremaine, R.¹ Franaszek, P.² Robinson, J.³ Schulz, C.⁴ Smith, T.⁵

42
- 0026405369
- Optimal prefetching via data compression
- J. S. Vitter, and P. Krishnan, "Optimal prefetching via data compression," in 32nd Annual Symposium on Foundations of Computer Science, 1991.
- 32nd Annual Symposium on Foundations of Computer Science, 1991
- Vitter, J.S.¹ Krishnan, P.²

43
- 27544508955
- Temporal Streaming of Shared Memory
- T. F. Wenisch, S. Somogyi, N. Hardavellas, J. Kim, A. Ailamaki et al., "Temporal Streaming of Shared Memory," in 32nd ISCA, 2005.
- (2005) 32nd ISCA
- Wenisch, T.F.¹ Somogyi, S.² Hardavellas, N.³ Kim, J.⁴ Ailamaki, A.⁵

44
- 56449097232
- Temporal streams in commercial server applications
- T. F. Wenisch, M. Ferdman, A. Ailamaki, B. Falsafi, and A. Moshovos, "Temporal streams in commercial server applications," in IEEE International Symposium on Workload Characterization, 2008.
- IEEE International Symposium on Workload Characterization, 2008
- Wenisch, T.F.¹ Ferdman, M.² Ailamaki, A.³ Falsafi, B.⁴ Moshovos, A.⁵

45
- 64949123191
- Practical off-chip meta-data for temporal memory streaming
- T. F. Wenisch, M. Ferdman, A. Ailamaki, B. Falsafi, and A. Moshovos, "Practical off-chip meta-data for temporal memory streaming," in 15th HPCA, 2009.
- (2009) 15th HPCA
- Wenisch, T.F.¹ Ferdman, M.² Ailamaki, A.³ Falsafi, B.⁴ Moshovos, A.⁵

46
- 85084162609
- The case for compressed caching in virtual memory systems
- P. R. Wilson, S. F. Kaplan, and Y. Smaragdakis, "The case for compressed caching in virtual memory systems," in Proceedings of USENIX Annual Technical Conference, 1999.
- Proceedings of USENIX Annual Technical Conference, 1999
- Wilson, P.R.¹ Kaplan, S.F.² Smaragdakis, Y.³

47
- 0038364440
- Frequent value locality and its applications
- J. Yang, and R. Gupta, "Frequent value locality and its applications," ACM Trans. on Embedded Computing Systems, vol. 1, no. 1, pp. 79-105, 2002.
- (2002) ACM Trans. on Embedded Computing Systems , vol.1 , Issue.1 , pp. 79-105
- Yang, J.¹ Gupta, R.²

48
- 34547204184
- High-performance operating system controlled memory compression
- L. Yang, H. Lekatsas, and R. P. Dick, "High-performance operating system controlled memory compression," in 43rd ACM/IEEE Design Automation Conference, 2006.
- 43rd ACM/IEEE Design Automation Conference, 2006
- Yang, L.¹ Lekatsas, H.² Dick, R.P.³

49
- 36949014308
- PTLsim: A Cycle Accurate Full System x86-64 Microarchitectural Simulator
- M. T. Yourst, "PTLsim: A Cycle Accurate Full System x86-64 Microarchitectural Simulator," in ISPASS, 2007.
- (2007) ISPASS
- Yourst, M.T.¹

50
- 0034443222
- Frequent value locality and value-centric data cache design
- Y. Zhang, J. Yang, and R. Gupta, "Frequent value locality and value-centric data cache design," in 9th ASPLOS, 2000.
- (2000) 9th ASPLOS
- Zhang, Y.¹ Yang, J.² Gupta, R.³

51
- 84944720428
- Enabling partial cache line prefetching through data compression
- Y. Zhang, and R. Gupta, "Enabling partial cache line prefetching through data compression," in International Conference on Parallel Processing, 2003.
- International Conference on Parallel Processing, 2003
- Zhang, Y.¹ Gupta, R.²

52
- 0012525243
- Benchmark health considered harmful
- C. Zilles, "Benchmark health considered harmful," ACM SIGARCH Computer Architecture News, vol. 29, no. 3, pp. 4-5, 2001.
- (2001) ACM SIGARCH Computer Architecture News , vol.29 , Issue.3 , pp. 4-5
- Zilles, C.¹

53
- 0017493286
- A universal algorithm for sequential data compression
- J. Ziv, and A. Lempel, "A universal algorithm for sequential data compression," IEEE Transactions on Information Theory, vol. 23, no. 3, pp. 337-343, 1977.
- (1977) IEEE Transactions on Information Theory , vol.23 , Issue.3 , pp. 337-343
- Ziv, J.¹ Lempel, A.²

54
- 0018019231
- Compression of individual sequences via variable-rate coding
- J. Ziv, and A. Lempel, "Compression of individual sequences via variable-rate coding," IEEE Transactions on Information Theory, vol. 24, no. 5, pp. 530-536, 1978.
- (1978) IEEE Transactions on Information Theory , vol.24 , Issue.5 , pp. 530-536
- Ziv, J.¹ Lempel, A.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.