SCOPUS 정보 검색 플랫폼

Journal of Low Power Electronics and Applications

Volumn 6, Issue 2, 2016, Pages

A survey of cache bypassing techniques

a OAK RIDGE NATIONAL LABORATORY (United States)

Author keywords

Cache bypassing; Classification; CPU; CPU GPU heterogeneous system; Dead block prediction; GPU; Non volatile memory; Review; Selective caching

Indexed keywords

EID: 84965128688 PISSN: None EISSN: 20799268 Source Type: Journal
DOI: 10.3390/jlpea6020005 Document Type: Article

Times cited : (29)

References (90)

1
- 84898061010
- 5.1 POWER8TM: A 12-core server-class processor in 22 nm SOI with 7.6 Tb/s off-chip bandwidth
- San Francisco, CA, USA, 9–13 February
- Fluhr, E.J.; Friedrich, J.; Dreps, D.; Zyuban, V.; Still, G.; Gonzalez, C.; Hall, A.; Hogenmiller, D.; Malgioglio, F.; Nett, R. et al. 5.1 POWER8TM: A 12-core server-class processor in 22 nm SOI with 7.6 Tb/s off-chip bandwidth. In Proceedings of the International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA, 9–13 February 2014; pp. 96–97.
- (2014) Proceedings of the International Solid-State Circuits Conference (ISSCC) , pp. 96-97
- Fluhr, E.J.¹ Friedrich, J.² Dreps, D.³ Zyuban, V.⁴ Still, G.⁵ Gonzalez, C.⁶ Hall, A.⁷ Hogenmiller, D.⁸ Malgioglio, F.⁹ Nett, R.¹⁰

2
- 84898062900
- 5.9 Haswell: A family of IA 22 nm processors
- San Francisco, CA, USA, 9–13 February
- Kurd, N.; Chowdhury, M.; Burton, E.; Thomas, T.P.; Mozak, C.; Boswell, B.; Lal, M.; Deval, A.; Douglas, J.; Elassal, M. et al. 5.9 Haswell: A family of IA 22 nm processors. In Proceedings of the International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA, 9–13 February 2014; pp. 112–113.
- (2014) Proceedings of the International Solid-State Circuits Conference (ISSCC , pp. 112-113
- Kurd, N.¹ Chowdhury, M.² Burton, E.³ Thomas, T.P.⁴ Mozak, C.⁵ Boswell, B.⁶ Lal, M.⁷ Deval, A.⁸ Douglas, J.⁹ Elassal, M.¹⁰

3
- 84965137094
- NVIDIA. NVIDIA’s Next Generation CUDA Compute Architecture: Fermi
- NVIDIA. NVIDIA’s Next Generation CUDA Compute Architecture: Fermi. 2009. Available online: http://goo.gl/X2AI0b (accessed on 27 April 2016).
- (2009)

4
- 84965138205
- NVIDIA. NVIDIA’s Next Generation CUDA Compute Architecture:Kepler GK110/210
- NVIDIA. NVIDIA’s Next Generation CUDA Compute Architecture:Kepler GK110/210. 2014. Available online: http://goo.gl/qOSWW1 (accessed on 27 April 2016).
- (2014)

5
- 84939277360
- Harris, M. 5 Things You Should Know about the New Maxwell GPU Architecture. 2014. Available online: http://goo.gl/8NV82n (accessed on 27 April 2016).
- (2014) 5 Things You Should Know about the New Maxwell GPU Architecture
- Harris, M.¹

6
- 84903384277
- A survey of techniques for managing and leveraging caches in GPUs
- Mittal, S. A survey of techniques for managing and leveraging caches in GPUs. J. Circuits Syst. Comput. 2014, 23, 229–236.
- (2014) J. Circuits Syst. Comput , vol.23 , pp. 229-236
- Mittal, S.¹

7
- 84962881097
- Real-Time GPU Computing: Cache or No Cache?
- Auckland, New Zealand, 13–17 April
- Huangfu, Y.; Zhang, W. Real-Time GPU Computing: Cache or No Cache? In Proceedings of the International Symposium on Real-Time Distributed Computing (ISORC), Auckland, New Zealand, 13–17 April 2015; pp. 182–189.
- (2015) Proceedings of the International Symposium on Real-Time Distributed Computing (ISORC) , pp. 182-189
- Huangfu, Y.¹ Zhang, W.²

8
- 0024906840
- Improving cache performance by selective cache bypass
- Kailua-Kona, HI, USA, 3–6 January
- Chi, C.H.; Dietz, H. Improving cache performance by selective cache bypass. In Proceedings of the Twenty-Second Annual Hawaii International Conference on System Sciences, Kailua-Kona, HI, USA, 3–6 January 1989; Volume 1, pp. 277–285.
- (1989) Proceedings of the Twenty-Second Annual Hawaii International Conference on System Sciences , vol.1 , pp. 277-285
- Chi, C.H.¹ Dietz, H.²

9
- 0031353109
- Design and performance evaluation of a cache assist to implement selective caching
- Austin, TX, USA, 12–15 October
- John, L.K.; Subramanian, A. Design and performance evaluation of a cache assist to implement selective caching. In Proceedings of the International Conference on Computer Design, Austin, TX, USA, 12–15 October 1997; pp. 510–518.
- (1997) Proceedings of the International Conference on Computer Design , pp. 510-518
- John, L.K.¹ Subramanian, A.²

10
- 0033311745
- Hardware identification of cache conflict misses
- Haifa, Israel, 16–18 November
- Collins, J.D.; Tullsen, D.M. Hardware identification of cache conflict misses. In Proceedings of the International Symposium on Microarchitecture, Haifa, Israel, 16–18 November 1999; pp. 126–135.
- (1999) Proceedings of the International Symposium on Microarchitecture , pp. 126-135
- Collins, J.D.¹ Tullsen, D.M.²

11
- 84937704296
- Adaptive cache management for energy-efficient GPU computing
- Cambridge, UK, 13–17 December
- Chen, X.; Chang, L.W.; Rodrigues, C.I.; Lv, J.; Wang, Z.; Hwu, W.M. Adaptive cache management for energy-efficient GPU computing. In Proceedings of the 47th International Symposium on Microarchitecture, Cambridge, UK, 13–17 December 2014; pp. 343–355.
- (2014) Proceedings of the 47Th International Symposium on Microarchitecture , pp. 343-355
- Chen, X.¹ Chang, L.W.² Rodrigues, C.I.³ Lv, J.⁴ Wang, Z.⁵ Hwu, W.M.⁶

12
- 84906813610
- SBAC: A statistics based cache bypassing method for asymmetric-access caches
- La Jolla, CA, USA, 11–13 August
- Zhang, C.; Sun, G.; Li, P.; Wang, T.; Niu, D.; Chen, Y. SBAC: A statistics based cache bypassing method for asymmetric-access caches. In Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED), La Jolla, CA, USA, 11–13 August 2014; pp. 345–350.
- (2014) Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED) , pp. 345-350
- Zhang, C.¹ Sun, G.² Li, P.³ Wang, T.⁴ Niu, D.⁵ Chen, Y.⁶

13
- 84904004484
- DASCA: Dead write prediction assisted STT-RAM cache architecture
- Orlando, FL, USA, 15–19 February
- Ahn, J.; Yoo, S.; Choi, K. DASCA: Dead write prediction assisted STT-RAM cache architecture. In Proceedings of the 20th International Symposium on High Performance Computer Architecture (HPCA), Orlando, FL, USA, 15–19 February 2014; pp. 25–36.
- (2014) Proceedings of the 20th International Symposium on High Performance Computer Architecture (HPCA) , pp. 25-36
- Ahn, J.¹ Yoo, S.² Choi, K.³

14
- 84876561848
- Improving cache management policies using dynamic reuse distances
- Vancouver, BC, Canada, 1–5 December
- Duong, N.; Zhao, D.; Kim, T.; Cammarota, R.; Valero, M.; Veidenbaum, A.V. Improving cache management policies using dynamic reuse distances. In Proceedings of the 45th International SymposiumonMicroarchitecture, Vancouver, BC, Canada, 1–5 December 2012; pp. 389–400.
- (2012) Proceedings of the 45Th International Symposiumonmicroarchitecture , pp. 389-400
- Duong, N.¹ Zhao, D.² Kim, T.³ Cammarota, R.⁴ Valero, M.⁵ Veidenbaum, A.V.⁶

15
- 84897572369
- A Survey of Architectural Techniques For Improving Cache Power Efficiency
- Mittal, S. A Survey of Architectural Techniques For Improving Cache Power Efficiency. Sustain. Comput. Inform. Syst. 2014, 4, 33–43.
- (2014) Sustain. Comput. Inform. Syst , vol.4 , pp. 33-43
- Mittal, S.¹

16
- 0003003638
- A study of replacement algorithms for a virtual-storage computer
- Belady, L.A. A study of replacement algorithms for a virtual-storage computer. IBM Syst. J. 1966, 5, 78–101.
- (1966) IBM Syst. J , vol.5 , pp. 78-101
- Belady, L.A.¹

17
- 0026242244
- Performance and the i860 microprocessor
- Atkins, M. Performance and the i860 microprocessor. IEEE Micro 1991, 11, 24–27.
- (1991) IEEE Micro , vol.11 , pp. 24-27
- Atkins, M.¹

18
- 84965160711
- Intel Corporation. Intel 64 and IA-32 Architectures, Software Developer’s Manual, Instruction Set Reference, A-Z; Intel Corporation: Santa Clara, CA, USA
- Intel Corporation. Intel 64 and IA-32 Architectures, Software Developer’s Manual, Instruction Set Reference, A-Z; Intel Corporation: Santa Clara, CA, USA, 2011; Volume 2.
- (2011) , vol.2

19
- 84965136367
- NVIDIA Corporation, Version 4.2; NVIDIA Corporation: Santa Clara, CA, USA
- NVIDIA Corporation. Parallel Thread Execution ISA Version 4.2; NVIDIA Corporation: Santa Clara, CA, USA, 2015.
- (2015) Parallel Thread Execution ISA

20
- 41149104074
- Counter-based cache replacement and bypassing algorithms
- Kharbutli, M.; Solihin, Y. Counter-based cache replacement and bypassing algorithms. IEEE Trans. Comput. 2008, 57, 433–447.
- (2008) IEEE Trans. Comput , vol.57 , pp. 433-447
- Kharbutli, M.¹ Solihin, Y.²

21
- 80052536606
- Bypass and insertion algorithms for exclusive last-level caches
- San Jose, CA, USA, 4–8 June
- Gaur, J.; Chaudhuri, M.; Subramoney, S. Bypass and insertion algorithms for exclusive last-level caches. In Proceedings of the 38 th International Symposium on Computer Architecture (ISCA), San Jose, CA, USA, 4–8 June 2011; pp. 81–92.
- (2011) Proceedings of the 38 Th International Symposium on Computer Architecture (ISCA) , pp. 81-92
- Gaur, J.¹ Chaudhuri, M.² Subramoney, S.³

22
- 84892550871
- FlexiWay: A Cache Energy Saving Technique Using Fine-grained Cache Reconfiguration
- Asheville, NC, USA, 6–9 October
- Mittal, S.; Zhang, Z.; Vetter, J. FlexiWay: A Cache Energy Saving Technique Using Fine-grained Cache Reconfiguration. In Proceedings of the 31st IEEE International Conference on Computer Design (ICCD); Asheville, NC, USA, 6–9 October 2013.
- (2013) Proceedings of the 31St IEEE International Conference on Computer Design (ICCD)
- Mittal, S.¹ Zhang, Z.² Vetter, J.³

23
- 84871651935
- Energy savings via dead sub-block prediction
- New York, NY, USA, 24–26 October
- Alves, M.; Khubaib, K.; Ebrahimi, E.; Narasiman, V.; Villavieja, C.; Navaux, P.O.A.; Patt, Y.N. Energy savings via dead sub-block prediction. In Proceedings of the 24th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), New York, NY, USA, 24–26 October 2012; pp. 51–58.
- (2012) Proceedings of the 24Th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD) , pp. 51-58
- Alves, M.¹ Khubaib, K.² Ebrahimi, E.³ Narasiman, V.⁴ Villavieja, C.⁵ Navaux, P.O.A.⁶ Patt, Y.N.⁷

24
- 84928400002
- EnCache: A Dynamic Profiling Based Reconfiguration Technique for Improving Cache Energy Efficiency
- Mittal, S.; Zhang, Z. EnCache: A Dynamic Profiling Based Reconfiguration Technique for Improving Cache Energy Efficiency. J. Circuits Syst. Comput. 2014, 23, 1450147.
- (2014) J. Circuits Syst. Comput , vol.23 , pp. 23
- Mittal, S.¹ Zhang, Z.²

25
- 84929352865
- A Survey Of Architectural Approaches for Managing Embedded DRAM and Non-volatile On-chip Caches
- Mittal, S.; Vetter, J.S.; Li, D. A Survey Of Architectural Approaches for Managing Embedded DRAM and Non-volatile On-chip Caches. IEEE Trans. Parallel Distrib. Syst. 2015, 26, 1524–1537.
- (2015) IEEE Trans. Parallel Distrib. Syst , vol.26 , pp. 1524-1537
- Mittal, S.¹ Vetter, J.S.² Li, D.³

26
- 84963816640
- A Survey of Power Management Techniques for Phase Change Memory
- Mittal, S. A Survey of Power Management Techniques for Phase Change Memory. Int. J. Comput. Aided Eng. Technol. 2014.
- (2014) Int. J. Comput. Aided Eng. Technol
- Mittal, S.¹

27
- 84945941165
- Technical Report ORNL/TM-2014/636; Oak Ridge National Laboratory: Oak Ridge, TN, USA
- Mittal, S.; Poremba, M.; Vetter, J.; Xie, Y. Exploring Design Space of 3D NVM and eDRAM Caches Using DESTINY Tool; Technical Report ORNL/TM-2014/636; Oak Ridge National Laboratory: Oak Ridge, TN, USA, 2014.
- (2014) Exploring Design Space of 3D NVM and Edram Caches Using DESTINY Tool
- Mittal, S.¹ Poremba, M.² Vetter, J.³ Xie, Y.⁴

28
- 84963787521
- A Survey of Software Techniques for Using Non-Volatile Memories for Storage and Main Memory Systems
- Mittal, S.; Vetter, J.S. A Survey of Software Techniques for Using Non-Volatile Memories for Storage and Main Memory Systems. IEEE Trans. Parallel Distrib. Syst. 2016, 27, 1537–1550.
- (2016) IEEE Trans. Parallel Distrib. Syst , vol.27 , pp. 1537-1550
- Mittal, S.¹ Vetter, J.S.²

29
- 84885645578
- OAP: An obstruction-aware cache management policy for STT-RAM last-level caches
- Grenoble, France, 18–22 March
- Wang, J.; Dong, X.; Xie, Y. OAP: An obstruction-aware cache management policy for STT-RAM last-level caches. In Proceedings of the Conference on Design, Automation and Test in Europe, Grenoble, France, 18–22 March 2013; pp. 847–852.
- (2013) Proceedings of the Conference on Design, Automation and Test in Europe , pp. 847-852
- Wang, J.¹ Dong, X.² Xie, Y.³

30
- 84969930755
- Survey of Techniques for Architecting DRAM Caches
- Mittal, S.; Vetter, J. A Survey of Techniques for Architecting DRAM Caches. IEEE Trans. Parallel Distrib. Syst. 2015, doi:10.1109/TPDS.2015.2461155.
- (2015) IEEE Trans. Parallel Distrib. Syst
- Mittal, S.¹ Vetter, J.A.²

31
- 84965119872
- MD. AMD Graphics Cores Next (GCN) Architecture
- AMD. AMD Graphics Cores Next (GCN) Architecture. 2012. Available online: https://goo.gl/NjNcDY (accessed on 27 April 2016).
- (2012)

32
- 84966487021
- Adaptive and Transparent Cache Bypassing for GPUs
- Austin, TX, USA, 15–20 November
- Li, A.; van den Braak, G.J.; Kumar, A.; Corporaal, H. Adaptive and Transparent Cache Bypassing for GPUs. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC), Austin, TX, USA, 15–20 November 2015.
- (2015) Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC)
- Li, A.¹ Van Den Braak, G.J.² Kumar, A.³ Corporaal, H.⁴

33
- 84965125176
- Hagedoorn, H. Core i7 5775C Processor Review: Desktop Broadwell—The Broadwell-H Architecture. 2015. Available online: http://goo.gl/1QFwja (accessed on 27 April 2016).
- (2015) Core I7 5775C Processor Review: Desktop Broadwell—The Broadwell-H Architecture
- Hagedoorn, H.¹

34
- 84903985058
- MRPB: Memory request prioritization for massively parallel processors
- Orlando, FL, USA
- Jia, W.; Shaw, K.; Martonosi, M. MRPB: Memory request prioritization for massively parallel processors. In Proceedings of the 20th International Symposium on High Performance Computer Architecture (HPCA), Orlando, FL, USA, 15–19 February 2014; pp. 272–283.
- (2014) Proceedings of the 20Th International Symposium on High Performance Computer Architecture (HPCA) , pp. 272-283
- Jia, W.¹ Shaw, K.² Martonosi, M.³

35
- 84938849494
- Adaptive GPU cache bypassing
- San Francisco, CA, USA
- Tian, Y.; Puthoor, S.; Greathouse, J.L.; Beckmann, B.M.; Jiménez, D.A. Adaptive GPU cache bypassing. In Proceedings of the 8thWorkshop on General Purpose Processing Using GPUs, San Francisco, CA, USA, 7 February 2015; pp. 25–35.
- (2015) Proceedings of the 8Thworkshop on General Purpose Processing Using Gpus , pp. 25-35
- Tian, Y.¹ Puthoor, S.² Greathouse, J.L.³ Beckmann, B.M.⁴ Jiménez, D.A.⁵

36
- 84867291675
- Exploiting core working sets to filter the L1 cache with random sampling
- Etsion, Y.; Feitelson, D.G. Exploiting core working sets to filter the L1 cache with random sampling. IEEE Trans. Comput. 2012, 61, 1535–1550.
- (2012) IEEE Trans. Comput , vol.61 , pp. 1535-1550
- Etsion, Y.¹ Feitelson, D.G.²

37
- 84960122898
- BEAR: Techniques for Mitigating Bandwidth Bloat in Gigascale DRAM Caches
- Portland, OR, USA, 13–17
- Chou, C.; Jaleel, A.; Qureshi, M.K. BEAR: Techniques for Mitigating Bandwidth Bloat in Gigascale DRAM Caches. In Proceedings of the 42nd International Symposium on Computer Architecture (ISCA), Portland, OR, USA, 13–17 June 2015.
- (2015) Proceedings of the 42Nd International Symposium on Computer Architecture (ISCA)
- Chou, C.¹ Jaleel, A.² Qureshi, M.K.³

38
- 84934293443
- Coordinated static and dynamic cache bypassing for GPUs
- Burlingame, CA, USA, 7–11
- Xie, X.; Liang, Y.; Wang, Y.; Sun, G.; Wang, T. Coordinated static and dynamic cache bypassing for GPUs. In Proceedings of the 21st International Symposium on High Performance Computer Architecture (HPCA), Burlingame, CA, USA, 7–11 February 2015; pp. 76–88.
- (2015) Proceedings of the 21St International Symposium on High Performance Computer Architecture (HPCA) , pp. 76-88
- Xie, X.¹ Liang, Y.² Wang, Y.³ Sun, G.⁴ Wang, T.⁵

39
- 84867487115
- Optimal bypass monitor for high performance last-level caches
- Minneapolis, MN, USA, 19–23
- Li, L.; Tong, D.; Xie, Z.; Lu, J.; Cheng, X. Optimal bypass monitor for high performance last-level caches. In Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques, Minneapolis, MN, USA, 19–23 September 2012; pp. 315–324.
- (2012) Proceedings of the 21St International Conference on Parallel Architectures and Compilation Techniques , pp. 315-324
- Li, L.¹ Tong, D.² Xie, Z.³ Lu, J.⁴ Cheng, X.⁵

40
- 84894205278
- SCIP: Selective cache insertion and bypassing to improve the performance of last-level caches
- Amman, Jordan, 3–5 Decembe
- Kharbutli, M.; Jarrah, M.; Jararweh, Y. SCIP: Selective cache insertion and bypassing to improve the performance of last-level caches. In Proceedings of the IEEE Conference on Applied Electrical Engineering and Computing Technologies (AEECT), Amman, Jordan, 3–5 December 2013; pp. 1–6.
- (2013) Proceedings of the IEEE Conference on Applied Electrical Engineering and Computing Technologies (AEECT) , pp. 1-6
- Kharbutli, M.¹ Jarrah, M.² Jararweh, Y.³

41
- 84903977516
- Full system simulation framework for integrated CPU/GPU architecture
- Hsinchu, Taiwan, 28–30 April
- Wang, P.H.; Liu, G.H.; Yeh, J.C.; Chen, T.M.; Huang, H.Y.; Yang, C.L.; Liu, S.L.; Greensky, J. Full system simulation framework for integrated CPU/GPU architecture. In Proceedings of the International Symposium on VLSI Design, Automation and Test (VLSI-DAT), Hsinchu, Taiwan, 28–30 April 2014; pp. 1–4.
- (2014) Roceedings of the International Symposium on VLSI Design, Automation and Test (VLSI-DAT) , pp. 1-4
- Wang, P.H.¹ Liu, G.H.² Yeh, J.C.³ Chen, T.M.⁴ Huang, H.Y.⁵ Yang, C.L.⁶ Liu, S.L.⁷ Greensky, J.⁸

42
- 84939814753
- Survey of CPU-GPU Heterogeneous Computing Techniques
- Mittal, S.; Vetter, J. A Survey of CPU-GPU Heterogeneous Computing Techniques. ACM Comput. Surv. 2015, 47, 69:1–69:35.
- (2015) ACM Comput. Surv , vol.47
- Mittal, S.¹ Vetter, J.A.²

43
- 84884874409
- Adaptive cache bypassing for inclusive last level caches
- Cambridge, MA, USA, 20–24 May
- Gupta, S.; Gao, H.; Zhou, H. Adaptive cache bypassing for inclusive last level caches. In Proceedings of the International Symposium on Parallel & Distributed Processing (IPDPS), Cambridge, MA, USA, 20–24 May 2013; pp. 1243–1253.
- (2013) Roceedings of the International Symposium on Parallel & Distributed Processing (IPDPS) , pp. 1243-1253
- Gupta, S.¹ Gao, H.² Zhou, H.³

44
- 84960841884
- Bypassing method for STT-RAM based inclusive last-level cache
- Prague, Czech Republic, 9–12
- Kim, M.K.; Choi, J.H.; Kwak, J.W.; Jhang, S.T.; Jhon, C.S. Bypassing method for STT-RAM based inclusive last-level cache. In Proceedings of the Conference on Research in Adaptive and Convergent Systems, Prague, Czech Republic, 9–12 October 2015; pp. 424–429.
- (2015) Proceedings of the Conference on Research in Adaptive and Convergent Systems , pp. 424-429
- Kim, M.K.¹ Choi, J.H.² Kwak, J.W.³ Jhang, S.T.⁴ Jhon, C.S.⁵

45
- 84867490427
- Introducing hierarchy-awareness in replacement and bypass algorithms for last-level caches
- Minneapolis, MN, USA, 19–23 September
- Chaudhuri, M.; Gaur, J.; Bashyam, N.; Subramoney, S.; Nuzman, J. Introducing hierarchy-awareness in replacement and bypass algorithms for last-level caches. In Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques, Minneapolis, MN, USA, 19–23 September 2012; pp. 293–304.
- (2012) Proceedings of the 21St International Conference on Parallel Architectures and Compilation Techniques , pp. 293-304
- Chaudhuri, M.¹ Gaur, J.² Bashyam, N.³ Subramoney, S.⁴ Nuzman, J.⁵

46
- 2642585003
- Using cache mapping to improve memory performance handheld devices
- Austin, TX, USA, 10–12 March
- Xu, R.; Li, Z. Using cache mapping to improve memory performance handheld devices. In Proceedings of the International Symposium on Performance Analysis of Systems and Software (ISPASS), Austin, TX, USA, 10–12 March 2004; pp. 106–114.
- (2004) Proceedings of the International Symposium on Performance Analysis of Systems and Software (ISPASS) , pp. 106-114
- Xu, R.¹ Li, Z.²

47
- 84957538591
- Locality-Driven Dynamic GPU Cache Bypassing
- Newport Beach, CA, USA, 8–11 June
- Li, C.; Song, S.L.; Dai, H.; Sidelnik, A.; Hari, S.K.S.; Zhou, H. Locality-Driven Dynamic GPU Cache Bypassing. In Proceedings of the International Conference on Supercomputing (ICS), Newport Beach, CA, USA, 8–11 June 2015.
- (2015) Proceedings of the International Conference on Supercomputing (ICS)
- Li, C.¹ Song, S.L.² Dai, H.³ Sidelnik, A.⁴ Hari, S.K.S.⁵ Zhou, H.⁶

48
- 84960075571
- A fully associative, tagless DRAM cache
- Portland,OR,USA, 13–17 June
- Lee, Y.; Kim, J.; Jang, H.; Yang, H.; Kim, J.; Jeong, J.; Lee, J.W. A fully associative, tagless DRAM cache. In Proceedings of the International Symposiumon ComputerArchitecture, Portland,OR,USA, 13–17 June 2015; pp. 211–222.
- (2015) Proceedings of the International Symposiumon Computerarchitecture , pp. 211-222
- Lee, Y.¹ Kim, J.² Jang, H.³ Yang, H.⁴ Kim, J.⁵ Jeong, J.⁶ Lee, J.W.⁷

49
- 70449729946
- Less reused filter: Improving L2 cache performance via filtering less reused lines
- NY, USA, 8–12 June
- Xiang, L.; Chen, T.; Shi, Q.; Hu, W. Less reused filter: Improving L2 cache performance via filtering less reused lines. In Proceedings of the 23rd International conference on Supercomputing, Yorktown Heights, NY, USA, 8–12 June 2009; pp. 68–79.
- (2009) Roceedings of the 23Rd International Conference on Supercomputing, Yorktown Heights , pp. 68-79
- Xiang, L.¹ Chen, T.² Shi, Q.³ Hu, W.⁴

50
- 66749155879
- Cache bursts: A new approach for eliminating dead blocks and increasing cache efficiency
- Como, Italy, 8–12 November
- Liu, H.; Ferdman, M.; Huh, J.; Burger, D. Cache bursts: A new approach for eliminating dead blocks and increasing cache efficiency. In Proceedings of the International Symposium on Microarchitecture, Como, Italy, 8–12 November 2008; pp. 222–233.
- (2008) Proceedings of the International Symposium on Microarchitecture , pp. 222-233
- Liu, H.¹ Ferdman, M.² Huh, J.³ Burger, D.⁴

51
- 84871174086
- Enhancing LRU replacement via phantom associativity
- New Orleans, LA, USA, 25 February
- Feng, M.; Tian, C.; Gupta, R. Enhancing LRU replacement via phantom associativity. In Proceedings of the 16th Workshop on Interaction between Compilers and Computer Architectures (INTERACT), New Orleans, LA, USA, 25 February 2012; pp. 9–16.
- (2012) Proceedings of the 16Th Workshop on Interaction between Compilers and Computer Architectures (INTERACT) , pp. 9-16
- Feng, M.¹ Tian, C.² Gupta, R.³

52
- 84899697182
- Location-aware cache management for many-core processors with deep cache hierarchy
- Denver, CO, USA, 17–22 November
- Park, J.; Yoo, R.M.; Khudia, D.S.; Hughes, C.J.; Kim, D. Location-aware cache management for many-core processors with deep cache hierarchy. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, Denver, CO, USA, 17–22 November 2013; p. 20.
- (2013) Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
- Park, J.¹ Yoo, R.M.² Khudia, D.S.³ Hughes, C.J.⁴ Kim, D.⁵

53
- 84903934670
- Adaptive placement and migration policy for an STT-RAM-based hybrid cache
- Orlando, FL, USA, 15–19 February
- Wang, Z.; Jiménez, D.A.; Xu, C.; Sun, G.; Xie, Y. Adaptive placement and migration policy for an STT-RAM-based hybrid cache. In Proceedings of the 20th International Symposium on High Performance Computer Architecture (HPCA), Orlando, FL, USA, 15–19 February 2014; pp. 13–24.
- (2014) Proceedings of the 20Th International Symposium on High Performance Computer Architecture (HPCA) , pp. 13-24
- Wang, Z.¹ Jiménez, D.A.² Xu, C.³ Sun, G.⁴ Xie, Y.⁵

54
- 84862965853
- Global Priority Table for Last-Level Caches
- Sydney, Australia, 12–14 December
- Yu, B.; Ma, J.; Chen, T.; Wu, M. Global Priority Table for Last-Level Caches. In Proceedings of the International Conference on Dependable, Autonomic and Secure Computing (DASC), Sydney, Australia, 12–14 December 2011; pp. 279–285.
- (2011) Proceedings of the International Conference on Dependable, Autonomic and Secure Computing (DASC) , pp. 279-285
- Yu, B.¹ Ma, J.² Chen, T.³ Wu, M.⁴

55
- 84946560845
- SLIP: Reducing wire energy in the memory hierarchy
- Portland, OR, USA, 13–17 June
- Das, S.; Aamodt, T.M.; Dally, W.J. SLIP: Reducing wire energy in the memory hierarchy. In Proceedings of the International Symposium on Computer Architecture, Portland, OR, USA, 13–17 June 2015; pp. 349–361.
- (2015) Proceedings of the International Symposium on Computer Architecture , pp. 349-361
- Das, S.¹ Aamodt, T.M.² Dally, W.J.³

56
- 80455174925
- A dueling segmented LRU replacement algorithm with adaptive bypassing
- Saint-Malo, France, 20 June
- Gao, H.; Wilkerson, C. A dueling segmented LRU replacement algorithm with adaptive bypassing. In Proceedings of the JILP Worshop on Computer Architecture Competitions: Cache Replacement Championship (JWAC), Saint-Malo, France, 20 June 2010.
- (2010) Proceedings of the JILP Worshop on Computer Architecture Competitions: Cache Replacement Championship (JWAC)
- Gao, H.¹ Wilkerson, C.²

57
- 84948958301
- Compiler managed micro-cache bypassing for high performance EPIC processors
- Istanbul, Turkey, 18–22 November
- Wu, Y.; Rakvic, R.; Chen, L.L.; Miao, C.C.; Chrysos, G.; Fang, J. Compiler managed micro-cache bypassing for high performance EPIC processors. In Proceedings of the 35th Annual IEEE International Symposium on Microarchitecture, Istanbul, Turkey, 18–22 November 2002; pp. 134–145.
- (2002) Proceedings of the 35Th Annual IEEE International Symposium on Microarchitecture , pp. 134-145
- Wu, Y.¹ Rakvic, R.² Chen, L.L.³ Miao, C.C.⁴ Chrysos, G.⁵ Fang, J.⁶

58
- 84938828873
- Efficient utilization of GPGPU cache hierarchy
- San Francisco, CA, USA, 7 February
- Khairy, M.; Zahran, M.; Wassal, A.G. Efficient utilization of GPGPU cache hierarchy. In Proceedings of the 8thWorkshop on General Purpose Processing Using GPUs, San Francisco, CA, USA, 7 February 2015; pp. 36–47.
- (2015) Proceedings of the 8Thworkshop on General Purpose Processing Using Gpus , pp. 36-47
- Khairy, M.¹ Zahran, M.² Wassal, A.G.³

59
- 84961750961
- Adaptive cache and concurrency allocation on GPGPUs
- Zheng, Z.; Wang, Z.; Lipasti, M. Adaptive cache and concurrency allocation on GPGPUs. IEEE Comput. Archit. Lett. 2015, 14, 90–93.
- (2015) IEEE Comput. Archit. Lett , vol.14 , pp. 90-93
- Zheng, Z.¹ Wang, Z.² Lipasti, M.³

60
- 85027162994
- Exploiting Inter-Warp Heterogeneity to Improve GPGPU Performance
- San Francisco, CA, USA, 18–21 October
- Ausavarungnirun, R.; Ghose, S.; Kayiran, O.; Loh, G.H.; Das, C.R.; Kandemir, M.T.; Mutlu, O. Exploiting Inter-Warp Heterogeneity to Improve GPGPU Performance. In Proceedings of the International Conference on Parallel Architecture and Compilation (PACT), San Francisco, CA, USA, 18–21 October 2015.
- (2015) Proceedings of the International Conference on Parallel Architecture and Compilation (PACT)
- Ausavarungnirun, R.¹ Ghose, S.² Kayiran, O.³ Loh, G.H.⁴ Das, C.R.⁵ Kandemir, M.T.⁶ Mutlu, O.⁷

61
- 0029508817
- A modified approach to data cache management
- Ann Arbor, MI, USA, 29 November–1 December
- Tyson, G.; Farrens, M.; Matthews, J.; Pleszkun, A.R. A modified approach to data cache management. In Proceedings of the 28th Annual International Symposium on Microarchitecture, Ann Arbor, MI, USA, 29 November–1 December 1995; pp. 93–103.
- (1995) Proceedings of the 28Th Annual International Symposium on Microarchitecture , pp. 93-103
- Tyson, G.¹ Farrens, M.² Matthews, J.³ Pleszkun, A.R.⁴

62
- 84977080696
- A Model-DrivenApproach toWarp/Thread-Block Level GPU Cache Bypassing
- Austin, TX, USA, 5–9 June
- Dai, H.; Gupta, S.; Li, C.; Kartsaklis, C.; Mantor, M; Zhou, H. A Model-DrivenApproach toWarp/Thread-Block Level GPU Cache Bypassing. In Proceedings of the Design Automation Conference (DAC), Austin, TX, USA, 5–9 June 2016.
- (2016) Proceedings of the Design Automation Conference (DAC)
- Dai, H.¹ Gupta, S.² Li, C.³ Kartsaklis, C.⁴ Mantor, M.⁵ Zhou, H.⁶

63
- 84858775441
- Reducing off-chip memory traffic by selective cache management scheme in GPGPUs
- London, UK, 3 March
- Choi, H.; Ahn, J.; Sung, W. Reducing off-chip memory traffic by selective cache management scheme in GPGPUs. In Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units, London, UK, 3 March 2012; pp. 110–119.
- (2012) Proceedings of the 5Th Annual Workshop on General Purpose Processing with Graphics Processing Units , pp. 110-119
- Choi, H.¹ Ahn, J.² Sung, W.³

64
- 84905105251
- Orchestrating cache management and memory scheduling for GPGPU applications
- Mu, S.; Deng, Y.; Chen, Y.; Li, H.; Pan, J.; Zhang, W.; Wang, Z. Orchestrating cache management and memory scheduling for GPGPU applications. IEEE Trans. Very Large Scale Integr. Syst. 2014, 22, 1803–1814.
- (2014) IEEE Trans. Very Large Scale Integr. Syst , vol.22 , pp. 1803-1814
- Mu, S.¹ Deng, Y.² Chen, Y.³ Li, H.⁴ Pan, J.⁵ Zhang, W.⁶ Wang, Z.⁷

65
- 0030717768
- Run-time adaptive cache hierarchy management via reference analysis
- Denver, CO, USA, 1–4 June
- Johnson, T.L.; Hwu, W.M.W. Run-time adaptive cache hierarchy management via reference analysis. In Proceedings of the International Symposium on Computer Architecture, Denver, CO, USA, 1–4 June 1997; Volume 25, pp. 315–326.
- (1997) Proceedings of the International Symposium on Computer Architecture , vol.25 , pp. 315-326
- Johnson, T.L.¹ Hwu, W.M.W.²

66
- 66749173006
- A novel approach to cache block reuse prediction
- Kaohsiung, Taiwan, 6–9 October
- Jalminger, J.; Stenström, P. A novel approach to cache block reuse prediction. In Proceedings of the 42nd International Conference on Parallel Processing, Kaohsiung, Taiwan, 6–9 October 2003; pp. 294–302.
- (2003) Proceedings of the 42Nd International Conference on Parallel Processing , pp. 294-302
- Jalminger, J.¹ Stenström, P.²

67
- 84892456364
- WADE:Writeback-aware dynamic cachemanagement forNVM-basedmainmemory system
- Wang, Z.; Shan, S.; Cao, T.; Gu, J.; Xu, Y.; Mu, S.; Xie, Y.; Jiménez, D.A. WADE:Writeback-aware dynamic cachemanagement forNVM-basedmainmemory system. ACM Trans. Archit. Code Optim. 2013, 10, 51:1–51:21.
- (2013) ACM Trans. Archit. Code Optim , vol.10
- Wang, Z.¹ Shan, S.² Cao, T.³ Gu, J.⁴ Xu, Y.⁵ Mu, S.⁶ Xie, Y.⁷ Jiménez, D.A.⁸

68
- 84957553600
- DaCache: Memory Divergence-Aware GPU Cache Management
- Newport Beach, CA, USA, 8–11 June
- Wang, B.; Yu, W.; Sun, X.H.; Wang, X. DaCache: Memory Divergence-Aware GPU Cache Management. In Proceedings of the 29th International Conference on Supercomputing, Newport Beach, CA, USA, 8–11 June 2015; pp. 89–98.
- (2015) Proceedings of the 29Th International Conference on Supercomputing , pp. 89-98
- Wang, B.¹ Yu, W.² Sun, X.H.³ Wang, X.⁴

69
- 84893396474
- An Efficient Compiler Framework for Cache Bypassing on GPUs
- San Jose, CA, USA, 18–21 November
- Liang, Y.; Xie, X.; Sun, G.; Chen, D. An Efficient Compiler Framework for Cache Bypassing on GPUs. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD), San Jose, CA, USA, 18–21 November 2013.
- (2013) Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD)
- Liang, Y.¹ Xie, X.² Sun, G.³ Chen, D.⁴

70
- 34548810162
- Load miss prediction-exploiting power performance trade-offs
- Long Beach, CA, USA, 26–30 March
- Malkowski, K.; Link, G.; Raghavan, P.; Irwin, M.J. Load miss prediction-exploiting power performance trade-offs. In Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS), Long Beach, CA, USA, 26–30 March 2007; pp. 1–8.
- (2007) Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS) , pp. 1-8
- Malkowski, K.¹ Link, G.² Raghavan, P.³ Irwin, M.J.⁴

71
- 0029204095
- A Data Cache with Multiple Caching Strategies Tuned to Different Types of Locality
- Barcelona, Spain, 3–7 July
- González, A.; Aliagas, C.; Valero, M. A Data Cache with Multiple Caching Strategies Tuned to Different Types of Locality. In Proceedings of the 9th International Conference on Supercomputing, Barcelona, Spain, 3–7 July 1995; pp. 338–347.
- (1995) Proceedings of the 9Th International Conference on Supercomputing , pp. 338-347
- González, A.¹ Aliagas, C.² Valero, M.³

72
- 84961121803
- Technique For Improving Lifetime of Non-volatile Caches using Write-minimization
- Mittal, S.; Vetter, J. A Technique For Improving Lifetime of Non-volatile Caches using Write-minimization. J. Low Power Electron. Appl. 2016, 6, 1.
- (2016) J. Low Power Electron. Appl , vol.6
- Mittal, S.¹ Vetter, J.A.²

73
- 84965166416
- Chan, K.K.; Hay, C.C.; Keller, J.R.; Kurpanek, G.P.; Schumacher, F.X.; Zheng, J. Design of the HP PA 7200 CPU. HP J. 1996.
- (1996) Design of the HP PA 7200 CPU. HP J
- Chan, K.K.¹ Hay, C.C.² Keller, J.R.³ Kurpanek, G.P.⁴ Schumacher, F.X.⁵ Zheng, J.⁶

74
- 84965101162
- Springer: New York, NY, USA
- Karlsson, M.; Hagersten, E. Timestamp-based selective cache allocation. In High Performance Memory Systems; Springer: New York, NY, USA, 2004; pp. 43–59.
- (2004) Timestamp-Based Selective Cache Allocation. in High Performance Memory Systems , pp. 43-59
- Karlsson, M.¹ Hagersten, E.²

75
- 84960893370
- GREEN Cache: Exploiting the Disciplined Memory Model of OpenCL on GPUs
- Lee, J.; Woo, D.H.; Kim, H.; Azimi, M. GREEN Cache: Exploiting the Disciplined Memory Model of OpenCL on GPUs. IEEE Trans. Comput. 2015, 64, 3167–3180.
- (2015) IEEE Trans. Comput , vol.64 , pp. 3167-3180
- Lee, J.¹ Woo, D.H.² Kim, H.³ Azimi, M.⁴

76
- 79951697650
- Sampling dead block prediction for last-level caches
- Atlanta, GA, USA, 4–8 December
- Khan, S.; Tian, Y.; Jiménez, D. Sampling dead block prediction for last-level caches. In Proceedings of the International Symposium on Microarchitecture (MICRO), Atlanta, GA, USA, 4–8 December 2010; pp. 175–186.
- (2010) Proceedings of the International Symposium on Microarchitecture (MICRO) , pp. 175-186
- Khan, S.¹ Tian, Y.² Jiménez, D.³

77
- 84887456430
- Managing shared last-level cache in a heterogeneousmulticore processor
- Edinburgh, UK, 7–11 September
- Mekkat, V.; Holey, A.; Yew, P.C.; Zhai, A. Managing shared last-level cache in a heterogeneousmulticore processor. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT), Edinburgh, UK, 7–11 September 2013, pp. 225–234.
- (2013) Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT) , pp. 225-234
- Mekkat, V.¹ Holey, A.² Yew, P.C.³ Zhai, A.⁴

78
- 84965128359
- A Survey Of Techniques for Cache Locking
- Mittal, S. A Survey Of Techniques for Cache Locking. ACM Trans. Des. Autom. Electron. Syst. 2016, 21, 49:1–49:24.
- (2016) ACM Trans. Des. Autom. Electron. Syst , vol.21
- Mittal, S.¹

79
- 84965105240
- A Survey of Recent Prefetching Techniques for Processor Caches
- Mittal, S. A Survey of Recent Prefetching Techniques for Processor Caches. ACM Comput. Surv. 2016.
- (2016) ACM Comput. Surv
- Mittal, S.¹

80
- 84905112592
- MASTER: A multicore cache energy saving technique using dynamic cache reconfiguration
- Mittal, S.; Cao, Y.; Zhang, Z. MASTER: A multicore cache energy saving technique using dynamic cache reconfiguration. IEEE Trans. Very Large Scale Integr. Syst. 2014, 22, 1653–1665.
- (2014) IEEE Trans. Very Large Scale Integr. Syst , vol.22 , pp. 1653-1665
- Mittal, S.¹ Cao, Y.² Zhang, Z.³

81
- 4143137693
- Self-correcting LRU replacement policies
- Ischia, Italy, 14–16 April
- Kampe, M.; Stenstrom, P.; Dubois, M. Self-correcting LRU replacement policies. In Proceedings of the 1st Conference on Computing Frontiers, Ischia, Italy, 14–16 April 2004; pp. 181–191.
- (2004) Proceedings of the 1St Conference on Computing Frontiers , pp. 181-191
- Kampe, M.¹ Stenstrom, P.² Dubois, M.³

82
- 84946136069
- Improve LLC Bypassing Performance by Memory Controller Improvements inHeterogeneousMulticore System
- Hong Kong, 9–11 December
- Ma, J.; Meng, J.; Chen, T.; Shi, Q.; Wu, M.; Liu, L. Improve LLC Bypassing Performance by Memory Controller Improvements inHeterogeneousMulticore System. In Proceedings of the International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT), Hong Kong, 9–11 December 2014; pp. 82–89.
- (2014) Proceedings of the International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT) , pp. 82-89
- Ma, J.¹ Meng, J.² Chen, T.³ Shi, Q.⁴ Wu, M.⁵ Liu, L.⁶

83
- 84982084476
- RACB: Resource Aware Cache Bypass on GPUs
- Paris, France, 22–24 October
- Dai, H.; Kartsaklis, C.; Li, C.; Janjusic, T.; Zhou, H. RACB: Resource Aware Cache Bypass on GPUs. In Proceedings of the International Symposium on Computer Architecture and High Performance Computing Workshop (SBAC-PADW), Paris, France, 22–24 October 2014; pp. 24–29.
- (2014) Proceedings of the International Symposium on Computer Architecture and High Performance Computing Workshop (SBAC-PADW , pp. 24-29
- Dai, H.¹ Kartsaklis, C.² Li, C.³ Janjusic, T.⁴ Zhou, H.⁵

84
- 84894459157
- Shared Data Caches Conflicts Reduction for WCET Computation in Multi-Core Architectures
- Toulouse, France, 4–5 Novermber
- Lesage, B.; Hardy, D.; Puaut, I. Shared Data Caches Conflicts Reduction for WCET Computation in Multi-Core Architectures. In Proceedings of the 18th International Conference on Real-Time and Network Systems, Toulouse, France, 4–5 Novermber 2010; p. 2283.
- (2010) Proceedings of the 18Th International Conference on Real-Time and Network Systems
- Lesage, B.¹ Hardy, D.² Puaut, I.³

85
- 77649302111
- Using bypass to tighten WCET estimates for multi-core processors with shared instruction caches
- Washington, DC, USA, 1–4 December
- Hardy, D.; Piquet, T.; Puaut, I. Using bypass to tighten WCET estimates for multi-core processors with shared instruction caches. In Proceedings of the 34th IEEE Real-Time Systems Symposium (RTSS), Washington, DC, USA, 1–4 December 2009; pp. 68–77.
- (2009) Roceedings of the 34Th IEEE Real-Time Systems Symposium (RTSS) , pp. 68-77
- Hardy, D.¹ Piquet, T.² Puaut, I.³

86
- 77954998134
- High performance cache replacement using re-reference interval prediction (RRIP)
- Saint-Malo, France, 19–23 June
- Jaleel, A.; Theobald, K.B.; Steely, S.C., Jr.; Emer, J. High performance cache replacement using re-reference interval prediction (RRIP). In Proceedings of the 37th International Symposium on Computer Architecture, Saint-Malo, France, 19–23 June 2010; pp. 60–71.
- (2010) Proceedings of the 37Th International Symposium on Computer Architecture , pp. 60-71
- Jaleel, A.¹ Theobald, K.B.² Steely, S.C.³ Emer, J.⁴

87
- 84965140743
- Intel Corporation. Intel StrongARM SA-1110 Microprocessor Developer’s Manual; Intel Corporation: Santa Clara, CA, USA
- Intel Corporation. Intel StrongARM SA-1110 Microprocessor Developer’s Manual; Intel Corporation: Santa Clara, CA, USA, 2000.
- (2000)

88
- 84893396474
- An efficient compiler framework for cache bypassing on GPUs
- San Jose, CA, USA, 18–21 November
- Xie, X.; Liang, Y.; Sun, G.; Chen, D. An efficient compiler framework for cache bypassing on GPUs. In Proceedings of the International Conference on Computer-Aided Design (ICCAD), San Jose, CA, USA, 18–21 November 2013; pp. 516–523.
- (2013) Proceedings of the International Conference on Computer-Aided Design (ICCAD) , pp. 516-523
- Xie, X.¹ Liang, Y.² Sun, G.³ Chen, D.⁴

89
- 84958770551
- A survey of architectural techniques for managing process variation
- Mittal, S. A survey of architectural techniques for managing process variation. ACM Comput. Surv. 2016, 48, Article No. 54.
- (2016) ACM Comput. Surv
- Mittal, S.¹

90
- 84963984095
- A survey of techniques for approximate computing
- Mittal, S. A survey of techniques for approximate computing. ACM Comput. Surv. 2016, 48, Article No. 62.
- (2016) ACM Comput. Surv
- Mittal, S.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.