SCOPUS 정보 검색 플랫폼

Proceedings - International Symposium on Computer Architecture

Volumn , Issue , 2012, Pages 368-379

A case for exploiting subarray-level parallelism (SALP) in DRAM

(5) Kim, Yoongu a Seshadri, Vivek a Lee, Donghyuk a Liu, Jamie a Mutlu, Onur a

a CARNEGIE MELLON UNIVERSITY (United States)

Author keywords

[No Author keywords available]

Indexed keywords

ACCESS LATENCY; AREA OVERHEAD; LOW COST APPROACH; MULTI-CORE SYSTEMS; NEW MECHANISMS; OFF-CHIP MEMORIES; REQUEST SCHEDULING; SUB-ARRAYS; SYSTEM COSTS; TIMING PARAMETERS;

COMPUTER ARCHITECTURE; MICROPROCESSOR CHIPS;

DYNAMIC RANDOM ACCESS STORAGE;

EID: 84864850807 PISSN: 10636897 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ISCA.2012.6237032 Document Type: Conference Paper

Times cited : (353)

References (71)

1
- 84864837658
- Multicore DIMM: An energy efficient memory module with independently controlled DRAMs
- Jan.
- J. H. Ahn et al. Multicore DIMM: An energy efficient memory module with independently controlled DRAMs. IEEE CAL, Jan. 2009.
- (2009) IEEE CAL
- Ahn, J.H.¹

2
- 84864831816
- Improving system energy efficiency with memory rank subsetting
- Mar.
- J. H. Ahn et al. Improving system energy efficiency with memory rank subsetting. ACM TACO, Mar. 2012.
- (2012) ACM TACO
- Ahn, J.H.¹

3
- 84860350704
- Staged reads: Mitigating the impact of DRAM writes on DRAM reads
- N. Chatterjee et al. Staged reads: Mitigating the impact of DRAM writes on DRAM reads. In HPCA, 2012.
- (2012) HPCA
- Chatterjee, N.¹

4
- 4644226058
- Microarchitecture optimizations for exploiting memorylevel parallelism
- Y. Chou et al. Microarchitecture optimizations for exploiting memorylevel parallelism. In ISCA, 2004.
- (2004) ISCA
- Chou, Y.¹

5
- 0030662863
- Improving data cache performance by preexecuting instructions under a cache miss
- J. Dundas and T. Mudge. Improving data cache performance by preexecuting instructions under a cache miss. In ICS, 1997.
- (1997) ICS
- Dundas, J.¹ Mudge, T.²

6
- 84863348772
- Parallel application memory scheduling
- E. Ebrahimi et al. Parallel application memory scheduling. In MICRO, 2011.
- (2011) MICRO
- Ebrahimi, E.¹

7
- 84864847006
- Enhanced SDRAM SM2604
- Enhanced Memory Systems. Enhanced SDRAM SM2604, 2002.
- (2002) Enhanced Memory Systems

8
- 68949189534
- Improvement potential and equalization example for multidrop DRAM memory buses
- H. Fredriksson and C. Svensson. Improvement potential and equalization example for multidrop DRAM memory buses. IEEE Transactions on Advanced Packaging, 2009.
- (2009) IEEE Transactions on Advanced Packaging
- Fredriksson, H.¹ Svensson, C.²

9
- 34547653935
- Fully-buffered DIMM memory architectures: Understanding mechanisms, overheads and scaling
- B. Ganesh et al. Fully-buffered DIMM memory architectures: Understanding mechanisms, overheads and scaling. In HPCA, 2007.
- (2007) HPCA
- Ganesh, B.¹

10
- 0003997750
- CDRAM in a unified memory architecture
- C. A. Hart. CDRAM in a unified memory architecture. In Compcon, 1994.
- (1994) Compcon
- Hart, C.A.¹

11
- 0025419834
- The cache DRAM architecture: A DRAM with an onchip cache memory
- Mar.
- H. Hidaka et al. The cache DRAM architecture: A DRAM with an onchip cache memory. IEEE Micro, Mar. 1990.
- (1990) IEEE Micro
- Hidaka, H.¹

12
- 84864860761
- HPCC. RandomAccess. http://icl.cs.utk.edu/hpcc/.
- RandomAccess

13
- 0027191655
- Performance of cached DRAM organizations in vector supercomputers
- W.-C. Hsu and J. E. Smith. Performance of cached DRAM organizations in vector supercomputers. In ISCA, 1993.
- (1993) ISCA
- Hsu, W.-C.¹ Smith, J.E.²

14
- 84855901019
- Intel. 2nd Gen.
- Intel. 2nd Gen. Intel Core Processor Family Desktop Datasheet, 2011.
- (2011) Intel Core Processor Family Desktop Datasheet

15
- 84864848542
- Intel
- Intel. Intel Core Desktop Processor Series Datasheet, 2011.
- (2011) Intel Core Desktop Processor Series Datasheet

16
- 52649148744
- Self optimizing memory controllers: A reinforcement learning approach
- E. Ipek et al. Self optimizing memory controllers: A reinforcement learning approach. In ISCA, 2008.
- (2008) ISCA
- Ipek, E.¹

17
- 0004000428
- Springer
- K. Itoh. VLSI Memory Chip Design. Springer, 2001.
- (2001) VLSI Memory Chip Design
- Itoh, K.¹

18
- 78650934251
- JEDEC. Standard No 79-3E
- JEDEC. Standard No. 79-3E. DDR3 SDRAM Specification, 2010.
- (2010) DDR3 SDRAM Specification

19
- 84864833348
- JEDEC. Standard No 21-C
- JEDEC. Standard No. 21-C. Annex K: Serial Presence Detect (SPD) for DDR3 SDRAM Modules, 2011.
- (2011) Annex K: Serial Presence Detect (SPD) for DDR3 SDRAM Modules

20
- 0003985543
- CS-1997-03, Duke
- G. Kedem and R. P. Koganti. WCDRAM: A fully associative integrated cached-DRAM with wide cache lines. CS-1997-03, Duke, 1997.
- (1997) WCDRAM: A Fully Associative Integrated Cached-DRAM with Wide Cache Lines
- Kedem, G.¹ Koganti, R.P.²

21
- 84864847003
- DRAM circuit design
- Wiley-IEEE Press
- B. Keeth et al. DRAM Circuit Design. Fundamental and High-Speed Topics. Wiley-IEEE Press, 2007.
- (2007) Fundamental and High-Speed Topics.
- Keeth, B.¹

22
- 70349280616
- 75nm 7Gb/s/pin 1Gb GDDR5 graphics memory device with bandwidth- improvement techniques
- R. Kho et al. 75nm 7Gb/s/pin 1Gb GDDR5 graphics memory device with bandwidth-improvement techniques. In ISSCC, 2009.
- (2009) ISSCC
- Kho, R.¹

23
- 77952558442
- ATLAS: A scalable and high-performance scheduling algorithm for multiple memory controllers
- Y. Kim et al. ATLAS: A scalable and high-performance scheduling algorithm for multiple memory controllers. In HPCA, 2010.
- (2010) HPCA
- Kim, Y.¹

24
- 79951718838
- Thread cluster memory scheduling: Exploiting differences in memory access behavior
- Y. Kim et al. Thread cluster memory scheduling: Exploiting differences in memory access behavior. In MICRO, 2010.
- (2010) MICRO
- Kim, Y.¹

25
- 84864831812
- Latched row decoder for a random access memory
- U.S. patent number 5615164
- T. Kirihata. Latched row decoder for a random access memory. U.S. patent number 5615164, 1997.
- (1997)
- Kirihata, T.¹

26
- 84864847008
- Conditional-capture flip-flop for statistical power reduction
- B.-S. Kong et al. Conditional-capture flip-flop for statistical power reduction. IEEE JSSC, 2001.
- (2001) IEEE JSSC
- Kong, B.-S.¹

27
- 84904279959
- Lockup-free instruction fetch/prefetch cache organization
- D. Kroft. Lockup-free instruction fetch/prefetch cache organization. In ISCA, 1981.
- (1981) ISCA
- Kroft, D.¹

28
- 70450235471
- Architecting phase change memory as a scalable DRAM alternative
- B. C. Lee et al. Architecting phase change memory as a scalable DRAM alternative. In ISCA, 2009.
- (2009) ISCA
- Lee, B.C.¹

29
- 84860332549
- DRAM-aware last-level cache writeback: Reducing writecaused interference in memory systems
- UT Austin
- C. J. Lee et al. DRAM-aware last-level cache writeback: Reducing writecaused interference in memory systems. TR-HPS-2010-002, UT Austin, 2010.
- (2010) TR-HPS-2010-002
- Lee, C.J.¹

30
- 31944440969
- Pin: Building customized program analysis tools with dynamic instrumentation
- C.-K. Luk et al. Pin: Building customized program analysis tools with dynamic instrumentation. In PLDI, 2005.
- (2005) PLDI
- Luk, C.-K.¹

31
- 77954148891
- Micron.
- Micron. DDR3 SDRAM System-Power Calculator, 2010.
- (2010) DDR3 SDRAM System-Power Calculator

32
- 84864847007
- Micron
- Micron. 2Gb: x16, x32 Mobile LPDDR2 SDRAM, 2012.
- (2012) 2Gb: x16, x32 Mobile LPDDR2 SDRAM

33
- 84864837344
- Micron
- Micron. 2Gb: x4, x8, x16, DDR3 SDRAM, 2012.
- (2012) 2Gb: x4, x8, x16, DDR3 SDRAM

34
- 84864833352
- Micron
- Micron. DDR3 SDRAM Verilog Model, 2012.
- (2012) DDR3 SDRAM Verilog Model

35
- 84855258577
- Bandwidth engine serial memory chip breaks 2 billion accesses/ sec
- M. J. Miller. Bandwidth engine serial memory chip breaks 2 billion accesses/ sec. In HotChips, 2011.
- (2011) HotChips
- Miller, M.J.¹

36
- 70349280617
- 1.2V 1.6Gb/s 56nm 6F2 4Gb DDR3 SDRAM with hybrid- I/O sense amplifier and segmented sub-array architecture
- Y. Moon et al. 1.2V 1.6Gb/s 56nm 6F2 4Gb DDR3 SDRAM with hybrid- I/O sense amplifier and segmented sub-array architecture. In ISSCC, 2009.
- (2009) ISSCC
- Moon, Y.¹

37
- 52649128991
- Memory performance attacks: Denial of memory service in multi-core systems
- T. Moscibroda and O. Mutlu. Memory performance attacks: Denial of memory service in multi-core systems. In USENIX SS, 2007.
- (2007) USENIX SS
- Moscibroda, T.¹ Mutlu, O.²

38
- 84858771269
- Reducing memory interference in multicore systems via application-aware memory channel partitioning
- S. P. Muralidhara et al. Reducing memory interference in multicore systems via application-aware memory channel partitioning. In MICRO, 2011.
- (2011) MICRO
- Muralidhara, S.P.¹

39
- 84955506994
- Runahead execution: An alternative to very large instruction windows for out-of-order processors
- O. Mutlu et al. Runahead execution: An alternative to very large instruction windows for out-of-order processors. In HPCA, 2003.
- (2003) HPCA
- Mutlu, O.¹

40
- 47349122373
- Stall-time fair memory access scheduling for chip multiprocessors
- O. Mutlu and T. Moscibroda. Stall-time fair memory access scheduling for chip multiprocessors. In MICRO, 2007.
- (2007) MICRO
- Mutlu, O.¹ Moscibroda, T.²

41
- 52649119398
- Parallelism-aware batch scheduling: Enhancing both performance and fairness of shared DRAM systems
- O. Mutlu and T. Moscibroda. Parallelism-aware batch scheduling: Enhancing both performance and fairness of shared DRAM systems. In ISCA, 2008.
- (2008) ISCA
- Mutlu, O.¹ Moscibroda, T.²

42
- 84864834197
- NEC
- NEC. Virtual Channel SDRAM uPD4565421, 1999.
- (1999) Virtual Channel SDRAM uPD4565421

43
- 34548050337
- Fair queuing memory systems
- K. J. Nesbit et al. Fair queuing memory systems. In MICRO, 2006.
- (2006) MICRO
- Nesbit, K.J.¹

44
- 84864860561
- Semiconductor memory having a bank with sub-banks
- U.S. patent number 7782703
- J.-h. Oh. Semiconductor memory having a bank with sub-banks. U.S. patent number 7782703, 2010.
- (2010)
- Oh, J.-H.¹

45
- 33845874613
- A case for MLP-aware cache replacement
- M. K. Qureshi et al. A case for MLP-aware cache replacement. In ISCA, 2006.
- (2006) ISCA
- Qureshi, M.K.¹

46
- 84864831814
- Rambus
- Rambus. DRAM Power Model, 2010.
- (2010) DRAM Power Model

47
- 0033691565
- Memory access scheduling
- S. Rixner et al. Memory access scheduling. In ISCA, 2000.
- (2000) ISCA
- Rixner, S.¹

48
- 84864831813
- DRAMSim2: A cycle accurate memory system simulator
- Jan.
- P. Rosenfeld et al. DRAMSim2: A cycle accurate memory system simulator. IEEE CAL, Jan. 2011.
- (2011) IEEE CAL
- Rosenfeld, P.¹

49
- 84864831815
- U.S. patent number 5887272
- R. H. Sartore et al. Enhanced DRAM with embedded registers. U.S. patent number 5887272, 1999.
- (1999) Enhanced DRAM with Embedded Registers
- Sartore, R.H.¹

50
- 0031641453
- Fast Cycle RAM (FCRAM); A 20-ns random row access, pipe-lined operating DRAM
- Y. Sato et al. Fast Cycle RAM (FCRAM); a 20-ns random row access, pipe-lined operating DRAM. In Symposium on VLSI Circuits, 1998.
- (1998) Symposium on VLSI Circuits
- Sato, Y.¹

51
- 81255177633
- IBM POWER7 multicore server processor
- May
- B. Sinharoy et al. IBM POWER7 multicore server processor. IBM Journal Res. Dev., May. 2011.
- (2011) IBM Journal Res. Dev.
- Sinharoy, B.¹

52
- 0018282603
- A pipelined shared resource MIMD computer
- B. J. Smith. A pipelined shared resource MIMD computer. In ICPP, 1978.
- (1978) ICPP
- Smith, B.J.¹

53
- 0034443570
- Symbiotic jobscheduling for a simultaneous multithreaded processor
- A. Snavely and D. M. Tullsen. Symbiotic jobscheduling for a simultaneous multithreaded processor. In ASPLOS, 2000.
- (2000) ASPLOS
- Snavely, A.¹ Tullsen, D.M.²

54
- 84864847009
- STREAM Benchmark. http://www.streambench.org/.

55
- 77954992165
- The virtual write queue: Coordinating DRAM and last-level cache policies
- J. Stuecheli et al. The virtual write queue: Coordinating DRAM and last-level cache policies. In ISCA, 2010.
- (2010) ISCA
- Stuecheli, J.¹

56
- 77952283542
- Micro-pages: Increasing DRAM efficiency with localityaware data placement
- K. Sudan et al. Micro-pages: Increasing DRAM efficiency with localityaware data placement. In ASPLOS, 2010.
- (2010) ASPLOS
- Sudan, K.¹

57
- 37449003277
- Sun Microsystems
- Sun Microsystems. OpenSPARC T1 microarch. specification, 2006.
- (2006) OpenSPARC T1 Microarch. Specification

58
- 84863352139
- Parallel operation in the control data 6600
- J. E. Thornton. Parallel operation in the control data 6600. In AFIPS, 1965.
- (1965) AFIPS
- Thornton, J.E.¹

59
- 52649139073
- A comprehensive memory modeling tool and its application to the design and analysis of future memory hierarchies
- S. Thoziyoor et al. A comprehensive memory modeling tool and its application to the design and analysis of future memory hierarchies. In ISCA, 2008.
- (2008) ISCA
- Thoziyoor, S.¹

60
- 0003081830
- An efficient algorithm for exploiting multiple arithmetic units
- Jan.
- R. M. Tomasulo. An efficient algorithm for exploiting multiple arithmetic units. IBM Journal Res. Dev., Jan. 1967.
- (1967) IBM Journal Res. Dev.
- Tomasulo, R.M.¹

61
- 84864847011
- TPC. http://www.tpc.org/.

62
- 77954989143
- Rethinking DRAM design and organization for energyconstrained multi-cores
- A. N. Udipi et al. Rethinking DRAM design and organization for energyconstrained multi-cores. In ISCA, 2010.
- (2010) ISCA
- Udipi, A.N.¹

63
- 79951702954
- Understanding the energy consumption of dynamic random access memories
- T. Vogelsang. Understanding the energy consumption of dynamic random access memories. In MICRO, 2010.
- (2010) MICRO
- Vogelsang, T.¹

64
- 49749122679
- Improving power and data efficiency with threaded memory modules
- F. Ware and C. Hampel. Improving power and data efficiency with threaded memory modules. In ICCD, 2006.
- (2006) ICCD
- Ware, F.¹ Hampel, C.²

65
- 0006997407
- CSE-97-03-04, UW
- W. A. Wong and J.-L. Baer. DRAM caching. CSE-97-03-04, UW, 1997.
- (1997) DRAM Caching
- Wong, W.A.¹ Baer, J.-L.²

66
- 0031363421
- The hierarchical multi-bank DRAM: A highperformance architecture for memory integrated with processors
- T. Yamauchi et al. The hierarchical multi-bank DRAM: A highperformance architecture for memory integrated with processors. In Advanced Research in VLSI, 1997.
- (1997) Advanced Research in VLSI
- Yamauchi, T.¹

67
- 76749123978
- Complexity effective memory access scheduling for many-core accelerator architectures
- G. L. Yuan et al. Complexity effective memory access scheduling for many-core accelerator architectures. In MICRO, 2009.
- (2009) MICRO
- Yuan, G.L.¹

68
- 0034460897
- A permutation-based page interleaving scheme to reduce row-buffer conflicts and exploit data locality
- Z. Zhang et al. A permutation-based page interleaving scheme to reduce row-buffer conflicts and exploit data locality. In MICRO, 2000.
- (2000) MICRO
- Zhang, Z.¹

69
- 0035389657
- Cached DRAM for ILP processor memory access latency reduction
- Jul.
- Z. Zhang et al. Cached DRAM for ILP processor memory access latency reduction. IEEE Micro, Jul. 2001.
- (2001) IEEE Micro
- Zhang, Z.¹

70
- 66749162556
- Mini-rank: Adaptive DRAM architecture for improving memory power efficiency
- H. Zheng et al. Mini-rank: Adaptive DRAM architecture for improving memory power efficiency. In MICRO, 2008.
- (2008) MICRO
- Zheng, H.¹

71
- 52649113530
- Controller for a synchronous DRAMthat maximizes throughput by allowing memory requests and commands to be issued out of order
- U.S. patent number 5630096
- W. K. Zuravleff and T. Robinson. Controller for a synchronous DRAMthat maximizes throughput by allowing memory requests and commands to be issued out of order. U.S. patent number 5630096, 1997.
- (1997)
- Zuravleff, W.K.¹ Robinson, T.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.