SCOPUS 정보 검색 플랫폼

Proceedings of the International Conference on Supercomputing

Volumn , Issue , 2004, Pages 326-335

Cluster prefetch: Tolerating on-chip wire delays in clustered microarchitectures

a Department of Electrical and Computer Engineering (United States)

Author keywords

Clustered microarchitectures; Communication bound processors; Data prefetch; Distributed caches; Effective address and memory dependence prediction

Indexed keywords

CLUSTERED MICROARCHITECTURES; COMMUNICATION-BOUND PROCESSOR; DATA PREFETCH; DISTRIBUTED CACHES;

ALGORITHMS; BUFFER STORAGE; COMPUTATIONAL COMPLEXITY; DATA REDUCTION; ELECTRIC POWER UTILIZATION; MATHEMATICAL MODELS; MICROPROCESSOR CHIPS;

COMPUTER ARCHITECTURE;

EID: 8344258257 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (4)

References (38)

1
- 0033717865
- Clock rate versus IPC: The end of the road for conventional microarchitectures
- June
- V. Agarwal, M. Hrishikesh, S. Keckler, and D. Burger. Clock Rate versus IPC: The End of the Road for Conventional Microarchitectures. In Proceedings of ISCA-27, pages 248-259, June 2000.
- (2000) Proceedings of ISCA-27 , pp. 248-259
- Agarwal, V.¹ Hrishikesh, M.² Keckler, S.³ Burger, D.⁴

2
- 84962184017
- An empirical study of the scalability aspects of instruction distribution algorithms for clustered processors
- A. Aggarwal and M. Franklin. An Empirical Study of the Scalability Aspects of Instruction Distribution Algorithms for Clustered Processors. In Proceedings of ISPASS, 2001.
- (2001) Proceedings of ISPASS
- Aggarwal, A.¹ Franklin, M.²

3
- 33745778388
- Hierarchical interconnects for on-chip clustering
- April
- A. Aggarwal and M. Franklin. Hierarchical Interconnects for On-Chip Clustering. In Proceedings of IPDPS, April 2002.
- (2002) Proceedings of IPDPS
- Aggarwal, A.¹ Franklin, M.²

4
- 8344231932
- Performance potential of effective address prediction of load instructions
- June
- P. Ahuja, I. Emer, A. Klauser, and S. Mukherjee. Performance Potential of Effective Address Prediction of Load Instructions. In Proceedings of Workshop on Memory Performance Issues (in conjunction with ISCA-28), June 2001.
- (2001) Proceedings of Workshop on Memory Performance Issues (In Conjunction with ISCA-28)
- Ahuja, P.¹ Emer, I.² Klauser, A.³ Mukherjee, S.⁴

5
- 0038346226
- Dynamically managing the communication-parallelism trade-off in future clustered processors
- June
- R. Balasubramonian, S. Dwarkadas, and D. Albonesi. Dynamically Managing the Communication-Parallelism Trade-Off in Future Clustered Processors. In Proceedings of ISCA-30, pages 275-286, June 2003.
- (2003) Proceedings of ISCA-30 , pp. 275-286
- Balasubramonian, R.¹ Dwarkadas, S.² Albonesi, D.³

6
- 0034462014
- Instruction distribution heuristics for quad-cluster, dynamically-scheduled, superscalar processors
- December
- A. Baniasadi and A. Moshovos. Instruction Distribution Heuristics for Quad-Cluster, Dynamically-Scheduled, Superscalar Processors. In Proceedings of MICRO-33, pages 337-347, December 2000.
- (2000) Proceedings of MICRO-33 , pp. 337-347
- Baniasadi, A.¹ Moshovos, A.²

7
- 0032630442
- Maps: A compiler-managed memory system for raw machines
- May
- R. Barua, W. Lee, S. Amarasinghe, and A. Agarwal. Maps: A Compiler-Managed Memory System for Raw Machines. In Proceedings of ISCA-26, May 1999.
- (1999) Proceedings of ISCA-26
- Barua, R.¹ Lee, W.² Amarasinghe, S.³ Agarwal, A.⁴

8
- 0032686330
- Correlated load-address predictors
- May
- M. Bekerman, S. Jourdan, R. Ronen, G. Kirshenboim, L. Rappoport, A. Yoaz, and U. Weiser. Correlated Load-Address Predictors. In Proceedings of ISCA-26, pages 54-63, May 1999.
- (1999) Proceedings of ISCA-26 , pp. 54-63
- Bekerman, M.¹ Jourdan, S.² Ronen, R.³ Kirshenboim, G.⁴ Rappoport, L.⁵ Yoaz, A.⁶ Weiser, U.⁷

9
- 0031642196
- Load execution latency reduction
- June
- B. Black, B. Mueller, S. Postal, R. Rakvie, N. Utamaphethai, and J. Shen. Load Execution Latency Reduction. In Proceedings of the 12th ICS, June 1998.
- (1998) Proceedings of the 12th ICS
- Black, B.¹ Mueller, B.² Postal, S.³ Rakvie, R.⁴ Utamaphethai, N.⁵ Shen, J.⁶

10
- 0003465202
- The simplescalar toolset, version 2.0
- University of Wisconsin-Madison, June
- D. Burger and T. Austin. The Simplescalar Toolset, Version 2.0. Technical Report TR-97-1342, University of Wisconsin-Madison, June 1997.
- (1997) Technical Report TR-97-1342
- Burger, D.¹ Austin, T.²

11
- 0034581207
- Dynamic cluster assignment mechanisms
- January
- R. Canal, J. M. Parcerisa, and A. Gonzalez. Dynamic Cluster Assignment Mechanisms. In Proceedings of HPCA-6, pages 132-142, January 2000.
- (2000) Proceedings of HPCA-6 , pp. 132-142
- Canal, R.¹ Parcerisa, J.M.² Gonzalez, A.³

12
- 0029308368
- Effective hardware based data prefetching for high performance processors
- May
- T. Chen and J. Baer. Effective Hardware Based Data Prefetching for High Performance Processors. IEEE Transactions on Computers, 44(5):609-623, May 1995.
- (1995) IEEE Transactions on Computers , vol.44 , Issue.5 , pp. 609-623
- Chen, T.¹ Baer, J.²

13
- 0031594025
- Memory dependence prediction using store sets
- June
- G. Chrysos and J. Emer. Memory Dependence Prediction Using Store Sets. In Proceedings of ISCA-25, June 1998.
- (1998) Proceedings of ISCA-25
- Chrysos, G.¹ Emer, J.²

14
- 0031374601
- The multicluster architecture: Reducing cycle time through partitioning
- December
- K. Farkas, P. Chow, N. Jouppi, and Z. Vranesic. The Multicluster Architecture: Reducing Cycle Time through Partitioning. In Proceedings of MICRO-30, pages 149-159, December 1997.
- (1997) Proceedings of MICRO-30 , pp. 149-159
- Farkas, K.¹ Chow, P.² Jouppi, N.³ Vranesic, Z.⁴

15
- 57649085955
- Effective instruction scheduling techniques for an interleaved cache clustered VLIW processor
- November
- E. Gibert, J. Sanchez, and A. Gonzalez. Effective Instruction Scheduling Techniques for an Interleaved Cache Clustered VLIW Processor. In Proceedings of MICRO-35, pages 123-133, November 2002.
- (2002) Proceedings of MICRO-35 , pp. 123-133
- Gibert, E.¹ Sanchez, J.² Gonzalez, A.³

16
- 84944397775
- Flexible compiler-managed LO buffers for clustered VLIW processors
- December
- E. Gibert, J. Sanchez, and A. Gonzalez. Flexible Compiler-Managed LO Buffers for Clustered VLIW Processors. In Proceedings of MICRO-36, December 2003.
- (2003) Proceedings of MICRO-36
- Gibert, E.¹ Sanchez, J.² Gonzalez, A.³

17
- 0030721866
- Speculative execution via address prediction and data prefetching
- July
- J. Gonzalez and A. Gonzalez. Speculative Execution via Address Prediction and Data Prefetching. In Proceedings of the 11th ICS, pages 196-203, July 1997.
- (1997) Proceedings of the 11th ICS , pp. 196-203
- Gonzalez, J.¹ Gonzalez, A.²

18
- 0003278283
- The microarchitecture of the pentium 4 processor
- G. Hinton, D. Sager, M. Upton, D. Boggs, D. Carmean, A. Kyker, and P. Roussel. The Microarchitecture of the Pentium 4 Processor. Intel Technology Journal, Q1, 2001.
- (2001) Intel Technology Journal , vol.Q1
- Hinton, G.¹ Sager, D.² Upton, M.³ Boggs, D.⁴ Carmean, D.⁵ Kyker, A.⁶ Roussel, P.⁷

19
- 0036396915
- The imagine stream processor
- September
- U. Kapasi, W. Dally, S. Rixner, J. Owens, and B. Khailany. The Imagine Stream Processor. In Proceedings of ICCD, September 2002.
- (2002) Proceedings of ICCD
- Kapasi, U.¹ Dally, W.² Rixner, S.³ Owens, J.⁴ Khailany, B.⁵

20
- 0026865602
- Processor coupling: Integrating compile time and runtime scheduling for parallelism
- May
- S. Keckler and W. Dally. Processor Coupling: Integrating Compile Time and Runtime Scheduling for Parallelism. In Proceedings of ISCA-19, pages 202-213, May 1992.
- (1992) Proceedings of ISCA-19 , pp. 202-213
- Keckler, S.¹ Dally, W.²

21
- 0032639289
- The alpha 21264 microprocessor
- March/April
- R. Kessler. The Alpha 21264 Microprocessor. IEEE Micro, 19(2):24-36, March/April 1999.
- (1999) IEEE Micro , vol.19 , Issue.2 , pp. 24-36
- Kessler, R.¹

22
- 0032297487
- The alpha 21264 microprocessor architecture
- R. Kessler, E. McLellan, and D. Webb. The Alpha 21264 Microprocessor Architecture. In Proceedings of ICCD, 1998.
- (1998) Proceedings of ICCD
- Kessler, R.¹ McLellan, E.² Webb, D.³

23
- 2842554734
- Value locality and load value prediction
- October
- M. Lipasti, C. Wilkerson, and J. Shen. Value Locality and Load Value Prediction. In Proceedings of ASPLOS-VIII, pages 138-147, October 1996.
- (1996) Proceedings of ASPLOS-VIII , pp. 138-147
- Lipasti, M.¹ Wilkerson, C.² Shen, J.³

24
- 0030717767
- Dynamic speculation and synchronization of data dependences
- May
- A. Moshovos, S. Breach, T. Vijaykumar, and G. Sohi. Dynamic Speculation and Synchronization of Data Dependences. In Proceedings of ISCA-24, May 1997.
- (1997) Proceedings of ISCA-24
- Moshovos, A.¹ Breach, S.² Vijaykumar, T.³ Sohi, G.⁴

25
- 0035693945
- A design space evaluation of grid processor architectures
- December
- R. Nagarajan, K. Sankaralingam, D. Burger, and S. Keckler. A Design Space Evaluation of Grid Processor Architectures. In Proceedings of MICRO-34, pages 40-51, December 2001.
- (2001) Proceedings of MICRO-34 , pp. 40-51
- Nagarajan, R.¹ Sankaralingam, K.² Burger, D.³ Keckler, S.⁴

26
- 0002432406
- The case for a single-chip multiprocessor
- October
- K. Olukotun, B. Nayfeh, L. Hammond, K. Wilson, and K.-Y. Chang. The Case for a Single-Chip Multiprocessor. In Proceedings of ASPLOS-VII, October 1996.
- (1996) Proceedings of ASPLOS-VII
- Olukotun, K.¹ Nayfeh, B.² Hammond, L.³ Wilson, K.⁴ Chang, K.-Y.⁵

27
- 0030676681
- Complexity-effective superscalar processors
- June
- S. Palacharla, N. Jouppi, and J. Smith. Complexity-Effective Superscalar Processors. In Proceedings of ISCA-24, pages 206-218, June 1997.
- (1997) Proceedings of ISCA-24 , pp. 206-218
- Palacharla, S.¹ Jouppi, N.² Smith, J.³

28
- 0034461108
- Reducing wire delay penalty through value prediction
- December
- J.-M. Parcerisa and A. Gonzalez. Reducing Wire Delay Penalty through Value Prediction. In Proceedings of MICRO-33, pages 317-326, December 2000.
- (2000) Proceedings of MICRO-33 , pp. 317-326
- Parcerisa, J.-M.¹ Gonzalez, A.²

29
- 84948777317
- Efficient interconnects for clustered microarchitectures
- September
- J.-M. Parcerisa, J. Sahuquillo, A. Gonzalez, and J. Duato. Efficient Interconnects for Clustered Microarchitectures. In Proceedings of PACT, September 2002.
- (2002) Proceedings of PACT
- Parcerisa, J.-M.¹ Sahuquillo, J.² Gonzalez, A.³ Duato, J.⁴

30
- 1142280992
- Partitioned first-level cache design for clustered microarchitectures
- June
- P. Racunas and Y. Patt. Partitioned First-Level Cache Design for Clustered Microarchitectures. In Proceedings of ICS-17, June 2003.
- (2003) Proceedings of ICS-17
- Racunas, P.¹ Patt, Y.²

31
- 0031605773
- An empirical study of decentralized ILP execution models
- October
- N. Ranganathan and M. Franklin. An Empirical Study of Decentralized ILP Execution Models. In Proceedings of ASPLOS-VIII, pages 272-281, October 1998.
- (1998) Proceedings of ASPLOS-VIII , pp. 272-281
- Ranganathan, N.¹ Franklin, M.²

32
- 0032315195
- Predictive techniques for aggressive load speculation
- December
- G. Reinman and B. Calder. Predictive Techniques for Aggressive Load Speculation. In Proceedings of MICRO-31, December 1998.
- (1998) Proceedings of MICRO-31
- Reinman, G.¹ Calder, B.²

33
- 0034459218
- Modulo scheduling for a fully-distributed clustered VLIW architecture
- December
- J. Sanchez and A. Gonzalez. Modulo Scheduling for a Fully-Distributed Clustered VLIW Architecture. In Proceedings of MICRO-33, pages 124-133, December 2000.
- (2000) Proceedings of MICRO-33 , pp. 124-133
- Sanchez, J.¹ Gonzalez, A.²

34
- 0030409867
- The performance potential of data dependence speculation and collapsing
- Dec
- Y. Sazeides, S. Vassiliadis, and J. Smith. The Performance Potential of Data Dependence Speculation and Collapsing. In Proceedings of MICRO-29, pages 238-247, Dec 1996.
- (1996) Proceedings of MICRO-29 , pp. 238-247
- Sazeides, Y.¹ Vassiliadis, S.² Smith, J.³

35
- 0003450887
- CACTI 3.0: An integrated cache timing, power, and area model
- Compaq Western Research Laboratory, August
- P. Shivakumar and N. P. Jouppi. CACTI 3.0: An Integrated Cache Timing, Power, and Area Model. Technical Report TN-2001/2, Compaq Western Research Laboratory, August 2001.
- (2001) Technical Report TN-2001/2
- Shivakumar, P.¹ Jouppi, N.P.²

36
- 0003535436
- Power4 system microarchitecture
- Technical White Paper, IBM, October
- J. Tendler, S. Dodson, S. Fields, H. Le, and B. Sinharoy. Power4 System Microarchitecture. Technical report, Technical White Paper, IBM, October 2001.
- (2001) Technical Report
- Tendler, J.¹ Dodson, S.² Fields, S.³ Le, H.⁴ Sinharoy, B.⁵

37
- 0034817930
- Dynamic prediction of critical path instructions
- January
- E. Tune, D. Liang, D. Tullsen, and B. Calder. Dynamic Prediction of Critical Path Instructions. In Proceedings of HPCA-7, pages 185-196, January 2001.
- (2001) Proceedings of HPCA-7 , pp. 185-196
- Tune, E.¹ Liang, D.² Tullsen, D.³ Calder, B.⁴

38
- 0035273395
- Inherently lower-power high-performance superscalar architectures
- March
- V. Zyuban and P. Kogge. Inherently Lower-Power High-Performance Superscalar Architectures. IEEE Transactions on Computers, March 2001.
- (2001) IEEE Transactions on Computers
- Zyuban, V.¹ Kogge, P.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.