SCOPUS 정보 검색 플랫폼

Proceedings - International Symposium on Computer Architecture

Volumn , Issue , 2014, Pages 97-108

Aladdin: A pre-RTL, power-performance accelerator simulator enabling large design space exploration of customized architectures

(4) Shao, Yakun Sophia a Reagen, Brandon a Wei, Gu Yeon a Brooks, David a

a HARVARD UNIVERSITY (United States)

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS; APPLICATION SPECIFIC INTEGRATED CIRCUITS; COMPUTER ARCHITECTURE; SYSTEM-ON-CHIP;

ACCURATE TIMING; DATA PATHS; ITS APPLICATIONS; LARGE DESIGNS; MEMORY HIERARCHY; MODEL FRAMEWORK; SYSTEM-ON-CHIP SIMULATIONS; TRADITIONAL ARCHITECTURE;

ACCELERATION;

EID: 84905487457 PISSN: 10636897 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ISCA.2014.6853196 Document Type: Conference Paper

Times cited : (265)

References (56)

1
- 0004245602
- "The international technology roadmap for semiconductors (itrs), system drivers, 2007, http://www.itrs.net/.
- (2007) The International Technology Roadmap for Semiconductors (Itrs), System Drivers

2
- 84879875776
- "Xilinx vivado high-level synthesis," http://www.xilinx.com/ products/design-tools/vivado/.
- Xilinx Vivado High-level Synthesis

3
- 0036469652
- Simplescalar: An infrastructure for computer system modeling
- T. M. Austin, E. Larson, and D. Ernst, "Simplescalar: An infrastructure for computer system modeling," IEEE Computer, 2002.
- (2002) IEEE Computer
- Austin, T.M.¹ Larson, E.² Ernst, D.³

4
- 0026867085
- Dynamic dependency analysis of ordinary programs
- T. M. Austin and G. S. Sohi, "Dynamic dependency analysis of ordinary programs," in ISCA, 1992.
- (1992) ISCA
- Austin, T.M.¹ Sohi, G.S.²

5
- 70349169075
- Analyzing cuda workloads using a detailed gpu simulator
- A. Bakhoda, G. L. Yuan, W. W. L. Fung, H. Wong, and T. M. Aamodt, "Analyzing cuda workloads using a detailed gpu simulator," in ISPASS, 2009.
- (2009) ISPASS
- Bakhoda, A.¹ Yuan, G.L.² Fung, W.W.L.³ Wong, H.⁴ Aamodt, T.M.⁵

6
- 84881175680
- Continuous real-world inputs can open up alternative accelerator designs
- B. Belhadj, A. Joubert, Z. Li, R. Héliot, and O. Temam, "Continuous real-world inputs can open up alternative accelerator designs," in ISCA, 2013.
- (2013) ISCA
- Belhadj, B.¹ Joubert, A.² Li, Z.³ Héliot, R.⁴ Temam, O.⁵

7
- 84859464490
- The gem5 simulator
- N. L. Binkert, B. M. Beckmann, G. Black, S. K. Reinhardt, A. G. Saidi, A. Basu, J. Hestness, D. Hower, T. Krishna, S. Sardashti, R. Sen, K. Sewell, M. Shoaib, N. Vaish, M. D. Hill, and D. A. Wood, "The gem5 simulator," SIGARCH Computer Architecture News, 2011.
- (2011) SIGARCH Computer Architecture News
- Binkert, N.L.¹ Beckmann, B.M.² Black, G.³ Reinhardt, S.K.⁴ Saidi, A.G.⁵ Basu, A.⁶ Hestness, J.⁷ Hower, D.⁸ Krishna, T.⁹ Sardashti, S.¹⁰ Sen, R.¹¹ Sewell, K.¹² Shoaib, M.¹³ Vaish, N.¹⁴ Hill, M.D.¹⁵ Wood, D.A.¹⁶

8
- 0033719421
- Wattch: A framework for architectural-level power analysis and optimizations
- D. Brooks, V. Tiwari, and M. Martonosi, "Wattch: A framework for architectural-level power analysis and optimizations," in ISCA, 2000.
- (2000) ISCA
- Brooks, D.¹ Tiwari, V.² Martonosi, M.³

9
- 0029666646
- Memory bandwidth limitations of future microprocessors
- D. Burger, J. R. Goodman, and A. Kagi, "Memory bandwidth limitations of future microprocessors," in ISCA, 1996.
- (1996) ISCA
- Burger, D.¹ Goodman, J.R.² Kagi, A.³

10
- 76949106140
- A highly flexible, parallel virtual machine: Design and experience of ildjit
- S. Campanoni, G. Agosta, S. Crespi-Reghizzi, and A. D. Biagio, "A highly flexible, parallel virtual machine: Design and experience of ildjit," Software Practice Expererience, 2010.
- (2010) Software Practice Expererience
- Campanoni, S.¹ Agosta, G.² Crespi-Reghizzi, S.³ Biagio, A.D.⁴

11
- 84874530623
- An fpga memcached appliance
- S. R. Chalamalasetti, K. Lim, M. Wright, A. AuYoung, P. Ranganathan, and M. Margala, "An fpga memcached appliance," in FPGA, 2013.
- (2013) FPGA
- Chalamalasetti, S.R.¹ Lim, K.² Wright, M.³ Auyoung, A.⁴ Ranganathan, P.⁵ Margala, M.⁶

12
- 84881142714
- Linqits: Big data on little clients
- E. S. Chung, J. D. Davis, and J. Lee, "Linqits: big data on little clients," ISCA, 2013.
- (2013) ISCA
- Chung, E.S.¹ Davis, J.D.² Lee, J.³

13
- 79951696448
- Single-chip heterogeneous computing: Does the future include custom logic, fpgas, and gpgpus?
- E. S. Chung, P. A. Milder, J. C. Hoe, and K. Mai, "Single-chip heterogeneous computing: Does the future include custom logic, fpgas, and gpgpus?" in MICRO, 2010.
- (2010) MICRO
- Chung, E.S.¹ Milder, P.A.² Hoe, J.C.³ Mai, K.⁴

14
- 52649095061
- Veal: Virtualized execution accelerator for loops
- N. Clark, A. Hormati, and S. A. Mahlke, "Veal: Virtualized execution accelerator for loops," in ISCA, 2008.
- (2008) ISCA
- Clark, N.¹ Hormati, A.² Mahlke, S.A.³

15
- 2442428419
- Application-specific instruction generation for configurable processor architectures
- J. Cong, Y. Fan, G. Han, and Z. Zhang, "Application-specific instruction generation for configurable processor architectures," in FPGA, 2004.
- (2004) FPGA
- Cong, J.¹ Fan, Y.² Han, G.³ Zhang, Z.⁴

16
- 67650692183
- Synthesis of reconfigurable highperformance multicore systems
- J. Cong, K. Gururaj, and G. Han, "Synthesis of reconfigurable highperformance multicore systems," in FPGA, 2009.
- (2009) FPGA
- Cong, J.¹ Gururaj, K.² Han, G.³

17
- 77952273045
- The scalable heterogeneous computing (shoc) benchmark suite
- A. Danalis, G. Marin, C. McCurdy, J. S. Meredith, P. C. Roth, K. Spafford, V. Tipparaju, and J. S. Vetter, "The scalable heterogeneous computing (shoc) benchmark suite," in Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units, 2010.
- (2010) Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units
- Danalis, A.¹ Marin, G.² McCurdy, C.³ Meredith, J.S.⁴ Roth, P.C.⁵ Spafford, K.⁶ Tipparaju, V.⁷ Vetter, J.S.⁸

18
- 84861950149
- Dark silicon and the end of multicore scaling
- H. Esmaeilzadeh, E. Blem, R. St. Amant, K. Sankaralingam, and D. Burger, "Dark silicon and the end of multicore scaling," Micro, IEEE, 2012.
- (2012) Micro IEEE
- Esmaeilzadeh, H.¹ Blem, E.² St. Amant, R.³ Sankaralingam, K.⁴ Burger, D.⁵

19
- 84876591853
- Neural acceleration for general-purpose approximate programs
- H. Esmaeilzadeh, A. Sampson, L. Ceze, and D. Burger, "Neural acceleration for general-purpose approximate programs," in MICRO, 2012.
- (2012) MICRO
- Esmaeilzadeh, H.¹ Sampson, A.² Ceze, L.³ Burger, D.⁴

20
- 80052679438
- Buffer-integrated-cache: A cost-effective sram architecture for handheld and embedded platforms
- C. F. Fajardo, Z. Fang, R. Iyer, G. F. Garcia, S. E. Lee, and L. Zhao, "Buffer-integrated-cache: A cost-effective sram architecture for handheld and embedded platforms," in DAC, 2011.
- (2011) DAC
- Fajardo, C.F.¹ Fang, Z.² Iyer, R.³ Garcia, G.F.⁴ Lee, S.E.⁵ Zhao, L.⁶

21
- 84857306124
- The program dependence graph and its use in optimization
- J. Ferrante, K. J. Ottenstein, and J. D. Warren, "The program dependence graph and its use in optimization," in Symposium on Programming, 1984.
- (1984) Symposium on Programming
- Ferrante, J.¹ Ottenstein, K.J.² Warren, J.D.³

22
- 0036296821
- Slack: Maximizing performance under technological constraints
- B. A. Fields, R. Bodk, and M. D. Hill, "Slack: Maximizing performance under technological constraints," in ISCA, 2002.
- (2002) ISCA
- Fields, B.A.¹ Bodk, R.² Hill, M.D.³

23
- 79952656083
- M. Fingeroff, High-Level Synthesis Blue Book, 2010.
- (2010) High-Level Synthesis Blue Book
- Fingeroff, M.¹

24
- 79959906704
- Kremlin: Rethinking and rebooting gprof for the multicore age
- S. Garcia, D. Jeon, C. M. Louie, and M. B. Taylor, "Kremlin: rethinking and rebooting gprof for the multicore age," in PLDI, 2011.
- (2011) PLDI
- Garcia, S.¹ Jeon, D.² Louie, C.M.³ Taylor, M.B.⁴

25
- 84869168810
- Dyser: Unifying functionality and parallelism specialization for energy-efficient computing
- V. Govindaraju, C.-H. Ho, T. Nowatzki, J. Chhugani, N. Satish, K. Sankaralingam, and C. Kim, "Dyser: Unifying functionality and parallelism specialization for energy-efficient computing," IEEE Micro, 2012.
- (2012) IEEE Micro
- Govindaraju, V.¹ Ho, C.-H.² Nowatzki, T.³ Chhugani, J.⁴ Satish, N.⁵ Sankaralingam, K.⁶ Kim, C.⁷

26
- 84887502088
- Breaking simd shackles with an exposed flexible microarchitecture and the access execute pdg
- V. Govindaraju, T. Nowatzki, and K. Sankaralingam, "Breaking simd shackles with an exposed flexible microarchitecture and the access execute pdg," in PACT, 2013.
- (2013) PACT
- Govindaraju, V.¹ Nowatzki, T.² Sankaralingam, K.³

27
- 84863374615
- Bundled execution of recurring traces for energy-efficient general purpose processing
- S. Gupta, S. Feng, A. Ansari, S. Mahlke, and D. August, "Bundled execution of recurring traces for energy-efficient general purpose processing," in MICRO, 2011.
- (2011) MICRO
- Gupta, S.¹ Feng, S.² Ansari, A.³ Mahlke, S.⁴ August, D.⁵

28
- 77954995378
- Understanding sources of inefficiency in general-purpose chips
- R. Hameed, W. Qadeer, M. Wachs, O. Azizi, A. Solomatnikov, B. C. Lee, S. Richardson, C. Kozyrakis, and M. Horowitz, "Understanding sources of inefficiency in general-purpose chips," in ISCA, 2010.
- (2010) ISCA
- Hameed, R.¹ Qadeer, W.² Wachs, M.³ Azizi, O.⁴ Solomatnikov, A.⁵ Lee, B.C.⁶ Richardson, S.⁷ Kozyrakis, C.⁸ Horowitz, M.⁹

29
- 84905475765
- Optimal huffman tree-height reduction for instruction-level parallelism
- Department of Computer Sciences The University of Texas at Austin
- W. Hunt, B. A. Maher, D. Burger, and K. S. Mckinley, "Optimal huffman tree-height reduction for instruction-level parallelism," Technical Report TR-08-34, Department of Computer Sciences The University of Texas at Austin, 2008.
- (2008) Technical Report TR-08-34
- Hunt, W.¹ Maher, B.A.² Burger, D.³ McKinley, K.S.⁴

30
- 77952985184
- Code coverage and input variability: Effects on architecture and compiler research
- H. C. Hunter andW. meiW. Hwu, "Code coverage and input variability: effects on architecture and compiler research," in CASES, 2002.
- (2002) CASES
- Hunter, H.C.¹ Mei, W.² Hwu, W.³

31
- 81455154902
- Kismet: Parallel speedup estimates for serial programs
- D. Jeon, S. Garcia, C. M. Louie, and M. B. Taylor, "Kismet: parallel speedup estimates for serial programs," in OOPSLA, 2011.
- (2011) OOPSLA
- Jeon, D.¹ Garcia, S.² Louie, C.M.³ Taylor, M.B.⁴

32
- 79951696651
- Sd3: A scalable approach to dynamic data-dependence profiling
- M. Kim, H. Kim, and C.-K. Luk, "Sd3: A scalable approach to dynamic data-dependence profiling," in MICRO, 2010.
- (2010) MICRO
- Kim, M.¹ Kim, H.² Luk, C.-K.³

33
- 0024068822
- Measuring parallelism in computation-intensive scientific/engineering applications
- M. Kumar, "Measuring parallelism in computation-intensive scientific/engineering applications," IEEE Trans. Computers, 1988.
- (1988) IEEE Trans. Computers
- Kumar, M.¹

34
- 0026867146
- Limits of control flow on parallelism
- M. S. Lam and R. P. Wilson, "Limits of control flow on parallelism," in ISCA, 1992.
- (1992) ISCA
- Lam, M.S.¹ Wilson, R.P.²

35
- 84979255264
- An fpga-based in-line accelerator for memcached
- M. Lavasani, H. Angepat, and D. Chiou, "An fpga-based in-line accelerator for memcached," IEEE Computer Architecture Letters, 2013.
- (2013) IEEE Computer Architecture Letters
- Lavasani, M.¹ Angepat, H.² Chiou, D.³

36
- 84881151222
- Gpuwattch: Enabling energy optimizations in gpgpus
- J. Leng, T. H. Hetherington, A. ElTantawy, S. Z. Gilani, N. S. Kim, T. M. Aamodt, and V. J. Reddi, "Gpuwattch: enabling energy optimizations in gpgpus," in ISCA, 2013.
- (2013) ISCA
- Leng, J.¹ Hetherington, T.H.² Eltantawy, A.³ Gilani, S.Z.⁴ Kim, N.S.⁵ Aamodt, T.M.⁶ Reddi, V.J.⁷

37
- 76749146060
- Mcpat: An integrated power, area, and timing modeling framework for multicore and manycore architectures
- S. Li, J. H. Ahn, R. D. Strong, J. B. Brockman, D. M. Tullsen, and N. P. Jouppi, "Mcpat: an integrated power, area, and timing modeling framework for multicore and manycore architectures," in MICRO, 2009.
- (2009) MICRO
- Li, S.¹ Ahn, J.H.² Strong, R.D.³ Brockman, J.B.⁴ Tullsen, D.M.⁵ Jouppi, N.P.⁶

38
- 84881144734
- Thin servers with smart pipes: Designing soc accelerators for memcached
- K. T. Lim, D. Meisner, A. G. Saidi, P. Ranganathan, and T. F. Wenisch, "Thin servers with smart pipes: designing soc accelerators for memcached," in ISCA, 2013.
- (2013) ISCA
- Lim, K.T.¹ Meisner, D.² Saidi, A.G.³ Ranganathan, P.⁴ Wenisch, T.F.⁵

39
- 84879851819
- On learning-based methods for designspace exploration with high-level synthesis
- H.-Y. Liu and L. P. Carloni, "On learning-based methods for designspace exploration with high-level synthesis," in DAC, 2013.
- (2013) DAC
- Liu, H.-Y.¹ Carloni, L.P.²

40
- 84862058364
- Compositional system-level design exploration with planning of high-level synthesis
- H.-Y. Liu, M. Petracca, and L. P. Carloni, "Compositional system-level design exploration with planning of high-level synthesis," in DATE, 2012.
- (2012) DATE
- Liu, H.-Y.¹ Petracca, M.² Carloni, L.P.³

41
- 40349109005
- Pathexpander: Architectural support for increasing the path coverage of dynamic bug detection
- S. Lu, P. Zhou, W. Liu, Y. Zhou, and J. Torrellas, "Pathexpander: Architectural support for increasing the path coverage of dynamic bug detection," in MICRO, 2006.
- (2006) MICRO
- Lu, S.¹ Zhou, P.² Liu, W.³ Zhou, Y.⁴ Torrellas, J.⁵

42
- 31944440969
- Pin: Building customized program analysis tools with dynamic instrumentation
- C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood, "Pin: building customized program analysis tools with dynamic instrumentation," PLDI, 2005.
- (2005) PLDI
- Luk, C.-K.¹ Cohn, R.² Muth, R.³ Patil, H.⁴ Klauser, A.⁵ Lowney, G.⁶ Wallace, S.⁷ Reddi, V.J.⁸ Hazelwood, K.⁹

43
- 84881162326
- Convolution engine: Balancing efficiency & flexibility in specialized computing
- W. Qadeer, R. Hameed, O. Shacham, P. Venkatesan, C. Kozyrakis, and M. A. Horowitz, "Convolution engine: balancing efficiency & flexibility in specialized computing," in ISCA, 2013.
- (2013) ISCA
- Qadeer, W.¹ Hameed, R.² Shacham, O.³ Venkatesan, P.⁴ Kozyrakis, C.⁵ Horowitz, M.A.⁶

44
- 84863430504
- Measuring limits of parallelism and characterizing its vulnerability to resource constraints
- L. Rauchwerger, P. K. Dubey, and R. Nair, "Measuring limits of parallelism and characterizing its vulnerability to resource constraints," in MICRO, 1993.
- (1993) MICRO
- Rauchwerger, L.¹ Dubey, P.K.² Nair, R.³

45
- 84889594827
- Quantifying acceleration: Power/performance trade-offs of application kernels in hardware
- B. Reagen, Y. S. Shao, G.-Y. Wei, and D. Brooks, "Quantifying acceleration: Power/performance trade-offs of application kernels in hardware," in ISLPED, 2013.
- (2013) ISLPED
- Reagen, B.¹ Shao, Y.S.² Wei, G.-Y.³ Brooks, D.⁴

46
- 79959550547
- Dramsim2: A cycle accurate memory system simulator
- P. Rosenfeld, E. Cooper-Balis, and B. Jacob, "Dramsim2: A cycle accurate memory system simulator," IEEE Comput. Archit. Lett., 2011.
- (2011) IEEE Comput. Archit. Lett
- Rosenfeld, P.¹ Cooper-Balis, E.² Jacob, B.³

47
- 84880285819
- Sonic millip3de: A massively parallel 3d-stacked accelerator for 3d ultrasound
- R. Sampson, M. Yang, S. Wei, C. Chakrabarti, and T. F. Wenisch, "Sonic millip3de: A massively parallel 3d-stacked accelerator for 3d ultrasound," in HPCA, 2013.
- (2013) HPCA
- Sampson, R.¹ Yang, M.² Wei, S.³ Chakrabarti, C.⁴ Wenisch, T.F.⁵

48
- 34249810603
- Nosq: Store-load communication without a store queue
- T. Sha, M. M. K. Martin, and A. Roth, "Nosq: Store-load communication without a store queue," in MICRO, 2006.
- (2006) MICRO
- Sha, T.¹ Martin, M.M.K.² Roth, A.³

49
- 84881437667
- Isa-independent workload characterization and its implications for specialized architectures
- Y. S. Shao and D. Brooks, "Isa-independent workload characterization and its implications for specialized architectures," in ISPASS, 2013.
- (2013) ISPASS
- Shao, Y.S.¹ Brooks, D.²

50
- 84864858301
- A defect-tolerant accelerator for emerging highperformance applications
- O. Temam, "A defect-tolerant accelerator for emerging highperformance applications," in ISCA, 2012.
- (2012) ISCA
- Temam, O.¹

51
- 0026989702
- On the limits of program parallelism and its smoothability
- K. B. Theobald, G. R. Gao, and L. J. Hendren, "On the limits of program parallelism and its smoothability," in MICRO, 1992.
- (1992) MICRO
- Theobald, K.B.¹ Gao, G.R.² Hendren, L.J.³

52
- 77952256041
- Conservation cores: Reducing the energy of mature computations
- G. Venkatesh, J. Sampson, N. Goulding, S. Garcia, V. Bryksin, J. Lugo-Martinez, S. Swanson, and M. B. Taylor, "Conservation cores: reducing the energy of mature computations," ASPLOS, 2010.
- (2010) ASPLOS
- Venkatesh, G.¹ Sampson, J.² Goulding, N.³ Garcia, S.⁴ Bryksin, V.⁵ Lugo-Martinez, J.⁶ Swanson, S.⁷ Taylor, M.B.⁸

53
- 0026137115
- Limits of instruction-level parallelism
- D.W. Wall, "Limits of instruction-level parallelism," in ASPLOS, 1991.
- (1991) ASPLOS
- Wall, D.W.¹

54
- 0030149507
- Cacti: An enhanced cache access and cycle time model
- S. J. E. Wilton and N. P. Jouppi, "Cacti: An enhanced cache access and cycle time model," IEEE Journal of Solid-State Circuits, 1996.
- (1996) IEEE Journal of Solid-State Circuits
- Wilton, S.J.E.¹ Jouppi, N.P.²

55
- 84881185269
- Navigating big data with high-throughput, energy-efficient data partitioning
- L. Wu, R. J. Barker, M. A. Kim, and K. A. Ross, "Navigating big data with high-throughput, energy-efficient data partitioning," in ISCA, 2013.
- (2013) ISCA
- Wu, L.¹ Barker, R.J.² Kim, M.A.³ Ross, K.A.⁴

56
- 84893898462
- A 3d-stacked logic-in-memory accelerator for application-specific data intensive computing
- Q. Zhu, B. Akin, H. E. Sumbul, F. Sadi, J. Hoe, L. Pileggi, and F. Franchetti, "A 3d-stacked logic-in-memory accelerator for application-specific data intensive computing," in 3DIC, 2013.
- (2013) 3DIC
- Zhu, Q.¹ Akin, B.² Sumbul, H.E.³ Sadi, F.⁴ Hoe, J.⁵ Pileggi, L.⁶ Franchetti, F.⁷

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.