SCOPUS 정보 검색 플랫폼

Proceedings of the Annual International Symposium on Microarchitecture, MICRO

Volumn , Issue , 2011, Pages 12-23

Bundled execution of recurring traces for energy-efficient general purpose processing

(5) Gupta, Shantanu a,b Feng, Shuguang b Ansari, Amin b Mahlke, Scott b August, David c

a INTEL CORPORATION (United States)

b UNIVERSITY OF MICHIGAN (United States)

c Princeton University (United States)

Author keywords

co processor; efficiency; energy saving; microarchitecture

Indexed keywords

APPLICATION SPECIFIC HARDWARES; AVERAGE ENERGY; CO-PROCESSORS; ENERGY EFFICIENT; EXECUTION MODEL; GENERAL PURPOSE; INSTRUCTION FETCH; MEDIA APPLICATION; MICRO ARCHITECTURES; OFF-LOADING; POWER BUDGETS; POWER CONSTRAINTS; PROCESSING RESOURCES; PROGRAM EXECUTION; REDUCED-COMPLEXITY; REGISTER FILES; SINGLE CHIPS; TECHNOLOGY SCALING; VOLTAGE-SCALING;

EFFICIENCY; ENERGY CONSERVATION; ENERGY UTILIZATION;

ENERGY EFFICIENCY;

EID: 84863374615 PISSN: 10724451 EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/2155620.2155623 Document Type: Conference Paper

Times cited : (87)

References (32)

1
- 84863344334
- ARM. Arm11. http://www.arm.com/products/CPUs/families/ARM11Family.html.
- Arm11

2
- 12844273425
- Spatial computation
- M. Budiu, G. Venkataramani, T. Chelcea, and S. C. Goldstein. Spatial computation. In 12th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 14-26, 2004.
- (2004) 12th International Conference on Architectural Support for Programming Languages and Operating Systems , pp. 14-26
- Budiu, M.¹ Venkataramani, G.² Chelcea, T.³ Goldstein, S.C.⁴

3
- 21644435314
- Application-specific processing on a general-purpose core via transparent instruction set customization
- Dec.
- N. Clark et al. Application-specific processing on a general-purpose core via transparent instruction set customization. In Proc. of the 37th Annual International Symposium on Microarchitecture, pages 30-40, Dec. 2004.
- (2004) Proc. of the 37th Annual International Symposium on Microarchitecture , pp. 30-40
- Clark, N.¹

4
- 52649095061
- VEAL: Virtualized execution accelerator for loops
- June
- N. Clark, A. Hormati, and S. Mahlke. VEAL: Virtualized execution accelerator for loops. In Proc. of the 35th Annual International Symposium on Computer Architecture, pages 389-400, June 2008.
- (2008) Proc. of the 35th Annual International Symposium on Computer Architecture , pp. 389-400
- Clark, N.¹ Hormati, A.² Mahlke, S.³

5
- 48249092127
- Efficient embedded computing
- July
- W. J. Dally, J. Balfour, D. Black-Shaffer, J. Chen, R. Harting, V. Parikh, J. Park, and D. Sheffield. Efficient embedded computing. IEEE Computer, 41(7):27-32, July 2008.
- (2008) IEEE Computer , vol.41 , Issue.7 , pp. 27-32
- Dally, W.J.¹ Balfour, J.² Black-Shaffer, D.³ Chen, J.⁴ Harting, R.⁵ Parikh, V.⁶ Park, J.⁷ Sheffield, D.⁸

6
- 64849117951
- Bridging the computation gap between programmable processors and hardwired accelerators
- Feb.
- K. Fan, M. Kudlur, G. Dasika, and S. Mahlke. Bridging the computation gap between programmable processors and hardwired accelerators. In Proc. of the 15th International Symposium on High-Performance Computer Architecture, pages 313-322, Feb. 2009.
- (2009) Proc. of the 15th International Symposium on High-Performance Computer Architecture , pp. 313-322
- Fan, K.¹ Kudlur, M.² Dasika, G.³ Mahlke, S.⁴

7
- 34548705938
- Compiler-directed synthesis of multifunction loop accelerators
- Sept.
- K. Fan, M. Kudlur, H. Park, and S. Mahlke. Compiler-directed synthesis of multifunction loop accelerators. In Proc. of the 2005 Workshop on Application Specific Processors, pages 91-98, Sept. 2005.
- (2005) Proc. of the 2005 Workshop on Application Specific Processors , pp. 91-98
- Fan, K.¹ Kudlur, M.² Park, H.³ Mahlke, S.⁴

8
- 0032312214
- Putting the fill unit to work: Dynamic optimizations for trace cache microprocessors
- June
- D. Friendly, S. Patel, and Y. Patt. Putting the fill unit to work: Dynamic optimizations for trace cache microprocessors. In Proc. of the 25th Annual International Symposium on Computer Architecture, pages 173-181, June 1998.
- (1998) Proc. of the 25th Annual International Symposium on Computer Architecture , pp. 173-181
- Friendly, D.¹ Patel, S.² Patt, Y.³

9
- 79955890625
- Dynamically specialized datapaths for energy efficient computing
- V. Govindaraju, C. H. Ho, and K. Sankaralingam. Dynamically specialized datapaths for energy efficient computing. In Proc. of the 17th International Symposium on High-Performance Computer Architecture, 2011.
- Proc. of the 17th International Symposium on High-Performance Computer Architecture, 2011
- Govindaraju, V.¹ Ho, C.H.² Sankaralingam, K.³

10
- 77954995378
- Understanding sources of inefficiency in general-purpose chips
- R. Hameed, W. Qadeer, M. Wachs, O. Azizi, A. Solomatnikov, B. C. Lee, S. Richardson, C. Kozyrakis, and M. Horowitz. Understanding sources of inefficiency in general-purpose chips. In Proc. of the 37th Annual International Symposium on Computer Architecture, pages 37-47, 2010.
- (2010) Proc. of the 37th Annual International Symposium on Computer Architecture , pp. 37-47
- Hameed, R.¹ Qadeer, W.² Wachs, M.³ Azizi, O.⁴ Solomatnikov, A.⁵ Lee, B.C.⁶ Richardson, S.⁷ Kozyrakis, C.⁸ Horowitz, M.⁹

11
- 0031360911
- GARP: A MIPS processor with a reconfigurable coprocessor
- Apr.
- J. R. Hauser and J. Wawrzynek. GARP: A MIPS processor with a reconfigurable coprocessor. In Proc. of the 5th IEEE Symposium on Field-Programmable Custom Computing Machines, pages 12-21, Apr. 1997.
- (1997) Proc. of the 5th IEEE Symposium on Field-Programmable Custom Computing Machines , pp. 12-21
- Hauser, J.R.¹ Wawrzynek, J.²

12
- 0345521552
- T. Instruments. Jan.
- T. Instruments. Tms320c2x user's guide, Jan. 1993.
- (1993) Tms320c2x User's Guide

13
- 84863362111
- Intel.
- Intel. Intel xeon processor with 512 kb l2 cache, 2004.
- (2004) Intel Xeon Processor with 512 Kb L2 Cache

14
- 33750401079
- The H.264 video coding standard
- H. Kalva. The H.264 video coding standard. IEEE MultiMedia, 13(4):86-90, 2006.
- (2006) IEEE MultiMedia , vol.13 , Issue.4 , pp. 86-90
- Kalva, H.¹

15
- 0027595384
- The superblock: An effective technique for vliw and superscalar compilation
- May
- W. mei W. Hwu, S. A. Mahlke, W. Y. Chen, P. P. Chang, N. J. Warter, R. A. Bringmann, R. G. Ouellette, R. E. Hank, T. Kiyohara, G. E. Haab, J. G. Holm, and D. M. Lavery. The superblock: An effective technique for vliw and superscalar compilation. Journal of Supercomputing, 7(1):229-248, May 1993.
- (1993) Journal of Supercomputing , vol.7 , Issue.1 , pp. 229-248
- Mei, W.¹ Hwu, W.² Mahlke, S.A.³ Chen, W.Y.⁴ Chang, P.P.⁵ Warter, N.J.⁶ Bringmann, R.A.⁷ Ouellette, R.G.⁸ Hank, R.E.⁹ Kiyohara, T.¹⁰ Haab, G.E.¹¹ Holm, J.G.¹² Lavery, D.M.¹³

16
- 47349084021
- Optimizing nuca organizations and wiring alternatives for large caches with cacti 6.0
- N. Muralimanohar, R. Balasubramonian, and N. P. Jouppi. Optimizing nuca organizations and wiring alternatives for large caches with cacti 6.0. In IEEE Micro, pages 3-14, 2007.
- (2007) IEEE Micro , pp. 3-14
- Muralimanohar, N.¹ Balasubramonian, R.² Jouppi, N.P.³

17
- 34548297517
- Efficient high-performance ASIC implementation of JPEG-LS encoder
- Apr.
- M. Papadonikolakis et al. Efficient high-performance ASIC implementation of JPEG-LS encoder. In Proc. of the 2007 Design, Automation and Test in Europe, pages 159-164, Apr. 2007.
- (2007) Proc. of the 2007 Design, Automation and Test in Europe , pp. 159-164
- Papadonikolakis, M.¹

18
- 0035363244
- rePLay: A hardware framework for dynamic optimization
- DOI 10.1109/12.931895
- S. J. Patel and S. S. Lumetta. rePLay: A hardware framework for dynamic optimization. IEEE Transactions on Computers, 50(6):590-608, June 2001. (Pubitemid 32609869)
- (2001) IEEE Transactions on Computers , vol.50 , Issue.6 , pp. 590-608
- Patel, S.J.¹ Lumetta, S.S.²

19
- 0024682923
- Force-directed scheduling for the behavorial synthesis of ASICs
- June
- P. G. Paulin and J. P. Knight. Force-directed scheduling for the behavorial synthesis of ASICs. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 8(6):661-679, June 1989.
- (1989) IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems , vol.8 , Issue.6 , pp. 661-679
- Paulin, P.G.¹ Knight, J.P.²

20
- 0028768023
- A high-performance microarchitecture with hardware-programmable function units
- Dec.
- R. Razdan and M. D. Smith. A high-performance microarchitecture with hardware-programmable function units. In Proc. of the 27th Annual International Symposium on Microarchitecture, pages 172-180, Dec. 1994.
- (1994) Proc. of the 27th Annual International Symposium on Microarchitecture , pp. 172-180
- Razdan, R.¹ Smith, M.D.²

21
- 0037669851
- Exploiting ILP, TLP, and DLP using polymorphism in the TRIPS architecture
- June
- K. Sankaralingam et al. Exploiting ILP, TLP, and DLP using polymorphism in the TRIPS architecture. In Proc. of the 30th Annual International Symposium on Computer Architecture, pages 422-433, June 2003.
- (2003) Proc. of the 30th Annual International Symposium on Computer Architecture , pp. 422-433
- Sankaralingam, K.¹

22
- 0036603298
- PICO-NPA: High-level synthesis of nonprogrammable hardware accelerators
- DOI 10.1023/A:1015341305426
- R. Schreiber et al. PICO-NPA: High-level synthesis of nonprogrammable hardware accelerators. Journal of VLSI Signal Processing, 31(2):127-142, 2002. (Pubitemid 34669474)
- (2002) Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology , vol.31 , Issue.2 , pp. 127-142
- Schreiber, R.¹ Aditya, S.² Mahlke, S.³ Kathail, V.⁴ Rau, B.R.⁵ Cronquist, D.⁶ Sivaraman, M.⁷

23
- 80055103082
- R. Singhal. Inside intel next generation nehalem microarchitecture, 2008. http://software.intel.com/file/18976.
- (2008) Inside Intel next Generation Nehalem Microarchitecture
- Singhal, R.¹

24
- 84944392428
- Wavescalar
- IEEE Computer Society
- S. Swanson, K. Michelson, A. Schwerin, and M. Oskin. Wavescalar. In Proc. of the 36th Annual International Symposium on Microarchitecture, page 291. IEEE Computer Society, 2003.
- (2003) Proc. of the 36th Annual International Symposium on Microarchitecture , pp. 291
- Swanson, S.¹ Michelson, K.² Schwerin, A.³ Oskin, M.⁴

25
- 84868174385
- July
- Tensilica Inc. Diamond Standard Processor Core Family Architecture, July 2007. http://www.tensilica.com/pdf/Diamond WP.pdf.
- (2007) Diamond Standard Processor Core Family Architecture

26
- 24144484255
- Trimaran. An infrastructure for research in ILP, 2000. http://www.trimaran.org/.
- (2000) An Infrastructure for Research in ILP

27
- 52649150458
- Achieving out-of-order performance with almost in-order complexity
- June
- F. Tseng and Y. N. Patt. Achieving out-of-order performance with almost in-order complexity. In Proc. of the 35th Annual International Symposium on Computer Architecture, pages 3-12, June 2008.
- (2008) Proc. of the 35th Annual International Symposium on Computer Architecture , pp. 3-12
- Tseng, F.¹ Patt, Y.N.²

28
- 77952256041
- Conservation cores: Reducing the energy of mature computations
- G. Venkatesh, J. Sampson, N. Goulding, S. Garcia, V. Bryksin, J. Lugo-Martinez, S. Swanson, and M. B. Taylor. Conservation cores: reducing the energy of mature computations. In 18th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 205-218, 2010.
- (2010) 18th International Conference on Architectural Support for Programming Languages and Operating Systems , pp. 205-218
- Venkatesh, G.¹ Sampson, J.² Goulding, N.³ Garcia, S.⁴ Bryksin, V.⁵ Lugo-Martinez, J.⁶ Swanson, S.⁷ Taylor, M.B.⁸

29
- 66749136924
- From SODA to scotch: The evolution of a wireless baseband processor
- Nov.
- M. Woh et al. From SODA to scotch: The evolution of a wireless baseband processor. In Proc. of the 41st Annual International Symposium on Microarchitecture, pages 152-163, Nov. 2008.
- (2008) Proc. of the 41st Annual International Symposium on Microarchitecture , pp. 152-163
- Woh, M.¹

30
- 0033703884
- CHIMAERA: A high-performance architecture with a tightly-coupled reconfigurable functional unit
- Z. A. Ye et al. CHIMAERA: a high-performance architecture with a tightly-coupled reconfigurable functional unit. In Proc. of the 27th Annual International Symposium on Computer Architecture, pages 225-235, 2000.
- (2000) Proc. of the 27th Annual International Symposium on Computer Architecture , pp. 225-235
- Ye, Z.A.¹

31
- 29144465665
- Exploring the design space of LUT-based transparent accelerators
- Sept.
- S. Yehia et al. Exploring the design space of LUT-based transparent accelerators. In Proc. of the 2005 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, pages 11-21, Sept. 2005.
- (2005) Proc. of the 2005 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems , pp. 11-21
- Yehia, S.¹

32
- 64949084227
- Reconciling specialization and flexibility through compound circuits
- S. Yehia, S. Girbal, H. Berry, and O. Temam. Reconciling specialization and flexibility through compound circuits. In Proc. of the 15th International Symposium on High-Performance Computer Architecture, pages 277-288, 2009.
- (2009) Proc. of the 15th International Symposium on High-Performance Computer Architecture , pp. 277-288
- Yehia, S.¹ Girbal, S.² Berry, H.³ Temam, O.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.