메뉴 건너뛰기




Volumn , Issue , 2013, Pages 487-498

GPU wattch†: Enabling energy optimizations in GPGPUs

Author keywords

CUDA; Energy; GPU architecture; Power estimation

Indexed keywords

CUDA; DYNAMIC VOLTAGE AND FREQUENCY SCALING; ENERGY; ENERGY OPTIMIZATION; GENERAL-PURPOSE GPUS; MODELING UNCERTAINTIES; POWER ESTIMATIONS; RUN-TIME VARIATIONS;

EID: 84881151222     PISSN: 10636897     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/2485922.2485964     Document Type: Conference Paper
Times cited : (498)

References (38)
  • 1
    • 84881152686 scopus 로고    scopus 로고
    • MacSim, http://code.google.com/p/macsim.
    • MacSim
  • 2
    • 84881172649 scopus 로고    scopus 로고
    • Predictive technology model, http://ptm.asu.edu.
  • 3
    • 79954541450 scopus 로고    scopus 로고
    • Synopsys Inc., Power Compiler, www.synopsys.com.
    • Power Compiler
  • 4
    • 70349169075 scopus 로고    scopus 로고
    • Analyzing CUDA workloads using a detailed GPU simulator
    • A. Bakhoda et al. Analyzing CUDA workloads using a detailed GPU simulator. In ISPASS, 2009.
    • (2009) ISPASS
    • Bakhoda, A.1
  • 5
    • 83155188972 scopus 로고    scopus 로고
    • CudaDMA: Optimizing GPU memory bandwidth via warp specialization
    • M. Bauer et al. CudaDMA: optimizing GPU memory bandwidth via warp specialization. In SC, 2011.
    • (2011) SC
    • Bauer, M.1
  • 6
    • 0033719421 scopus 로고    scopus 로고
    • Wattch: A framework for architectural-level power analysis and optimizations
    • D. Brooks et al. Wattch: a framework for architectural-level power analysis and optimizations. In ISCA, 2000.
    • (2000) ISCA
    • Brooks, D.1
  • 7
    • 70649092154 scopus 로고    scopus 로고
    • Rodinia: A benchmark suite for heterogeneous computing
    • S. Che et al. Rodinia: A benchmark suite for heterogeneous computing. In IISWC, 2009.
    • (2009) IISWC
    • Che, S.1
  • 8
    • 77951006223 scopus 로고    scopus 로고
    • Power consumption of GPUs from a software perspective
    • S. Collange et al. Power consumption of GPUs from a software perspective. In ICCS, 2009.
    • (2009) ICCS
    • Collange, S.1
  • 9
    • 78651498216 scopus 로고    scopus 로고
    • Moving the needle, computer architecture research in academe and industry
    • W. J. Dally. Moving the needle, computer architecture research in academe and industry. In ISCA, 2010.
    • (2010) ISCA
    • Dally, W.J.1
  • 11
    • 79955923056 scopus 로고    scopus 로고
    • Thread block compaction for efficient SIMT control flow
    • W. Fung and T. Aamodt. Thread block compaction for efficient SIMT control flow. In HPCA, 2011.
    • (2011) HPCA
    • Fung, W.1    Aamodt, T.2
  • 12
    • 47349104432 scopus 로고    scopus 로고
    • Dynamic warp formation and scheduling for efficient GPU control flow
    • W. Fung et al. Dynamic warp formation and scheduling for efficient GPU control flow. In MICRO, 2007.
    • (2007) MICRO
    • Fung, W.1
  • 13
    • 77954994853 scopus 로고    scopus 로고
    • An integrated GPU power and performance model
    • S. Hong and H. Kim. An integrated GPU power and performance model. In ISCA, 2010.
    • (2010) ISCA
    • Hong, S.1    Kim, H.2
  • 14
    • 36949023020 scopus 로고    scopus 로고
    • Live, runtime phase monitoring and prediction on real systems with application to dynamic power management
    • C. Isci et al. Live, runtime phase monitoring and prediction on real systems with application to dynamic power management. In MICRO, 2006.
    • (2006) MICRO
    • Isci, C.1
  • 15
    • 28444498110 scopus 로고    scopus 로고
    • Stretching the limits of clock-gating efficiency in server-class processors
    • H. Jacobson et al. Stretching the limits of clock-gating efficiency in server-class processors. In HPCA, 2005.
    • (2005) HPCA
    • Jacobson, H.1
  • 17
    • 84870698638 scopus 로고    scopus 로고
    • Power aware computing on GPUs
    • K. Kasichayanula et al. Power aware computing on GPUs. SAAHPC, 2012.
    • (2012) SAAHPC
    • Kasichayanula, K.1
  • 18
    • 84862328133 scopus 로고    scopus 로고
    • Life after dennard and how I learned to love the picojoule
    • S. Keckler. Life After Dennard and How I Learned to Love the Picojoule. In MICRO, 2012.
    • (2012) MICRO
    • Keckler, S.1
  • 19
    • 57749178620 scopus 로고    scopus 로고
    • System level analysis of fast, per-core DVFS using on-chip switching regulators
    • W. Kim et al. System level analysis of fast, per-core DVFS using on-chip switching regulators. In HPCA, 2008.
    • (2008) HPCA
    • Kim, W.1
  • 20
    • 84863037228 scopus 로고    scopus 로고
    • Improving throughput of power-constrained GPUs using dynamic voltage/frequency and core scaling
    • J. Lee et al. Improving throughput of power-constrained GPUs using dynamic voltage/frequency and core scaling. In PACT, 2011.
    • (2011) PACT
    • Lee, J.1
  • 21
    • 27944465327 scopus 로고    scopus 로고
    • Deterministic clock gating for microprocessor power reduction
    • H. Li et al. Deterministic clock gating for microprocessor power reduction. In HPCA, 2003.
    • (2003) HPCA
    • Li, H.1
  • 22
    • 76749146060 scopus 로고    scopus 로고
    • McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures
    • S. Li et al. McPAT: an integrated power, area, and timing modeling framework for multicore and manycore architectures. In MICRO, 2009.
    • (2009) MICRO
    • Li, S.1
  • 23
    • 44849137198 scopus 로고    scopus 로고
    • NVIDIA tesla: A unified graphics and computing architecture
    • IEEE
    • E. Lindholm et al. NVIDIA Tesla: A unified graphics and computing architecture. Micro, IEEE, 2008.
    • (2008) Micro
    • Lindholm, E.1
  • 26
    • 84908696193 scopus 로고    scopus 로고
    • Statistical power modeling of GPU kernels using performance counters
    • H. Nagasaka et al. Statistical power modeling of GPU kernels using performance counters. In Green Computing Conference, 2010.
    • (2010) Green Computing Conference
    • Nagasaka, H.1
  • 27
    • 84863342255 scopus 로고    scopus 로고
    • Improving GPU performance via large warps and two-level warp scheduling
    • V. Narasiman et al. Improving GPU performance via large warps and two-level warp scheduling. In MICRO, 2011.
    • (2011) MICRO
    • Narasiman, V.1
  • 31
    • 33645675534 scopus 로고    scopus 로고
    • A fully pipelined single-precision floating-point unit in the synergistic processor element of a CELL processor
    • H.-J. Oh et al. A fully pipelined single-precision floating-point unit in the synergistic processor element of a CELL processor. JSSC, 2006.
    • (2006) JSSC
    • Oh, H.-J.1
  • 32
    • 84881168427 scopus 로고    scopus 로고
    • Lossless and lossy memory-link compression techniques for improving performance of memory-bound GPGPU workloads
    • V. Sathish et al. Lossless and lossy memory-link compression techniques for improving performance of memory-bound GPGPU workloads. In PACT, 2012.
    • (2012) PACT
    • Sathish, V.1
  • 33
    • 52649139073 scopus 로고    scopus 로고
    • A comprehensive memory modeling tool and its application to the design and analysis of future memory hierarchies
    • S. Thoziyoor et al. A comprehensive memory modeling tool and its application to the design and analysis of future memory hierarchies. In ISCA, 2008.
    • (2008) ISCA
    • Thoziyoor, S.1
  • 34
    • 84867504986 scopus 로고    scopus 로고
    • Multi2Sim: A simulation framework for CPU-GPU computing
    • R. Ubal et al. Multi2Sim: A simulation framework for CPU-GPU computing. In PACT, 2012.
    • (2012) PACT
    • Ubal, R.1
  • 35
    • 79951702954 scopus 로고    scopus 로고
    • Understanding the energy consumption of dynamic random access memories
    • T. Vogelsang. Understanding the energy consumption of dynamic random access memories. In MICRO, 2010.
    • (2010) MICRO
    • Vogelsang, T.1
  • 36
    • 84860369077 scopus 로고    scopus 로고
    • Power estimating model and analysis of general programming on GPU
    • H. Wang and Q. Chen. Power estimating model and analysis of general programming on GPU. Journal of Software, 2012.
    • (2012) Journal of Software
    • Wang, H.1    Chen, Q.2
  • 37
    • 33644928947 scopus 로고    scopus 로고
    • A dynamic compilation framework for controlling microprocessor energy and performance
    • Q. Wu et al. A dynamic compilation framework for controlling microprocessor energy and performance. In MICRO, 2005.
    • (2005) MICRO
    • Wu, Q.1
  • 38
    • 80052901858 scopus 로고    scopus 로고
    • Performance and power analysis of ATI GPU: A statistical approach
    • Y. Zhang et al. Performance and power analysis of ATI GPU: A statistical approach. In NSA, 2011.
    • (2011) NSA
    • Zhang, Y.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.