메뉴 건너뛰기




Volumn 43, Issue 3, 2008, Pages 277-286

Feedback-driven threading: Power-efficient and high-performance execution of multi-threaded workloads on CMPs

Author keywords

Bandwidth; CMP; Multi threaded; Synchronization

Indexed keywords

AVERAGE EXECUTION TIME; BANDWIDTH-AWARE; CHIP MULTIPROCESSOR; CMP; EXECUTION TIME; IMPROVING PERFORMANCE; INPUT SET; MINIMAL SUPPORTS; MULTI-THREADED; MULTI-THREADED APPLICATION; MULTIPLE THREADS; NUMBER OF THREADS; OFF-CHIP; ON CHIPS; OPTIMAL NUMBER; PERFORMANCE COUNTERS; PERFORMANCE IMPROVEMENTS; POWER EFFICIENT; RUN-TIME INFORMATION; RUNTIME;

EID: 67650075028     PISSN: 15232867     EISSN: None     Source Type: Journal    
DOI: None     Document Type: Article
Times cited : (27)

References (33)
  • 1
    • 67650096031 scopus 로고    scopus 로고
    • Advanced Micro Devices, Inc. White Paper: Multi-Core Processors, The next evolution in computing
    • Advanced Micro Devices, Inc. White Paper: Multi-Core Processors - The next evolution in computing. 2005.
    • (2005)
  • 2
    • 67650033734 scopus 로고    scopus 로고
    • D. an Mey et al. The RWTH Aachen SMP-Cluster User's Guide Version 6.2, May 2007.
    • D. an Mey et al. The RWTH Aachen SMP-Cluster User's Guide Version 6.2, May 2007.
  • 3
    • 0003605996 scopus 로고    scopus 로고
    • NAS parallel benchmarks
    • Technical report, NASA, 1994
    • D. Bailey et al. NAS parallel benchmarks. Technical report, NASA, 1994.
    • Bailey, D.1
  • 5
    • 0030258428 scopus 로고    scopus 로고
    • Using parallel program characteristics in dynamic processor allocation policies
    • T. Brecht and K. Guha. Using parallel program characteristics in dynamic processor allocation policies. Performance Evaluation, 27/28(4), 1996.
    • (1996) Performance Evaluation , vol.27-28 , Issue.4
    • Brecht, T.1    Guha, K.2
  • 6
    • 84947280741 scopus 로고    scopus 로고
    • Dynamic speedup calculation through self-analysis
    • Technical Report UPC-DAC-1999-43, UPC, 1999
    • J. Corbalan et al. Dynamic speedup calculation through self-analysis. Technical Report UPC-DAC-1999-43, UPC, 1999.
    • Corbalan, J.1
  • 8
    • 0002806690 scopus 로고    scopus 로고
    • Openmp: An industry-standard api for shared-memory programming
    • L. Dagum and R. Menon. Openmp: An industry-standard api for shared-memory programming. IEEE Comput. Sci. Eng., 1998.
    • (1998) IEEE Comput. Sci. Eng
    • Dagum, L.1    Menon, R.2
  • 10
    • 34748904216 scopus 로고    scopus 로고
    • Efficient Software Transactional Memory. Technical Report IRC-TR-05-051
    • Jan
    • R. Ennals. Efficient Software Transactional Memory. Technical Report IRC-TR-05-051, Intel Research Cambridge Tech Report, Jan 2005.
    • (2005) Intel Research Cambridge Tech Report
    • Ennals, R.1
  • 11
    • 67650089193 scopus 로고    scopus 로고
    • M. Gillespie and C. Breshears(Intel Corp.). Achieving Threading Success. www.intel.com/cd/ids/developer/asmo-na/eng/212806.htm, 2005.
    • M. Gillespie and C. Breshears(Intel Corp.). Achieving Threading Success. www.intel.com/cd/ids/developer/asmo-na/eng/212806.htm, 2005.
  • 12
    • 42549137614 scopus 로고    scopus 로고
    • Exploring design space of future CMPs
    • J. Huh et al. Exploring design space of future CMPs. In PACT '01.
    • PACT '01
    • Huh, J.1
  • 14
    • 42549091263 scopus 로고    scopus 로고
    • Intel. ICC 9.1 for Linux. http://www.intel.com/cd/software/products/asmo- na/eng/compilers/284264.htm.
    • Intel. ICC 9.1 for Linux
  • 16
    • 67650078561 scopus 로고    scopus 로고
    • Intel. Intel Itanium 2 Processor Reference Manual, 2004.
    • Intel. Intel Itanium 2 Processor Reference Manual, 2004.
  • 17
    • 3042669130 scopus 로고    scopus 로고
    • IBM Power5 Chip: A Dual-Core Multithreaded Processor
    • R. Kalla, B. Sinharoy, and J. M. Tendler. IBM Power5 Chip: A Dual-Core Multithreaded Processor. IEEE Micro, 24(2):40-47, 2004.
    • (2004) IEEE Micro , vol.24 , Issue.2 , pp. 40-47
    • Kalla, R.1    Sinharoy, B.2    Tendler, J.M.3
  • 19
    • 20344374162 scopus 로고    scopus 로고
    • Niagara: A 32-Way Multithreaded SPARC Processor
    • P. Kongetira, K. Aingaran, and K. Olukotun. Niagara: A 32-Way Multithreaded SPARC Processor. IEEE Micro, 25(2):21-29, 2005.
    • (2005) IEEE Micro , vol.25 , Issue.2 , pp. 21-29
    • Kongetira, P.1    Aingaran, K.2    Olukotun, K.3
  • 20
    • 84966600269 scopus 로고    scopus 로고
    • Compiling several classes of communication patterns on a multithreaded architecture
    • R. Kumar et al. Compiling several classes of communication patterns on a multithreaded architecture. In IPDPS '02, 2002.
    • (2002) IPDPS '02
    • Kumar, R.1
  • 21
    • 84944403811 scopus 로고    scopus 로고
    • Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction
    • R. Kumar et al. Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction. In MICRO 36, 2003.
    • (2003) MICRO 36
    • Kumar, R.1
  • 22
    • 4644370318 scopus 로고    scopus 로고
    • Single-ISA Heterogeneous Multi-Core Architectures for Multithreaded Workload Performance
    • R. Kumar et al. Single-ISA Heterogeneous Multi-Core Architectures for Multithreaded Workload Performance. In ISCA 31, 2004.
    • (2004) ISCA 31
    • Kumar, R.1
  • 24
    • 0031599142 scopus 로고    scopus 로고
    • Mersenne twister: A 623-dimensionally equidistributed uniform pseudo-random number generator
    • M. Matsumoto and T. Nishimura. Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator. ACM Trans. Model. Comput. Simul., 1998.
    • (1998) ACM Trans. Model. Comput. Simul
    • Matsumoto, M.1    Nishimura, T.2
  • 25
    • 0027594835 scopus 로고
    • A dynamic processor allocation policy for multiprogrammed shared-memory multiprocessors
    • C. McCann et al. A dynamic processor allocation policy for multiprogrammed shared-memory multiprocessors. Trans. Comp. Sys., 1993.
    • (1993) Trans. Comp. Sys
    • McCann, C.1
  • 26
    • 47349098275 scopus 로고    scopus 로고
    • MineBench: A Benchmark Suite for Data Mining Workloads
    • R. Narayanan et al. MineBench: A Benchmark Suite for Data Mining Workloads. In IISWC, 2006.
    • (2006) IISWC
    • Narayanan, R.1
  • 27
    • 42549121345 scopus 로고    scopus 로고
    • Maximizing speedup through self-tuning of processor allocation
    • T. D. Nguyen et al. Maximizing speedup through self-tuning of processor allocation. In Intn'l Parallel Processing Symposium, 1996.
    • (1996) Intn'l Parallel Processing Symposium
    • Nguyen, T.D.1
  • 28
    • 35348859034 scopus 로고    scopus 로고
    • Evaluating the potential of multithreaded platforms for irregular scientific computations
    • J. Nieplocha et al. Evaluating the potential of multithreaded platforms for irregular scientific computations. In Computing frontiers, 2007.
    • (2007) Computing frontiers
    • Nieplocha, J.1
  • 29
    • 42549083083 scopus 로고    scopus 로고
    • Implementation and Evaluation of OpenMP for Hitachi SR8000
    • Y. Nishitani, K. Negishi, H. Ohta, and E. Nunohiro. Implementation and Evaluation of OpenMP for Hitachi SR8000. In ISHPC 3, 2000.
    • (2000) ISHPC , vol.3
    • Nishitani, Y.1    Negishi, K.2    Ohta, H.3    Nunohiro, E.4
  • 30
    • 84869362304 scopus 로고    scopus 로고
    • Nvidia. CUDA SDK Code Samples. http://developer.download.nvidia.com/ compute/cuda/sdk/website/samples.html, 2007.
    • (2007) Code Samples
  • 31
    • 42549125305 scopus 로고    scopus 로고
    • Intel multi-core processors: Making the move to quad-core and beyond
    • Dec
    • R. Ramanathan. Intel multi-core processors: Making the move to quad-core and beyond. Technology@Intel Magazine, Dec 2006.
    • (2006) Technology@Intel Magazine
    • Ramanathan, R.1
  • 32
    • 42549166728 scopus 로고    scopus 로고
    • A Scalability Study of Columbia using the NAS Parallel Benchmarks
    • S. Saini et al. A Scalability Study of Columbia using the NAS Parallel Benchmarks. Journal of Comput. Methods in Sci. and Engr., 2006.
    • (2006) Journal of Comput. Methods in Sci. and Engr
    • Saini, S.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.