메뉴 건너뛰기




Volumn 49, Issue 2, 2006, Pages 211-233

Instruction level parallelism through microthreading - A scalable approach to chip multiprocessors

Author keywords

CMP; Code fragments; Concurrency; Microthreads

Indexed keywords

COMPUTATIONAL COMPLEXITY; CONCURRENCY CONTROL; MULTIPROCESSING SYSTEMS; PARALLEL PROCESSING SYSTEMS; PROGRAM COMPILERS; SYNCHRONIZATION;

EID: 33644893333     PISSN: 00104620     EISSN: 14602067     Source Type: Journal    
DOI: 10.1093/comjnl/bxh157     Document Type: Article
Times cited : (23)

References (49)
  • 5
    • 0034316177 scopus 로고    scopus 로고
    • The MAJC architecture: A synthesis of parallelism and scalability
    • Tremblay, M., Chan, J., Chaudhry, S., Conigliaro, A. W. and Tse, S. S. (2000) The MAJC architecture: A synthesis of parallelism and scalability. IEEE Micro, 20, 12-25.
    • (2000) IEEE Micro , vol.20 , pp. 12-25
    • Tremblay, M.1    Chan, J.2    Chaudhry, S.3    Conigliaro, A.W.4    Tse, S.S.5
  • 7
    • 33644896874 scopus 로고    scopus 로고
    • Billion transistor chips in mainstream enterprise platforms of the future
    • Anaheim, CA, February 8-12, IEEE Computer Society, Washington, DC
    • Bhandarkar, D. (2003) Billion transistor chips in mainstream enterprise platforms of the future. In Proc. 9th Int. Symp. High-Performance Computer Architecture, Anaheim, CA, February 8-12, pp. 3. IEEE Computer Society, Washington, DC.
    • (2003) Proc. 9th Int. Symp. High-Performance Computer Architecture , pp. 3
    • Bhandarkar, D.1
  • 8
    • 0033717865 scopus 로고    scopus 로고
    • Clock rate versus IPC: The end of the road for conventional microarchitectures
    • Vancouver, British, Columbia, Canada, June 10-14, ACM Press, New York, NY
    • Agarwal, V., Hrishikesh, M. S., Keckler, S. W. and Burger, D. (2000) Clock rate versus IPC: The end of the road for conventional microarchitectures. In Proc. 27th Annual Int. Symp. Computer Architecture, Vancouver, British, Columbia, Canada, June 10-14, pp. 248-259. ACM Press, New York, NY.
    • (2000) Proc. 27th Annual Int. Symp. Computer Architecture , pp. 248-259
    • Agarwal, V.1    Hrishikesh, M.S.2    Keckler, S.W.3    Burger, D.4
  • 9
    • 65349143924 scopus 로고    scopus 로고
    • Instruction wake-up in wide issue superscalars
    • Manchester, UK, August 28-31, Springer-Verlag, London, UK
    • Onder, S. and Gupta, R. (2001) Instruction wake-up in wide issue superscalars. In Proc. 7th Int. Euro-Par Conf. Manchester on Parallel Processing, Manchester, UK, August 28-31, pp. 418-427. Springer-Verlag, London, UK.
    • (2001) Proc. 7th Int. Euro-Par Conf. Manchester on Parallel Processing , pp. 418-427
    • Onder, S.1    Gupta, R.2
  • 10
    • 28444453374 scopus 로고    scopus 로고
    • Superscalar execution with dynamic data forwarding
    • Paris, France, October 12-18, IEEE Computer Society, Washington, DC
    • Onder, S. and Gupta, R. (1998) Superscalar execution with dynamic data forwarding. In Proc. Int. Conf. Parallel Architectures and Compilation Techniques, Paris, France, October 12-18, pp. 130-135. IEEE Computer Society, Washington, DC.
    • (1998) Proc. Int. Conf. Parallel Architectures and Compilation Techniques , pp. 130-135
    • Onder, S.1    Gupta, R.2
  • 12
    • 0030676681 scopus 로고    scopus 로고
    • Complexity-effective superscalar processors
    • Denver, CO, June 1-4, ACM Press, New York, NY
    • Palacharla, S., Jouppi, N. P. and Smith, J. (1997) Complexity-effective superscalar processors. In Proc. 24th Int. Symp. Computer Architecture, Denver, CO, June 1-4, pp. 206-218. ACM Press, New York, NY.
    • (1997) Proc. 24th Int. Symp. Computer Architecture , pp. 206-218
    • Palacharla, S.1    Jouppi, N.P.2    Smith, J.3
  • 13
    • 0029200683 scopus 로고
    • Simultaneous multithreading: Maximizing on chip parallelism
    • Santa Margherita Ligure, Italy, June 22-24, ACM Press, New York, NY
    • Tullsen, D. M., Eggersa, S. and Levy, H. M. (1995) Simultaneous multithreading: Maximizing on chip parallelism. In Proc. 22nd Annual Int. Symp. Computer Architecture, Santa Margherita Ligure, Italy, June 22-24, pp. 392-403. ACM Press, New York, NY.
    • (1995) Proc. 22nd Annual Int. Symp. Computer Architecture , pp. 392-403
    • Tullsen, D.M.1    Eggersa, S.2    Levy, H.M.3
  • 15
    • 0035696763 scopus 로고    scopus 로고
    • Reducing the Complexity of the register file in dynamic superscalar processors
    • Austin, TX, December 1-5, IEEE Computer Society, Washington, DC
    • Balasubramonian, R., Dwarkadas, S. and Albonesi, D. (2001) Reducing the Complexity of the register file in dynamic superscalar processors. In Proc. 34th Int. Symp. on Microarchitecture, Austin, TX, December 1-5, pp. 237-248. IEEE Computer Society, Washington, DC.
    • (2001) Proc. 34th Int. Symp. on Microarchitecture , pp. 237-248
    • Balasubramonian, R.1    Dwarkadas, S.2    Albonesi, D.3
  • 16
    • 33644888865 scopus 로고    scopus 로고
    • Complex SOCs require new architectures
    • Available at
    • Diefendorff, K. and Duquesne, Y. (2002) Complex SOCs require new architectures. EE Times. Available at http://www.eetimes.com/issue/se/ OEG20020911S0076.
    • (2002) EE Times
    • Diefendorff, K.1    Duquesne, Y.2
  • 17
    • 2042458649 scopus 로고    scopus 로고
    • A survey of processors with explicit multithreading
    • Ungerer, T., Robec, B. and Silc, J. (2003) A survey of processors with explicit multithreading: ACM Comput. Surveys, 35, 29-63.
    • (2003) ACM Comput. Surveys , vol.35 , pp. 29-63
    • Ungerer, T.1    Robec, B.2    Silc, J.3
  • 18
    • 0035182594 scopus 로고    scopus 로고
    • Area and system clock effects on SMT/CMP processors
    • Barcelona, Spain, September 8-12, IEEE Computer Society, Washington, DC
    • Burns, J. and Gaudiot, J.-L. (2001) Area and system clock effects on SMT/ CMP processors. In Proc. 2001 Int. Conf. Parallel Architectures and Compilation Techniques, Barcelona, Spain, September 8-12, pp. 211-218. IEEE Computer Society, Washington, DC.
    • (2001) Proc. 2001 Int. Conf. Parallel Architectures and Compilation Techniques , pp. 211-218
    • Burns, J.1    Gaudiot, J.-L.2
  • 19
    • 35248832488 scopus 로고    scopus 로고
    • Multi-threaded microprocessors evolution or revolution
    • Aizu, Japan, September 23-26, LNCS 2823, Springer, Berlin, Germany
    • Jesshope, C. R. (2003) Multi-threaded microprocessors evolution or revolution. In Proc. 8th Asia-Pacific Conf. ACSAC'2003, Aizu, Japan, September 23-26, LNCS 2823, pp. 21-45. Springer, Berlin, Germany.
    • (2003) Proc. 8th Asia-Pacific Conf. ACSAC'2003 , pp. 21-45
    • Jesshope, C.R.1
  • 20
    • 33644889137 scopus 로고    scopus 로고
    • Performance of a microthreaded pipeline
    • Melbourne, Victoria, Australia, January 28-February 2, Australia Computer Society, Inc. Darlinghurst, Australia
    • Luo, B. and Jesshope, C. R. (2002) Performance of a microthreaded pipeline. In Proc. 7th Asia-Pacific Conf. Computer Systems Architecture, Melbourne, Victoria, Australia, January 28-February 2, pp. 83-90. Australia Computer Society, Inc. Darlinghurst, Australia.
    • (2002) Proc. 7th Asia-Pacific Conf. Computer Systems Architecture , pp. 83-90
    • Luo, B.1    Jesshope, C.R.2
  • 21
    • 84949521500 scopus 로고    scopus 로고
    • Implementing an efficient vector instruction set in a chip multi-processor using micro-threaded pipelines
    • Gold Coast, Queensland, Australia, January 29-30, IEEE Computer Society, Los Alamitos, CA
    • Jesshope, C. R. (2001) Implementing an efficient vector instruction set in a chip multi-processor using micro-threaded pipelines. In Proc. ACSAC 2001, Gold Coast, Queensland, Australia, January 29-30, pp. 80-88. IEEE Computer Society, Los Alamitos, CA.
    • (2001) Proc. ACSAC 2001 , pp. 80-88
    • Jesshope, C.R.1
  • 23
    • 0003574189 scopus 로고
    • MIT and McGraw-Hill, New York, St Louis, San Francisco
    • Hwang, K. (1993) Advanced Computer Architecture. MIT and McGraw-Hill, New York, St Louis, San Francisco.
    • (1993) Advanced Computer Architecture
    • Hwang, K.1
  • 24
    • 0034442570 scopus 로고    scopus 로고
    • Image and video processing using MAJC 5200
    • Vancouver, BC, Canada, September 10-13, IEEE Computer Society, Washington, DC
    • Sudharsanan, S., Sriram, P., Frederickson, H. and Gulati, A. (2000) Image and video processing using MAJC 5200. In Proc. 2000 IEEE Int. Conf. Image Processing, Vancouver, BC, Canada, September 10-13, pp. 122-125. IEEE Computer Society, Washington, DC.
    • (2000) Proc. 2000 IEEE Int. Conf. Image Processing , pp. 122-125
    • Sudharsanan, S.1    Sriram, P.2    Frederickson, H.3    Gulati, A.4
  • 25
    • 84949810527 scopus 로고    scopus 로고
    • Eliminating squashes through learning cross-thread violations in speculative parallelisation for multiprocessors
    • Boston, MA, February 2-6, IEEE Computer Society, Washington, DC
    • Cintra, M. and Torrellas, J. (2002) Eliminating squashes through learning cross-thread violations in speculative parallelisation for multiprocessors. In Proc. 8th Int. Symp. High-Performance Computer Architecture, Boston, MA, February 2-6, pp. 43-54. IEEE Computer Society, Washington, DC.
    • (2002) Proc. 8th Int. Symp. High-Performance Computer Architecture , pp. 43-54
    • Cintra, M.1    Torrellas, J.2
  • 26
    • 0033689702 scopus 로고    scopus 로고
    • Architecture support for scalable speculative parallelization in shared-memory multiprocessors
    • Vancouver, Canada, June 10-14, ACM Press, New York, NY
    • Cintra, M. Martinez, J. S. and Torrellas, J. (2000) Architecture support for scalable speculative parallelization in shared-memory multiprocessors. In Proc. Int. Symp. Computer Architecture, Vancouver, Canada, June 10-14, pp. 13-24. ACM Press, New York, NY.
    • (2000) Proc. Int. Symp. Computer Architecture , pp. 13-24
    • Cintra, M.1    Martinez, J.S.2    Torrellas, J.3
  • 28
    • 0041583138 scopus 로고    scopus 로고
    • Inside IA-64
    • Halfhill, T. (1998) Inside IA-64. Byte Magaz., 23, 81-88.
    • (1998) Byte Magaz. , vol.23 , pp. 81-88
    • Halfhill, T.1
  • 29
    • 0342673658 scopus 로고    scopus 로고
    • EPIC: An architecture for instruction-level parallel processors
    • Compiler and Architecture Research, HPL-1999-111. HP Laboratories, Palo Alto
    • Schlansker, M. S. and Rau, B. R. (2000) EPIC: An architecture for instruction-level parallel processors. Compiler and Architecture Research, HPL-1999-111. HP Laboratories, Palo Alto.
    • (2000)
    • Schlansker, M.S.1    Rau, B.R.2
  • 30
    • 0030684575 scopus 로고    scopus 로고
    • Multiscalar execution along a single flow of control
    • Bloomington, IL, August 11-15, IEEE Computer Society, Washington, DC
    • Sundararaman, K. and Franklin, M. (1997) Multiscalar execution along a single flow of control. In Proc. IEEE Int. Conf. Parallel Processing, Bloomington, IL, August 11-15, pp. 106-113. IEEE Computer Society, Washington, DC.
    • (1997) Proc. IEEE Int. Conf. Parallel Processing , pp. 106-113
    • Sundararaman, K.1    Franklin, M.2
  • 32
    • 0028768028 scopus 로고
    • The anatomy of the register file in a multiscalar processor
    • San Jose, CA, November 30-December 2, ACM Press, New York, NY
    • Breach, S. E., Vijaykumar, T. N. and Sohi, G. S. (1994) The anatomy of the register file in a multiscalar processor. In Proc. 27th Int. Symp. Microarchitecture, San Jose, CA, November 30-December 2, pp. 181-190. ACM Press, New York, NY.
    • (1994) Proc. 27th Int. Symp. Microarchitecture , pp. 181-190
    • Breach, S.E.1    Vijaykumar, T.N.2    Sohi, G.S.3
  • 34
  • 36
    • 0003336316 scopus 로고    scopus 로고
    • Simultaneous multithreading: Multiple Alpha's performance
    • In Presentation at the Microprocessor Forum'99, MicroDesign Resources, San Jose, CA
    • Emer, J. (1999) Simultaneous multithreading: Multiple Alpha's performance. In Presentation at the Microprocessor Forum'99, MicroDesign Resources, San Jose, CA.
    • (1999)
    • Emer, J.1
  • 37
    • 0035104225 scopus 로고    scopus 로고
    • Architecture of the Atlas Chip Multiprocessor: Dynamically parallelising irregular applications
    • Codrescu, L., Wills, D. S. and Meindl, J. D. (2001) Architecture of the Atlas Chip Multiprocessor: Dynamically parallelising irregular applications. IEEE Comput. Soc., 50, 67-82.
    • (2001) IEEE Comput. Soc. , vol.50 , pp. 67-82
    • Codrescu, L.1    Wills, D.S.2    Meindl, J.D.3
  • 38
    • 0001948133 scopus 로고    scopus 로고
    • Power4 focuses on memory bandwidth: IBM confronts IA-64, says ISA not important
    • Diefendorff, K. (1999) Power4 focuses on memory bandwidth: IBM confronts IA-64, says ISA not important. Microprocessor Rep., 13, 11-17.
    • (1999) Microprocessor Rep. , vol.13 , pp. 11-17
    • Diefendorff, K.1
  • 39
    • 0036110799 scopus 로고    scopus 로고
    • Design of an 8-wide superscalar RISC microprocessor with simultaneous multithreading
    • San Francisco, CA, February 4-6, IEEE Solid-State Circuits, USA
    • Preston, R. P. et al. (2002) Design of an 8-wide superscalar RISC microprocessor with simultaneous multithreading. In Proc. 2002 IEEE Int. Solid-State Circuits Conf., San Francisco, CA, February 4-6, pp. 334-335. IEEE Solid-State Circuits, USA.
    • (2002) Proc. 2002 IEEE Int. Solid-State Circuits Conf. , pp. 334-335
    • Preston, R.P.1
  • 41
    • 41349090027 scopus 로고    scopus 로고
    • Reducing register ports for higher speed and lower energy
    • Istanbul, Turkey, November 18-22, IEEE Computer Society, Los Alamitos, CA
    • Park, I., Powell, M. D. and Vijaykumar, T. N. (2002) Reducing register ports for higher speed and lower energy. In Proc. 35th Annual ACM/IEEE Int. Symp. Microarchitecture, Istanbul, Turkey, November 18-22, pp. 171-182. IEEE Computer Society, Los Alamitos, CA.
    • (2002) Proc. 35th Annual ACM/IEEE Int. Symp. Microarchitecture , pp. 171-182
    • Park, I.1    Powell, M.D.2    Vijaykumar, T.N.3
  • 42
    • 1142280977 scopus 로고    scopus 로고
    • Reducing register ports using delayed write-back queues and operand pre-fetch
    • San Francisco, CA, June 23-26, ACM Press, New York, NY
    • Kim, N. S. and Mudge, T. (2003) Reducing register ports using delayed write-back queues and operand pre-fetch. In Proc. 17th Annual Int. Conf. Supercomputing, San Francisco, CA, June 23-26, pp. 172-182. ACM Press, New York, NY.
    • (2003) Proc. 17th Annual Int. Conf. Supercomputing , pp. 172-182
    • Kim, N.S.1    Mudge, T.2
  • 43
    • 0038008204 scopus 로고    scopus 로고
    • Banked multiported register files for high-frequency superscalar microprocessors
    • San Diego, CA, June 9-11, ACM Press, New York, NY
    • Tseng, J. H. and Asanovic, K. (2003) Banked multiported register files for high-frequency superscalar microprocessors. In Proc. 30th Int. Symp. Computer Architecture, San Diego, CA, June 9-11, pp. 62-71. ACM Press, New York, NY.
    • (2003) Proc. 30th Int. Symp. Computer Architecture , pp. 62-71
    • Tseng, J.H.1    Asanovic, K.2
  • 44
    • 0345412726 scopus 로고    scopus 로고
    • Reducing operand transport complexity of superscalar processors using distributed register files
    • San Jose, CA, October 13-15, IEEE Computer Society, Los Alamitos, CA
    • Bunchua, S., Wills, D. S. and Wills, L. M. (2003) Reducing operand transport complexity of superscalar processors using distributed register files. In Proc. 21st Int. Conf. Computer Design, San Jose, CA, October 13-15, pp. 532-535. IEEE Computer Society, Los Alamitos, CA.
    • (2003) Proc. 21st Int. Conf. Computer Design , pp. 532-535
    • Bunchua, S.1    Wills, D.S.2    Wills, L.M.3
  • 46
    • 33644880589 scopus 로고    scopus 로고
    • Micro-grids - The exploitation of massive on-chip concurrency
    • L. Grandinetti (ed.), Cetraro, Italy, May 31-June 3. Elsevier, Amsterdam
    • Jesshope, C. R. (2005) Micro-grids - the exploitation of massive on-chip concurrency. In Proc. HPC Workshop 2004, Grid Computing: A New Frontier of High Performance Computing, L. Grandinetti (ed.), Cetraro, Italy, May 31-June 3. Elsevier, Amsterdam.
    • (2005) Proc. HPC Workshop 2004, Grid Computing: A New Frontier of High Performance Computing
    • Jesshope, C.R.1
  • 47
    • 33646534810 scopus 로고    scopus 로고
    • The challenges of massive on-chip concurrency
    • Singapore, October 24-26. LNCS, Springer-Verlag
    • Bousias, K. and Jesshope, C. R. (2005) The challenges of massive on-chip concurrency. Tenth Asia-Pacific Computer Systems Architecture Conference, Singapore, October 24-26. LNCS 3740, pp. 157-170. Springer-Verlag.
    • (2005) Tenth Asia-Pacific Computer Systems Architecture Conference , vol.3740 , pp. 157-170
    • Bousias, K.1    Jesshope, C.R.2
  • 48
    • 0003657403 scopus 로고
    • Globally Asynchronous Locally Synchronous Circuits
    • PhD Thesis, Report No. STAN-CS-84-1026, Stanford University
    • Shapiro, D. (1984) Globally Asynchronous Locally Synchronous Circuits. PhD Thesis, Report No. STAN-CS-84-1026, Stanford University.
    • (1984)
    • Shapiro, D.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.