메뉴 건너뛰기




Volumn 24, Issue 6, 2009, Pages 1061-1073

Godson-t: An efficient many-core architecture for parallel program executions

Author keywords

Data communication; Many core; Multithread; Parallel computing; Runtime system; Thread synchronization

Indexed keywords

DATA-COMMUNICATION; MANY-CORE; MULTI-THREAD; PARALLEL COMPUTING; RUNTIME SYSTEMS; THREAD SYNCHRONIZATION;

EID: 70450170935     PISSN: 10009000     EISSN: None     Source Type: Journal    
DOI: 10.1007/s11390-009-9295-3     Document Type: Article
Times cited : (41)

References (38)
  • 1
    • 35648995516 scopus 로고    scopus 로고
    • The landscape of parallel computing research: A view from Berkeley
    • University of California, Berkeley, December 18
    • Asanovic K et al. The landscape of parallel computing research: A view from Berkeley. Technical Report No.UCB/EECS-2006-183, University of California, Berkeley, December 18, 2006.
    • (2006) Technical Report No.UCB/EECS-2006-183
    • Asanovic, K.1
  • 2
    • 33646892173 scopus 로고    scopus 로고
    • The problem with threads
    • DOI 10.1109/MC.2006.180
    • EA Lee 2006 The problem with threads Computer 39 5 33 42 10.1109/MC.2006.180 (Pubitemid 43786509)
    • (2006) Computer , vol.39 , Issue.5 , pp. 33-42
    • Lee, E.A.1
  • 3
    • 78651582149 scopus 로고    scopus 로고
    • Real-world concurrency
    • 10.1145/1454456.1454462
    • B Cantrill J Bonwick 2008 Real-world concurrency ACM Queue 6 5 16 25 10.1145/1454456.1454462
    • (2008) ACM Queue , vol.6 , Issue.5 , pp. 16-25
    • Cantrill, B.1    Bonwick, J.2
  • 6
    • 0000269759 scopus 로고    scopus 로고
    • Scheduling multithreaded computations by work stealing
    • 1065.68504 10.1145/324133.324234 1747653
    • RD Blumofe CE Leiserson 1999 Scheduling multithreaded computations by work stealing Journal of the ACM 46 5 720 748 1065.68504 10.1145/324133.324234 1747653
    • (1999) Journal of the ACM , vol.46 , Issue.5 , pp. 720-748
    • Blumofe, R.D.1    Leiserson, C.E.2
  • 8
    • 63649096141 scopus 로고    scopus 로고
    • Efficiency and scalability of barrier synchronization on NoC based many-core architecture
    • Atlanta, USA, Oct. 19-24
    • Villa O, Palermo G, Silvano C. Efficiency and scalability of barrier synchronization on NoC based many-core architecture. In Proc. CASES 2008, Atlanta, USA, Oct. 19-24, 2008, pp.81-90.
    • (2008) Proc. CASES 2008 , pp. 81-90
    • Villa, O.1    Palermo, G.2    Silvano, C.3
  • 10
    • 0002081678 scopus 로고    scopus 로고
    • Co-array Fortran for parallel programming
    • 10.1145/289918.289920
    • RW Numrich J Reid 1998 Co-array Fortran for parallel programming SIGPLAN Fortran Forum 17 2 1 31 10.1145/289918.289920
    • (1998) SIGPLAN Fortran Forum , vol.17 , Issue.2 , pp. 1-31
    • Numrich, R.W.1    Reid, J.2
  • 11
    • 0032155556 scopus 로고    scopus 로고
    • Titanium: A high-performance Java dialect
    • 10.1002/(SICI)1096-9128(199809/11)10:11/13<825::AID-CPE383>3.0. CO;2-H
    • K Yelick L Semenzato, et al. 1998 Titanium: A high-performance Java dialect Concurrency: Practice and Experience 10 11-13 825 836 10.1002/(SICI)1096-9128(199809/11)10:11/13<825::AID-CPE383>3.0.CO;2-H
    • (1998) Concurrency: Practice and Experience , vol.10 , Issue.1113 , pp. 825-836
    • Yelick, K.1    Semenzato, L.2
  • 16
    • 35348812496 scopus 로고    scopus 로고
    • Synchronization state buffer: Supporting efficient fine-grain synchronization on many-core architectures
    • San Diego, USA, June 9-13
    • Zhu W, Sreedhar V C et al. Synchronization state buffer: Supporting efficient fine-grain synchronization on many-core architectures. In Proc. the 34th Annual International Symposium on Computer Architecture, San Diego, USA, June 9-13, 2007, pp.35-45.
    • (2007) Proc. the 34th Annual International Symposium on Computer Architecture , pp. 35-45
    • Zhu Sreedhar, W.V.C.1
  • 17
    • 0029179077 scopus 로고
    • The SPLASH-2 programs: Characterization and methodological considerations
    • Santa Margnerita Ligure, Italy, June 22-24
    • Woo S C, Ohara M et al. The SPLASH-2 programs: Characterization and methodological considerations. In Proc. the 22nd Annual International Symposium on Computer Architecture, Santa Margnerita Ligure, Italy, June 22-24, 1995, pp.24-36.
    • (1995) Proc. the 22nd Annual International Symposium on Computer Architecture , pp. 24-36
    • Woo Ohara, C.S.M.1
  • 18
    • 4444237022 scopus 로고    scopus 로고
    • Exploiting the kernel trick to correlate fragment ions for peptide identification via tandem mass spectrometry
    • DOI 10.1093/bioinformatics/bth186
    • Y Fu Q Yang, et al. 2004 Exploiting the kernel trick to correlate fragment ions for peptide identification via tandem mass spectrometry Bioinformatics 20 1 1948 1954 10.1093/bioinformatics/bth186 (Pubitemid 39199057)
    • (2004) Bioinformatics , vol.20 , Issue.12 , pp. 1948-1954
    • Fu, Y.1    Yang, Q.2    Sun, R.3    Li, D.4    Zeng, R.5    Ling, C.X.6    Gao, W.7
  • 20
    • 0032627704 scopus 로고    scopus 로고
    • Evaluating synchronization on shared address space multiprocessors: Methodology and performance
    • 10.1145/301464.301477
    • S Kumar D Jiang, et al. 1999 Evaluating synchronization on shared address space multiprocessors: Methodology and performance ACM SIGMETRICS Performance Evaluation Review (SIGMETRICS 1999) 27 1 23 34 10.1145/301464.301477
    • (1999) ACM SIGMETRICS Performance Evaluation Review (SIGMETRICS 1999) , vol.27 , Issue.1 , pp. 23-34
    • Kumar, S.1    Jiang, D.2
  • 21
    • 0024032163 scopus 로고
    • ANALYSIS OF THE COMPUTATIONAL AND PARALLEL COMPLEXITY OF THE LIVERMORE LOOPS.
    • DOI 10.1016/0167-8191(88)90037-3
    • J Feo 1988 An analysis of the computational and parallel complexity of the Livermore loops Parallel Computing 7 2 163 185 0651.65033 10.1016/0167-8191(88)90037-3 (Pubitemid 18648054)
    • (1988) Parallel Computing , vol.7 , Issue.2 , pp. 163-185
    • Feo John, T.1
  • 23
  • 25
    • 33750004191 scopus 로고    scopus 로고
    • Optimization of dense matrix multiplication on IBM Cyclops-64: Challenges and experiences
    • Dresden, Germany, August 28-September 1
    • Hu Z, Cuvillo J et al. Optimization of dense matrix multiplication on IBM Cyclops-64: Challenges and experiences. In Proc. Euro-Par 2006, Dresden, Germany, August 28-September 1, pp.134-144.
    • Proc. Euro-Par 2006 , pp. 134-144
    • Hu Cuvillo, Z.J.1
  • 27
    • 34247349114 scopus 로고    scopus 로고
    • The potential of the cell processor for scientific computing
    • Ischia, Italy, May 3-5
    • Williams S, Shalf J et al. The potential of the cell processor for scientific computing. In Proc. CF'06, Ischia, Italy, May 3-5, 2006, pp.9-20.
    • (2006) Proc. CF'06 , pp. 9-20
    • Williams Shalf, S.J.1    Al, E.2
  • 28
    • 0034246578 scopus 로고    scopus 로고
    • Location consistency - a new memory model and cache consistency protocol
    • DOI 10.1109/12.868026
    • GR Gao V Sarkar 2000 Location consistency - A new memory model and cache consistency protocol IEEE Transactions on Computers 49 8 798 813 10.1109/12.868026 (Pubitemid 30927304)
    • (2000) IEEE Transactions on Computers , vol.49 , Issue.8 , pp. 798-813
    • Gao, G.R.1    Sarkar, V.2
  • 29
    • 0032671416 scopus 로고    scopus 로고
    • Commit-reconcile & fences (CRF): A new memory model for architects and compiler writers
    • Atlanta, USA, May 2-4
    • Shen X et al. Commit-reconcile & fences (CRF): A new memory model for architects and compiler writers. In Proc. the 26th Annual International Symposium on Computer Architecture, Atlanta, USA, May 2-4, 1999, pp.150-161.
    • (1999) Proc. the 26th Annual International Symposium on Computer Architecture , pp. 150-161
    • Shen, X.1    Al, E.2
  • 32
    • 27644567646 scopus 로고    scopus 로고
    • Power efficient architecture and the cell processor
    • San Francisco, USA, February 12-16
    • Hofstee P. Power efficient architecture and the cell processor. In Proc. HPCA-11, San Francisco, USA, February 12-16, 2005, pp.258-262.
    • (2005) Proc. HPCA-11 , pp. 258-262
    • Hofstee, P.1
  • 33
    • 33746304031 scopus 로고    scopus 로고
    • Dissecting cyclops: A detailed analysis of a multithreaded architecture
    • 10.1145/773365.773369
    • G Almasi C Cascaval, et al. 2003 Dissecting cyclops: A detailed analysis of a multithreaded architecture ACM SIGARCH Computer Architecture News 31 1 26 38 10.1145/773365.773369
    • (2003) ACM SIGARCH Computer Architecture News , vol.31 , Issue.1 , pp. 26-38
    • Almasi, G.1    Cascaval, C.2
  • 34
    • 44849137198 scopus 로고    scopus 로고
    • NVIDIA Tesla: A unified graphics and computing architecture
    • DOI 10.1109/MM.2008.31
    • E Lindholm, et al. 2008 NVIDIA Tesla: A unified graphics and computing architecture IEEE Micro 28 2 39 55 10.1109/MM.2008.31 (Pubitemid 351796170)
    • (2008) IEEE Micro , vol.28 , Issue.2 , pp. 39-55
    • Lindholm, E.1    Nickolls, J.2    Oberman, S.3    Montrym, J.4
  • 36
    • 0031593999 scopus 로고    scopus 로고
    • Exploiting fine-grain thread level parallelism on the MIT multi-alu processor
    • Barcelona, Spain, June 27-July 1
    • Keckler S W et al. Exploiting fine-grain thread level parallelism on the MIT multi-alu processor. In Proc. the 25th Annual International Symposium on Computer Architecture, Barcelona, Spain, June 27-July 1, 1998, pp.306-317.
    • (1998) Proc. the 25th Annual International Symposium on Computer Architecture , pp. 306-317
    • Keckler, S.W.1
  • 38
    • 63649096141 scopus 로고    scopus 로고
    • Efficiency and scalability of barrier synchronization on NoC based many-core architecture
    • Atlanta, USA, October 19-24
    • Villa O et al. Efficiency and scalability of barrier synchronization on NoC based many-core architecture. In Proc. CASES 2008, Atlanta, USA, October 19-24, 2008, pp.81-90.
    • (2008) Proc. CASES 2008 , pp. 81-90
    • Villa, O.1    Al, E.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.