메뉴 건너뛰기




Volumn , Issue , 2012, Pages 165-176

Supporting efficient collective communication in NoCs

Author keywords

[No Author keywords available]

Indexed keywords

COLLECTIVE COMMUNICATIONS; COMMUNICATION OPERATION; DIRECTORY PROTOCOL; HARDWARE SUPPORTS; MANY-TO-ONE; MULTICAST ALGORITHMS; MULTICAST COMMUNICATION; MULTICASTS; NETWORK LOAD; NETWORK SATURATION; NETWORKS ON CHIPS; PACKET LATENCIES; PARALLEL PROGRAMMING PARADIGMS; POWER SAVINGS; PRIMARY CONTRIBUTION; RESOURCE CONFIGURATIONS; SATURATION THROUGHPUT; THROUGHPUT IMPROVEMENT;

EID: 84860323871     PISSN: 15300897     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/HPCA.2012.6168953     Document Type: Conference Paper
Times cited : (25)

References (43)
  • 1
    • 64949116918 scopus 로고    scopus 로고
    • Enabling fully adaptive multicast routing for CMP interconnection networks
    • MRR
    • P. Abad et al. MRR: Enabling fully adaptive multicast routing for CMP interconnection networks. In HPCA 2009.
    • HPCA 2009
    • Abad, P.1
  • 2
    • 11144287593 scopus 로고    scopus 로고
    • An overview of the BlueGene/L supercomputer
    • N. R. Adiga et al. An overview of the BlueGene/L supercomputer. In SC 2002.
    • SC 2002
    • Adiga, N.R.1
  • 3
    • 63549095070 scopus 로고    scopus 로고
    • The PARSEC benchmark suite: Characterization and architectural implications
    • C. Bienia et al. The PARSEC benchmark suite: characterization and architectural implications. In PACT 2008.
    • PACT 2008
    • Bienia, C.1
  • 4
    • 36348965353 scopus 로고    scopus 로고
    • The Power of Priority: NoC Based Distributed Cache Coherency
    • E. Bolotin et al. The Power of Priority: NoC Based Distributed Cache Coherency. In NOCS 2007.
    • NOCS 2007
    • Bolotin, E.1
  • 8
    • 0027837827 scopus 로고
    • A new theory of deadlock-free adaptive routing in worm-hole networks
    • Dec.
    • J. Duato. A new theory of deadlock-free adaptive routing in worm-hole networks. IEEE Trans. Parallel Distrib. Syst., 4(12):1320-1331, Dec. 1993.
    • (1993) IEEE Trans. Parallel Distrib. Syst. , vol.4 , Issue.12 , pp. 1320-1331
    • Duato, J.1
  • 10
    • 77953105129 scopus 로고    scopus 로고
    • SigNet: Network-on-chip filtering for coarse vector directories
    • N. Enright Jerger. SigNet: Network-on-chip filtering for coarse vector directories. In DATE 2010.
    • DATE 2010
    • Enright Jerger, N.1
  • 12
    • 52649171528 scopus 로고    scopus 로고
    • Virtual Circuit Tree Multicasting: A case for on-chip hardware multicast support
    • N. Enright Jerger et al. Virtual Circuit Tree Multicasting: A case for on-chip hardware multicast support. In ISCA 2008.
    • ISCA 2008
    • Enright Jerger, N.1
  • 13
    • 66749163103 scopus 로고    scopus 로고
    • Virtual tree coherence: Leveraging regions and in-network multicast trees for scalable cache coherence
    • N. Enright Jerger et al. Virtual tree coherence: Leveraging regions and in-network multicast trees for scalable cache coherence. In MICRO 2008.
    • MICRO 2008
    • Enright Jerger, N.1
  • 14
    • 0030243819 scopus 로고    scopus 로고
    • Energy dissipation in general purpose microprocessors
    • sep
    • R. Gonzalez and M. Horowitz. Energy dissipation in general purpose microprocessors. IEEE J. Sol. St. Cir., 31(9):1277-1284, sep 1996.
    • (1996) IEEE J. Sol. St. Cir. , vol.31 , Issue.9 , pp. 1277-1284
    • Gonzalez, R.1    Horowitz, M.2
  • 15
    • 84860351827 scopus 로고    scopus 로고
    • The NYU Ultracomputer - Designing a MIMD, shared-memory parallel machine
    • A. Gottlieb et al. The NYU Ultracomputer - designing a MIMD, shared-memory parallel machine. In ISCA 1982.
    • ISCA 1982
    • Gottlieb, A.1
  • 16
    • 57749191721 scopus 로고    scopus 로고
    • Regional Congestion Awareness for load balance in networks-on-chip
    • P. Gratz et al. Regional Congestion Awareness for load balance in networks-on-chip. In HPCA 2008.
    • HPCA 2008
    • Gratz, P.1
  • 17
    • 36349000348 scopus 로고    scopus 로고
    • Implementation and evaluation of a dynamically routed processor operand network
    • P. Gratz et al. Implementation and evaluation of a dynamically routed processor operand network. In NOCS 2007.
    • NOCS 2007
    • Gratz, P.1
  • 18
    • 0001617669 scopus 로고    scopus 로고
    • Reducing memory and traffic requirements for scalable directory-based cache coherence schemes
    • A. Gupta et al. Reducing memory and traffic requirements for scalable directory-based cache coherence schemes. In ICPP 1990.
    • ICPP 1990
    • Gupta, A.1
  • 20
    • 70350584637 scopus 로고    scopus 로고
    • Multicast routing with dynamic packet fragmentation
    • Y. H. Kang et al. Multicast routing with dynamic packet fragmentation. In GLSVLSI 2009.
    • GLSVLSI 2009
    • Kang, Y.H.1
  • 21
    • 27944435722 scopus 로고    scopus 로고
    • A low latency router supporting adaptivity for on-chip interconnects
    • J. Kim et al. A low latency router supporting adaptivity for on-chip interconnects. In DAC 2005.
    • DAC 2005
    • Kim, J.1
  • 22
    • 84858790896 scopus 로고    scopus 로고
    • Towards the ideal on-chip fabric for 1-to-many and many-to-1 communication
    • T. Krishna et al. Towards the ideal on-chip fabric for 1-to-many and many-to-1 communication. In MICRO 2011.
    • MICRO 2011
    • Krishna, T.1
  • 23
    • 52949114554 scopus 로고    scopus 로고
    • A 4.6Tbits/s 3.6GHz single-cycle NoC router with a novel switch allocator in 65nm CMOS
    • A. Kumar et al. A 4.6Tbits/s 3.6GHz single-cycle NoC router with a novel switch allocator in 65nm CMOS. In ICCD 2007.
    • ICCD 2007
    • Kumar, A.1
  • 24
    • 0026980902 scopus 로고
    • The network architecture of the connection machine CM-5
    • C. Leiserson et al. The network architecture of the connection machine CM-5. In J. Parallel Distrib. Comput., pages 272-285, 1992.
    • (1992) J. Parallel Distrib. Comput. , pp. 272-285
    • Leiserson, C.1
  • 25
    • 0025429467 scopus 로고    scopus 로고
    • The directory-based cache coherence protocol for the DASH multiprocessor
    • D. Lenoski et al. The directory-based cache coherence protocol for the DASH multiprocessor. In ISCA 1990.
    • ISCA 1990
    • Lenoski, D.1
  • 26
    • 33749348991 scopus 로고    scopus 로고
    • Connection-oriented multicasting in wormhole-switched networks on chip
    • Z. Lu et al. Connection-oriented multicasting in wormhole-switched networks on chip. In ISVLSI 2006.
    • ISVLSI 2006
    • Lu, Z.1
  • 27
    • 80052533252 scopus 로고    scopus 로고
    • DBAR: An efficient routing algorithm to support multiple concurrent applications in networks-on-chip
    • S. Ma et al. DBAR: an efficient routing algorithm to support multiple concurrent applications in networks-on-chip. In ISCA 2011.
    • ISCA 2011
    • Ma, S.1
  • 28
    • 0036469676 scopus 로고    scopus 로고
    • Simics: A full system simulation platform
    • February
    • P. S. Magnusson et al. Simics: A full system simulation platform. Computer, 35:50-58, February 2002.
    • (2002) Computer , vol.35 , pp. 50-58
    • Magnusson, P.S.1
  • 29
    • 0038346234 scopus 로고    scopus 로고
    • Token coherence: Decoupling performance and correctness
    • M. Martin et al. Token coherence: decoupling performance and correctness. In ISCA 2003
    • ISCA 2003
    • Martin, M.1
  • 30
    • 77955102506 scopus 로고    scopus 로고
    • Evaluating bufferless flow control for on-chip networks
    • G. Michelogiannakis et al. Evaluating bufferless flow control for on-chip networks. In NOCS 2010.
    • NOCS 2010
    • Michelogiannakis, G.1
  • 31
    • 80052536229 scopus 로고    scopus 로고
    • A case for heterogeneous on-chip interconnects for CMPs
    • A. K. Mishra et al. A case for heterogeneous on-chip interconnects for CMPs. In ISCA 2011.
    • ISCA 2011
    • Mishra, A.K.1
  • 34
    • 80052543351 scopus 로고    scopus 로고
    • TLSync: Support for multiple fast barriers using on-chip transmission lines
    • J. Oh et al. TLSync: support for multiple fast barriers using on-chip transmission lines. In ISCA 2011.
    • ISCA 2011
    • Oh, J.1
  • 35
    • 0002483291 scopus 로고    scopus 로고
    • Fast barrier synchronization in wormhole k-ary n-cube networks with multidestination worms
    • D. Panda. Fast barrier synchronization in wormhole k-ary n-cube networks with multidestination worms. In HPCA 1995.
    • HPCA 1995
    • Panda, D.1
  • 36
    • 0034818435 scopus 로고    scopus 로고
    • A delay model and speculative architecture for pipelined routers
    • L.-S. Peh and W. Dally. A delay model and speculative architecture for pipelined routers. In HPCA 2001.
    • HPCA 2001
    • Peh, L.-S.1    Dally, W.2
  • 37
    • 66749138110 scopus 로고    scopus 로고
    • Efficient unicast and multicast support for CMPs
    • S. Rodrigo et al. Efficient unicast and multicast support for CMPs. In MICRO 2008.
    • MICRO 2008
    • Rodrigo, S.1
  • 38
    • 79952070570 scopus 로고    scopus 로고
    • New theory for deadlock-free multicast routing in wormhole-switched virtual-channelless networks-on-chip
    • April
    • F. Samman et al. New theory for deadlock-free multicast routing in wormhole-switched virtual-channelless networks-on-chip. IEEE Trans. Parallel Distrib. Syst., 22(4):544 -557, April 2011.
    • (2011) IEEE Trans. Parallel Distrib. Syst. , vol.22 , Issue.4 , pp. 544-557
    • Samman, F.1
  • 39
    • 70349826938 scopus 로고    scopus 로고
    • Recursive Partitioning Multicast: A bandwidth-efficient routing for networks-on-chip
    • L. Wang et al. Recursive Partitioning Multicast: A bandwidth-efficient routing for networks-on-chip. In NOCS 2009.
    • NOCS 2009
    • Wang, L.1
  • 40
    • 78650447497 scopus 로고    scopus 로고
    • Efficient lookahead routing and header compression for multicasting in networks-on-chip
    • L. Wang et al. Efficient lookahead routing and header compression for multicasting in networks-on-chip. In ANCS 2010.
    • ANCS 2010
    • Wang, L.1
  • 41
    • 79951882970 scopus 로고    scopus 로고
    • On an efficient NoC multicasting scheme in support of multiple applications running on irregular sub-networks
    • March
    • X. Wang et al. On an efficient NoC multicasting scheme in support of multiple applications running on irregular sub-networks. Microprocess. Microsyst., 35:119-129, March 2011.
    • (2011) Microprocess. Microsyst. , vol.35 , pp. 119-129
    • Wang, X.1
  • 42
    • 84910890874 scopus 로고    scopus 로고
    • Efficient implementation of barrier synchronization in wormhole-routed hypercube multicomputers
    • H. Xu et al. Efficient implementation of barrier synchronization in wormhole-routed hypercube multicomputers. In ICDCS 1992.
    • ICDCS 1992
    • Xu, H.1
  • 43
    • 79959969892 scopus 로고    scopus 로고
    • The Tianhe-1A supercomputer: Its hardware and software
    • X. Yang et al. The Tianhe-1A supercomputer: Its hardware and software. J. Comput. Sci. Technol., 26(3):344-351, 2011.
    • (2011) J. Comput. Sci. Technol. , vol.26 , Issue.3 , pp. 344-351
    • Yang, X.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.