메뉴 건너뛰기




Volumn , Issue , 2011, Pages 296-307

Hardware transactional memory for GPU architectures

Author keywords

[No Author keywords available]

Indexed keywords

AREA OVERHEAD; ATOMIC OPERATION; BLOOM FILTERS; BROADCAST COMMUNICATION; CACHE COHERENCY; CONCURRENT THREADS; CONCURRENT TRANSACTIONS; CONFLICT DETECTION; GRAPHICS PROCESSOR UNITS; MEMORY ACCESS; MEMORY LOCATIONS; MULTIPLE THREADS; NOVEL HARDWARE; ON-CHIP STORAGE; SCRATCH PAD MEMORY; THREAD-LEVEL PARALLELISM; TRANSACTIONAL MEMORY; VALUE-BASED; WORKGROUPS;

EID: 84858761190     PISSN: 10724451     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/2155620.2155655     Document Type: Conference Paper
Times cited : (84)

References (59)
  • 2
    • 0001371923 scopus 로고    scopus 로고
    • Fast Discovery of Association Rules
    • chapter. American Association for Artificial Intelligence
    • R. Agrawal et al. Advances in Knowledge Discovery and Data Mining. chapter Fast Discovery of Association Rules. American Association for Artificial Intelligence, 1996.
    • (1996) Advances in Knowledge Discovery and Data Mining
    • Agrawal, R.1
  • 4
    • 34548737032 scopus 로고    scopus 로고
    • Stack Trace Analysis for Large Scale Debugging
    • D. Arnold et al. Stack Trace Analysis for Large Scale Debugging. In IPDPS, 2007.
    • (2007) IPDPS
    • Arnold, D.1
  • 6
    • 70349169075 scopus 로고    scopus 로고
    • Analyzing CUDA Workloads Using a Detailed GPU Simulator
    • A. Bakhoda et al. Analyzing CUDA Workloads Using a Detailed GPU Simulator. In ISPASS, 2009.
    • (2009) ISPASS
    • Bakhoda, A.1
  • 7
    • 79955888551 scopus 로고    scopus 로고
    • Bloom Filter Guided Transaction Scheduling
    • G. Blake, R. G. Dreslinski, and T. Mudge. Bloom Filter Guided Transaction Scheduling. In HPCA, 2011.
    • (2011) HPCA
    • Blake, G.1    Dreslinski, R.G.2    Mudge, T.3
  • 8
    • 35348871241 scopus 로고    scopus 로고
    • Making the Fast Case Common and the Uncommon Case Simple in Unbounded Transactional Memory
    • C. Blundell et al. Making the Fast Case Common and the Uncommon Case Simple in Unbounded Transactional Memory. In ISCA, 2007.
    • (2007) ISCA
    • Blundell, C.1
  • 9
    • 35348875372 scopus 로고    scopus 로고
    • Performance Pathologies in Hardware Transactional Memory
    • J. Bobba et al. Performance Pathologies in Hardware Transactional Memory. In ISCA, 2007.
    • (2007) ISCA
    • Bobba, J.1
  • 10
    • 52649149963 scopus 로고    scopus 로고
    • TokenTM: Efficient Execution of Large Transactions with Hardware Transactional Memory
    • J. Bobba et al. TokenTM: Efficient Execution of Large Transactions with Hardware Transactional Memory. In ISCA, 2008.
    • (2008) ISCA
    • Bobba, J.1
  • 12
    • 84858427151 scopus 로고    scopus 로고
    • An Efficient CUDA Implementation of the Tree-based Barnes Hut n-Body Algorithm
    • Chapter 6
    • M. Burtscher and K. Pingali. An Efficient CUDA Implementation of the Tree-based Barnes Hut n-Body Algorithm. Chapter 6 in GPU Computing Gems Emerald Edition, 2011.
    • (2011) GPU Computing Gems Emerald Edition
    • Burtscher, M.1    Pingali, K.2
  • 13
    • 79953100976 scopus 로고    scopus 로고
    • Hardware Acceleration of Transactional Memory on Commodity Systems
    • J. Casper et al. Hardware Acceleration of Transactional Memory on Commodity Systems. In ASPLOS, 2011.
    • (2011) ASPLOS
    • Casper, J.1
  • 14
    • 84863389339 scopus 로고    scopus 로고
    • Towards a Software Transactional Memory for Graphics Processors
    • D. Cederman et al. Towards a Software Transactional Memory for Graphics Processors. In EGPGV, 2010.
    • (2010) EGPGV
    • Cederman, D.1
  • 15
    • 33845866604 scopus 로고    scopus 로고
    • Bulk Disambiguation of Speculative Threads in Multiprocessors
    • L. Ceze, J. Tuck, J. Torrellas, and C. Cascaval. Bulk Disambiguation of Speculative Threads in Multiprocessors. In ISCA, 2006.
    • (2006) ISCA
    • Ceze, L.1    Tuck, J.2    Torrellas, J.3    Cascaval, C.4
  • 16
    • 34547700390 scopus 로고    scopus 로고
    • A Scalable, Non-blocking Approach to Transactional Memory
    • H. Chafi et al. A Scalable, Non-blocking Approach to Transactional Memory. In HPCA, 2007.
    • (2007) HPCA
    • Chafi, H.1
  • 17
    • 79951701409 scopus 로고    scopus 로고
    • ASF: AMD64 Extension for Lock-Free Data Structures and Transactional Memory
    • J. Chung et al. ASF: AMD64 Extension for Lock-Free Data Structures and Transactional Memory. In MICRO, 2010.
    • (2010) MICRO
    • Chung, J.1
  • 18
    • 84858771156 scopus 로고    scopus 로고
    • United States Patent #7,353,369: System and Method for Managing Divergent Threads in a SIMD Architecture (Assignee NVIDIA Corp.), April
    • B. W. Coon et al. United States Patent #7,353,369: System and Method for Managing Divergent Threads in a SIMD Architecture (Assignee NVIDIA Corp.), April 2008.
    • (2008)
    • Coon, B.W.1
  • 19
    • 77749243410 scopus 로고    scopus 로고
    • NOrec: Streamlining STM by Abolishing Ownership Records
    • L. Dalessandro, M. F. Spear, and M. L. Scott. NOrec: Streamlining STM by Abolishing Ownership Records. In PPoPP, 2010.
    • (2010) PPoPP
    • Dalessandro, L.1    Spear, M.F.2    Scott, M.L.3
  • 21
    • 67650093724 scopus 로고    scopus 로고
    • Early Experience with a Commercial Hardware Transactional Memory Implementation
    • D. Dice et al. Early Experience With a Commercial Hardware Transactional Memory Implementation. In ASPLOS, 2009.
    • (2009) ASPLOS
    • Dice, D.1
  • 22
    • 79955887509 scopus 로고    scopus 로고
    • Cuckoo Directory: A Scalable Directory for Many-Core Systems
    • M. Ferdman et al. Cuckoo Directory: A Scalable Directory for Many-Core Systems. In HPCA, 2011.
    • (2011) HPCA
    • Ferdman, M.1
  • 23
    • 47349104432 scopus 로고    scopus 로고
    • Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow
    • W. Fung et al. Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow. In MICRO, 2007.
    • (2007) MICRO
    • Fung, W.1
  • 24
    • 68549096107 scopus 로고    scopus 로고
    • Dynamic Warp Formation: Efficient MIMD Control Flow on SIMD Graphics Hardware
    • W. Fung et al. Dynamic Warp Formation: Efficient MIMD Control Flow on SIMD Graphics Hardware. ACM TACO, 6(2), 2009.
    • (2009) ACM TACO , vol.6 , Issue.2
    • Fung, W.1
  • 25
    • 77954004688 scopus 로고    scopus 로고
    • An Efficient Software Transactional Memory Using Commit-Time Invalidation
    • J. E. Gottschlich et al. An Efficient Software Transactional Memory Using Commit-Time Invalidation. In CGO, 2010.
    • (2010) CGO
    • Gottschlich, J.E.1
  • 26
    • 57349163660 scopus 로고    scopus 로고
    • On the Correctness of Transactional Memory
    • R. Guerraoui and M. Kapalka. On the Correctness of Transactional Memory. In PPoPP, 2008.
    • (2008) PPoPP
    • Guerraoui, R.1    Kapalka, M.2
  • 28
    • 0027262011 scopus 로고
    • Transactional Memory: Architectural Support for Lock-Free Data Structures
    • M. Herlihy and J. E. B. Moss. Transactional Memory: Architectural Support for Lock-Free Data Structures. In ISCA, 1993.
    • (1993) ISCA
    • Herlihy, M.1    Moss, J.E.B.2
  • 29
    • 78149269568 scopus 로고    scopus 로고
    • WAYPOINT: Scaling Coherence to Thousand-Core Architectures
    • J. H. Kelm et al. WAYPOINT: Scaling Coherence to Thousand-Core Architectures. In PACT, 2010.
    • (2010) PACT
    • Kelm, J.H.1
  • 30
    • 79953889255 scopus 로고    scopus 로고
    • RMS-TM: A Comprehensive Benchmark Suite for Transactional Memory Systems
    • G. Kestor et al. RMS-TM: A Comprehensive Benchmark Suite for Transactional Memory Systems. In ICPE '11, 2011.
    • (2011) ICPE '11
    • Kestor, G.1
  • 31
    • 84858783752 scopus 로고    scopus 로고
    • Khronos Group. OpenCL. http://www.khronos.org/opencl/.
    • OpenCL
  • 32
    • 35948996973 scopus 로고    scopus 로고
    • Time-Out Bloom Filter: A New Sampling Method for Recording More Flows
    • S. Kong et al. Time-Out Bloom Filter: A New Sampling Method for Recording More Flows. In ICOIN, 2006.
    • (2006) ICOIN
    • Kong, S.1
  • 33
    • 33646892173 scopus 로고    scopus 로고
    • The Problem with Threads
    • May
    • E. A. Lee. The Problem with Threads. Computer, 39, May 2006.
    • (2006) Computer , vol.39
    • Lee, E.A.1
  • 34
    • 0002225708 scopus 로고
    • Chap - A SIMD Graphics Processor
    • A. Levinthal and T. Porter. Chap - A SIMD Graphics Processor. In SIGGRAPH, 1984.
    • (1984) SIGGRAPH
    • Levinthal, A.1    Porter, T.2
  • 35
    • 44849137198 scopus 로고    scopus 로고
    • NVIDIA Tesla: A Unified Graphics and Computing Architecture
    • E. Lindholm et al. NVIDIA Tesla: A Unified Graphics and Computing Architecture. Micro, IEEE, 2008.
    • (2008) Micro, IEEE
    • Lindholm, E.1
  • 36
    • 35348853739 scopus 로고    scopus 로고
    • An Effective Hybrid Transactional Memory System with Strong Isolation Guarantees
    • C. C. Minh et al. An Effective Hybrid Transactional Memory System with Strong Isolation Guarantees. In ISCA, 2007.
    • (2007) ISCA
    • Minh, C.C.1
  • 37
    • 0026886033 scopus 로고
    • PixelFlow: High-Speed Rendering Using Image Composition
    • S. Molnar, J. Eyles, and J. Poulton. PixelFlow: High-Speed Rendering Using Image Composition. In SIGGRAPH, 1992.
    • (1992) SIGGRAPH
    • Molnar, S.1    Eyles, J.2    Poulton, J.3
  • 38
    • 33748873605 scopus 로고    scopus 로고
    • LogTM: Log-Based Transactional Memory
    • K. Moore et al. LogTM: Log-Based Transactional Memory. In HPCA, 2006.
    • (2006) HPCA
    • Moore, K.1
  • 39
    • 78651550268 scopus 로고    scopus 로고
    • Scalable Parallel Programming with CUDA
    • Mar.-Apr.
    • J. Nickolls et al. Scalable Parallel Programming with CUDA. ACM Queue, 6(2):40-53, Mar.-Apr. 2008.
    • (2008) ACM Queue , vol.6 , Issue.2 , pp. 40-53
    • Nickolls, J.1
  • 42
    • 47849112591 scopus 로고    scopus 로고
    • JudoSTM: A Dynamic Binary-Rewriting Approach to Software Transactional Memory
    • M. Olszewski et al. JudoSTM: A Dynamic Binary-Rewriting Approach to Software Transactional Memory. In PACT, 2007.
    • (2007) PACT
    • Olszewski, M.1
  • 44
    • 34548354208 scopus 로고    scopus 로고
    • Architectural Support for Software Transactional Memory
    • B. Saha et al. Architectural Support for Software Transactional Memory. In MICRO, 2006.
    • (2006) MICRO
    • Saha, B.1
  • 45
    • 47349104267 scopus 로고    scopus 로고
    • Implementing Signatures for Transactional Memory
    • D. Sanchez et al. Implementing Signatures for Transactional Memory. In MICRO, 2007.
    • (2007) MICRO
    • Sanchez, D.1
  • 46
    • 57649106258 scopus 로고    scopus 로고
    • Larrabee: A Many-Core x86 Architecture for Visual Computing
    • L. Seiler et al. Larrabee: A Many-Core x86 Architecture for Visual Computing. In SIGGRAPH, 2008.
    • (2008) SIGGRAPH
    • Seiler, L.1
  • 47
    • 77955084127 scopus 로고    scopus 로고
    • Technical Report HPL-2007-167. HP Laboratories
    • P. Shivakumar and N. Jouppi. CACTI 5.0. Technical Report HPL-2007-167. HP Laboratories, 2007.
    • (2007) CACTI 5.0
    • Shivakumar, P.1    Jouppi, N.2
  • 48
    • 52649096071 scopus 로고    scopus 로고
    • Flexible Decoupled Transactional Memory Support
    • A. Shriraman et al. Flexible Decoupled Transactional Memory Support. In ISCA, 2008.
    • (2008) ISCA
    • Shriraman, A.1
  • 49
    • 57349198226 scopus 로고    scopus 로고
    • RingSTM: Scalable Transactions with a Single Atomic Instruction
    • M. F. Spear et al. RingSTM: Scalable Transactions with a Single Atomic Instruction. In SPAA, 2008.
    • (2008) SPAA
    • Spear, M.F.1
  • 50
    • 79959620582 scopus 로고    scopus 로고
    • Transactional Conflict Decoupling and Value Prediction
    • F. Tabba et al. Transactional Conflict Decoupling and Value Prediction. ICS '11, 2011.
    • (2011) ICS '11
    • Tabba, F.1
  • 51
    • 78650838281 scopus 로고    scopus 로고
    • The Sharing Tracker: Using Ideas from Cache Coherence Hardware to Reduce Off-Chip Memory Traffic with Non-Coherent Caches
    • D. Tarjan and K. Skadron. The Sharing Tracker: Using Ideas from Cache Coherence Hardware to Reduce Off-Chip Memory Traffic with Non-Coherent Caches. SC '10, 2010.
    • (2010) SC '10
    • Tarjan, D.1    Skadron, K.2
  • 52
    • 76749151001 scopus 로고    scopus 로고
    • EazyHTM: Eager-Lazy Hardware Transactional Memory
    • S. Tomić et al. EazyHTM: Eager-Lazy Hardware Transactional Memory. In MICRO, 2009.
    • (2009) MICRO
    • Tomić, S.1
  • 53
    • 51849086874 scopus 로고    scopus 로고
    • CudaCuts: Fast Graph Cuts on the GPU
    • V. Vineet and P. Narayanan. CudaCuts: Fast Graph Cuts on the GPU. In CVPRW '08, 2008.
    • (2008) CVPRW '08
    • Vineet, V.1    Narayanan, P.2
  • 54
    • 85004920430 scopus 로고
    • Merging and Transformation of Raster Images for Cartoon Animation
    • B. A. Wallace. Merging and Transformation of Raster Images for Cartoon Animation. In SIGGRAPH, 1981.
    • (1981) SIGGRAPH
    • Wallace, B.A.1
  • 55
    • 77952579552 scopus 로고    scopus 로고
    • Demystifying GPU microarchitecture through microbenchmarking
    • H. Wong et al. Demystifying GPU microarchitecture through microbenchmarking. In ISPASS, 2010.
    • (2010) ISPASS
    • Wong, H.1
  • 56
    • 34547683554 scopus 로고    scopus 로고
    • LogTM-SE: Decoupling Hardware Transactional Memory from Caches
    • L. Yen et al. LogTM-SE: Decoupling Hardware Transactional Memory from Caches. In HPCA, 2007.
    • (2007) HPCA
    • Yen, L.1
  • 57
    • 57349100347 scopus 로고    scopus 로고
    • Adaptive Transaction Scheduling for Transactional Memory Systems
    • R. M. Yoo and H.-H. S. Lee. Adaptive Transaction Scheduling for Transactional Memory Systems. In SPAA, 2008.
    • (2008) SPAA
    • Yoo, R.M.1    Lee, H.-H.S.2
  • 58
    • 78149246470 scopus 로고    scopus 로고
    • SPACE: Sharing Pattern-based Directory Coherence for Multicore Scalability
    • H. Zhao et al. SPACE: Sharing Pattern-based Directory Coherence for Multicore Scalability. In PACT, 2010.
    • (2010) PACT
    • Zhao, H.1
  • 59
    • 78149246223 scopus 로고    scopus 로고
    • Discovering and understanding performance bottlenecks in transactional applications
    • F. Zyulkyarov et al. Discovering and understanding performance bottlenecks in transactional applications. In PACT, 2010.
    • (2010) PACT
    • Zyulkyarov, F.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.