SCOPUS 정보 검색 플랫폼

Proceedings - International Symposium on Computer Architecture

Volumn , Issue , 2010, Pages 441-450

Data marshaling for multi-core architectures

(5) Suleman, M Aater a Mutlu, Onur b Joao, José A a Khubaib, a Patt, Yale N a

a UNIVERSITY OF TEXAS AT AUSTIN (United States)

b CARNEGIE MELLON UNIVERSITY (United States)

Author keywords

CMP; Critical sections; Pipelining; Staged execution

Indexed keywords

CACHE MISS; CMP; CRITICAL SECTIONS; EXECUTION MODEL; HETEROGENEOUS MULTICORE; MULTICORE ARCHITECTURES; PERFORMANCE BENEFITS; STORAGE OVERHEAD;

COMPUTER CONTROL SYSTEMS; COMPUTERS; MICROPROCESSOR CHIPS;

COMPUTER ARCHITECTURE;

EID: 77954973999 PISSN: 10636897 EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/1815961.1816020 Document Type: Conference Paper

Times cited : (33)

References (46)

1
- 67650080017
- MySQL database engine 5.0.1. http://www.mysql.com, 2008.
- (2008) MySQL Database Engine 5.0.1.

2
- 67650033145
- SQLite database engine version 3.5.8. 2008.
- (2008) SQLite Database Engine Version 3.5.8.

3
- 85018355183
- SysBench: a system performance benchmark v0.4.8. 2008.
- (2008) SysBench: A System Performance Benchmark v0.4.8.

4
- 27544493676
- Mitigating Amdahl's law through EPI throttling
- M. Annavaram, E. Grochowski, and J. Shen. Mitigating Amdahl's law through EPI throttling. In ISCA-32, 2005.
- (2005) ISCA-32
- Annavaram, M.¹ Grochowski, E.² Shen, J.³

5
- 77954985762
- Tech. Brief
- Apple. Grand Central Dispatch. Tech. Brief, 2009.
- (2009) Apple. Grand Central Dispatch

6
- 0003605996
- Technical Report RNR-94-1007, NASA Ames Research Center
- D. H. Bailey et al. NAS parallel benchmarks. Technical Report RNR-94-1007, NASA Ames Research Center, 1994.
- (1994) NAS Parallel Benchmarks
- Bailey, D.H.¹

7
- 70450245578
- Thread criticality predictors for dynamic performance, power, and resource management in chip multiprocessors
- A. Bhattacharjee and M. Martonosi. Thread criticality predictors for dynamic performance, power, and resource management in chip multiprocessors. In ISCA, 2009.
- (2009) ISCA
- Bhattacharjee, A.¹ Martonosi, M.²

8
- 63549095070
- The PARSEC benchmark suite: Characterization and architectural implications
- C. Bienia et al. The PARSEC benchmark suite: Characterization and architectural implications. In PACT, 2008.
- (2008) PACT
- Bienia, C.¹

9
- 84976783312
- Implementing remote procedure calls
- A. D. Birrell and B. J. Nelson. Implementing remote procedure calls. ACM TOCS, 2(1):39-59, 1984.
- (1984) ACM TOCS , vol.2 , Issue.1 , pp. 39-59
- Birrell, A.D.¹ Nelson, B.J.²

10
- 0029191296
- Cilk: An efficient multithreaded runtime system
- R. D. Blumofe et al. Cilk: an efficient multithreaded runtime system. In PPoPP, 1995.
- (1995) PPoPP
- Blumofe, R.D.¹

11
- 84872973735
- Reinventing scheduling for multicore systems
- S. Boyd-Wickizer et al. Reinventing scheduling for multicore systems. In HotOS-XII, 2009.
- (2009) HotOS-XII
- Boyd-Wickizer, S.¹

12
- 57549118941
- The shared-thread multiprocessor
- J. A. Brown and D. M. Tullsen. The shared-thread multiprocessor. In ICS, 2008.
- (2008) ICS
- Brown, J.A.¹ Tullsen, D.M.²

13
- 34547473118
- Computation spreading: Employing hardware migration to specialize CMP cores on-the-fly
- K. Chakraborty, P. M. Wells, and G. S. Sohi. Computation spreading: Employing hardware migration to specialize CMP cores on-the-fly. In ASPLOS-XII, 2006.
- (2006) ASPLOS-XII
- Chakraborty, K.¹ Wells, P.M.² Sohi, G.S.³

14
- 0036949391
- A stateless, content-directed data prefetching mechanism
- R. Cooksey et al. A stateless, content-directed data prefetching mechanism. In ASPLOS, 2002.
- (2002) ASPLOS
- Cooksey, R.¹

15
- 33646144623
- The OpenMP source code repository
- A. J. Dorta et al. The OpenMP source code repository. In Euromicro, 2005.
- (2005) Euromicro
- Dorta, A.J.¹

16
- 64949179220
- Techniques for bandwidth-efficient prefetching of linked data structures in hybrid prefetching systems.
- E. Ebrahimi et al. Techniques for bandwidth-efficient prefetching of linked data structures in hybrid prefetching systems. HPCA, 2009.
- (2009) HPCA
- Ebrahimi, E.¹

17
- 34547423880
- Exploiting coarse-grained task, data, and pipeline parallelism in stream programs
- M. Gordon et al. Exploiting coarse-grained task, data, and pipeline parallelism in stream programs. In ASPLOS, 2006.
- (2006) ASPLOS
- Gordon, M.¹

18
- 67349122208
- StagedDB: Designing database servers for modern hardware
- S. Harizopoulos and A. Ailamaki. StagedDB: Designing database servers for modern hardware. IEEE Data Eng. Bull., June 2005.
- (2005) IEEE Data Eng. Bull., June
- Harizopoulos, S.¹ Ailamaki, A.²

19
- 48249118853
- Amdahl's law in the multicore era
- M. Hill and M. Marty. Amdahl's law in the multicore era. IEEE Computer, 41(7), 2008.
- (2008) IEEE Computer , vol.41 , pp. 7
- Hill, M.¹ Marty, M.²

20
- 70449669476
- DDCache: Decoupled and delegable cache data and metadata
- H. Hossain et al. DDCache: Decoupled and delegable cache data and metadata. In PACT, 2009.
- (2009) PACT
- Hossain, H.¹

21
- 67650075062
- Intel
- Intel. Source code for Intel threading building blocks.
- Source Code for Intel Threading Building Blocks

22
- 77954964332
- Intel
- Intel. Getting Started with Intel Parallel Studio, 2009.
- (2009) Getting Started with Intel Parallel Studio

23
- 0030677583
- Prefetching using Markov predictors
- D. Joseph and D. Grunwald. Prefetching using Markov predictors. In ISCA, 1997.
- (1997) ISCA
- Joseph, D.¹ Grunwald, D.²

24
- 0025429331
- Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers
- N. P. Jouppi. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers. In ISCA-17, 1990.
- (1990) ISCA-17
- Jouppi, N.P.¹

25
- 0036949388
- An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches
- C. Kim et al. An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches. In ASPLOS, 2002.
- (2002) ASPLOS
- Kim, C.¹

26
- 67650831862
- H. Kredel. Source code for traveling salesman problem (tsp). http://krum.rz.uni-mannheim.de/ba-pp-2007/java/index.html.
- Source Code for Traveling Salesman Problem (Tsp)
- Kredel, H.¹

27
- 85084163748
- Using cohort scheduling to enhance server performance
- J. R. Larus and M. Parkes. Using cohort scheduling to enhance server performance. In USENIX, 2002.
- (2002) USENIX
- Larus, J.R.¹ Parkes, M.²

28
- 0030685588
- The SGI Origin: A ccNUMA highly scalable server
- J. Laudon and D. Lenoski. The SGI Origin: A ccNUMA Highly Scalable Server. In ISCA, 1997.
- (1997) ISCA
- Laudon, J.¹ Lenoski, D.²

29
- 0033705677
- Push vs. pull: Data movement for linked data structures
- C. lin Yang and A. R. Lebeck. Push vs. pull: Data movement for linked data structures. In ICS, 2000.
- (2000) ICS
- Lin Yang, C.¹ Lebeck, A.R.²

30
- 66749153837
- PhD thesis
- M. R. Marty. Cache coherence techniques for multicore processors. PhD thesis, 2008.
- (2008) Cache Coherence Techniques for Multicore Processors
- Marty., M.R.¹

31
- 33947328378
- Performance, power efficiency and scalability of asymmetric cluster chip multiprocessors
- T. Morad et al. Performance, power efficiency and scalability of asymmetric cluster chip multiprocessors. Comp Arch Letters, 2006.
- (2006) Comp Arch Letters
- Morad, T.¹

32
- 47349098275
- MineBench: A benchmark suite for data mining workloads
- R. Narayanan et al. MineBench: A benchmark suite for data mining workloads. In IISWC, 2006.
- (2006) IISWC
- Narayanan, R.¹

33
- 77954988455
- NVIDIA Corporation
- NVIDIA Corporation. CUDA SDK code samples, 2009.
- (2009) CUDA SDK Code Samples

34
- 64949187933
- Adaptive spill-receive for robust high-performance caching in CMPs.
- M. K. Qureshi. Adaptive spill-receive for robust high-performance caching in CMPs. HPCA, 2009.
- (2009) HPCA
- Qureshi, M.K.¹

35
- 70450253535
- Thread motion: Fine-grained power management for multi-core systems
- K. K. Rangan et al. Thread motion: Fine-grained power management for multi-core systems. In ISCA, 2009.
- (2009) ISCA
- Rangan, K.K.¹

36
- 0030672607
- The interaction of software prefetching with ILP processors in shared-memory systems
- P. Ranganathan et al. The interaction of software prefetching with ILP processors in shared-memory systems. ISCA, 1997.
- (1997) ISCA
- Ranganathan, P.¹

37
- 70450279104
- Spatio-temporal memory streaming
- S. Somogyi et al. Spatio-temporal memory streaming. ISCA, 2009.
- (2009) ISCA
- Somogyi, S.¹

38
- 77952284721
- Fast switching of threads between cores
- R. Strong et al. Fast switching of threads between cores. SIGOPS Oper. Syst. Rev., 43(2), 2009.
- (2009) SIGOPS Oper. Syst. Rev. , vol.43 , pp. 2
- Strong, R.¹

39
- 48249131157
- Technical Report TR-HPS-2007-3001, Univ. of Texas at Austin
- M. A. Suleman et al. ACMP: Balancing hardware efficiency and programmer efficiency. Technical Report TR-HPS-2007-3001, Univ. of Texas at Austin, 2007.
- (2007) ACMP: Balancing Hardware Efficiency and Programmer Efficiency
- Suleman, M.A.¹

40
- 67650085373
- Technical Report TR-HPS-2008-3003, Univ. of Texas at Austin
- M. A. Suleman et al. An asymmetric multi-core architecture for accelerating critical sections. Technical Report TR-HPS-2008-3003, Univ. of Texas at Austin, 2008.
- (2008) An Asymmetric Multi-core Architecture for Accelerating Critical Sections
- Suleman, M.A.¹

41
- 67650033098
- Accelerating critical section execution with asymmetric multi-core architectures
- M. A. Suleman et al. Accelerating critical section execution with asymmetric multi-core architectures. ASPLOS, 2009.
- (2009) ASPLOS
- Suleman, M.A.¹

42
- 0036298603
- POWER4 system microarchitecture
- J. M. Tendler et al. POWER4 system microarchitecture. IBM Journal of Research and Development, 46(1):5-26, 2002.
- (2002) IBM Journal of Research and Development , vol.46 , Issue.1 , pp. 5-26
- Tendler, J.M.¹

43
- 0037521913
- Streamit: A language for streaming applications
- W. Thies et al. Streamit: A language for streaming applications. In 11th Conf. on Compiler Construction, 2002.
- (2002) 11th Conf. on Compiler Construction
- Thies, W.¹

44
- 84957872108
- The impact of speeding up critical sections with data prefetching and forwarding
- P. Trancoso and J. Torrellas. The impact of speeding up critical sections with data prefetching and forwarding. In ICPP, 1996.
- (1996) ICPP
- Trancoso, P.¹ Torrellas, J.²

45
- 0002002496
- Scalable high speed IP routing lookups
- M. Waldvogel, G. Varghese, J. Turner, and B. Plattner. Scalable high speed IP routing lookups. In SIGCOMM, 1997.
- (1997) SIGCOMM
- Waldvogel, M.¹ Varghese, G.² Turner, J.³ Plattner, B.⁴

46
- 27544495466
- Victim replication: Maximizing capacity while hiding wire delay in tiled chip multiprocessors
- M. Zhang and K. Asanovic. Victim replication: Maximizing capacity while hiding wire delay in tiled chip multiprocessors. In ISCA, 2005.
- (2005) ISCA
- Zhang, M.¹ Asanovic, K.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.