SCOPUS 정보 검색 플랫폼

Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP

Volumn , Issue , 2013, Pages 219-228

Scheduling parallel programs by work stealing with private deques

(3) Acar, Umut A a Charguéraud, Arthur b Rainey, Mike c

a Carnegie Mellon University (United States)

b INRIA SACLAY (France)

c MAX PLANCK INSTITUTE FOR SOFTWARE SYSTEMS (Germany)

Author keywords

dynamic load balancing; nested parallelism; work stealing

Indexed keywords

DEPTH FIRST SEARCH; DESIGN AND IMPLEMENTATIONS; DISTRIBUTION STRATEGIES; DIVIDE AND CONQUER; NESTED PARALLELISM; PROBABILISTIC MODELS; THEORETICAL GUARANTEES; WORK STEALING;

ALGORITHMS; COMPUTER PROGRAMMING LANGUAGES; DYNAMIC LOADS; FENCES; FLOCCULATION; NETWORK MANAGEMENT; PARALLEL PROGRAMMING; SCHEDULING;

PARALLEL ARCHITECTURES;

EID: 84875141794 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/2442516.2442538 Document Type: Conference Paper

Times cited : (80)

References (44)

1
- 84875130758
- Umut A. Acar, Arthur Charguéraud, and Mike Rainey. Technical report associated with the present paper. http://arthur.chargueraud.org/ research/2013/ppopp/full.pdf
- Technical Report Associated with the Present Paper
- Acar, U.A.¹ Charguéraud, A.² Rainey, M.³

2
- 0036590708
- The data locality of work stealing
- Umut A. Acar, Guy E. Blelloch, and Robert D. Blumofe. The data locality of work stealing. Theory of Computing Systems (TOCS), 35(3):321-347, 2002.
- (2002) Theory of Computing Systems (TOCS) , vol.35 , Issue.3 , pp. 321-347
- Acar, U.A.¹ Blelloch, G.E.² Blumofe, R.D.³

3
- 0031628001
- Thread scheduling for multiprogrammed multiprocessors
- ACM Press
- Nimar S. Arora, Robert D. Blumofe, and C. Greg Plaxton. Thread scheduling for multiprogrammed multiprocessors. In SPAA '98, pages 119-129. ACM Press, 1998.
- (1998) SPAA '98 , pp. 119-129
- Arora, N.S.¹ Blumofe, R.D.² Greg Plaxton, C.³

4
- 0344584867
- The natural work-stealing algorithm is stable
- May
- Petra Berenbrink, Tom Friedetzky, and Leslie Ann Goldberg. The natural work-stealing algorithm is stable. SIAM J. Comput., 32:1260-1279, May 2003.
- (2003) SIAM J. Comput. , vol.32 , pp. 1260-1279
- Berenbrink, P.¹ Friedetzky, T.² Goldberg, L.A.³

5
- 58449090994
- Provably good multicore cache performance for divide-and-conquer algorithms
- Guy E. Blelloch, Rezaul A. Chowdhury, Phillip B. Gibbons, Vijaya Ramachandran, Shimin Chen, and Michael Kozuch. Provably good multicore cache performance for divide-and-conquer algorithms. In In the Proceedings of the 19th ACM-SIAM Symposium on Discrete Algorithms, pages 501-510, 2008.
- (2008) In the Proceedings of the 19th ACM-SIAM Symposium on Discrete Algorithms , pp. 501-510
- Blelloch, G.E.¹ Chowdhury, R.A.² Gibbons, P.B.³ Ramachandran, V.⁴ Chen, S.⁵ Kozuch, M.⁶

6
- 84858427811
- Internally deterministic parallel algorithms can be fast
- NY, USA, ACM
- Guy E. Blelloch, Jeremy T. Fineman, Phillip B. Gibbons, and Julian Shun. Internally deterministic parallel algorithms can be fast. In PPoPP '12, pages 181-192, NY, USA, 2012. ACM.
- (2012) PPoPP '12 , pp. 181-192
- Blelloch, G.E.¹ Fineman, J.T.² Gibbons, P.B.³ Shun, J.⁴

7
- 0029696091
- A provable time and space efficient implementation of NESL
- ACM
- Guy E. Blelloch and John Greiner. A provable time and space efficient implementation of NESL. In ICFP '96, pages 213-225. ACM, 1996.
- (1996) ICFP '96 , pp. 213-225
- Blelloch, G.E.¹ Greiner, J.²

8
- 0002634823
- Scheduling multithreaded computations by work stealing
- 0
- R.D. Blumofe and C.E. Leiserson. Scheduling multithreaded computations by work stealing. Foundations of Computer Science, IEEE Annual Symposium on, 0:356-368, 1994.
- (1994) Foundations of Computer Science, IEEE Annual Symposium on , pp. 356-368
- Blumofe, R.D.¹ Leiserson, C.E.²

9
- 0029191296
- Cilk: An efficient multithreaded runtime system
- Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, and Yuli Zhou. Cilk: an efficient multithreaded runtime system. In PPoPP, pages 207-216, 1995.
- (1995) PPoPP , pp. 207-216
- Blumofe, R.D.¹ Joerg, C.F.² Kuszmaul, B.C.³ Leiserson, C.E.⁴ Randall, K.H.⁵ Zhou, Y.⁶

10
- 0000269759
- Scheduling multithreaded computations by work stealing
- September
- Robert D. Blumofe and Charles E. Leiserson. Scheduling multithreaded computations by work stealing. J. ACM, 46:720-748, September 1999.
- (1999) J. ACM , vol.46 , pp. 720-748
- Blumofe, R.D.¹ Leiserson, C.E.²

11
- 85035595949
- Executing functional programs on a virtual tree of processors
- ACM Press, October
- F.Warren Burton and M. Ronan Sleep. Executing functional programs on a virtual tree of processors. In Functional Programming Languages and Computer Architecture (FPCA '81), pages 187-194. ACM Press, October 1981.
- (1981) Functional Programming Languages and Computer Architecture (FPCA '81) , pp. 187-194
- Burton, F.W.¹ Sleep, M.R.²

12
- 32144435090
- Dynamic circular work-stealing deque
- David Chase and Yossi Lev. Dynamic circular work-stealing deque. In SPAA '05, pages 21-28, 2005.
- (2005) SPAA '05 , pp. 21-28
- Chase, D.¹ Lev, Y.²

13
- 84875199402
- Technical report, Sun Microsystems
- David Chase and Yossi Lev. Dynamic circular work-stealing deque. Technical report, Sun Microsystems, 2005.
- (2005) Dynamic Circular Work-stealing Deque
- Chase, D.¹ Lev, Y.²

14
- 55849100059
- Solving large, irregular graph problems using adaptive work-stealing
- Guojing Cong, Sreedhar B. Kodali, Sriram Krishnamoorthy, Doug Lea, Vijay A. Saraswat, and Tong Wen. Solving large, irregular graph problems using adaptive work-stealing. In ICPP, pages 536-545, 2008.
- (2008) ICPP , pp. 536-545
- Cong, G.¹ Kodali, S.B.² Krishnamoorthy, S.³ Lea, D.⁴ Saraswat, V.A.⁵ Wen, T.⁶

15
- 0030786221
- The effect of scheduling discipline on dynamic load sharing in heterogeneous distributed systems
- 0
- Sivarama P. Dandamudi. The effect of scheduling discipline on dynamic load sharing in heterogeneous distributed systems. Modeling, Analysis, and Simulation of Computer Systems, International Symposium on, 0:17, 1997.
- (1997) Modeling, Analysis, and Simulation of Computer Systems, International Symposium on , pp. 17
- Dandamudi, S.P.¹

16
- 34548771395
- Dynamic load balancing of unbalanced computations using message passing
- J. Dinan, S. Olivier, G. Sabin, J. Prins, P. Sadayappan, and C.-W. Tseng. Dynamic load balancing of unbalanced computations using message passing. In IPDPS '07. IEEE International, march 2007.
- IPDPS '07. IEEE International, March 2007
- Dinan, J.¹ Olivier, S.² Sabin, G.³ Prins, J.⁴ Sadayappan, P.⁵ Tseng, C.-W.⁶

17
- 74049140383
- Scalable work stealing
- ACM
- James Dinan, D. Brian Larkins, P. Sadayappan, Sriram Krishnamoorthy, and Jarek Nieplocha. Scalable work stealing. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC '09, pages 53:1-53:11. ACM, 2009.
- (2009) Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC '09
- Dinan, J.¹ Larkins, D.B.² Sadayappan, P.³ Krishnamoorthy, S.⁴ Nieplocha, J.⁵

18
- 0022676728
- COMPARISON of RECEIVER-INITIATED and SENDER-INITIATED ADAPTIVE LOAD SHARING
- DOI 10.1016/0166-5316(86)90008-8
- Derek L. Eager, Edward D. Lazowska, and John Zahorjan. A comparison of receiver-initiated and sender-initiated adaptive load sharing. Perform. Eval., 6(1):53-68, 1986. (Pubitemid 16538292)
- (1986) Performance Evaluation , vol.6 , Issue.1 , pp. 53-68
- Eager, D.L.¹ Lazowska, E.D.² Zahorjan, J.³

19
- 85028891596
- A message passing implementation of lazy task creation
- Marc Feeley. A message passing implementation of lazy task creation. In Parallel Symbolic Computing, pages 94-107, 1992.
- (1992) Parallel Symbolic Computing , pp. 94-107
- Feeley, M.¹

20
- 0008802085
- PhD thesis, Brandeis University, MA, USA, UMI Order No. GAX93-22348
- Marc Feeley. An efficient and general implementation of futures on large scale shared-memory multiprocessors. PhD thesis, Brandeis University, MA, USA, 1993. UMI Order No. GAX93-22348.
- (1993) An Efficient and General Implementation of Futures on Large Scale Shared-memory Multiprocessors
- Feeley, M.¹

21
- 0027844215
- Polling efficiently on stock hardware
- NY, USA, ACM
- Marc Feeley. Polling efficiently on stock hardware. In Proceedings of the conference on Functional programming languages and computer architecture, FPCA '93, pages 179-187, NY, USA, 1993. ACM.
- (1993) Proceedings of the Conference on Functional Programming Languages and Computer Architecture, FPCA '93 , pp. 179-187
- Feeley, M.¹

22
- 80054926287
- Implicitly threaded parallelism in Manticore
- Matthew Fluet, Mike Rainey, John Reppy, and Adam Shaw. Implicitly threaded parallelism in Manticore. Journal of Functional Programming, 20(5-6):1-40, 2011.
- (2011) Journal of Functional Programming , vol.20 , Issue.5-6 , pp. 1-40
- Fluet, M.¹ Rainey, M.² Reppy, J.³ Shaw, A.⁴

23
- 0347507496
- The implementation of the Cilk-5 multithreaded language
- Matteo Frigo, Charles E. Leiserson, and Keith H. Randall. The implementation of the Cilk-5 multithreaded language. In PLDI, pages 212-223, 1998.
- (1998) PLDI , pp. 212-223
- Frigo, M.¹ Leiserson, C.E.² Randall, K.H.³

24
- 84862601360
- A performance model for x10 applications: What's going on under the hood?
- NY, USA, ACM
- David Grove, Olivier Tardieu, David Cunningham, Ben Herta, Igor Peshansky, and Vijay Saraswat. A performance model for x10 applications: what's going on under the hood? In Proceedings of the 2011 ACMSIGPLAN X10Workshop, pages 1:1-1:8, NY, USA, 2011. ACM.
- (2011) Proceedings of the 2011 ACMSIGPLAN X10Workshop
- Grove, D.¹ Tardieu, O.² Cunningham, D.³ Ben Herta, I.P.⁴ Saraswat, V.⁵

25
- 0021658497
- Implementation of multilisp: Lisp on a multiprocessor
- ACM
- Robert H. Halstead, Jr. Implementation of multilisp: Lisp on a multiprocessor. In Proceedings of the 1984 ACM Symposium on LISP and functional programming, LFP '84, pages 9-17. ACM, 1984.
- (1984) Proceedings of the 1984 ACM Symposium on LISP and Functional Programming, LFP '84 , pp. 9-17
- Halstead Jr., R.H.¹

26
- 32844466488
- A dynamic-sized nonblocking work stealing deque
- February
- Danny Hendler, Yossi Lev, Mark Moir, and Nir Shavit. A dynamic-sized nonblocking work stealing deque. Distrib. Comput., 18:189-207, February 2006.
- (2006) Distrib. Comput. , vol.18 , pp. 189-207
- Hendler, D.¹ Lev, Y.² Moir, M.³ Shavit, N.⁴

27
- 0036954275
- Non-blocking steal-half work queues
- Danny Hendler and Nir Shavit. Non-blocking steal-half work queues. In PODC, pages 280-289, 2002.
- (2002) PODC , pp. 280-289
- Hendler, D.¹ Shavit, N.²

28
- 0036954486
- Work dealing
- ACM
- Danny Hendler and Nir Shavit. Work dealing. In SPAA '02, pages 164-172. ACM, 2002.
- (2002) SPAA '02 , pp. 164-172
- Hendler, D.¹ Shavit, N.²

29
- 67650093461
- Backtracking-based load balancing
- ACM
- Tasuku Hiraishi, Masahiro Yasugi, Seiji Umatani, and Taiichi Yuasa. Backtracking-based load balancing. In PPoPP '09, pages 55-64. ACM, 2009.
- (2009) PPoPP '09 , pp. 55-64
- Hiraishi, T.¹ Yasugi, M.² Umatani, S.³ Yuasa, T.⁴

30
- 84875186606
- Intel. Cilk Plus. http://software.intel.com/en-us/articles/intel-cilk- plus/.
- Cilk Plus

31
- 84875155436
- Specifications at
- Intel. Intel Xeon Processor X7550. Specifications at http://ark.intel. com/products/46498/Intel-Xeon-Processor-X7550-(18M-Cache-2-00-GHz-6-40-GTs- Intel- QPI).
- Intel Xeon Processor X7550

32
- 78249264449
- Regular, shape-polymorphic, parallel arrays in haskell
- Gabriele Keller, Manuel M.T. Chakravarty, Roman Leshchinskiy, Simon Peyton Jones, and Ben Lippmeier. Regular, shape-polymorphic, parallel arrays in haskell. In ICFP '10, pages 261-272, 2010.
- (2010) ICFP '10 , pp. 261-272
- Keller, G.¹ Chakravarty, M.M.T.² Leshchinskiy, R.³ Jones, S.P.⁴ Lippmeier, B.⁵

33
- 35348855586
- Carbon: Architectural support for fine-grained parallelism on chip multiprocessors
- June
- Sanjeev Kumar, Christopher J. Hughes, and Anthony Nguyen. Carbon: architectural support for fine-grained parallelism on chip multiprocessors. SIGARCH Computer Architecture News, 35:162-173, June 2007.
- (2007) SIGARCH Computer Architecture News , vol.35 , pp. 162-173
- Kumar, S.¹ Hughes, C.J.² Nguyen, A.³

34
- 67650093463
- Idempotent work stealing
- Maged M. Michael, Martin T. Vechev, and Vijay A. Saraswat. Idempotent work stealing. In PPoPP '09, pages 45-54, 2009.
- (2009) PPoPP '09 , pp. 45-54
- Michael, M.M.¹ Vechev, M.T.² Saraswat, V.A.³

35
- 0024771473
- Analysis of the effects of delays on load sharing
- nov
- R. Mirchandaney, D. Towsley, and J.A. Stankovic. Analysis of the effects of delays on load sharing. Computers, IEEE Transactions on, 38(11):1513 -1525, nov 1989.
- (1989) Computers, IEEE Transactions on , vol.38 , Issue.11 , pp. 1513-1525
- Mirchandaney, R.¹ Towsley, D.² Stankovic, J.A.³

36
- 0031635830
- Analyses of load stealing models based on differential equations
- NY, USA, ACM
- Michael Mitzenmacher. Analyses of load stealing models based on differential equations. In SPAA '98, pages 212-221, NY, USA, 1998. ACM.
- (1998) SPAA '98 , pp. 212-221
- Mitzenmacher, M.¹

37
- 84987792525
- A simple load balancing scheme for task allocation in parallel machines
- NY, USA, ACM
- Larry Rudolph, Miriam Slivkin-Allalouf, and Eli Upfal. A simple load balancing scheme for task allocation in parallel machines. In SPAA '91, pages 237-245, NY, USA, 1991. ACM.
- (1991) SPAA '91 , pp. 237-245
- Rudolph, L.¹ Slivkin-Allalouf, M.² Upfal, E.³

38
- 84875199468
- Technical Report. Intel Corp.
- Bratin Saha, Ali-Reza Adl-Tabatabai, Anwar Ghuloum, Mohan Rajagopalan, Richard L. Hudson, Leaf Petersen, Vijay Menon, Brian Murphy, Tatiana Shpeisman, Jesse Fang, Eric Sprangle, Anwar Rohillah, and Doug Carmean. Enabling scalability and performance in a large scale chip multiprocessor environment. Technical Report. Intel Corp., 2006.
- (2006) Enabling Scalability and Performance in a Large Scale Chip Multiprocessor Environment
- Saha, B.¹ Adl-Tabatabai, A.-R.² Ghuloum, A.³ Rajagopalan, M.⁴ Hudson, R.L.⁵ Petersen, L.⁶ Menon, V.⁷ Murphy, B.⁸ Shpeisman, T.⁹ Fang, J.¹⁰ Sprangle, E.¹¹ Rohillah, A.¹² Carmean, D.¹³

39
- 77952259532
- Flexible architectural support for fine-grain scheduling
- NY, USA, ACM
- Daniel Sanchez, Richard M. Yoo, and Christos Kozyrakis. Flexible architectural support for fine-grain scheduling. In ASPLOS '10, pages 311-322, NY, USA, 2010. ACM.
- (2010) ASPLOS '10 , pp. 311-322
- Sanchez, D.¹ Yoo, R.M.² Kozyrakis, C.³

40
- 0036395865
- Randomized receiver initiated load-balancing algorithms for tree-shaped computations
- Peter Sanders. Randomized receiver initiated load-balancing algorithms for tree-shaped computations. Comput. J., 45(5):561-573, 2002.
- (2002) Comput. J. , vol.45 , Issue.5 , pp. 561-573
- Sanders, P.¹

41
- 84875158398
- Miser - A dynamically loadable memory allocator for multi-threaded applications
- Barry Tannenbaum. Miser - a dynamically loadable memory allocator for multi-threaded applications. Intel Software Network, 2009.
- (2009) Intel Software Network
- Tannenbaum, B.¹

42
- 78650866403
- A tighter analysis of work stealing
- Algorithms and Computation - 21st International Symposium, ISAAC 2010, Springer
- Marc Tchiboukdjian, Nicolas Gast, Denis Trystram, Jean-Louis Roch, and Julien Bernard. A tighter analysis of work stealing. In Algorithms and Computation - 21st International Symposium, ISAAC 2010, volume 6507 of LNCS, pages 291-302. Springer, 2010.
- (2010) LNCS , vol.6507 , pp. 291-302
- Tchiboukdjian, M.¹ Gast, N.² Trystram, D.³ Roch, J.-L.⁴ Bernard, J.⁵

43
- 84875136438
- PhD thesis, University of Maryland
- Alexandros Tzannes. Enhancing Productivity and Performance Portability of General-Purpose Parallel Programming. PhD thesis, University of Maryland, 2012.
- (2012) Enhancing Productivity and Performance Portability of General-Purpose Parallel Programming
- Tzannes, A.¹

44
- 0242276122
- Pursuing laziness for efficient implementation of modern multithreaded languages
- Seiji Umatani, Masahiro Yasugi, Tsuneyasu Komiya, and Taiichi Yuasa. Pursuing laziness for efficient implementation of modern multithreaded languages. In ISHPC, pages 174-188, 2003.
- (2003) ISHPC , pp. 174-188
- Umatani, S.¹ Yasugi, M.² Komiya, T.³ Yuasa, T.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.