-
1
-
-
0036590708
-
The data locality of work stealing
-
DOI 10.1007/s00224-002-1057-3
-
U. Acar, G. Blelloch, and R. Blumofe, "The Data Locality of Work Stealing," Theory of Computing Systems, vol. 35, no. 3, pp. 321-347, 2002. (Pubitemid 36156138)
-
(2002)
Theory of Computing Systems
, vol.35
, Issue.3
, pp. 321-347
-
-
Acar, U.A.1
Blelloch, G.E.2
Blumofe, R.D.3
-
2
-
-
60449097203
-
The design of OpenMP tasks
-
Mar
-
E. Ayguadé, N. Copty, A. Duran, J. Hoeflinger, Y. Lin, F. Massaioli, X. Teruel, P. Unnikrishnan, and G. Zhang, "The Design of OpenMP Tasks," IEEE Trans. Parallel and Distributed Systems, vol. 20, no. 3, pp. 404-418, Mar. 2009.
-
(2009)
IEEE Trans. Parallel and Distributed Systems
, vol.20
, Issue.3
, pp. 404-418
-
-
Ayguadé, E.1
Copty, N.2
Duran, A.3
Hoeflinger, J.4
Lin, Y.5
Massaioli, F.6
Teruel, X.7
Unnikrishnan, P.8
Zhang, G.9
-
3
-
-
32844456410
-
Online performance analysis by statistical sampling of microprocessor performance counters
-
DOI 10.1145/1088149.1088163, ICS05 - Proceedings of the 19th ACM International Conference on Supercomputing
-
R. Azimi, M. Stumm, and R. Wisniewski, "Online Performance Analysis by Statistical Sampling of Microprocessor Performance Counters," Proc. 19th Ann. Int'l Conf. Supercomputing, pp. 101-110, 2005. (Pubitemid 43251314)
-
(2005)
Proceedings of the International Conference on Supercomputing
, pp. 101-110
-
-
Azimi, R.1
Stumm, M.2
Wisniewski, R.W.3
-
4
-
-
48749141209
-
Adaptive mesh refinement for hyperbolic partial differential equations
-
M. Berger and J. Oliger, "Adaptive Mesh Refinement for Hyperbolic Partial Differential Equations," J. Computational Physics, vol. 53, no. 3, pp. 484-512, 1984.
-
(1984)
J. Computational Physics
, vol.53
, Issue.3
, pp. 484-512
-
-
Berger, M.1
Oliger, J.2
-
5
-
-
58449090994
-
Provably good multicore cache performance for divide-and-conquer algorithms
-
G. Blelloch, R. Chowdhury, P. Gibbons, V. Ramachandran, S. Chen, and M. Kozuch, "Provably Good Multicore Cache Performance for Divide-and-Conquer Algorithms," Proc. 19th Ann. ACM-SIAM Symp. Discrete Algorithms, pp. 501-510, 2008.
-
(2008)
Proc. 19th Ann. ACM-SIAM Symp. Discrete Algorithms
, pp. 501-510
-
-
Blelloch, G.1
Chowdhury, R.2
Gibbons, P.3
Ramachandran, V.4
Chen, S.5
Kozuch, M.6
-
6
-
-
79959676391
-
Scheduling irregular parallel computations on hierarchical caches
-
June
-
G. Blelloch, J. Fineman, P. Gibbons, and H.V. Simhadri, "Scheduling Irregular Parallel Computations on Hierarchical Caches," Proc. 20th ACM Symp. Parallel Algorithms and Architectures, June 2011.
-
(2011)
Proc. 20th ACM Symp. Parallel Algorithms and Architectures
-
-
Blelloch, G.1
Fineman, J.2
Gibbons, P.3
Simhadri, H.V.4
-
7
-
-
77954942935
-
Low depth cache-oblivious algorithms
-
G. Blelloch, P. Gibbons, and H. Simhadri, "Low Depth Cache-Oblivious Algorithms," Proc. 22nd ACM Symp. Parallelism in Algorithms and Architectures, pp. 189-199, 2010.
-
(2010)
Proc. 22nd ACM Symp. Parallelism in Algorithms and Architectures
, pp. 189-199
-
-
Blelloch, G.1
Gibbons, P.2
Simhadri, H.3
-
8
-
-
0003459808
-
-
PhD thesis, Dept. of Electrical Eng. and Computer Science, Massachusetts Inst. of Technology, Technical Report MIT/LCS/TR-677, MIT Laboratory for Computer Science, Sept
-
R.D. Blumofe, "Executing Multithreaded Programs Efficiently," PhD thesis, Dept. of Electrical Eng. and Computer Science, Massachusetts Inst. of Technology, Technical Report MIT/LCS/TR-677, MIT Laboratory for Computer Science, Sept. 1995.
-
(1995)
Executing Multithreaded Programs Efficiently
-
-
Blumofe, R.D.1
-
9
-
-
0030601279
-
Cilk: An efficient multithreaded runtime system
-
DOI 10.1006/jpdc.1996.0107
-
R.D. Blumofe, C.F. Joerg, B.C. Kuszmaul, C.E. Leiserson, K.H. Randall, and Y. Zhou, "Cilk: An Efficient Multithreaded Runtime System," J. Parallel and Distributed computing, vol. 37, no. 1, pp. 55-69, Aug. 1996. (Pubitemid 126167766)
-
(1996)
Journal of Parallel and Distributed Computing
, vol.37
, Issue.1
, pp. 55-69
-
-
Blumofe, R.D.1
Joerg, C.F.2
Kuszmaul, B.C.3
Leiserson, C.E.4
Randall, K.H.5
Zhou, Y.6
-
11
-
-
32144435090
-
Dynamic circular work-stealing deque
-
DOI 10.1145/1073970.1073974, SPAA 2005 - Seventeenth Annual ACM Symposium on Parallelism in Algorithms and Architectures
-
D. Chase and Y. Lev, "Dynamic Circular Work-Stealing Deque," Proc. 17th Ann. ACM Symp. Parallelism Algorithms and Architectures, pp. 21-28, 2005. (Pubitemid 43206548)
-
(2005)
Annual ACM Symposium on Parallelism in Algorithms and Architectures
, pp. 21-28
-
-
Chase, D.1
Lev, Y.2
-
12
-
-
84864066885
-
Cats: Cache aware task-stealing based on online profiling in multi-socket multi-core architectures
-
Q. Chen, M. Guo, and Z. Huang, "Cats: Cache Aware Task-Stealing Based on Online Profiling in Multi-Socket Multi-Core Architectures," Proc. 26th Int'l Conf. Supercomputing, pp 163-172, 2012.
-
(2012)
Proc. 26th Int'l Conf. Supercomputing
, pp. 163-172
-
-
Chen, Q.1
Guo, M.2
Huang, Z.3
-
13
-
-
80155183145
-
CAB: CachE-Aware bi-tier task-stealing in multi-socket multi-core architecture
-
Q. Chen, Z. Huang, M. Guo, and J. Zhou, "CAB: CachE-Aware Bi-Tier Task-Stealing In Multi-Socket Multi-Core Architecture," Proc. 40th Int'l Conf. Parallel Processing, pp. 722-732, 2011.
-
(2011)
Proc. 40th Int'l Conf. Parallel Processing
, pp. 722-732
-
-
Chen, Q.1
Huang, Z.2
Guo, M.3
Zhou, J.4
-
14
-
-
35248852476
-
Scheduling threads for constructive cache sharing on CMPs
-
DOI 10.1145/1248377.1248396, SPAA'07: Proceedings of the Nineteenth Annual Symposium on Parallelism in Algorithms and Architectures
-
S. Chen et al., "Scheduling Threads for Constructive Cache Sharing on CMPs," Proc. 19th Ann. ACM Symp. Parallel Algorithms and Architectures, pp. 105-115, 2007. (Pubitemid 47568559)
-
(2007)
Annual ACM Symposium on Parallelism in Algorithms and Architectures
, pp. 105-115
-
-
Chen, S.1
Gibbons, P.B.2
Kozuch, M.3
Liaskovitis, V.4
Ailamaki, A.5
Blelloch, G.E.6
Falsafi, B.7
Fix, L.8
Hardavellas, N.9
Mowry, T.C.10
Wilkerson, C.11
-
15
-
-
84864050289
-
Analysis of randomized work stealing with false sharing
-
Mar
-
R. Cole and V. Ramachandran, "Analysis of Randomized Work Stealing with False Sharing," ArXiv e-prints, Mar. 2011.
-
(2011)
ArXiv E-prints
-
-
Cole, R.1
Ramachandran, V.2
-
16
-
-
79952789017
-
ULCC: A user-level facility for optimizing shared cache performance on multicores
-
X. Ding, K. Wang, and X. Zhang, "ULCC: A User-Level Facility for Optimizing Shared Cache Performance on Multicores," Proc. ACM SIGPLAN Symp. Principles and Practice Parallel Programming, pp. 103-112, 2011.
-
(2011)
Proc. ACM SIGPLAN Symp. Principles and Practice Parallel Programming
, pp. 103-112
-
-
Ding, X.1
Wang, K.2
Zhang, X.3
-
17
-
-
44049113422
-
A comparison of clustering heuristics for scheduling directed acyclic graphs on multiprocessors
-
A. Gerasoulis and T. Yang, "A Comparison of Clustering Heuristics for Scheduling Directed Acyclic Graphs on Multiprocessors," J. Parallel and Distributed Computing, vol. 16, no. 4, pp. 276-291, 1992.
-
(1992)
J. Parallel and Distributed Computing
, vol.16
, Issue.4
, pp. 276-291
-
-
Gerasoulis, A.1
Yang, T.2
-
19
-
-
70450029262
-
Work-first and help-first scheduling policies for async-finish task parallelism
-
Y. Guo, R. Barik, R. Raman, and V. Sarkar, "Work-First and Help-First Scheduling Policies for Async-Finish Task Parallelism," Proc. IEEE 23th Int'l Parallel and Distributed Processing Symp., pp. 1-12, 2009.
-
(2009)
Proc. IEEE 23th Int'l Parallel and Distributed Processing Symp.
, pp. 1-12
-
-
Guo, Y.1
Barik, R.2
Raman, R.3
Sarkar, V.4
-
20
-
-
77953967811
-
Slaw: A scalable locality-aware adaptive work-stealing scheduler
-
Y. Guo, J. Zhao, V. Cave, and V. Sarkar, "Slaw: A Scalable Locality-Aware Adaptive Work-Stealing Scheduler," Proc. IEEE 24th Int'l Parallel and Distributed Processing Symp., pp. 1-12, 2010.
-
(2010)
Proc. IEEE 24th Int'l Parallel and Distributed Processing Symp.
, pp. 1-12
-
-
Guo, Y.1
Zhao, J.2
Cave, V.3
Sarkar, V.4
-
21
-
-
32844470883
-
A dynamic-sized nonblocking work stealing deque
-
Sun Microsystems, Inc
-
D. Hendler, Y. Lev, M. Moir, and N. Shavit, "A Dynamic-Sized Nonblocking Work Stealing Deque," Technical Report TR-2005-144, Sun Microsystems, Inc., p. 69, 2005.
-
(2005)
Technical Report TR-2005-144
, pp. 69
-
-
Hendler, D.1
Lev, Y.2
Moir, M.3
Shavit, N.4
-
23
-
-
0034593391
-
A java fork/join framework
-
D. Lea, "A Java Fork/Join Framework," Proc. ACM Conf. Java Grande, pp. 36-43, 2000.
-
(2000)
Proc. ACM Conf. Java Grande
, pp. 36-43
-
-
Lea, D.1
-
25
-
-
72249096886
-
The design of a task parallel library
-
D. Leijen, W. Schulte, and S. Burckhardt, "The Design of a Task Parallel Library," ACM SIGPLAN Notices, vol. 44, no. 10, pp. 227-242, 2009.
-
(2009)
ACM SIGPLAN Notices
, vol.44
, Issue.10
, pp. 227-242
-
-
Leijen, D.1
Schulte, W.2
Burckhardt, S.3
-
27
-
-
67650093463
-
Idempotent work stealing
-
M.M. Michael, M.T. Vechev, and V.A. Saraswat, "Idempotent Work Stealing," Proc. 14th ACM SIGPLAN Symp. Principles and Practice Parallel Programming, pp. 45-54, 2009.
-
(2009)
Proc. 14th ACM SIGPLAN Symp. Principles and Practice Parallel Programming
, pp. 45-54
-
-
Michael, M.M.1
Vechev, M.T.2
Saraswat, V.A.3
-
30
-
-
77953990150
-
An adaptive task creation strategy for work-stealing scheduling
-
L. Wang, H. Cui, Y. Duan, F. Lu, X. Feng, and P. Yew, "An Adaptive Task Creation Strategy for Work-Stealing Scheduling," Proc. IEEE/ACM Eighth Ann. Int'l Symp. Code Generation and Optimization, pp. 266-277, 2010.
-
(2010)
Proc. IEEE/ACM Eighth Ann. Int'l Symp. Code Generation and Optimization
, pp. 266-277
-
-
Wang, L.1
Cui, H.2
Duan, Y.3
Lu, F.4
Feng, X.5
Yew, P.6
-
31
-
-
55849143328
-
Maotai: View-oriented parallel programming on CMT processors
-
J. Zhang, Z. Huang, W. Chen, Q. Huang, and W. Zheng, "Maotai: View-Oriented Parallel Programming on CMT Processors," Proc. 37th Int'l Conf. Parallel Processing, pp. 636-643, 2008.
-
(2008)
Proc. 37th Int'l Conf. Parallel Processing
, pp. 636-643
-
-
Zhang, J.1
Huang, Z.2
Chen, W.3
Huang, Q.4
Zheng, W.5
|