메뉴 건너뛰기




Volumn , Issue , 2010, Pages 145-156

The Cilkview scalability analyzer

Author keywords

Burdened parallelism; Cilk++; Cilkview; Dag model; Multicore programming; Multithreading; Parallel programming; Parallelism; Performance; Scalability; Software tools; Span; Speedup; Work

Indexed keywords

CILKVIEW; MULTI CORE; MULTI-THREADING; PERFORMANCE SCALABILITY; SOFTWARE TOOL;

EID: 77954942121     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1810479.1810509     Document Type: Conference Paper
Times cited : (84)

References (51)
  • 3
    • 0025567275 scopus 로고
    • Quartz: A tool for tuning parallel program performance
    • T. E. Anderson and E. D. Lazowska. Quartz: a tool for tuning parallel program performance. SIGMETRICS Perform. Eval. Rev., 18(1):115-125, 1990.
    • (1990) SIGMETRICS Perform. Eval. Rev. , vol.18 , Issue.1 , pp. 115-125
    • Anderson, T.E.1    Lazowska, E.D.2
  • 4
    • 0038036149 scopus 로고    scopus 로고
    • Space-efficient scheduling of multithreaded computations
    • R. D. Blumofe and C. E. Leiserson. Space-efficient scheduling of multithreaded computations. SIAM J. Comput., 27(1):202-229, 1998.
    • (1998) SIAM J. Comput. , vol.27 , Issue.1 , pp. 202-229
    • Blumofe, R.D.1    Leiserson, C.E.2
  • 5
    • 0000269759 scopus 로고    scopus 로고
    • Scheduling multithreaded computations by work stealing
    • R. D. Blumofe and C. E. Leiserson. Scheduling multithreaded computations by work stealing. JACM, 46(5):720-748, 1999.
    • (1999) JACM , vol.46 , Issue.5 , pp. 720-748
    • Blumofe, R.D.1    Leiserson, C.E.2
  • 7
    • 0016046965 scopus 로고
    • The parallel evaluation of general arithmetic expressions
    • R. P. Brent. The parallel evaluation of general arithmetic expressions. JACM, 21(2):201-206, 1974.
    • (1974) JACM , vol.21 , Issue.2 , pp. 201-206
    • Brent, R.P.1
  • 10
    • 77954949865 scopus 로고    scopus 로고
    • Available from
    • J. Carr. A parallel bzip2. Available from http://sotware.intel.com/en-us/ articles/a-parallel-bzip2/, 2009.
    • (2009) A Parallel
    • Carr, J.1
  • 11
    • 84974695561 scopus 로고    scopus 로고
    • A dynamic tracing mechanism for performance analysis of OpenMP applications
    • J. Caubet, J. Gimenez, J. Labarta, L. D. Rose, and J. S. Vetter. A dynamic tracing mechanism for performance analysis of OpenMP applications. In WOMPAT, pp. 53-67, 2001.
    • (2001) WOMPAT , pp. 53-67
    • Caubet, J.1    Gimenez, J.2    Labarta, J.3    Rose, L.D.4    Vetter, J.S.5
  • 14
    • 0001162786 scopus 로고
    • On the partial difference equations of mathematical physics
    • R. Courant, K. Friedrichs, and H. Lewy. On the partial difference equations of mathematical physics. IBM J. R&D, 11(2):215-234, 1967.
    • (1967) IBM J. R&D , vol.11 , Issue.2 , pp. 215-234
    • Courant, R.1    Friedrichs, K.2    Lewy, H.3
  • 15
    • 84981167256 scopus 로고    scopus 로고
    • The Dynamic Probe Class Library - An infrastructure for developing instrumentation for performance tools
    • L. DeRose, T. Hoover Jr., and J. K. Hollingsworth. The Dynamic Probe Class Library - an infrastructure for developing instrumentation for performance tools. In IPDPS, p. 10066b, 2001.
    • (2001) IPDPS
    • DeRose, L.1    Hoover Jr., T.2    Hollingsworth, J.K.3
  • 16
    • 0001801746 scopus 로고
    • Protocol verification as a hardware design aid
    • D. L. Dill, A. J. Drexler, A. J. Hu, and C. H. Yang. Protocol verification as a hardware design aid. In ICCD, pp. 522-525, 1992.
    • (1992) ICCD , pp. 522-525
    • Dill, D.L.1    Drexler, A.J.2    Hu, A.J.3    Yang, C.H.4
  • 17
    • 0024627264 scopus 로고
    • Speedup versus efficiency in parallel systems
    • D. L. Eager, J. Zahorjan, and E. D. Lazowska. Speedup versus efficiency in parallel systems. IEEE Trans. Comput., 38(3):408-423, 1989.
    • (1989) IEEE Trans. Comput. , vol.38 , Issue.3 , pp. 408-423
    • Eager, D.L.1    Zahorjan, J.2    Lazowska, E.D.3
  • 19
  • 20
    • 0031622953 scopus 로고    scopus 로고
    • The implementation of the Cilk-5 multithreaded language
    • M. Frigo, C. E. Leiserson, and K. H. Randall. The implementation of the Cilk-5 multithreaded language. In PLDI, pp. 212-223, 1998.
    • (1998) PLDI , pp. 212-223
    • Frigo, M.1    Leiserson, C.E.2    Randall, K.H.3
  • 21
    • 32844463802 scopus 로고    scopus 로고
    • Cache oblivious stencil computations
    • M. Frigo and V. Strumpen. Cache oblivious stencil computations. In ICS, pp. 361-366, 2005.
    • (2005) ICS , pp. 361-366
    • Frigo, M.1    Strumpen, V.2
  • 22
    • 33749564381 scopus 로고    scopus 로고
    • The cache complexity of multithreaded cache oblivious algorithms
    • M. Frigo and V. Strumpen. The cache complexity of multithreaded cache oblivious algorithms. In SPAA, pp. 271-280, 2006.
    • (2006) SPAA , pp. 271-280
    • Frigo, M.1    Strumpen, V.2
  • 24
    • 0026284572 scopus 로고
    • Performance debugging shared memory multiprocessor programs with MTOOL
    • A. J. Goldberg and J. L. Hennessy. Performance debugging shared memory multiprocessor programs with MTOOL. In SC'91, pp. 481-490, 1991.
    • (1991) SC'91 , pp. 481-490
    • Goldberg, A.J.1    Hennessy, J.L.2
  • 25
    • 84944813080 scopus 로고
    • Bounds for certain multiprocessing anomalies
    • R. L. Graham. Bounds for certain multiprocessing anomalies. Bell System Technical Journal, 45:1563-1581, 1966.
    • (1966) Bell System Technical Journal , vol.45 , pp. 1563-1581
    • Graham, R.L.1
  • 28
    • 77954004251 scopus 로고    scopus 로고
    • Available from, Document No. 322581-001US
    • Intel Corp. Intel Cilk++ SDK Programmer's Guide, 2009. Available from http://sotware.intel.com/en-us/articles/download-intel-cilk-sdk/Document No. 322581-001US.
    • (2009) Intel Cilk++ SDK Programmer's Guide
  • 29
    • 77954930852 scopus 로고    scopus 로고
    • Available from, Document No. 320486-003US
    • Intel Corp. Intel Parallel Amplifier. Available from http://sotware. intel.com/sites/products/documentation/studio/amplifier/en-us/2009/ug-docs/ index.htm. Document No. 320486-003US, 2009.
    • (2009) Intel Parallel Amplifier
  • 30
    • 77954936577 scopus 로고    scopus 로고
    • Available from
    • Intel Corp. Intel Thread Profiler. Available from http://sotware.intel. com/en-us/articles/intel-thread-profiler-for-windows-documentation/, 2010.
    • (2010) Intel Thread Profiler
  • 31
    • 0034593391 scopus 로고    scopus 로고
    • A Java fork/join framework
    • D. Lea. A Java fork/join framework. In Java Grande, pp. 36-43, 2000.
    • (2000) Java Grande , pp. 36-43
    • Lea, D.1
  • 32
    • 72249096886 scopus 로고    scopus 로고
    • The design of a task parallel library
    • D. Leijen, W. Schulte, and S. Burckhardt. The design of a task parallel library. In OOPSLA, pp. 227-242, 2009.
    • (2009) OOPSLA , pp. 227-242
    • Leijen, D.1    Schulte, W.2    Burckhardt, S.3
  • 33
    • 77951240770 scopus 로고    scopus 로고
    • The Cilk++ concurrency platform
    • C. E. Leiserson. The Cilk++ concurrency platform. J. Supercomput., 51(3):244-257, 2010.
    • (2010) J. Supercomput. , vol.51 , Issue.3 , pp. 244-257
    • Leiserson, C.E.1
  • 34
    • 77954929696 scopus 로고    scopus 로고
    • A work-efficient parallel breadth-first search algorithm (or how to cope with the nondeterminism of reducers)
    • C. E. Leiserson and T. B. Schardl. A work-efficient parallel breadth-first search algorithm (or how to cope with the nondeterminism of reducers). In SPAA, 2010.
    • (2010) SPAA
    • Leiserson, C.E.1    Schardl, T.B.2
  • 36
    • 77950985865 scopus 로고    scopus 로고
    • Balanced dense polynomial multiplication on multi-cores
    • M. M. Maza and Y. Xie. Balanced dense polynomial multiplication on multi-cores. In PDCAT, pp. 1-9, 2009.
    • (2009) PDCAT , pp. 1-9
    • Maza, M.M.1    Xie, Y.2
  • 37
    • 77952351456 scopus 로고    scopus 로고
    • FFT-based dense polynomial arithmetic on multi-cores
    • M. M. Maza and Y. Xie. FFT-based dense polynomial arithmetic on multi-cores. In HPCS, pp. 378-399, 2009.
    • (2009) HPCS , pp. 378-399
    • Maza, M.M.1    Xie, Y.2
  • 40
    • 0036679605 scopus 로고    scopus 로고
    • Design and prototype of a performance tool interface for OpenMP
    • B. Mohr, A. D. Malony, S. Shende, and F. Wolf. Design and prototype of a performance tool interface for OpenMP. J. Supercomput., 23(1):105-128, 2002.
    • (2002) J. Supercomput. , vol.23 , Issue.1 , pp. 105-128
    • Mohr, B.1    Malony, A.D.2    Shende, S.3    Wolf, F.4
  • 42
    • 33745612838 scopus 로고    scopus 로고
    • version 3.0
    • OpenMP Architecture Review Board. OpenMP application program interface, version 3.0. http://www.openmp.org/mp-documents/spec30.pdf, 2008.
    • (2008) OpenMP Application Program Interface
  • 47
    • 67650034867 scopus 로고    scopus 로고
    • Effective performance measurement and analysis of multithreaded applications
    • N. R. Tallent and J. M. Mellor-Crummey. Effective performance measurement and analysis of multithreaded applications. In PPoPP, pp. 229-240, 2009.
    • (2009) PPoPP , pp. 229-240
    • Tallent, N.R.1    Mellor-Crummey, J.M.2
  • 48
    • 67650837951 scopus 로고    scopus 로고
    • Binary analysis for measurement and attribution of program performance
    • N. R. Tallent, J. M. Mellor-Crummey, and M. W. Fagan. Binary analysis for measurement and attribution of program performance. In PLDI, pp. 441-452, 2009.
    • (2009) PLDI , pp. 441-452
    • Tallent, N.R.1    Mellor-Crummey, J.M.2    Fagan, M.W.3
  • 49
    • 0036036949 scopus 로고    scopus 로고
    • Dynamic statistical profiling of communication activity in distributed applications
    • J. Vetter. Dynamic statistical profiling of communication activity in distributed applications. In SIGMETRICS, pp. 240-250, 2002.
    • (2002) SIGMETRICS , pp. 240-250
    • Vetter, J.1
  • 50
    • 0034819519 scopus 로고    scopus 로고
    • Statistical scalability analysis of communication operations in distributed applications
    • J. S. Vetter and M. O. McCracken. Statistical scalability analysis of communication operations in distributed applications. SIGPLAN Not., 36(7):123-132, 2001.
    • (2001) SIGPLAN Not. , vol.36 , Issue.7 , pp. 123-132
    • Vetter, J.S.1    McCracken, M.O.2
  • 51
    • 33750427372 scopus 로고    scopus 로고
    • From trace generation to visualization: A performance framework for distributed parallel systems
    • C. E. Wu, A. Bolmarcich, M. Snir, D. Wootton, F. Parpia, A. Chan, and E. Lusk. From trace generation to visualization: A performance framework for distributed parallel systems. In SC'00, p. 50, 2000.
    • (2000) SC'00 , pp. 50
    • Wu, C.E.1    Bolmarcich, A.2    Snir, M.3    Wootton, D.4    Parpia, F.5    Chan, A.6    Lusk, E.7


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.