메뉴 건너뛰기




Volumn , Issue , 2010, Pages 307-318

Partitioning streaming parallelism for multi-cores: A machine learning based approach

Author keywords

compiler optimization; machine learning; partitioning streaming parallelism

Indexed keywords

ITERATIVE METHODS; LEARNING SYSTEMS; MACHINE LEARNING; PARALLEL ARCHITECTURES; PROGRAM COMPILERS;

EID: 78149235736     PISSN: 1089795X     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1854273.1854313     Document Type: Conference Paper
Times cited : (89)

References (41)
  • 1
    • 77749249219 scopus 로고    scopus 로고
    • Input-driven dynamic execution prediction of streaming applications
    • F. Aleen, M. Sharif, and S. Pande. Input-driven dynamic execution prediction of streaming applications. In PPoPP, 2010.
    • (2010) PPoPP
    • Aleen, F.1    Sharif, M.2    Pande, S.3
  • 2
    • 70349694201 scopus 로고    scopus 로고
    • A view of the parallel computing landscape
    • K. Asanovic, R. Bodik, and J. Demmel et al. A view of the parallel computing landscape. Commun. ACM, 52(10), 2009.
    • (2009) Commun. ACM , vol.52 , Issue.10
    • Asanovic, K.1    Bodik, R.2    Demmel, J.3
  • 3
    • 33751022080 scopus 로고    scopus 로고
    • Programming for parallelism and locality with hierarchically tiled arrays
    • G. Bikshandi, J. Guo, and D. Hoeflinger et al. Programming for parallelism and locality with hierarchically tiled arrays. In PPoPP, 2006.
    • (2006) PPoPP
    • Bikshandi, G.1    Guo, J.2    Hoeflinger, D.3
  • 5
    • 0000301477 scopus 로고
    • Finding good approximate vertex and edge partitions is NP-hard
    • T. N. Bui and C. Jones. Finding good approximate vertex and edge partitions is NP-hard. Inf. Process. Lett., 42(3), 1992.
    • (1992) Inf. Process. Lett. , vol.42 , Issue.3
    • Bui, T.N.1    Jones, C.2
  • 6
    • 70350705613 scopus 로고    scopus 로고
    • Computer generation of fast fourier transforms for the cell broadband engine
    • S. Chellappa, F. Franchetti, and M. Püeschel. Computer generation of fast fourier transforms for the cell broadband engine. In ICS, 2009.
    • (2009) ICS
    • Chellappa, S.1    Franchetti, F.2    Püeschel, M.3
  • 7
    • 0030287932 scopus 로고    scopus 로고
    • LogP: A practical model of parallel computation
    • D. E. Culler, R. M. Karp, and D. Patterson et al. LogP: a practical model of parallel computation. Commun. ACM, 39(11), 1996.
    • (1996) Commun. ACM , vol.39 , Issue.11
    • Culler, D.E.1    Karp, R.M.2    Patterson, D.3
  • 8
    • 78149278555 scopus 로고    scopus 로고
    • A performance analysis of the berkeley upc compiler
    • W. C. Dan, W. yu Chen, and Parry Husbands et al. A performance analysis of the berkeley upc compiler. In ICS, 2003.
    • (2003) ICS
    • Dan, W.C.1    Yu Chen, W.2    Husbands, P.3
  • 10
    • 34547423880 scopus 로고    scopus 로고
    • Exploiting coarse-grained task, data, and pipeline parallelism in stream programs
    • M. I. Gordon, W. Thies, and S. Amarasinghe. Exploiting coarse-grained task, data, and pipeline parallelism in stream programs. In ASPLOS, 2006.
    • (2006) ASPLOS
    • Gordon, M.I.1    Thies, W.2    Amarasinghe, S.3
  • 11
    • 0036959649 scopus 로고    scopus 로고
    • A stream compiler for communication-exposed architectures
    • M. I. Gordon, W. Thies, and M. Karczmarek et al. A stream compiler for communication-exposed architectures. In ASPLOS, 2002.
    • (2002) ASPLOS
    • Gordon, M.I.1    Thies, W.2    Karczmarek, M.3
  • 13
    • 77952252026 scopus 로고    scopus 로고
    • MacroSS: Macro-simdization of streaming applications
    • A. Hormati, Y. Choi, and M. Woh et al. MacroSS: Macro-simdization of streaming applications. In ASPLOS, 2010.
    • (2010) ASPLOS
    • Hormati, A.1    Choi, Y.2    Woh, M.3
  • 14
    • 70449669477 scopus 로고    scopus 로고
    • Flextream: Adaptive compilation of streaming applications for heterogeneous architectures
    • A. H. Hormati, Y. Choi, and M. Kudlur et al. Flextream: Adaptive compilation of streaming applications for heterogeneous architectures. In PACT, 2009.
    • (2009) PACT
    • Hormati, A.H.1    Choi, Y.2    Kudlur, M.3
  • 15
    • 43449119850 scopus 로고    scopus 로고
    • COLE: Compiler optimization level exploration
    • K. Hoste and L. Eeckhout. COLE: compiler optimization level exploration. In CGO, 2008.
    • (2008) CGO
    • Hoste, K.1    Eeckhout, L.2
  • 16
    • 57349172999 scopus 로고    scopus 로고
    • Orchestrating the execution of stream programs on multicore platforms
    • M. Kudlur and S. Mahlke. Orchestrating the execution of stream programs on multicore platforms. In PLDI, 2008.
    • (2008) PLDI
    • Kudlur, M.1    Mahlke, S.2
  • 17
    • 35448941890 scopus 로고    scopus 로고
    • Optimistic parallelism requires abstractions
    • M. Kulkarni, K. Pingali, and B. Walter et al. Optimistic parallelism requires abstractions. In PLDI, 2007.
    • (2007) PLDI
    • Kulkarni, M.1    Pingali, K.2    Walter, B.3
  • 18
    • 0002050141 scopus 로고    scopus 로고
    • Static scheduling algorithms for allocating directed task graphs to multiprocessors
    • Y. Kwok and I. Ahmad. Static scheduling algorithms for allocating directed task graphs to multiprocessors. ACM Comput. Surv., 31(4), 1999.
    • (1999) ACM Comput. Surv. , vol.31 , Issue.4
    • Kwok, Y.1    Ahmad, I.2
  • 20
    • 57349101237 scopus 로고    scopus 로고
    • Data and computation transformations for brook streaming applications on multiprocessors
    • S. Liao, Z. Du, and G. Wu et al. Data and computation transformations for brook streaming applications on multiprocessors. In CGO, 2006.
    • (2006) CGO
    • Liao, S.1    Du, Z.2    Wu, G.3
  • 21
    • 76749140917 scopus 로고    scopus 로고
    • Qilin: Exploiting parallelism on heterogeneous multiprocessors with adaptive mapping
    • C. Luk, S. Hong, and H. Kim. Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping. In MICRO, 2009.
    • (2009) MICRO
    • Luk, C.1    Hong, S.2    Kim, H.3
  • 22
    • 33646834588 scopus 로고    scopus 로고
    • Predicting unroll factors using supervised classification
    • S. Mark and A. Saman. Predicting unroll factors using supervised classification. In CGO, 2005.
    • (2005) CGO
    • Mark, S.1    Saman, A.2
  • 23
    • 33846471288 scopus 로고    scopus 로고
    • Learning to schedule straight-line code
    • E. Moss, P. Utgoff, and J. Cavazos et al. Learning to schedule straight-line code. In NIPS, 1997.
    • (1997) NIPS
    • Moss, E.1    Utgoff, P.2    Cavazos, J.3
  • 25
    • 70449687057 scopus 로고    scopus 로고
    • Analytical modeling of pipeline parallelism
    • A. Navarro, R. Asenjo, and S. Tabik et al. Analytical modeling of pipeline parallelism. In PACT, 2009.
    • (2009) PACT
    • Navarro, A.1    Asenjo, R.2    Tabik, S.3
  • 26
    • 0001820920 scopus 로고    scopus 로고
    • X-means: Extending K-means with Efficient Estimation of the Number of Clusters
    • D. Pelleg and A. W. Moore. X-means: Extending K-means with Efficient Estimation of the Number of Clusters. In ICML, 2000.
    • (2000) ICML
    • Pelleg, D.1    Moore, A.W.2
  • 27
    • 57349167317 scopus 로고    scopus 로고
    • Iterative Optimization in the Polyhedral Model: Part II, Multidimensional Time
    • L. Pouchet and C. Bastoul et al. Iterative Optimization in the Polyhedral Model: Part II, Multidimensional Time. In PLDI, 2008.
    • (2008) PLDI
    • Pouchet, L.1    Bastoul, C.2
  • 28
    • 0021458524 scopus 로고
    • Dynamic task scheduling in hard real-time distributed systems
    • K. Ramamritham and J. A. Stankovic. Dynamic task scheduling in hard real-time distributed systems. IEEE Softw., 1(3), 1984.
    • (1984) IEEE Softw. , vol.1 , Issue.3
    • Ramamritham, K.1    Stankovic, J.A.2
  • 29
    • 34748894221 scopus 로고    scopus 로고
    • X10: Concurrent programming for modern architectures
    • V. A. Saraswat, V. Sarkar, and C. von Praun. X10: concurrent programming for modern architectures. In PPoPP, 2007.
    • (2007) PPoPP
    • Saraswat, V.A.1    Sarkar, V.2    Von Praun, C.3
  • 30
    • 0026213832 scopus 로고
    • Automatic partitioning of a program dependence graph into parallel tasks
    • V. Sarkar. Automatic partitioning of a program dependence graph into parallel tasks. IBM J. Res. Dev., 35(5-6), 1991.
    • (1991) IBM J. Res. Dev. , vol.35 , Issue.5-6
    • Sarkar, V.1
  • 31
    • 0000120766 scopus 로고
    • Estimating the dimension of a model
    • G. Schwarz. Estimating the dimension of a model. Ann. Statist, 6(2), 1978.
    • (1978) Ann. Statist , vol.6 , Issue.2
    • Schwarz, G.1
  • 32
    • 0036953769 scopus 로고    scopus 로고
    • Automatically characterizing large scale program behavior
    • T. Sherwood, E. Perelman, and G. Hamerly et al. Automatically characterizing large scale program behavior. In ASPLOS, 2002.
    • (2002) ASPLOS
    • Sherwood, T.1    Perelman, E.2    Hamerly, G.3
  • 33
    • 0031295210 scopus 로고    scopus 로고
    • A survey of stream processing
    • R. Stephens. A survey of stream processing. Acta Informatica, 34(7):491-541, 1997.
    • (1997) Acta Informatica , vol.34 , Issue.7 , pp. 491-541
    • Stephens, R.1
  • 34
    • 0038035143 scopus 로고    scopus 로고
    • Meta optimization: Improving compiler heuristics with machine learning
    • M. Stephenson, S. Amarasinghe, and M. Martin et al. Meta optimization: improving compiler heuristics with machine learning. SIGPLAN Not., 2003.
    • (2003) SIGPLAN Not
    • Stephenson, M.1    Amarasinghe, S.2    Martin, M.3
  • 35
    • 0027845715 scopus 로고
    • Exploiting task and data parallelism on a multicomputer
    • J. Subhlok, J. M. Stichnoth, and David R. O'hallaron et al. Exploiting task and data parallelism on a multicomputer. In PPoPP, 1993.
    • (1993) PPoPP
    • Subhlok, J.1    Stichnoth, J.M.2    O'Hallaron, D.R.3
  • 36
    • 0008917430 scopus 로고    scopus 로고
    • StreamIt: A language for streaming applications
    • B. Thies, M. Karczmarek, and S. Amarasinghe. StreamIt: A language for streaming applications. In CC, 2001.
    • (2001) CC
    • Thies, B.1    Karczmarek, M.2    Amarasinghe, S.3
  • 38
    • 70450278773 scopus 로고    scopus 로고
    • Towards a holistic approach to auto-parallelization - Integrating profile-driven parallelism detection and machine-learning based mapping
    • G. Tournavitis, Z. Wang, and B. Franke et al. Towards a holistic approach to auto-parallelization - integrating profile-driven parallelism detection and machine-learning based mapping. In PLDI, 2009.
    • (2009) PLDI
    • Tournavitis, G.1    Wang, Z.2    Franke, B.3
  • 39
    • 67650563116 scopus 로고    scopus 로고
    • Software Pipelined Execution of Stream Programs on GPUs
    • A. Udupa, R. Govindarajan, and M. J. Thazhuthaveetil. Software Pipelined Execution of Stream Programs on GPUs. In CGO, 2009.
    • (2009) CGO
    • Udupa, A.1    Govindarajan, R.2    Thazhuthaveetil, M.J.3
  • 40
    • 77957793058 scopus 로고    scopus 로고
    • Dispersing proprietary applications as benchmarks through code mutation
    • L. Van Ertvelde and L. Eeckhout. Dispersing proprietary applications as benchmarks through code mutation. In ASPLOS, 2008.
    • (2008) ASPLOS
    • Van Ertvelde, L.1    Eeckhout, L.2
  • 41
    • 67650088253 scopus 로고    scopus 로고
    • Mapping parallelism to multi-cores: A machine learning based approach
    • Z. Wang and M. F. O'Boyle. Mapping parallelism to multi-cores: a machine learning based approach. In PPoPP, 2009.
    • (2009) PPoPP
    • Wang, Z.1    O'Boyle, M.F.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.