메뉴 건너뛰기




Volumn 36, Issue 2, 2012, Pages 78-87

Optimization strategies in different CUDA architectures using llCoMP

Author keywords

Code optimization; Coding effort; CUDA; GPGPU; llc; Performance portability

Indexed keywords

CODE OPTIMIZATION; CODING EFFORT; CUDA; GPGPU; LLC; PERFORMANCE PORTABILITY;

EID: 84857311397     PISSN: 01419331     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.micpro.2011.05.006     Document Type: Conference Paper
Times cited : (12)

References (35)
  • 1
    • 77956773183 scopus 로고    scopus 로고
    • Extending OpenMP to survive the heterogeneous multi-core era
    • (Cited by (since 1996) 0). <
    • E. Ayguadé Extending OpenMP to survive the heterogeneous multi-core era Int. J. Parallel Progr. 38 5-6 2010 440 459 (Cited by (since 1996) 0). < http://www.scopus.com/inward/record.url?eid=2-s2.0- 77956773183&partnerID=40&md5=4fd3a3df6ed4dc78ac2046b6366b5a9c
    • (2010) Int. J. Parallel Progr. , vol.38 , Issue.56 , pp. 440-459
    • Ayguadé, E.1
  • 4
    • 77951931560 scopus 로고    scopus 로고
    • State-of-the-art in heterogeneous computing
    • A.R. Brodtkorb, C. Dyken, T.R. Hagen, J.M. Hjelmervik, and O.O. Storaasli State-of-the-art in heterogeneous computing Scientific Progr 18 2010 1 33 < http://babrodtk.at.ifi.uio.no/files/publications/brodtkorb-etal-star-heterocomp- final.pdf>
    • (2010) Scientific Progr , vol.18 , pp. 1-33
    • Brodtkorb, A.R.1    Dyken, C.2    Hagen, T.R.3    Hjelmervik, J.M.4    Storaasli, O.O.5
  • 5
    • 73449130892 scopus 로고    scopus 로고
    • Cetus: A source-to-source compiler infrastructure for multicores
    • C. Dave, H. Bae, S.-J. Min, S. Lee, R. Eigenmann, and S. Midkiff Cetus: a source-to-source compiler infrastructure for multicores Computer 42 12 2009 36 42
    • (2009) Computer , vol.42 , Issue.12 , pp. 36-42
    • Dave, C.1    Bae, H.2    Min, S.-J.3    Lee, S.4    Eigenmann, R.5    Midkiff, S.6
  • 8
    • 33750341021 scopus 로고    scopus 로고
    • Basic skeletons in llc
    • DOI 10.1016/j.parco.2006.07.001, PII S0167819106000342
    • A.J. Dorta, P. López, and F. de Sande Basic skeletons in llc Parallel Comput. 32 7-8 2006 491 506 (Pubitemid 44634782)
    • (2006) Parallel Computing , vol.32 , Issue.7-8 , pp. 491-506
    • Dorta, A.1    Lopez, P.2    De Sande, F.3
  • 10
    • 53349090243 scopus 로고    scopus 로고
    • A closer look at GPUs
    • K. Fatahalian, and M. Houston A closer look at GPUs Commun. ACM 51 10 2008 50 57
    • (2008) Commun. ACM , vol.51 , Issue.10 , pp. 50-57
    • Fatahalian, K.1    Houston, M.2
  • 11
    • 9744247646 scopus 로고    scopus 로고
    • Measuring high performance computing productivity
    • S.e.a. Faulk Measuring high performance computing productivity Int. J. High Perform. Comput. Appl. 18 4 2004 459 473
    • (2004) Int. J. High Perform. Comput. Appl. , vol.18 , Issue.4 , pp. 459-473
    • Faulk, S.E.A.1
  • 15
    • 0347937585 scopus 로고    scopus 로고
    • An asynchronous approach to efficient execution of programs on adaptive architectures utilizing FPGAs
    • S. Ghosh An asynchronous approach to efficient execution of programs on adaptive architectures utilizing FPGAs J. Netw. Comput. Appl. 20 3 1997 223 252 (Pubitemid 127452098)
    • (1997) Journal of Network and Computer Applications , vol.20 , Issue.3 , pp. 223-252
    • Ghosh, S.1
  • 18
    • 70450231944 scopus 로고    scopus 로고
    • An analytical model for a gpu architecture with memory-level and thread-level parallelism awareness
    • S. Hong, and H. Kim An analytical model for a gpu architecture with memory-level and thread-level parallelism awareness SIGARCH Comput. Archit. News 37 2009 152 163
    • (2009) SIGARCH Comput. Archit. News , vol.37 , pp. 152-163
    • Hong, S.1    Kim, H.2
  • 20
    • 9744274567 scopus 로고    scopus 로고
    • High performance computing productivity model synthesis
    • J. Kepner High performance computing productivity model synthesis Int. J. High Perform. Comput. Appl. 18 2004 505 516 < http://portal.acm.org/ citation.cfm?id=1057159.1057169 >
    • (2004) Int. J. High Perform. Comput. Appl. , vol.18 , pp. 505-516
    • Kepner, J.1
  • 23
    • 77955364968 scopus 로고    scopus 로고
    • A ROSE-Based OpenMP 3.0 research compiler supporting multiple runtime libraries
    • C. Liao, D.J. Quinlan, T. Panas, B.R. de Supinski, A ROSE-Based OpenMP 3.0 research compiler supporting multiple runtime libraries, in: IWOMP, 2010, pp. 15-28.
    • (2010) IWOMP , pp. 15-28
    • Liao, C.1    Quinlan, D.J.2    Panas, T.3    De Supinski, B.R.4
  • 25
    • 84857258418 scopus 로고    scopus 로고
    • llc
    • llc, 2011. llc Home Page. < http://llc.pcg.ull.es >.
    • (2011) Llc Home Page
  • 27
    • 34147164008 scopus 로고    scopus 로고
    • Languages for high-productivity computing: The DARPA HPCS language project
    • DOI 10.1142/S0129626407002892, PII S0129626407002892
    • E. Lusk, and K. Yelick Languages for high-productivity computing: the DARPA HPCS language project Parallel Proces. Lett. 17 1 2007 89 102 doi:10.1142/S0129626407002892 (Pubitemid 46573928)
    • (2007) Parallel Processing Letters , vol.17 , Issue.1 , pp. 89-102
    • Lusk, E.1    Yelick, K.2
  • 28
    • 78651550268 scopus 로고    scopus 로고
    • Scalable parallel programming with CUDA
    • J. Nickolls, I. Buck, M. Garland, and K. Skadron Scalable parallel programming with CUDA Queue 6 2 2008 40 53
    • (2008) Queue , vol.6 , Issue.2 , pp. 40-53
    • Nickolls, J.1    Buck, I.2    Garland, M.3    Skadron, K.4
  • 29
    • 63549150883 scopus 로고    scopus 로고
    • NVIDIA Corp.
    • NVIDIA Corp., 2007. Cuda Occupancy Calculator, 2007. < http://developer.download.nvidia.com/compute/cuda/CUDA-Occupancy-calculator.xls >.
    • (2007) Cuda Occupancy Calculator, 2007
  • 30
    • 73449144074 scopus 로고    scopus 로고
    • OpenMP Architecture Review Board May
    • OpenMP Architecture Review Board, OpenMP Application Program Interface v. 3.0, May 2008. < http://www.openmp.org/drupal/mp-documents/spec30.pdf >.
    • (2008) OpenMP Application Program Interface V. 3.0
  • 31
    • 33947588048 scopus 로고    scopus 로고
    • A survey of general-purpose computation on graphics hardware
    • DOI 10.1111/j.1467-8659.2007.01012.x
    • J.D. Owens, D. Luebke, N. Govindaraju, M. Harris, J. Krüger, A.E. Lefohn, and T. Purcell A survey of general-purpose computation on graphics hardware Comput. Graph. Forum 26 1 2007 80 113 < http://graphics.idav. ucdavis.edu/publications/print-pub?pub-id=907>(Pubitemid 46481097)
    • (2007) Computer Graphics Forum , vol.26 , Issue.1 , pp. 80-113
    • Owens, J.D.1    Luebke, D.2    Govindaraju, N.3    Harris, M.4    Kruger, J.5    Lefohn, A.E.6    Purcell, T.J.7
  • 32
    • 70350454023 scopus 로고    scopus 로고
    • Automatic hybrid MPI + OpenMP code generation with llc
    • M. Ropo, J. Westerholm, J. Dongarra, Lecture Notes in Computer Science Springer-Verlag Espoo, Finland
    • R. Reyes, A.J. Dorta, F. Almeida, and F. de Sande Automatic hybrid MPI + OpenMP code generation with llc M. Ropo, J. Westerholm, J. Dongarra, Proceedings of the 16th European PVM/MPI Users' Group Meeting Lecture Notes in Computer Science vol. 5759 2009 Springer-Verlag Espoo, Finland 185 195
    • (2009) Proceedings of the 16th European PVM/MPI Users' Group Meeting , vol.5759 , pp. 185-195
    • Reyes, R.1    Dorta, A.J.2    Almeida, F.3    De Sande, F.4
  • 35
    • 33947317857 scopus 로고    scopus 로고
    • D. Wheeler, Sloccount, 2009. < http://www.dwheeler.com/sloccount/ >.
    • (2009) Sloccount
    • Wheeler, D.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.