메뉴 건너뛰기




Volumn , Issue , 2014, Pages 123-132

Energy efficient HPC on embedded SoCs: Optimization techniques for mali GPU

Author keywords

Embedded GPUs; High performance computing; Optimization; Performance analysis

Indexed keywords

ARM PROCESSORS; DIGITAL ARITHMETIC; DISTRIBUTED PARAMETER NETWORKS; ENERGY EFFICIENCY; MASKS; OPTIMIZATION; PARALLEL PROGRAMMING; PROGRAM PROCESSORS; SYSTEM-ON-CHIP;

EID: 84906653007     PISSN: 15302075     EISSN: 23321237     Source Type: Conference Proceeding    
DOI: 10.1109/IPDPS.2014.24     Document Type: Conference Paper
Times cited : (44)

References (33)
  • 2
    • 40749160036 scopus 로고    scopus 로고
    • Overview of the ibm blue gene/p project
    • IBM Blue Gene Team
    • IBM Blue Gene Team, "Overview of the IBM Blue Gene/P project," IBM Journal of Research and Development, vol. 52, no. 1/2, 2008.
    • (2008) IBM Journal of Research and Development , vol.52 , Issue.1-2
  • 6
    • 84906660896 scopus 로고    scopus 로고
    • "Mali-T604 GPU Architecture," http://www.arm.com/products/ multimedia/mali-graphicsplus- gpu-compute/mali-t604.php, 2013.
    • (2013) Mali-T604 GPU Architecture
  • 7
    • 84906710752 scopus 로고    scopus 로고
    • "Tegra K1," http://www.nvidia.com/object/tegra-k1-processor. html, 2014.
    • (2014) Tegra K1
  • 8
    • 84906677333 scopus 로고    scopus 로고
    • "PowerVR Graphics," http://www.imgtec.com/powervr/powervr- graphics.asp, 2013.
    • (2013) PowerVR Graphics
  • 9
    • 84875205174 scopus 로고    scopus 로고
    • Khronos OpenCL Working Group
    • Khronos OpenCL Working Group, "The OpenCL 1.2 specification," http://www.khronos.org/opencl, 2012.
    • (2012) The OpenCL 1.2 Specification
  • 10
    • 77952342828 scopus 로고    scopus 로고
    • Khronos OpenCL Working Group
    • -, "The OpenCL Specification. Version 2.0," http://www.khronos.org/registry/cl/specs/opencl-2.0.pdf, 2013.
    • (2013) The OpenCL Specification. Version 2.0
  • 11
    • 80052317182 scopus 로고    scopus 로고
    • Automatic opencl device characterization: Guiding optimized kernel design
    • P. Thoman, K. Kofler, H. Studt, J. Thomson, and T. Fahringer, "Automatic opencl device characterization: guiding optimized kernel design," in Euro-Par, 2011, pp. 438-452.
    • (2011) Euro-Par , pp. 438-452
    • Thoman, P.1    Kofler, K.2    Studt, H.3    Thomson, J.4    Fahringer, T.5
  • 15
    • 84885597732 scopus 로고    scopus 로고
    • Experiences with mobile processors for energy efficient HPC
    • N. Rajovic, A. Rico, J. Vipond, I. Gelado, N. Puzovic, and A. Ramirez, " Experiences with mobile processors for energy efficient HPC ," in DATE, 2013, pp. 464-468.
    • (2013) Date , pp. 464-468
    • Rajovic, N.1    Rico, A.2    Vipond, J.3    Gelado, I.4    Puzovic, N.5    Ramirez, A.6
  • 17
    • 84879835573 scopus 로고    scopus 로고
    • Efficient sparse matrix-vector multiplication on x86-based many-core processors
    • X. Liu, M. Smelyanskiy, E. Chow, and P. Dubey, "Efficient sparse matrix-vector multiplication on x86-based many-core processors," in ICS, 2013, pp. 273-282.
    • (2013) ICS , pp. 273-282
    • Liu, X.1    Smelyanskiy, M.2    Chow, E.3    Dubey, P.4
  • 18
    • 79953274591 scopus 로고    scopus 로고
    • Data layout transformation for stencil computations on short-vector simd architectures
    • T. Henretty, K. Stock, L.-N. Pouchet, F. Franchetti, J. Ramanujam, and P. Sadayappan, "Data Layout Transformation for Stencil Computations on Short-Vector SIMD Architectures," in CC, 2011, pp. 225-245.
    • (2011) CC , pp. 225-245
    • Henretty, T.1    Stock, K.2    Pouchet, L.-N.3    Franchetti, F.4    Ramanujam, J.5    Sadayappan, P.6
  • 19
    • 79957493885 scopus 로고    scopus 로고
    • Where is the data? Why you cannot debate CPU vs. GPU performance without the answer
    • C. Gregg and K. Hazelwood, "Where is the data? Why you cannot debate CPU vs. GPU performance without the answer," in ISPASS, 2011, pp. 134-144.
    • (2011) ISPASS , pp. 134-144
    • Gregg, C.1    Hazelwood, K.2
  • 21
    • 84906684730 scopus 로고    scopus 로고
    • Khronos OpenGL Working Group
    • Khronos OpenGL Working Group, "The OpenGL ES 3.0 specification," http://www.khronos.org/opengles/, 2013.
    • (2013) The OpenGL ES 3.0 Specification
  • 22
    • 78651093230 scopus 로고    scopus 로고
    • Implementation and optimization of image processing algorithms on handheld gpu
    • N. Singhal, I. K. Park, and S. Cho, "Implementation and Optimization of Image Processing Algorithms on Handheld GPU," in ICIP, 2010.
    • (2010) ICIP
    • Singhal, N.1    Park, I.K.2    Cho, S.3
  • 24
    • 79959493727 scopus 로고    scopus 로고
    • Using mobile GPU for general-purpose computing a case study of face recognition on smartphones
    • K.-T. T. Cheng and Y.-C. Wang, "Using Mobile GPU for General-Purpose Computing A Case Study of Face Recognition on Smartphones," in VLSI-DAT, 2011.
    • (2011) VLSI-DAT
    • Cheng, K.-T.T.1    Wang, Y.-C.2
  • 26
    • 84890502474 scopus 로고    scopus 로고
    • A fast and efficient sift detector using the mobile GPU
    • B. Rister, G. Wang, M. Wu, and J. R. Cavallaro, "A fast and efficient sift detector using the mobile GPU," in ICASSP, 2013.
    • (2013) ICASSP
    • Rister, B.1    Wang, G.2    Wu, M.3    Cavallaro, J.R.4
  • 27
    • 74549183455 scopus 로고    scopus 로고
    • OpenCL embedded profile prototype in mobile device
    • J. Leskelä, J. Nikula, and M. Salmela, "OpenCL embedded profile prototype in mobile device," in SiPS, 2009, pp. 279-284.
    • (2009) SiPS , pp. 279-284
    • Leskelä, J.1    Nikula, J.2    Salmela, M.3
  • 28
    • 84890545340 scopus 로고    scopus 로고
    • Accelerating computer vision algorithms using opencl on the mobile GPU a case study
    • G. Wang, Y. Xiong, J. Yun, and J. R. Cavallaro, "Accelerating Computer Vision Algorithms Using OpenCL on the Mobile GPU A Case Study," in ICASSP, 2013.
    • (2013) ICASSP
    • Wang, G.1    Xiong, Y.2    Yun, J.3    Cavallaro, J.R.4
  • 30
    • 84879822173 scopus 로고    scopus 로고
    • TEAPOT: A toolset for evaluating performance, power and image quality on mobile graphics systems
    • J.-M. Arnau, J.-M. Parcerisa, and P. Xekalakis, "TEAPOT: a toolset for evaluating performance, power and image quality on mobile graphics systems," in ICS, 2013, pp. 37-46.
    • (2013) ICS , pp. 37-46
    • Arnau, J.-M.1    Parcerisa, J.-M.2    Xekalakis, P.3
  • 31
    • 84887445167 scopus 로고    scopus 로고
    • Parallel frame rendering: Trading responsiveness for energy on a mobile gpu
    • -, "Parallel Frame Rendering: Trading Responsiveness for Energy on a Mobile GPU," in PACT, 2013.
    • (2013) PACT
    • Arnau, J.-M.1    Parcerisa, J.-M.2    Xekalakis, P.3
  • 32
    • 84864858885 scopus 로고    scopus 로고
    • Boosting mobile GPU performance with a decoupled access/execute fragment processor
    • -, "Boosting mobile GPU performance with a decoupled access/execute fragment processor," in ISCA, 2012, pp. 84-93.
    • (2012) ISCA , pp. 84-93
    • Arnau, J.-M.1    Parcerisa, J.-M.2    Xekalakis, P.3
  • 33
    • 84888873765 scopus 로고    scopus 로고
    • General purpose computing on low-power embedded gpus: Has it come of age?
    • A. Maghazeh, U. D. Bordoloi, P. Eles, and Z. Peng, "General Purpose Computing on Low-Power Embedded GPUs: Has It Come of Age?" in SAMOS, 2013.
    • (2013) SAMOS
    • Maghazeh, A.1    Bordoloi, U.D.2    Eles, P.3    Peng, Z.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.