메뉴 건너뛰기




Volumn 69, Issue 9, 2009, Pages 762-777

Optimization of a lattice Boltzmann computation on state-of-the-art multicore platforms

Author keywords

Auto tuning; Cell broadband engine; Lattice Boltzmann; Multicore; Niagara

Indexed keywords

AUTO-TUNING; CELL BROADBAND ENGINE; LATTICE BOLTZMANN; MULTICORE; NIAGARA;

EID: 67650998701     PISSN: 07437315     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.jpdc.2009.04.002     Document Type: Article
Times cited : (40)

References (28)
  • 1
    • 67650797545 scopus 로고    scopus 로고
    • K. Asanovic, R. Bodik, B. Catanzaro, et al., The landscape of parallel computing research: A view from Berkeley, Technical Report UCB/EECS-2006-183, EECS, University of California, Berkeley, 2006
    • K. Asanovic, R. Bodik, B. Catanzaro, et al., The landscape of parallel computing research: A view from Berkeley, Technical Report UCB/EECS-2006-183, EECS, University of California, Berkeley, 2006
  • 2
    • 26344468007 scopus 로고
    • A model for collisional processes in gases I: Small amplitude processes in charged and neutral one-component systems
    • Bhatnagar P., Gross E., and Krook M. A model for collisional processes in gases I: Small amplitude processes in charged and neutral one-component systems. Phys. Rev. 94 (1954) 511
    • (1954) Phys. Rev. , vol.94 , pp. 511
    • Bhatnagar, P.1    Gross, E.2    Krook, M.3
  • 5
    • 0037054259 scopus 로고    scopus 로고
    • Lattice kinetic schemes for magnetohydrodynamics
    • Dellar P. Lattice kinetic schemes for magnetohydrodynamics. J. Comput. Phys. 79 (2002)
    • (2002) J. Comput. Phys. , vol.79
    • Dellar, P.1
  • 7
    • 34247376580 scopus 로고    scopus 로고
    • Chip multiprocessing and the cell broadband engine
    • New York, NY, USA
    • M. Gschwind, Chip multiprocessing and the cell broadband engine, in: CF '06: Computing Fontiers, New York, NY, USA, 2006, pp. 1-8
    • (2006) CF '06: Computing Fontiers , pp. 1-8
    • Gschwind, M.1
  • 8
    • 0024903997 scopus 로고
    • Evaluating associativity in CPU caches
    • Hill M.D., and Smith A.J. Evaluating associativity in CPU caches. IEEE Trans. Comput. 38 12 (1989) 1612-1630
    • (1989) IEEE Trans. Comput. , vol.38 , Issue.12 , pp. 1612-1630
    • Hill, M.D.1    Smith, A.J.2
  • 14
    • 67650800575 scopus 로고    scopus 로고
    • OpenMP, 1997. http://openmp.org
    • (1997)
  • 16
    • 1242352441 scopus 로고    scopus 로고
    • Optimization and profiling of the cache performance of parallel lattice Boltzmann codes
    • Pohl T., Kowarschik M., Wilke J., Iglberger K., and Rüde U. Optimization and profiling of the cache performance of parallel lattice Boltzmann codes. Parallel Process. Lett. 13 4 (2003) S: 549
    • (2003) Parallel Process. Lett. , vol.13 , Issue.4
    • Pohl, T.1    Kowarschik, M.2    Wilke, J.3    Iglberger, K.4    Rüde, U.5
  • 18
    • 33646924323 scopus 로고    scopus 로고
    • Microarchitectures for systems on a chip in small process geometries
    • Sylvester D., and Keutzer K. Microarchitectures for systems on a chip in small process geometries. Proc. IEEE Apr. (2001) 467-489
    • (2001) Proc. IEEE , Issue.Apr , pp. 467-489
    • Sylvester, D.1    Keutzer, K.2
  • 19
    • 67650845618 scopus 로고    scopus 로고
    • The IEEE and The Open Group, The Open Group Base Specifications Issue 6
    • The IEEE and The Open Group, The Open Group Base Specifications Issue 6, 2004
    • (2004)
  • 20
    • 24344485098 scopus 로고    scopus 로고
    • OSKI: A library of automatically tuned sparse matrix kernels, in: Proc. of SciDAC 2005
    • Vuduc R., Demmel J., and Yelick K. OSKI: A library of automatically tuned sparse matrix kernels, in: Proc. of SciDAC 2005. J. Phys.: Conf. Ser. June (2005) 521-530
    • (2005) J. Phys.: Conf. Ser. , Issue.June , pp. 521-530
    • Vuduc, R.1    Demmel, J.2    Yelick, K.3
  • 21
    • 33646809359 scopus 로고    scopus 로고
    • On the single processor performance of simple lattice Boltzmann kernels
    • Wellein G., Zeiser T., Donath S., and Hager G. On the single processor performance of simple lattice Boltzmann kernels. Comput. Fluids 35 910 (2005)
    • (2005) Comput. Fluids , vol.35 , Issue.910
    • Wellein, G.1    Zeiser, T.2    Donath, S.3    Hager, G.4
  • 22
    • 0343462141 scopus 로고    scopus 로고
    • Automated empirical optimization of software and the ATLAS project
    • Whaley R.C., Petitet A., and Dongarra J. Automated empirical optimization of software and the ATLAS project. Parallel Comput. 27 1-2 (2001) 3-35
    • (2001) Parallel Comput. , vol.27 , Issue.1-2 , pp. 3-35
    • Whaley, R.C.1    Petitet, A.2    Dongarra, J.3
  • 23
    • 65649090648 scopus 로고    scopus 로고
    • Ph.D. Thesis, EECS Department, University of California, Berkeley, December
    • S. Williams, Auto-tuning performance on multicore computers, Ph.D. Thesis, EECS Department, University of California, Berkeley, December 2008
    • (2008) Auto-tuning performance on multicore computers
    • Williams, S.1
  • 26
    • 67650797544 scopus 로고    scopus 로고
    • Roofline: An insightful visual performance model for floating-point programs and multicore architectures
    • Williams S., Watterman A., and Patterson D. Roofline: An insightful visual performance model for floating-point programs and multicore architectures. Commun. ACM April (2009)
    • (2009) Commun. ACM , Issue.April
    • Williams, S.1    Watterman, A.2    Patterson, D.3
  • 27
    • 0000331979 scopus 로고    scopus 로고
    • Lattice Boltzmann method for 3D flows with curved boundary
    • Yu D., Mei R., Shyy W., and Luo L. Lattice Boltzmann method for 3D flows with curved boundary. J. Comput. Phys. 161 (2000) 680-699
    • (2000) J. Comput. Phys. , vol.161 , pp. 680-699
    • Yu, D.1    Mei, R.2    Shyy, W.3    Luo, L.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.