메뉴 건너뛰기




Volumn , Issue , 2010, Pages

Optimizing and tuning the fast multipole method for state-of-the-art multicore architectures

Author keywords

[No Author keywords available]

Indexed keywords

BARCELONA; FAST MULTIPOLE METHOD; MULTI-CORE SYSTEMS; MULTICORE ARCHITECTURES; NUMERICAL APPROXIMATIONS; PARALLELIZATIONS; PERFORMANCE ENHANCEMENTS; POWER EFFICIENCY; SINGLE-NODE PERFORMANCE;

EID: 77953980209     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/IPDPS.2010.5470415     Document Type: Conference Paper
Times cited : (40)

References (17)
  • 3
    • 35348871275 scopus 로고    scopus 로고
    • Hybrid MPI-thread parallelization of the fast multipole method
    • Hagenberg, Austria
    • O. Coulaud, P. Fortin, and J. Roman. Hybrid MPI-thread parallelization of the fast multipole method. In Proc. IS- PDC, Hagenberg, Austria, 2007.
    • (2007) Proc. IS- PDC
    • Coulaud, O.1    Fortin, P.2    Roman., J.3
  • 4
    • 20744449792 scopus 로고    scopus 로고
    • The design and implementation of FFTW3
    • M. Frigo and S. G. Johnson. The design and implementation of FFTW3. Proc. IEEE, 93, 2005.
    • (2005) Proc. IEEE , vol.93
    • Frigo, M.1    Johnson., S.G.2
  • 5
    • 0000396658 scopus 로고
    • A fast algorithm for particle simulations
    • L. Greengard and V. Rokhlin. A fast algorithm for particle simulations. J. Comp. Phys., 73, 1987.
    • (1987) J. Comp. Phys. , vol.73
    • Greengard, L.1    Rokhlin., V.2
  • 6
    • 48149107858 scopus 로고    scopus 로고
    • Fast multipole methods on graphics processors
    • N. A. Gumerov and R. Duraiswami. Fast multipole methods on graphics processors. J. Comp. Phys., 227:8290-8313, 2008.
    • (2008) J. Comp. Phys. , vol.227 , pp. 8290-8313
    • Gumerov, N.A.1    Duraiswami., R.2
  • 7
    • 18844402673 scopus 로고    scopus 로고
    • Efficient parallel algorithms and software for compressed octrees with applications to hierarchical methods
    • B. Hariharan and S. Aluru. Efficient parallel algorithms and software for compressed octrees with applications to hierarchical methods. Par. Co., 31(3-4):311-331, 2005.
    • (2005) Par. Co. , vol.31 , Issue.3-4 , pp. 311-331
    • Hariharan, B.1    Aluru., S.2
  • 8
    • 19944419779 scopus 로고    scopus 로고
    • Massively parallel implementation of a fast multipole method for distributed memory machines
    • J. Kurzak and B. M. Pettitt. Massively parallel implementation of a fast multipole method for distributed memory machines. J. Par. Distrib. Comput., 65:870-881, 2005.
    • (2005) J. Par. Distrib. Comput. , vol.65 , pp. 870-881
    • Kurzak, J.1    Pettitt., B.M.2
  • 10
    • 33751225374 scopus 로고    scopus 로고
    • Performance tuning of n-body codes on modern microprocessors: I. Direct integration with a Hermite scheme on ×86-64 architecture
    • arXiv:astro-ph/0511062v1
    • K. Nitadori, J. Makino, and P. Hut. Performance tuning of n-body codes on modern microprocessors: I. Direct integration with a Hermite scheme on ×86-64 architecture. New Astron., 12:169-181, 2006. arXiv:astro- ph/0511062v1.
    • (2006) New Astron , vol.12 , pp. 169-181
    • Nitadori, K.1    Makino, J.2    Hut., P.3
  • 11
    • 0038825209 scopus 로고    scopus 로고
    • Scalable and portable implementation of the fast multipole method on parallel comptuers
    • July
    • S. Ogata, T. J. Campbell, R. K. Kalia, A. Nakano, P. Vashishta, and S. Vemparala. Scalable and portable implementation of the fast multipole method on parallel comptuers. Computer Phys. Comm., 153(3):445-461, July 2003.
    • (2003) Computer Phys. Comm. , vol.153 , Issue.3 , pp. 445-461
    • Ogata, S.1    Campbell, T.J.2    Kalia, R.K.3    Nakano, A.4    Vashishta, P.5    Vemparala, S.6
  • 12
    • 79960575885 scopus 로고    scopus 로고
    • Adapting a message-driven parallel application to GPU-accelerated clusters
    • J. C. Phillips, J. E. Stone, and K. Schulten. Adapting a message-driven parallel application to GPU-accelerated clusters. In Proc. SC, 2008.
    • (2008) Proc. SC
    • Phillips, J.C.1    Stone, J.E.2    Schulten., K.3
  • 13
    • 55349088898 scopus 로고    scopus 로고
    • Bottom-up construction and 2:1 balance refinement of linear octrees in parallel
    • H. Sundar, R. S. Sampath, and G. Biros. Bottom-up construction and 2:1 balance refinement of linear octrees in parallel. SIAM J. Sci. Comput., 30(5):2675-2708, 2008.
    • (2008) SIAM J. Sci. Comput. , vol.30 , Issue.5 , pp. 2675-2708
    • Sundar, H.1    Sampath, R.S.2    Biros., G.3
  • 14
    • 0027747808 scopus 로고
    • A parallel hashed octtree n-body algorithm
    • M. S. Warren and J. K. Salmon. A parallel hashed octtree n-body algorithm. In Proc. SC, 1993.
    • (1993) Proc. SC
    • Warren, M.S.1    Salmon., J.K.2
  • 15
    • 48249149028 scopus 로고    scopus 로고
    • A new parallel kernel-independent fast multipole method
    • L. Ying, G. Biros, D. Zorin, and H. Langston. A new parallel kernel-independent fast multipole method. In Proc. SC, 2003.
    • (2003) Proc. SC
    • Ying, L.1    Biros, G.2    Zorin, D.3    Langston., H.4
  • 16
    • 2442446356 scopus 로고    scopus 로고
    • A kernel-independent adaptive fast multipole method in two and three dimensions
    • May
    • L. Ying, D. Zorin, and G. Biros. A kernel-independent adaptive fast multipole method in two and three dimensions. J. Comp. Phys., 196:591-626, May 2004.
    • (2004) J. Comp. Phys. , vol.196 , pp. 591-626
    • Ying, L.1    Zorin, D.2    Biros., G.3
  • 17
    • 8344272049 scopus 로고    scopus 로고
    • Array regrouping and structure splitting using whole-program reference affinity
    • May
    • Y. Zhong, M. Orlovich, X. Shen, and C. Ding. Array regrouping and structure splitting using whole-program reference affinity. ACM SIGPLAN Notices, 39(6):255-266, May 2004.
    • (2004) ACM SIGPLAN Notices , vol.39 , Issue.6 , pp. 255-266
    • Zhong, Y.1    Orlovich, M.2    Shen, X.3    Ding., C.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.