메뉴 건너뛰기




Volumn , Issue , 2012, Pages 205-214

On the communication complexity of 3D FFTs and its implications for exascale

Author keywords

Exascale; FFT; Performance model

Indexed keywords

ALL-TO-ALL COMMUNICATION; CO-PROCESSORS; COMMUNICATION COMPLEXITY; CURRENT TECHNOLOGY; EXASCALE; INTRA-NODE COMMUNICATION; MEMORY BANDWIDTHS; MEMORY HIERARCHY; NETWORK BANDWIDTH; NETWORK COMMUNICATIONS; PERFORMANCE IMPACT; PERFORMANCE MODEL; POTENTIAL SCALING; SOFTWARE IMPLEMENTATION;

EID: 84864032930     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/2304576.2304604     Document Type: Conference Paper
Times cited : (65)

References (43)
  • 3
    • 0036105874 scopus 로고    scopus 로고
    • Cellular supercomputing with system-on-a-chip
    • Digest of Technical Papers (Cat. No.02CH37315) Ieee
    • G. Almasi et al. Cellular supercomputing with system-on-a-chip. In 2002 IEEE International Solid-State Circuits Conference. Digest of Technical Papers (Cat. No.02CH37315), pages 196-197. Ieee, 2002.
    • (2002) 2002 IEEE International Solid-state Circuits Conference , pp. 196-197
    • Almasi, G.1
  • 7
    • 33746887354 scopus 로고    scopus 로고
    • SeaStar interconnect: Balanced bandwidth for scalable performance
    • DOI 10.1109/MM.2006.65
    • R. Brightwell, K. T. Pedretti, K. D. Underwood, and T. Hudson. Seastar interconnect: Balanced bandwidth for scalable performance. IEEE Micro, 26:41-57, May 2006. (Pubitemid 44194067)
    • (2006) IEEE Micro , vol.26 , Issue.3 , pp. 41-57
    • Brightwell, R.1    Pedretti, K.T.2    Underwood, K.D.3    Hudson, T.4
  • 8
  • 10
    • 19344375178 scopus 로고    scopus 로고
    • The development and integration of a distributed 3D FFT for a cluster of workstations
    • Atlanta, GA, USA
    • C. E. Cramer and J. Board. The development and integration of a distributed 3D FFT for a cluster of workstations. In Proceedings of the 4th Annual Linux Showcase & Conference, Atlanta, GA, USA, 2000.
    • (2000) Proceedings of the 4th Annual Linux Showcase & Conference
    • Cramer, C.E.1    Board, J.2
  • 14
    • 0035980881 scopus 로고    scopus 로고
    • Scalable parallel FFT for spectral simulations on a Beowulf cluster
    • DOI 10.1016/S0167-8191(01)00120-X, PII S016781910100120X
    • P. Dmitruk, L.-P. Wang, W. H. Mattaeus, R. Zhang, and D. Seckel. Scalable parallel FFT for spectral simulations on a Beowulf cluster. Parallel Computing, 27(14):1921-1936, Dec. 2001. (Pubitemid 32997727)
    • (2001) Parallel Computing , vol.27 , Issue.14 , pp. 1921-1936
    • Dmitruk, P.1    Wang, L.-P.2    Matthaeus, W.H.3    Zhang, R.4    Seckel, D.5
  • 16
    • 79951595196 scopus 로고    scopus 로고
    • The international exascale software project roadmap
    • J. Dongarra et al. The international exascale software project roadmap. IJHPCA, 25(1):3-60, 2011.
    • (2011) IJHPCA , vol.25 , Issue.1 , pp. 3-60
    • Dongarra, J.1
  • 19
    • 19344378421 scopus 로고    scopus 로고
    • Scalable framework for 3D FFTs on the Blue Gene/L supercomputer: Implementation and early performance measurements
    • M. Eleftheriou, B. Fitch, A. Rayshubskiy, T. Ward, and R. Germain. Scalable framework for 3D FFTs on the Blue Gene/L supercomputer: implementation and early performance measurements. IBM Journal of Research and Development, 49(2.3):457-464, 2005. (Pubitemid 40718146)
    • (2005) IBM Journal of Research and Development , vol.49 , Issue.2-3 , pp. 457-464
    • Eleftheriou, M.1    Fitch, B.G.2    Rayshubskiy, A.3    Ward, T.J.C.4    Germain, R.S.5
  • 20
    • 33947229391 scopus 로고    scopus 로고
    • Performance of the 3D FFT on the 6D network torus QCDOC parallel supercomputer
    • DOI 10.1016/j.cpc.2006.12.006, PII S0010465507000276
    • B. FANG, Y. DENG, and G. MARTYNA. Performance of the 3D FFT on the 6D network torus QCDOC parallel supercomputer. Computer Physics Communications, 176(8):531-538, Apr. 2007. (Pubitemid 46435804)
    • (2007) Computer Physics Communications , vol.176 , Issue.8 , pp. 531-538
    • Fang, B.1    Deng, Y.2    Martyna, G.3
  • 23
    • 78149258346 scopus 로고    scopus 로고
    • Understanding throughput-oriented architectures
    • Nov.
    • M. Garland and D. B. Kirk. Understanding throughput-oriented architectures. Communications of the ACM, 53(11):58, Nov. 2010.
    • (2010) Communications of the ACM , vol.53 , Issue.11 , pp. 58
    • Garland, M.1    Kirk, D.B.2
  • 24
    • 0035280950 scopus 로고    scopus 로고
    • Parallel distributed FFT-based solvers for 3-D Poisson problems in meso-scale atmospheric simulations
    • DOI 10.1177/109434200101500104
    • L. Giraud, R. Guivarch, and J. Stein. Parallel Distributed FFT-Based Solvers for 3-D Poisson Problems in Meso-Scale Atmospheric Simulations. International Journal of High Performance Computing Applications, 15(1):36-46, Feb. 2001. (Pubitemid 32252488)
    • (2001) International Journal of High Performance Computing Applications , vol.15 , Issue.1 , pp. 36-46
    • Giraud, L.1    Guivarch, R.2    Stein, J.3
  • 34
    • 52649125840 scopus 로고    scopus 로고
    • 3D-stacked memory architectures for multi-core processors
    • IEEE, June
    • G. H. Loh. 3D-Stacked Memory Architectures for Multi-core Processors. In 2008 International Symposium on Computer Architecture, pages 453-464. IEEE, June 2008.
    • (2008) 2008 International Symposium on Computer Architecture , pp. 453-464
    • Loh, G.H.1
  • 38
    • 79961071291 scopus 로고    scopus 로고
    • Web search using mobile cores: Quantifying and mitigating the price of efficiency
    • June
    • V. J. Reddi, B. C. Lee, T. Chilimbi, and K. Vaid. Web search using mobile cores: Quantifying and mitigating the price of efficiency. ACM SIGARCH Computer Architecture News, 38(3):215-314, June 2010.
    • (2010) ACM SIGARCH Computer Architecture News , vol.38 , Issue.3 , pp. 215-314
    • Reddi, V.J.1    Lee, B.C.2    Chilimbi, T.3    Vaid, K.4
  • 40
    • 0012776293 scopus 로고    scopus 로고
    • A parallel 3-D FFT algorithm on clusters of vector SMPs
    • Applied Parallel Computing New Paradigms for HPC in Industry and Academia 5th International Workshop, PARA 2000 Bergen, Norway, June 18-20, 2000 Proceedings
    • D. Takahashi. A Parallel 3-D FFT Algorithm on Clusters of Vector SMPs. In Proceedings of Applied Parallel Computing: New Paradigms for HPC in Industry and Academia, volume LNCS 1947, pages 316-323, 2001. (Pubitemid 33239312)
    • (2001) LECTURE NOTES IN COMPUTER SCIENCE , Issue.1947 , pp. 316-323
    • Takahashi, D.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.