메뉴 건너뛰기




Volumn , Issue , 2010, Pages 1-395

Performance tuning of scientific applications

Author keywords

[No Author keywords available]

Indexed keywords

APPLICATION PROGRAMS; COMPUTER HARDWARE;

EID: 85054469366     PISSN: None     EISSN: None     Source Type: Book    
DOI: 10.1201/b10509     Document Type: Book
Times cited : (14)

References (392)
  • 1
    • 85054422128 scopus 로고    scopus 로고
    • Frequently Asked Questions
    • ATLAS Frequently Asked Questions. http://math-atlas.sourceforge.net/faq.html
  • 2
    • 85054451643 scopus 로고    scopus 로고
    • BLAS: Basic linear algebra subprograms. http://www.netlib.org/blas
  • 3
    • 85054431715 scopus 로고    scopus 로고
    • CactusEinstein toolkit home page. http://www.cactuscode.org/Community/NumericalRelativity
  • 4
    • 85054462117 scopus 로고    scopus 로고
    • GEO 600.
  • 5
    • 85054448244 scopus 로고    scopus 로고
    • Gnu standard: Formatting error messages. http://www.gnu.org/prep/standards/html_node/Errors.html
  • 6
    • 85054445397 scopus 로고    scopus 로고
    • Kranc: Automated code generation.
  • 7
    • 85054463898 scopus 로고    scopus 로고
    • LIGO: Laser Interferometer Gravitational wave Observatory.
  • 8
    • 85054461122 scopus 로고    scopus 로고
    • LISA: Laser Interferometer Space Antenna.
  • 9
    • 85054468897 scopus 로고    scopus 로고
    • Mesh refinement with Carpet.
  • 10
    • 85054446144 scopus 로고    scopus 로고
    • Netlib repository. http://www.netlib.org
  • 11
    • 85054449561 scopus 로고    scopus 로고
    • Queen Bee, the core supercomputer of LONI
    • Queen Bee, the core supercomputer of LONI.
  • 12
    • 85054439712 scopus 로고    scopus 로고
    • Sun Constellation Linux Cluster: Ranger.
  • 13
    • 85054451362 scopus 로고    scopus 로고
    • Top500 Supercomputer Sites. http://www.top500.org
  • 14
    • 85054466880 scopus 로고    scopus 로고
    • Optimizing applications on the Cray X1TM system, 2009. http://docs.cray.com/books/S-2315-50/html-S-2315-50/z1055157958smg.html
    • (2009)
  • 15
  • 29
    • 85054451287 scopus 로고    scopus 로고
    • Alpaca: Cactus tools for application-level profiling and correctness analysis. http://www.cct.lsu.edu/~eschnett/Alpaca
  • 33
    • 85054464091 scopus 로고    scopus 로고
    • home page
    • Astrophysics Simulation Collaboratory (ASC) home page.
  • 39
    • 0041638552 scopus 로고
    • Twelve ways to fool the masses when giving performance results on parallel computers
    • August
    • D.H. Bailey. Twelve ways to fool the masses when giving performance results on parallel computers. Supercomputing Review, pages 54-55, August 1991.
    • (1991) Supercomputing Review , pp. 54-55
    • Bailey, D.H.1
  • 40
    • 34147135028 scopus 로고
    • Misleading performance reporting in the supercomputing field
    • D.H. Bailey. Misleading performance reporting in the supercomputing field. Scientific Programming, 1: 141-151, 1992.
    • (1992) Scientific Programming , vol.1 , pp. 141-151
    • Bailey, D.H.1
  • 47
    • 37549015666 scopus 로고    scopus 로고
    • Bell’s law for the birth and death of computer classes
    • January
    • G. Bell. Bell’s law for the birth and death of computer classes. Communications of the ACM, 5(1): 86-94, January 2008.
    • (2008) Communications of the ACM , vol.5 , Issue.1 , pp. 86-94
    • Bell, G.1
  • 51
    • 11744289966 scopus 로고
    • Local adaptive mesh refinement for shock hydrodynamics
    • May
    • M.J. Berger and P. Colella. Local adaptive mesh refinement for shock hydrodynamics. Journal of Computational Physics, 82(1): 64-84, May 1989.
    • (1989) Journal of Computational Physics , vol.82 , Issue.1 , pp. 64-84
    • Berger, M.J.1    Colella, P.2
  • 52
    • 48749141209 scopus 로고
    • Adaptive mesh refinement for hyperbolic partial differential equations
    • M.J. Berger and J. Oliger. Adaptive mesh refinement for hyperbolic partial differential equations. Journal of Computational Physics, 53: 484-512, 1984.
    • (1984) Journal of Computational Physics , vol.53 , pp. 484-512
    • Berger, M.J.1    Oliger, J.2
  • 55
    • 0030661485 scopus 로고    scopus 로고
    • Optimizing matrix multiply using PHiPAC: A portable, high-performance, ANSI C coding methodology
    • Vienna, Austria
    • J. Bilmes, K. Asanovic, C-W Chin, and J. Demmel. Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology. In International Conference on Supercomputing, pages 340-347, Vienna, Austria, 1997.
    • (1997) International Conference on Supercomputing , pp. 340-347
    • Bilmes, J.1    Asanovic, K.2    Chin, C.-W.3    Demmel, J.4
  • 57
    • 0033407555 scopus 로고    scopus 로고
    • An energy-conserving thermodynamic model of sea ice
    • C.M. Bitz and W.H. Lipscomb. An energy-conserving thermodynamic model of sea ice. Journal of Geophysical Research, 104: 15669-15677, 1999.
    • (1999) Journal of Geophysical Research , vol.104 , pp. 15669-15677
    • Bitz, C.M.1    Lipscomb, W.H.2
  • 71
    • 58149269099 scopus 로고    scopus 로고
    • A class of parallel tiled linear algebra algorithms for multicore architectures
    • A. Buttari, J. Langou, J. Kurzak, and J. Dongarra. A class of parallel tiled linear algebra algorithms for multicore architectures. Parallel Computing, 35(1): 38-53, 2009.
    • (2009) Parallel Computing , vol.35 , Issue.1 , pp. 38-53
    • Buttari, A.1    Langou, J.2    Kurzak, J.3    Dongarra, J.4
  • 72
    • 85054441479 scopus 로고    scopus 로고
    • home page
    • Cactus computational toolkit home page. http://www.cactuscode.org
  • 77
    • 0028549474 scopus 로고
    • Improving the ratio of memory operations to floating-point operations in loops
    • S. Carr and K. Kennedy. Improving the ratio of memory operations to floating-point operations in loops. ACM Transactions on Programming Languages and Systems, 16(6): 1768-1810, 1994.
    • (1994) ACM Transactions on Programming Languages and Systems , vol.16 , Issue.6 , pp. 1768-1810
    • Carr, S.1    Kennedy, K.2
  • 79
    • 34250161860 scopus 로고    scopus 로고
    • Applying an automated framework to produce accurate blind performance predictions of full-scale HPC applications
    • June
    • L. Carrington, N. Wolter, A. Snavely, and C.B. Lee. Applying an automated framework to produce accurate blind performance predictions of full-scale HPC applications. DoD Users Group Conference (UGC2004), June 2004.
    • (2004) DoD Users Group Conference (UGC2004)
    • Carrington, L.1    Wolter, N.2    Snavely, A.3    Lee, C.B.4
  • 84
    • 85054462630 scopus 로고    scopus 로고
    • CCSM Software Engineering Group. http://www.ccsm.ucar.edu/cseg
  • 85
    • 85054427463 scopus 로고    scopus 로고
    • CCSM Software Engineering Working Group. http://www.ccsm.ucar.edu/csm/working_groups/Software
  • 86
    • 85054462150 scopus 로고    scopus 로고
    • National Energy Research Scientific Computing Center. Parallel total energy code, 2009.
    • (2009) Parallel total energy code
  • 96
    • 26844455510 scopus 로고
    • Multidimensional Upwind Methods for Hyperbolic Conservation Laws
    • P. Colella. Multidimensional Upwind Methods for Hyperbolic Conservation Laws. Journal of Computational Physics, 87: 171-200, 1990.
    • (1990) Journal of Computational Physics , vol.87 , pp. 171-200
    • Colella, P.1
  • 100
    • 33947636363 scopus 로고    scopus 로고
    • The formulation and atmospheric simulation of the community atmosphere model: CAM3
    • W.D. Collins, et al. The formulation and atmospheric simulation of the community atmosphere model: CAM3. Journal of Climate, 2005.
    • (2005) Journal of Climate
    • Collins, W.D.1
  • 101
    • 85054469230 scopus 로고    scopus 로고
    • Community Climate System Model. http://www.ccsm.ucar.edu
  • 106
    • 0002806690 scopus 로고    scopus 로고
    • OpenMP: An industry-standard API for shared-memory programming
    • January/March
    • L. Dagum and R. Menon. OpenMP: an industry-standard API for shared-memory programming. IEEE Computational Science and Engineering, 5(1): 46-55, January/March 1998.
    • (1998) IEEE Computational Science and Engineering , vol.5 , Issue.1 , pp. 46-55
    • Dagum, L.1    Menon, R.2
  • 115
    • 34250840018 scopus 로고    scopus 로고
    • Optimized high-order derivative and dissipation operators satisfying summation by parts, and applications in threedimensional multi-block evolutions
    • P. Diener, E.N. Dorband, E. Schnetter, and M. Tiglio. Optimized high-order derivative and dissipation operators satisfying summation by parts, and applications in threedimensional multi-block evolutions. Journal of Scientific Computing, 32: 109-145, 2007.
    • (2007) Journal of Scientific Computing , vol.32 , pp. 109-145
    • Diener, P.1    Dorband, E.N.2    Schnetter, E.3    Tiglio, M.4
  • 120
  • 126
    • 67650793277 scopus 로고    scopus 로고
    • Introduction to FLASH 3.0, with application to supersonic turbulence
    • A. Dubey, L.B. Reid, and R. Fisher. Introduction to FLASH 3.0, with application to supersonic turbulence. Physica Scripta, 132: 014046, 2008.
    • (2008) Physica Scripta , vol.132 , pp. 014046
    • Dubey, A.1    Reid, L.B.2    Fisher, R.3
  • 128
    • 85054464329 scopus 로고    scopus 로고
    • Octave home page
    • J.W. Eaton. Octave home page. http://www.octave.org
    • Eaton, J.W.1
  • 131
    • 85054465022 scopus 로고    scopus 로고
    • FEAP.
  • 133
    • 34548715722 scopus 로고    scopus 로고
    • The importance of being low power in high-performance computing
    • W. Feng. The importance of being low power in high-performance computing. CTWatch Quarterly, 1(3): 12-20, 2005.
    • (2005) CTWatch Quarterly , vol.1 , Issue.3 , pp. 12-20
    • Feng, W.1
  • 134
    • 85054425042 scopus 로고    scopus 로고
    • March
    • Solaris memory placement optimization and sun fireservers. http://www.sun.com/software/solaris/performance.jsp, March 2003.
    • (2003)
  • 135
  • 148
  • 155
    • 40749124008 scopus 로고    scopus 로고
    • Architecture of Qbox: A scalable first-principles molecular dynamics code
    • January/March
    • F. Gygi. Architecture of Qbox: A scalable first-principles molecular dynamics code. IBM Journal of Research and Development, 52, January/March 2008.
    • (2008) IBM Journal of Research and Development , pp. 52
    • Gygi, F.1
  • 159
    • 84870211068 scopus 로고    scopus 로고
    • Loop transformation recipes for code generation and auto-tuning
    • October
    • M. Hall, J. Chame, J. Shin, C. Chen, G. Rudy, and M.M. Khan. Loop transformation recipes for code generation and auto-tuning. In LCPC, October, 2009.
    • (2009) LCPC
    • Hall, M.1    Chame, J.2    Shin, J.3    Chen, C.4    Rudy, G.5    Khan, M.M.6
  • 165
    • 0024903997 scopus 로고
    • Evaluating Associativity in CPU Caches
    • M.D. Hill and A.J. Smith. Evaluating Associativity in CPU Caches. IEEE Transactions on Computers, 38(12): 1612-1630, 1989.
    • (1989) IEEE Transactions on Computers , vol.38 , Issue.12 , pp. 1612-1630
    • Hill, M.D.1    Smith, A.J.2
  • 166
    • 10644250257 scopus 로고
    • Inhomogeneous electron gas
    • P. Hohenberg and W. Kohn. Inhomogeneous electron gas. Physical Review, 136: B864, 1964.
    • (1964) Physical Review , vol.136 , pp. B864
    • Hohenberg, P.1    Kohn, W.2
  • 167
    • 0034543848 scopus 로고    scopus 로고
    • Performance and scalability analysis of teraflop-scale parallel architectures using multidimensional wavefront applications
    • A. Hoisie, O. Lubeck, and H. Wasserman. Performance and scalability analysis of teraflop-scale parallel architectures using multidimensional wavefront applications. International Journal of High Performance Computing Applications, 14: 330-346, 2000.
    • (2000) International Journal of High Performance Computing Applications , vol.14 , pp. 330-346
    • Hoisie, A.1    Lubeck, O.2    Wasserman, H.3
  • 168
    • 12444335040 scopus 로고    scopus 로고
    • Prediction and adaptation in Active Harmony
    • J.K. Hollingsworth and P.J. Keleher. Prediction and adaptation in Active Harmony. Cluster Computing, 2(3): 195-205, 1999.
    • (1999) Cluster Computing , vol.2 , Issue.3 , pp. 195-205
    • Hollingsworth, J.K.1    Keleher, P.J.2
  • 170
    • 84938447945 scopus 로고
    • Direct search solution of numerical and statistical problems
    • R. Hooke and T.A. Jeeves. Direct search solution of numerical and statistical problems. Journal of the ACM, 8(2): 212-229, 1961.
    • (1961) Journal of the ACM , vol.8 , Issue.2 , pp. 212-229
    • Hooke, R.1    Jeeves, T.A.2
  • 171
    • 85054424049 scopus 로고    scopus 로고
    • HPC challenge benchmark. http://icl.cs.utk.edu/hpcc/index.html
  • 173
    • 48849093309 scopus 로고    scopus 로고
    • Knowledge Support and Automation for Performance Analysis with PerfExplorer 2.0
    • (special issue on Large-Scale Programming Tools and Environments)
    • K. Huck, A. Malony, S. Shende, and A. Morris. Knowledge Support and Automation for Performance Analysis with PerfExplorer 2.0. The Journal of Scientific Programming, 16(2-3): 123-134, 2008. (special issue on Large-Scale Programming Tools and Environments).
    • (2008) The Journal of Scientific Programming , vol.16 , Issue.2-3 , pp. 123-134
    • Huck, K.1    Malony, A.2    Shende, S.3    Morris, A.4
  • 175
    • 0001439727 scopus 로고    scopus 로고
    • An elastic-viscous-plastic model for sea ice dynamics
    • E.C. Hunke and J.K. Dukowicz. An elastic-viscous-plastic model for sea ice dynamics. Journal of Physical Oceanography, 27: 1849-1867, 1997.
    • (1997) Journal of Physical Oceanography , vol.27 , pp. 1849-1867
    • Hunke, E.C.1    Dukowicz, J.K.2
  • 177
    • 33646765746 scopus 로고    scopus 로고
    • Kranc: A Mathematica application to generate numerical codes for tensorial evolution equations
    • S. Husa, I. Hinder, and C. Lechner. Kranc: A Mathematica application to generate numerical codes for tensorial evolution equations. Computer Physics Communications, 174: 983-1004, 2006.
    • (2006) Computer Physics Communications , vol.174 , pp. 983-1004
    • Husa, S.1    Hinder, I.2    Lechner, C.3
  • 181
    • 85054431030 scopus 로고    scopus 로고
    • ITER: International thermonuclear experimental reactor.
  • 182
    • 85054461393 scopus 로고    scopus 로고
    • HPC profiling with the Sun Studio(TM) performance tools
    • Dresden, Germany, September
    • M. Itzkowitz and Y. Maruyama. HPC profiling with the Sun Studio(TM) performance tools. In Third Parallel Tools Workshop, Dresden, Germany, September 2009.
    • (2009) Third Parallel Tools Workshop
    • Itzkowitz, M.1    Maruyama, Y.2
  • 183
    • 0037595554 scopus 로고    scopus 로고
    • Sheared poloidal flow driven by mode conversion in tokamak plasmas
    • E. Jaeger, L. Berry, and J. Myra, et al. Sheared poloidal flow driven by mode conversion in tokamak plasmas. Physical Review Letters, 90, 2003.
    • (2003) Physical Review Letters , pp. 90
    • Jaeger, E.1    Berry, L.2    Myra, J.3
  • 195
    • 4043140349 scopus 로고    scopus 로고
    • Density functional and density matrix method scaling linearly with the number of atoms
    • W. Kohn. Density functional and density matrix method scaling linearly with the number of atoms. Physical Review Letters, 76(17): 3168-3171, 1996.
    • (1996) Physical Review Letters , vol.76 , Issue.17 , pp. 3168-3171
    • Kohn, W.1
  • 196
    • 0042113153 scopus 로고
    • Self-consistent equations including exchange and correlation effects
    • W. Kohn and L.J. Sham. Self-consistent equations including exchange and correlation effects. Physical Review, 140: A1133, 1965.
    • (1965) Physical Review , vol.140 , pp. A1133
    • Kohn, W.1    Sham, L.J.2
  • 197
    • 0242667172 scopus 로고    scopus 로고
    • Optimization by direct search: New perspectives on some classical and modern methods
    • T.G. Kolda, R.M. Lewis, and V. Torczon. Optimization by direct search: New perspectives on some classical and modern methods. SIAM Review, 45(3): 385-482, 2004.
    • (2004) SIAM Review , vol.45 , Issue.3 , pp. 385-482
    • Kolda, T.G.1    Lewis, R.M.2    Torczon, V.3
  • 198
    • 0029359304 scopus 로고
    • Comparison of initial value and eigenvalue codes for kinetic toroidal plasma instabilities
    • August
    • M. Kotschenreuther, G. Rewoldt, and W.M. Tang. Comparison of initial value and eigenvalue codes for kinetic toroidal plasma instabilities. Computer Physics Communications, 88: 128-140, August 1995.
    • (1995) Computer Physics Communications , vol.88 , pp. 128-140
    • Kotschenreuther, M.1    Rewoldt, G.2    Tang, W.M.3
  • 199
    • 85054429973 scopus 로고    scopus 로고
    • Kranc: Automated code generation. http://www.cct.lsu.edu/~eschnett/Kranc
  • 200
    • 65549119644 scopus 로고    scopus 로고
    • Quantum chromodynamics with advanced computing
    • A.S. Kronfeld. Quantum chromodynamics with advanced computing. Journal of Physics: Conference Series, 125: 012067, 2008.
    • (2008) Journal of Physics: Conference Series , vol.125 , pp. 012067
    • Kronfeld, A.S.1
  • 201
  • 206
    • 0028380268 scopus 로고
    • Rewriting executable files to measure program behavior
    • J.R. Larus and T. Ball. Rewriting executable files to measure program behavior. Software Practice and Experience, 24(2): 197-218, 1994.
    • (1994) Software Practice and Experience , vol.24 , Issue.2 , pp. 197-218
    • Larus, J.R.1    Ball, T.2
  • 210
    • 20844459296 scopus 로고
    • Gyrokinetic particle simulation model
    • W.W. Lee. Gyrokinetic particle simulation model. Journal of Computational Physics, 72: 243-269, 1987.
    • (1987) Journal of Computational Physics , vol.72 , pp. 243-269
    • Lee, W.W.1
  • 212
    • 85054467691 scopus 로고    scopus 로고
    • Dyninst as a binary rewriter
    • M. Legendre. Dyninst as a binary rewriter. In Paradyn/Dyninst week, 2009. http: //www.dyninst.org/pdWeek09/slides/legendre-binrewriter.pdf
    • (2009) Paradyn/Dyninst week
    • Legendre, M.1
  • 216
    • 0037071357 scopus 로고    scopus 로고
    • Size scaling of turbulent transport in magnetically confined plasmas
    • Z. Lin, S. Ethier, T.S. Hahm, and W.M. Tang. Size scaling of turbulent transport in magnetically confined plasmas. Physical Review Letters, 88, 2002.
    • (2002) Physical Review Letters , pp. 88
    • Lin, Z.1    Ethier, S.2    Hahm, T.S.3    Tang, W.M.4
  • 217
    • 0032544628 scopus 로고    scopus 로고
    • Turbulent transport reduction by zonal flows: Massively parallel simulations
    • September
    • Z. Lin, T.S. Hahm, W.W. Lee, W.M. Tang, and R.B. White. Turbulent transport reduction by zonal flows: Massively parallel simulations. Science, 281(5384): 1835-1837, September 1998.
    • (1998) Science , vol.281 , Issue.5384 , pp. 1835-1837
    • Lin, Z.1    Hahm, T.S.2    Lee, W.W.3    Tang, W.M.4    White, R.B.5
  • 220
    • 77954020714 scopus 로고    scopus 로고
    • On-line detection of large-scale parallel application’s structure
    • April
    • G. Llort, J. Gonzalez, H. Servat, J. Gimenez, and J. Labarta. On-line detection of large-scale parallel application’s structure. In IPDPS 2010, April 2010.
    • (2010) IPDPS 2010
    • Llort, G.1    Gonzalez, J.2    Servat, H.3    Gimenez, J.4    Labarta, J.5
  • 222
    • 0031164889 scopus 로고    scopus 로고
    • Increasing the efficiency of ideal solar cells by photon induced tansitions at intermediate lavels
    • A. Luque and A. Marti. Increasing the efficiency of ideal solar cells by photon induced tansitions at intermediate lavels. Physical Review Letters, 78: 5014, 1997.
    • (1997) Physical Review Letters , vol.78 , pp. 5014
    • Luque, A.1    Marti, A.2
  • 224
    • 33745859061 scopus 로고    scopus 로고
    • Spatial hypersurfaces in causal set cosmology
    • Jun
    • S. Major, D. Rideout, and S. Surya. Spatial hypersurfaces in causal set cosmology. Classical Quantum Gravity, 23: 4743-4752, Jun 2006.
    • (2006) Classical Quantum Gravity , vol.23 , pp. 4743-4752
    • Major, S.1    Rideout, D.2    Surya, S.3
  • 229
    • 70350727348 scopus 로고
    • Measuring how fast computers really are
    • September
    • J. Markoff. Measuring how fast computers really are. New York Times, page 14F, September 1991.
    • (1991) New York Times , pp. 14
    • Markoff, J.1
  • 230
    • 85054455832 scopus 로고    scopus 로고
    • Performance Measurement of Applications with GPU Acceleration using CUDA
    • to appear
    • S. Mayanglambam, A. Malony, and M. Sottile. Performance Measurement of Applications with GPU Acceleration using CUDA. In Parallel Computing (ParCo), 2009. to appear.
    • (2009) Parallel Computing (ParCo)
    • Mayanglambam, S.1    Malony, A.2    Sottile, M.3
  • 232
    • 0032252855 scopus 로고    scopus 로고
    • Convergence of the Nelder-Mead simplex method to a nonstationary point
    • K.I.M. McKinnon. Convergence of the Nelder-Mead simplex method to a nonstationary point. SIAM Journal on Optimization, 9(1): 148-158, 1998.
    • (1998) SIAM Journal on Optimization , vol.9 , Issue.1 , pp. 148-158
    • McKinnon, K.I.M.1
  • 233
    • 85054461446 scopus 로고    scopus 로고
    • a public BSSN code
    • McLachlan, a public BSSN code.
    • McLachlan1
  • 236
    • 33846529179 scopus 로고    scopus 로고
    • Performance monitoring on the POWER5 microprocessor
    • L.K. John and L. Eeckhout, CRC PRESS
    • A. Mericas. Performance monitoring on the POWER5 microprocessor. In L.K. John and L. Eeckhout, editors, Performance Evaluation and Benchmarking, pages 247-266. CRC PRESS, 2006.
    • (2006) Performance Evaluation and Benchmarking , pp. 247-266
    • Mericas, A.1
  • 240
    • 0037146399 scopus 로고    scopus 로고
    • A Conservative Three-Dimensional Eulerian Method for Coupled Solid-Fluid Shock Capturing
    • G.H. Miller and P. Colella. A Conservative Three-Dimensional Eulerian Method for Coupled Solid-Fluid Shock Capturing. Journal of Computational Physics, 183: 26-82, 2002.
    • (2002) Journal of Computational Physics , vol.183 , pp. 26-82
    • Miller, G.H.1    Colella, P.2
  • 241
    • 83155177863 scopus 로고    scopus 로고
    • Coping at the user-level with resource limitations in the Cray message passing poolkit MPI at scale: How not to spend your summer vacation
    • R. Winget and K. Winget, editor, Eagan, MN, Cray User Group, Inc
    • R. Mills, F. Hoffman, P.Worley, K. Perumalla, A. Mirin, G. Hammond, and B. Smith. Coping at the user-level with resource limitations in the Cray message passing poolkit MPI at scale: How not to spend your summer vacation. In R. Winget and K. Winget, editor, Proceedings of the 51st Cray User Group Conference, May 4-7, 2009, Eagan, MN, 2009. Cray User Group, Inc.
    • (2009) Proceedings of the 51st Cray User Group Conference, May 4-7, 2009
    • Mills, R.1    Hoffman, F.2    Worley, P.3    Perumalla, K.4    Mirin, A.5    Hammond, G.6    Smith, B.7
  • 242
    • 35348840289 scopus 로고    scopus 로고
    • Block structured adaptive mesh and time refinement for hybrid, hyperbolic + n-body systems
    • F. Miniati and P. Colella. Block structured adaptive mesh and time refinement for hybrid, hyperbolic + n-body systems. Journal of Computational Physics, 227: 400-430, 2007.
    • (2007) Journal of Computational Physics , vol.227 , pp. 400-430
    • Miniati, F.1    Colella, P.2
  • 243
    • 36049045668 scopus 로고    scopus 로고
    • Extending scalability of the Community Atmosphere Model
    • A. Mirin and P. Worley. Extending scalability of the Community Atmosphere Model. Journal of Physics: Conference Series, 78, 2007. doi: 10.1088/1742-6596/78/1/012082
    • (2007) Journal of Physics: Conference Series , pp. 78
    • Mirin, A.1    Worley, P.2
  • 247
    • 0000793139 scopus 로고
    • Cramming more components onto integrated circuits
    • April
    • G.E. Moore. Cramming more components onto integrated circuits. Electronics, 38(8), April 1965.
    • (1965) Electronics , vol.38 , Issue.8
    • Moore, G.E.1
  • 248
    • 51849091556 scopus 로고    scopus 로고
    • Observing performance dynamics using parallel profile snapshots
    • Canary Island, Spain, August, Springer
    • A. Morris, W. Spear, A. Malony, and S. Shende. Observing performance dynamics using parallel profile snapshots. In EuroPar 2008, volume LNCS 5168, pages 162-171, Canary Island, Spain, August 2008. Springer.
    • (2008) EuroPar 2008, volume LNCS 5168 , pp. 162-171
    • Morris, A.1    Spear, W.2    Malony, A.3    Shende, S.4
  • 257
    • 85054443810 scopus 로고    scopus 로고
    • National Center for Supercomputing Applications. Blue Waters hardware. http://www.ncsa.illinois.edu/BlueWaters/hardware.html
    • Blue Waters hardware
  • 258
    • 0000238336 scopus 로고
    • A simplex method for function minimization
    • J.A. Nelder and R. Mead. A simplex method for function minimization. Computer Journal, 7: 308-313, 1965.
    • (1965) Computer Journal , vol.7 , pp. 308-313
    • Nelder, J.A.1    Mead, R.2
  • 261
  • 262
    • 0002081678 scopus 로고    scopus 로고
    • Co-Array Fortran for parallel programming
    • R.W. Numrich and J.K. Reid. Co-Array Fortran for parallel programming. ACM Fortran Forum, 17(2): 1-31, 1998.
    • (1998) ACM Fortran Forum , vol.17 , Issue.2 , pp. 1-31
    • Numrich, R.W.1    Reid, J.K.2
  • 267
    • 0031123703 scopus 로고    scopus 로고
    • From silicon to RNA: The coming of age of first-principle molecular dynamics
    • M. Parrinello. From silicon to RNA: The coming of age of first-principle molecular dynamics. Solid State Communications, 103, 107, 1997.
    • (1997) Solid State Communications , vol.103 , pp. 107
    • Parrinello, M.1
  • 269
    • 11944256577 scopus 로고
    • Iterative minimization techniques for ab initio total-energy calculations: Molecular dynamics and conjugate gradients
    • M.C. Payne, M.P. Teter, D.C. Allan, T.A. Arias, and J.D. Joannopoulos. Iterative minimization techniques for ab initio total-energy calculations: Molecular dynamics and conjugate gradients. Reviews of Modern Physics, 64: 1045, 1992.
    • (1992) Reviews of Modern Physics , vol.64 , pp. 1045
    • Payne, M.C.1    Teter, M.P.2    Allan, D.C.3    Arias, T.A.4    Joannopoulos, J.D.5
  • 271
    • 85054451417 scopus 로고    scopus 로고
    • SciDAC Performance Engineering Research Institute (PERI).
  • 272
    • 85054459001 scopus 로고    scopus 로고
    • PETSc: Portable, extensible toolkit for scientific computation.
  • 273
    • 70649090070 scopus 로고    scopus 로고
    • Victoria Falls: Scaling highly-threaded processor cores
    • S. Phillips. Victoria Falls: Scaling highly-threaded processor cores. In HotChips 19, 2007.
    • (2007) HotChips 19
    • Phillips, S.1
  • 274
    • 0028409163 scopus 로고
    • The NX message passing interface
    • April
    • P. Pierce. The NX message passing interface. Parallel Computing, 20(4): 463-480, April 1994.
    • (1994) Parallel Computing , vol.20 , Issue.4 , pp. 463-480
    • Pierce, P.1
  • 276
    • 33645436979 scopus 로고
    • Technical Report UPC-CEPBA 95-3, European Center for Parallelism of Barcelona (CEPBA), Universitat Polit`ecnica de Catalunya (UPC)
    • V. Pillet, J. Labarta, T. Cortes, and S. Girona. PARAVER: A tool to visualize and analyze parallel code. Technical Report UPC-CEPBA 95-3, European Center for Parallelism of Barcelona (CEPBA), Universitat Polit`ecnica de Catalunya (UPC), 1995. http://tinyurl.com/paraver95
    • (1995) PARAVER: A tool to visualize and analyze parallel code
    • Pillet, V.1    Labarta, J.2    Cortes, T.3    Girona, S.4
  • 278
    • 85054461952 scopus 로고    scopus 로고
    • PLASMA project. http://icl.cs.utk.edu/plasma
  • 282
    • 85054441534 scopus 로고    scopus 로고
    • Coefficient of determination. mathbits.com/mathbits/tisection/statistics2/correlation.htm
  • 285
    • 33846164822 scopus 로고    scopus 로고
    • Evidence for an entropy bound from fundamentally discrete gravity
    • D. Rideout and S. Zohren. Evidence for an entropy bound from fundamentally discrete gravity. Classical Quantum Gravity, 2006.
    • (2006) Classical Quantum Gravity
    • Rideout, D.1    Zohren, S.2
  • 286
    • 84877034501 scopus 로고    scopus 로고
    • Mrnet: A software-based multicast/reduction network for scalable tools
    • IEEE Computer Society
    • P.C. Roth, D.C. Arnold, and B.P. Miller. Mrnet: A software-based multicast/reduction network for scalable tools. In International Conference on Supercomputing, pages 21-36. IEEE Computer Society, 2003.
    • (2003) International Conference on Supercomputing , pp. 21-36
    • Roth, P.C.1    Arnold, D.C.2    Miller, B.P.3
  • 290
    • 33746604824 scopus 로고    scopus 로고
    • A multi-block infrastructure for three-dimensional time-dependent numerical relativity
    • E. Schnetter, P. Diener, E.N. Dorband, and M. Tiglio. A multi-block infrastructure for three-dimensional time-dependent numerical relativity. Classical Quantum Gravity, 23: S553-S578, 2006.
    • (2006) Classical Quantum Gravity , vol.23 , pp. S553-S578
    • Schnetter, E.1    Diener, P.2    Dorband, E.N.3    Tiglio, M.4
  • 291
    • 1842479966 scopus 로고    scopus 로고
    • Evolutions in 3D numerical relativity using fixed mesh refinement
    • E. Schnetter, S.H. Hawley, and I. Hawke. Evolutions in 3D numerical relativity using fixed mesh refinement. Classical and Quantum Gravity, 21: 1465-1488, 2004.
    • (2004) Classical and Quantum Gravity , vol.21 , pp. 1465-1488
    • Schnetter, E.1    Hawley, S.H.2    Hawke, I.3
  • 292
    • 34548192076 scopus 로고    scopus 로고
    • Optical properties of zno/zns and zno/znte heterostructures for photovoltaic applications
    • J. Schrier, D.O. Demchenko, L.-W. Wang, and A.P. Alivisatos. Optical properties of zno/zns and zno/znte heterostructures for photovoltaic applications. NanoLett., 7: 2377, 2007.
    • (2007) NanoLett. , vol.7 , pp. 2377
    • Schrier, J.1    Demchenko, D.O.2    Wang, L.-W.3    Alivisatos, A.P.4
  • 293
    • 34547489425 scopus 로고    scopus 로고
    • A flexible and dynamic infrastructure for MPI tool interoperability
    • M. Schulz and B.R. de Supinski. A flexible and dynamic infrastructure for MPI tool interoperability. In Proceedings of ICPP 2006, pages 193-202, 2006.
    • (2006) Proceedings of ICPP 2006 , pp. 193-202
    • Schulz, M.1    De Supinski, B.R.2
  • 294
    • 56749160395 scopus 로고    scopus 로고
    • pnMPI tools: A whole lot greater than the sum of their parts
    • M. Schulz and B.R. de Supinski. pnMPI tools: A whole lot greater than the sum of their parts. In Proceedings of SC07, 2007.
    • (2007) Proceedings of SC07
    • Schulz, M.1    De Supinski, B.R.2
  • 295
    • 85054464259 scopus 로고    scopus 로고
    • Report of the High-End Computing Revitalization Task Force (HECRTF)
    • National Science and Technology Council Committee on Technology High-End Computing Revitalization Task Force. Report of the High-End Computing Revitalization Task Force (HECRTF). 2004.
    • (2004) On Technology High-End Computing Revitalization Task Force
  • 296
    • 33645982477 scopus 로고    scopus 로고
    • Technical Report ZHR-R-0304, Dresden University of Technology, Center for High-Performance Computing, Nov
    • S. Seidl. VTF3 -A fast Vampir trace file low-level management library. Technical Report ZHR-R-0304, Dresden University of Technology, Center for High-Performance Computing, Nov 2003.
    • (2003) VTF3 -A fast Vampir trace file low-level management library
    • Seidl, S.1
  • 309
    • 0017949328 scopus 로고    scopus 로고
    • A comparative study of set associative memory mapping algorithms and their use for cache and main memory
    • A.J. Smith. A comparative study of set associative memory mapping algorithms and their use for cache and main memory. IEEE Transactions on Software Engineering, (2): 121-130.
    • IEEE Transactions on Software Engineering , Issue.2 , pp. 121-130
    • Smith, A.J.1
  • 310
    • 44049110107 scopus 로고
    • Parallel ocean general circulation modeling
    • R.D. Smith, J.K. Dukowicz, and R.C. Malone. Parallel ocean general circulation modeling. Phys. D, 60(1-4): 38-61, 1992.
    • (1992) Phys. D , vol.60 , Issue.1-4 , pp. 38-61
    • Smith, R.D.1    Dukowicz, J.K.2    Malone, R.C.3
  • 314
    • 85054454559 scopus 로고    scopus 로고
    • SPIRAL project. http://www.spiral.net
  • 315
    • 0036652569 scopus 로고    scopus 로고
    • Pentium 4 performance-monitoring features
    • B. Sprunt. Pentium 4 performance-monitoring features. IEEE Micro, 22(4): 72-82, 2002.
    • (2002) IEEE Micro , vol.22 , Issue.4 , pp. 72-82
    • Sprunt, B.1
  • 317
    • 85054430208 scopus 로고    scopus 로고
    • STREAM: Sustainable memory bandwidth in high performance computers. http://www.cs.virginia.edu/stream
  • 321
    • 85054452679 scopus 로고    scopus 로고
    • Sun Microsystems. Sun Studio Performance Analyzer. http://developers.sun.com/sunstudio/overview/topics/analyzing.jsp 2009.
    • (2009) Sun Studio Performance Analyzer
  • 330
    • 23944471086 scopus 로고    scopus 로고
    • Prophesy: An infrastructure for performance analysis and modeling of parallel and grid applications
    • V. Taylor, X. Wu, and R. Stevens. Prophesy: An infrastructure for performance analysis and modeling of parallel and grid applications. SIGMETRICS Perform. Eval. Rev., 30(4): 13-18, 2003.
    • (2003) SIGMETRICS Perform. Eval. Rev. , vol.30 , Issue.4 , pp. 13-18
    • Taylor, V.1    Wu, X.2    Stevens, R.3
  • 331
    • 85054432382 scopus 로고    scopus 로고
    • The Parallel Ocean Program. http://climate.lanl.gov/Models/POP
  • 334
    • 0002862950 scopus 로고    scopus 로고
    • Gravitational Radiation -a New Window Onto the Universe. (Karl Schwarzschild Lecture 1996)
    • K.S. Thorne. Gravitational Radiation -a New Window Onto the Universe. (Karl Schwarzschild Lecture 1996). Reviews of Modern Astronomy, 10: 1-28, 1997.
    • (1997) Reviews of Modern Astronomy , vol.10 , pp. 1-28
    • Thorne, K.S.1
  • 336
    • 0034375534 scopus 로고    scopus 로고
    • The accuracy, consistency, and speed of an electronpositron equation of state based on table interpolation of the helmholtz free energy
    • F.X. Timmes and F.D. Swesty. The accuracy, consistency, and speed of an electronpositron equation of state based on table interpolation of the helmholtz free energy. Astrophysical Journal, Supplement, 126: 501-516, 2000.
    • (2000) Astrophysical Journal, Supplement , vol.126 , pp. 501-516
    • Timmes, F.X.1    Swesty, F.D.2
  • 340
    • 85054446997 scopus 로고    scopus 로고
    • University of Oregon. TAU Portal. http://tau.nic.uoregon.edu
    • TAU Portal
  • 344
    • 70350771131 scopus 로고    scopus 로고
    • Benchmarking GPUs to tune dense linear algebra
    • IEEE, to appear
    • V. Volkov and J. Demmel. Benchmarking GPUs to tune dense linear algebra. In Supercomputing 08. IEEE, 2008. to appear.
    • (2008) Supercomputing 08
    • Volkov, V.1    Demmel, J.2
  • 347
    • 24344485098 scopus 로고    scopus 로고
    • OSKI: A library of automatically tuned sparse matrix kernels
    • June
    • R. Vuduc, J. Demmel, and K. Yelick. OSKI: A library of automatically tuned sparse matrix kernels. Journal of Physics: Conference Series, 16: 521-530, June 2005.
    • (2005) Journal of Physics: Conference Series , vol.16 , pp. 521-530
    • Vuduc, R.1    Demmel, J.2    Yelick, K.3
  • 351
    • 42749103540 scopus 로고    scopus 로고
    • First-principles thousand-atoms quantum dot calculations
    • L.-W. Wang and J. Li. First-principles thousand-atoms quantum dot calculations. Physical Review B, 69: 153302, 2004.
    • (2004) Physical Review B , vol.69 , pp. 153302
    • Wang, L.-W.1    Li, J.2
  • 352
    • 42049097273 scopus 로고    scopus 로고
    • Linear scaling three-dimensional fragment method for large-scale electronic structure calculations
    • L.-W.Wang, Z. Zhao, and J. Meza. Linear scaling three-dimensional fragment method for large-scale electronic structure calculations. Physical Review B, 77: 165113, 2008.
    • (2008) Physical Review B , vol.77 , pp. 165113
    • Wang, L.-W.1    Zhao, Z.2    Meza, J.3
  • 353
    • 0001604458 scopus 로고
    • Solving Schrodinger’s equation around a desired energy: Application to silicon quantum dots
    • L.-W. Wang and A. Zunger. Solving Schrodinger’s equation around a desired energy: Application to silicon quantum dots. Journal of Chemical Physics, 100: 2394, 1994.
    • (1994) Journal of Chemical Physics , vol.100 , pp. 2394
    • Wang, L.-W.1    Zunger, A.2
  • 360
    • 84943297310 scopus 로고    scopus 로고
    • Automatically tuned linear algebra software
    • R.C. Whaley and J.J. Dongarra. Automatically tuned linear algebra software. In SuperComputing, 1998.
    • (1998) SuperComputing
    • Whaley, R.C.1    Dongarra, J.J.2
  • 361
    • 0343462141 scopus 로고    scopus 로고
    • Automated empirical optimization of software and the ATLAS project
    • R.C. Whaley, A. Petitet, and J. Dongarra. Automated empirical optimization of software and the ATLAS project. Parallel Computing, 27(1-2): 3-35, 2001.
    • (2001) Parallel Computing , vol.27 , Issue.1-2 , pp. 3-35
    • Whaley, R.C.1    Petitet, A.2    Dongarra, J.3
  • 368
    • 67650797544 scopus 로고    scopus 로고
    • Roofline: An insightful visual performance model for floating-point programs and multicore architectures
    • April
    • S. Williams, A. Watterman, and D. Patterson. Roofline: An insightful visual performance model for floating-point programs and multicore architectures. Communications of the ACM, April 2009.
    • (2009) Communications of the ACM
    • Williams, S.1    Watterman, A.2    Patterson, D.3
  • 369
    • 0004039521 scopus 로고
    • NCAR Tech. Note NCAR/TN-210+STR, NTIS PB83 231068, National Center for Atmospheric Research, Boulder, Colo
    • D. L. Williamson. Description of NCAR Community Climate Model (CCM0B). NCAR Tech. Note NCAR/TN-210+STR, NTIS PB83 231068, National Center for Atmospheric Research, Boulder, Colo., 1983.
    • (1983) Description of NCAR Community Climate Model (CCM0B)
    • Williamson, D.L.1
  • 374
    • 85054458024 scopus 로고    scopus 로고
    • Performance of the Community Atmosphere Model on the Cray X1E and XT3
    • R. Winget and K. Winget, editor, Eagan, MN, Cray User Group, Inc
    • P. Worley. Performance of the Community Atmosphere Model on the Cray X1E and XT3. In R. Winget and K. Winget, editor, Proceedings of the 48th Cray User Group Conference, May 8-11, 2006, Eagan, MN, 2006. Cray User Group, Inc.
    • (2006) Proceedings of the 48th Cray User Group Conference, May 8-11, 2006
    • Worley, P.1
  • 375
    • 85054459662 scopus 로고    scopus 로고
    • June, Poster Presentation at the 13th Annual CCSM Workshop, June 17-19, 2008, Breckenridge, CO
    • P. Worley and A. Mirin. Performance Results for the new CAM Benchmark Suite, June 2008. Poster Presentation at the 13th Annual CCSM Workshop, June 17-19, 2008, Breckenridge, CO.
    • (2008) Performance Results for the new CAM Benchmark Suite
    • Worley, P.1    Mirin, A.2
  • 376
  • 377
    • 0346941076 scopus 로고    scopus 로고
    • MPI performance evaluation and characterization using a compact application benchmark code
    • IEEE Computer Society Press, Los Alamitos, CA
    • P.H. Worley. MPI performance evaluation and characterization using a compact application benchmark code. In Proceedings of the Second MPI Developers Conference and Users’ Meeting, pages 170-177. IEEE Computer Society Press, Los Alamitos, CA, 1996.
    • (1996) Proceedings of the Second MPI Developers Conference and Users’ Meeting , pp. 170-177
    • Worley, P.H.1
  • 379
    • 84883330114 scopus 로고    scopus 로고
    • Benchmarking using the Community Atmosphere Model
    • Warrenton, VA, The Standard Performance Evaluation Corp
    • P.H. Worley. Benchmarking using the Community Atmosphere Model. In Proceedings of the 2006 SPEC Benchmark Workshop, January 23, 2006, Warrenton, VA, 2006. The Standard Performance Evaluation Corp.
    • (2006) Proceedings of the 2006 SPEC Benchmark Workshop, January 23, 2006
    • Worley, P.H.1
  • 380
  • 381
    • 23844503894 scopus 로고    scopus 로고
    • Performance portability in the physical parameterizations of the Community Atmosphere Model
    • August
    • P.H. Worley and J.B. Drake. Performance portability in the physical parameterizations of the Community Atmosphere Model. International Journal of High Performance Computing Applications, 19(3): 1-15, August 2005.
    • (2005) International Journal of High Performance Computing Applications , vol.19 , Issue.3 , pp. 1-15
    • Worley, P.H.1    Drake, J.B.2
  • 382
    • 0028565417 scopus 로고
    • Parallel spectral transform shallow water model: A runtime-tunable parallel benchmark code
    • J. J. Dongarra and D. W. Walker, editors, IEEE Computer Society Press, Los Alamitos, CA
    • P.H. Worley and I.T. Foster. Parallel spectral transform shallow water model: a runtime-tunable parallel benchmark code. In J. J. Dongarra and D. W. Walker, editors, Proceedings of the Scalable High Performance Computing Conference, pages 207-214. IEEE Computer Society Press, Los Alamitos, CA, 1994.
    • (1994) Proceedings of the Scalable High Performance Computing Conference , pp. 207-214
    • Worley, P.H.1    Foster, I.T.2
  • 383
    • 0002372482 scopus 로고
    • Algorithm comparison and benchmarking using a parallel spectral transform shallow water model
    • G.-R. Hoffman and N. Kreitz, editors, World Scientific Publishing Co. Pte. Ltd., Singapore
    • P.H. Worley, I.T. Foster, and B. Toonen. Algorithm comparison and benchmarking using a parallel spectral transform shallow water model. In G.-R. Hoffman and N. Kreitz, editors, Coming of Age: Proceedings of the Sixth ECMWF Workshop on Use of Parallel Processors in Meteorology, pages 277-289. World Scientific Publishing Co. Pte. Ltd., Singapore, 1995.
    • (1995) Coming of Age: Proceedings of the Sixth ECMWF Workshop on Use of Parallel Processors in Meteorology , pp. 277-289
    • Worley, P.H.1    Foster, I.T.2    Toonen, B.3
  • 384
    • 25144441529 scopus 로고    scopus 로고
    • The performance evolution of the Parallel Ocean Program on the Cray X1
    • R. Winget and K. Winget, editor, Eagan, MN, ray User Group, Inc
    • P.H. Worley and J. Levesque. The performance evolution of the Parallel Ocean Program on the Cray X1. In R. Winget and K. Winget, editor, Proceedings of the 46th Cray User Group Conference, May 17-21, 2004, Eagan, MN, 2004. Cray User Group, Inc.
    • (2004) Proceedings of the 46th Cray User Group Conference, May 17-21, 2004
    • Worley, P.H.1    Levesque, J.2
  • 390
    • 47249088157 scopus 로고    scopus 로고
    • A divide and conquer linear scaling three dimensional fragment method for large scale electronic structure calculations
    • Z. Zhao, J. Meza, and L.-W. Wang. A divide and conquer linear scaling three dimensional fragment method for large scale electronic structure calculations. Journal of Physics: Condensed Matter, 20(294203), 2008.
    • (2008) Journal of Physics: Condensed Matter , vol.20
    • Zhao, Z.1    Meza, J.2    Wang, L.-W.3
  • 392
    • 44649089660 scopus 로고    scopus 로고
    • Multipatch methods in general relativistic astrophysics -hydrodynamical flows on fixed backgrounds
    • B. Zink, E. Schnetter, and M. Tiglio. Multipatch methods in general relativistic astrophysics -hydrodynamical flows on fixed backgrounds. Physical Review D, 77: 103015, 2008.
    • (2008) Physical Review D , vol.77 , pp. 103015
    • Zink, B.1    Schnetter, E.2    Tiglio, M.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.