-
1
-
-
77950611743
-
HPCToolkit: Tools for performance analysis of optimized parallel programs
-
L. Adhianto, S. Banerjee, M. Fagan, M. Krentel, G. Marin, J. Mellor-Crummey, and N. R. Tallent, "HPCToolkit: Tools for performance analysis of optimized parallel programs, "Concurrency and Computation: Practice and Experience, vol. 22, no. 6, pp. 685-701, 2010.
-
(2010)
Concurrency and Computation: Practice and Experience
, vol.22
, Issue.6
, pp. 685-701
-
-
Adhianto, L.1
Banerjee, S.2
Fagan, M.3
Krentel, M.4
Marin, G.5
Meller-Crummey, J.6
Tallent, N.R.7
-
2
-
-
0030645124
-
Exploiting hardware performance counters with flow and context sensitive profiling
-
New York, NY, USA: ACM
-
G. Ammons, T. Ball, and J. R. Larus, "Exploiting hardware performance counters with flow and context sensitive profiling, "in Proc. of the 1997 ACM SIGPLAN Conference on Programming Language Design and Implementation. New York, NY, USA: ACM, 1997, pp. 85-96.
-
(1997)
Proc. of the 1997 ACM SIGPLAN Conference on Programming Language Design and Implementation
, pp. 85-86
-
-
Ammons, G.1
Ball, T.2
Larus, J.R.3
-
3
-
-
0003660984
-
PETSc users manual
-
S. Balay, K. Buschelman, V. Eijkhout, W. D. Gropp, D. Kaushik, M. G. Knepley, L. C. McInnes, B. F. Smith, and H. Zhang, "PETSc users manual, "Argonne National Laboratory, Tech. Rep. ANL-95/11 - Revision 3.0.0, 2008.
-
(2008)
Argonne National Laboratory, Tech. Rep. ANL-95/11-Revision 3.0.0
-
-
Balay, S.1
Buschelman, K.2
Eijkhout, V.3
Gropp, W.D.4
Kaushik, D.5
Knepley, M.G.6
McInnes, L.C.7
Smith, B.F.8
Zhang, H.9
-
4
-
-
34548290422
-
MPI performance analysis tools on Blue Gene/L
-
New York, NY, USA: ACM
-
I.-H. Chung, R. E. Walkup, H.-F. Wen, and H. Yu, "MPI performance analysis tools on Blue Gene/L, "in Proc. of the 2006 ACM/IEEE Conference on Supercomputing. New York, NY, USA: ACM, 2006, p. 123.
-
(2006)
Proc. of the 2006 ACM/IEEE Conference on Supercomputing
, pp. 123
-
-
Chung, I.-H.1
Walkup, R.E.2
Wen, H.-F.3
Yu, H.4
-
5
-
-
38049138575
-
Detecting application load imbalance on high end massively parallel systems
-
L. DeRose, B. Homer, and D. Johnson, "Detecting application load imbalance on high end massively parallel systems, "Lecture Notes in Computer Science, vol. 4641/2007, pp. 150-159, 2007.
-
(2007)
Lecture Notes in Computer Science
, vol.4641
, Issue.2007
, pp. 150-159
-
-
Derose, L.1
Homer, B.2
Johnson, D.3
-
6
-
-
74049085137
-
Cray performance analysis tools
-
Springer Berlin Heidelberg
-
L. DeRose, B. Homer, D. Johnson, S. Kaufmann, and H. Poxon, "Cray performance analysis tools, "in Tools for High Performance Computing. Springer Berlin Heidelberg, 2008, pp. 191-199.
-
(2008)
Tools for High Performance Computing
, pp. 191-199
-
-
Derose, L.1
Homer, B.2
Johnson, D.3
Kaufmann, S.4
Poxon, H.5
-
7
-
-
70350755747
-
Scalable load-balance measurement for SPMD codes
-
Piscataway, NJ, USA: IEEE Press
-
T. Gamblin, B. R. de Supinski, M. Schulz, R. Fowler, and D. A. Reed, "Scalable load-balance measurement for SPMD codes, "in Proc. of the 2008 ACM/IEEE Conference on Supercomputing. Piscataway, NJ, USA: IEEE Press, 2008, pp. 1-12.
-
(2008)
Proc. of the 2008 ACM/IEEE Conference on Supercomputing
, pp. 1-12
-
-
Gamblin, T.1
De Supinski, B.R.2
Schulz, M.3
Fowler, R.4
Reed, D.A.5
-
8
-
-
77954747892
-
Clustering performance data efficiently at massive scales
-
New York, NY, USA: ACM
-
T. Gamblin, B. R. de Supinski, M. Schulz, R. Fowler, and D. A. Reed"Clustering performance data efficiently at massive scales, "in Proc. of the 24th ACM International Conference on Supercomputing. New York, NY, USA: ACM, 2010, pp. 243-252.
-
(2010)
Proc. of the 24th ACM International Conference on Supercomputing
, pp. 243-252
-
-
Gamblin, T.1
De Supinski, B.R.2
Schulz, M.3
Fowler, R.4
Reed, D.A.5
-
9
-
-
51049087079
-
Scalable methods for monitoring and detecting behavioral equivalence classes in scientific codes
-
April
-
T. Gamblin, R. Fowler, and D. A. Reed, "Scalable methods for monitoring and detecting behavioral equivalence classes in scientific codes, "in Proc. of the 2008 IEEE International Symposium on Parallel and Distributed Processing, April 2008, pp. 1-12.
-
(2008)
Proc. of the 2008 IEEE International Symposium on Parallel and Distributed Processing
, pp. 1-12
-
-
Gamblin, T.1
Fowler, R.2
Reed, D.A.3
-
10
-
-
77950622295
-
The Scalasca performance toolset architecture
-
M. Geimer, F. Wolf, B. J. N. Wylie, E. Á brahám, D. Becker, and B. Mohr, "The Scalasca performance toolset architecture, "Concurrency and Computation: Practice and Experience, vol. 22, no. 6, pp. 702-719, 2010.
-
(2010)
Concurrency and Computation: Practice and Experience
, vol.22
, Issue.6
, pp. 702-719
-
-
Geimer, M.1
Wolf, F.2
Wylie, B.J.N.3
Brahám, E.Á.4
Becker, D.5
Mohr, B.6
-
11
-
-
78650825004
-
Scalable techniques for performance analysis
-
Parallel Programming Laboratory, Department of Computer Science, Urbana-Champaign, 07-06, May
-
C. W. Lee and L. V. Kalé, "Scalable Techniques for Performance Analysis, "Parallel Programming Laboratory, Department of Computer Science, University of Illinois, Urbana-Champaign, Tech. Rep. 07-06, May 2007.
-
(2007)
University of Illinois, Tech. Rep.
-
-
Lee, C.W.1
Kalé, L.V.2
-
12
-
-
65549142479
-
Towards scalable performance analysis and visualization through data reduction
-
Miami, Florida, USA, April
-
C. W. Lee, C. Mendes, and L. V. Kalé, "Towards Scalable Performance Analysis and Visualization through Data Reduction, "in Proc. of the 13th International Workshop on High-Level Parallel Programming Models and Supportive Environments, Miami, Florida, USA, April 2008.
-
(2008)
Proc. of the 13th International Workshop on High-Level Parallel Programming Models and Supportive Environments
-
-
Lee, C.W.1
Mendes, C.2
Kalé, L.V.3
-
14
-
-
78650830233
-
Detecting load imbalance in massively parallel applications
-
December
-
J. C. Linford, M.-A. Hermanns, M. Geimer, D. Boehme, and F. Wolf, "Detecting load imbalance in massively parallel applications, "Forschungszentrum Jülich, Tech. Rep. FZJ-JSC-IB-2008-09, December 2008.
-
(2008)
Forschungszentrum Jülich, Tech. Rep. FZJ-JSC-IB-2008-09
-
-
Linford, J.C.1
Hermanns, M.-A.2
Geimer, M.3
Boehme, D.4
Wolf, F.5
-
15
-
-
34247564559
-
Compensation of measurement overhead in parallel performance profiling
-
A. D. Malony, S. Shende, A. Morris, and F. Wolf, "Compensation of measurement overhead in parallel performance profiling, "Int. J. High Perform. Comput. Appl., vol. 21, no. 2, pp. 174-194, 2007.
-
(2007)
Int. J. High Perform. Comput. Appl.
, vol.21
, Issue.2
, pp. 174-194
-
-
Malony, A.D.1
Shende, S.2
Morris, A.3
Wolf, F.4
-
17
-
-
0029408429
-
The Paradyn parallel performance measurement tool
-
B. P. Miller, M. D. Callaghan, J. M. Cargille, J. K. Hollingsworth, R. B. Irvin, K. L. Karavanic, K. Kunchithapadam, and T. Newhall, "The Paradyn parallel performance measurement tool, "Computer, vol. 28, no. 11, pp. 37-46, 1995.
-
(1995)
Computer
, vol.28
, Issue.11
, pp. 37-46
-
-
Miller, B.P.1
Callaghan, M.D.2
Cargille, J.M.3
Hollingsworth, J.K.4
Irvin, R.B.5
Karavanic, K.L.6
Kunchithapadam, K.7
Newhall, T.8
-
18
-
-
36048966865
-
Simulating subsurface flow and transport on ultrascale computers using PFLOTRAN
-
no. 012051
-
R. T. Mills, C. Lu, P. C. Lichtner, and G. E. Hammond, "Simulating subsurface flow and transport on ultrascale computers using PFLOTRAN, "Journal of Physics Conference Series, vol. 78, no. 012051, 2007.
-
(2007)
Journal of Physics Conference Series
, vol.78
-
-
Mills, R.T.1
Lu, C.2
Lichtner, P.C.3
Hammond, G.E.4
-
19
-
-
67349187344
-
Scalatrace: Scalable compression and replay of communication traces for high-performance computing
-
M. Noeth, P. Ratn, F. Mueller, M. Schulz, and B. R. de Supinski, "Scalatrace: Scalable compression and replay of communication traces for high-performance computing, "J. Parallel Distrib. Comput., vol. 69, no. 8, pp. 696-710, 2009.
-
(2009)
J. Parallel Distrib. Comput.
, vol.69
, Issue.8
, pp. 696-710
-
-
Noeth, M.1
Ratn, P.2
Mueller, F.3
Schulz, M.4
Supinski, B.R.D.5
-
22
-
-
84877034501
-
MRNet: A software-based multicast/reduction network for scalable tools
-
Washington, DC, USA: IEEE Computer Society
-
P. C. Roth, D. C. Arnold, and B. P. Miller, "MRNet: A software-based multicast/reduction network for scalable tools, "in Proc. of the 2003 ACM/IEEE Conference on Supercomputing. Washington, DC, USA: IEEE Computer Society, 2003, p. 21.
-
(2003)
Proc. of the 2003 ACM/IEEE Conference on Supercomputing
, pp. 21
-
-
Roth, P.C.1
Arnold, D.C.2
Miller, B.P.3
-
24
-
-
70350597876
-
Effective performance measurement and analysis of multithreaded applications
-
New York, NY, USA: ACM
-
N. R. Tallent and J. Mellor-Crummey, "Effective performance measurement and analysis of multithreaded applications, "in Proc. of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. New York, NY, USA: ACM, 2009, pp. 229-240.
-
(2009)
Proc. of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
, pp. 229-240
-
-
Tallent, N.R.1
Mellor-Crummey, J.2
-
25
-
-
67650837951
-
Binary analysis for measurement and attribution of program performance
-
New York, NY, USA: ACM
-
N. R. Tallent, J. Mellor-Crummey, and M. W. Fagan, "Binary analysis for measurement and attribution of program performance, "in Proc. of the 2009 ACM SIGPLAN Conference on Programming Language Design and Implementation. New York, NY, USA: ACM, 2009, pp. 441-452.
-
(2009)
Proc. of the 2009 ACM SIGPLAN Conference on Programming Language Design and Implementation
, pp. 441-452
-
-
Tallent, N.R.1
Mellor-Crummey, J.2
Fagan, M.W.3
-
26
-
-
74049095154
-
Diagnosing performance bottlenecks in emerging petascale applications
-
New York, NY, USA: ACM
-
N. R. Tallent, J. M. Mellor-Crummey, L. Adhianto, M. W. Fagan, and M. Krentel, "Diagnosing performance bottlenecks in emerging petascale applications, "in Proc. of the 2009 ACM/IEEE Conference on Supercomputing. New York, NY, USA: ACM, 2009, pp. 1-11.
-
(2009)
Proc. of the 2009 ACM/IEEE Conference on Supercomputing
, pp. 1-11
-
-
Tallent, N.R.1
Mellor-Crummey, J.M.2
Adhianto, L.3
Fagan, M.W.4
Krentel, M.5
-
27
-
-
77957574504
-
Analyzing lock contention in multithreaded applications
-
New York, NY, USA: ACM
-
N. R. Tallent, J. M. Mellor-Crummey, and A. Porterfield, "Analyzing lock contention in multithreaded applications, "in Proc. of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. New York, NY, USA: ACM, 2010, pp. 269-280.
-
(2010)
Proc. of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
, pp. 269-280
-
-
Tallent, N.R.1
Mellor-Crummey, J.M.2
Porterfield, A.3
-
28
-
-
0033691589
-
Performance analysis of distributed applications using automatic classification of communication infficiencies
-
New York, NY, USA: ACM
-
J. Vetter, "Performance analysis of distributed applications using automatic classification of communication infficiencies, "in Proc. of the 14th International Conference on Supercomputing. New York, NY, USA: ACM, 2000, pp. 245-254.
-
(2000)
Proc. of the 14th International Conference on Supercomputing
, pp. 245-254
-
-
Vetter, J.1
-
29
-
-
0029697880
-
Waiting time analysis and performance visualization in carnival
-
New York, NY, USA: ACM
-
J. Wagner Meira, T. J. LeBlanc, and A. Poulos, "Waiting time analysis and performance visualization in carnival, "in Proc. of the 1996 SIGMETRICS Symposium on Parallel and Distributed Tools. New York, NY, USA: ACM, 1996, pp. 1-10.
-
(1996)
Proc. of the 1996 SIGMETRICS Symposium on Parallel and Distributed Tools
, pp. 1-10
-
-
Meira, J.W.1
Leblanc, T.J.2
Poulos, A.3
-
30
-
-
34547616149
-
Automatic analysis of inefficiency patterns in parallel applications
-
DOI 10.1002/cpe.1128
-
F. Wolf, B. Mohr, J. Dongarra, and S. Moore, "Automatic analysis of inefficiency patterns in parallel applications, "Concurrency and Computation: Practice and Experience, vol. 19, pp. 1481-1496, Feb. 2007, special Issue Automatic Performance Analysis. (Pubitemid 47204258)
-
(2007)
Concurrency Computation Practice and Experience
, vol.19
, Issue.11
, pp. 1481-1496
-
-
Wolf, F.1
Mohr, B.2
Dongarra, J.3
Moore, S.4
-
31
-
-
48849092945
-
Performance measurement and analysis of large-scale parallel applications on leadership computing systems
-
B. J. N. Wylie, M. Geimer, and F. Wolf, "Performance measurement and analysis of large-scale parallel applications on leadership computing systems, "Sci. Program., vol. 16, no. 2-3, pp. 167-181, 2008.
-
(2008)
Sci. Program
, vol.16
, Issue.2-3
, pp. 167-181
-
-
Wylie, B.J.N.1
Geimer, M.2
Wolf, F.3
|