-
4
-
-
4544221105
-
Finding effective compilation sequences
-
L. Almagor, K. D. Cooper, A. Grosul, T. J. Harvey, S. W. Reeves, D. Subramanian, L. Torczon, and T. Waterman. Finding effective compilation sequences. In LCTES '04: Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools, pages 231-239, 2004.
-
(2004)
LCTES '04: Proceedings of the 2004 ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools
, pp. 231-239
-
-
Almagor, L.1
Cooper, K.D.2
Grosul, A.3
Harvey, T.J.4
Reeves, S.W.5
Subramanian, D.6
Torczon, L.7
Waterman, T.8
-
6
-
-
85117163262
-
A high-level approach to synthesis of high-performance codes for quantum chemistry
-
November
-
O. Baumgartner, D. Bernholdt, D. Cociorva, R. Harrison, S. Hirata, C. Lam, M. Nooijen, R. Pitzer, J. Ramanujam, and P. Sadayappan. A High-Level Approach to Synthesis of High-Performance Codes for Quantum Chemistry. In Proc. of Supercomputing 2002, November 2002.
-
(2002)
Proc. of Supercomputing 2002
-
-
Baumgartner, O.1
Bernholdt, D.2
Cociorva, D.3
Harrison, R.4
Hirata, S.5
Lam, C.6
Nooijen, M.7
Pitzer, R.8
Ramanujam, J.9
Sadayappan, P.10
-
7
-
-
0032676178
-
A tile selection algorithm for data locality and cache interference
-
ACM Press
-
J. Chame and S. Moon. A tile selection algorithm for data locality and cache interference. In ICS '99, pages 492-499. ACM Press, 1999.
-
(1999)
ICS '99
, pp. 492-499
-
-
Chame, J.1
Moon, S.2
-
8
-
-
0036041078
-
Space-time trade-off optimization for a class of electronic structure calculations
-
D. Cociorva, G. Baumgartner, C. Lam, P. Sadayappan, J. Ramanujam, M. Nooijen, D. Bernholdt, and R. Harrison. Space-Time Trade-Off Optimization for a Class of Electronic Structure Calculations. In Proc. of ACM SIGPLAN PLDI, pages 177-186, 2002.
-
(2002)
Proc. of ACM SIGPLAN PLDI
, pp. 177-186
-
-
Cociorva, D.1
Baumgartner, G.2
Lam, C.3
Sadayappan, P.4
Ramanujam, J.5
Nooijen, M.6
Bernholdt, D.7
Harrison, R.8
-
9
-
-
84947277042
-
Global communication optimization for tensor contraction expressions under memory constraints
-
D. Cociorva, X. Gao, S. Krishnan, G. Baumgartner, C. Lam, P. Sadayappan, and J. Ramanujam. Global Communication Optimization for Tensor Contraction Expressions under Memory Constraints. In Proc. of IPDPS, 2003.
-
(2003)
Proc. of IPDPS
-
-
Cociorva, D.1
Gao, X.2
Krishnan, S.3
Baumgartner, G.4
Lam, C.5
Sadayappan, P.6
Ramanujam, J.7
-
10
-
-
84947707755
-
Towards automatic synthesis of high-performance codes for electronic structure calculations: Data locality optimization
-
D. Cociorva, J. Wilkins, G. Baumgartner, P. Sadayappan, J. Ramanujam, M. Nooijen, D. E. Bernholdt, and R. Harrison. Towards Automatic Synthesis of High-Performance Codes for Electronic Structure Calculations: Data Locality Optimization. In Proc. of HiPC, volume 2228, pages 237-248, 2001.
-
(2001)
Proc. of HiPC
, vol.2228
, pp. 237-248
-
-
Cociorva, D.1
Wilkins, J.2
Baumgartner, G.3
Sadayappan, P.4
Ramanujam, J.5
Nooijen, M.6
Bernholdt, D.E.7
Harrison, R.8
-
11
-
-
85009352487
-
Tile size selection using cache organization and data layout
-
S. Coleman and K. S. McKinley. Tile Size Selection Using Cache Organization and Data Layout. In Proc. of PLDI, 1995.
-
(1995)
Proc. of PLDI
-
-
Coleman, S.1
McKinley, K.S.2
-
12
-
-
0002419099
-
An introduction to coupled cluster theory for computational chemists
-
K. Lipkowitz and D. Boyd, editor John Wiley & Sons, Ltd.
-
T. Crawford and H. S. III. An Introduction to Coupled Cluster Theory for Computational Chemists. In K. Lipkowitz and D. Boyd, editor, Reviews in Computational. Chemistry, volume 14, pages 33-136. John Wiley & Sons, Ltd., 2000.
-
(2000)
Reviews in Computational. Chemistry
, vol.14
, pp. 33-136
-
-
Crawford, T.1
Iii, H.S.2
-
13
-
-
1642502420
-
Improving effective bandwidth through compiler enhancement of global cache reuse
-
C. Ding and K. Kennedy. Improving effective bandwidth through compiler enhancement of global cache reuse. J. Parallel Distrib. Comput., 64(1):108-134, 2004.
-
(2004)
J. Parallel Distrib. Comput.
, vol.64
, Issue.1
, pp. 108-134
-
-
Ding, C.1
Kennedy, K.2
-
15
-
-
0031611719
-
Precise miss analysis for program transformations with caches of arbitrary associativity
-
S. Ghosh, M. Martonosi, and S. Malik. Precise miss analysis for program transformations with caches of arbitrary associativity. In Proc. of ASPLOS, pages 228-239, 1998.
-
(1998)
Proc. of ASPLOS
, pp. 228-239
-
-
Ghosh, S.1
Martonosi, M.2
Malik, S.3
-
16
-
-
0345566357
-
Tensor contraction engine: Abstraction and automated parallel implementation of configuration-interaction, coupled-cluster and many-body perturbation theories
-
S. Hirata. Tensor Contraction Engine: Abstraction and automated parallel implementation of configuration-interaction, coupled-cluster and many-body perturbation theories. J. Phys. Chem. A, 107:9887-9897, 2003.
-
(2003)
J. Phys. Chem. A
, vol.107
, pp. 9887-9897
-
-
Hirata, S.1
-
17
-
-
3142692593
-
Higher-order equation-of-motion coupled-cluster methods
-
S. Hirata. Higher-order equation-of-motion coupled-cluster methods. J. Chem. Phys., 121:51, 2004.
-
(2004)
J. Chem. Phys.
, vol.121
, pp. 51
-
-
Hirata, S.1
-
18
-
-
33845391019
-
-
Intel Math Kernel Library. http://www.intel.com/software/products/mkl/ features.htm.
-
-
-
-
19
-
-
84858693885
-
Increasing temporal locality with skewing and recursive blocking
-
G. Jin, J. Mellor-Crummey, and R. Fowler. Increasing temporal locality with skewing and recursive blocking. In Supercomputing '01, pages 43-43, 2001.
-
(2001)
Supercomputing '01
, pp. 43-43
-
-
Jin, G.1
Mellor-Crummey, J.2
Fowler, R.3
-
21
-
-
0035707468
-
Telescoping languages: A strategy for automatic generation of scientific problem-solving systems from annotated libraries
-
K. Kennedy, B. Broom, K. D. Cooper, J. Dongarra, R. J. Fowler, D. Gannon, S. L. Johnsson, J. M. Mellor-Crummey, and L. Torczon. Telescoping Languages: A Strategy for Automatic Generation of Scientific Problem-Solving Systems from Annotated Libraries. JPDC, 61(12):1803-1826, 2001.
-
(2001)
JPDC
, vol.61
, Issue.12
, pp. 1803-1826
-
-
Kennedy, K.1
Broom, B.2
Cooper, K.D.3
Dongarra, J.4
Fowler, R.J.5
Gannon, D.6
Johnsson, S.L.7
Mellor-Crummey, J.M.8
Torczon, L.9
-
22
-
-
0001465739
-
Maximizing loop parallelism and improving data locality via loop fusion and distribution
-
K. Kennedy and K. S. McKinley. Maximizing loop parallelism and improving data locality via loop fusion and distribution. In Proc. of LCPC, pages 301-320, 1993.
-
(1993)
Proc. of LCPC
, pp. 301-320
-
-
Kennedy, K.1
McKinley, K.S.2
-
23
-
-
12444319714
-
-
Technical Report OSU-CIRSC-9/03-T52, The Ohio State University, Columbus, OH, September
-
S. Krishnamoorthy, G. Baumgartner, D. Cociorva, C. Lam, and P. Sadayappan. On Efficient Out-of-core Matrix Transposition. Technical Report OSU-CIRSC-9/03-T52, The Ohio State University, Columbus, OH, September 2003.
-
(2003)
On Efficient Out-of-core Matrix Transposition
-
-
Krishnamoorthy, S.1
Baumgartner, G.2
Cociorva, D.3
Lam, C.4
Sadayappan, P.5
-
24
-
-
26444481049
-
Data locality optimization for synthesis of efficient out-of-core algorithms
-
S. Krishnan, S. Krishnamoorthy, G. Baumgartner, D. Cociorva, C.-C. Lam, P. Sadayappan, J. Ramanujam, D. E. Bernholdt, and V. Choppella. Data Locality Optimization for Synthesis of Efficient Out-of-Core Algorithms. In Proc. of HiPC, 2003.
-
(2003)
Proc. of HiPC
-
-
Krishnan, S.1
Krishnamoorthy, S.2
Baumgartner, G.3
Cociorva, D.4
Lam, C.-C.5
Sadayappan, P.6
Ramanujam, J.7
Bernholdt, D.E.8
Choppella, V.9
-
26
-
-
0000523695
-
On optimizing a class of multi-dimensional loops with reductions for parallel execution
-
C. Lam, P. Sadayappan, and R. Wenger. On Optimizing a Class of Multi-Dimensional Loops with Reductions for Parallel Execution. Parallel Processing Letters, 7(2):157-168, 1997.
-
(1997)
Parallel Processing Letters
, vol.7
, Issue.2
, pp. 157-168
-
-
Lam, C.1
Sadayappan, P.2
Wenger, R.3
-
30
-
-
17644395320
-
Blocking and array contraction across arbitrarily nested loops using affine partitioning
-
ACM Press
-
A. W. Lim, S.-W. Liao, and M. S. Lam. Blocking and array contraction across arbitrarily nested loops using affine partitioning. In Proc. of the Eighth ACM SIGPLAN PPoPP, pages 103-112. ACM Press, 2001.
-
(2001)
Proc. of the Eighth ACM SIGPLAN PPoPP
, pp. 103-112
-
-
Lim, A.W.1
Liao, S.-W.2
Lam, M.S.3
-
31
-
-
0000533836
-
Benchmark studies on small molecules
-
P. v. R. Schleyer, P. R. Schreiner, N. L. Allinger, T. Clark, J. Gasteiger, P. Kollman, and H. F. S. Ill, editors John Wiley & Sons, Ltd.
-
J. M. L. Martin. Benchmark Studies on Small Molecules. In P. v. R. Schleyer, P. R. Schreiner, N. L. Allinger, T. Clark, J. Gasteiger, P. Kollman, and H. F. S. Ill, editors, Encyclopedia of Computational Chemistry, volume 1, pages 115-128. John Wiley & Sons, Ltd., 1998.
-
(1998)
Encyclopedia of Computational Chemistry
, vol.1
, pp. 115-128
-
-
Martin, J.M.L.1
-
32
-
-
0032308685
-
Quantifying the multi-level nature of tiling interactions
-
June
-
N. Mitchell, K. Högstedt, L. Carter, and J. Ferrante. Quantifying the multi-level nature of tiling interactions. Intl. Journal of Parallel Programming, 26(6):641-670, June 1998.
-
(1998)
Intl. Journal of Parallel Programming
, vol.26
, Issue.6
, pp. 641-670
-
-
Mitchell, N.1
Högstedt, K.2
Carter, L.3
Ferrante, J.4
-
33
-
-
0035939372
-
Towards an internally contracted multireference CC method: Automated implementation of open-shell CCSD method for doublet states
-
M. Nooijen and V. L. Lotrich. Towards an internally contracted multireference CC method: Automated implementation of open-shell CCSD method for doublet states. J. Mol. Struct.-THEOCHEM, 547:253-267, 2001.
-
(2001)
J. Mol. Struct.-THEOCHEM
, vol.547
, pp. 253-267
-
-
Nooijen, M.1
Lotrich, V.L.2
-
34
-
-
34548789419
-
Better tiling and array contraction for compiling scientific programs
-
G. Pike and P. N. Hilfinger. Better tiling and array contraction for compiling scientific programs. In SC '02, pages 1-12, 2002.
-
(2002)
SC '02
, pp. 1-12
-
-
Pike, G.1
Hilfinger, P.N.2
-
35
-
-
33845574641
-
Tiling optimizations for 3d scientific computations
-
G. Rivera and C.-W. Tseng. Tiling optimizations for 3d scientific computations. In Supercomputing '00, page 32, 2000.
-
(2000)
Supercomputing '00
, pp. 32
-
-
Rivera, G.1
Tseng, C.-W.2
-
36
-
-
33746294217
-
Cache miss characterization and data locality optimization for imperfectly nested loops on shared memory multiprocessors
-
S. K. Sahoo, R. Panuganti, S. Krishnamoorthy, and P. Sadayappan. Cache Miss Characterization and Data Locality Optimization for Imperfectly Nested Loops on Shared Memory Multiprocessors. In Proc. of IPDPS., 2005.
-
(2005)
Proc. of IPDPS
-
-
Sahoo, S.K.1
Panuganti, R.2
Krishnamoorthy, S.3
Sadayappan, P.4
-
37
-
-
33845466059
-
Loop fusion for data locality and parallelism
-
S. Singhai and K. S. McKinley. Loop Fusion for Data Locality and Parallelism. In Proc. of MASPLAS, 1996.
-
(1996)
Proc. of MASPLAS
-
-
Singhai, S.1
McKinley, K.S.2
-
38
-
-
0032635362
-
New tiling techniques to improve cache temporal locality
-
Y. Song and Z. Li. New Tiling Techniques to Improve Cache Temporal Locality. In Proc. of ACM SIGPLAN PLDI, 1999.
-
(1999)
Proc. of ACM SIGPLAN PLDI
-
-
Song, Y.1
Li, Z.2
|