메뉴 건너뛰기




Volumn 23, Issue 3, 1997, Pages 336-361

Compiler Blockability of Dense Matrix Factorizations

Author keywords

D.3.4 Programming Languages : Processors compilers; F.2.1 Analysis of Algorithms and Problem Complexity : Numerical Algorithms and Problems computations on matrices; Optimization

Indexed keywords

ALGORITHMS; CODES (SYMBOLS); MATRIX ALGEBRA; OPTIMIZATION;

EID: 0031223129     PISSN: 00983500     EISSN: None     Source Type: Journal    
DOI: 10.1145/275323.275325     Document Type: Article
Times cited : (17)

References (37)
  • 1
    • 0023438847 scopus 로고
    • Automatic translation of FORTRAN programs to vector form
    • ALLEN, R. AND KENNEDY, K. 1987. Automatic translation of FORTRAN programs to vector form. ACM Trans. Program. Lang. Syst. 9, 4 (Oct.), 491-542.
    • (1987) ACM Trans. Program. Lang. Syst. , vol.9 , Issue.4 OCT , pp. 491-542
    • Allen, R.1    Kennedy, K.2
  • 3
    • 0345897544 scopus 로고    scopus 로고
    • PHiPAC: A portable, high-performance, ANSI C coding methodology and its application to matrix multiply
    • University of Tennessee, Knoxville, TN.
    • BILMES, J., ASANOVIĆ, K., DEMMEL, J., LAM, D., AND CHIN, C. W. 1996. PHiPAC: A portable, high-performance, ANSI C coding methodology and its application to matrix multiply. LAPACK Working Note 111, University of Tennessee, Knoxville, TN.
    • (1996) LAPACK Working Note 111
    • Bilmes, J.1    Asanović, K.2    Demmel, J.3    Lam, D.4    Chin, C.W.5
  • 4
    • 0345897543 scopus 로고
    • Analysis of interprocedural side effects in a parallel programming environment
    • (Athens, Greece, June 1987), E. N. Houstis, T. S. Papatheodorou, and C. D. Polychronopoulos, Eds. Lecture Notes in Computer Science, Springer-Verlag New York, Inc., New York, NY
    • CALLAHAN, D. AND KENNEDY, K. 1988. Analysis of interprocedural side effects in a parallel programming environment. In Proceedings of the 1st International Conference on Supercomputing (Athens, Greece, June 1987), E. N. Houstis, T. S. Papatheodorou, and C. D. Polychronopoulos, Eds. Lecture Notes in Computer Science, vol. 297. Springer-Verlag New York, Inc., New York, NY, 138-171.
    • (1988) Proceedings of the 1st International Conference on Supercomputing , vol.297 , pp. 138-171
    • Callahan, D.1    Kennedy, K.2
  • 6
  • 7
    • 84964748976 scopus 로고
    • Compiler blockability of numerical algorithms
    • (Minneapolis, MN, Nov. 16-20, 1992), R. Werner, Ed. IEEE Computer Society Press, Los Alamitos, CA
    • CARR, S. AND KENNEDY, K. 1992. Compiler blockability of numerical algorithms. In Proceedings of Supercomputing '92 (Minneapolis, MN, Nov. 16-20, 1992), R. Werner, Ed. IEEE Computer Society Press, Los Alamitos, CA, 114-124.
    • (1992) Proceedings of Supercomputing '92 , pp. 114-124
    • Carr, S.1    Kennedy, K.2
  • 8
    • 0028549474 scopus 로고
    • Improving the ratio of memory operations to floating-point operations in loops
    • CARR, S. AND KENNEDY, K. 1994. Improving the ratio of memory operations to floating-point operations in loops. ACM Trans. Program. Lang. Syst. 16, 6 (Nov.), 1768-1810.
    • (1994) ACM Trans. Program. Lang. Syst. , vol.16 , Issue.6 NOV , pp. 1768-1810
    • Carr, S.1    Kennedy, K.2
  • 10
    • 84976745804 scopus 로고
    • Tile size selection using cache organization and data layout
    • COLEMAN, S. AND MCKINLEY, K. S. 1995. Tile size selection using cache organization and data layout. SIGPLAN Not. 30, 6 (June), 279-290.
    • (1995) SIGPLAN Not. , vol.30 , Issue.6 JUNE , pp. 279-290
    • Coleman, S.1    Mckinley, K.S.2
  • 14
    • 0021310295 scopus 로고
    • Implementing linear algebra algorithms for dense matrices on a vector pipeline machine
    • DONGARRA, J. J., GUSTAVSON, F. G., AND KARP, A. 1984. Implementing linear algebra algorithms for dense matrices on a vector pipeline machine. SIAM Rev. 26, 1 (Jan.), 91-112.
    • (1984) SIAM Rev. , vol.26 , Issue.1 JAN , pp. 91-112
    • Dongarra, J.J.1    Gustavson, F.G.2    Karp, A.3
  • 16
    • 0030688479 scopus 로고    scopus 로고
    • Auto-blocking matrix multiplication, or tracking BLAS3 performance from source code
    • FRENS, J. AND WISE, D. S. 1997. Auto-blocking matrix multiplication, or tracking BLAS3 performance from source code. SIGPLAN Not. 32, 7 (July), 206-216.
    • (1997) SIGPLAN Not. , vol.32 , Issue.7 JULY , pp. 206-216
    • Frens, J.1    Wise, D.S.2
  • 17
    • 0003100264 scopus 로고
    • Parallel algorithms for dense linear algebra computations
    • GALLIVAN, K. A., PLEMMONS, R. J., AND SAMEH, A. H. 1990. Parallel algorithms for dense linear algebra computations. SIAM Rev. 32, 1 (Mar.), 54-135.
    • (1990) SIAM Rev. , vol.32 , Issue.1 MAR , pp. 54-135
    • Gallivan, K.A.1    Plemmons, R.J.2    Sameh, A.H.3
  • 18
    • 84976790479 scopus 로고
    • Practical dependence testing
    • GOFF, G., KENNEDY, K., AND TSENG, C.-W. 1991. Practical dependence testing. SIGPLAN Not. 26, 6 (June), 15-29.
    • (1991) SIGPLAN Not. , vol.26 , Issue.6 JUNE , pp. 15-29
    • Goff, G.1    Kennedy, K.2    Tseng, C.-W.3
  • 20
    • 0026186967 scopus 로고
    • An implementation of interprocedural bounded regular section analysis
    • HAVLAK, P. AND KENNEDY, K. 1991. An implementation of interprocedural bounded regular section analysis. IEEE Trans. Parallel Distrib. Syst. 2, 3 (July), 350-360.
    • (1991) IEEE Trans. Parallel Distrib. Syst. , vol.2 , Issue.3 JULY , pp. 350-360
    • Havlak, P.1    Kennedy, K.2
  • 22
    • 0040831411 scopus 로고
    • GEMM-based level 3 BLAS: High-performance model implementations and performance evaluation benchmark
    • Dept. of Computing Science, University of Umeå, Umeå, Sweden. Also available as LAPACK Working Note 107
    • KÅGSTRÖM, B., LING, P., AND VAN LOAN, C. 1995a. GEMM-based level 3 BLAS: High-performance model implementations and performance evaluation benchmark. Tech. Rep. UMINF-95.18, Dept. of Computing Science, University of Umeå, Umeå, Sweden. Also available as LAPACK Working Note 107.
    • (1995) Tech. Rep. UMINF-95.18
    • Kågström, B.1    Ling, P.2    Van Loan, C.3
  • 23
    • 0040831411 scopus 로고
    • GEMM-based level 3 BLAS: Installation, tuning, and use of the model implementations and the performance evaluation benchmark
    • Dept. of Computing Science, University of Umeå, Umeå, Sweden. Also available as LAPACK Working Note 108
    • KÅGSTRÖM, B., LING, P., AND VAN LOAN, C. 1995b. GEMM-based level 3 BLAS: Installation, tuning, and use of the model implementations and the performance evaluation benchmark. Tech. Rep. UMINF 95.19, Dept. of Computing Science, University of Umeå, Umeå, Sweden. Also available as LAPACK Working Note 108.
    • (1995) Tech. Rep. UMINF 95.19
    • Kågström, B.1    Ling, P.2    Van Loan, C.3
  • 24
    • 0028459839 scopus 로고
    • DXML: A high-performance scientific subroutine library
    • KAMATH, C., HO, R., AND MANLEY, D. P. 1994. DXML: A high-performance scientific subroutine library. Digital Tech. J. 6, 3 (Summer), 44-56.
    • (1994) Digital Tech. J. , vol.6 , Issue.3 SUMMER , pp. 44-56
    • Kamath, C.1    Ho, R.2    Manley, D.P.3
  • 26
    • 0042650298 scopus 로고
    • Software pipelining. An effective scheduling technique for VLIW machines
    • (Atlanta, Georgia, June 22-24, 1988), R. L. Wexelblat, Ed. ACM Press, New York, NY
    • LAM, M. 1988. Software pipelining. An effective scheduling technique for VLIW machines. In Proceedings of the ACM SIGPLAN '88 Conference on Programming Language Design and Implementation (Atlanta, Georgia, June 22-24, 1988), R. L. Wexelblat, Ed. ACM Press, New York, NY, 318-328.
    • (1988) Proceedings of the ACM SIGPLAN '88 Conference on Programming Language Design and Implementation , pp. 318-328
    • Lam, M.1
  • 27
    • 0347788751 scopus 로고
    • The cache performance and optimizations of blocked algorithms
    • LAM, M. D., ROTHBERG, E. E., AND WOLF, M. E. 1991. The cache performance and optimizations of blocked algorithms. SIGARCH Comput. Archit. News 19, 2 (Apr.), 63-74.
    • (1991) SIGARCH Comput. Archit. News , vol.19 , Issue.2 APR , pp. 63-74
    • Lam, M.D.1    Rothberg, E.E.2    Wolf, M.E.3
  • 30
    • 0026407190 scopus 로고
    • A comparative study of automatic vectorizing compilers
    • LEVINE, D., CALLAHAN, D., AND DONGARRA, J. 1991. A comparative study of automatic vectorizing compilers. Parallel Comput. 17, 1223-1244.
    • (1991) Parallel Comput. , vol.17 , pp. 1223-1244
    • Levine, D.1    Callahan, D.2    Dongarra, J.3
  • 32
    • 0026966702 scopus 로고
    • Register allocation for software pipelined lops
    • RAU, B. R., LEE, M., TIRUMALAI, P. P., AND SCHLANSKER, M. S. 1992. Register allocation for software pipelined lops. SIGPLAN Not. 27, 7 (July), 283-299.
    • (1992) SIGPLAN Not. , vol.27 , Issue.7 JULY , pp. 283-299
    • Rau, B.R.1    Lee, M.2    Tirumalai, P.P.3    Schlansker, M.S.4
  • 33
    • 0031140581 scopus 로고    scopus 로고
    • Automatic selection of high order transformations in the IBM XL Fortran compilers
    • SARKAR, V. 1997. Automatic selection of high order transformations in the IBM XL Fortran compilers. IBM J. Res. Dev. 41, 3 (May).
    • (1997) IBM J. Res. Dev. , vol.41 , Issue.3 MAY
    • Sarkar, V.1
  • 37
    • 84976827033 scopus 로고
    • A data locality optimizing algorithm
    • WOLF, M. E. AND LAM, M. S. 1991. A data locality optimizing algorithm. SIGPLAN Not. 26, 6 (June), 30-44.
    • (1991) SIGPLAN Not. , vol.26 , Issue.6 JUNE , pp. 30-44
    • Wolf, M.E.1    Lam, M.S.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.