SCOPUS 정보 검색 플랫폼

ACM Transactions on Mathematical Software

Volumn 23, Issue 3, 1997, Pages 336-361

Compiler Blockability of Dense Matrix Factorizations

a Michigan Technological University ^* (United States)

b ARGONNE NATIONAL LABORATORY (United States)

Author keywords

D.3.4 Programming Languages : Processors compilers; F.2.1 Analysis of Algorithms and Problem Complexity : Numerical Algorithms and Problems computations on matrices; Optimization

Indexed keywords

ALGORITHMS; CODES (SYMBOLS); MATRIX ALGEBRA; OPTIMIZATION;

BASIC LINEAR ALGEBRA SUBPROGRAMS (BLAS); DENSE MATRIX FACTORIZATIONS;

PROGRAM COMPILERS;

EID: 0031223129 PISSN: 00983500 EISSN: None Source Type: Journal
DOI: 10.1145/275323.275325 Document Type: Article

Times cited : (17)

References (37)

1
- 0023438847
- Automatic translation of FORTRAN programs to vector form
- ALLEN, R. AND KENNEDY, K. 1987. Automatic translation of FORTRAN programs to vector form. ACM Trans. Program. Lang. Syst. 9, 4 (Oct.), 491-542.
- (1987) ACM Trans. Program. Lang. Syst. , vol.9 , Issue.4 OCT , pp. 491-542
- Allen, R.¹ Kennedy, K.²

2
- 0003706460
- Society for Information Management and The Management Information Systems, Minneapolis, MN
- ANDERSON, E., BAI, Z., BISCHOF, C. H., DEMMEL, J., DONGARRA, J. J., DU CROZ, J., GREENBAUM, A., HAMMARLING, S., MCKENNEY, A., OSTROUCHOV, S., AND SORENSEN, D. C. 1995. LAPACK User's Guide. 2nd ed. Society for Information Management and The Management Information Systems, Minneapolis, MN.
- (1995) LAPACK User's Guide. 2nd Ed.
- Anderson, E.¹ Bai, Z.² Bischof, C.H.³ Demmel, J.⁴ Dongarra, J.J.⁵ Du Croz, J.⁶ Greenbaum, A.⁷ Hammarling, S.⁸ Mckenney, A.⁹ Ostrouchov, S.¹⁰ Sorensen, D.C.¹¹

3
- 0345897544
- PHiPAC: A portable, high-performance, ANSI C coding methodology and its application to matrix multiply
- University of Tennessee, Knoxville, TN.
- BILMES, J., ASANOVIĆ, K., DEMMEL, J., LAM, D., AND CHIN, C. W. 1996. PHiPAC: A portable, high-performance, ANSI C coding methodology and its application to matrix multiply. LAPACK Working Note 111, University of Tennessee, Knoxville, TN.
- (1996) LAPACK Working Note 111
- Bilmes, J.¹ Asanović, K.² Demmel, J.³ Lam, D.⁴ Chin, C.W.⁵

4
- 0345897543
- Analysis of interprocedural side effects in a parallel programming environment
- (Athens, Greece, June 1987), E. N. Houstis, T. S. Papatheodorou, and C. D. Polychronopoulos, Eds. Lecture Notes in Computer Science, Springer-Verlag New York, Inc., New York, NY
- CALLAHAN, D. AND KENNEDY, K. 1988. Analysis of interprocedural side effects in a parallel programming environment. In Proceedings of the 1st International Conference on Supercomputing (Athens, Greece, June 1987), E. N. Houstis, T. S. Papatheodorou, and C. D. Polychronopoulos, Eds. Lecture Notes in Computer Science, vol. 297. Springer-Verlag New York, Inc., New York, NY, 138-171.
- (1988) Proceedings of the 1st International Conference on Supercomputing , vol.297 , pp. 138-171
- Callahan, D.¹ Kennedy, K.²

5
- 0003303385
- ParaScope: A parallel programming environment
- (Athens, Greece, June 8-12, 1987). ACM Press, New York, NY.
- CALLAHAN, K., COOPER, R., KENNEDY, K., AND TORCZON, L. 1987. ParaScope: A parallel programming environment. In Proceedings of the 1st International Conference on Supercomputing (Athens, Greece, June 8-12, 1987). ACM Press, New York, NY.
- (1987) Proceedings of the 1st International Conference on Supercomputing
- Callahan, K.¹ Cooper, R.² Kennedy, K.³ Torczon, L.⁴

6
- 0012951882
- Ph.D. thesis, Rice University, Houston, TX.
- CARR, S. M. 1992. Memory-hierarchy management. Ph.D. thesis, Rice University, Houston, TX.
- (1992) Memory-hierarchy Management
- Carr, S.M.¹

7
- 84964748976
- Compiler blockability of numerical algorithms
- (Minneapolis, MN, Nov. 16-20, 1992), R. Werner, Ed. IEEE Computer Society Press, Los Alamitos, CA
- CARR, S. AND KENNEDY, K. 1992. Compiler blockability of numerical algorithms. In Proceedings of Supercomputing '92 (Minneapolis, MN, Nov. 16-20, 1992), R. Werner, Ed. IEEE Computer Society Press, Los Alamitos, CA, 114-124.
- (1992) Proceedings of Supercomputing '92 , pp. 114-124
- Carr, S.¹ Kennedy, K.²

8
- 0028549474
- Improving the ratio of memory operations to floating-point operations in loops
- CARR, S. AND KENNEDY, K. 1994. Improving the ratio of memory operations to floating-point operations in loops. ACM Trans. Program. Lang. Syst. 16, 6 (Nov.), 1768-1810.
- (1994) ACM Trans. Program. Lang. Syst. , vol.16 , Issue.6 NOV , pp. 1768-1810
- Carr, S.¹ Kennedy, K.²

9
- 84875571223
- Improving software pipelining with unroll-and-jam
- (Maui, Hawaii, Jan. 1996). IEEE Computer Society Press, Los Alamitos, CA.
- CARR, S., DING, C., AND SWEANY, P. 1996. Improving software pipelining with unroll-and-jam. In Proceedings of the 29th Annual Hawaii International Conference on System Sciences (Maui, Hawaii, Jan. 1996). IEEE Computer Society Press, Los Alamitos, CA.
- (1996) Proceedings of the 29th Annual Hawaii International Conference on System Sciences
- Carr, S.¹ Ding, C.² Sweany, P.³

10
- 84976745804
- Tile size selection using cache organization and data layout
- COLEMAN, S. AND MCKINLEY, K. S. 1995. Tile size selection using cache organization and data layout. SIGPLAN Not. 30, 6 (June), 279-290.
- (1995) SIGPLAN Not. , vol.30 , Issue.6 JUNE , pp. 279-290
- Coleman, S.¹ Mckinley, K.S.²

11
- 0025402476
- A set of level 3 Basic Linear Algebra Subprograms
- DONGARRA, J. J., DU CROZ, J., HAMMARLING, S., AND DUFF, I. 1990. A set of level 3 Basic Linear Algebra Subprograms. ACM Trans. Math. Softw. 16, 1 (Mar.), 1-17.
- (1990) ACM Trans. Math. Softw. , vol.16 , Issue.1 MAR , pp. 1-17
- Dongarra, J.J.¹ Du Croz, J.² Hammarling, S.³ Duff, I.⁴

12
- 0023983122
- An extended set of FORTRAN Basic Linear Algebra Subprograms
- DONGARRA, J. J., DU CROZ, J., HAMMARLING, S., AND HANSON, R. J. 1998. An extended set of FORTRAN Basic Linear Algebra Subprograms. ACM Trans. Math. Softw. 14, 1 (Mar.), 1-17.
- (1998) ACM Trans. Math. Softw. , vol.14 , Issue.1 MAR , pp. 1-17
- Dongarra, J.J.¹ Du Croz, J.² Hammarling, S.³ Hanson, R.J.⁴

13
- 0003793981
- Society for Industrial and Applied Mathematics, Philadelphia, PA.
- DONGARRA, J. J., DUFF, I. S., SORENSEN, D. C., AND VAN DER HORST, H. A. 1991. Solving Linear Systems on Vector and Shared Memory Computers. Society for Industrial and Applied Mathematics, Philadelphia, PA.
- (1991) Solving Linear Systems on Vector and Shared Memory Computers
- Dongarra, J.J.¹ Duff, I.S.² Sorensen, D.C.³ Van Der Horst, H.A.⁴

14
- 0021310295
- Implementing linear algebra algorithms for dense matrices on a vector pipeline machine
- DONGARRA, J. J., GUSTAVSON, F. G., AND KARP, A. 1984. Implementing linear algebra algorithms for dense matrices on a vector pipeline machine. SIAM Rev. 26, 1 (Jan.), 91-112.
- (1984) SIAM Rev. , vol.26 , Issue.1 JAN , pp. 91-112
- Dongarra, J.J.¹ Gustavson, F.G.² Karp, A.³

15
- 0003555195
- Society for Industrial and Applied Mathematics, Philadelphia, PA.
- DONGARRA, J. J., MOLER, C. B., BUNCH, J. R., AND STEWART, G. W. 1979. LINPACK User's Guide. Society for Industrial and Applied Mathematics, Philadelphia, PA.
- (1979) LINPACK User's Guide
- Dongarra, J.J.¹ Moler, C.B.² Bunch, J.R.³ Stewart, G.W.⁴

16
- 0030688479
- Auto-blocking matrix multiplication, or tracking BLAS3 performance from source code
- FRENS, J. AND WISE, D. S. 1997. Auto-blocking matrix multiplication, or tracking BLAS3 performance from source code. SIGPLAN Not. 32, 7 (July), 206-216.
- (1997) SIGPLAN Not. , vol.32 , Issue.7 JULY , pp. 206-216
- Frens, J.¹ Wise, D.S.²

17
- 0003100264
- Parallel algorithms for dense linear algebra computations
- GALLIVAN, K. A., PLEMMONS, R. J., AND SAMEH, A. H. 1990. Parallel algorithms for dense linear algebra computations. SIAM Rev. 32, 1 (Mar.), 54-135.
- (1990) SIAM Rev. , vol.32 , Issue.1 MAR , pp. 54-135
- Gallivan, K.A.¹ Plemmons, R.J.² Sameh, A.H.³

18
- 84976790479
- Practical dependence testing
- GOFF, G., KENNEDY, K., AND TSENG, C.-W. 1991. Practical dependence testing. SIGPLAN Not. 26, 6 (June), 15-29.
- (1991) SIGPLAN Not. , vol.26 , Issue.6 JUNE , pp. 15-29
- Goff, G.¹ Kennedy, K.² Tseng, C.-W.³

19
- 0004236492
- Johns Hopkins University Press, Baltimore, MD.
- GOLUB, G. H. AND VAN LOAN, C. F. 1996. Matrix Computations. 3rd ed. Johns Hopkins University Press, Baltimore, MD.
- (1996) Matrix Computations. 3rd Ed.
- Golub, G.H.¹ Van Loan, C.F.²

20
- 0026186967
- An implementation of interprocedural bounded regular section analysis
- HAVLAK, P. AND KENNEDY, K. 1991. An implementation of interprocedural bounded regular section analysis. IEEE Trans. Parallel Distrib. Syst. 2, 3 (July), 350-360.
- (1991) IEEE Trans. Parallel Distrib. Syst. , vol.2 , Issue.3 JULY , pp. 350-360
- Havlak, P.¹ Kennedy, K.²

21
- 0012374769
- IBM Corp., Riverton, NJ.
- IBM. 1994. Engineering and Scientific Subroutine Library Version 2 Release 2, Guide and Reference. IBM Corp., Riverton, NJ.
- (1994) Engineering and Scientific Subroutine Library Version 2 Release 2, Guide and Reference

22
- 0040831411
- GEMM-based level 3 BLAS: High-performance model implementations and performance evaluation benchmark
- Dept. of Computing Science, University of Umeå, Umeå, Sweden. Also available as LAPACK Working Note 107
- KÅGSTRÖM, B., LING, P., AND VAN LOAN, C. 1995a. GEMM-based level 3 BLAS: High-performance model implementations and performance evaluation benchmark. Tech. Rep. UMINF-95.18, Dept. of Computing Science, University of Umeå, Umeå, Sweden. Also available as LAPACK Working Note 107.
- (1995) Tech. Rep. UMINF-95.18
- Kågström, B.¹ Ling, P.² Van Loan, C.³

23
- 0040831411
- GEMM-based level 3 BLAS: Installation, tuning, and use of the model implementations and the performance evaluation benchmark
- Dept. of Computing Science, University of Umeå, Umeå, Sweden. Also available as LAPACK Working Note 108
- KÅGSTRÖM, B., LING, P., AND VAN LOAN, C. 1995b. GEMM-based level 3 BLAS: Installation, tuning, and use of the model implementations and the performance evaluation benchmark. Tech. Rep. UMINF 95.19, Dept. of Computing Science, University of Umeå, Umeå, Sweden. Also available as LAPACK Working Note 108.
- (1995) Tech. Rep. UMINF 95.19
- Kågström, B.¹ Ling, P.² Van Loan, C.³

24
- 0028459839
- DXML: A high-performance scientific subroutine library
- KAMATH, C., HO, R., AND MANLEY, D. P. 1994. DXML: A high-performance scientific subroutine library. Digital Tech. J. 6, 3 (Summer), 44-56.
- (1994) Digital Tech. J. , vol.6 , Issue.3 SUMMER , pp. 44-56
- Kamath, C.¹ Ho, R.² Manley, D.P.³

25
- 0003845230
- John Wiley & Sons, Inc., New York, NY
- KUCK, D. 1978. The Structure of Computers and Computations. Vol. 1. John Wiley & Sons, Inc., New York, NY.
- (1978) The Structure of Computers and Computations , vol.1
- Kuck, D.¹

26
- 0042650298
- Software pipelining. An effective scheduling technique for VLIW machines
- (Atlanta, Georgia, June 22-24, 1988), R. L. Wexelblat, Ed. ACM Press, New York, NY
- LAM, M. 1988. Software pipelining. An effective scheduling technique for VLIW machines. In Proceedings of the ACM SIGPLAN '88 Conference on Programming Language Design and Implementation (Atlanta, Georgia, June 22-24, 1988), R. L. Wexelblat, Ed. ACM Press, New York, NY, 318-328.
- (1988) Proceedings of the ACM SIGPLAN '88 Conference on Programming Language Design and Implementation , pp. 318-328
- Lam, M.¹

27
- 0347788751
- The cache performance and optimizations of blocked algorithms
- LAM, M. D., ROTHBERG, E. E., AND WOLF, M. E. 1991. The cache performance and optimizations of blocked algorithms. SIGARCH Comput. Archit. News 19, 2 (Apr.), 63-74.
- (1991) SIGARCH Comput. Archit. News , vol.19 , Issue.2 APR , pp. 63-74
- Lam, M.D.¹ Rothberg, E.E.² Wolf, M.E.³

28
- 0018515759
- Basic linear algebra subprograms for Fortran usage
- LAWSON, C. L. HANSON, R. J., KINCAIS, D. R., AND KROGH, F. T. 1979. Basic linear algebra subprograms for Fortran usage. ACM Trans. Math. Softw. 5, 3, 308-323.
- (1979) ACM Trans. Math. Softw. , vol.5 , Issue.3 , pp. 308-323
- Lawson, C.L.¹ Hanson, R.J.² Kincais, D.R.³ Krogh, F.T.⁴

29
- 0347762580
- Implementing efficienct and portable dense matrix factorizations
- Society for Industrial and Applied Mathematics, Philadelphia, PA.
- LEHOUCQ, R. 1992. Implementing efficienct and portable dense matrix factorizations. In Proceedings of the 5th SIAM Conference on Parallel Processing for Scientific Computing. Society for Industrial and Applied Mathematics, Philadelphia, PA.
- (1992) Proceedings of the 5th SIAM Conference on Parallel Processing for Scientific Computing
- Lehoucq, R.¹

30
- 0026407190
- A comparative study of automatic vectorizing compilers
- LEVINE, D., CALLAHAN, D., AND DONGARRA, J. 1991. A comparative study of automatic vectorizing compilers. Parallel Comput. 17, 1223-1244.
- (1991) Parallel Comput. , vol.17 , pp. 1223-1244
- Levine, D.¹ Callahan, D.² Dongarra, J.³

31
- 0004025755
- Plenum Press, New York, NY.
- ORTEGA, J. M. 1988. Introduction to Parallel and Vector Solution of Linear Systems. Plenum Press, New York, NY.
- (1988) Introduction to Parallel and Vector Solution of Linear Systems
- Ortega, J.M.¹

32
- 0026966702
- Register allocation for software pipelined lops
- RAU, B. R., LEE, M., TIRUMALAI, P. P., AND SCHLANSKER, M. S. 1992. Register allocation for software pipelined lops. SIGPLAN Not. 27, 7 (July), 283-299.
- (1992) SIGPLAN Not. , vol.27 , Issue.7 JULY , pp. 283-299
- Rau, B.R.¹ Lee, M.² Tirumalai, P.P.³ Schlansker, M.S.⁴

33
- 0031140581
- Automatic selection of high order transformations in the IBM XL Fortran compilers
- SARKAR, V. 1997. Automatic selection of high order transformations in the IBM XL Fortran compilers. IBM J. Res. Dev. 41, 3 (May).
- (1997) IBM J. Res. Dev. , vol.41 , Issue.3 MAY
- Sarkar, V.¹

34
- 0003595562
- Springer Lecture Notes in Computer Science, Springer-Verlag New York, Inc., New York, NY
- SMITH, B. T., BOYLE, J. M., DONGARRA, J. J., GARBOW, B. S., IKEBE, Y., KLEMA, V. C., AND MOLER, C. B. 1976. EISPACK Guide. 2nd ed. Springer Lecture Notes in Computer Science, vol. 6. Springer-Verlag New York, Inc., New York, NY.
- (1976) EISPACK Guide. 2nd Ed. , vol.6
- Smith, B.T.¹ Boyle, J.M.² Dongarra, J.J.³ Garbow, B.S.⁴ Ikebe, Y.⁵ Klema, V.C.⁶ Moler, C.B.⁷

35
- 0022876858
- Advanced loop interchange
- WOLFE, M. 1986. Advanced loop interchange. In Proceedings of the 1986 International Conference on Parallel Processing.
- (1986) Proceedings of the 1986 International Conference on Parallel Processing
- Wolfe, M.¹

36
- 0002433589
- Iteration space titling for memory hieararchies
- Society for Industrial and Applied Mathematics, Philadelphia, PA.
- WOLFE, M. 1987. Iteration space titling for memory hieararchies. In Proceedings of the 3rd SIAM Conference on Parallel Processing for Scientific Computing. Society for Industrial and Applied Mathematics, Philadelphia, PA.
- (1987) Proceedings of the 3rd SIAM Conference on Parallel Processing for Scientific Computing
- Wolfe, M.¹

37
- 84976827033
- A data locality optimizing algorithm
- WOLF, M. E. AND LAM, M. S. 1991. A data locality optimizing algorithm. SIGPLAN Not. 26, 6 (June), 30-44.
- (1991) SIGPLAN Not. , vol.26 , Issue.6 JUNE , pp. 30-44
- Wolf, M.E.¹ Lam, M.S.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.