메뉴 건너뛰기




Volumn 2006, Issue , 2006, Pages 5-12

Analyzing block locality in Morton-order and Morton-hybrid matrices

Author keywords

Cholesky factorization; Morton order; Quadtrees

Indexed keywords

ALGORITHMS; BANDWIDTH; COMPUTATION THEORY; ITERATIVE METHODS; MICROPROCESSOR CHIPS; PROGRAM PROCESSORS; RANDOM ACCESS STORAGE;

EID: 34248336283     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1166133.1166134     Document Type: Conference Paper
Times cited : (5)

References (26)
  • 1
    • 33748317896 scopus 로고    scopus 로고
    • Fast additions on masked integers
    • May
    • M. D. Adams and D. S. Wise. Fast additions on masked integers. SIGPLAN Not., 41(5):39-45, May 2006. http://doi.acm.art/10.1145/1149982.1149987
    • (2006) SIGPLAN Not , vol.41 , Issue.5 , pp. 39-45
    • Adams, M.D.1    Wise, D.S.2
  • 2
    • 33745777806 scopus 로고    scopus 로고
    • Cache oblivious matrix multiplication using an element ordering based on the Peano curve
    • Parallel Processing and Applied Mathematics, of, Berlin, Springer
    • M. Bader and C. Zenger. Cache oblivious matrix multiplication using an element ordering based on the Peano curve. In Parallel Processing and Applied Mathematics, volume 3911 of Lecture Notes in Comput. Sci., pages 1042-1049, Berlin, 2006. Springer. http://dx.doo.org/10.1007/11752578_126
    • (2006) Lecture Notes in Comput. Sci , vol.3911 , pp. 1042-1049
    • Bader, M.1    Zenger, C.2
  • 3
    • 0036870763 scopus 로고    scopus 로고
    • Recursive array layouts and fast parallel matrix multiplication
    • Nov
    • S. Chatterjee, A. R. Lebeck, P. K. Patnala, and M. Thottenthodi. Recursive array layouts and fast parallel matrix multiplication. IEEE Trans. Parallel Distrib. Syst., 13(11):1105-1123, Nov. 2002. http://dx.doi.org/10. 1109/TPDS.2002.1058095
    • (2002) IEEE Trans. Parallel Distrib. Syst , vol.13 , Issue.11 , pp. 1105-1123
    • Chatterjee, S.1    Lebeck, A.R.2    Patnala, P.K.3    Thottenthodi, M.4
  • 4
    • 0025402476 scopus 로고
    • A set of level 3 Basic Linear Algebra Subprograms
    • Mar
    • J. J. Dongarra, J. Du Croz, S. Hammarling, and I. S. Duff. A set of level 3 Basic Linear Algebra Subprograms. ACM Trans. Math. Softw., 16(1): 1-17, Mar. 1990. http://doi.acm.org/10.1145/77626.79170
    • (1990) ACM Trans. Math. Softw , vol.16 , Issue.1 , pp. 1-17
    • Dongarra, J.J.1    Du Croz, J.2    Hammarling, S.3    Duff, I.S.4
  • 5
    • 34248384701 scopus 로고    scopus 로고
    • G. C. Fox. A graphical approach to load balancing and sparse matrix-vector multiplication. In M. Schultz, editor, Numerical Algorithms for Modern Parallel Architectures, 13 of IMA in Math. & Appl., pages 37-61. Springer, New York, 1988.
    • G. C. Fox. A graphical approach to load balancing and sparse matrix-vector multiplication. In M. Schultz, editor, Numerical Algorithms for Modern Parallel Architectures, volume 13 of IMA Vol. in Math. & Appl., pages 37-61. Springer, New York, 1988.
  • 6
    • 77953929281 scopus 로고    scopus 로고
    • B. B. Fraguela, J. Guo, G. Bikshandi, M. J. Garzarán, G. Almási, J. Moreira, and D. Padua. The hierarchically tiled arrays programming approach. In LCR '04: Proc. 7th Wkshp. Languages, Compilers, and Run-Time Support for Scalable Systems, 81 of ACM Int. Conf. Proc. Series, pages 1-12. ACM Press, New York, 2004. http://doi.acm.org/10.1145/ 1066650.1066657
    • B. B. Fraguela, J. Guo, G. Bikshandi, M. J. Garzarán, G. Almási, J. Moreira, and D. Padua. The hierarchically tiled arrays programming approach. In LCR '04: Proc. 7th Wkshp. Languages, Compilers, and Run-Time Support for Scalable Systems, volume 81 of ACM Int. Conf. Proc. Series, pages 1-12. ACM Press, New York, 2004. http://doi.acm.org/10.1145/ 1066650.1066657
  • 7
    • 0033350255 scopus 로고    scopus 로고
    • M. Frigo, C. E. Leiserson, H. Prokop, and S. Ramachandran. Cache-oblivious algorithms. In Proc. 40th Ann. Symp. Foundations of Computer Science, pages 285-298. IEEE Computer Soc. Press, Washington, DC, Oct. 1999. http://dx.doi.org/10.1109/SFFCS.1999.814600
    • M. Frigo, C. E. Leiserson, H. Prokop, and S. Ramachandran. Cache-oblivious algorithms. In Proc. 40th Ann. Symp. Foundations of Computer Science, pages 285-298. IEEE Computer Soc. Press, Washington, DC, Oct. 1999. http://dx.doi.org/10.1109/SFFCS.1999.814600
  • 9
    • 77954450405 scopus 로고    scopus 로고
    • The Opie compiler from rowmajor source to Morton-ordered matrices
    • J. Carter and L. Zhang, editors, ACM Press, New York
    • S. T. Gabriel and D. S. Wise. The Opie compiler from rowmajor source to Morton-ordered matrices. In J. Carter and L. Zhang, editors, Proc. 3rd Wkshp. on Memory Performance Issues, pages 136-144. ACM Press, New York, 2004. http://doi.acm.org/10.1145/1054943.1054962
    • (2004) Proc. 3rd Wkshp. on Memory Performance Issues , pp. 136-144
    • Gabriel, S.T.1    Wise, D.S.2
  • 10
    • 0020249952 scopus 로고
    • An effective way to represent quadtrees
    • Dec
    • I. Gargantini. An effective way to represent quadtrees. Commun. ACM, 25(12):905-910, Dec. 1982. http://doi.acm.org/10.1145/358728.358741
    • (1982) Commun. ACM , vol.25 , Issue.12 , pp. 905-910
    • Gargantini, I.1
  • 11
    • 0004236492 scopus 로고    scopus 로고
    • The Johns Hopkins Univ. Press, Baltimore, third edition
    • G. H. Golub and C. F. Van Loan. Matrix Computations. The Johns Hopkins Univ. Press, Baltimore, third edition, 1996.
    • (1996) Matrix Computations
    • Golub, G.H.1    Van Loan, C.F.2
  • 13
    • 49149109685 scopus 로고    scopus 로고
    • Anatomy of high-performance matrix multiplication
    • Technical report, Univ. of Texas, Austin. Submittted for publication. Visited Sept
    • K. Goto and R. A. van de Geijn. Anatomy of high-performance matrix multiplication. Technical report, Univ. of Texas, Austin. Submittted for publication. Visited Sept. 2006. http://www.cs.atexas.edu/users/flame/pubs/ GOTO_TOMS.pdf
    • (2006)
    • Goto, K.1    van de Geijn, R.A.2
  • 14
    • 34248333297 scopus 로고    scopus 로고
    • Innovative Computing Laboratory, Univ. of Tennessee, Knoxville, TN. Performance Application Programming Interface (PAPI), Dec. 2005. http://icl.cs.utk.edu/papi/
    • Innovative Computing Laboratory, Univ. of Tennessee, Knoxville, TN. Performance Application Programming Interface (PAPI), Dec. 2005. http://icl.cs.utk.edu/papi/
  • 15
    • 2942630889 scopus 로고    scopus 로고
    • A theoretician's guide to the experimental analysis of algorithms
    • M. H. Goldwasser, D. S. Johnson, and C. C. McGeoch, editors, Data Structures, Near Neighbor Searches, and Methodology: 5th & 6th DIMACS Implementation Challenges, of, Amer. Math. Soc, Providence
    • D. S. Johnson. A theoretician's guide to the experimental analysis of algorithms. In M. H. Goldwasser, D. S. Johnson, and C. C. McGeoch, editors, Data Structures, Near Neighbor Searches, and Methodology: 5th & 6th DIMACS Implementation Challenges, volume 59 of DIMACS Ser. Discrete Math. Theoret. Comput. Sci., pages 215-250. Amer. Math. Soc., Providence, 2002. http://www.research.att.com/~dsj/papers.html
    • (2002) DIMACS Ser. Discrete Math. Theoret. Comput. Sci , vol.59 , pp. 215-250
    • Johnson, D.S.1
  • 16
    • 0033907995 scopus 로고    scopus 로고
    • Scalable parallel matrix multiplication on distributed memory parallel computers
    • IEEE Computer Soc. Press, Washington, DC, May
    • K. Li. Scalable parallel matrix multiplication on distributed memory parallel computers. In 14th Int. Parallel and Distributed Processing Symp. (IPDPS'00), pages 307-314. IEEE Computer Soc. Press, Washington, DC, May 2000. http://dx.doi.org/10.1109/IPDPS.2000.846000
    • (2000) 14th Int. Parallel and Distributed Processing Symp. (IPDPS'00) , pp. 307-314
    • Li, K.1
  • 17
    • 34248372349 scopus 로고    scopus 로고
    • J. Markoff. Writing the fastest code, by hand, for fun: A human computer keeps speeding up chips. The New York Times, CLV(53,412):C1, C6, 2005 Nov. 28. http://www.nytimes.com/2005/11/28/technology/28super.html
    • J. Markoff. Writing the fastest code, by hand, for fun: A human computer keeps speeding up chips. The New York Times, CLV(53,412):C1, C6, 2005 Nov. 28. http://www.nytimes.com/2005/11/28/technology/28super.html
  • 18
    • 0003460690 scopus 로고
    • A computer oriented geodetic data base and a new technique in file sequencing
    • Technical report, IBM Ltd, Ottawa, Ontario, Mar
    • G. M. Morton. A computer oriented geodetic data base and a new technique in file sequencing. Technical report, IBM Ltd., Ottawa, Ontario, Mar. 1966.
    • (1966)
    • Morton, G.M.1
  • 19
    • 0042235298 scopus 로고    scopus 로고
    • Tiling, block data layout, and memory hierarchy performance
    • July
    • N. Park, B. Hong, and V. K. Prasanna. Tiling, block data layout, and memory hierarchy performance. IEEE Trans. Parallel Distrib. Syst., 14(7):640-654, July 2003. http://dx.doi.org/10.1109/TPDS.2003.1214317
    • (2003) IEEE Trans. Parallel Distrib. Syst , vol.14 , Issue.7 , pp. 640-654
    • Park, N.1    Hong, B.2    Prasanna, V.K.3
  • 21
    • 4544352521 scopus 로고    scopus 로고
    • Optimizing graph algorithms for improved cache performance
    • Sept
    • J. Sang Park, M. Penner, and V. K. Prasanna. Optimizing graph algorithms for improved cache performance. IEEE Trans. Parallel Distrib. Syst., 15(9):769-782, Sept. 2004. http://dx.doi.org/10.1109/TPDS.2004.44
    • (2004) IEEE Trans. Parallel Distrib. Syst , vol.15 , Issue.9 , pp. 769-782
    • Sang Park, J.1    Penner, M.2    Prasanna, V.K.3
  • 22
    • 0000058088 scopus 로고
    • Finding neighbors of equal size in linear quadtrees and octrees in constant time
    • May
    • G. Schrack. Finding neighbors of equal size in linear quadtrees and octrees in constant time. CVGIP: Image Underst., 55(3):221-230, May 1992.
    • (1992) CVGIP: Image Underst , vol.55 , Issue.3 , pp. 221-230
    • Schrack, G.1
  • 23
    • 0037173976 scopus 로고    scopus 로고
    • A framework for high-performance matrix multiplication based on hierarchical abstractions, algorithms and optimized low-level kernels
    • V. Valsalam and A. Skjellum. A framework for high-performance matrix multiplication based on hierarchical abstractions, algorithms and optimized low-level kernels. Concur. Comp. Prac. Exper., 14(10): 805-839, 2002. http://dx.doi.org/10.1002/cpe.630
    • (2002) Concur. Comp. Prac. Exper , vol.14 , Issue.10 , pp. 805-839
    • Valsalam, V.1    Skjellum, A.2
  • 24
    • 84937431996 scopus 로고    scopus 로고
    • Ahnentafel indexing into Morton-ordered arrays, or matrix locality for free
    • A. Bode, T. Ludwig, W. Karl, and R. Wismüller, editors, Euro-Par 2000, Parallel Processing, of, Springer, Heidelberg
    • D. S. Wise. Ahnentafel indexing into Morton-ordered arrays, or matrix locality for free. In A. Bode, T. Ludwig, W. Karl, and R. Wismüller, editors, Euro-Par 2000 - Parallel Processing, volume 1900 of Lecture Notes in Comput. Sci., pages 774-883. Springer, Heidelberg, 2000. http://www.springerlink.com/link.asp?id=0pc0e9gfk4x9j5fa
    • (2000) Lecture Notes in Comput. Sci , vol.1900 , pp. 774-883
    • Wise, D.S.1
  • 25
    • 27144518219 scopus 로고    scopus 로고
    • A paradigm for parallel matrix algorithms: Scalable Cholesky
    • J. C. Cunha and P. D. Medeiros, editors, Euro-Par 2005, Parallel Processing, number in, Springer, Berlin, Aug
    • D. S. Wise, C. L. Citro, J. J. Hursey, F. Liu, and M. A. Rainey. A paradigm for parallel matrix algorithms: Scalable Cholesky. In J. C. Cunha and P. D. Medeiros, editors, Euro-Par 2005 - Parallel Processing, number 3648 in Lecture Notes in Comput. Sci., pages 687-698. Springer, Berlin, Aug. 2005. http://dx.doi.org/10.1007/11549468_76
    • (2005) Lecture Notes in Comput. Sci , vol.3648 , pp. 687-698
    • Wise, D.S.1    Citro, C.L.2    Hursey, J.J.3    Liu, F.4    Rainey, M.A.5
  • 26
    • 0024935630 scopus 로고
    • More iteration space tiling
    • ACM Press, New York, NY, USA, Nov
    • M. Wolfe. More iteration space tiling. In Proc. Supercomputing '89, pages 655-664. ACM Press, New York, NY, USA, Nov. 1989.
    • (1989) Proc. Supercomputing '89 , pp. 655-664
    • Wolfe, M.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.