메뉴 건너뛰기




Volumn 10, Issue 12, 1999, Pages 1201-1216

Algorithmic redistribution methods for block-cyclic decompositions

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMIC REDISTRIBUTION METHODS; BLOCK-CYCLIC DECOMPOSITION;

EID: 0033309435     PISSN: 10459219     EISSN: None     Source Type: Journal    
DOI: 10.1109/71.819944     Document Type: Article
Times cited : (36)

References (51)
  • 2
    • 0028545949 scopus 로고
    • A High Performance Matrix Multiplication Algorithm on a Distributed-Memory Parallel Computer, Using Overlapped Communication
    • R. Agarwal, F. Gustavson, and M. Zubair, "A High Performance Matrix Multiplication Algorithm on a Distributed-Memory Parallel Computer, Using Overlapped Communication," IBM J. Research and Development, vol. 38, no. 6, pp.673-681, 1994.
    • (1994) IBM J. Research and Development , vol.38 , Issue.6 , pp. 673-681
    • Agarwal, R.1    Gustavson, F.2    Zubair, M.3
  • 9
    • 84943678690 scopus 로고
    • Parallel LU Decomposition on a Transputer Network
    • G. van Zee and J. van der Vorst, eds.
    • R. Bisseling and J. van der Vorst, "Parallel LU Decomposition on a Transputer Network," Lecture Notes in Computer Sciences, G. van Zee and J. van der Vorst, eds., vol. 384, pp. 61-77, 1989.
    • (1989) Lecture Notes in Computer Sciences , vol.384 , pp. 61-77
    • Bisseling, R.1    Van Der Vorst, J.2
  • 12
    • 0027558054 scopus 로고
    • Implementation of BLAS Level 3 and LINPACK Benchmark on the AP1000
    • R. Brent and P. Strazdins, "Implementation of BLAS Level 3 and LINPACK Benchmark on the AP1000," Fujitsu Scientific and Technical J., vol. 5, no. 1, pp. 61-70, 1993.
    • (1993) Fujitsu Scientific and Technical J. , vol.5 , Issue.1 , pp. 61-70
    • Brent, R.1    Strazdins, P.2
  • 15
    • 0028530654 scopus 로고
    • PUMMA: Parallel Universal Matrix Multiplication Algorithms on Distributed-Memory Concurrent Computers
    • J. Choi, J. Dongarra, and D. Walker, "PUMMA: Parallel Universal Matrix Multiplication Algorithms on Distributed-Memory Concurrent Computers," Concurrency: Practice and Experience, vol. 6, no. 7, pp. 543-570, 1994.
    • (1994) Concurrency: Practice and Experience , vol.6 , Issue.7 , pp. 543-570
    • Choi, J.1    Dongarra, J.2    Walker, D.3
  • 16
    • 0030241311 scopus 로고    scopus 로고
    • PB-BLAS: A Set of Parallel Block Basic Linear Algebra Subroutines
    • J. Choi, J. Dongarra, and D. Walker, "PB-BLAS: A Set of Parallel Block Basic Linear Algebra Subroutines" Concurrency: Practice and Experience, vol. 8, no. 7, pp. 517-535, 1996.
    • (1996) Concurrency: Practice and Experience , vol.8 , Issue.7 , pp. 517-535
    • Choi, J.1    Dongarra, J.2    Walker, D.3
  • 18
    • 0006488807 scopus 로고
    • QR Factorization of a Dense Matrix on a Hypercube Multiprocessor
    • E. Chu and A. George, "QR Factorization of a Dense Matrix on a Hypercube Multiprocessor," SIAM J. Scientific and Statistical Computing, vol. 11, pp. 990-1,028, 1990.
    • (1990) SIAM J. Scientific and Statistical Computing , vol.11
    • Chu, E.1    George, A.2
  • 19
    • 0028443077 scopus 로고
    • A Parallel Block Implementation of Level 3 BLAS for MIMD Vector Processors
    • M. Day de, I. Duff, and A. Petitet, "A Parallel Block Implementation of Level 3 BLAS for MIMD Vector Processors," ACM Trans. Mathematical Software, vol. 20, no. 2, pp. 178-193, 1994.
    • (1994) ACM Trans. Mathematical Software , vol.20 , Issue.2 , pp. 178-193
    • Dayde, M.1    Duff, I.2    Petitet, A.3
  • 21
    • 0000778168 scopus 로고
    • Scalability Issues in the Design of a Library for Dense Linear Algebra
    • J. Dongarra, R. van de Geijn, and D. Walker, "Scalability Issues in the Design of a Library for Dense Linear Algebra," J. Parallel and Distributed Computing, vol. 22, no. 3, pp. 523-537, 1994.
    • (1994) J. Parallel and Distributed Computing , vol.22 , Issue.3 , pp. 523-537
    • Dongarra, J.1    Van De Geijn, R.2    Walker, D.3
  • 22
    • 0029324485 scopus 로고
    • Software Libraries for Linear Algebra Computations on High Performance Computers
    • J. Dongarra and D. Walker, "Software Libraries for Linear Algebra Computations on High Performance Computers," SIAM Review, vol. 37, no. 2, pp. 151-180, 1995.
    • (1995) SIAM Review , vol.37 , Issue.2 , pp. 151-180
    • Dongarra, J.1    Walker, D.2
  • 23
    • 0012493293 scopus 로고
    • Technical Report UT CS-95-281, LAPACK Working Note 94, Univ. Tennessee
    • J. Dongarra and R.C. Whaley, "A User's Guide to the BLACS v1.0," Technical Report UT CS-95-281, LAPACK Working Note 94, Univ. Tennessee, 1995. (http://www.netlib.org/blacs/)
    • (1995) A User's Guide to the BLACS V1.0
    • Dongarra, J.1    Whaley, R.C.2
  • 25
    • 0023288009 scopus 로고
    • Matrix Algorithms on a Hypercube I: Matrix Multiplication
    • G. Fox, S. Otto, and A. Hey, "Matrix Algorithms on a Hypercube I: Matrix Multiplication," Parallel Computing, vol. 3, pp. 17-31, 1987.
    • (1987) Parallel Computing , vol.3 , pp. 17-31
    • Fox, G.1    Otto, S.2    Hey, A.3
  • 26
    • 0039821547 scopus 로고
    • LU Factorization Algorithms on Distributed-Memory Multiprocessor Architectures
    • G. Geist and C. Romine, "LU Factorization Algorithms on Distributed-Memory Multiprocessor Architectures," SIAM J. Scientific and Statistical Computing, vol. 9, pp. 639-649, 1988.
    • (1988) SIAM J. Scientific and Statistical Computing , vol.9 , pp. 639-649
    • Geist, G.1    Romine, C.2
  • 27
    • 0001615713 scopus 로고
    • Parallel Solution Triangular Systems on Distributed-Memory Multiprocessors
    • M. Heath and C. Romine, "Parallel Solution Triangular Systems on Distributed-Memory Multiprocessors," SIAM J. Scientific and Statistical Computing, vol. 9, pp. 558-588, 1988.
    • (1988) SIAM J. Scientific and Statistical Computing , vol.9 , pp. 558-588
    • Heath, M.1    Romine, C.2
  • 29
    • 0000667923 scopus 로고
    • The Torus-Wrap Mapping for Dense Matrix Calculations on Massively Parallel Computers
    • Sept.
    • B. Hendrickson and D. Womble, "The Torus-Wrap Mapping for Dense Matrix Calculations on Massively Parallel Computers," J. Scientific and Statistical Computing, vol. 15, no. 5, pp. 1,201-1,226, Sept. 1994.
    • (1994) J. Scientific and Statistical Computing , vol.15 , Issue.5
    • Hendrickson, B.1    Womble, D.2
  • 33
    • 0029484078 scopus 로고
    • Processor Mapping Techniques towards Efficient Data Redistribution
    • E. Kalns and L. Ni, "Processor Mapping Techniques towards Efficient Data Redistribution," IEEE Trans. Parallel and Distributed Systems, vol. 12, no. 6, pp. 1,234-1,247, 1995.
    • (1995) IEEE Trans. Parallel and Distributed Systems , vol.12 , Issue.6
    • Kalns, E.1    Ni, L.2
  • 37
    • 0013317481 scopus 로고
    • A New Method for Solving Triangular Systems on Distributed-Memory Message-Passing Multiprocessor
    • G. Li and T. Coleman, "A New Method for Solving Triangular Systems on Distributed-Memory Message-Passing Multiprocessor," SIAM J. Scientific and Statistical Computing, vol. 10, no. 2, pp. 382-396, 1989.
    • (1989) SIAM J. Scientific and Statistical Computing , vol.10 , Issue.2 , pp. 382-396
    • Li, G.1    Coleman, T.2
  • 39
  • 40
    • 0028464291 scopus 로고
    • Multiplication of Matrices of Arbitrary Shapes on a Data Parallel Computer
    • K. Mathur, S.L. Johnsson, "Multiplication of Matrices of Arbitrary Shapes on a Data Parallel Computer," Parallel Computing, vol. 20, pp. 919-951, 1994.
    • (1994) Parallel Computing , vol.20 , pp. 919-951
    • Mathur, K.1    Johnsson, S.L.2
  • 42
  • 44
    • 33749948602 scopus 로고    scopus 로고
    • A High Performance Version of Parallel LAPACK: Preliminary Report
    • Fujitsu Parallel Computing Center
    • P. Strazdins and H. Koesmarno, "A High Performance Version of Parallel LAPACK: Preliminary Report," Proc. Sixth Parallel Computing Workshop, Fujitsu Parallel Computing Center, 1996.
    • (1996) Proc. Sixth Parallel Computing Workshop
    • Strazdins, P.1    Koesmarno, H.2
  • 47
    • 0031123769 scopus 로고    scopus 로고
    • SUMMA: Scalable Universal Matrix Multiplication Algorithm
    • R. van de Geijn and J. Watts, "SUMMA: Scalable Universal Matrix Multiplication Algorithm," Concurrency: Practice and Experience, vol. 9, no. 4, pp. 255-274, 1997.
    • (1997) Concurrency: Practice and Experience , vol.9 , Issue.4 , pp. 255-274
    • Van De Geijn, R.1    Watts, J.2
  • 48
    • 84990712105 scopus 로고
    • Experiments with Multicomputer LU Decomposition
    • E. van de Velde, "Experiments with Multicomputer LU Decomposition," Concurrency: Practice and Experience, vol. 2, pp. 1-26, 1990.
    • (1990) Concurrency: Practice and Experience , vol.2 , pp. 1-26
    • Van De Velde, E.1
  • 49
    • 0030282238 scopus 로고    scopus 로고
    • Redistribution of Block-Cyclic Data Distributions Using MPI
    • D. Walker and S. Otto, "Redistribution of Block-Cyclic Data Distributions Using MPI," Concurrency: Practice and Experience, vol. 8, no. 9, pp. 707-728, 1996.
    • (1996) Concurrency: Practice and Experience , vol.8 , Issue.9 , pp. 707-728
    • Walker, D.1    Otto, S.2
  • 50
    • 0010224751 scopus 로고    scopus 로고
    • Runtime Performance of Parallel Array Assignment: An Empirical Study
    • L. Wang, J. Stichnoth, S. Chatterjee, "Runtime Performance of Parallel Array Assignment: An Empirical Study," Proc. Supercomputing, 1996. (http://www.supercomp.org/sc96/proceedings/).
    • (1996) Proc. Supercomputing
    • Wang, L.1    Stichnoth, J.2    Chatterjee, S.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.