SCOPUS 정보 검색 플랫폼

Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP

Volumn , Issue , 2003, Pages 144-154

QR factorization with Morton-ordered quadtree matrices for memory re-use and parallelism

(2) Frens, Jeremy D a Wise, David S b

a CALVIN COLLEGE (United States)

b INDIANA UNIVERSITY (United States)

Author keywords

Cache misses; Indexing; Paging; Quadtrees; Storage management; Swapping

Indexed keywords

ALGORITHMS; CODES (SYMBOLS); DATA STORAGE EQUIPMENT;

MEMORY HIERARCHY;

PARALLEL PROCESSING SYSTEMS;

EID: 0038716587 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (14)

References (27)

1
- 0029193089
- LogGP: Incorporating long messages into the LogP model - One step closer towards a realistic model for parallel computation
- ACM Press, New York, June
- A. Alexandrov, M. F. Ionescu, K. E. Schauser, and C. Scheiman. LogGP: Incorporating long messages into the LogP model - one step closer towards a realistic model for parallel computation. In Proc. 7th ACM Symp. on Parallel Algorithms and Architectures, pages 95-105. ACM Press, New York, June 1995. http://doi.acm.org/10.1145/215399.215427
- (1995) Proc. 7th ACM Symp. on Parallel Algorithms and Architectures , pp. 95-105
- Alexandrov, A.¹ Ionescu, M.F.² Schauser, K.E.³ Scheiman, C.⁴

2
- 0025536635
- LAPACK: A portable linear algebra library for high-performance computers
- SIAM, Philadelphia, Nov.
- E. Anderson, Z. Bai, J. Dongarra, A. Greenbaum, A. McKenney, J. Du Croz, S. Hammerling, J. Demmel, C. Bischof, and D. Sorensen. LAPACK: A portable linear algebra library for high-performance computers. In Proc. '90 Int. Conf. on Supercomputing, pages 2-11. SIAM, Philadelphia, Nov. 1990. http://www.acm.org/pubs/citation/proceeding/supercomputing/110382/p2-anderson/
- (1990) Proc. '90 Int. Conf. on Supercomputing , pp. 2-11
- Anderson, E.¹ Bai, Z.² Dongarra, J.³ Greenbaum, A.⁴ McKenney, A.⁵ Du Croz, J.⁶ Hammerling, S.⁷ Demmel, J.⁸ Bischof, C.⁹ Sorensen, D.¹⁰

3
- 1442335113
- chapter 2, pages 26-65. In Kronsjö and Shumsheruddin [19]
- T. Axford. The Divide-and-Conquer Paradigm as a Basis for Parallel Language Design, chapter 2, pages 26-65. In Kronsjö and Shumsheruddin [19], 1992.
- (1992) The Divide-and-Conquer Paradigm as a Basis for Parallel Language Design
- Axford, T.¹

4
- 0030661485
- Optimizing matrix multiply using PHiPAC: A portable, high-performance, ANSI C coding methodology
- ACM Press, New York, July
- J. Bilmes, K. Asanović, C.-W. Chin, and J. Demmel. Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology. In Proc. '97 Int. Conf. on Supercomputing, pages 340-347. ACM Press, New York, July 1997. http://doi.acm.org/10.1145/263580.263662
- (1997) Proc. '97 Int. Conf. on Supercomputing , pp. 340-347
- Bilmes, J.¹ Asanović, K.² Chin, C.-W.³ Demmel, J.⁴

5
- 0003615167
- SIAM, Philadelphia
- L. S. Blackford, J. Choi, A. Cleary, E. D'Azevedo, J. Demmel, I. Dhillon, J. Dongarra, S. Hammarling, G. Henry, A. Petitet, K. Stanley, D. Walker, and R. C. Whaley. ScaLAPACK Users' Guide. SIAM, Philadelphia, 1997.
- (1997) ScaLAPACK Users' Guide
- Blackford, L.S.¹ Choi, J.² Cleary, A.³ D'Azevedo, E.⁴ Demmel, J.⁵ Dhillon, I.⁶ Dongarra, J.⁷ Hammarling, S.⁸ Henry, G.⁹ Petitet, A.¹⁰ Stanley, K.¹¹ Walker, D.¹² Whaley, R.C.¹³

6
- 0017419683
- A transformation system for developing recursive programs
- Jan.
- R. M. Burstall and J. Darlington. A transformation system for developing recursive programs. J. ACM, 24(1):44-67, Jan. 1977. http://doi.acm.org/10.1145/321992.321996
- (1977) J. ACM , vol.24 , Issue.1 , pp. 44-67
- Burstall, R.M.¹ Darlington, J.²

7
- 0036870763
- Recursive array layouts and fast parallel matrix multiplication
- Nov.
- S. Chatterjee, A. R. Lebeck, P. K. Patnala, and M. Thottenthodi. Recursive array layouts and fast parallel matrix multiplication. IEEE Trans. Parallel Distrib. Syst., 13(11):1105-1123, Nov. 2002. http://www.computer.org/tpds/td2002/11009abs.htm
- (2002) IEEE Trans. Parallel Distrib. Syst. , vol.13 , Issue.11 , pp. 1105-1123
- Chatterjee, S.¹ Lebeck, A.R.² Patnala, P.K.³ Thottenthodi, M.⁴

8
- 18844387390
- Exact analysis of the cache behavior of nested loops
- ACM Press, New York
- S. Chatterjee, E. Parker, P. J. Hanlon, and A. R. Lebeck. Exact analysis of the cache behavior of nested loops. In Proc. ACM SIGPLAN '01 Conf. on Program. Language Design and Implementation, pages 286-297. ACM Press, New York, 2001. http://doi.acm.org/10.1145/378796.378859.
- (2001) Proc. ACM SIGPLAN '01 Conf. on Program. Language Design and Implementation , pp. 286-297
- Chatterjee, S.¹ Parker, E.² Hanlon, P.J.³ Lebeck, A.R.⁴

9
- 0037660035
- chapter 1, pages 1-25. In Kronsjö and Shumsheruddin [19]
- M. Cole. Parallel Software Designs, chapter 1, pages 1-25. In Kronsjö and Shumsheruddin [19], 1992.
- (1992) Parallel Software Designs
- Cole, M.¹

10
- 0030287932
- LogP: A practical model of parallel computation
- Nov.
- D. E. Culler, R. M. Karp, D. Patterson, A. Sahay, E. E. Santos, K. E. Schauser, R. Subramonian, and T. von Eicken. LogP: A practical model of parallel computation. Commun. ACM, 39(11):78-85, Nov. 1996. http://doi.acm.org/10.1145/240455.240477.
- (1996) Commun. ACM , vol.39 , Issue.11 , pp. 78-85
- Culler, D.E.¹ Karp, R.M.² Patterson, D.³ Sahay, A.⁴ Santos, E.E.⁵ Schauser, K.E.⁶ Subramonian, R.⁷ Von Eicken, T.⁸

11
- 0034224207
- Applying recursion to serial and parallel QR factorization leads to better performance
- July
- E. Elmroth and F. Gustavson. Applying recursion to serial and parallel QR factorization leads to better performance. IBM J. Res. Develop., 44(4):605-624, July 2000. http://www.research.ibm.com/journal/rd/444/elmroth.html
- (2000) IBM J. Res. Develop. , vol.44 , Issue.4 , pp. 605-624
- Elmroth, E.¹ Gustavson, F.²

12
- 0037997839
- Matrix factorization using a block-recursive structure and block-recursive algorithms
- PhD thesis, Indiana University, Computer Science Department, Sept.
- J. D. Frens. Matrix Factorization Using a Block-Recursive Structure and Block-Recursive Algorithms. PhD thesis, Indiana University, Computer Science Department, Sept. 2002. Available as Techical Report # 568. http://cs.indiana.edu/Research/techreport/TR588.shtml.
- (2002) Techical Report # 568 , vol.568
- Frens, J.D.¹

13
- 0030688479
- Auto-blocking matrix-multiplication or tracking BLAS3 performance from source code
- ACM Press, New York, June
- J. D. Frens and D. S. Wise. Auto-blocking matrix-multiplication or tracking BLAS3 performance from source code. In Proc. 6th ACM SIGPLAN Symp. on Principles and Practice of Parallel Program., pages 206-216. ACM Press, New York, June 1997. http://doi.acm.org/10.1145/263764.263789
- (1997) Proc. 6th ACM SIGPLAN Symp. on Principles and Practice of Parallel Program , pp. 206-216
- Frens, J.D.¹ Wise, D.S.²

14
- 0033350255
- Cache-oblivious algorithms
- IEEE Computer Society, Los Alamitos, CA, Oct.
- M. Frigo, C. E. Leiserson, H. Prokop, and S. Ramachandran. Cache-oblivious algorithms. In Proc. 40th Symp. on Foundations of Computer Science, pages 285-298. IEEE Computer Society, Los Alamitos, CA, Oct. 1999. http://www.computer.org/proceedings/focs/0409/04090285abs.htm
- (1999) Proc. 40th Symp. on Foundations of Computer Science , pp. 285-298
- Frigo, M.¹ Leiserson, C.E.² Prokop, H.³ Ramachandran, S.⁴

15
- 0004236492
- The John Hopkins University Press, Baltimore, third edition
- G. H. Golub and C. F. Van Loan. Matrix Computations. The John Hopkins University Press, Baltimore, third edition, 1996.
- (1996) Matrix Computations
- Golub, G.H.¹ Van Loan, C.F.²

16
- 0004060561
- SIAM, Philadelphia
- N. J. Higham. Accuracy and Stability of Numerical Algorithms. SIAM, Philadelphia, 1996.
- (1996) Accuracy and Stability of Numerical Algorithms
- Higham, N.J.¹

17
- 0037997838
- LogGPS: A parallel computational model for synchonization analysis
- ACM Press, New York, June
- F. Ino, N. Fujimoto, and K. Hagihara. LogGPS: A parallel computational model for synchonization analysis. In Proc. 8th ACM SIGPLAN Symp. on Principles and Practice of Parallel Program., pages 133-142. ACM Press, New York, June 2001. http : //doi.acm.org/10.1145/379539.379592.
- (2001) Proc. 8th ACM SIGPLAN Symp. on Principles and Practice of Parallel Program , pp. 133-142
- Ino, F.¹ Fujimoto, N.² Hagihara, K.³

18
- 0003657590
- Addison Wesley Longman, Boston, New York, third edition
- D. E. Knuth. The Art of Computer Programming: Fundamental Algorithms, volume 1 of The Art of Computer Programming. Addison Wesley Longman, Boston, New York, third edition, 1997.
- (1997) The Art of Computer Programming: Fundamental Algorithms, Volume 1 of The Art of Computer Programming , vol.1
- Knuth, D.E.¹

19
- 0037660034
- John Wiley & Sons, New York
- L. Kronsjö and D. Shumsheruddin, editors. Advances in Parallel Algorithms. John Wiley & Sons, New York, 1992.
- (1992) Advances in Parallel Algorithms
- Kronsjö, L.¹ Shumsheruddin, D.²

20
- 0003502903
- Morgan Kaufmann, San Francisco
- S. S. Muchnick, Advanced Compiler Design and Implementation. Morgan Kaufmann, San Francisco, 1997.
- (1997) Advanced Compiler Design and Implementation
- Muchnick, S.S.¹

21
- 84958744342
- Recursion unrolling for divide and conquer programs
- S. Midkiff, J. Moreira, M. Gupta, S. Chatterjee, J. Ferrante, J. Prins, W. Pugh, and C.-W. Tseng, editors. Springer, Berlin
- R. Rugina and M. Rinard. Recursion unrolling for divide and conquer programs. In S. Midkiff, J. Moreira, M. Gupta, S. Chatterjee, J. Ferrante, J. Prins, W. Pugh, and C.-W. Tseng, editors, Languages and Compilers for Parallel Computing, volume 2017, pages 34-48. Springer, Berlin, 2001. http://link.springer.de/link/service/series/0558/bibs/2017/20170034.htm.
- (2001) Languages and Compilers for Parallel Computing , vol.2017 , pp. 34-48
- Rugina, R.¹ Rinard, M.²

22
- 0037997837
- R8000 microprocessor chip set
- Silicon Graphics, Inc. Silicon Graphics, Inc.
- Silicon Graphics, Inc. R8000 microprocessor chip set. Technical report, Silicon Graphics, Inc., 1994.
- (1994) Technical Report

23
- 84943297310
- Automatically tuned linear algebra software
- ACM Press, New York
- R. C. Whaley and J. J. Dongarra. Automatically tuned linear algebra software. In Proc. '98 Int. Conf. on Supercomputing. ACM Press, New York, 1998.
- (1998) Proc. '98 Int. Conf. on Supercomputing
- Whaley, R.C.¹ Dongarra, J.J.²

24
- 0037997834
- Undulant block elimination and integer-preserving matrix inversion
- Jan.
- D. S. Wise. Undulant block elimination and integer-preserving matrix inversion. Sci. Comp. Program., 33(1):29-85, Jan. 1999. http:// www.cs.indiana.edu/ftp/techreports/TR418.html.
- (1999) Sci. Comp. Program. , vol.33 , Issue.1 , pp. 29-85
- Wise, D.S.¹

25
- 84937431996
- Ahnentafel indexing into Morton-ordered arrays, or matrix locality for free
- A. Bode, T. Ludwig, W. Karl, and R. Wismüller, editors, Euro-Par 2000 - Parallel Processing, Springer, Heidelberg
- D.S. Wise. Ahnentafel indexing into Morton-ordered arrays, or matrix locality for free. In A. Bode, T. Ludwig, W. Karl, and R. Wismüller, editors, Euro-Par 2000 - Parallel Processing, volume 1900 of Lecture Notes in Computer Science, pages 774-883. Springer, Heidelberg, 2000. http:// link.springer.de/link/service/series/0558/bibs/1900/19000774.htm
- (2000) Lecture Notes in Computer Science , vol.1900 , pp. 774-883
- Wise, D.S.¹

26
- 0034819362
- Language support for Morton-order matrices
- ACM Press, New York, June
- D. S. Wise, J. D. Frens, Y. Gu, and G. A. Alexander. Language support for Morton-order matrices. In Proc. 8th ACM SIGPLAN Symp. on Principles and Practice of Parallel Program., pages 24-33. ACM Press, New York, June 2001. http://doi.acm.org/10.1145/379539.379559
- (2001) Proc. 8th ACM SIGPLAN Symp. on Principles and Practice of Parallel Program , pp. 24-33
- Wise, D.S.¹ Frens, J.D.² Gu, Y.³ Alexander, G.A.⁴

27
- 0003424922
- Computer Engineering. McGraw-Hill, New York
- A. Y. H. Zomaya, editor. Parallel & Distributed Computing Handbook. Computer Engineering. McGraw-Hill, New York, 1996.
- (1996) Parallel & Distributed Computing Handbook
- Zomaya, A.Y.H.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.