-
1
-
-
0029193089
-
LogGP: Incorporating long messages into the LogP model - One step closer towards a realistic model for parallel computation
-
ACM Press, New York, June
-
A. Alexandrov, M. F. Ionescu, K. E. Schauser, and C. Scheiman. LogGP: Incorporating long messages into the LogP model - one step closer towards a realistic model for parallel computation. In Proc. 7th ACM Symp. on Parallel Algorithms and Architectures, pages 95-105. ACM Press, New York, June 1995. http://doi.acm.org/10.1145/215399.215427
-
(1995)
Proc. 7th ACM Symp. on Parallel Algorithms and Architectures
, pp. 95-105
-
-
Alexandrov, A.1
Ionescu, M.F.2
Schauser, K.E.3
Scheiman, C.4
-
2
-
-
0025536635
-
LAPACK: A portable linear algebra library for high-performance computers
-
SIAM, Philadelphia, Nov.
-
E. Anderson, Z. Bai, J. Dongarra, A. Greenbaum, A. McKenney, J. Du Croz, S. Hammerling, J. Demmel, C. Bischof, and D. Sorensen. LAPACK: A portable linear algebra library for high-performance computers. In Proc. '90 Int. Conf. on Supercomputing, pages 2-11. SIAM, Philadelphia, Nov. 1990. http://www.acm.org/pubs/citation/proceeding/supercomputing/110382/p2-anderson/
-
(1990)
Proc. '90 Int. Conf. on Supercomputing
, pp. 2-11
-
-
Anderson, E.1
Bai, Z.2
Dongarra, J.3
Greenbaum, A.4
McKenney, A.5
Du Croz, J.6
Hammerling, S.7
Demmel, J.8
Bischof, C.9
Sorensen, D.10
-
4
-
-
0030661485
-
Optimizing matrix multiply using PHiPAC: A portable, high-performance, ANSI C coding methodology
-
ACM Press, New York, July
-
J. Bilmes, K. Asanović, C.-W. Chin, and J. Demmel. Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology. In Proc. '97 Int. Conf. on Supercomputing, pages 340-347. ACM Press, New York, July 1997. http://doi.acm.org/10.1145/263580.263662
-
(1997)
Proc. '97 Int. Conf. on Supercomputing
, pp. 340-347
-
-
Bilmes, J.1
Asanović, K.2
Chin, C.-W.3
Demmel, J.4
-
5
-
-
0003615167
-
-
SIAM, Philadelphia
-
L. S. Blackford, J. Choi, A. Cleary, E. D'Azevedo, J. Demmel, I. Dhillon, J. Dongarra, S. Hammarling, G. Henry, A. Petitet, K. Stanley, D. Walker, and R. C. Whaley. ScaLAPACK Users' Guide. SIAM, Philadelphia, 1997.
-
(1997)
ScaLAPACK Users' Guide
-
-
Blackford, L.S.1
Choi, J.2
Cleary, A.3
D'Azevedo, E.4
Demmel, J.5
Dhillon, I.6
Dongarra, J.7
Hammarling, S.8
Henry, G.9
Petitet, A.10
Stanley, K.11
Walker, D.12
Whaley, R.C.13
-
6
-
-
0017419683
-
A transformation system for developing recursive programs
-
Jan.
-
R. M. Burstall and J. Darlington. A transformation system for developing recursive programs. J. ACM, 24(1):44-67, Jan. 1977. http://doi.acm.org/10.1145/321992.321996
-
(1977)
J. ACM
, vol.24
, Issue.1
, pp. 44-67
-
-
Burstall, R.M.1
Darlington, J.2
-
7
-
-
0036870763
-
Recursive array layouts and fast parallel matrix multiplication
-
Nov.
-
S. Chatterjee, A. R. Lebeck, P. K. Patnala, and M. Thottenthodi. Recursive array layouts and fast parallel matrix multiplication. IEEE Trans. Parallel Distrib. Syst., 13(11):1105-1123, Nov. 2002. http://www.computer.org/tpds/td2002/11009abs.htm
-
(2002)
IEEE Trans. Parallel Distrib. Syst.
, vol.13
, Issue.11
, pp. 1105-1123
-
-
Chatterjee, S.1
Lebeck, A.R.2
Patnala, P.K.3
Thottenthodi, M.4
-
8
-
-
18844387390
-
Exact analysis of the cache behavior of nested loops
-
ACM Press, New York
-
S. Chatterjee, E. Parker, P. J. Hanlon, and A. R. Lebeck. Exact analysis of the cache behavior of nested loops. In Proc. ACM SIGPLAN '01 Conf. on Program. Language Design and Implementation, pages 286-297. ACM Press, New York, 2001. http://doi.acm.org/10.1145/378796.378859.
-
(2001)
Proc. ACM SIGPLAN '01 Conf. on Program. Language Design and Implementation
, pp. 286-297
-
-
Chatterjee, S.1
Parker, E.2
Hanlon, P.J.3
Lebeck, A.R.4
-
9
-
-
0037660035
-
-
chapter 1, pages 1-25. In Kronsjö and Shumsheruddin [19]
-
M. Cole. Parallel Software Designs, chapter 1, pages 1-25. In Kronsjö and Shumsheruddin [19], 1992.
-
(1992)
Parallel Software Designs
-
-
Cole, M.1
-
10
-
-
0030287932
-
LogP: A practical model of parallel computation
-
Nov.
-
D. E. Culler, R. M. Karp, D. Patterson, A. Sahay, E. E. Santos, K. E. Schauser, R. Subramonian, and T. von Eicken. LogP: A practical model of parallel computation. Commun. ACM, 39(11):78-85, Nov. 1996. http://doi.acm.org/10.1145/240455.240477.
-
(1996)
Commun. ACM
, vol.39
, Issue.11
, pp. 78-85
-
-
Culler, D.E.1
Karp, R.M.2
Patterson, D.3
Sahay, A.4
Santos, E.E.5
Schauser, K.E.6
Subramonian, R.7
Von Eicken, T.8
-
11
-
-
0034224207
-
Applying recursion to serial and parallel QR factorization leads to better performance
-
July
-
E. Elmroth and F. Gustavson. Applying recursion to serial and parallel QR factorization leads to better performance. IBM J. Res. Develop., 44(4):605-624, July 2000. http://www.research.ibm.com/journal/rd/444/elmroth.html
-
(2000)
IBM J. Res. Develop.
, vol.44
, Issue.4
, pp. 605-624
-
-
Elmroth, E.1
Gustavson, F.2
-
12
-
-
0037997839
-
Matrix factorization using a block-recursive structure and block-recursive algorithms
-
PhD thesis, Indiana University, Computer Science Department, Sept.
-
J. D. Frens. Matrix Factorization Using a Block-Recursive Structure and Block-Recursive Algorithms. PhD thesis, Indiana University, Computer Science Department, Sept. 2002. Available as Techical Report # 568. http://cs.indiana.edu/Research/techreport/TR588.shtml.
-
(2002)
Techical Report # 568
, vol.568
-
-
Frens, J.D.1
-
13
-
-
0030688479
-
Auto-blocking matrix-multiplication or tracking BLAS3 performance from source code
-
ACM Press, New York, June
-
J. D. Frens and D. S. Wise. Auto-blocking matrix-multiplication or tracking BLAS3 performance from source code. In Proc. 6th ACM SIGPLAN Symp. on Principles and Practice of Parallel Program., pages 206-216. ACM Press, New York, June 1997. http://doi.acm.org/10.1145/263764.263789
-
(1997)
Proc. 6th ACM SIGPLAN Symp. on Principles and Practice of Parallel Program
, pp. 206-216
-
-
Frens, J.D.1
Wise, D.S.2
-
14
-
-
0033350255
-
Cache-oblivious algorithms
-
IEEE Computer Society, Los Alamitos, CA, Oct.
-
M. Frigo, C. E. Leiserson, H. Prokop, and S. Ramachandran. Cache-oblivious algorithms. In Proc. 40th Symp. on Foundations of Computer Science, pages 285-298. IEEE Computer Society, Los Alamitos, CA, Oct. 1999. http://www.computer.org/proceedings/focs/0409/04090285abs.htm
-
(1999)
Proc. 40th Symp. on Foundations of Computer Science
, pp. 285-298
-
-
Frigo, M.1
Leiserson, C.E.2
Prokop, H.3
Ramachandran, S.4
-
15
-
-
0004236492
-
-
The John Hopkins University Press, Baltimore, third edition
-
G. H. Golub and C. F. Van Loan. Matrix Computations. The John Hopkins University Press, Baltimore, third edition, 1996.
-
(1996)
Matrix Computations
-
-
Golub, G.H.1
Van Loan, C.F.2
-
17
-
-
0037997838
-
LogGPS: A parallel computational model for synchonization analysis
-
ACM Press, New York, June
-
F. Ino, N. Fujimoto, and K. Hagihara. LogGPS: A parallel computational model for synchonization analysis. In Proc. 8th ACM SIGPLAN Symp. on Principles and Practice of Parallel Program., pages 133-142. ACM Press, New York, June 2001. http : //doi.acm.org/10.1145/379539.379592.
-
(2001)
Proc. 8th ACM SIGPLAN Symp. on Principles and Practice of Parallel Program
, pp. 133-142
-
-
Ino, F.1
Fujimoto, N.2
Hagihara, K.3
-
18
-
-
0003657590
-
-
Addison Wesley Longman, Boston, New York, third edition
-
D. E. Knuth. The Art of Computer Programming: Fundamental Algorithms, volume 1 of The Art of Computer Programming. Addison Wesley Longman, Boston, New York, third edition, 1997.
-
(1997)
The Art of Computer Programming: Fundamental Algorithms, Volume 1 of The Art of Computer Programming
, vol.1
-
-
Knuth, D.E.1
-
21
-
-
84958744342
-
Recursion unrolling for divide and conquer programs
-
S. Midkiff, J. Moreira, M. Gupta, S. Chatterjee, J. Ferrante, J. Prins, W. Pugh, and C.-W. Tseng, editors. Springer, Berlin
-
R. Rugina and M. Rinard. Recursion unrolling for divide and conquer programs. In S. Midkiff, J. Moreira, M. Gupta, S. Chatterjee, J. Ferrante, J. Prins, W. Pugh, and C.-W. Tseng, editors, Languages and Compilers for Parallel Computing, volume 2017, pages 34-48. Springer, Berlin, 2001. http://link.springer.de/link/service/series/0558/bibs/2017/20170034.htm.
-
(2001)
Languages and Compilers for Parallel Computing
, vol.2017
, pp. 34-48
-
-
Rugina, R.1
Rinard, M.2
-
22
-
-
0037997837
-
R8000 microprocessor chip set
-
Silicon Graphics, Inc. Silicon Graphics, Inc.
-
Silicon Graphics, Inc. R8000 microprocessor chip set. Technical report, Silicon Graphics, Inc., 1994.
-
(1994)
Technical Report
-
-
-
24
-
-
0037997834
-
Undulant block elimination and integer-preserving matrix inversion
-
Jan.
-
D. S. Wise. Undulant block elimination and integer-preserving matrix inversion. Sci. Comp. Program., 33(1):29-85, Jan. 1999. http:// www.cs.indiana.edu/ftp/techreports/TR418.html.
-
(1999)
Sci. Comp. Program.
, vol.33
, Issue.1
, pp. 29-85
-
-
Wise, D.S.1
-
25
-
-
84937431996
-
Ahnentafel indexing into Morton-ordered arrays, or matrix locality for free
-
A. Bode, T. Ludwig, W. Karl, and R. Wismüller, editors, Euro-Par 2000 - Parallel Processing, Springer, Heidelberg
-
D.S. Wise. Ahnentafel indexing into Morton-ordered arrays, or matrix locality for free. In A. Bode, T. Ludwig, W. Karl, and R. Wismüller, editors, Euro-Par 2000 - Parallel Processing, volume 1900 of Lecture Notes in Computer Science, pages 774-883. Springer, Heidelberg, 2000. http:// link.springer.de/link/service/series/0558/bibs/1900/19000774.htm
-
(2000)
Lecture Notes in Computer Science
, vol.1900
, pp. 774-883
-
-
Wise, D.S.1
-
26
-
-
0034819362
-
Language support for Morton-order matrices
-
ACM Press, New York, June
-
D. S. Wise, J. D. Frens, Y. Gu, and G. A. Alexander. Language support for Morton-order matrices. In Proc. 8th ACM SIGPLAN Symp. on Principles and Practice of Parallel Program., pages 24-33. ACM Press, New York, June 2001. http://doi.acm.org/10.1145/379539.379559
-
(2001)
Proc. 8th ACM SIGPLAN Symp. on Principles and Practice of Parallel Program
, pp. 24-33
-
-
Wise, D.S.1
Frens, J.D.2
Gu, Y.3
Alexander, G.A.4
|