-
1
-
-
4644295630
-
Evaluating the imagine stream architecture
-
AHN, J. H., DALLY, W. J., KHAILANY, B., KAPASI, U. J., AND DAS, A. 2004. Evaluating the imagine stream architecture. In Proceedings of the 31st Annual International Symposium on Computer Architexture, Munich, Germany.
-
(2004)
Proceedings of the 31st Annual International Symposium on Computer Architexture, Munich, Germany
-
-
Ahn, J.H.1
Dally, W.J.2
Khailany, B.3
Kapasi, U.J.4
Das, A.5
-
2
-
-
0242533311
-
Sparse matrix solvers on the gpu: Conjugate gradients and multigrid
-
BOLZ, J., FARMER, I., GRINSPUN, E., AND SCHRÖDER, P. 2003. Sparse matrix solvers on the gpu: conjugate gradients and multigrid. ACM Trans. Graph. 22, 3, 917-924.
-
(2003)
ACM Trans. Graph.
, vol.22
, Issue.3
, pp. 917-924
-
-
Bolz, J.1
Farmer, I.2
Grinspun, E.3
Schröder, P.4
-
3
-
-
10644248153
-
Brook for gpus: Stream computing on graphics hardware
-
BUCK., I., FOLEY, T., HORN, D., SUGERMAN, J., FATAHALIAN, K., HOUSTON, M., AND HANRAHAN, P. 2004. Brook for gpus: stream computing on graphics hardware. ACM Trans. Graph. 23, 3, 777-786.
-
(2004)
ACM Trans. Graph.
, vol.23
, Issue.3
, pp. 777-786
-
-
Buck, I.1
Foley, T.2
Horn, D.3
Sugerman, J.4
Fatahalian, K.5
Houston, M.6
Hanrahan, P.7
-
4
-
-
0032659795
-
Recursive array layouts and fast parallel matrix multiplication
-
CHATTERJEE, S., LEBECK, A. R., PATNALA, P. K., AND THOTTETHODI, M. 1999. Recursive array layouts and fast parallel matrix multiplication. In ACM Symposium on Parallel Algorithms and Architectures, 222-231.
-
(1999)
ACM Symposium on Parallel Algorithms and Architectures
, pp. 222-231
-
-
Chatterjee, S.1
Lebeck, A.R.2
Patnala, P.K.3
Thottethodi, M.4
-
5
-
-
10444232320
-
Merrimac: Supercomputing with streams
-
DALLY, W. J., HANRAHAN, P., EREZ, M., KNIGHT, T. J., LABONTE, F., A., J.-H., JAYASENA, N., KAPASI, U. J., DAS, A., GUMMARAJU, J., AND BUCK, I. 2003. Merrimac: Supercomputing with streams. In SC'03.
-
(2003)
SC'03
-
-
Dally, W.J.1
Hanrahan, P.2
Erez, M.3
Knight, T.J.4
Labonte, F.A.J.-H.5
Jayasena, N.6
Kapasi, U.J.7
Das, A.8
Gummaraju, J.9
Buck, I.10
-
7
-
-
0003851784
-
-
SIAM
-
DONGARRA, J. J., DUFF, I. S., SORENSEN, D. C., AND VAN DER VORST, H. A. 1998. Numerical Linear Algebra for High-Performance Computers. SIAM.
-
(1998)
Numerical Linear Algebra for High-performance Computers
-
-
Dongarra, J.J.1
Duff, I.S.2
Sorensen, D.C.3
Van Der Vorst, H.A.4
-
8
-
-
0042674307
-
The LINPACK benchmark: Past, present, and future
-
DONGARRA, J. J., LUSZCZEK., P., AND PETITET, A. 2003. The LINPACK benchmark: Past, present, and future. Concurrency and Computation: Practice and Experience 15, 1-18.
-
(2003)
Concurrency and Computation: Practice and Experience
, vol.15
, pp. 1-18
-
-
Dongarra, J.J.1
Luszczek, P.2
Petitet, A.3
-
9
-
-
84934343786
-
Analysis and performance results of a molecular modeling application on merrimac
-
EREZ, M., AHN, J., GARG, A., DALLY, W. J., AND DARVE, E. 2004. Analysis and Performance Results of a Molecular Modeling Application on Merrimac. In SC'04.
-
(2004)
SC'04
-
-
Erez, M.1
Ahn, J.2
Garg, A.3
Dally, W.J.4
Darve, E.5
-
10
-
-
23944462603
-
Gpu cluster for high performance computing
-
FAN, Z., QIU, F., KAUFMAN, A., AND YOAKUM-STOVER, S. 2004. Gpu cluster for high performance computing. In ACM/IEEE Supercomputing Conference 2004.
-
(2004)
ACM/IEEE Supercomputing Conference 2004
-
-
Fan, Z.1
Qiu, F.2
Kaufman, A.3
Yoakum-Stover, S.4
-
12
-
-
33845440618
-
-
Tech. rep., University of Dortmund, Germany
-
GÖDDEKE, D. 2005. Gpgpu performance tuning. Tech. rep., University of Dortmund, Germany. http://www.mathematik.uni-dortmund.de/~goeddeke/ gpgpu/.
-
(2005)
Gpgpu Performance Tuning
-
-
Göddeke, D.1
-
13
-
-
84860038197
-
A GPU benchmarking suite
-
GPUBENCH, Los Angeles
-
GPUBENCH, 2004. A GPU benchmarking suite. GP2 Workshop, Los Angeles. Available online: http://graphics.stanford.edu/projects/gpubanch/.
-
(2004)
GP2 Workshop
-
-
-
14
-
-
2342641297
-
-
Addison Wesley
-
GRAMA, A., GUPTA, A., KARYPIS, G., AND KUMAR, V. 2003. Introduction to Parallel Computing (2nd ed.). Addison Wesley.
-
(2003)
Introduction to Parallel Computing (2nd Ed.)
-
-
Grama, A.1
Gupta, A.2
Karypis, G.3
Kumar, V.4
-
16
-
-
10644280791
-
-
Technical Report UIUCDCS-R-2003-2328, University of Illinois at Urbana-Champaign
-
HALL, J. D., CARR, N., AND HART, J. 2003. Cache and bandwidth aware matrix multiplication on the gpu. Technical Report UIUCDCS-R-2003-2328, University of Illinois at Urbana-Champaign.
-
(2003)
Cache and Bandwidth Aware Matrix Multiplication on the Gpu
-
-
Hall, J.D.1
Carr, N.2
Hart, J.3
-
18
-
-
78651284090
-
Simularion of cloud dynamics on graphics hardware
-
Eurographics Association, Aire-la-Ville, Switzerland, Switzerland
-
HARRIS, M. J., BAXTER, W. V., SCHEUERMANN, T., AND LASTRA, A. 2003. Simularion of cloud dynamics on graphics hardware. In HWWS '03: Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware, Eurographics Association, Aire-la-Ville, Switzerland, Switzerland, 92-101.
-
(2003)
HWWS '03: Proceedings of the ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware
, pp. 92-101
-
-
Harris, M.J.1
Baxter, W.V.2
Scheuermann, T.3
Lastra, A.4
-
20
-
-
0242533310
-
Linear algebra operators for gpu implementation of numerical algorithms
-
KRÜGER, J., AND WESTERMANN, R. 2003. Linear algebra operators for gpu implementation of numerical algorithms. ACM Trans. Graph. 22, 3, 908-916.
-
(2003)
ACM Trans. Graph.
, vol.22
, Issue.3
, pp. 908-916
-
-
Krüger, J.1
Westermann, R.2
-
23
-
-
10644238428
-
Shader algebra
-
MCCOOL, M., TOIT, S. D., POPA, T., CHAN, B., AND MOULE, K. 2004. Shader algebra. ACM Trans. Graph. 23, 3, 787-795.
-
(2004)
ACM Trans. Graph.
, vol.23
, Issue.3
, pp. 787-795
-
-
McCool, M.1
Toit, S.D.2
Popa, T.3
Chan, B.4
Moule, K.5
-
25
-
-
84860039921
-
-
Technical report
-
MCLEOD, I., AND YU, H. 2002. Timing comparisons of mathematica, matlab, r, s-plus, c & fortran. Technical report. Available online: http://fisher.stats. uwo.ca/faculty/aim/epubs/MatrixInverseTiming/dafault.htm.
-
(2002)
Timing Comparisons of Mathematica, Matlab, R, S-plus, C & Fortran
-
-
McLeod, I.1
Yu, H.2
-
26
-
-
33845385759
-
Closure models for the computation of dilute bubbly flows
-
Forschungszentrum Karlsruhe, April
-
MITRAN, S. 2000. Closure models for the computation of dilute bubbly flows. Wissenschaftliche Belichte FZKA 6357, Forschungszentrum Karlsruhe, April.
-
(2000)
Wissenschaftliche Belichte FZKA
, vol.6357
-
-
Mitran, S.1
-
28
-
-
84934325826
-
Scientific computations on modern parallel vector systems
-
OLIKER, L., CANNING, A., CARTER, J., AND SHALF, J. 2004. Scientific computations on modern parallel vector systems. In Supercomputing 2004.
-
(2004)
Supercomputing 2004
-
-
Oliker, L.1
Canning, A.2
Carter, J.3
Shalf, J.4
-
29
-
-
0029754520
-
Parallel algorithm and architecture for two-step division-free gaussian elimination
-
PENG, S., SEDUKHIN, S., AND SEDUKHIN, I. 1996. Parallel algorithm and architecture for two-step division-free gaussian elimination. In 1996 International Conference on Application-Specific Systems, Architectures and Processors (ASAP'96).
-
(1996)
1996 International Conference on Application-Specific Systems, Architectures and Processors (ASAP'96)
-
-
Peng, S.1
Sedukhin, S.2
Sedukhin, I.3
-
31
-
-
0030127191
-
Kinetic theory for bubbly flow i: Collisionless case
-
RUSSO, G., AND SMEREKA, P. 1996. Kinetic theory for bubbly flow i: Collisionless case. SIAM J. Appl. Math. 56, 2, 327-357.
-
(1996)
SIAM J. Appl. Math.
, vol.56
, Issue.2
, pp. 327-357
-
-
Russo, G.1
Smereka, P.2
-
32
-
-
0030127881
-
Kinetic theory for bubbly flow II: Fluid dynamic limit
-
RUSSO, G., AND SMEREKA, P. 1996. Kinetic theory for bubbly flow II: Fluid dynamic limit. SIAM J. Appl. Math. 56, 2, 358-371.
-
(1996)
SIAM J. Appl. Math.
, vol.56
, Issue.2
, pp. 358-371
-
-
Russo, G.1
Smereka, P.2
-
33
-
-
0038345686
-
A performance analysis of pim, stream processing, and tiled processing on memory-intensive signal processing kernels
-
SUH, J., KIM, E.-G., CRAGO, S. P., SRINIVASAN, L., AND FRENCH, M. C. 2003. A performance analysis of pim, stream processing, and tiled processing on memory-intensive signal processing kernels. In Proceedings of the International Symposium on Computer Architecture.
-
(2003)
Proceedings of the International Symposium on Computer Architecture
-
-
Suh, J.1
Kim, E.-G.2
Crago, S.P.3
Srinivasan, L.4
French, M.C.5
-
34
-
-
0036505033
-
The raw microprocessor: A computational fabric for software circuits and general purpose programs
-
TAYLOR, M. B., KIM, J., MILLER, J., WENTZLAFF, D., GHODRAT, F., GREENWALD, B., HOFFMANN, H., JOHNSON, P., LEE, J.-W., LEE, W., MA, A., SARAF, A., SBNESKI, M., SHNIDMAN, N., FRANK, V. S. M., AMARASINGHE, S., AND AGARWAL, A. 2002. The raw microprocessor: A computational fabric for software circuits and general purpose programs. IEEE Micro.
-
(2002)
IEEE Micro
-
-
Taylor, M.B.1
Kim, J.2
Miller, J.3
Wentzlaff, D.4
Ghodrat, F.5
Greenwald, B.6
Hoffmann, H.7
Johnson, P.8
Lee, J.-W.9
Lee, W.10
Ma, A.11
Saraf, A.12
Sbneski, M.13
Shnidman, N.14
Frank, V.S.M.15
Amarasinghe, S.16
Agarwal, A.17
-
36
-
-
0343462141
-
Automated empirical optimization of software and the ATLAS project
-
WHALEY, R. C., PETITET, A., AND DONGARRA, J. J. 2001. Automated empirical optimization of software and the ATLAS project. Parallel Computing 27, 1-2, 3-35
-
(2001)
Parallel Computing
, vol.27
, Issue.1-2
, pp. 3-35
-
-
Whaley, R.C.1
Petitet, A.2
Dongarra, J.J.3
|