-
1
-
-
0003706460
-
-
SIAM, Philadelphia
-
Anderson, E., Bai, Z., Demmel, J., Dongarra, J.E., DuCroz, J., Greenbaum, A., Hammarling, S., McKenney, A.E., Ostrouchov, S., Sorensen, D.: LAPACK Users' Guide. SIAM, Philadelphia (1992)
-
(1992)
LAPACK Users' Guide
-
-
Anderson, E.1
Bai, Z.2
Demmel, J.3
Dongarra, J.E.4
DuCroz, J.5
Greenbaum, A.6
Hammarling, S.7
McKenney, A.E.8
Ostrouchov, S.9
Sorensen, D.10
-
2
-
-
77951980969
-
A proposal to extend the OpenMP tasking model for heterogeneous architectures
-
Evolving OpenMP in an Age of Extreme Parallelism. 5th International Workshop on OpenMP, IWOMP, Dresden, Germany, Springer, Heidelberg
-
Ayguade, E., Badia, R.M., Cabrera, D., Duran, A., Gonzalez, M., Igual, F.D., Jimenez, D., Labarta, J., Martorell, X., Mayo, R., Perez, J.M., Quintana-Ortí, E.S.: A proposal to extend the OpenMP tasking model for heterogeneous architectures. In: Evolving OpenMP in an Age of Extreme Parallelism. 5th International Workshop on OpenMP, IWOMP 2009, Dresden, Germany. LNCS. Springer, Heidelberg (2009)
-
(2009)
LNCS
-
-
Ayguade, E.1
Badia, R.M.2
Cabrera, D.3
Duran, A.4
Gonzalez, M.5
Igual, F.D.6
Jimenez, D.7
Labarta, J.8
Martorell, X.9
Mayo, R.10
Perez, J.M.11
Quintana-Ortí, E.S.12
-
3
-
-
51849144655
-
-
Barrachina, S., Castillo, M., Igual, F.D., Mayo, R., Quintana- Ortí, E.S.: Solving dense linear systems on graphics processors. In: Luque, E., Margalef, T., Benítez, D. (eds.) Euro-Par 2008. LNCS, 5168, pp. 739-748. Springer, Heidelberg (2008)
-
Barrachina, S., Castillo, M., Igual, F.D., Mayo, R., Quintana- Ortí, E.S.: Solving dense linear systems on graphics processors. In: Luque, E., Margalef, T., Benítez, D. (eds.) Euro-Par 2008. LNCS, vol. 5168, pp. 739-748. Springer, Heidelberg (2008)
-
-
-
-
4
-
-
34548265764
-
CellSs: A programming model for the Cell BE architecture
-
ACM Press, New York
-
Bellens, P., Pérez, J.M., Badia, R.M., Labarta, J.: CellSs: a programming model for the Cell BE architecture. In: SC 2006: Proceedings of the 2006 ACM/IEEE conference on Supercomputing, p. 86. ACM Press, New York (2006)
-
(2006)
SC 2006: Proceedings of the 2006 ACM/IEEE conference on Supercomputing
, pp. 86
-
-
Bellens, P.1
Pérez, J.M.2
Badia, R.M.3
Labarta, J.4
-
5
-
-
0036870763
-
Recursive array layouts and fast matrix multiplication
-
Chatterjee, S., Lebeck, A.R., Patnala, P.K., Thottethodi, M.: Recursive array layouts and fast matrix multiplication. IEEE Trans. on Parallel and Distributed Systems 13(11), 1105-1123 (2002)
-
(2002)
IEEE Trans. on Parallel and Distributed Systems
, vol.13
, Issue.11
, pp. 1105-1123
-
-
Chatterjee, S.1
Lebeck, A.R.2
Patnala, P.K.3
Thottethodi, M.4
-
6
-
-
0025402476
-
A set of level 3 basic linear algebra subprograms
-
Dongarra, J., Croz, J.D., Hammarling, S., Duff, I.: A set of level 3 basic linear algebra subprograms. ACM Trans. Math. Soft. 16(1), 1-17 (1990)
-
(1990)
ACM Trans. Math. Soft
, vol.16
, Issue.1
, pp. 1-17
-
-
Dongarra, J.1
Croz, J.D.2
Hammarling, S.3
Duff, I.4
-
7
-
-
67650081010
-
Openmp to gpgpu: A compiler framework for automatic translation and optimization
-
ACM Press, New York
-
Lee, S., Min, S.-J., Eigenmann, R.: Openmp to gpgpu: a compiler framework for automatic translation and optimization. In: PPoPP 2009: Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming, pp. 101-110. ACM Press, New York (2009)
-
(2009)
PPoPP 2009: Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
, pp. 101-110
-
-
Lee, S.1
Min, S.-J.2
Eigenmann, R.3
-
8
-
-
70350636063
-
-
NVIDIA. NVIDIA CUDA Programming Guide 2.2 (2008)
-
NVIDIA. NVIDIA CUDA Programming Guide 2.2 (2008)
-
-
-
-
9
-
-
0042235298
-
Tiling, block data layout, and memory hierarchy performance
-
Park, N., Hong, B., Prasanna, V.K.: Tiling, block data layout, and memory hierarchy performance. IEEE Trans. on Parallel and Distributed Systems 14(7), 640-654 (2003)
-
(2003)
IEEE Trans. on Parallel and Distributed Systems
, vol.14
, Issue.7
, pp. 640-654
-
-
Park, N.1
Hong, B.2
Prasanna, V.K.3
-
10
-
-
35649006026
-
CellSs: Making it easier to program the cell broadband engine processor
-
August
-
Perez, J.M., Bellens, P., Badia, R.M., Labarta, J.: CellSs: Making it easier to program the cell broadband engine processor. IBM Journal of Research and Development 51(5) (August 2007)
-
(2007)
IBM Journal of Research and Development
, vol.51
, Issue.5
-
-
Perez, J.M.1
Bellens, P.2
Badia, R.M.3
Labarta, J.4
-
11
-
-
34548265331
-
Scalar-aware grid superscalar. DAC TR UPC-DAC-RR-CAP-2006-12
-
Technical report, Universitat Politécnica de Catalunya, Computer Architecture Department
-
Perez, J.M., Badia, R.M., Labarta, J.: Scalar-aware grid superscalar. DAC TR UPC-DAC-RR-CAP-2006-12. Technical report, Universitat Politécnica de Catalunya, Computer Architecture Department (2006)
-
(2006)
-
-
Perez, J.M.1
Badia, R.M.2
Labarta, J.3
-
12
-
-
70350666900
-
A flexible and portable programming model for SMP and multi-cores
-
Technical Report 03/2007, Barcelona Supercomputing Center, CNS, Barcelona, Spain
-
Pérez, J.M., Badia, R.M., Labarta, J.: A flexible and portable programming model for SMP and multi-cores. Technical Report 03/2007, Barcelona Supercomputing Center - CNS, Barcelona, Spain (2007)
-
(2007)
-
-
Pérez, J.M.1
Badia, R.M.2
Labarta, J.3
-
13
-
-
67650021816
-
Solving dense linear systems on platforms with multiple hardware accelerators
-
ACM, New York
-
Quintana-Ortí, G., Igual, F.D., Quintana-Ortí, E.S., van de Geijn, R.A.: Solving dense linear systems on platforms with multiple hardware accelerators. In: PPoPP 2009: Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming, pp. 121-130. ACM, New York (2009)
-
(2009)
PPoPP 2009: Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
, pp. 121-130
-
-
Quintana-Ortí, G.1
Igual, F.D.2
Quintana-Ortí, E.S.3
van de Geijn, R.A.4
|