-
1
-
-
77954080759
-
Dense linear algebra solvers for multicore with GPU accelerators
-
S. Tomov, R. Nath, H. Ltaief, and J. Dongarra, "Dense linear algebra solvers for multicore with GPU accelerators, " in Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on, 2010, pp. 1-8.
-
(2010)
Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on
, pp. 1-8
-
-
Tomov, S.1
Nath, R.2
Ltaief, H.3
Dongarra, J.4
-
2
-
-
78650883751
-
Interlacing bypass rings to torus networks for more efficient networks
-
P. Zhang, R. Powell, and Y. Deng, "Interlacing Bypass Rings to Torus Networks for More Efficient Networks, " Parallel and Distributed Systems, IEEE Transactions on, vol. 22, pp. 287-295, 2011.
-
(2011)
Parallel and Distributed Systems, IEEE Transactions on
, vol.22
, pp. 287-295
-
-
Zhang, P.1
Powell, R.2
Deng, Y.3
-
4
-
-
84881127643
-
Sparse matrixvector multiplication on NVIDIA GPU
-
H. Liu, S. Yu, Z. Chen, B. Hsieh, and L. Shao, "Sparse matrixvector multiplication on nvidia gpu, " Int. J. Numer. Anal. Model, vol. 3, pp. 185-191, 2012.
-
(2012)
Int. J. Numer. Anal. Model
, vol.3
, pp. 185-191
-
-
Liu, H.1
Yu, S.2
Chen, Z.3
Hsieh, B.4
Shao, L.5
-
5
-
-
79951616240
-
Revealing feasibility of FMM on ASIC: Efficient implementation of N-Body problem on FPGA
-
Z. Zheng, Y. Zhu, X. Wang, Z. Que, T. Huang, X. Yin, H. Wang, G. Rong, and M. Qiu, "Revealing feasibility of FMM on ASIC: efficient implementation of N-Body problem on FPGA, " in Computational Science and Engineering (CSE), 2010 IEEE 13th International Conference on, 2010, pp. 132-139.
-
(2010)
Computational Science and Engineering (CSE), 2010 IEEE 13th International Conference on
, pp. 132-139
-
-
Zheng, Z.1
Zhu, Y.2
Wang, X.3
Que, Z.4
Huang, T.5
Yin, X.6
Wang, H.7
Rong, G.8
Qiu, M.9
-
6
-
-
84969913998
-
BLAS for GPUs
-
Kurzak J, Bader DA, Dongarra J (eds). CRC Press: Boca Raton, FL
-
R. Nath, S. Tomov, and J. Dongarra, "BLAS for GPUs, " Scientific Computing with Multicore and Accelerators, Kurzak J, Bader DA, Dongarra J (eds). CRC Press: Boca Raton, FL, 2010.
-
(2010)
Scientific Computing with Multicore and Accelerators
-
-
Nath, R.1
Tomov, S.2
Dongarra, J.3
-
7
-
-
84928141293
-
GPU-based parallel reservoir simulation for large-scale simulation problems
-
S. Yu, H. Liu, Z. J. Chen, B. Hsieh, and L. Shao, "GPU-based parallel reservoir simulation for large-scale simulation problems, " in SPE Europec/EAGE Annual Conference, 2012.
-
(2012)
SPE Europec/EAGE Annual Conference
-
-
Yu, S.1
Liu, H.2
Chen, Z.J.3
Hsieh, B.4
Shao, L.5
-
8
-
-
84983744589
-
Matrix multiplication on high-density multi-GPU architectures: Theoretical and experimental investigations
-
J. M. Kunkel and T. Ludwig, Eds., ed: Springer International Publishing
-
P. Zhang and Y. Gao, "Matrix Multiplication on High-Density Multi-GPU Architectures: Theoretical and Experimental Investigations, " in High Performance Computing. vol. 9137, J. M. Kunkel and T. Ludwig, Eds., ed: Springer International Publishing, 2015, pp. 17-30.
-
(2015)
High Performance Computing
, vol.9137
, pp. 17-30
-
-
Zhang, P.1
Gao, Y.2
-
9
-
-
79951765394
-
Data-aware task scheduling on multi-accelerator based platforms
-
C. Augonnet, J. Clet-Ortega, S. Thibault, and R. Namyst, "Data-Aware Task Scheduling on Multi-accelerator Based Platforms, " in Parallel and Distributed Systems (ICPADS), 2010 IEEE 16th International Conference on, 2010, pp. 291-298.
-
(2010)
Parallel and Distributed Systems (ICPADS), 2010 IEEE 16th International Conference on
, pp. 291-298
-
-
Augonnet, C.1
Clet-Ortega, J.2
Thibault, S.3
Namyst, R.4
-
10
-
-
84909993282
-
Eigenanalysis-based task mapping on parallel computers with cellular networks
-
P. Zhang, Y. Gao, J. Fierson, and Y. Deng, "Eigenanalysis-Based Task Mapping on Parallel Computers with Cellular Networks, " Mathematics of Computation, vol. 83, pp. 1727-1756, 2014.
-
(2014)
Mathematics of Computation
, vol.83
, pp. 1727-1756
-
-
Zhang, P.1
Gao, Y.2
Fierson, J.3
Deng, Y.4
-
11
-
-
70350641505
-
StarPU: A unified platform for task scheduling on heterogeneous multicore architectures
-
ed: Springer
-
C. Augonnet, S. Thibault, R. Namyst, and P.-A. Wacrenier, "StarPU: A unified platform for task scheduling on heterogeneous multicore architectures, " in Euro-Par 2009 Parallel Processing, ed: Springer, 2009, pp. 863-874.
-
(2009)
Euro-Par 2009 Parallel Processing
, pp. 863-874
-
-
Augonnet, C.1
Thibault, S.2
Namyst, R.3
Wacrenier, P.-A.4
-
12
-
-
84655174868
-
DAGuE: A generic distributed DAG engine for high performance computing
-
G. Bosilca, A. Bouteiller, A. Danalis, T. Herault, P. Lemarinier, and J. Dongarra, "DAGuE: A generic distributed DAG engine for high performance computing, " Parallel Computing, vol. 38, pp. 37-51, 2012.
-
(2012)
Parallel Computing
, vol.38
, pp. 37-51
-
-
Bosilca, G.1
Bouteiller, A.2
Danalis, A.3
Herault, T.4
Lemarinier, P.5
Dongarra, J.6
-
13
-
-
79959734507
-
Ompss: A proposal for programming heterogeneous multi-core architectures
-
A. Duran, E. Ayguadé, R. M. Badia, J. Labarta, L. Martinell, X. Martorell, and J. Planas, "Ompss: a proposal for programming heterogeneous multi-core architectures, " Parallel Processing Letters, vol. 21, pp. 173-193, 2011.
-
(2011)
Parallel Processing Letters
, vol.21
, pp. 173-193
-
-
Duran, A.1
Ayguadé, E.2
Badia, R.M.3
Labarta, J.4
Martinell, L.5
Martorell, X.6
Planas, J.7
-
14
-
-
0027593363
-
Parallax: A tool for parallel program scheduling
-
T. Lewis and H. El-Rewini, "Parallax: A Tool for Parallel Program Scheduling, " IEEE Parallel Distrib. Technol., vol. 1, pp. 62-72, 1993.
-
(1993)
IEEE Parallel Distrib. Technol.
, vol.1
, pp. 62-72
-
-
Lewis, T.1
El-Rewini, H.2
-
15
-
-
84936100422
-
A data-driven paradigm for mapping problems
-
P. Zhang, L. Liu, and Y. Deng, "A data-driven paradigm for mapping problems, " Parallel Computing, vol. 48, pp. 108-124, 2015.
-
(2015)
Parallel Computing
, vol.48
, pp. 108-124
-
-
Zhang, P.1
Liu, L.2
Deng, Y.3
-
16
-
-
84874887986
-
Thermal-aware task scheduling in 3D chip multiprocessor with real-time constrained workloads
-
J. Li, M. Qiu, J.-W. Niu, L. T. Yang, Y. Zhu, and Z. Ming, "Thermal-aware task scheduling in 3D chip multiprocessor with real-time constrained workloads, " ACM Transactions on Embedded Computing Systems (TECS), vol. 12, p. 24, 2013.
-
(2013)
ACM Transactions on Embedded Computing Systems (TECS)
, vol.12
, pp. 24
-
-
Li, J.1
Qiu, M.2
Niu, J.-W.3
Yang, L.T.4
Zhu, Y.5
Ming, Z.6
-
18
-
-
84862806683
-
Online optimization for scheduling preemptable tasks on IaaS cloud systems
-
J. Li, M. Qiu, Z. Ming, G. Quan, X. Qin, and Z. Gu, "Online optimization for scheduling preemptable tasks on IaaS cloud systems, " Journal of Parallel and Distributed Computing, vol. 72, pp. 666-677, 2012.
-
(2012)
Journal of Parallel and Distributed Computing
, vol.72
, pp. 666-677
-
-
Li, J.1
Qiu, M.2
Ming, Z.3
Quan, G.4
Qin, X.5
Gu, Z.6
-
19
-
-
80155187624
-
Optimal data allocation for scratch-pad memory on embedded multi-core systems
-
Y. Guo, Q. Zhuge, J. Hu, M. Qiu, and E. H.-M. Sha, "Optimal data allocation for scratch-pad memory on embedded multi-core systems, " in Parallel Processing (ICPP), 2011 International Conference on, 2011, pp. 464-471.
-
(2011)
Parallel Processing (ICPP), 2011 International Conference on
, pp. 464-471
-
-
Guo, Y.1
Zhuge, Q.2
Hu, J.3
Qiu, M.4
Sha, E.H.-M.5
-
20
-
-
77953997924
-
Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects
-
E. Agullo, J. Demmel, J. Dongarra, B. Hadri, J. Kurzak, J. Langou, H. Ltaief, P. Luszczek, and S. Tomov, "Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects, " in Journal of Physics: Conference Series, 2009, p. 012037.
-
(2009)
Journal of Physics: Conference Series
, pp. 012037
-
-
Agullo, E.1
Demmel, J.2
Dongarra, J.3
Hadri, B.4
Kurzak, J.5
Langou, J.6
Ltaief, H.7
Luszczek, P.8
Tomov, S.9
-
22
-
-
0036467470
-
PASTIX: A highperformance parallel direct solver for sparse symmetric positive definite systems
-
P. Hénon, P. Ramet, and J. Roman, "PASTIX: a highperformance parallel direct solver for sparse symmetric positive definite systems, " Parallel Computing, vol. 28, pp. 301-321, 2002.
-
(2002)
Parallel Computing
, vol.28
, pp. 301-321
-
-
Hénon, P.1
Ramet, P.2
Roman, J.3
|