-
1
-
-
35648995516
-
The landscape of parallel computing research: A view from berkeley
-
Dec
-
K. Asanovic, R. Bodik, B. C. Catanzaro, J. J. Gebis, P. Husbands, K. Keutzer, D. A. Patterson, W. L. Plishker, J. Shalf, S.W. Williams, and K. A. Yelick. The landscape of parallel computing research: A view from berkeley. Technical Report UCB/EECS-2006-183, Dec 2006.
-
(2006)
Technical Report UCB/EECS-
, pp. 2006-2183
-
-
Asanovic, K.1
Bodik, R.2
Catanzaro, B.C.3
Gebis, J.J.4
Husbands, P.5
Keutzer, K.6
Patterson, D.A.7
Plishker, W.L.8
Shalf, J.9
Williams, S.W.10
Yelick, K.A.11
-
2
-
-
33745125067
-
On the architectural requirements for efficient execution of graph algorithms
-
Oslo, Norway Jun
-
D. A. Bader and G. Cong. On the architectural requirements for efficient execution of graph algorithms. In: Proc. of Int?l Conf. on Parallel Processing, pages 547-556, Oslo, Norway, Jun 2005.
-
(2005)
Proc. Of Int?l Conf. Of Parallel Processing
, pp. 547-556
-
-
Bader, D.A.1
Cong, G.2
-
4
-
-
0027541302
-
Automatic program parallelization
-
U. Banerjee, R. Eigenmann, A. Nicolau, and D. A. Padua. Automatic program parallelization. In: Proc. of the IEEE, 81(2):211-243, 1993.
-
(1993)
Proc. Of the IEEE
, vol.81
, Issue.2
, pp. 211-243
-
-
Banerjee, U.1
Eigenmann, R.2
Nicolau, A.3
Padua, D.A.4
-
5
-
-
78650822594
-
Diagnosis, Tuning, and Redesign for Multicore Performance: A Case Study of the Fast Multipole Method. In: Proc. Of
-
A. Chandramowlishwaran, K. Madduri, and R. Vuduc. Diagnosis, Tuning, and Redesign for Multicore Performance: A Case Study of the Fast Multipole Method. In: Proc. of 2010 ACM/IEEE Int?l Conf. on High Performance Computing, Networking, Storage ansd Analysis, New Orleans, LA, USA, Nov 2010.
-
(2010)
ACM/IEEE Int?l Conf. On High Performance Computing, Networking, Storage Ansd Analysis, New Orleans, LA, USA, Nov 2010
-
-
Chandramowlishwaran, A.1
Madduri, K.2
Vuduc, R.3
-
6
-
-
84966560559
-
Parallel wavelet transform for large scale image processing. In
-
Ft. Lauderdale, FL, USA Apr
-
D. Chaver, M. Prieto, L. Pinuel, and F. Tirado. Parallel wavelet transform for large scale image processing. In: Proc. of the Int?l Parallel and Distributed Processing Symposium (IPDPS), pages 4-9, Ft. Lauderdale, FL, USA, Apr 2002.
-
(2002)
Proc. Of the Int?l Parallel and Distributed Processing Symposium (IPDPS
, pp. 4-9
-
-
Chaver, D.1
Prieto, M.2
Pinuel, L.3
Tirado, F.4
-
7
-
-
84945318819
-
2-d wavelet transform enhancement on general-purpose microprocessors: Memory hierarchy and simd parallelism exploitation
-
Bangalore, India Dec
-
D. Chaver, C. Tenllado, L. Piñuel, M. Prieto, and F. Tirado. 2-D Wavelet Transform Enhancement on General-Purpose Microprocessors: Memory Hierarchy and SIMD Parallelism Exploitation. In: Proc. of Int?l Conf. on High Performance Computing, LNCS 2552, pages 9-21, Bangalore, India, Dec 2002.
-
(2002)
Proc. Of Int?l Conf. Of High Performance Computing LNCS 2552
, pp. 9-21
-
-
Chaver, D.1
Tenllado, C.2
Piñuel, L.3
Prieto, M.4
Tirado, F.5
-
8
-
-
70449975572
-
Understanding the design trade-offs among current multicore systems for numerical computations. In
-
May
-
S. Kang, D. A. Bader, and R. Vuduc. Understanding the design trade-offs among current multicore systems for numerical computations. In: Proc. of Int?l Symp. on Parallel and Distributed Processing, Rome, Italy, May 2009.
-
(2009)
Proc. Of Int?l Symp. Of Parallel and Distributed Processing, Rome, Italy
-
-
Kang, S.1
Bader, D.A.2
Vuduc, R.3
-
9
-
-
73649141632
-
A NUMA API for Linux
-
Apr
-
A. Kleen. A NUMA API for Linux. Technical Report, Apr 2005.
-
(2005)
Technical Report
-
-
Kleen, A.1
-
10
-
-
60649099576
-
Optimizing matrix multiplication for a short-vector SIMD architecture-CELL processor
-
J. Kurzak, W. Alvaro, and J. Dongarra. Optimizing matrix multiplication for a short-vector SIMD architecture-CELL processor. In: Parallel Computing, 35(3):138-150, 2009.
-
(2009)
Parallel Computing
, vol.35
, Issue.3
, pp. 138-150
-
-
Kurzak, J.1
Alvaro, W.2
Dongarra, J.3
-
11
-
-
33947328378
-
Performance, power efficiency and scalability of asymmetric cluster chip multiprocessors
-
T. Y. Morad, U. C. Weiser, A. Kolodnyt, M. Valero, and E. Ayguade. Performance, power efficiency and scalability of asymmetric cluster chip multiprocessors. In: Computer Architecture Letters, 5(1):14-17, 2006.
-
(2006)
Computer Architecture Letters
, vol.5
, Issue.1
, pp. 14-17
-
-
Morad, T.Y.1
Weiser, U.C.2
Kolodnyt, A.3
Valero, M.4
Ayguade, E.5
-
12
-
-
56749158843
-
Optimization of sparse matrix-vector multiplication on emerging multicore platforms. In
-
Nov
-
S. Williams, L. Oliker, R. Vuduc, J. Shalf, K. Yelick, and J. Demmel. Optimization of sparse matrix-vector multiplication on emerging multicore platforms. In: Proc. of Int?l Conf. on Supercomputing, Reno, NV, Nov 2007.
-
(2007)
Proc. Of Int?l Conf. Of Supercomputing, Reno, NV
-
-
Williams, S.1
Oliker, L.2
Vuduc, R.3
Shalf, J.4
Yelick, K.5
Demmel, J.6
-
13
-
-
79551702326
-
Advanced MRI reconstruction toolbox with accelerating on GPU
-
Jan
-
X.-L. Wu, Y. Zhuo, J. Gai, F. Lam, M. Fu, J. P. Haldar, W.-M. Hwu, Z.-P. Liang, and B. P. Sutton. Advanced MRI reconstruction toolbox with accelerating on GPU. In: Proc. of Conf. on Parallel Processing for Imaging Applications, San Francisco, CA, Jan 2011.
-
(2011)
Proc. Of Conf. Of Parallel Processing for Imaging Applications, San Francisco, CA
-
-
Wu, X.-L.1
Zhuo, Y.2
Gai, J.3
Lam, F.4
Fu, M.5
Haldar, J.P.6
Hwu, W.-M.7
Liang, Z.-P.8
Sutton, B.P.9
|