-
1
-
-
84862107202
-
Parallel and cache-efficient in-place matrix storage format conversion
-
Apr. doi: 10.1145/2168773.2168775
-
F. Gustavson, L. Karlsson, and B. Kågström. Parallel and cache-efficient in-place matrix storage format conversion. ACM Transactions on Mathematical Software, 38(3):1-32, Apr. 2012. doi: 10.1145/2168773.2168775.
-
(2012)
ACM Transactions on Mathematical Software
, vol.38
, Issue.3
, pp. 1-32
-
-
Gustavson, F.1
Karlsson, L.2
Kågström, B.3
-
2
-
-
84896857784
-
-
Intel. Intel MKL, 2013. URL http://software.intel.com/en-us/intel-mkl.
-
(2013)
-
-
-
4
-
-
85049613651
-
Tight bounds on the complexity of parallel sorting
-
New York, NY, USA, ACM. doi: 10.1145/800057.808667
-
T. Leighton. Tight bounds on the complexity of parallel sorting. In Proceedings of the Sixteenth Annual ACM Symposium on Theory of Computing, STOC '84, pages 71-80, New York, NY, USA, 1984. ACM. doi: 10.1145/800057.808667.
-
(1984)
Proceedings of the Sixteenth Annual ACM Symposium on Theory of Computing, STOC '84
, pp. 71-80
-
-
Leighton, T.1
-
5
-
-
78651550268
-
Scalable parallel programming with CUDA
-
Mar./Apr. doi: 10.1145/1365490.1365500
-
J. Nickolls, I. Buck, M. Garland, and K. Skadron. Scalable parallel programming with CUDA. ACM Queue, pages 40- 53, Mar./Apr. 2008. doi: 10.1145/1365490.1365500.
-
(2008)
ACM Queue
, pp. 40-53
-
-
Nickolls, J.1
Buck, I.2
Garland, M.3
Skadron, K.4
-
6
-
-
84896808494
-
-
PhD thesis, University of Illinois, Department of Electrical and Computer Engineering, May
-
I.-J. Sung. Data layout transformation through in-place transposition. PhD thesis, University of Illinois, Department of Electrical and Computer Engineering, May 2013. URL http://hdl.handle.net/2142/44300.
-
(2013)
Data Layout Transformation Through In-place Transposition
-
-
Sung, I.-J.1
-
7
-
-
84870691946
-
DL: A data layout transformation system for heterogeneous computing
-
May. doi: 10.1109/InPar.2012.6339606
-
I.-J. Sung, G. D. Liu, and W.-M. W. Hwu. DL: A data layout transformation system for heterogeneous computing. In Innovative Parallel Computing (InPar), May 2012. doi: 10.1109/InPar.2012.6339606.
-
(2012)
Innovative Parallel Computing (InPar)
-
-
Sung, I.-J.1
Liu, G.D.2
Hwu, W.-M.W.3
-
8
-
-
84896819561
-
In-place transposition of rectangular matrices on accelerators
-
doi: 10.1145/2555243.2555266
-
I.-J. Sung, J. Gómez-Luna, J. M. González-Linares, N. Guil, and W.-M. W. Hwu. In-place transposition of rectangular matrices on accelerators. In Principles and Practices of Parallel Programming (PPoPP), PPoPP '14, 2014. doi: 10.1145/2555243.2555266.
-
(2014)
Principles and Practices of Parallel Programming (PPoPP), PPoPP '14
-
-
Sung, I.-J.1
Gómez-Luna, J.2
González-Linares, J.M.3
Guil, N.4
Hwu, W.-M.W.5
-
9
-
-
67649094126
-
Optimal in-place transposition of rectangular matrices
-
Aug. doi: 10.1016/j.jco.2009.02.008
-
A. A. Tretyakov and E. E. Tyrtyshnikov. Optimal in-place transposition of rectangular matrices. Journal of Complexity, 25(4):377-384, Aug. 2009. doi: 10.1016/j.jco.2009.02.008.
-
(2009)
Journal of Complexity
, vol.25
, Issue.4
, pp. 377-384
-
-
Tretyakov, A.A.1
Tyrtyshnikov, E.E.2
-
10
-
-
8744284121
-
-
Addison-Wesley Professional, ISBN 978-0-201-91465-8
-
H. S. Warren. Hacker's Delight. Addison-Wesley Professional, 2002. ISBN 978-0-201-91465-8.
-
(2002)
Hacker's Delight
-
-
Warren, H.S.1
-
11
-
-
38049073987
-
Transposing matrices in a digital computer
-
Jan. doi: 10.1093/comjnl/2.1.47
-
P. F. Windley. Transposing matrices in a digital computer. The Computer Journal, 2(1):47-48, Jan. 1959. doi: 10.1093/comjnl/2.1.47.
-
(1959)
The Computer Journal
, vol.2
, Issue.1
, pp. 47-48
-
-
Windley, P.F.1
|