-
1
-
-
58449123679
-
-
NVIDIA: NVIDIA CUDA, http://www.nvidia.com/cuda
-
NVIDIA: NVIDIA CUDA, http://www.nvidia.com/cuda
-
-
-
-
2
-
-
44849137198
-
NVIDIA Tesla: A unified graphics and computing architecture
-
in press
-
Lindholm, E., Nickolls, J., Oberman, S., Montrym, J.: NVIDIA Tesla: A unified graphics and computing architecture. IEEE Micro 28(2) (in press, 2008)
-
(2008)
IEEE Micro
, vol.28
, Issue.2
-
-
Lindholm, E.1
Nickolls, J.2
Oberman, S.3
Montrym, J.4
-
3
-
-
30744459395
-
-
Woop, S., Schmittler, J., Slusallek, P.: RPU: A programmable ray processing unit for realtime ray tracing. ACM Trans. Graph. 24(3), 434-444 (2005)
-
Woop, S., Schmittler, J., Slusallek, P.: RPU: A programmable ray processing unit for realtime ray tracing. ACM Trans. Graph. 24(3), 434-444 (2005)
-
-
-
-
4
-
-
58449094436
-
-
Intel: Intel 64 and IA-32 Architectures Software Developer's Manual (May 2007)
-
Intel: Intel 64 and IA-32 Architectures Software Developer's Manual (May 2007)
-
-
-
-
5
-
-
58449084520
-
-
Devices, A.M.: 3DNow! technology manual. Technical Report 21928, Advanced Micro Devices, Sunnyvale, CA (May 1998)
-
Devices, A.M.: 3DNow! technology manual. Technical Report 21928, Advanced Micro Devices, Sunnyvale, CA (May 1998)
-
-
-
-
8
-
-
35048854568
-
-
Lee, S., Johnson, T., Eigenmann, R.: Cetus - An extensible compiler infrastructure for source-to-source transformation. In: Rauchwerger, L. (ed.) LCPC 2003. LNCS, 2958, Springer, Heidelberg (2004)
-
Lee, S., Johnson, T., Eigenmann, R.: Cetus - An extensible compiler infrastructure for source-to-source transformation. In: Rauchwerger, L. (ed.) LCPC 2003. LNCS, vol. 2958, Springer, Heidelberg (2004)
-
-
-
-
9
-
-
35248845008
-
-
Ayguadé, E., Blainey, B., Duran, A., Labarta, J., Martínez, F., Martorell, X., Silvera, R.: Is the schedule clause really necessary in OpenMP? In: Proceedings of the International Workshop on OpenMP Applications and Tools, June 2003, pp. 147-159 (2003)
-
Ayguadé, E., Blainey, B., Duran, A., Labarta, J., Martínez, F., Martorell, X., Silvera, R.: Is the schedule clause really necessary in OpenMP? In: Proceedings of the International Workshop on OpenMP Applications and Tools, June 2003, pp. 147-159 (2003)
-
-
-
-
10
-
-
84876909872
-
-
Markatos, E.P., LeBlanc, T.J.: Using processor affinity in loop scheduling on shared-memory multiprocessors. In: Proceedings of the 1992 International Conference on Supercomputing, July 1992, pp. 104-113 (1992)
-
Markatos, E.P., LeBlanc, T.J.: Using processor affinity in loop scheduling on shared-memory multiprocessors. In: Proceedings of the 1992 International Conference on Supercomputing, July 1992, pp. 104-113 (1992)
-
-
-
-
11
-
-
0026264626
-
-
Hummel, S.F., Schonberg, E., Flynn, L.E.: Factoring: A practical and robust method for scheduling parallel loops. In: Proceedings of the 1001 International Conference of Supercomputing, June 1991, pp. 610-632 (1991)
-
Hummel, S.F., Schonberg, E., Flynn, L.E.: Factoring: A practical and robust method for scheduling parallel loops. In: Proceedings of the 1001 International Conference of Supercomputing, June 1991, pp. 610-632 (1991)
-
-
-
-
13
-
-
79959466764
-
-
Ryoo, S., Rodrigues, C.I., Baghsorkhi, S.S., Stone, S.S., Kirk, D., Hwu, W.W.: Optimization principles and application performance evaluation of a multithreaded GPU using CUDA. in: Proceedings of the 13th ACM S1GPLAN Symposium on Principles and Practice of Parallel Programming (February 2008)
-
Ryoo, S., Rodrigues, C.I., Baghsorkhi, S.S., Stone, S.S., Kirk, D., Hwu, W.W.: Optimization principles and application performance evaluation of a multithreaded GPU using CUDA. in: Proceedings of the 13th ACM S1GPLAN Symposium on Principles and Practice of Parallel Programming (February 2008)
-
-
-
-
14
-
-
43449094719
-
-
Ryoo, S., Rodrigues, C.I., Stone, S.S., Baghsorkhi, S.S., Ueng, S.Z., Stratton, J.A., Hwu, W.W.: Program optimization space pruning for a multithreaded GPU. in: Proceedings of the 2008 international Symposium on Code Generation and Optimization (April 2008)
-
Ryoo, S., Rodrigues, C.I., Stone, S.S., Baghsorkhi, S.S., Ueng, S.Z., Stratton, J.A., Hwu, W.W.: Program optimization space pruning for a multithreaded GPU. in: Proceedings of the 2008 international Symposium on Code Generation and Optimization (April 2008)
-
-
-
-
15
-
-
58449113459
-
-
Volkov, V., Demmel, J.W.: LU, QR and Cholesky factorizations using vector capabilities of CPUs. Technical Report UCB/EECS-2008-49, EECS Department, University of California, Berkeley, CA (May 2008)
-
Volkov, V., Demmel, J.W.: LU, QR and Cholesky factorizations using vector capabilities of CPUs. Technical Report UCB/EECS-2008-49, EECS Department, University of California, Berkeley, CA (May 2008)
-
-
-
-
16
-
-
0032740603
-
Modeling the global internet
-
Cowie, J.H., Nicol, D.M., Ogielski, A.T.: Modeling the global internet. Computing in Science and Eng. 1(1), 42-50 (1999)
-
(1999)
Computing in Science and Eng
, vol.1
, Issue.1
, pp. 42-50
-
-
Cowie, J.H.1
Nicol, D.M.2
Ogielski, A.T.3
-
19
-
-
0003487728
-
High Performance Fortran language specification, version 1.0
-
Technical Report CRPC-TR92225, Rice University May
-
Forum, H.P.F.: High Performance Fortran language specification, version 1.0. Technical Report CRPC-TR92225, Rice University (May 1993)
-
(1993)
-
-
Forum, H.P.F.1
-
20
-
-
57349101237
-
-
Liao, S.W., Du, Z., Wu, G., Lueh, G.Y.: Data and computation transformations for Brook streaming applications on multiprocessors. in: Proceedings of the 4th international Symposium on Code Generation and Optimization, March 2006, pp. 196-207 (2006)
-
Liao, S.W., Du, Z., Wu, G., Lueh, G.Y.: Data and computation transformations for Brook streaming applications on multiprocessors. in: Proceedings of the 4th international Symposium on Code Generation and Optimization, March 2006, pp. 196-207 (2006)
-
-
-
|