-
2
-
-
20344374162
-
Niagara: A 32-way Multithreaded Sparc Processor
-
K. Aingaran, P. Kongetira, and K. Olukotun. Niagara: A 32-way Multithreaded Sparc Processor. IEEE Micro, 25:21-29, 2005.
-
(2005)
IEEE Micro
, vol.25
, pp. 21-29
-
-
Aingaran, K.1
Kongetira, P.2
Olukotun, K.3
-
3
-
-
43649092214
-
ATI CTM Guide: Technical reference manual
-
AMD, Technical report, AMD, 2006. Version 1.01
-
AMD. ATI CTM Guide: Technical reference manual. Technical report, AMD, 2006. Version 1.01.
-
-
-
-
5
-
-
41249087856
-
General Purpose Molecular Dynamics Simulations fully implemented on Graphics Processing Units
-
J. A. Anderson, C. D. Lorenz, and A. Travesset. General Purpose Molecular Dynamics Simulations fully implemented on Graphics Processing Units. J. of Computational Physics, 227(10):5342-5359, 2008.
-
(2008)
J. of Computational Physics
, vol.227
, Issue.10
, pp. 5342-5359
-
-
Anderson, J.A.1
Lorenz, C.D.2
Travesset, A.3
-
6
-
-
84944390453
-
Beating In-Order Stalls with "Flea-Flicker" Two-Pass Pipelining
-
R. D. Barnes, E. M. Nystrom, J. W. Sias, S. J. Patel, N. Navarro, and W.-m. W. Hwu. Beating In-Order Stalls with "Flea-Flicker" Two-Pass Pipelining. In Proc. 36th IEEE/ACM Int'l Symp. Microarchitecture (MICRO '03), pages 387-398, 2003.
-
(2003)
Proc. 36th IEEE/ACM Int'l Symp. Microarchitecture (MICRO '03)
, pp. 387-398
-
-
Barnes, R.D.1
Nystrom, E.M.2
Sias, J.W.3
Patel, S.J.4
Navarro, N.5
Hwu, W.-M.W.6
-
7
-
-
0033722744
-
Piranha: A Scalable Architecture based on Single-Chip Multiprocessing
-
L. A. Barroso, K. Gharachorloo, R. McNamara, A. Nowatzyk, S. Qadeer, B. Sano, S. Smith, R. Stets, and B. Verghese. Piranha: A Scalable Architecture based on Single-Chip Multiprocessing. In Proc. 27th Int'l Symp. Computer Architecture (ISCA '00), pages 282-293, 2000.
-
(2000)
Proc. 27th Int'l Symp. Computer Architecture (ISCA '00)
, pp. 282-293
-
-
Barroso, L.A.1
Gharachorloo, K.2
McNamara, R.3
Nowatzyk, A.4
Qadeer, S.5
Sano, B.6
Smith, S.7
Stets, R.8
Verghese, B.9
-
8
-
-
77953980486
-
The Direct3D 10 system
-
D. Blythe. The Direct3D 10 system. ACM Trans. Graphics, 25(3):724-734, 2006.
-
(2006)
ACM Trans. Graphics
, vol.25
, Issue.3
, pp. 724-734
-
-
Blythe, D.1
-
9
-
-
70450059008
-
Accelerating Leukocyte Tracking using CUDA: A Case Study in Leveraging Manycore Coprocessors
-
M. Boyer, D. Tarjan, S. T. Acton, and K. Skadron. Accelerating Leukocyte Tracking using CUDA: A Case Study in Leveraging Manycore Coprocessors. In Proc. 24th Int'l Parallel and Distributed Processing Symp. (IPDPS '09), pages 1-12, 2009.
-
(2009)
Proc. 24th Int'l Parallel and Distributed Processing Symp. (IPDPS '09)
, pp. 1-12
-
-
Boyer, M.1
Tarjan, D.2
Acton, S.T.3
Skadron, K.4
-
10
-
-
10644248153
-
Brook for GPUs: Stream Computing on Graphics Hardware
-
I. Buck, T. Foley, D. Horn, J. Sugerman, K. Fatahalian, M. Houston, and P. Hanrahan. Brook for GPUs: Stream Computing on Graphics Hardware. ACM Trans. on Graphics, 23(3):777-786, 2004.
-
(2004)
ACM Trans. on Graphics
, vol.23
, Issue.3
, pp. 777-786
-
-
Buck, I.1
Foley, T.2
Horn, D.3
Sugerman, J.4
Fatahalian, K.5
Houston, M.6
Hanrahan, P.7
-
11
-
-
51449118065
-
A Performance Study of General-Purpose Applications on Graphics Processors using CUDA
-
S. Che, M. Boyer, J. Meng, D. Tarjan, J. W. Sheaffer, and K. Skadron. A Performance Study of General-Purpose Applications on Graphics Processors using CUDA. J. of Parallel and Distributed Computing, 68(10):1370-1380, 2008.
-
(2008)
J. of Parallel and Distributed Computing
, vol.68
, Issue.10
, pp. 1370-1380
-
-
Che, S.1
Boyer, M.2
Meng, J.3
Tarjan, D.4
Sheaffer, J.W.5
Skadron, K.6
-
12
-
-
84877083867
-
Merrimac: Supercomputing with Streams
-
W. J. Dally, F. Labonte, A. Das, P. Hanrahan, J.-H. Ahn, J. Gummaraju, M. Erez, N. Jayasena, I. Buck, T. J. Knight, and U. J. Kapasi. Merrimac: Supercomputing with Streams. In Proc. 15th ACM/IEEE Conf. Supercomputing (SC '03), page 35, 2003.
-
(2003)
Proc. 15th ACM/IEEE Conf. Supercomputing (SC '03)
, pp. 35
-
-
Dally, W.J.1
Labonte, F.2
Das, A.3
Hanrahan, P.4
Ahn, J.-H.5
Gummaraju, J.6
Erez, M.7
Jayasena, N.8
Buck, I.9
Knight, T.J.10
Kapasi, U.J.11
-
13
-
-
35649007366
-
-
B. K. Flachs, S. Asano, S. H. Dhong, H. P. Hofstee, G. Gervais, R. Kim, T. Le, P. Liu, J. Leenstra, J. S. Liberty, B. W. Michael, H.-J. Oh, S. M. Müller, O. Takahashi, K. Hirairi, A. Kawasumi, H. Murakami, H. Noro, S. Onishi, J. Pille, J. Silberman, S. Yong, A. Hatakeyama, Y. Watanabe, N. Yano, D. A. Brokenshire, M. Peyravian, V. To, and E. Iwata. Microarchitecture and Implementation of the Synergistic Processor in 65-nm and 90-nm SOI. IBM J. Research and Development, 51(5):529-544, 2007.
-
B. K. Flachs, S. Asano, S. H. Dhong, H. P. Hofstee, G. Gervais, R. Kim, T. Le, P. Liu, J. Leenstra, J. S. Liberty, B. W. Michael, H.-J. Oh, S. M. Müller, O. Takahashi, K. Hirairi, A. Kawasumi, H. Murakami, H. Noro, S. Onishi, J. Pille, J. Silberman, S. Yong, A. Hatakeyama, Y. Watanabe, N. Yano, D. A. Brokenshire, M. Peyravian, V. To, and E. Iwata. Microarchitecture and Implementation of the Synergistic Processor in 65-nm and 90-nm SOI. IBM J. Research and Development, 51(5):529-544, 2007.
-
-
-
-
14
-
-
47349104432
-
Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow
-
W. W. L. Fung, I. Sham, G. Yuan, and T. M. Aamodt. Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow. In Proc. 40th IEEE/ACM Int'l Symp. Microarchitecture (MICRO '07), pages 407-420, 2007.
-
(2007)
Proc. 40th IEEE/ACM Int'l Symp. Microarchitecture (MICRO '07)
, pp. 407-420
-
-
Fung, W.W.L.1
Sham, I.2
Yuan, G.3
Aamodt, T.M.4
-
15
-
-
53749092570
-
Parallel Computing Experiences with CUDA
-
M. Garland, S. L. Grand, J. Nickolls, J. Anderson, J. Hardwick, S. Morton, E. Phillips, Y. Zhang, and V. Volkov. Parallel Computing Experiences with CUDA. IEEE Micro, 28(4):13-27, 2008.
-
(2008)
IEEE Micro
, vol.28
, Issue.4
, pp. 13-27
-
-
Garland, M.1
Grand, S.L.2
Nickolls, J.3
Anderson, J.4
Hardwick, J.5
Morton, S.6
Phillips, E.7
Zhang, Y.8
Volkov, V.9
-
17
-
-
44849137198
-
NVIDIA Tesla: A Unified Graphics and Computing Architecture
-
E. Lindholm, J. Nickolls, S. Oberman, and J. Montrym. NVIDIA Tesla: A Unified Graphics and Computing Architecture. IEEE Micro, 28(2):39-55, 2008.
-
(2008)
IEEE Micro
, vol.28
, Issue.2
, pp. 39-55
-
-
Lindholm, E.1
Nickolls, J.2
Oberman, S.3
Montrym, J.4
-
19
-
-
84955506994
-
Runahead Execution: An Alternative to Very Large Instruction Windows for Out-of-Order Processors
-
O. Mutlu, J. Stark, C. Wilkerson, and Y. N. Patt. Runahead Execution: An Alternative to Very Large Instruction Windows for Out-of-Order Processors. In "Proc. 9th Int'l Conf. High Performance Computer Architecture (HPCA '03)", pages 129-140, 2003.
-
(2003)
Proc. 9th Int'l Conf. High Performance Computer Architecture (HPCA '03)
, pp. 129-140
-
-
Mutlu, O.1
Stark, J.2
Wilkerson, C.3
Patt, Y.N.4
-
20
-
-
47349098275
-
MineBench: A Benchmark Suite for Data Mining Workloads
-
R. Narayanan, B. Ozisikyilmaz, J. Zambreno, G. Memik, and A. Choudhary. MineBench: A Benchmark Suite for Data Mining Workloads. In Proc. 2006 IEEE Int'l Symposium on Workload Characterization (ISWC '06), pages 182-188, 2006.
-
(2006)
Proc. 2006 IEEE Int'l Symposium on Workload Characterization (ISWC '06)
, pp. 182-188
-
-
Narayanan, R.1
Ozisikyilmaz, B.2
Zambreno, J.3
Memik, G.4
Choudhary, A.5
-
21
-
-
78651550268
-
Scalable Parallel Programming with CUDA
-
J. Nickolls, I. Buck, M. Garland, and K. Skadron. Scalable Parallel Programming with CUDA. ACM Queue, 6(2):40-53, 2008.
-
(2008)
ACM Queue
, vol.6
, Issue.2
, pp. 40-53
-
-
Nickolls, J.1
Buck, I.2
Garland, M.3
Skadron, K.4
-
22
-
-
0030259458
-
The Case for a Single-Chip Multiprocessor
-
K. Olukotun, B. A. Nayfeh, L. Hammond, K. Wilson, and K. Chang. The Case for a Single-Chip Multiprocessor. In Proc. 7th Int'l Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VII), pages 2-11, 1996.
-
(1996)
Proc. 7th Int'l Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VII)
, pp. 2-11
-
-
Olukotun, K.1
Nayfeh, B.A.2
Hammond, L.3
Wilson, K.4
Chang, K.5
-
23
-
-
84868071789
-
-
M. Raab, L. Grünschloss, J. Hanikaz, M. Finckh, and A. Keller. bwfirt. http://bwfirt.sourceforge.net/.
-
M. Raab, L. Grünschloss, J. Hanikaz, M. Finckh, and A. Keller. bwfirt. http://bwfirt.sourceforge.net/.
-
-
-
-
24
-
-
38849131252
-
High-Throughput Sequence Alignment using Graphics Processing Units
-
M. Schatz, C. Trapnell, A. Delcher, and A. Varshney. High-Throughput Sequence Alignment using Graphics Processing Units. BMC Bioinformatics, 8(1):474, 2007.
-
(2007)
BMC Bioinformatics
, vol.8
, Issue.1
, pp. 474
-
-
Schatz, M.1
Trapnell, C.2
Delcher, A.3
Varshney, A.4
-
25
-
-
49249086142
-
Larrabee: A Many-Core x86 Architecture for Visual Computing
-
L. Seiler et al. Larrabee: A Many-Core x86 Architecture for Visual Computing. ACM Trans. on Graphics, 27(3):1-15, 2008.
-
(2008)
ACM Trans. on Graphics
, vol.27
, Issue.3
, pp. 1-15
-
-
Seiler, L.1
-
26
-
-
12844269176
-
Continual Flow Pipelines
-
S. T. Srinivasan, R. Rajwar, H. Akkary, A. Gandhi, and M. Upton. Continual Flow Pipelines. In Proc. 11th Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS-XI), pages 107-119, 2004.
-
(2004)
Proc. 11th Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS-XI)
, pp. 107-119
-
-
Srinivasan, S.T.1
Rajwar, R.2
Akkary, H.3
Gandhi, A.4
Upton, M.5
|