-
1
-
-
70349169075
-
Analyzing cuda workloads using a detailed gpu simulator
-
A. Bakhoda, G. Yuan, W. W. L. Fung, H. Wong, and T. M. Aamodt. Analyzing cuda workloads using a detailed gpu simulator. In Proc. of the 2009 International Symposium on Analysis of Systems and Software (ISPASS-2009), Apr 2009.
-
Proc. of the 2009 International Symposium on Analysis of Systems and Software (ISPASS-2009), Apr 2009
-
-
Bakhoda, A.1
Yuan, G.2
Fung, W.W.L.3
Wong, H.4
Aamodt, T.M.5
-
4
-
-
70649092154
-
Rodinia: A benchmark suite for heterogeneous computing
-
S. Che, M. Boyer, J. Meng, D. Tarjan, J. W. Sheaffer, S.-H. Lee, and K. Skadron. Rodinia: A benchmark suite for heterogeneous computing. In Proc. of the 2009 International Symposium on Workload Characterization (IISWC-2009), Oct 2009.
-
Proc. of the 2009 International Symposium on Workload Characterization (IISWC-2009), Oct 2009
-
-
Che, S.1
Boyer, M.2
Meng, J.3
Tarjan, D.4
Sheaffer, J.W.5
Lee, S.-H.6
Skadron, K.7
-
5
-
-
78751505898
-
A characterization of the rodinia benchmark suite with comparison to contemporary cmp workloads
-
S. Che, J. W. Sheaffer, M. Boyer, L. G. Szafaryn, L. Wang, and K. Skadron. A characterization of the rodinia benchmark suite with comparison to contemporary cmp workloads. In Proc. of the 2010 International Symposium on Workload Characterization (IISWC-2010), Dec 2010.
-
Proc. of the 2010 International Symposium on Workload Characterization (IISWC-2010), Dec 2010
-
-
Che, S.1
Sheaffer, J.W.2
Boyer, M.3
Szafaryn, L.G.4
Wang, L.5
Skadron, K.6
-
6
-
-
84863348772
-
Parallel application memory scheduling
-
E. Ebrahimi, R. Miftakhutdinov, C. Fallin, C. J. Lee, J. A. Joao, O. Mutlu, and Y. N. Patt. Parallel application memory scheduling. In Proceedings of the 44th International Symposium on Microarchitecture (MICRO-44), Dec 2011.
-
Proceedings of the 44th International Symposium on Microarchitecture (MICRO-44), Dec 2011
-
-
Ebrahimi, E.1
Miftakhutdinov, R.2
Fallin, C.3
Lee, C.J.4
Joao, J.A.5
Mutlu, O.6
Patt, Y.N.7
-
9
-
-
80052533471
-
Energy-efficient mechanisms for managing thread conteext in throughput processors
-
M. Gebhart, R. D. Johnson, D. Tarjan, S. W. Keckler, W. J. Dally, E. Lindoholm, and K. Skadron. Energy-efficient mechanisms for managing thread conteext in throughput processors. In Proc. of the 38th International Symposium on Computer Architecture (ISCA-38), Jun 2011.
-
Proc. of the 38th International Symposium on Computer Architecture (ISCA-38), Jun 2011
-
-
Gebhart, M.1
Johnson, R.D.2
Tarjan, D.3
Keckler, S.W.4
Dally, W.J.5
Lindoholm, E.6
Skadron, K.7
-
11
-
-
84875640178
-
Owl: Cooperative thread array aware scheduling techniques for improving gpgpu performance
-
A. Jog, O. Kayiran, N. Chidambaram, A. K. Mishra, M. T. Kandemir, O. Mutlu, R. Iyer, and C. R. Das. Owl: Cooperative thread array aware scheduling techniques for improving gpgpu performance. In Proc. of the 18th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-13), Mar 2013.
-
Proc. of the 18th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-13), Mar 2013
-
-
Jog, A.1
Kayiran, O.2
Chidambaram, N.3
Mishra, A.K.4
Kandemir, M.T.5
Mutlu, O.6
Iyer, R.7
Das, C.R.8
-
12
-
-
84881126240
-
Orchestrated scheduling and prefetching for gpgpus
-
A. Jog, O. Kayiran, A. K. Mishra, M. T. Kandemir, O. Mutlu, R. Iyer, and C. R. Das. Orchestrated scheduling and prefetching for gpgpus. In Proc. of the 40th International Symposium on Computer Architecture (ISCA-40), Jun 2013.
-
Proc. of the 40th International Symposium on Computer Architecture (ISCA-40), Jun 2013
-
-
Jog, A.1
Kayiran, O.2
Mishra, A.K.3
Kandemir, M.T.4
Mutlu, O.5
Iyer, R.6
Das, C.R.7
-
16
-
-
84863342255
-
Improving gpu performance via large warps and two-level warp scheduling
-
V. Narasiman, M. Shebanow, C. J. Lee, R. Miftakhutdinov, O. Mutlu, and Y. N. Patt. Improving gpu performance via large warps and two-level warp scheduling. In Proc. of the 44th International Symposium on Microarchitecture (MICRO-44), Dec 2011.
-
Proc. of the 44th International Symposium on Microarchitecture (MICRO-44), Dec 2011
-
-
Narasiman, V.1
Shebanow, M.2
Lee, C.J.3
Miftakhutdinov, R.4
Mutlu, O.5
Patt, Y.N.6
-
19
-
-
84921758691
-
-
Mar
-
J. A. Stratton, C. Rodrigues, I.-J. Sung, N. Obeid, L.-W. Chang, N. Anssari, G. D. Liu, and W.-M. W. Hwu. The Parboil technique report, Mar 2012.
-
(2012)
The Parboil Technique Report
-
-
Stratton, J.A.1
Rodrigues, C.2
Sung, I.-J.3
Obeid, N.4
Chang, L.-W.5
Anssari, N.6
Liu, G.D.7
Hwu, W.-M.W.8
-
20
-
-
84881183039
-
Simd divergence optimization through intra-warp compaction
-
A. S. Vaidya, A. Shayesteh, D. H. Woo, R. Saharoy, and M. Azimi. Simd divergence optimization through intra-warp compaction. In Proc. of the 40th International Symposium on Computer Architecture (ISCA-40), Jun 2013.
-
Proc. of the 40th International Symposium on Computer Architecture (ISCA-40), Jun 2013
-
-
Vaidya, A.S.1
Shayesteh, A.2
Woo, D.H.3
Saharoy, R.4
Azimi, M.5
|