-
1
-
-
84860351763
-
The case for GPGPU spatial multitasking, High Performance Computer Architecture (HPCA) 2012
-
J. T. Adriaens, K. Compton, N. S. Kim, and M. J. Schulte, "The case for GPGPU spatial multitasking," in High Performance Computer Architecture (HPCA), 2012 IEEE 18th International Symposium on. IEEE, 2012, pp. 1-12.
-
(2012)
IEEE 18th International Symposium On. IEEE
, pp. 1-12
-
-
Adriaens, J.T.1
Compton, K.2
Kim, N.S.3
Schulte, M.J.4
-
3
-
-
84905507634
-
-
AMD, "AMD A-Series Processor-in-a-Box," 2012. [Online]. Available: http://www.amd.com/us/products/desktop/processors/a-series/Pages/a- series-pib.aspx
-
(2012)
AMD AMD A-Series Processor-in-a-Box
-
-
-
5
-
-
84905507635
-
-
ARM, "ARM Mali," 2012. [Online]. Available: www.arm.com/products/multimedia/mali-graphics-plus-gpu-compute
-
(2012)
ARM ARM Mali
-
-
-
7
-
-
43649096256
-
Graphic engine resource management
-
M. Bautin, A. Dwarakinath, and T. Chiueh, "Graphic engine resource management," in SPIE 2008, vol. 6818, 2008, p. 68180O.
-
(2008)
SPIE
, vol.6818
, Issue.2008
-
-
Bautin, M.1
Dwarakinath, A.2
Chiueh, T.3
-
8
-
-
84859702950
-
AMD Fusion APU: Llano
-
A. Branover, D. Foley, and M. Steinman, "AMD Fusion APU: Llano," Micro, IEEE, vol. 32, no. 2, pp. 28-37, 2012.
-
(2012)
Micro IEEE
, vol.32
, Issue.2
, pp. 28-37
-
-
Branover, A.1
Foley, D.2
Steinman, M.3
-
9
-
-
79951697459
-
Task superscalar: An out-of-order task pipeline
-
Y. Etsion, F. Cabarcas, A. Rico, A. Ramirez, R. M. Badia, E. Ayguade, J. Labarta, and M. Valero, "Task superscalar: An out-of-order task pipeline," in Microarchitecture (MICRO), 2010 43rd Annual IEEE/ACM International Symposium on. IEEE, 2010, pp. 89-100.
-
(2010)
Microarchitecture (MICRO) 2010 43rd Annual IEEE/ACM International Symposium On. IEEE
, pp. 89-100
-
-
Etsion, Y.1
Cabarcas, F.2
Rico, A.3
Ramirez, A.4
Badia, R.M.5
Ayguade, E.6
Labarta, J.7
Valero, M.8
-
10
-
-
47249094055
-
System-level performance metrics for multiprogram workloads
-
S. Eyerman and L. Eeckhout, "System-level performance metrics for multiprogram workloads," Micro, IEEE, vol. 28, no. 3, pp. 42-53, 2008.
-
(2008)
Micro IEEE
, vol.28
, Issue.3
, pp. 42-53
-
-
Eyerman, S.1
Eeckhout, L.2
-
11
-
-
47349104432
-
Dynamic warp formation and scheduling for efficient GPU control flow
-
W. W. Fung, I. Sham, G. Yuan, and T. M. Aamodt, "Dynamic warp formation and scheduling for efficient GPU control flow," in Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture, 2007, pp. 407-420.
-
(2007)
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
, pp. 407-420
-
-
Fung, W.W.1
Sham, I.2
Yuan, G.3
Aamodt, T.M.4
-
12
-
-
84894883016
-
Fine-grained resource sharing for concurrent GPGPU kernels
-
C. Gregg, J. Dorn, K. Hazelwood, and K. Skadron, "Fine-grained resource sharing for concurrent GPGPU kernels," in Proceedings of the 4th USENIX conference on Hot Topics in Parallelism. USENIX Association, 2012, pp. 10-10.
-
(2012)
Proceedings of the 4th USENIX Conference on Hot Topics in Parallelism. USENIX Association
, pp. 10-10
-
-
Gregg, C.1
Dorn, J.2
Hazelwood, K.3
Skadron, K.4
-
13
-
-
79960526623
-
Enabling task parallelism in the CUDA scheduler
-
M. Guevara, C. Gregg, K. Hazelwood, and K. Skadron, "Enabling task parallelism in the CUDA scheduler," in Workshop on Programming Models for Emerging Architectures, 2009, pp. 69-76.
-
(2009)
Workshop on Programming Models for Emerging Architectures
, pp. 69-76
-
-
Guevara, M.1
Gregg, C.2
Hazelwood, K.3
Skadron, K.4
-
14
-
-
84870690379
-
A study of persistent threads style GPU programming for GPGPU workloads
-
K. Gupta, J. A. Stuart, and J. D. Owens, "A study of persistent threads style GPU programming for GPGPU workloads," in Innovative Parallel Computing (InPar), 2012. IEEE, 2012, pp. 1-14.
-
(2012)
Innovative Parallel Computing (InPar) 2012 IEEE
, pp. 1-14
-
-
Gupta, K.1
Stuart, J.A.2
Owens, J.D.3
-
16
-
-
84863015834
-
RGEM: A responsive GPGPU execution model for runtime engines
-
S. Kato, K. Lakshmanan, A. Kumar, M. Kelkar, Y. Ishikawa, and R. Rajkumar, "RGEM: A responsive GPGPU execution model for runtime engines," in Real-Time Systems Symposium (RTSS), 2011 IEEE 32nd. IEEE, 2011, pp. 57-66.
-
(2011)
Real-Time Systems Symposium (RTSS) 2011 IEEE 32nd. IEEE
, pp. 57-66
-
-
Kato, S.1
Lakshmanan, K.2
Kumar, A.3
Kelkar, M.4
Ishikawa, Y.5
Rajkumar, R.6
-
17
-
-
85077032008
-
Time-Graph: GPU scheduling for real-time multi-tasking environments
-
S. Kato, K. Lakshmanan, R. R. Rajkumar, and Y. Ishikawa, "Time-Graph: GPU scheduling for real-time multi-tasking environments," in 2011 USENIX Annual Technical Conference (USENIX ATC11), 2011, p. 17.
-
(2011)
2011 USENIX Annual Technical Conference (USENIX ATC11)
, pp. 17
-
-
Kato, S.1
Lakshmanan, K.2
Rajkumar, R.R.3
Ishikawa, Y.4
-
18
-
-
84878156908
-
Gdev: First-class GPU resource management in the operating system
-
S. Kato, M. McThrow, C. Maltzahn, and S. Brandt, "Gdev: First-class GPU resource management in the operating system," in USENIX ATC, vol. 12, 2012, pp. 37-37.
-
(2012)
USENIX ATC
, vol.12
, pp. 37-37
-
-
Kato, S.1
McThrow, M.2
Maltzahn, C.3
Brandt, S.4
-
19
-
-
84888133920
-
Heterogenious System Architecture: A technical review
-
G. Kyriazis, "Heterogenious System Architecture: a technical review," AMD, 2012.
-
(2012)
AMD
-
-
Kyriazis, G.1
-
20
-
-
80155183121
-
GPU resource sharing and virtualization on high performance computing systems
-
T. Li, V. K. Narayana, E. El-Araby, and T. El-Ghazawi, "GPU resource sharing and virtualization on high performance computing systems," in Parallel Processing (ICPP), 2011 International Conference on. IEEE, 2011, pp. 733-742.
-
(2011)
Parallel Processing (ICPP), 2011 International Conference on IEEE
, pp. 733-742
-
-
Li, T.1
Narayana, V.K.2
El-Araby, E.3
El-Ghazawi, T.4
-
21
-
-
44849137198
-
NVIDIA Tesla: A unified graphics and computing architecture
-
E. Lindholm, J. Nickolls, S. Oberman, and J. Montrym, "NVIDIA Tesla: A unified graphics and computing architecture," Micro, IEEE, vol. 28, no. 2, pp. 39-55, 2008.
-
(2008)
Micro IEEE
, vol.28
, Issue.2
, pp. 39-55
-
-
Lindholm, E.1
Nickolls, J.2
Oberman, S.3
Montrym, J.4
-
22
-
-
84864857149
-
Igpu: Exception support and speculative execution on gpus
-
J. Menon, M. De Kruijf, and K. Sankaralingam, "igpu: Exception support and speculative execution on gpus," in Proceedings of the 39th Annual International Symposium on Computer Architecture. IEEE, 2012, pp. 72-83.
-
(2012)
Proceedings of the 39th Annual International Symposium on Computer Architecture IEEE
, pp. 72-83
-
-
Menon, J.1
De Kruijf, M.2
Sankaralingam, K.3
-
26
-
-
49049088756
-
GPU computing
-
J. D. Owens, M. Houston, D. Luebke, S. Green, J. E. Stone, and J. C. Phillips, "GPU computing," Proceedings of the IEEE, vol. 96, no. 5, pp. 879-899, 2008.
-
(2008)
Proceedings of the IEEE
, vol.96
, Issue.5
, pp. 879-899
-
-
Owens, J.D.1
Houston, M.2
Luebke, D.3
Green, S.4
Stone, J.E.5
Phillips, J.C.6
-
27
-
-
84875669496
-
Improving GPGPU concurrency with elastic kernels
-
S. Pai, M. J. Thazhuthaveetil, and R. Govindarajan, "Improving GPGPU concurrency with elastic kernels," in Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems. ACM, 2013, pp. 407-418.
-
(2013)
Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems ACM
, pp. 407-418
-
-
Pai, S.1
Thazhuthaveetil, M.J.2
Govindarajan, R.3
-
28
-
-
84897759661
-
Architectural support for address translation on gpus: Designing memory management units for cpu/gpus with unified address spaces
-
B. Pichai, L. Hsu, and A. Bhattacharjee, "Architectural support for address translation on gpus: Designing memory management units for cpu/gpus with unified address spaces," in Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 2014, pp. 743-758.
-
(2014)
Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems ACM
, pp. 743-758
-
-
Pichai, B.1
Hsu, L.2
Bhattacharjee, A.3
-
29
-
-
79960506159
-
Supporting GPU sharing in cloud environments with a transparent runtime consolidation framework
-
V. T. Ravi, M. Becchi, G. Agrawal, and S. Chakradhar, "Supporting GPU sharing in cloud environments with a transparent runtime consolidation framework," in Proceedings of the 20th international symposium on High performance distributed computing. ACM, 2011, pp. 217-228.
-
(2011)
Proceedings of the 20th International Symposium on High Performance Distributed Computing ACM
, pp. 217-228
-
-
Ravi, V.T.1
Becchi, M.2
Agrawal, G.3
Chakradhar, S.4
-
30
-
-
82655162782
-
PTask: Operating system abstractions to manage GPUs as compute devices
-
C. J. Rossbach, J. Currey, M. Silberstein, B. Ray, and E. Witchel, "PTask: operating system abstractions to manage GPUs as compute devices," in Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles. ACM, 2011, pp. 233-248.
-
(2011)
Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles. ACM
, pp. 233-248
-
-
Rossbach, C.J.1
Currey, J.2
Silberstein, M.3
Ray, B.4
Witchel, E.5
-
31
-
-
84905507630
-
-
Samsung, "Samsung Exynos," 2012. [Online]. Available: www.samsung.com/exynos
-
(2012)
Samsung Samsung Exynos
-
-
-
33
-
-
84870188759
-
Softshell: Dynamic scheduling on GPUs
-
M. Steinberger, B. Kainz, B. Kerbl, S. Hauswiesner, M. Kenzel, and D. Schmalstieg, "Softshell: dynamic scheduling on GPUs," ACM Transactions on Graphics (TOG), vol. 31, no. 6, p. 161, 2012.
-
(2012)
ACM Transactions on Graphics (TOG)
, vol.31
, Issue.6
, pp. 161
-
-
Steinberger, M.1
Kainz, B.2
Kerbl, B.3
Hauswiesner, S.4
Kenzel, M.5
Schmalstieg, D.6
-
34
-
-
84873470137
-
The parboil benchmarks
-
University of Illinois at Urbana-Champaign, Tech. Rep
-
J. Stratton, C. Rodrigues, I. Sung, N. Obeid, L. Chang, G. Liu, and W. Hwu, "The Parboil benchmarks," Technical Report IMPACT-12-01, University of Illinois at Urbana-Champaign, Tech. Rep., 2012.
-
(2012)
Technical Report IMPACT-12-01
-
-
Stratton, J.1
Rodrigues, C.2
Sung, I.3
Obeid, N.4
Chang, L.5
Liu, G.6
Hwu, W.7
-
35
-
-
58449109179
-
MCUDA: An efficient implementation of CUDA kernels for multi-core CPUs
-
J. Stratton, S. Stone, and W.-m. Hwu, "MCUDA: An efficient implementation of CUDA kernels for multi-core CPUs," LCPC 2008, pp. 16-30, 2008.
-
(2008)
LCPC
, vol.2008
, pp. 16-30
-
-
Stratton, J.1
Stone, S.2
Hwu, W.-M.3
-
37
-
-
47249121916
-
FAME: Fairly measuring multithreaded architectures
-
J. Vera, F. J. Cazorla, A. Pajuelo, O. J. Santana, E. Fernandez, and M. Valero, "FAME: Fairly measuring multithreaded architectures," in Parallel Architecture and Compilation Techniques, 2007. PACT 2007. 16th International Conference on. IEEE, 2007, pp. 305-316.
-
(2007)
Parallel Architecture and Compilation Techniques 2007. PACT 2007. 16th International Conference on IEEE
, pp. 305-316
-
-
Vera, J.1
Cazorla, F.J.2
Pajuelo, A.3
Santana, O.J.4
Fernandez, E.5
Valero, M.6
-
38
-
-
79955435088
-
Fermi GF100 GPU architecture
-
C. M. Wittenbrink, E. Kilgariff, and A. Prabhu, "Fermi GF100 GPU architecture," Micro, IEEE, vol. 31, no. 2, pp. 50-59, 2011.
-
(2011)
Micro IEEE
, vol.31
, Issue.2
, pp. 50-59
-
-
Wittenbrink, C.M.1
Kilgariff, E.2
Prabhu, A.3
|