-
2
-
-
84934313382
-
-
7esla k20 gpu accelerator. board specification
-
'7esla k20 gpu accelerator. board specification. " http://www. nvidia. comlconlentIPOF/keplerffesla-K20-Passive-80-06455-00 1-v07. pdf.
-
-
-
-
3
-
-
84934313383
-
-
"Understanding xid errors. " hup:/ldocs. nvidia. com/deploy/xid-errorsl index. hlnll.
-
Understanding Xid Errors
-
-
-
4
-
-
84155167175
-
Temperature dependence of neutron-induced soft errors in scams
-
M. Bagatin. S. Gerardin. A. Paccagnella, C. Andreani. G. Gorini. and C. Frost, "Temperature dependence of neutron-induced soft errors in scams," Microelectronics Reliability, vol. 52, no. 1. pp. 289-293. 2012.
-
(2012)
Microelectronics Reliability
, vol.52
, Issue.1
, pp. 289-293
-
-
Bagatin, S.1
Gerardin, A.2
Paccagnella, M.3
Andreani, G.4
Gorini, C.5
Frost, C.6
-
5
-
-
29344472607
-
Radiation-induced soft errors in advanced semiconductor technologies
-
Sept
-
R. Baumann. "Radiation-induced soft errors in advanced semiconductor technologies," Device and Materials Reliability, IEEE Transactions on,voI. 5,no. 3,pp. 305-316,Sept2005.
-
(2005)
Device and Materials Reliability, IEEE Transactions on
, vol.5
, Issue.3
, pp. 305-316
-
-
Baumann, R.1
-
7
-
-
77952273045
-
The scalable heterogeneous computing (shoe) benchmark suite
-
New York, NY, USA: ACM
-
A. Danalis, G. Marin, C. McCurdy, J. S. Meredith, P. C. Roth, K. Spafford. V. 1ipparaju, and J. S. Vetter, The scalable heterogeneous computing (shoe) benchmark suite," in Proceeding.,. of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units, sera GPGPU ' 10. New York, NY, USA: ACM, 2010, pp. 63-74. . Available: http://doi. acm. org/l0. 1145/17356H~t1735702
-
(2010)
Proceeding.,. of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units, Sera GPGPU ' 10
, pp. 63-74
-
-
Danalis, A.1
Marin, G.2
McCurdy, C.3
Meredith, J.S.4
Roth, P.C.5
Spafford, V.6
Lipparaju, K.7
Vetter, J.S.8
-
8
-
-
84905100314
-
GPU Behavior on a Large HPC Cluster
-
Aachen, Gennan)'. . August 26-30
-
N. DeBardeleben, S. Blanchard, L. Monroe, P. Romero, D. Grunau, C. Idler, and C. Wright, "GPU Behavior on a Large HPC Cluster:' 6th Workshop on Resiliency in High Perfonnance Computing (Resilience) in Clusters, Clouds, and Grids in conjunction with the 19th International European Conference on Parallel and Distributed Computing (Euro-Par 2013), Aachen, Gennan)'. . August 26-30 2013.
-
(2013)
6th Workshop on Resiliency in High Perfonnance Computing (Resilience) in Clusters, Clouds, and Grids in Conjunction with the 19th International European Conference on Parallel and Distributed Computing (Euro-Par 2013)
-
-
Debardeleben, N.1
Blanchard, S.2
Monroe, L.3
Romero, P.4
Grunau, D.5
Idler, C.6
Wright, C.7
-
9
-
-
84912075762
-
Lessons learned from the analysis of system failures at petascale: The case of blue waters
-
C. Di Martino, F. Baccanico, W. Kramer, J. Fullop, Z. Kalbarczyk, and R. Iyer, "Lessons learned from the analysis of system failures at petascale: The case of blue waters," 44th international.
-
44th International
-
-
Di Martino, C.1
Baccanico, F.2
Kramer, W.3
Fullop, J.4
Kalbarczyk, Z.5
Iyer, R.6
-
12
-
-
84883367588
-
Readjng between the lines of failure logs: Understanding how hpc systems fail
-
N. EI-Sayed and B. Schroeder, ~'Readjng between the lines of failure logs: Understanding how hpc systems fail, DSN," 2013.
-
(2013)
DSN
-
-
Ei-Sayed, N.1
Schroeder, B.2
-
14
-
-
84876585773
-
UPoster: Evaluating error resiliency of gpgpu applications
-
Nov
-
B. Fang, J. Wei, K. Pattabiraman, and M. Ripeanu, UPoster: Evaluating error resiliency of gpgpu applications," in High Peiformance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion:, Nov 2012, pp. 1504-1504.
-
(2012)
High Peiformance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion
, pp. 1504-1504
-
-
Fang, B.1
Wei, J.2
Pattabiraman, K.3
Ripeanu, M.4
-
15
-
-
84903827331
-
Gpgpus: How to combine high computational power with high reliability
-
Dresden. Germany
-
L. A. B. Gomez, F. CapPello, L. Carro, N. DeBardeleben, B. Fang, S. Gurumurthi. S. Keckler, K. Pattabiraman, R. Rech, and M. S. Reorda, "Gpgpus: How to combine high computational power with high reliability," in 2014 Design Automation and Test in Europe Conference and E. thihition, Dresden. Germany, 2014.
-
(2014)
2014 Design Automation and Test in Europe Conference and E. Thihition
-
-
Gomez, L.A.B.1
Cappello, F.2
Carro, L.3
Debardeleben, N.4
Fang, B.5
Gurumurthi, S.6
Keckler, S.7
Pattabiraman, K.8
Rech, R.9
Reorda, M.S.10
-
18
-
-
84858781341
-
Cosmic rays don't strike twice: Understanding the nature of dram errors and the implications for system design
-
A. A. Hwang, I. A. Stefanovici, and B. Schroeder. ''Cosmic rays don't strike twice: understanding the nature of dram errors and the implications for system design," ACM SIGPLAN Notices, vol. 47, no. 4, pp. 111-122. 2012.
-
(2012)
ACM SIGPLAN Notices
, vol.47
, Issue.4
, pp. 111-122
-
-
Hwang, A.A.1
Stefanovici, I.A.2
Schroeder, B.3
-
19
-
-
39049112433
-
Spreading diversity in multi-cell neutron-induced upsets with device scaling
-
E. lbe, S. Chung, S. Wen, H. Yamaguchi, Y. Yahagi, H. Kameyama. S. Yamamoto, and T. Akioka, "Spreading diversity in multi-cell neutron-induced upsets with device scaling," in Custom Integrated Circuits Conference, 2006. CICC'06. IEEE. IEEE, 2006, pp. 437444.
-
(2006)
Custom Integrated Circuits Conference, 2006. CICC'06. IEEE. IEEE
, pp. 437444
-
-
Lbe, E.1
Chung, S.2
Wen, S.3
Yamaguchi, H.4
Yahagi, Y.5
Kameyama, S.6
Yamamoto, H.7
Akioka, T.8
-
20
-
-
37249089512
-
Mea~urement and reporting of alpha particle and terrestrial cosmic ray-induced soft errors in semiconductor devices
-
JEDEC
-
JEDEC. '"Mea~urement and Reporting of Alpha Particle and Terrestrial Cosmic Ray-Induced Soft Errors in Semiconductor Devices," JEDEC Standard, Tech. Rep. JESD89A, 2006.
-
(2006)
JEDEC Standard, Tech. Rep. JESD89A
-
-
-
21
-
-
33845589803
-
UBluegenell failure analysis and prediction models
-
Y. Liang, Y. Zhang, M. Jette, A. Sivasubramaniam, and R. Sahoo, UBluegenell failure analysis and prediction models," in Dependable Systelns and Networks, 2006. DSN 2006. International Conference on. IEEE, 2006, pp. 425-434.
-
(2006)
Dependable Systelns and Networks, 2006. DSN 2006. International Conference On. IEEE
, pp. 425-434
-
-
Liang, Y.1
Zhang, Y.2
Jette, M.3
Sivasubramaniam, A.4
Sahoo, R.5
-
22
-
-
27544497222
-
Filtering failure logs for a bluegenell prototype
-
Y. Liang, Y. Zhang, A. Sivasubramaniam, R. K. Sahoo, J. Moreira, and M. Gupta, "Filtering failure logs for a bluegenell prototype. " in Dependable Systelns and Networks, 2005. DSN 2005. Proceedings. International Conference on. IEEE, 2005, pp. 476-485.
-
(2005)
Dependable Systelns and Networks, 2005. DSN 2005. Proceedings. International Conference On. IEEE
, pp. 476-485
-
-
Liang, Y.1
Zhang, Y.2
Sivasubramaniam, A.3
Sahoo, R.K.4
Moreira, J.5
Gupta, M.6
-
24
-
-
84934284229
-
-
NVIDIA. "uBLAS Library User Guide," http://docs. nvidia. comlcudal pdf/CUBLAS-Library. pdf. 2014.
-
(2014)
UBLAS Library User Guide
-
-
-
26
-
-
84914675584
-
Gpgpus ecc efficiency and efficacy
-
D. A. G. Oliveira, P. Rech, L. L. Pill~ P. O. A. Navaux, and L. Carro, "Gpgpus ecc efficiency and efficacy," in International Symposium on Defect and Failit Tolerance ill VLSI and Nanotechnology Systems (DFT 2014), 2014.
-
(2014)
International Symposium on Defect and Failit Tolerance Ill VLSI and Nanotechnology Systems (DFT 2014)
-
-
Oliveira, D.A.G.1
Rech, P.2
Pill, P.O.A.3
Navaux, L.L.4
Carro, L.5
-
28
-
-
80051915968
-
Improving log-based field failure data analysis of multi-node computing systems
-
A. Pecchia, D. Cotroneo, Z. Kalbarczyk, and R. K. Iyer, "Improving log-based field failure data analysis of multi-node computing systems," in Dependable Systems &: Networks (DSN), 2011 IEEE/IFIP 41st International Conference on. IEEE, 2011, pp. 97-108.
-
(2011)
Dependable Systems &: Networks (DSN), 2011 IEEE/IFIP 41st International Conference On. IEEE
, pp. 97-108
-
-
Pecchia, A.1
Cotroneo, D.2
Kalbarczyk, Z.3
Iyer, R.K.4
-
29
-
-
84906779993
-
Software-based hardening strategies for neutron sensitive fft algorithms on GPUS
-
L. Pilla, P. Rech, F. Silvestri, C. Frost. P. Navaux, M. Reorda, and L. Carro, "Software-based hardening strategies for neutron sensitive fft algorithms on gpus," Nuclear Science, IEEE Transactions on, vol. PP. no. 99,pp. 1-7,2014.
-
(2014)
Nuclear Science, IEEE Transactions on
, Issue.99
, pp. 1-7
-
-
Pilla, L.1
Rech, P.2
Silvestri, F.3
Frost. P Navaux, C.4
Reorda, M.5
Carro, L.6
-
30
-
-
84869186078
-
Neutron radiation test of graphic processing units
-
June
-
P. Rech, C. Aguiar, R. Ferreira, C. Frost, and L. Carro, "Neutron radiation test of graphic processing units," in On-Line Testing Symposium (fOLTS), 2012 IEEE 18th International, June 2012, pp. 55-60.
-
(2012)
On-Line Testing Symposium (FOLTS), 2012 IEEE 18th International
, pp. 55-60
-
-
Rech, P.1
Aguiar, C.2
Ferreira, R.3
Frost, C.4
Carro, L.5
-
31
-
-
84882824302
-
An efficient and experimentally tuned software-based hardening strategy for matrix multiplication on GPUS
-
P. Rech, C. Aguiar, C. Frost, and L. Carro, "An Efficient and Experimentally Tuned Software-Based Hardening Strategy for Matrix Multiplication on GPUs," Nuclear Science, IEEE Transactions on, vol. 6O,no. 4,pp. 2797-2804,2013.
-
(2013)
Nuclear Science, IEEE Transactions on
, vol.60
, Issue.4
, pp. 2797-2804
-
-
Rech, P.1
Aguiar, C.2
Frost, C.3
Carro, L.4
-
32
-
-
84883407616
-
Experimental evaluation of thread distribution effects on multiple output errors in GPUS
-
-, "Experimental evaluation of thread distribution effects on multiple output errors in gpus," in Test Symposium (ETS), 2013 18th IEEE European. IEEE, 2013, pp. 1-6.
-
(2013)
Test Symposium (ETS), 2013 18th IEEE European. IEEE
, pp. 1-6
-
-
Rech, P.1
Aguiar, C.2
Frost, C.3
Carro, L.4
-
33
-
-
84912064052
-
Measuring the radiation reliability of SRAM structures in opus designed for HPC
-
P. Rech, L. Carro, N. Wang, T. Tsai, S. K. S. Hari, and S. W. Keckler. "Measuring the Radiation Reliability of SRAM Structures in OPUS Designed for HPC," in IEEE 10th Workshop on Silicon Errors in Logic-Systeln Effects (SELSE), 2014.
-
(2014)
IEEE 10th Workshop on Silicon Errors in Logic-Systeln Effects (SELSE)
-
-
Rech, P.1
Carro, L.2
Wang, N.3
Tsai, T.4
Hari, S.K.S.5
Keckler, S.W.6
-
34
-
-
84894420807
-
On the evaluation of soft-errors detection techniques for gpgpus
-
Dec
-
D. Sabena, M. Sonza Reorda, L. Sterpone, P. Rech, and L. Carro, "On the evaluation of soft-errors detection techniques for gpgpus," in Design and Test Symposillln (IDT), 2013 8th International, Dec 2013, pp. 1-6.
-
(2013)
Design and Test Symposillln (IDT), 2013 8th International
, pp. 1-6
-
-
Sabena, D.1
Sonza Reorda, M.2
Sterpone, L.3
Rech, P.4
Carro, L.5
-
35
-
-
4544382099
-
Failure data analysis of a large-scale heterogeneous server environment
-
R. K. Sahoo, M. S. Squillante, A. Sivasubramaniam, and Y. Zhang, "Failure data analysis of a large-scale heterogeneous server environment:' in Dependable Systems and Nehvorks, 2004 International Conference on. IEEE, 2004, pp. 772-781.
-
(2004)
Dependable Systems and Nehvorks, 2004 International Conference On. IEEE
, pp. 772-781
-
-
Sahoo, R.K.1
Squillante, M.S.2
Sivasubramaniam, A.3
Zhang, Y.4
-
36
-
-
78149470110
-
A large-scale study of failures in highperformance computing systems
-
B. Schroeder and G. Gibson, "A large-scale study of failures in highperformance computing systems. " Dependable and Secure Computing, IEEE Transactions on, vol. 7, no. 4, pp. 337-350, 2010.
-
(2010)
Dependable and Secure Computing, IEEE Transactions on
, vol.7
, Issue.4
, pp. 337-350
-
-
Schroeder, B.1
Gibson, G.2
-
37
-
-
85084160707
-
Disk failures in the real world: What does an muf of 1,000,000 hours mean to you?
-
B. Schroeder and G. A. Gibson, "Disk failures in the real world: What does an muf of 1,000,000 hours mean to you" in FAST, vol. 7,2007, pp. I-16.
-
(2007)
FAST
, vol.7
, pp. I-16
-
-
Schroeder, B.1
Gibson, G.A.2
-
38
-
-
70449657893
-
Dram errors in the wild: A large-scale field study
-
ACM
-
B. Schroeder, E. Pinheiro, and W.-D. Weber, "Dram errors in the wild: a large-scale field study," in ACM SIGMEfRICS Perfonnance Eltaluation Review, vol. 37, no. I. ACM, 2009, pp. 193-204.
-
(2009)
ACM SIGMEfRICS Perfonnance Eltaluation Review
, vol.37
, Issue.1
, pp. 193-204
-
-
Schroeder, B.1
Pinheiro, E.2
Weber, W.-D.3
-
39
-
-
84900560822
-
Addressing failures in exascale computing
-
1094342014522573
-
M. Snir, R. W. Wisniewski, J. A. Abraham, S. V. Adve, S. Bagchi, P. Balaji, J. Belak, P. Bose. F. Cappello, B. Carlson et al., "Addressing failures in exascale computing," International Journal of High Peifonnance Computing Applications. p. 1094342014522573. 2014.
-
(2014)
International Journal of High Peifonnance Computing Applications
-
-
Snir, M.1
Wisniewski, R.W.2
Abraham, J.A.3
Adve, S.V.4
Bagchi, S.5
Balaji, P.6
Belak, J.7
Bose. F Cappello, P.8
Carlson, B.9
-
40
-
-
84899689608
-
Feng shui of supercomputer memory: Positional effects in dram and SRAM faults
-
V. Sridharan, J. Stearley, N. DeBardeleben, S. Blanchard, and S. Gurumurthi, "Feng shui of supercomputer memory: positional effects in dram and sram faults," in Proceedings ofSC13: International Conference for High Perfonnance Computing, Networking, Storage and Analysis. ACM. 2013, p. 22.
-
(2013)
Proceedings ofSC13: International Conference for High Perfonnance Computing, Networking, Storage and Analysis. ACM
, pp. 22
-
-
Sridharan, V.1
Stearley, J.2
Debardeleben, N.3
Blanchard, S.4
Gurumurthi, S.5
-
41
-
-
84862974517
-
Analyzing soft-error vulnerability on gpgpu microarchitecture
-
Nov
-
J. Tan, N. Goswami, T. Li, and X. Fu, "Analyzing soft-error vulnerability on gpgpu microarchitecture," in Workload Characterization (lISWC), 2011 IEEE International Symposiuln on, Nov 2011, pp. 226235.
-
(2011)
Workload Characterization (LISWC), 2011 IEEE International Symposiuln on
, pp. 226235
-
-
Tan, J.1
Goswami, N.2
Li, T.3
Fu, X.4
-
43
-
-
34548090964
-
A New Hardware/Software Platform and a New lIE Neutron Source for Soft Error Studies: Testing FPGAs at the ISIS Facility
-
M. Violante, L. Sterpone, A. Manuzzato, S. Gerardin, P. Rech, M. Bagatin, A. Paccagnella, C. Andreani, G. Gorini, A. Pietropaolo, G. Cardarilli, S. Pontarelli, and C. Frost, "A New Hardware/Software Platform and a New lIE Neutron Source for Soft Error Studies: Testing FPGAs at the ISIS Facility," Nuclear Science, IEEE Transactions on, voI. 54,no. 4,pp. 1184-1189,2oo7. (44) J. F. Ziegler and H. Puchner, SER-history, Trends and Challenges: A Guide for Designing with Memory les. Cypress, 2010.
-
(2010)
Nuclear Science, IEEE Transactions On, VoI. 54,no. 4,pp. 1184-1189,2oo7. (44) J. F. Ziegler and H. Puchner, SER-history, Trends and Challenges: A Guide for Designing with Memory Les. Cypress
-
-
Violante, M.1
Sterpone, L.2
Manuzzato, A.3
Gerardin, S.4
Rech, P.5
Bagatin, M.6
Paccagnella, A.7
Andreani, C.8
Gorini, G.9
Pietropaolo, A.10
Cardarilli, G.11
Pontarelli, S.12
Frost, C.13
|