-
1
-
-
85044085686
-
-
Advanced configuration and power interface (ACPI). http://www.uefi.org/acpi/specs (2013)
-
(2013)
-
-
-
3
-
-
27544473955
-
Nonstopadvanced architecture
-
Bernick, D., Bruckert, B., Vigna, P.D., Garcia, D., Jardine, R., Klecka, J., Smullen, J.: Nonstopadvanced architecture. In: Proceedings of the 2005 International Conference on Dependable Systems and Networks, DSN ’05, pp. 12–21 (2005)
-
(2005)
Proceedings of the 2005 International Conference on Dependable Systems and Networks, DSN ’05
, pp. 12-21
-
-
Bernick, D.1
Bruckert, B.2
Vigna, P.D.3
Garcia, D.4
Jardine, R.5
Klecka, J.6
Smullen, J.7
-
4
-
-
33846118079
-
Designing reliable systems from unreliable components: the challenges of transistor variability and degradation
-
Borkar, S.: Designing reliable systems from unreliable components: the challenges of transistor variability and degradation. IEEE Micro 25(6), 10–16 (2005)
-
(2005)
IEEE Micro
, vol.25
, Issue.6
, pp. 10-16
-
-
Borkar, S.1
-
5
-
-
84977070666
-
Clear: Cross-layer exploration for architecting resilience—combining hardware and software techniques to tolerate soft errors in processor cores
-
Cheng, E., Mirkhani, S., Szafaryn, L.G., Cher, C.Y., Cho, H., Skadron, K., Stan, M.R., Lilja, K., Abraham, J.A., Bose, P., Mitra, S.: Clear: cross-layer exploration for architecting resilience—combining hardware and software techniques to tolerate soft errors in processor cores. In: Proceedings of the 53rd Annual Design Automation Conference, DAC ’16, pp. 68:1–68:6 (2016)
-
(2016)
Proceedings of the 53Rd Annual Design Automation Conference, DAC ’16
, pp. 1-68
-
-
Cheng, E.1
Mirkhani, S.2
Szafaryn, L.G.3
Cher, C.Y.4
Cho, H.5
Skadron, K.6
Stan, M.R.7
Lilja, K.8
Abraham, J.A.9
Bose, P.10
Mitra, S.11
-
6
-
-
79951595196
-
The international exascale software project roadmap
-
Dongarra, J., Beckman, P., Moore, T., et al.: The international exascale software project roadmap. Int. J. High Perform. Comput. Appl. 3–60 (2011)
-
(2011)
Int. J. High Perform. Comput. Appl
, pp. 3-60
-
-
Dongarra, J.1
Beckman, P.2
Moore, T.3
-
7
-
-
77954574789
-
-
Tech. rep, DARPA
-
Elnozahy, E., Bianchini, R., El-Ghazawi, T., et al.: System resilience at extreme scale. White Paper. Tech. rep, DARPA (2009)
-
(2009)
System Resilience at Extreme Scale. White Paper
-
-
Elnozahy, E.1
Bianchini, R.2
El-Ghazawi, T.3
-
8
-
-
74549140832
-
The case for modular redundancy in large-scale high performance computing systems
-
Engelmann, C., Ong, H.H., Scott, S.L.: The case for modular redundancy in large-scale high performance computing systems. In: Proceedings of the 27th IASTED International Conference on Parallel and Distributed Computing and Networks (PDCN), pp. 189–194 (2009)
-
(2009)
Proceedings of the 27Th IASTED International Conference on Parallel and Distributed Computing and Networks (PDCN)
, pp. 189-194
-
-
Engelmann, C.1
Ong, H.H.2
Scott, S.L.3
-
9
-
-
83155188951
-
Evaluating the viability of process replication reliability for exascale systems
-
Ferreira, K., Stearley, J., Laros III, J.H., et al.: Evaluating the viability of process replication reliability for exascale systems. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–12 (2011)
-
(2011)
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
, pp. 1-12
-
-
Ferreira, K.1
Stearley, J.2
Laros, J.H.3
-
11
-
-
84882659504
-
Fault-tolerant iterative methods via selective reliability
-
Hoemmen, M., Heroux, M.A.: Fault-tolerant iterative methods via selective reliability. In: Proceedings of the 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC). IEEE Computer Society, vol. 3, p. 9 (2011)
-
(2011)
Proceedings of the 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC). IEEE Computer Society
, vol.3
, pp. 9
-
-
Hoemmen, M.1
Heroux, M.A.2
-
12
-
-
84908614184
-
Opportunistic application-level fault detection through adaptive redundant multithreading
-
Hukerikar, S., Diniz, P.C., Lucas, R.F., Teranishi, K.: Opportunistic application-level fault detection through adaptive redundant multithreading. In: International Conference on High Performance Computing Simulation (HPCS), pp. 243–250 (2014). doi:10.1109/HPCSim.2014.6903692
-
(2014)
International Conference on High Performance Computing Simulation (HPCS)
, pp. 243-250
-
-
Hukerikar, S.1
Diniz, P.C.2
Lucas, R.F.3
Teranishi, K.4
-
13
-
-
84970024484
-
Rolex: resilience-oriented language extensions for extreme-scale systems
-
Hukerikar, S., Lucas, R.F.: Rolex: resilience-oriented language extensions for extreme-scale systems. J. Supercomput. 72, 1–33 (2016). doi:10.1007/s11227-016-1752-5
-
(2016)
J. Supercomput.
, vol.72
, pp. 1-33
-
-
Hukerikar, S.1
Lucas, R.F.2
-
14
-
-
84946688764
-
An evaluation of lazy fault detection based on adaptive redundant multithreading
-
Hukerikar, S., Teranishi, K., Diniz, P.C., Lucas, R.F.: An evaluation of lazy fault detection based on adaptive redundant multithreading. In: IEEE High Performance Extreme Computing Conference (HPEC), pp. 1–6 (2014) doi:10.1109/HPEC.2014.7040999
-
(2014)
IEEE High Performance Extreme Computing Conference (HPEC)
, pp. 1-6
-
-
Hukerikar, S.1
Teranishi, K.2
Diniz, P.C.3
Lucas, R.F.4
-
15
-
-
66749092384
-
-
Tech. rep, DARPA
-
Kogge, P., Bergman, K., Borkar, S., et al.: Exascale computing study: technology challenges in achieving exascale systems. Tech. rep, DARPA (2008)
-
(2008)
Exascale Computing Study: Technology Challenges in Achieving Exascale Systems
-
-
Kogge, P.1
Bergman, K.2
Borkar, S.3
-
16
-
-
77954400468
-
-
Liao, C., Quinlan, D.J., Vuduc, R., Panas, T.: Effective source-to-source outlining to support whole program empirical optimization pp. 308–322 (2010)
-
(2010)
Effective Source-To-Source Outlining to Support Whole Program Empirical Optimization
, pp. 308-322
-
-
Liao, C.1
Quinlan, D.J.2
Vuduc, R.3
Panas, T.4
-
17
-
-
84880877655
-
ROSE::FTTransform—a source-to-source translation framework for exascale fault-tolerance research
-
Lidman, J., Quinlan, D.J., Liao, C., McKee, S.A.: ROSE::FTTransform—a source-to-source translation framework for exascale fault-tolerance research. In: Dependable Systems and Networks Workshops (DSN-W), 2012 IEEE/IFIP 42nd International Conference on, pp. 1–6 (2012). doi:10.1109/DSNW.2012.6264672
-
(2012)
Dependable Systems and Networks Workshops (DSN-W), 2012 IEEE/IFIP 42Nd International Conference On
, pp. 1-6
-
-
Lidman, J.1
Quinlan, D.J.2
Liao, C.3
McKee, S.A.4
-
19
-
-
0036287327
-
Detailed design and evaluation of redundant multithreading alternatives
-
Wiley-Interscience, Hoboken, N.J
-
Mukherjee, S.S., Kontz, M., Reinhardt, S.K.: Detailed design and evaluation of redundant multithreading alternatives. In: SIGARCH Computer Architecture News, pp. 99–110. Wiley-Interscience, Hoboken, N.J. (2002)
-
(2002)
SIGARCH Computer Architecture News
, pp. 99-110
-
-
Mukherjee, S.S.1
Kontz, M.2
Reinhardt, S.K.3
-
20
-
-
0036507790
-
Error detection by duplicated instructions in super-scalar processors
-
Oh, N., Shirvani, P.P., McCluskey, E.J.: Error detection by duplicated instructions in super-scalar processors. IEEE Trans. Reliab. pp. 63–75 (2002)
-
(2002)
IEEE Trans. Reliab
, pp. 63-75
-
-
Oh, N.1
Shirvani, P.P.2
McCluskey, E.J.3
-
21
-
-
33846503392
-
Slick: Slice-based locality exploitation for efficient redundant multithreading
-
Parashar, A., Sivasubramaniam, A., Gurumurthi, S.: Slick: Slice-based locality exploitation for efficient redundant multithreading. SIGOPS Oper. Syst. Rev. 5, 95–105 (2006)
-
(2006)
SIGOPS Oper. Syst. Rev.
, vol.5
, pp. 95-105
-
-
Parashar, A.1
Sivasubramaniam, A.2
Gurumurthi, S.3
-
24
-
-
33646829087
-
SWIFT: Software implemented fault tolerance
-
Reis, G., Chang, J., Vachharajani, N., et al.: SWIFT: software implemented fault tolerance. In: International Symposium on Code Generation and Optimization, pp. 243–254 (2005)
-
(2005)
International Symposium on Code Generation and Optimization
, pp. 243-254
-
-
Reis, G.1
Chang, J.2
Vachharajani, N.3
-
26
-
-
67649255075
-
Plr: a software approach to transient fault tolerance for multicore architectures
-
Shye, A., Blomstedt, J., Moseley, T., Reddi, V.J., Connors, D.A.: Plr: a software approach to transient fault tolerance for multicore architectures. IEEE Trans. Dependable Secure Comput. 6(2), 135–148 (2009)
-
(2009)
IEEE Trans. Dependable Secure Comput.
, vol.6
, Issue.2
, pp. 135-148
-
-
Shye, A.1
Blomstedt, J.2
Moseley, T.3
Reddi, V.J.4
Connors, D.A.5
-
28
-
-
0032667728
-
IBM’s S/390 G5 Microprocessor Design
-
Slegel, T., Averill R.M., I., Check, M., et. al: IBM’s S/390 G5 Microprocessor Design. In: IEEE Micro, pp. 12–23 (1999)
-
(1999)
IEEE Micro
, pp. 12-23
-
-
Slegel, T.1
Averill, R.M.I.2
Check, M.3
-
29
-
-
84882116892
-
Stratus ftserver–intel fault tolerant platform
-
Somers, J.: Stratus ftserver–intel fault tolerant platform. Intel Developer Forum (2002)
-
(2002)
Intel Developer Forum
-
-
Somers, J.1
-
31
-
-
84055195466
-
-
Tech. rep., Summary Report of the Advanced Scientific Computing Advisory Committee (ASCAC) Subcommittee
-
The Opportunities and Challenges of Exascale Computing. Tech. rep., Summary Report of the Advanced Scientific Computing Advisory Committee (ASCAC) Subcommittee (2010)
-
(2010)
The Opportunities and Challenges of Exascale Computing
-
-
-
33
-
-
77953085265
-
Multicore soft error rate stabilization using adaptive dual modular redundancy
-
Vadlamani, R., Zhao, J., Burleson, W., Tessier, R.: Multicore soft error rate stabilization using adaptive dual modular redundancy. In: Proceedings of the Conference on Design, Automation and Test in Europe, DATE ’10, pp. 27–32 (2010)
-
(2010)
Proceedings of the Conference on Design, Automation and Test in Europe, DATE ’10
, pp. 27-32
-
-
Vadlamani, R.1
Zhao, J.2
Burleson, W.3
Tessier, R.4
-
34
-
-
0036290674
-
Transient-fault recovery using simultaneous multithreading
-
Vijaykumar, T., Pomeranz, I., Cheng, K.: Transient-fault recovery using simultaneous multithreading. In: 29th Annual International Symposium on Computer Architecture, pp. 87–98 (2002)
-
(2002)
29Th Annual International Symposium on Computer Architecture
, pp. 87-98
-
-
Vijaykumar, T.1
Pomeranz, I.2
Cheng, K.3
-
35
-
-
0003133883
-
Probabilistic logics and the synthesis of reliable organisms from unreliable components
-
ACM, New York, NY
-
von Neumann, J.: Probabilistic logics and the synthesis of reliable organisms from unreliable components. In Automata Studies, pp. 43–98. ACM, New York, NY (1956)
-
(1956)
Automata Studies
, pp. 43-98
-
-
von Neumann, J.1
-
36
-
-
34547655973
-
Compiler-managed software-based redundant multi-threading for transient fault detection
-
Wang, C., Kim, H., Wu, Y., Ying, V.: Compiler-managed software-based redundant multi-threading for transient fault detection. In: International Symposium on Code Generation and Optimization, pp. 244–258 (2007). doi:10.1109/CGO.2007.7
-
(2007)
International Symposium on Code Generation and Optimization
, pp. 244-258
-
-
Wang, C.1
Kim, H.2
Wu, Y.3
Ying, V.4
-
37
-
-
78149269828
-
DAFT: Decoupled acyclic fault tolerance
-
Zhang, Y., Lee, J.W., Johnson, N.P., August, D.I.: DAFT: Decoupled acyclic fault tolerance. In: Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, PACT ’10, pp. 87–98 (2010)
-
(2010)
Proceedings of the 19Th International Conference on Parallel Architectures and Compilation Techniques, PACT ’10
, pp. 87-98
-
-
Zhang, Y.1
Lee, J.W.2
Johnson, N.P.3
August, D.I.4
|