SCOPUS 정보 검색 플랫폼

IFAC Proceedings Volumes (IFAC-PapersOnline)

Volumn 8, Issue PART 1, 2012, Pages 660-665

Robust exploration/exploitation trade-offs in safety-critical applications

(5) Tokic, Michel a,d Ertle, Philipp b,d Palm, Günther a Söffker, Dirk b Voos, Holger c,d

a UNIVERSITY OF ULM (Germany)

b UNIVERSITY OF DUISBURG ESSEN (Germany)

c UNIVERSITY OF LUXEMBOURG (Luxembourg)

d UNIVERSITY OF APPLIED SCIENCES RAVENSBURG WEINGARTEN (Germany)

Author keywords

Autonomous systems; Learning; Safety; Temporal difference learning

Indexed keywords

ACCIDENT PREVENTION; ECONOMIC AND SOCIAL EFFECTS; FAULT DETECTION; PLANT MANAGEMENT; SAFETY ENGINEERING;

AUTONOMOUS SYSTEMS; DYNAMIC ENVIRONMENTS; EXPLORATION POLICY; EXPLORATION/EXPLOITATION; EXPLORATION/EXPLOITATION DILEMMAS; LEARNING; SAFETY CRITICAL APPLICATIONS; TEMPORAL DIFFERENCE LEARNING;

REINFORCEMENT LEARNING;

EID: 84867036077 PISSN: 14746670 EISSN: None Source Type: Conference Proceeding
DOI: 10.3182/20120829-3-MX-2028.00160 Document Type: Conference Paper

Times cited : (5)

References (18)

1
- 33745223257
- Cortical substrates for exploratory decisions in humans
- Daw, N.D., O'Doherty, J.P., Dayan, P., Seymour, B., and Dolan, R.J. (2006). Cortical substrates for exploratory decisions in humans. Nature, 441(7095), 876-879.
- (2006) Nature , vol.441 , Issue.7095 , pp. 876-879
- Daw, N.D.¹ O'Doherty, J.P.² Dayan, P.³ Seymour, B.⁴ Dolan, R.J.⁵

2
- 84859146294
- Rowohlt, Reinbek bei Hamburg
- Dörner, D. (2000). Die Logik des Mißlingens strategisches Denken in komplexen Situationen. Rowohlt, Reinbek bei Hamburg.
- (2000) Die Logik des Mißlingens Strategisches Denken in Komplexen Situationen
- Dörner, D.¹

3
- 84867076681
- On risk for-malization of on-line risk assessment for safe decision making in robotics
- Ertle, P., Voos, H., and Söffker, D. (2010). On risk for-malization of on-line risk assessment for safe decision making in robotics. In 7th IARP Workshop on Technical Challenges for Dependable Robots in Human Environments, 15-22.
- (2010) 7th IARP Workshop on Technical Challenges for Dependable Robots in Human Environments , pp. 15-22
- Ertle, P.¹ Voos, H.² Söffker, D.³

4
- 13444290317
- Reinforcement learning with bounded risk
- Morgan Kaufmann Publishers Inc.
- Geibel, P. (2001). Reinforcement learning with bounded risk. In Proceedings of the 18th International Conference on Machine Learning, ICML'01, 162-169. Morgan Kaufmann Publishers Inc.
- (2001) Proceedings of the 18th International Conference on Machine Learning, ICML'01 , pp. 162-169
- Geibel, P.¹

5
- 33748998787
- Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming
- George, A.P. and Powell, W.B. (2006). Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming. Machine Learning, 65(1), 167-198.
- (2006) Machine Learning , vol.65 , Issue.1 , pp. 167-198
- George, A.P.¹ Powell, W.B.²

6
- 79956136559
- Safe exploration for reinforcement learning
- Hans, A., Schneegaß, D., Schäfer, A.M., and Udluft, S. (2008). Safe exploration for reinforcement learning. In Proceedings of the 16th European Symposium on Artificial Neural Networks ESANN'08, 143-148.
- (2008) Proceedings of the 16th European Symposium on Artificial Neural Networks ESANN'08 , pp. 143-148
- Hans, A.¹ Schneegaß, D.² Schäfer, A.M.³ Udluft, S.⁴

7
- 85120861483
- Consideration of risk in reinforcement learning
- Morgan Kaufmann Publishers, Inc., San Francisco, CA, USA
- Heger, M. (1994). Consideration of risk in reinforcement learning. In Proceedings of the 11th International Conference on Machine Learning, 105-111. Morgan Kaufmann Publishers, Inc., San Francisco, CA, USA.
- (1994) Proceedings of the 11th International Conference on Machine Learning , pp. 105-111
- Heger, M.¹

8
- 51349102890
- On fault tolerance and robustness in autonomous systems
- Lussier, B., Chatila, R., Ingrand, F., Killijian, M.O., and Powell, D. (2004). On fault tolerance and robustness in autonomous systems. In 3rd IARP - IEEE/RAS - EURON Joint Workshop on Technical Challenges for Dependable Robots in Human Environments, 7-9.
- (2004) 3rd IARP - IEEE/RAS - EURON Joint Workshop on Technical Challenges for Dependable Robots in Human Environments , pp. 7-9
- Lussier, B.¹ Chatila, R.² Ingrand, F.³ Killijian, M.O.⁴ Powell, D.⁵

9
- 0036832952
- Risk-sensitive reinforcement learning
- Mihatsch, O. and Neuneier, R. (2002). Risk-sensitive reinforcement learning. Machine Learning, 49(2), 267-290.
- (2002) Machine Learning , vol.49 , Issue.2 , pp. 267-290
- Mihatsch, O.¹ Neuneier, R.²

10
- 84860700444
- Stable adaptive control with online learning
- Ng, A.Y. and Kim, H.J. (2004). Stable adaptive control with online learning. In Advances in Neural Information Processing Systems, 17, 13-18.
- (2004) Advances in Neural Information Processing Systems , vol.17 , pp. 13-18
- Ng, A.Y.¹ Kim, H.J.²

11
- 0141607826
- Lyapunov design for safe reinforcement learning
- Perkins, T.J. and Barto, A.G. (2003). Lyapunov design for safe reinforcement learning. Journal of Machine Learning Research, 3, 803-832.
- (2003) Journal of Machine Learning Research , vol.3 , pp. 803-832
- Perkins, T.J.¹ Barto, A.G.²

12
- 0003636089
- On-line Q-learning using connectionist systems
- Cambridge University
- Rummery, G.A. and Niranjan, M. (1994). On-line Q-learning using connectionist systems. Technical Report CUED/F-INFENG/TR 166, Cambridge University.
- (1994) Technical Report CUED/F-INFENG/TR 166
- Rummery, G.A.¹ Niranjan, M.²

13
- 0031172111
- Autonomy in robots and other agents
- Smithers, T. (1997). Autonomy in robots and other agents. Brain and Cognition, 34, 88-106.
- (1997) Brain and Cognition , vol.34 , pp. 88-106
- Smithers, T.¹

14
- 0004102479
- MIT Press, Cambridge, MA
- Sutton, R.S. and Barto, A.G. (1998). Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

15
- 78349245906
- Adaptive ε-greedy exploration in reinforcement learning based on value differences
- Springer Berlin / Heidelberg
- Tokic, M. (2010). Adaptive ε-greedy exploration in reinforcement learning based on value differences. In KI 2010: Advances in Artificial Intelligence, 203-210. Springer Berlin / Heidelberg.
- (2010) KI 2010: Advances in Artificial Intelligence , pp. 203-210
- Tokic, M.¹

16
- 80054004135
- Value-difference based exploration: Adaptive exploration between epsilon-greedy and softmax
- Springer Berlin / Heidelberg
- Tokic, M. and Palm, G. (2011). Value-difference based exploration: Adaptive exploration between epsilon-greedy and softmax. In KI 2011: Advances in Artificial Intelligence, 335-346. Springer Berlin / Heidelberg.
- (2011) KI 2011: Advances in Artificial Intelligence , pp. 335-346
- Tokic, M.¹ Palm, G.²

17
- 0004049893
- Ph.D. thesis, University of Cambridge, England
- Watkins, C. (1989). Learning from Delayed Rewards. Ph.D. thesis, University of Cambridge, England.
- (1989) Learning from Delayed Rewards
- Watkins, C.¹

18
- 34249833101
- Technical note: Q-learning
- Watkins, C. and Dayan, P. (1992). Technical note: Q-learning. Machine Learning, 8(3), 279-292.
- (1992) Machine Learning , vol.8 , Issue.3 , pp. 279-292
- Watkins, C.¹ Dayan, P.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.