메뉴 건너뛰기




Volumn 8, Issue PART 1, 2012, Pages 660-665

Robust exploration/exploitation trade-offs in safety-critical applications

Author keywords

Autonomous systems; Learning; Safety; Temporal difference learning

Indexed keywords

ACCIDENT PREVENTION; ECONOMIC AND SOCIAL EFFECTS; FAULT DETECTION; PLANT MANAGEMENT; SAFETY ENGINEERING;

EID: 84867036077     PISSN: 14746670     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.3182/20120829-3-MX-2028.00160     Document Type: Conference Paper
Times cited : (5)

References (18)
  • 1
    • 33745223257 scopus 로고    scopus 로고
    • Cortical substrates for exploratory decisions in humans
    • Daw, N.D., O'Doherty, J.P., Dayan, P., Seymour, B., and Dolan, R.J. (2006). Cortical substrates for exploratory decisions in humans. Nature, 441(7095), 876-879.
    • (2006) Nature , vol.441 , Issue.7095 , pp. 876-879
    • Daw, N.D.1    O'Doherty, J.P.2    Dayan, P.3    Seymour, B.4    Dolan, R.J.5
  • 5
    • 33748998787 scopus 로고    scopus 로고
    • Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming
    • George, A.P. and Powell, W.B. (2006). Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming. Machine Learning, 65(1), 167-198.
    • (2006) Machine Learning , vol.65 , Issue.1 , pp. 167-198
    • George, A.P.1    Powell, W.B.2
  • 7
    • 85120861483 scopus 로고
    • Consideration of risk in reinforcement learning
    • Morgan Kaufmann Publishers, Inc., San Francisco, CA, USA
    • Heger, M. (1994). Consideration of risk in reinforcement learning. In Proceedings of the 11th International Conference on Machine Learning, 105-111. Morgan Kaufmann Publishers, Inc., San Francisco, CA, USA.
    • (1994) Proceedings of the 11th International Conference on Machine Learning , pp. 105-111
    • Heger, M.1
  • 9
    • 0036832952 scopus 로고    scopus 로고
    • Risk-sensitive reinforcement learning
    • Mihatsch, O. and Neuneier, R. (2002). Risk-sensitive reinforcement learning. Machine Learning, 49(2), 267-290.
    • (2002) Machine Learning , vol.49 , Issue.2 , pp. 267-290
    • Mihatsch, O.1    Neuneier, R.2
  • 13
    • 0031172111 scopus 로고    scopus 로고
    • Autonomy in robots and other agents
    • Smithers, T. (1997). Autonomy in robots and other agents. Brain and Cognition, 34, 88-106.
    • (1997) Brain and Cognition , vol.34 , pp. 88-106
    • Smithers, T.1
  • 15
    • 78349245906 scopus 로고    scopus 로고
    • Adaptive ε-greedy exploration in reinforcement learning based on value differences
    • Springer Berlin / Heidelberg
    • Tokic, M. (2010). Adaptive ε-greedy exploration in reinforcement learning based on value differences. In KI 2010: Advances in Artificial Intelligence, 203-210. Springer Berlin / Heidelberg.
    • (2010) KI 2010: Advances in Artificial Intelligence , pp. 203-210
    • Tokic, M.1
  • 16
    • 80054004135 scopus 로고    scopus 로고
    • Value-difference based exploration: Adaptive exploration between epsilon-greedy and softmax
    • Springer Berlin / Heidelberg
    • Tokic, M. and Palm, G. (2011). Value-difference based exploration: Adaptive exploration between epsilon-greedy and softmax. In KI 2011: Advances in Artificial Intelligence, 335-346. Springer Berlin / Heidelberg.
    • (2011) KI 2011: Advances in Artificial Intelligence , pp. 335-346
    • Tokic, M.1    Palm, G.2
  • 17
  • 18
    • 34249833101 scopus 로고
    • Technical note: Q-learning
    • Watkins, C. and Dayan, P. (1992). Technical note: Q-learning. Machine Learning, 8(3), 279-292.
    • (1992) Machine Learning , vol.8 , Issue.3 , pp. 279-292
    • Watkins, C.1    Dayan, P.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.