-
1
-
-
33745223257
-
Cortical substrates for exploratory decisions in humans
-
Daw, N.D., O'Doherty, J.P., Dayan, P., Seymour, B., and Dolan, R.J. (2006). Cortical substrates for exploratory decisions in humans. Nature, 441(7095), 876-879.
-
(2006)
Nature
, vol.441
, Issue.7095
, pp. 876-879
-
-
Daw, N.D.1
O'Doherty, J.P.2
Dayan, P.3
Seymour, B.4
Dolan, R.J.5
-
3
-
-
84867076681
-
On risk for-malization of on-line risk assessment for safe decision making in robotics
-
Ertle, P., Voos, H., and Söffker, D. (2010). On risk for-malization of on-line risk assessment for safe decision making in robotics. In 7th IARP Workshop on Technical Challenges for Dependable Robots in Human Environments, 15-22.
-
(2010)
7th IARP Workshop on Technical Challenges for Dependable Robots in Human Environments
, pp. 15-22
-
-
Ertle, P.1
Voos, H.2
Söffker, D.3
-
5
-
-
33748998787
-
Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming
-
George, A.P. and Powell, W.B. (2006). Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming. Machine Learning, 65(1), 167-198.
-
(2006)
Machine Learning
, vol.65
, Issue.1
, pp. 167-198
-
-
George, A.P.1
Powell, W.B.2
-
6
-
-
79956136559
-
Safe exploration for reinforcement learning
-
Hans, A., Schneegaß, D., Schäfer, A.M., and Udluft, S. (2008). Safe exploration for reinforcement learning. In Proceedings of the 16th European Symposium on Artificial Neural Networks ESANN'08, 143-148.
-
(2008)
Proceedings of the 16th European Symposium on Artificial Neural Networks ESANN'08
, pp. 143-148
-
-
Hans, A.1
Schneegaß, D.2
Schäfer, A.M.3
Udluft, S.4
-
7
-
-
85120861483
-
Consideration of risk in reinforcement learning
-
Morgan Kaufmann Publishers, Inc., San Francisco, CA, USA
-
Heger, M. (1994). Consideration of risk in reinforcement learning. In Proceedings of the 11th International Conference on Machine Learning, 105-111. Morgan Kaufmann Publishers, Inc., San Francisco, CA, USA.
-
(1994)
Proceedings of the 11th International Conference on Machine Learning
, pp. 105-111
-
-
Heger, M.1
-
8
-
-
51349102890
-
On fault tolerance and robustness in autonomous systems
-
Lussier, B., Chatila, R., Ingrand, F., Killijian, M.O., and Powell, D. (2004). On fault tolerance and robustness in autonomous systems. In 3rd IARP - IEEE/RAS - EURON Joint Workshop on Technical Challenges for Dependable Robots in Human Environments, 7-9.
-
(2004)
3rd IARP - IEEE/RAS - EURON Joint Workshop on Technical Challenges for Dependable Robots in Human Environments
, pp. 7-9
-
-
Lussier, B.1
Chatila, R.2
Ingrand, F.3
Killijian, M.O.4
Powell, D.5
-
9
-
-
0036832952
-
Risk-sensitive reinforcement learning
-
Mihatsch, O. and Neuneier, R. (2002). Risk-sensitive reinforcement learning. Machine Learning, 49(2), 267-290.
-
(2002)
Machine Learning
, vol.49
, Issue.2
, pp. 267-290
-
-
Mihatsch, O.1
Neuneier, R.2
-
13
-
-
0031172111
-
Autonomy in robots and other agents
-
Smithers, T. (1997). Autonomy in robots and other agents. Brain and Cognition, 34, 88-106.
-
(1997)
Brain and Cognition
, vol.34
, pp. 88-106
-
-
Smithers, T.1
-
15
-
-
78349245906
-
Adaptive ε-greedy exploration in reinforcement learning based on value differences
-
Springer Berlin / Heidelberg
-
Tokic, M. (2010). Adaptive ε-greedy exploration in reinforcement learning based on value differences. In KI 2010: Advances in Artificial Intelligence, 203-210. Springer Berlin / Heidelberg.
-
(2010)
KI 2010: Advances in Artificial Intelligence
, pp. 203-210
-
-
Tokic, M.1
-
16
-
-
80054004135
-
Value-difference based exploration: Adaptive exploration between epsilon-greedy and softmax
-
Springer Berlin / Heidelberg
-
Tokic, M. and Palm, G. (2011). Value-difference based exploration: Adaptive exploration between epsilon-greedy and softmax. In KI 2011: Advances in Artificial Intelligence, 335-346. Springer Berlin / Heidelberg.
-
(2011)
KI 2011: Advances in Artificial Intelligence
, pp. 335-346
-
-
Tokic, M.1
Palm, G.2
-
17
-
-
0004049893
-
-
Ph.D. thesis, University of Cambridge, England
-
Watkins, C. (1989). Learning from Delayed Rewards. Ph.D. thesis, University of Cambridge, England.
-
(1989)
Learning from Delayed Rewards
-
-
Watkins, C.1
-
18
-
-
34249833101
-
Technical note: Q-learning
-
Watkins, C. and Dayan, P. (1992). Technical note: Q-learning. Machine Learning, 8(3), 279-292.
-
(1992)
Machine Learning
, vol.8
, Issue.3
, pp. 279-292
-
-
Watkins, C.1
Dayan, P.2
|