-
3
-
-
34948825740
-
Reinforcement Learning for Problems with Hidden State
-
Tech. Report, University of Toronto
-
BSamuel W. Hasinoff, "Reinforcement Learning for Problems with Hidden State", Tech. Report, University of Toronto, 2002.
-
(2002)
-
-
BSamuel, W.1
Hasinoff2
-
4
-
-
0000728324
-
Reinforcement Learning in Markovian and Non-Markovian Environments
-
Jürgen H. Schmidhuber, "Reinforcement Learning in Markovian and Non-Markovian Environments", Advances in Neural Information Processing Systems, 3, pp. 500-506, 1991.
-
(1991)
Advances in Neural Information Processing Systems
, vol.3
, pp. 500-506
-
-
Schmidhuber, J.H.1
-
5
-
-
0004049893
-
Learning from Delayed Rewards
-
PhD thesis, King's College, Cambridge, UK
-
Christopher J. C. H. Watkins, "Learning from Delayed Rewards", PhD thesis, King's College, Cambridge, UK, 1989.
-
(1989)
-
-
Watkins, C.J.C.H.1
-
6
-
-
0242552235
-
Using Suitable Action Selection Rule in Reinforcement Learning
-
Man, and Cybernetics, pp
-
Masayuki Ohta, Yoichiro Kumada, Itsuki Noda, "Using Suitable Action Selection Rule in Reinforcement Learning", IEEE International Conference on Systems, Man, and Cybernetics, pp. 4358-4363, 2003.
-
(2003)
IEEE International Conference on Systems
, pp. 4358-4363
-
-
Ohta, M.1
Kumada, Y.2
Noda, I.3
-
7
-
-
56749151758
-
-
M. Littman, Memoryless Policies: Theoretical Limitations and Practical Results, From Animal to Animats 3: Proceedings of the 3rd International Conference on Simulation and Adaptive Behavior, 1994.
-
M. Littman, "Memoryless Policies: Theoretical Limitations and Practical Results", From Animal to Animats 3: Proceedings of the 3rd International Conference on Simulation and Adaptive Behavior, 1994.
-
-
-
-
8
-
-
38149018611
-
Solving Deep Memory POMDPs with Recurrent Policy Gradients
-
Springer, Germany
-
Daan Wierstra, Alexander Foerster, Jan Peters, and Juergen Schmidhuber, "Solving Deep Memory POMDPs with Recurrent Policy Gradients", Artificial Neural Networks: ICANN 2007, pp. 697-706, Springer, Germany, 2007.
-
(2007)
Artificial Neural Networks: ICANN 2007
, pp. 697-706
-
-
Wierstra, D.1
Foerster, A.2
Peters, J.3
Schmidhuber, J.4
-
9
-
-
0003673017
-
Reinforcement Learning for Robots Using Neural Networks
-
PhD. Thesis
-
L.-J. Lin, "Reinforcement Learning for Robots Using Neural Networks", PhD. Thesis, 1993.
-
(1993)
-
-
Lin, L.-J.1
-
10
-
-
84899015857
-
Reinforcement learning with Long Short Term Memory
-
B. Bakker, "Reinforcement learning with Long Short Term Memory", NIPS 14, 2002.
-
(2002)
NIPS 14
-
-
Bakker, B.1
-
11
-
-
0346149797
-
A Robot that Reinforcement-Learns to Identify and Memorize Important Previous Observations
-
B. Bakker, V. Zhumatiy, G. Gruener, and J. Schmidhuber, "A Robot that Reinforcement-Learns to Identify and Memorize Important Previous Observations", Proceedings of the 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2003.
-
(2003)
Proceedings of the 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems
-
-
Bakker, B.1
Zhumatiy, V.2
Gruener, G.3
Schmidhuber, J.4
-
12
-
-
0028746933
-
Reinforcement Learning using a Recurrent Neural Network
-
Ho, F. and Kamel, M., '"Reinforcement Learning using a Recurrent Neural Network", IEEE World Congress on Computational Intelligence, Proceedings of the ICNN, Vol I, pp. 437-440, 1994.
-
(1994)
IEEE World Congress on Computational Intelligence, Proceedings of the ICNN
, vol.1
, pp. 437-440
-
-
Ho, F.1
Kamel, M.2
-
14
-
-
0031619061
-
Recurrent Neural Networks for Reinforcement Learning: Architecture, Learning Algorithms and Internal Representation
-
Ahmet Onat, Hajime Kita, and Yoshikazu Nishikawa, "Recurrent Neural Networks for Reinforcement Learning: Architecture, Learning Algorithms and Internal Representation", Proceedings of the IJCNN98, pp. 2010-2015, 1998.
-
(1998)
Proceedings of the IJCNN98
, pp. 2010-2015
-
-
Onat, A.1
Kita, H.2
Nishikawa, Y.3
-
16
-
-
34547097516
-
-
Charles W. Anderson, Peter Michael Young, Michael R. Buehner, James N. Knight, Keith A. Bush, and Douglas C. Hittle, Robust Reinforcement Learning Control using Integral Quadratic Constraints for Recurrent Neural Networks, IEEE Transactions on Neural Networks: Special Issue on Neural Networks for Feedback Control Systems, 18(4), pp. 993-1002, 2007.
-
Charles W. Anderson, Peter Michael Young, Michael R. Buehner, James N. Knight, Keith A. Bush, and Douglas C. Hittle, "Robust Reinforcement Learning Control using Integral Quadratic Constraints for Recurrent Neural Networks", IEEE Transactions on Neural Networks: Special Issue on Neural Networks for Feedback Control Systems, 18(4), pp. 993-1002, 2007.
-
-
-
-
18
-
-
0033629916
-
Reinforcement Learning in Continuous Time and Space
-
Kenji Doya, "Reinforcement Learning in Continuous Time and Space", Neural Computation, 12, pp. 243-269, 2000.
-
(2000)
Neural Computation
, vol.12
, pp. 243-269
-
-
Doya, K.1
-
20
-
-
21844465127
-
Tree-Based Batch Mode Reinforcement Learning
-
D. Ernst, P. Geurts, and L. Wehenkel, "Tree-Based Batch Mode Reinforcement Learning", Journal of Machine Learning Research, 6, pp. 503-556, 2005.
-
(2005)
Journal of Machine Learning Research
, vol.6
, pp. 503-556
-
-
Ernst, D.1
Geurts, P.2
Wehenkel, L.3
-
21
-
-
34447537708
-
Application of SONQL for Real-Time Learning of Robot Behaviors
-
Marc Carreras, Junku Yuh, Joan Batlle, and Pere Ridao, "Application of SONQL for Real-Time Learning of Robot Behaviors", Robotics and Autonomous Systems, 55(8), pp. 628-642, 2007.
-
(2007)
Robotics and Autonomous Systems
, vol.55
, Issue.8
, pp. 628-642
-
-
Carreras, M.1
Yuh, J.2
Batlle, J.3
Ridao, P.4
-
22
-
-
0000123778
-
Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching
-
L.-J. Lin, "Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching", Machine Learning, 8, pp. 293-321, 1992.
-
(1992)
Machine Learning
, vol.8
, pp. 293-321
-
-
Lin, L.-J.1
-
23
-
-
33845607326
-
Quasi-Online Reinforcement Learning for Robots
-
USA
-
B. Bakker, V. Zhumatiy, G. Gruener, and J. Schmidhuber, "Quasi-Online Reinforcement Learning for Robots", Proceedings of the International Conference on Robotics and Automation, USA, 2006.
-
(2006)
Proceedings of the International Conference on Robotics and Automation
-
-
Bakker, B.1
Zhumatiy, V.2
Gruener, G.3
Schmidhuber, J.4
-
25
-
-
0034293152
-
Learning to Forget: Continual Prediction with LSTM
-
F. A. Gers, J. Schmidhuber, and F. Cummins, "Learning to Forget: Continual Prediction with LSTM", Neural Computation, vol. 12, pp. 2451-2471, 2000.
-
(2000)
Neural Computation
, vol.12
, pp. 2451-2471
-
-
Gers, F.A.1
Schmidhuber, J.2
Cummins, F.3
-
26
-
-
0041965934
-
Learning Precise Timing with LSTM Recurrent Networks
-
F. Gers, N. Schraudolph, J. Schmidhuber, "Learning Precise Timing with LSTM Recurrent Networks", Journal of Machine Learning Research, 3, pp. 115-143, 2002.
-
(2002)
Journal of Machine Learning Research
, vol.3
, pp. 115-143
-
-
Gers, F.1
Schraudolph, N.2
Schmidhuber, J.3
-
27
-
-
33847649288
-
Training Recurrent Networks by Evolino
-
J. Schmidhuber, D. Wierstra, M. Gagliolo, F. Gomez, "Training Recurrent Networks by Evolino", Neural Computation, 19(3), pp 757-779, 2007.
-
(2007)
Neural Computation
, vol.19
, Issue.3
, pp. 757-779
-
-
Schmidhuber, J.1
Wierstra, D.2
Gagliolo, M.3
Gomez, F.4
-
28
-
-
0141596576
-
Policy invariance under reward transformations: Theory and application to reward shaping
-
A. Y. Ng, D. Harada, and S. Russell, "Policy invariance under reward transformations: theory and application to reward shaping", Proceedings of the 16th International Conf. on Machine Learning, pp. 278-287, 1999.
-
(1999)
Proceedings of the 16th International Conf. on Machine Learning
, pp. 278-287
-
-
Ng, A.Y.1
Harada, D.2
Russell, S.3
|