-
1
-
-
0004140522
-
-
Barto, A.G., Sutton, R.S., Anderson, C.W.: Neuronlike adaptive elements that can solve difficult learning control problems, pp. 535-549 (1988)
-
(1988)
Neuronlike Adaptive Elements That Can Solve Difficult Learning Control Problems
, pp. 535-549
-
-
Barto, A.G.1
Sutton, R.S.2
Anderson, C.W.3
-
2
-
-
0004049893
-
-
PhD thesis, Cambridge University, Cambridge, England
-
Watkins, C.: Learning from Delayed Rewards. PhD thesis, Cambridge University, Cambridge, England (1989)
-
(1989)
Learning from Delayed Rewards
-
-
Watkins, C.1
-
3
-
-
84898939480
-
Policy Gradient Methods for Reinforcement Learning with Function Approximation
-
Sutton, R.S., McAllester, D.A., Singh, S.P., Mansour, Y.: Policy Gradient Methods for Reinforcement Learning with Function Approximation. In: Advances in Neural Information Processing Systems (NIPS 12), pp. 1057-1063 (2000)
-
(2000)
Advances in Neural Information Processing Systems (NIPS 12)
, pp. 1057-1063
-
-
Sutton, R.S.1
McAllester, D.A.2
Singh, S.P.3
Mansour, Y.4
-
5
-
-
34447553096
-
Reinforcement Learning for Humanoid Robotics
-
Peters, J., Vijayakumar, S., Schaal, S.: Reinforcement Learning for Humanoid Robotics. In: Third IEEE-RAS International Conference on Humanoid Robots, Humanoids 2003 (2003)
-
Third IEEE-RAS International Conference on Humanoid Robots, Humanoids 2003 (2003)
-
-
Peters, J.1
Vijayakumar, S.2
Schaal, S.3
-
7
-
-
85162049326
-
Incremental Natural Actor-Critic Algorithms
-
Vancouver, Canada
-
Bhatnagar, S., Sutton, R.S., Ghavamzadeh, M., Lee, M.: Incremental Natural Actor-Critic Algorithms. In: Advances in Neural Information Processing Systems (NIPS 21), Vancouver, Canada (2007)
-
(2007)
Advances in Neural Information Processing Systems (NIPS 21)
-
-
Bhatnagar, S.1
Sutton, R.S.2
Ghavamzadeh, M.3
Lee, M.4
-
8
-
-
0000396062
-
Natural gradient works efficiently in learning
-
Amari, S.I.: Natural gradient works efficiently in learning. Neural Computation 10, 251-276 (1998)
-
(1998)
Neural Computation
, vol.10
, pp. 251-276
-
-
Amari, S.I.1
-
10
-
-
67650458797
-
Kalman Temporal Differences: The deterministic case
-
Geist, M., Pietquin, O., Fricout, G.: Kalman Temporal Differences: the deterministic case. In: Proceedings of the IEEE International Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL 2009), Nashville, TN, USA (2009)
-
Proceedings of the IEEE International Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL 2009), Nashville, TN, USA (2009)
-
-
Geist, M.1
Pietquin, O.2
Fricout, G.3
-
11
-
-
50849108789
-
Utilizing the Natural Gradient in Temporal Difference Reinforcement Learning with Eligibility Traces
-
Morimura, T., Uchibe, E., Doya, K.: Utilizing the Natural Gradient in Temporal Difference Reinforcement Learning with Eligibility Traces. In: 2nd Internatinal Symposium on Information Geometry and its Applications, Tokyo, Japan, pp. 256-263 (2005)
-
(2005)
2nd Internatinal Symposium on Information Geometry and Its Applications, Tokyo, Japan
, pp. 256-263
-
-
Morimura, T.1
Uchibe, E.2
Doya, K.3
-
12
-
-
67650505326
-
The QV Family Compared to Other Reinforcement Learning Algorithms
-
Wiering, M., van Hasselt, H.: The QV Family Compared to Other Reinforcement Learning Algorithms. In: IEEE International Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL 2009), Nashville, TN, USA (2009)
-
IEEE International Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL 2009), Nashville, TN, USA (2009)
-
-
Wiering, M.1
Van Hasselt, H.2
-
13
-
-
0001771345
-
Linear Least-Squares algorithms for temporal difference learning
-
Bradtke, S.J., Barto, A.G.: Linear Least-Squares algorithms for temporal difference learning. Machine Learning 22, 33-57 (1996)
-
(1996)
Machine Learning
, vol.22
, pp. 33-57
-
-
Bradtke, S.J.1
Barto, A.G.2
-
14
-
-
76649127744
-
Tracking in reinforcement learning
-
Leung, C.S., Lee, M., Chan, J.H. (eds.) ICONIP 2009. Springer, Heidelberg
-
Geist, M., Pietquin, O., Fricout, G.: Tracking in reinforcement learning. In: Leung, C.S., Lee, M., Chan, J.H. (eds.) ICONIP 2009. LNCS, vol. 5863, pp. 502-511. Springer, Heidelberg (2009)
-
(2009)
LNCS
, vol.5863
, pp. 502-511
-
-
Geist, M.1
Pietquin, O.2
Fricout, G.3
-
15
-
-
33646831159
-
An RLS-Based Natural Actor-Critic Algorithm for Locomotion of a Two-Linked Robot Arm
-
Hao, Y., Liu, J., Wang, Y.-P., Cheung, Y.-m., Yin, H., Jiao, L., Ma, J., Jiao, Y.-C. (eds.) CIS 2005. Springer, Heidelberg
-
Park, J., Kim, J., Kang, D.: An RLS-Based Natural Actor-Critic Algorithm for Locomotion of a Two-Linked Robot Arm. In: Hao, Y., Liu, J., Wang, Y.-P., Cheung, Y.-m., Yin, H., Jiao, L., Ma, J., Jiao, Y.-C. (eds.) CIS 2005. LNCS (LNAI), vol. 3801, pp. 65-72. Springer, Heidelberg (2005)
-
(2005)
LNCS (LNAI)
, vol.3801
, pp. 65-72
-
-
Park, J.1
Kim, J.2
Kang, D.3
|