-
2
-
-
34547974097
-
Tracking Value Function Dynamics to Improve Reinforcement Learning with Piecewise Linear Function Approximation
-
Phua, C.W., Fitch, R.: Tracking Value Function Dynamics to Improve Reinforcement Learning with Piecewise Linear Function Approximation. In: International Conference on Machine Learning, ICML 2007 (2007)
-
(2007)
International Conference on Machine Learning, ICML
-
-
Phua, C.W.1
Fitch, R.2
-
3
-
-
34547991608
-
On the role of tracking in stationary environments
-
Sutton, R.S., Koop, A., Silver, D.: On the role of tracking in stationary environments. In: Proceedings of the 24th international conference on Machine learning, pp. 871-878 (2007)
-
(2007)
Proceedings of the 24th international conference on Machine learning
, pp. 871-878
-
-
Sutton, R.S.1
Koop, A.2
Silver, D.3
-
4
-
-
67650458797
-
Kalman Temporal Differences: The deterministic case
-
Nashville, TN, USA April
-
Geist, M., Pietquin, O., Fricout, G.: Kalman Temporal Differences: the deterministic case. In: Proceedings of the IEEE International Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL 2009), Nashville, TN, USA (April 2009)
-
(2009)
Proceedings of the IEEE International Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL
-
-
Geist, M.1
Pietquin, O.2
Fricout, G.3
-
5
-
-
85024429815
-
-
Kalman, R.E.: A New Approach to Linear Filtering and Prediction Problems. Transactions of the ASME-Journal of Basic Engineering 82(Series D), 35-45 (1960)
-
Kalman, R.E.: A New Approach to Linear Filtering and Prediction Problems. Transactions of the ASME-Journal of Basic Engineering 82(Series D), 35-45 (1960)
-
-
-
-
6
-
-
21244437999
-
Unscented filtering and nonlinear estimation
-
Julier, S.J., Uhlmann, J.K.: Unscented filtering and nonlinear estimation. Proceedings of the IEEE 92(3), 401-422 (2004)
-
(2004)
Proceedings of the IEEE
, vol.92
, Issue.3
, pp. 401-422
-
-
Julier, S.J.1
Uhlmann, J.K.2
-
8
-
-
0001771345
-
Linear Least-Squares Algorithms for Temporal Difference Learning
-
Bradtke, S.J., Barto, A.G.: Linear Least-Squares Algorithms for Temporal Difference Learning. Machine Learning 22(1-3), 33-57 (1996)
-
(1996)
Machine Learning
, vol.22
, Issue.1-3
, pp. 33-57
-
-
Bradtke, S.J.1
Barto, A.G.2
-
10
-
-
40849145988
-
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
-
Antos, A., Szepesvári, C., Munos, R.: Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path. Machine Learning 71(1), 89-129 (2008)
-
(2008)
Machine Learning
, vol.71
, Issue.1
, pp. 89-129
-
-
Antos, A.1
Szepesvári, C.2
Munos, R.3
-
11
-
-
76649113839
-
-
Kakade, S.: A natural policy gradient. In: Advances in Neural Information Processing Systems 14 (NIPS 2001), Vancouver, British Columbia, Canada, pp. 1531-1538 (2001)
-
Kakade, S.: A natural policy gradient. In: Advances in Neural Information Processing Systems 14 (NIPS 2001), Vancouver, British Columbia, Canada, pp. 1531-1538 (2001)
-
-
-
-
12
-
-
33646413135
-
-
Peters, J., Vijayakumar, S., Schaal, S.: Natural actor-critic. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), 3720, pp. 280-291. Springer, Heidelberg (2005)
-
Peters, J., Vijayakumar, S., Schaal, S.: Natural actor-critic. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 280-291. Springer, Heidelberg (2005)
-
-
-
-
13
-
-
0036832950
-
Technical Update: Least-Squares Temporal Difference Learning
-
Boyan, J.A.: Technical Update: Least-Squares Temporal Difference Learning. Machine Learning 49(2-3), 233-246 (1999)
-
(1999)
Machine Learning
, vol.49
, Issue.2-3
, pp. 233-246
-
-
Boyan, J.A.1
-
15
-
-
20544433674
-
Consistent Normalized Least Mean Square Filtering with Noisy Data Matrix
-
Jo, S., Kim, S.W.: Consistent Normalized Least Mean Square Filtering with Noisy Data Matrix. IEEE Transactions on Signal Processing 53(6), 2112-2123 (2005)
-
(2005)
IEEE Transactions on Signal Processing
, vol.53
, Issue.6
, pp. 2112-2123
-
-
Jo, S.1
Kim, S.W.2
-
16
-
-
31844451013
-
Reinforcement Learning with Gaussian Processes
-
Engel, Y., Mannor, S., Meir, R.: Reinforcement Learning with Gaussian Processes. In: Proceedings of Internation Conference on Machine Learning, ICML 2005 (2005)
-
(2005)
Proceedings of Internation Conference on Machine Learning, ICML
-
-
Engel, Y.1
Mannor, S.2
Meir, R.3
-
17
-
-
85162049326
-
Incremental Natural Actor-Critic Algorithms
-
Vancouver
-
Bhatnagar, S., Sutton, R.S., Ghavamzadeh, M., Lee, M.: Incremental Natural Actor-Critic Algorithms. In: Advances in Neural Information Processing Systems, Vancouver, vol. 21 (2008)
-
(2008)
In: Advances in Neural Information Processing Systems
, vol.21
-
-
Bhatnagar, S.1
Sutton, R.S.2
Ghavamzadeh, M.3
Lee, M.4
-
18
-
-
58449117448
-
-
Geist, M., Pietquin, O., Fricout, G.: Bayesian Reward Filtering. In: Girgin, S., Loth, M., Munos, R., Preux, P., Ryabko, D. (eds.) EWRL 2008. LNCS (LNAI), 5323, pp. 96-109. Springer, Heidelberg (2008)
-
Geist, M., Pietquin, O., Fricout, G.: Bayesian Reward Filtering. In: Girgin, S., Loth, M., Munos, R., Preux, P., Ryabko, D. (eds.) EWRL 2008. LNCS (LNAI), vol. 5323, pp. 96-109. Springer, Heidelberg (2008)
-
-
-
|