-
3
-
-
84921399937
-
-
J. Si, A. Barto, W. B. Powell, and D. W. II, Handbook of Learning and Approximate Dynamic Programming. IEEE, 2004.
-
J. Si, A. Barto, W. B. Powell, and D. W. II, Handbook of Learning and Approximate Dynamic Programming. IEEE, 2004.
-
-
-
-
5
-
-
0031074521
-
Locally weighted learning
-
C. G. Atkeson, A. W. Moore, and S. Schaal, "Locally weighted learning," Artificial Intelligence Review, vol. 11, pp. 11-73, 1997.
-
(1997)
Artificial Intelligence Review
, vol.11
, pp. 11-73
-
-
Atkeson, C.G.1
Moore, A.W.2
Schaal, S.3
-
9
-
-
28644446278
-
Evolutionary policy iteration for solving Markov decision processes
-
H. S. Chang, H. G. Lee, M. C. Fu, and S. I. Marcus, "Evolutionary policy iteration for solving Markov decision processes," IEEE Transactions on Automatic Control, vol. 50, pp. 1804-1808, 2005.
-
(2005)
IEEE Transactions on Automatic Control
, vol.50
, pp. 1804-1808
-
-
Chang, H.S.1
Lee, H.G.2
Fu, M.C.3
Marcus, S.I.4
-
10
-
-
41849132401
-
An evolutionary random policy search algorithm for solving Markov decision processes, INFORMS Journal on Computing, vol
-
to appear
-
J. Hu, M. C. Fu, V. R. Ramezani, and S. I. Marcus, "An evolutionary random policy search algorithm for solving Markov decision processes," INFORMS Journal on Computing, vol. to appear, 2007.
-
(2007)
-
-
Hu, J.1
Fu, M.C.2
Ramezani, V.R.3
Marcus, S.I.4
-
11
-
-
14644444172
-
An adaptive sampling algorithm for solving Markov decision processes
-
H. S. Chang, M. C. Fu, J. Hu, and S. I. Marcus, "An adaptive sampling algorithm for solving Markov decision processes," Operations Research, vol. 53, pp. 126-139, 2005.
-
(2005)
Operations Research
, vol.53
, pp. 126-139
-
-
Chang, H.S.1
Fu, M.C.2
Hu, J.3
Marcus, S.I.4
-
15
-
-
0001509947
-
Using randomization to break the curse of dimensionality
-
Online, Available
-
J. Rust, "Using randomization to break the curse of dimensionality," Econometrica, vol. 65, no. 3, pp. 487-516, 1997. [Online]. Available: citeseer.ist.psu.edu/rust96using.html
-
(1997)
Econometrica
, vol.65
, Issue.3
, pp. 487-516
-
-
Rust, J.1
-
18
-
-
34548751619
-
-
G. Gordon, Approximate solutions to Markov decision processes, Ph.D. dissertation, Carnegie Mellon University, 1999. [Online]. Available: citeseer.ist.psu.edu/gordon99approximate.html
-
G. Gordon, "Approximate solutions to Markov decision processes," Ph.D. dissertation, Carnegie Mellon University, 1999. [Online]. Available: citeseer.ist.psu.edu/gordon99approximate.html
-
-
-
-
19
-
-
34548750791
-
-
R. J. Williams and L. C. Baird, III, Analysis of some incremental variants of policy iteration: First steps toward understanding actor-critic learning systems, Northeastern University, Tech. Rep. NU-CCS-93-11, 1993. [Online]. Available: citeseer.ist.psu.edu/williams93analysis.html
-
R. J. Williams and L. C. Baird, III, "Analysis of some incremental variants of policy iteration: First steps toward understanding actor-critic learning systems," Northeastern University, Tech. Rep. NU-CCS-93-11, 1993. [Online]. Available: citeseer.ist.psu.edu/williams93analysis.html
-
-
-
-
20
-
-
34249833101
-
Q-learning
-
C. Watkins and P. Dayan, "Q-learning," Machine Learning, vol. 8, no. 3, pp. 279-292, 1992.
-
(1992)
Machine Learning
, vol.8
, Issue.3
, pp. 279-292
-
-
Watkins, C.1
Dayan, P.2
-
21
-
-
33750502578
-
From dynamic programming to RRTs: Algorithmic design of feasible trajectories
-
Springer-Verlag
-
S. M. LaValle, "From dynamic programming to RRTs: Algorithmic design of feasible trajectories," in Control Problems in Robotics. Springer-Verlag, 2002, pp. 19-37.
-
(2002)
Control Problems in Robotics
, pp. 19-37
-
-
LaValle, S.M.1
|