-
1
-
-
0029679044
-
Reinforcement learning: A survey
-
L. P. Kaelbling, M. L. Littman, and A. W. Moore, "Reinforcement learning: A survey," Journal of Artificial Intelligence Research, vol. 4, pp. 237-285, 1996.
-
(1996)
Journal of Artificial Intelligence Research
, vol.4
, pp. 237-285
-
-
Kaelbling, L.P.1
Littman, M.L.2
Moore, A.W.3
-
3
-
-
85153938292
-
Reinforcement learning algorithm for partially observable markov decision problems
-
MIT Press
-
T. Jaakkola, S. P. Singh, and M. I. Jordan, "Reinforcement learning algorithm for partially observable markov decision problems," in Advances in Neural Information Processing Systems 7. MIT Press, 1995, pp. 345-352.
-
(1995)
Advances in Neural Information Processing Systems 7
, pp. 345-352
-
-
Jaakkola, T.1
Singh, S.P.2
Jordan, M.I.3
-
4
-
-
2142812536
-
Learning without stateestimation in partially observable markovian decision processes
-
Morgan Kaufmann
-
S. P. Singh, T. Jaakkola, and M. I. Jordan, "Learning without stateestimation in partially observable markovian decision processes," in In Proceedings of the Eleventh International Conference on Machine Learning. Morgan Kaufmann, 1994, pp. 284-292.
-
(1994)
In Proceedings of the Eleventh International Conference on Machine Learning
, pp. 284-292
-
-
Singh, S.P.1
Jaakkola, T.2
Jordan, M.I.3
-
5
-
-
0347502949
-
Reinforcement learning in non-markov environments
-
S. D. Whitehead and L. J. Lin, "Reinforcement learning in non-markov environments," Artificial Intelligence, vol. 8, pp. 3-4, 1992.
-
(1992)
Artificial Intelligence
, vol.8
, pp. 3-4
-
-
Whitehead, S.D.1
Lin, L.J.2
-
7
-
-
0001770240
-
Value-function approximations for partially observable markov decision processes
-
M. Hauskrecht, "Value-function approximations for partially observable markov decision processes," Journal of Artificial Intelligence Research, vol. 13, pp. 33-94, 2000.
-
(2000)
Journal of Artificial Intelligence Research
, vol.13
, pp. 33-94
-
-
Hauskrecht, M.1
-
9
-
-
0035273403
-
Online learning control by association and reinforcement
-
Mar
-
J. Si and Y.-T. Wang, "Online learning control by association and reinforcement," IEEE Transactions on Neural Networks, vol. 12, no. 2, pp. 264-276, Mar 2001.
-
(2001)
IEEE Transactions on Neural Networks
, vol.12
, Issue.2
, pp. 264-276
-
-
Si, J.1
Wang, Y.-T.2
-
10
-
-
0000439891
-
Convergence of stochastic iterative dynamic programming algorithms
-
T. Jaakkola, M. I. Jordan, and S. P. Singh, "Convergence of stochastic iterative dynamic programming algorithms," Neural Computation, vol. 6, pp. 1185-1201, 1994.
-
(1994)
Neural Computation
, vol.6
, pp. 1185-1201
-
-
Jaakkola, T.1
Jordan, M.I.2
Singh, S.P.3
-
11
-
-
0022783899
-
Distributed asynchronous deterministic and stochastic gradient optimization algorithms
-
Sep
-
J. Tsitsiklis, D. Bertsekas, and M. Athans, "Distributed asynchronous deterministic and stochastic gradient optimization algorithms," IEEE Transactions on Automatic Control, vol. 31, no. 9, pp. 803-812, Sep 1986.
-
(1986)
IEEE Transactions on Automatic Control
, vol.31
, Issue.9
, pp. 803-812
-
-
Tsitsiklis, J.1
Bertsekas, D.2
Athans, M.3
-
14
-
-
0021376040
-
Convergence of an adaptive linear estimation algorithm
-
E. Eweda and O. Macchi, "Convergence of an adaptive linear estimation algorithm," IEEE Transactions on Automatic Control, vol. 29, no. 2, pp. 119-127, Feb 1984.
-
(1984)
IEEE Transactions on Automatic Control
, vol.29
, Issue.2
-
-
Eweda, E.1
Macchi, O.2
-
15
-
-
0041510534
-
Linear stochastic approximation driven by slowly varying markov chains
-
V. R. Konda and J. N. Tsitsiklis, "Linear stochastic approximation driven by slowly varying markov chains," Systems and Control Letters, vol. 50, pp. 95-102, 2003.
-
(2003)
Systems and Control Letters
, vol.50
, pp. 95-102
-
-
Konda, V.R.1
Tsitsiklis, J.N.2
-
16
-
-
61349103197
-
Generating self-excited oscillations via two-relay controller
-
Feb.
-
L. Aguilar, I. Boiko, L. Fridman, and R. Iriarte, "Generating self-excited oscillations via two-relay controller," IEEE Transactions on Automatic Control, vol. 54, no. 2, pp. 416-420, Feb. 2009.
-
(2009)
IEEE Transactions on Automatic Control
, vol.54
, Issue.2
, pp. 416-420
-
-
Aguilar, L.1
Boiko, I.2
Fridman, L.3
Iriarte, R.4
|