-
1
-
-
0036465258
-
Expertness based cooperative Q-learning
-
M. N. Ahmadabadi and M. Asadpour, "Expertness based cooperative Q-learning," IEEE Trans. an Systems, Man, and Cybernetics, part B. vol. 32, no. 1, pp. 66-76, 2002.
-
(2002)
IEEE Trans. an Systems, Man, and Cybernetics, part B
, vol.32
, Issue.1
, pp. 66-76
-
-
Ahmadabadi, M.N.1
Asadpour, M.2
-
2
-
-
0004007508
-
Reinforcement Learning
-
J. Si, A. G. Barto, W. B. Powell, and D. Wunsch eds, pp, Wiley-IEEE Press, Piscataway, NJ
-
A. G. Barto, "Reinforcement Learning," in Handbook of Learning and Approximate Dynamic Programming, J. Si, A. G. Barto, W. B. Powell, and D. Wunsch (eds.), pp. 804-809, Wiley-IEEE Press, Piscataway, NJ, 2004.
-
(2004)
Handbook of Learning and Approximate Dynamic Programming
, pp. 804-809
-
-
Barto, A.G.1
-
3
-
-
84979715630
-
Supervised Actor-Critic Reinforcement Learning
-
J. Si, A. G. Barto, W. B. Powell, and D. Wunsch eds, pp, Wiley-IEEE Press, Piscataway, NJ
-
A. G. Barto and M. T. Rosenstein, "Supervised Actor-Critic Reinforcement Learning," in Handbook of Learning and Approximate Dynamic Programming, J. Si, A. G. Barto, W. B. Powell, and D. Wunsch (eds.), pp. 359-380, Wiley-IEEE Press, Piscataway, NJ, 2004.
-
(2004)
Handbook of Learning and Approximate Dynamic Programming
, pp. 359-380
-
-
Barto, A.G.1
Rosenstein, M.T.2
-
5
-
-
15744397544
-
An Ant System Based Exploration-Exploitation for Reinforcement Learning
-
H. S. Chang, "An Ant System Based Exploration-Exploitation for Reinforcement Learning," in Proc. of the IEEE Conf. on Systems, Man, and Cybernetics, Vol. 4, 2004, pp. 3805-3810.
-
(2004)
Proc. of the IEEE Conf. on Systems, Man, and Cybernetics
, vol.4
, pp. 3805-3810
-
-
Chang, H.S.1
-
6
-
-
0004033139
-
-
D. Corne, Fl Glover, and M. Dorigo eds, McGraw-Hill
-
D. Corne, Fl Glover, and M. Dorigo (eds.), New Ideas in Optimization, McGraw-Hill, 1999.
-
(1999)
New Ideas in Optimization
-
-
-
7
-
-
0043247546
-
Accelerating Reinforcement Learning by Composing Solutions of Automatically Identified Subtasks
-
C. Drummond, "Accelerating Reinforcement Learning by Composing Solutions of Automatically Identified Subtasks," J. of Artificial Intelligence Research, vol. 16, 2002, pp. 59-104.
-
(2002)
J. of Artificial Intelligence Research
, vol.16
, pp. 59-104
-
-
Drummond, C.1
-
8
-
-
0002012598
-
The ant colony optimization metaheuristic
-
D. Corne, M. Dorigo eds, pp, McGraw-Hill, NY, USA
-
M. Dorigo and G. Di Caro, "The ant colony optimization metaheuristic," New Ideas in Optimization, D. Corne, M. Dorigo (eds.), pp. 11-32, McGraw-Hill, NY, USA, 1999.
-
(1999)
New Ideas in Optimization
, pp. 11-32
-
-
Dorigo, M.1
Di Caro, G.2
-
12
-
-
0002357911
-
Convergence of Indirect Adaptive Asynchronous Value Iteration Algorithms
-
J. D. Cowan, G. Tesauro, and J. Alspector eds, Morgan Kaufmann Publishers, Inc
-
V. Gullapalli and A. G. Barto, "Convergence of Indirect Adaptive Asynchronous Value Iteration Algorithms", Advances in Neural Information Processing Systems, J. D. Cowan, G. Tesauro, and J. Alspector (eds.), Morgan Kaufmann Publishers, Inc., vol. 6, 1994, pp. 695-702.
-
(1994)
Advances in Neural Information Processing Systems
, vol.6
, pp. 695-702
-
-
Gullapalli, V.1
Barto, A.G.2
-
13
-
-
0029679044
-
Reinforcement Learning; A Survey
-
L. P. Kaelbling, M. L. Littman, and A. W. Moore, "Reinforcement Learning; A Survey," J. of Artificial Intelligence Research, vol. 4, 1996, pp. 237-285.
-
(1996)
J. of Artificial Intelligence Research
, vol.4
, pp. 237-285
-
-
Kaelbling, L.P.1
Littman, M.L.2
Moore, A.W.3
-
15
-
-
0000123778
-
Self-improving reactive agents based on reinforcement learning, planning and teaching
-
L. J. Lin, "Self-improving reactive agents based on reinforcement learning, planning and teaching," Machine Learning, vol. 8, 1992, pp. 294-321.
-
(1992)
Machine Learning
, vol.8
, pp. 294-321
-
-
Lin, L.J.1
-
16
-
-
0029732210
-
Creating advice-taking reinforcement learners
-
R. Maclin and J.W. Shavlik, "Creating advice-taking reinforcement learners," Machine Learning, vol. 22, 1996, pp. 251-282.
-
(1996)
Machine Learning
, vol.22
, pp. 251-282
-
-
Maclin, R.1
Shavlik, J.W.2
-
19
-
-
0141596576
-
Policy invariance under reward transformations: Theory and application to reward shaping
-
A. Y. Ng, D. Harada, and S. Russell, "Policy invariance under reward transformations: theory and application to reward shaping," in Proc. of the 16th Int. Conf. on Machine Learning, 1999, pp. 278-287.
-
(1999)
Proc. of the 16th Int. Conf. on Machine Learning
, pp. 278-287
-
-
Ng, A.Y.1
Harada, D.2
Russell, S.3
-
21
-
-
8744269435
-
Reinforcement learning with super- vision by a stable controller
-
M. Rosenstein and A.G. Barto, "Reinforcement learning with super- vision by a stable controller," in Proc. of the American Control Conf., 2004, pp. 4517-4522.
-
(2004)
Proc. of the American Control Conf
, pp. 4517-4522
-
-
Rosenstein, M.1
Barto, A.G.2
-
23
-
-
0033901602
-
Convergence results for single-step on-policy reinforcement learning algorithms
-
S. Singh, T. Jaakkola, M. Littman, and C. Szepesvari, "Convergence results for single-step on-policy reinforcement learning algorithms," Machine Learning, vol. 38, pp. 287-308, 2000.
-
(2000)
Machine Learning
, vol.38
, pp. 287-308
-
-
Singh, S.1
Jaakkola, T.2
Littman, M.3
Szepesvari, C.4
-
25
-
-
84947807317
-
Open theoretical questions in reinforcement learning
-
EuroCOLT'99, Nordkirchen, Germany
-
R. Sutton, "Open theoretical questions in reinforcement learning," in Proc. of the 4th European Conference on Computational Learning Theory, EuroCOLT'99, Nordkirchen, Germany, 1999, pp. 11-17.
-
(1999)
Proc. of the 4th European Conference on Computational Learning Theory
, pp. 11-17
-
-
Sutton, R.1
-
26
-
-
0028497630
-
Asynchronous stochastic approximation and Q-learning
-
J. N. Tsitsiklis, "Asynchronous stochastic approximation and Q-learning," Machine Learning, vol. 16, pp. 185-202, 1994.
-
(1994)
Machine Learning
, vol.16
, pp. 185-202
-
-
Tsitsiklis, J.N.1
-
27
-
-
0039967456
-
Analysis of some incremental variants of policy iteration: First steps toward understanding actor-critic learning systems
-
Tech. Rep. NU-CCS-93-11
-
R. J. Williams and L. C. Baird, "Analysis of some incremental variants of policy iteration: first steps toward understanding actor-critic learning systems," Tech. Rep. NU-CCS-93-11. 1993.
-
(1993)
-
-
Williams, R.J.1
Baird, L.C.2
|