-
3
-
-
0003787146
-
-
Princeton University Press, Princeton, New Jersey
-
R. Bellman. Dynamic Programming. Princeton University Press, Princeton, New Jersey, 1957.
-
(1957)
Dynamic Programming
-
-
Bellman, R.1
-
5
-
-
0001234682
-
Feudal reinforcement learning
-
San Mateo, CA, Morgan Kaufmann
-
P. Dayan and G. E. Hinton. Feudal reinforcement learning. In Advances in Neural Information Processing Systems, volume 5, pages 271-278, San Mateo, CA, 1993. Morgan Kaufmann.
-
(1993)
Advances in Neural Information Processing Systems
, vol.5
, pp. 271-278
-
-
Dayan, P.1
Hinton, G.E.2
-
6
-
-
0002278788
-
Hierarchical reinforcement learning with the MAXQ value function decomposition
-
T.G. Dietterich. Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 13:227-303, 2000.
-
(2000)
Journal of Artificial Intelligence Research
, vol.13
, pp. 227-303
-
-
Dietterich, T.G.1
-
7
-
-
0004671869
-
Temporal difference learning in continuous time and space
-
D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo, editors, Cambridge, MA, MIT Press
-
K. Doya. Temporal difference learning in continuous time and space. In D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo, editors, Advances in Neural Information Processing Systems 8, Cambridge, MA, 1996. MIT Press.
-
(1996)
Advances in Neural Information Processing Systems
, vol.8
-
-
Doya, K.1
-
8
-
-
0033629916
-
Reinforcement learning in continuous time and space
-
K. Doya. Reinforcement learning in continuous time and space. Neural Computation, 12:243-269, 2000.
-
(2000)
Neural Computation
, vol.12
, pp. 243-269
-
-
Doya, K.1
-
9
-
-
0030324990
-
Self-organizing multi-resolution grid for motion planning and control
-
T. Fomin, T. Rozgonyi, Cs. Szepesvári, and A. Lorincz. Self-organizing multi-resolution grid for motion planning and control. International Journal of Neural Systems, 7:757-776, 1997.
-
(1997)
International Journal of Neural Systems
, vol.7
, pp. 757-776
-
-
Fomin, T.1
Rozgonyi, T.2
Szepesvári, Cs.3
Lorincz, A.4
-
10
-
-
0034272032
-
Bounded-parameter markov decision processes
-
R. Givan, S. M. Leach, and T. Dean. Bounded-parameter markov decision processes. Artificial Intelligence, 122(1-2):71-109, 2000. URL citeseer.nj.nec.com/article/givan97bounded.html.
-
(2000)
Artificial Intelligence
, vol.122
, Issue.1-2
, pp. 71-109
-
-
Givan, R.1
Leach, S.M.2
Dean, T.3
-
11
-
-
0002357911
-
Convergence of indirect adaptive asynchronous value iteration algorithms
-
J. D. Cowan, G. Tesauro, and J. Alspector, editor, San Mateo, CA. Morgan Kaufmann
-
V. Gullapalli and A. G. Barto. Convergence of indirect adaptive asynchronous value iteration algorithms. In J. D. Cowan, G. Tesauro, and J. Alspector, editor, Advances in Neural Information Processing Systems, volume 6, pages 695-702, San Mateo, CA, 1994. Morgan Kaufmann.
-
(1994)
Advances in Neural Information Processing Systems
, vol.6
, pp. 695-702
-
-
Gullapalli, V.1
Barto, A.G.2
-
13
-
-
0026913154
-
Gross motion planning - A survey
-
Y. K. Hwang and N. Ahuja. Gross motion planning - a survey. ACM Computing Surveys, 24(3):219-291, 1992.
-
(1992)
ACM Computing Surveys
, vol.24
, Issue.3
, pp. 219-291
-
-
Hwang, Y.K.1
Ahuja, N.2
-
14
-
-
0000439891
-
On the convergence of stochastic iterative dynamic programming algorithms
-
November
-
T. Jaakkola, M. I. Jordan, and S. P. Singh. On the convergence of stochastic iterative dynamic programming algorithms. Neural Computation, 6(6):1185-1201, November 1994.
-
(1994)
Neural Computation
, vol.6
, Issue.6
, pp. 1185-1201
-
-
Jaakkola, T.1
Jordan, M.I.2
Singh, S.P.3
-
17
-
-
0032045145
-
Module-based reinforcement learning: Experiments with a real robot
-
Z. Kalmár, Cs. Szepesvári, and A. Lorincz. Module-based reinforcement learning: Experiments with a real robot. Machine Learning, 31:55-85, 1998.
-
(1998)
Machine Learning
, vol.31
, pp. 55-85
-
-
Kalmár, Z.1
Szepesvári, Cs.2
Lorincz, A.3
-
18
-
-
0042415660
-
Event-learning and robust policy heuristics
-
forthcoming
-
A. Lorincz, I. Pólik, and I. Szita. Event-learning and robust policy heuristics. Cognitive Systems Research, 2002, forthcoming. URL http://people.inf.elte.hu/lorincz/Files/NIPG-ELU-14-05-2001.ps.
-
(2002)
Cognitive Systems Research
-
-
Lorincz, A.1
Pólik, I.2
Szita, I.3
-
19
-
-
85149834820
-
Markov games as a framework for multi-agent reinforcement learning
-
San Fransisco, CA. Morgan Kaufmann
-
M. L. Littman. Markov games as a framework for multi-agent reinforcement learning. In Proceedings of the Eleventh International Conference on Machine Learning, pages 157-163, San Fransisco, CA, 1994. Morgan Kaufmann.
-
(1994)
Proceedings of the Eleventh International Conference on Machine Learning
, pp. 157-163
-
-
Littman, M.L.1
-
20
-
-
0002434059
-
Learning behavior networks from experience
-
F. J. Varela and P. Bourgine, editors, Cambridge, MA. MIT Press, Cambridge
-
P. Maes. Learning behavior networks from experience. In F. J. Varela and P. Bourgine, editors, Toward a practice of autonomous systems: Proceedings of the First European Conf. on Artificial Life, Cambridge, MA, 1992. MIT Press, Cambridge.
-
(1992)
Toward a Practice of Autonomous Systems: Proceedings of the First European Conf. on Artificial Life
-
-
Maes, P.1
-
21
-
-
0026880130
-
Automatic programming of behavior-based robots using reinforcement learning
-
S. Mahadevan and J. Connell. Automatic programming of behavior-based robots using reinforcement learning. Artificial Intelligence, 55:311-365, 1992.
-
(1992)
Artificial Intelligence
, vol.55
, pp. 311-365
-
-
Mahadevan, S.1
Connell, J.2
-
26
-
-
0002876837
-
Scaling reinforcement learning algorithms by learning variable temporal resolution models
-
San Mateo, CA. Morgan Kaufmann
-
S. P. Singh. Scaling reinforcement learning algorithms by learning variable temporal resolution models. In Proceedings of the Ninth International Conference on Machine Learning, MLC-92, San Mateo, CA, 1992. Morgan Kaufmann.
-
(1992)
Proceedings of the Ninth International Conference on Machine Learning
, vol.MLC-92
-
-
Singh, S.P.1
-
28
-
-
0009656873
-
Between MDPs and semi-MDPs: Learning, planning and representing knowledge at multiple temporal scales
-
R. Sutton, D. Precup, and S. Singh. Between MDPs and semi-MDPs: Learning, planning and representing knowledge at multiple temporal scales. Journal of Artificial Intelligence Research, 1:1-39, 1998.
-
(1998)
Journal of Artificial Intelligence Research
, vol.1
, pp. 1-39
-
-
Sutton, R.1
Precup, D.2
Singh, S.3
-
30
-
-
0031455549
-
Neurocontroller using dynamic state feedback for compensatory control
-
Cs. Szepesvári, Sz. Cimmer, and A. Lorincz. Neurocontroller using dynamic state feedback for compensatory control. Neural Networks, 10 (9):1691-1708, 1997.
-
(1997)
Neural Networks
, vol.10
, Issue.9
, pp. 1691-1708
-
-
Szepesvári, Cs.1
Cimmer, Sz.2
Lorincz, A.3
-
31
-
-
0031455549
-
Dynamic state feedback neurocontroller for compensatory control
-
Cs. Szepesvári, Sz. Cimmer, and A. Lorincz. Dynamic state feedback neurocontroller for compensatory control. Neural Networks, 10:1691-1708, 1997.
-
(1997)
Neural Networks
, vol.10
, pp. 1691-1708
-
-
Szepesvári, Cs.1
Cimmer, Sz.2
Lorincz, A.3
-
33
-
-
0008572817
-
Approximate inverse-dynamics based robust control using static and dynamic feedback
-
J. Kalkkuhl, K. J. Hunt, R. Zbikowski, and A. Dzielinski, editors. World Scientific, Singapore
-
Cs. Szepesvári and A. Lorincz. Approximate inverse-dynamics based robust control using static and dynamic feedback. In J. Kalkkuhl, K. J. Hunt, R. Zbikowski, and A. Dzielinski, editors, Applications of Neural Adaptive Control Theory, volume 2, pages 151-179. World Scientific, Singapore, 1997.
-
(1997)
Applications of Neural Adaptive Control Theory
, vol.2
, pp. 151-179
-
-
Szepesvári, Cs.1
Lorincz, A.2
-
34
-
-
0031678862
-
An integrated architecture for motion-control and path-planning
-
Cs. Szepesvári and A. Lorincz. An integrated architecture for motion-control and path-planning. Journal of Robotic Systems, 15:1-15, 1998.
-
(1998)
Journal of Robotic Systems
, vol.15
, pp. 1-15
-
-
Szepesvári, Cs.1
Lorincz, A.2
-
35
-
-
0041413263
-
Event-learning with a non-markovian controller
-
F. van Harmelen, editor, Lyon, IOS Press, Amsterdam
-
I. Szita, B. Takács, and A. Lorincz. Event-learning with a non-markovian controller. In F. van Harmelen, editor, 15th European Conference on Artifical Intelligence, Lyon, pages 365-369. IOS Press, Amsterdam, 2002.
-
(2002)
15th European Conference on Artifical Intelligence
, pp. 365-369
-
-
Szita, I.1
Takács, B.2
Lorincz, A.3
-
37
-
-
0028497630
-
Asynchronous stochastic approximation and Q-learning
-
September
-
J. N. Tsitsiklis. Asynchronous stochastic approximation and Q-learning. Machime Learning, 3(16):185-202, September 1994.
-
(1994)
Machime Learning
, vol.3
, Issue.16
, pp. 185-202
-
-
Tsitsiklis, J.N.1
|