-
3
-
-
85166207010
-
Exploiting structure in policy construction
-
Boutilier, C., Dearden, R., & Goldszmidt, M. (1995). Exploiting structure in policy construction. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, pp. 1104-1111.
-
(1995)
Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence
, pp. 1104-1111
-
-
Boutilier, C.1
Dearden, R.2
Goldszmidt, M.3
-
4
-
-
0026255231
-
O-plan: The open planning architecture
-
Currie, K., & Tate, A. (1991). O-plan: The open planning architecture. Artificial Intelligence, 52(1), 49-86.
-
(1991)
Artificial Intelligence
, vol.52
, Issue.1
, pp. 49-86
-
-
Currie, K.1
Tate, A.2
-
5
-
-
0001234682
-
Feudal reinforcement learning
-
Morgan Kaufmann, San Francisco, CA
-
Dayan, P., & Hinton, G. (1993). Feudal reinforcement learning. In Advances in Neural Information Processing Systems, 5, pp. 271-278. Morgan Kaufmann, San Francisco, CA.
-
(1993)
Advances in Neural Information Processing Systems
, vol.5
, pp. 271-278
-
-
Dayan, P.1
Hinton, G.2
-
6
-
-
0006424007
-
-
Tech. rep. CS-95-10, Department of Computer Science, Brown University, Providence, Rhode Island
-
Dean, T., & Lin, S.-H. (1995). Decomposition techniques for planning in stochastic domains. Tech. rep. CS-95-10, Department of Computer Science, Brown University, Providence, Rhode Island.
-
(1995)
Decomposition Techniques for Planning in Stochastic Domains
-
-
Dean, T.1
Lin, S.-H.2
-
8
-
-
0015440625
-
Learning and executing generalized robot plans
-
Fikes, R. E., Hart, P. E., & Nilsson, N. J. (1972). Learning and executing generalized robot plans. Artificial Intelligence, 3, 251-288.
-
(1972)
Artificial Intelligence
, vol.3
, pp. 251-288
-
-
Fikes, R.E.1
Hart, P.E.2
Nilsson, N.J.3
-
9
-
-
0020177941
-
Rete: A fast algorithm for the many pattern/many object pattern match problem
-
Forgy, C. L. (1982). Rete: A fast algorithm for the many pattern/many object pattern match problem. Artificial Intelligence, 19(1), 17-37.
-
(1982)
Artificial Intelligence
, vol.19
, Issue.1
, pp. 17-37
-
-
Forgy, C.L.1
-
10
-
-
0006419533
-
Hierarchical solution of Markov decision processes using macro-actions
-
San Francisco, CA. Morgan Kaufmann Publishers
-
Hauskrecht, M., Meuleau, N., Kaelbling, L. P., Dean, T., & Boutilier, C. (1998). Hierarchical solution of Markov decision processes using macro-actions. In Proceedings of the Fourteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-98), pp. 220-229 San Francisco, CA. Morgan Kaufmann Publishers.
-
(1998)
Proceedings of the Fourteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-98)
, pp. 220-229
-
-
Hauskrecht, M.1
Meuleau, N.2
Kaelbling, L.P.3
Dean, T.4
Boutilier, C.5
-
12
-
-
0000439891
-
On the convergence of stochastic iterative dynamic programming algorithms
-
Jaakkola, T., Jordan, M. I., & Singh, S. P. (1994). On the convergence of stochastic iterative dynamic programming algorithms. Neural Computation, 6(6), 1185-1201.
-
(1994)
Neural Computation
, vol.6
, Issue.6
, pp. 1185-1201
-
-
Jaakkola, T.1
Jordan, M.I.2
Singh, S.P.3
-
14
-
-
0032045145
-
Module based reinforcement learning for a real robot
-
Kalmár, Z., Szepesvári, C., & Lörincz, A. (1998). Module based reinforcement learning for a real robot. Machine Learning, 31, 55-85.
-
(1998)
Machine Learning
, vol.31
, pp. 55-85
-
-
Kalmár, Z.1
Szepesvári, C.2
Lörincz, A.3
-
16
-
-
0022045044
-
Macro-operators: A weak method for learning
-
Korf, R. E. (1985). Macro-operators: A weak method for learning. Artificial Intelligence, 26(1), 35-77.
-
(1985)
Artificial Intelligence
, vol.26
, Issue.1
, pp. 35-77
-
-
Korf, R.E.1
-
17
-
-
0003673017
-
-
Ph.D. thesis, Carnegie Mellon University, Department of Computer Science, Pittsburgh, PA
-
Lin, L.-J. (1993). Reinforcement learning for robots using neural networks. Ph.D. thesis, Carnegie Mellon University, Department of Computer Science, Pittsburgh, PA.
-
(1993)
Reinforcement Learning for Robots Using Neural Networks
-
-
Lin, L.-J.1
-
18
-
-
84880688141
-
Multi-value-functions: Efficient automatic action hierarchies for multiple goal MDPs
-
San Francisco. Morgan Kaufmann
-
Moore, A. W., Baird, L., & Kaelbling, L. P. (1999). Multi-value-functions: Efficient automatic action hierarchies for multiple goal MDPs. In Proceedings of the International Joint Conference on Artificial Intelligence, pp. 1316-1323 San Francisco. Morgan Kaufmann.
-
(1999)
Proceedings of the International Joint Conference on Artificial Intelligence
, pp. 1316-1323
-
-
Moore, A.W.1
Baird, L.2
Kaelbling, L.P.3
-
21
-
-
84898956770
-
Reinforcement learning with hierarchies of machines
-
Cambridge, MA. MIT Press
-
Parr, R., & Russell, S. (1998). Reinforcement learning with hierarchies of machines. In Advances in Neural Information Processing Systems, Vol. 10, pp. 1043-1049 Cambridge, MA. MIT Press.
-
(1998)
Advances in Neural Information Processing Systems
, vol.10
, pp. 1043-1049
-
-
Parr, R.1
Russell, S.2
-
23
-
-
0003636089
-
-
Tech. rep. CUED/FINFENG/TR 166, Cambridge University Engineering Department, Cambridge, England
-
Rummery, G. A., & Niranjan, M. (1994). Online Q-learning using connectionist systems. Tech. rep. CUED/FINFENG/TR 166, Cambridge University Engineering Department, Cambridge, England.
-
(1994)
Online Q-learning Using Connectionist Systems
-
-
Rummery, G.A.1
Niranjan, M.2
-
24
-
-
0016069798
-
Planning in a hierarchy of abstraction spaces
-
Sacerdoti, E. D. (1974). Planning in a hierarchy of abstraction spaces. Artificial Intelligence, 5(2), 115-135.
-
(1974)
Artificial Intelligence
, vol.5
, Issue.2
, pp. 115-135
-
-
Sacerdoti, E.D.1
-
25
-
-
0346087506
-
Convergence results for single-step on-policy reinforcement-learning algorithms
-
Tech. rep., University of Colorado, Department of Computer Science, Boulder, CO. To appear
-
Singh, S., Jaakkola, T., Littman, M. L., & Szepesvári, C. (1998). Convergence results for single-step on-policy reinforcement-learning algorithms. Tech. rep., University of Colorado, Department of Computer Science, Boulder, CO. To appear in Machine Learning.
-
(1998)
Machine Learning
-
-
Singh, S.1
Jaakkola, T.2
Littman, M.L.3
Szepesvári, C.4
-
26
-
-
0001027894
-
Transfer of learning by composing solutions of elemental sequential tasks
-
Singh, S. P. (1992). Transfer of learning by composing solutions of elemental sequential tasks. Machine Learning, 8, 323-339.
-
(1992)
Machine Learning
, vol.8
, pp. 323-339
-
-
Singh, S.P.1
-
27
-
-
0000672258
-
Improved switching among temporally abstract actions
-
MIT Press
-
Sutton, R. S., Singh, S., Precup, D., & Ravindran, B. (1999). Improved switching among temporally abstract actions. In Advances in Neural Information Processing Systems, Vol. 11, pp. 1066-1072. MIT Press.
-
(1999)
Advances in Neural Information Processing Systems
, vol.11
, pp. 1066-1072
-
-
Sutton, R.S.1
Singh, S.2
Precup, D.3
Ravindran, B.4
-
29
-
-
0003899594
-
Between MDPs and Semi-MDPs: Learning, planning, and representing knowledge at multiple temporal scales
-
Tech. rep., University of Massachusetts, Department of Computer and Information Sciences, Amherst, MA. To appear
-
Sutton, R. S., Precup, D., & Singh, S. (1998). Between MDPs and Semi-MDPs: Learning, planning, and representing knowledge at multiple temporal scales. Tech. rep., University of Massachusetts, Department of Computer and Information Sciences, Amherst, MA. To appear in Artificial Intelligence.
-
(1998)
Artificial Intelligence
-
-
Sutton, R.S.1
Precup, D.2
Singh, S.3
-
31
-
-
0028464184
-
Investigating production system representations for non-combinatorial match
-
Tambe, M., & Rosenbloom, P. S. (1994). Investigating production system representations for non-combinatorial match. Artificial Intelligence, 68(1), 155-199.
-
(1994)
Artificial Intelligence
, vol.68
, Issue.1
, pp. 155-199
-
-
Tambe, M.1
Rosenbloom, P.S.2
-
32
-
-
0004049893
-
-
Ph.D. thesis, King's College, Oxford. (To be reprinted by MIT Press.)
-
Watkins, C. J. C. H. (1989). Learning from Delayed Rewards. Ph.D. thesis, King's College, Oxford. (To be reprinted by MIT Press.).
-
(1989)
Learning from Delayed Rewards
-
-
Watkins, C.J.C.H.1
|