-
1
-
-
0013535965
-
Infinite-horizon policy-gradient estimation
-
Baxter J., Bartlett P., «Infinite-Horizon Policy-Gradient Estimation », Journal of Artificial Intelligence Research, vol. 15, p. 319-350, 2001a.
-
(2001)
Journal of Artificial Intelligence Research
, vol.15
, pp. 319-350
-
-
Baxter, J.1
Bartlett, P.2
-
2
-
-
0013495368
-
Experiments with infinite-horizon, policy-gradient estimation
-
Baxter J., Bartlett P., Weaver L., «Experiments with Infinite-Horizon, Policy-Gradient Estimation », Journal of Artificial Intelligence Research, vol. 15, p. 351-381, 2001b.
-
(2001)
Journal of Artificial Intelligence Research
, vol.15
, pp. 351-381
-
-
Baxter, J.1
Bartlett, P.2
Weaver, L.3
-
4
-
-
84880891360
-
Symbolic dynamic programming for first-order MDPs
-
Boutilier C., Reiter R., Price B., «Symbolic Dynamic Programming for First-order MDPs », Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI'01), p. 690-697, 2001.
-
(2001)
Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI'01)
, pp. 690-697
-
-
Boutilier, C.1
Reiter, R.2
Price, B.3
-
8
-
-
0003989210
-
-
PhD thesis, Brown University, Department of Computer Science, Providence, RI
-
Cassandra A. R., Exact and Approximate Algorithms for Partially Observable Markov Decision Processes, PhD thesis, Brown University, Department of Computer Science, Providence, RI, 1998.
-
(1998)
Exact and Approximate Algorithms for Partially Observable Markov Decision Processes
-
-
Cassandra, A.R.1
-
10
-
-
0002278788
-
Hierarchical reinforcement learning with the MAXQ value function decomposition
-
Dietterich T., «Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition », Journal of Artificial Intelligence Research, vol. 13, p. 227-303, 2000.
-
(2000)
Journal of Artificial Intelligence Research
, vol.13
, pp. 227-303
-
-
Dietterich, T.1
-
13
-
-
0035312760
-
Relational reinforcement learning
-
Dzeroski S., Raedt L. D., Driessens K., «Relational reinforcement learning », Machine Learning, vol. 43, p. 7-52, 2001.
-
(2001)
Machine Learning
, vol.43
, pp. 7-52
-
-
Dzeroski, S.1
Raedt, L.D.2
Driessens, K.3
-
14
-
-
84972539429
-
Combining probability distributions: A critique and an annotated bibliography
-
February
-
Genest C, Zidek J., «Combining Probability Distributions : A Critique and an Annotated Bibliography », Statistical Science, vol. 1, no 1, p. 114-135, February, 1986.
-
(1986)
Statistical Science
, vol.1
, Issue.1
, pp. 114-135
-
-
Genest, C.1
Zidek, J.2
-
16
-
-
0006419533
-
Hierarchical solution of Markov decision processes using macro-actions
-
Hauskretch M., Meuleau N., Kaelbling L., Dean T., Boutilier C., «Hierarchical Solution of Markov Decision Processes Using Macro-Actions », Proceedings of the Fourteenth International Conference on Uncertainty in Artificial Intelligence (UAI'98), p. 220-229, 1998.
-
(1998)
Proceedings of the Fourteenth International Conference on Uncertainty in Artificial Intelligence (UAI'98)
, pp. 220-229
-
-
Hauskretch, M.1
Meuleau, N.2
Kaelbling, L.3
Dean, T.4
Boutilier, C.5
-
19
-
-
0000439891
-
On the convergence of stochastic iterative dynamic programming algorithms
-
Jaakkola T., Jordan M., Singh S., «On the Convergence of Stochastic Iterative Dynamic Programming Algorithms », Neural Computation, vol. 6, no 6, p. 1186-1201, 1994.
-
(1994)
Neural Computation
, vol.6
, Issue.6
, pp. 1186-1201
-
-
Jaakkola, T.1
Jordan, M.2
Singh, S.3
-
20
-
-
0032329151
-
A roadmap of agent research and development
-
Jennings N., Sycara K., Wooldridge M., «A Roadmap of Agent Research and Development », Autonomous Agents and Multi-Agent Systems, vol. 1, p. 7-38, 1998.
-
(1998)
Autonomous Agents and Multi-agent Systems
, vol.1
, pp. 7-38
-
-
Jennings, N.1
Sycara, K.2
Wooldridge, M.3
-
21
-
-
33645861652
-
Tileworld users' manual
-
August
-
Joslin D., Nunes A., Pollack M. E., TileWorld Users' Manual, Technical Report no TR 93-12, August, 1993.
-
(1993)
Technical Report No TR 93-12
, vol.TR 93-12
-
-
Joslin, D.1
Nunes, A.2
Pollack, M.E.3
-
23
-
-
0029679044
-
Reinforcement learning: A survey
-
Kaelbling L., Littman M., Moore A., «Reinforcement Learning : A Survey », Journal of Artificial Intelligence Research, vol. 4, p. 237-285, 1996.
-
(1996)
Journal of Artificial Intelligence Research
, vol.4
, pp. 237-285
-
-
Kaelbling, L.1
Littman, M.2
Moore, A.3
-
24
-
-
0036832951
-
A sparse sampling algorithm for near-optimal planning in large Markov decision processes
-
Kearns M., Mansour Y., Ng A., «A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes », Machine Learning, vol. 49, p. 193-208, 2002.
-
(2002)
Machine Learning
, vol.49
, pp. 193-208
-
-
Kearns, M.1
Mansour, Y.2
Ng, A.3
-
25
-
-
0000123778
-
Self-improving reactive agent based on reinforcement learning, planning and teaching
-
Lin. L.-J., «Self-improving reactive agent based on reinforcement learning, planning and teaching. », Machine Learning, vol. 8, p. 293-321, 1992.
-
(1992)
Machine Learning
, vol.8
, pp. 293-321
-
-
Lin, L.-J.1
-
29
-
-
0026880130
-
Automatic programming of behavior-based robots using reinforcement learning
-
June
-
Mahadevan S., Cornell J., «Automatic Programming of Behavior-based Robots using Reinforcement Learning », Artificial Intelligence, vol. 55, no 2-3, p. 311-365, June, 1992.
-
(1992)
Artificial Intelligence
, vol.55
, Issue.2-3
, pp. 311-365
-
-
Mahadevan, S.1
Cornell, J.2
-
35
-
-
0003998452
-
-
John Wiley and Sons, Inc., New York, USA
-
Puterman M. L., Markov Decision Processes-Discrete Stochastic Dynamic Programming, John Wiley and Sons, Inc., New York, USA, 1994.
-
(1994)
Markov Decision Processes-discrete Stochastic Dynamic Programming
-
-
Puterman, M.L.1
-
37
-
-
4544279348
-
Multi-agent reinforcement learning: A critical survey
-
Stanford
-
Shoham Y., Powers R., Grenager T., Multi-agent reinforcement learning : a critical survey. Technical report, Stanford, 2003.
-
(2003)
Technical Report
-
-
Shoham, Y.1
Powers, R.2
Grenager, T.3
-
39
-
-
0008321896
-
Reinforcement learning: An introduction
-
MIT Press, Cambridge, MA
-
Sutton R., Barto G., Reinforcement Learning : an introduction, Bradford Book, MIT Press, Cambridge, MA, 1998.
-
(1998)
Bradford Book
-
-
Sutton, R.1
Barto, G.2
-
41
-
-
0032276461
-
Incremental robot shaping
-
Urzelai J., Floreano D., Dorigo M., Colombetti M., «Incremental Robot Shaping », Connection Science Journal, 1998.
-
(1998)
Connection Science Journal
-
-
Urzelai, J.1
Floreano, D.2
Dorigo, M.3
Colombetti, M.4
|