-
2
-
-
85163534824
-
Policy-gradient methods for planning
-
Aberdeen, D. 2006. Policy-gradient methods for planning. In Proc. NIPS'05.
-
(2006)
Proc. NIPS'05
-
-
Aberdeen, D.1
-
3
-
-
85163520533
-
-
Barto, A.; Bradtke, S.; and Singh, S. 1995. Learning to act using real-time dynamic programming. Artificial Intelligence 72.
-
Barto, A.; Bradtke, S.; and Singh, S. 1995. Learning to act using real-time dynamic programming. Artificial Intelligence 72.
-
-
-
-
4
-
-
85163440030
-
-
Baxter, J.; Bartlett, P.; and Weaver, L. 2001. Experiments with infinite-horizon, policy-gradient estimation. JAIR 15.
-
Baxter, J.; Bartlett, P.; and Weaver, L. 2001. Experiments with infinite-horizon, policy-gradient estimation. JAIR 15.
-
-
-
-
5
-
-
85163482395
-
-
Bonet, B., and Givan, R. 2006. Proc. of the 5th int. planning competition (IPC-5). See http://www.ldc.usb.ve/~bonet/ipc5 for all results and proceedings.
-
Bonet, B., and Givan, R. 2006. Proc. of the 5th int. planning competition (IPC-5). See http://www.ldc.usb.ve/~bonet/ipc5 for all results and proceedings.
-
-
-
-
6
-
-
57749179024
-
FF+FPG: Guiding a policygradient planner
-
Buffet, O., and Aberdeen, D. 2007. FF+FPG: Guiding a policygradient planner. In Proc. ICAPS.
-
(2007)
Proc. ICAPS
-
-
Buffet, O.1
Aberdeen, D.2
-
7
-
-
0036377352
-
The FF planning system: Fast plan generation through heuristic search
-
Hoffmann, J., and Nebel, B. 2001. The FF planning system: Fast plan generation through heuristic search. JAIR 14:253-302.
-
(2001)
JAIR
, vol.14
, pp. 253-302
-
-
Hoffmann, J.1
Nebel, B.2
-
8
-
-
85163418620
-
Probabilistic planning vs replanning
-
Submitted for Publication
-
Little, I., and Thiébaux, S. Probabilistic planning vs replanning. Submitted for Publication.
-
-
-
Little, I.1
Thiébaux, S.2
-
9
-
-
33746077700
-
Concurrent probabilistic planning in the graphplan framework
-
Little, I., and Thiébaux, S. 2006. Concurrent probabilistic planning in the graphplan framework. In Proc. ICAPS.
-
(2006)
Proc. ICAPS
-
-
Little, I.1
Thiébaux, S.2
-
11
-
-
33746077967
-
Concurrent probabilistic temporal planning
-
Mausam, and Weld, D. S. 2005. Concurrent probabilistic temporal planning. In Proc. ICAPS.
-
(2005)
Proc. ICAPS
-
-
Mausam1
Weld, D.S.2
-
12
-
-
44449135985
-
Probabilistic temporal planning with uncertains durations
-
Mausam, and Weld, D. S. 2006. Probabilistic temporal planning with uncertains durations. In Proc. AAAI'06.
-
(2006)
Proc. AAAI
, vol.6
-
-
Mausam1
Weld, D.S.2
-
14
-
-
77957878103
-
Practical linear valueapproximation techniques for first-order MDPs
-
Sanner, S., and Boutilier, C. 2006. Practical linear valueapproximation techniques for first-order MDPs. In Proc. UAI.
-
(2006)
Proc. UAI
-
-
Sanner, S.1
Boutilier, C.2
-
15
-
-
84898939480
-
Policy gradient methods for reinforcement learning with function approximation
-
Sutton, R. S.; McAllester, D.; Singh, S.; and Mansour, Y. 2000. Policy gradient methods for reinforcement learning with function approximation. Proc. NIPS.
-
(2000)
Proc. NIPS
-
-
Sutton, R.S.1
McAllester, D.2
Singh, S.3
Mansour, Y.4
-
16
-
-
13444294406
-
A multi-agent, policygradient approach to network routing
-
Tao, N.; Baxter, J.; and Weaver, L. 2001. A multi-agent, policygradient approach to network routing. In Proc. ICML.
-
(2001)
Proc. ICML
-
-
Tao, N.1
Baxter, J.2
Weaver, L.3
-
17
-
-
0000337576
-
Simple statistical gradient-following algorithms for connectionist reinforcement learning
-
Williams, R. J. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning 8:229-256.
-
(1992)
Machine Learning
, vol.8
, pp. 229-256
-
-
Williams, R.J.1
-
18
-
-
58349118462
-
FF-replan, a baseline for probabilistic planning
-
Yoon, S.; Fern, A.; and Givan, R. 2007. FF-replan, a baseline for probabilistic planning. In Proc. ICAPS'07.
-
(2007)
Proc. ICAPS'07
-
-
Yoon, S.1
Fern, A.2
Givan, R.3
-
19
-
-
29344454922
-
PPDDL1.0: An extension to PDDL for expressing planning domains with probabilistic effects
-
Technical Report CMU-CS-04-167
-
Younes, H. L. S., and Littman, M. L. 2004. PPDDL1.0: An extension to PDDL for expressing planning domains with probabilistic effects. Technical Report CMU-CS-04-167.
-
(2004)
-
-
Younes, H.L.S.1
Littman, M.L.2
-
20
-
-
13444256700
-
Policy generation for continuous-time stochastic domains with concurrency
-
Younes, H. L. S., and Simmons, R. G. 2004. Policy generation for continuous-time stochastic domains with concurrency. In Proc. ICAPS.
-
(2004)
Proc. ICAPS
-
-
Younes, H.L.S.1
Simmons, R.G.2
-
21
-
-
9444281800
-
Extending PDDL to model stochastic decision processes
-
Younes, H. L. S. 2003. Extending PDDL to model stochastic decision processes. In Proc. ICAPS Workshop on PDDL.
-
(2003)
Proc. ICAPS Workshop on PDDL
-
-
Younes, H.L.S.1
|