메뉴 건너뛰기




Volumn , Issue , 2007, Pages 10-17

Concurrent probabilistic temporal planning with policy-gradients

Author keywords

[No Author keywords available]

Indexed keywords

DIRECT POLICY SEARCH; FUNCTIONS APPROXIMATIONS; GRADIENT ASCENT; LOW MEMORY; MEMORY USE; PARAMETERIZED; POLICY GRADIENT; PROBABILISTICS; TEMPORAL PLANNING; UNCERTAINTY;

EID: 58349090838     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (7)

References (21)
  • 1
  • 2
    • 85163534824 scopus 로고    scopus 로고
    • Policy-gradient methods for planning
    • Aberdeen, D. 2006. Policy-gradient methods for planning. In Proc. NIPS'05.
    • (2006) Proc. NIPS'05
    • Aberdeen, D.1
  • 3
    • 85163520533 scopus 로고    scopus 로고
    • Barto, A.; Bradtke, S.; and Singh, S. 1995. Learning to act using real-time dynamic programming. Artificial Intelligence 72.
    • Barto, A.; Bradtke, S.; and Singh, S. 1995. Learning to act using real-time dynamic programming. Artificial Intelligence 72.
  • 4
    • 85163440030 scopus 로고    scopus 로고
    • Baxter, J.; Bartlett, P.; and Weaver, L. 2001. Experiments with infinite-horizon, policy-gradient estimation. JAIR 15.
    • Baxter, J.; Bartlett, P.; and Weaver, L. 2001. Experiments with infinite-horizon, policy-gradient estimation. JAIR 15.
  • 5
    • 85163482395 scopus 로고    scopus 로고
    • Bonet, B., and Givan, R. 2006. Proc. of the 5th int. planning competition (IPC-5). See http://www.ldc.usb.ve/~bonet/ipc5 for all results and proceedings.
    • Bonet, B., and Givan, R. 2006. Proc. of the 5th int. planning competition (IPC-5). See http://www.ldc.usb.ve/~bonet/ipc5 for all results and proceedings.
  • 6
    • 57749179024 scopus 로고    scopus 로고
    • FF+FPG: Guiding a policygradient planner
    • Buffet, O., and Aberdeen, D. 2007. FF+FPG: Guiding a policygradient planner. In Proc. ICAPS.
    • (2007) Proc. ICAPS
    • Buffet, O.1    Aberdeen, D.2
  • 7
    • 0036377352 scopus 로고    scopus 로고
    • The FF planning system: Fast plan generation through heuristic search
    • Hoffmann, J., and Nebel, B. 2001. The FF planning system: Fast plan generation through heuristic search. JAIR 14:253-302.
    • (2001) JAIR , vol.14 , pp. 253-302
    • Hoffmann, J.1    Nebel, B.2
  • 8
    • 85163418620 scopus 로고    scopus 로고
    • Probabilistic planning vs replanning
    • Submitted for Publication
    • Little, I., and Thiébaux, S. Probabilistic planning vs replanning. Submitted for Publication.
    • Little, I.1    Thiébaux, S.2
  • 9
    • 33746077700 scopus 로고    scopus 로고
    • Concurrent probabilistic planning in the graphplan framework
    • Little, I., and Thiébaux, S. 2006. Concurrent probabilistic planning in the graphplan framework. In Proc. ICAPS.
    • (2006) Proc. ICAPS
    • Little, I.1    Thiébaux, S.2
  • 11
    • 33746077967 scopus 로고    scopus 로고
    • Concurrent probabilistic temporal planning
    • Mausam, and Weld, D. S. 2005. Concurrent probabilistic temporal planning. In Proc. ICAPS.
    • (2005) Proc. ICAPS
    • Mausam1    Weld, D.S.2
  • 12
    • 44449135985 scopus 로고    scopus 로고
    • Probabilistic temporal planning with uncertains durations
    • Mausam, and Weld, D. S. 2006. Probabilistic temporal planning with uncertains durations. In Proc. AAAI'06.
    • (2006) Proc. AAAI , vol.6
    • Mausam1    Weld, D.S.2
  • 14
    • 77957878103 scopus 로고    scopus 로고
    • Practical linear valueapproximation techniques for first-order MDPs
    • Sanner, S., and Boutilier, C. 2006. Practical linear valueapproximation techniques for first-order MDPs. In Proc. UAI.
    • (2006) Proc. UAI
    • Sanner, S.1    Boutilier, C.2
  • 15
    • 84898939480 scopus 로고    scopus 로고
    • Policy gradient methods for reinforcement learning with function approximation
    • Sutton, R. S.; McAllester, D.; Singh, S.; and Mansour, Y. 2000. Policy gradient methods for reinforcement learning with function approximation. Proc. NIPS.
    • (2000) Proc. NIPS
    • Sutton, R.S.1    McAllester, D.2    Singh, S.3    Mansour, Y.4
  • 16
    • 13444294406 scopus 로고    scopus 로고
    • A multi-agent, policygradient approach to network routing
    • Tao, N.; Baxter, J.; and Weaver, L. 2001. A multi-agent, policygradient approach to network routing. In Proc. ICML.
    • (2001) Proc. ICML
    • Tao, N.1    Baxter, J.2    Weaver, L.3
  • 17
    • 0000337576 scopus 로고
    • Simple statistical gradient-following algorithms for connectionist reinforcement learning
    • Williams, R. J. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning 8:229-256.
    • (1992) Machine Learning , vol.8 , pp. 229-256
    • Williams, R.J.1
  • 18
    • 58349118462 scopus 로고    scopus 로고
    • FF-replan, a baseline for probabilistic planning
    • Yoon, S.; Fern, A.; and Givan, R. 2007. FF-replan, a baseline for probabilistic planning. In Proc. ICAPS'07.
    • (2007) Proc. ICAPS'07
    • Yoon, S.1    Fern, A.2    Givan, R.3
  • 19
    • 29344454922 scopus 로고    scopus 로고
    • PPDDL1.0: An extension to PDDL for expressing planning domains with probabilistic effects
    • Technical Report CMU-CS-04-167
    • Younes, H. L. S., and Littman, M. L. 2004. PPDDL1.0: An extension to PDDL for expressing planning domains with probabilistic effects. Technical Report CMU-CS-04-167.
    • (2004)
    • Younes, H.L.S.1    Littman, M.L.2
  • 20
    • 13444256700 scopus 로고    scopus 로고
    • Policy generation for continuous-time stochastic domains with concurrency
    • Younes, H. L. S., and Simmons, R. G. 2004. Policy generation for continuous-time stochastic domains with concurrency. In Proc. ICAPS.
    • (2004) Proc. ICAPS
    • Younes, H.L.S.1    Simmons, R.G.2
  • 21
    • 9444281800 scopus 로고    scopus 로고
    • Extending PDDL to model stochastic decision processes
    • Younes, H. L. S. 2003. Extending PDDL to model stochastic decision processes. In Proc. ICAPS Workshop on PDDL.
    • (2003) Proc. ICAPS Workshop on PDDL
    • Younes, H.L.S.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.