SCOPUS 정보 검색 플랫폼

ICAPS 2007, 17th International Conference on Automated Planning and Scheduling

Volumn , Issue , 2007, Pages 10-17

Concurrent probabilistic temporal planning with policy-gradients

(2) Aberdeen, Douglas a Buffet, Olivier b

a AUSTRALIAN NATIONAL UNIVERSITY (Australia)

b UNIVERSITÉ DE TOULOUSE (France)

Author keywords

[No Author keywords available]

Indexed keywords

DIRECT POLICY SEARCH; FUNCTIONS APPROXIMATIONS; GRADIENT ASCENT; LOW MEMORY; MEMORY USE; PARAMETERIZED; POLICY GRADIENT; PROBABILISTICS; TEMPORAL PLANNING; UNCERTAINTY;

EID: 58349090838 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (7)

References (21)

1
- 13444288352
- Decisiontheoretic military operations planning
- Aberdeen, D.; Thiébaux, S.; and Zhang, L. 2004. Decisiontheoretic military operations planning. In Proc. ICAPS.
- (2004) Proc. ICAPS
- Aberdeen, D.¹ Thiébaux, S.² Zhang, L.³

2
- 85163534824
- Policy-gradient methods for planning
- Aberdeen, D. 2006. Policy-gradient methods for planning. In Proc. NIPS'05.
- (2006) Proc. NIPS'05
- Aberdeen, D.¹

3
- 85163520533
- Barto, A.; Bradtke, S.; and Singh, S. 1995. Learning to act using real-time dynamic programming. Artificial Intelligence 72.
- Barto, A.; Bradtke, S.; and Singh, S. 1995. Learning to act using real-time dynamic programming. Artificial Intelligence 72.

4
- 85163440030
- Baxter, J.; Bartlett, P.; and Weaver, L. 2001. Experiments with infinite-horizon, policy-gradient estimation. JAIR 15.
- Baxter, J.; Bartlett, P.; and Weaver, L. 2001. Experiments with infinite-horizon, policy-gradient estimation. JAIR 15.

5
- 85163482395
- Bonet, B., and Givan, R. 2006. Proc. of the 5th int. planning competition (IPC-5). See http://www.ldc.usb.ve/~bonet/ipc5 for all results and proceedings.
- Bonet, B., and Givan, R. 2006. Proc. of the 5th int. planning competition (IPC-5). See http://www.ldc.usb.ve/~bonet/ipc5 for all results and proceedings.

6
- 57749179024
- FF+FPG: Guiding a policygradient planner
- Buffet, O., and Aberdeen, D. 2007. FF+FPG: Guiding a policygradient planner. In Proc. ICAPS.
- (2007) Proc. ICAPS
- Buffet, O.¹ Aberdeen, D.²

7
- 0036377352
- The FF planning system: Fast plan generation through heuristic search
- Hoffmann, J., and Nebel, B. 2001. The FF planning system: Fast plan generation through heuristic search. JAIR 14:253-302.
- (2001) JAIR , vol.14 , pp. 253-302
- Hoffmann, J.¹ Nebel, B.²

8
- 85163418620
- Probabilistic planning vs replanning
- Submitted for Publication
- Little, I., and Thiébaux, S. Probabilistic planning vs replanning. Submitted for Publication.
- Little, I.¹ Thiébaux, S.²

9
- 33746077700
- Concurrent probabilistic planning in the graphplan framework
- Little, I., and Thiébaux, S. 2006. Concurrent probabilistic planning in the graphplan framework. In Proc. ICAPS.
- (2006) Proc. ICAPS
- Little, I.¹ Thiébaux, S.²

10
- 33746073960
- Prottle: A probabilistic temporal planner
- Little, I.; Aberdeen, D.; and Thiébaux, S. 2005. Prottle: A probabilistic temporal planner. In Proc. AAAI.
- (2005) Proc. AAAI
- Little, I.¹ Aberdeen, D.² Thiébaux, S.³

11
- 33746077967
- Concurrent probabilistic temporal planning
- Mausam, and Weld, D. S. 2005. Concurrent probabilistic temporal planning. In Proc. ICAPS.
- (2005) Proc. ICAPS
- Mausam¹ Weld, D.S.²

12
- 44449135985
- Probabilistic temporal planning with uncertains durations
- Mausam, and Weld, D. S. 2006. Probabilistic temporal planning with uncertains durations. In Proc. AAAI'06.
- (2006) Proc. AAAI , vol.6
- Mausam¹ Weld, D.S.²

13
- 0012646255
- Learning to cooperate via policy search
- Peshkin, L.; Kim, K.-E.; Meuleau, N.; and Kaelbling, L. P. 2000. Learning to cooperate via policy search. In Proc. UAI.
- (2000) Proc. UAI
- Peshkin, L.¹ Kim, K.-E.² Meuleau, N.³ Kaelbling, L.P.⁴

14
- 77957878103
- Practical linear valueapproximation techniques for first-order MDPs
- Sanner, S., and Boutilier, C. 2006. Practical linear valueapproximation techniques for first-order MDPs. In Proc. UAI.
- (2006) Proc. UAI
- Sanner, S.¹ Boutilier, C.²

15
- 84898939480
- Policy gradient methods for reinforcement learning with function approximation
- Sutton, R. S.; McAllester, D.; Singh, S.; and Mansour, Y. 2000. Policy gradient methods for reinforcement learning with function approximation. Proc. NIPS.
- (2000) Proc. NIPS
- Sutton, R.S.¹ McAllester, D.² Singh, S.³ Mansour, Y.⁴

16
- 13444294406
- A multi-agent, policygradient approach to network routing
- Tao, N.; Baxter, J.; and Weaver, L. 2001. A multi-agent, policygradient approach to network routing. In Proc. ICML.
- (2001) Proc. ICML
- Tao, N.¹ Baxter, J.² Weaver, L.³

17
- 0000337576
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- Williams, R. J. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning 8:229-256.
- (1992) Machine Learning , vol.8 , pp. 229-256
- Williams, R.J.¹

18
- 58349118462
- FF-replan, a baseline for probabilistic planning
- Yoon, S.; Fern, A.; and Givan, R. 2007. FF-replan, a baseline for probabilistic planning. In Proc. ICAPS'07.
- (2007) Proc. ICAPS'07
- Yoon, S.¹ Fern, A.² Givan, R.³

19
- 29344454922
- PPDDL1.0: An extension to PDDL for expressing planning domains with probabilistic effects
- Technical Report CMU-CS-04-167
- Younes, H. L. S., and Littman, M. L. 2004. PPDDL1.0: An extension to PDDL for expressing planning domains with probabilistic effects. Technical Report CMU-CS-04-167.
- (2004)
- Younes, H.L.S.¹ Littman, M.L.²

20
- 13444256700
- Policy generation for continuous-time stochastic domains with concurrency
- Younes, H. L. S., and Simmons, R. G. 2004. Policy generation for continuous-time stochastic domains with concurrency. In Proc. ICAPS.
- (2004) Proc. ICAPS
- Younes, H.L.S.¹ Simmons, R.G.²

21
- 9444281800
- Extending PDDL to model stochastic decision processes
- Younes, H. L. S. 2003. Extending PDDL to model stochastic decision processes. In Proc. ICAPS Workshop on PDDL.
- (2003) Proc. ICAPS Workshop on PDDL
- Younes, H.L.S.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.