메뉴 건너뛰기




Volumn 173, Issue 5-6, 2009, Pages 748-788

Practical solution techniques for first-order MDPs

Author keywords

First order logic; MDPs; Planning

Indexed keywords

FORMAL LOGIC; LINEARIZATION; SPECIFICATIONS;

EID: 60549103706     PISSN: 00043702     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.artint.2008.11.003     Document Type: Article
Times cited : (97)

References (91)
  • 3
    • 0033897011 scopus 로고    scopus 로고
    • Using temporal logics to express search control knowledge for planning
    • Bacchus F., and Kabanza F. Using temporal logics to express search control knowledge for planning. Artificial Intelligence 116 1-2 (2000) 123-191
    • (2000) Artificial Intelligence , vol.116 , Issue.1-2 , pp. 123-191
    • Bacchus, F.1    Kabanza, F.2
  • 5
    • 0002700781 scopus 로고
    • Learning to act using real-time dynamic programming
    • Tech. Rep. UM-CS-1993-002, U. Mass. Amherst
    • A.G. Barto, S.J. Bradtke, S.P. Singh, Learning to act using real-time dynamic programming, Tech. Rep. UM-CS-1993-002, U. Mass. Amherst, 1993
    • (1993)
    • Barto, A.G.1    Bradtke, S.J.2    Singh, S.P.3
  • 6
    • 85012688561 scopus 로고
    • Princeton University Press, Princeton, NJ
    • Bellman R.E. Dynamic Programming (1957), Princeton University Press, Princeton, NJ
    • (1957) Dynamic Programming
    • Bellman, R.E.1
  • 9
    • 0001657540 scopus 로고
    • Fast planning through graph analysis
    • Montreal
    • A.L. Blum, M.L. Furst, Fast planning through graph analysis, in: IJCAI 95, Montreal, 1995, pp. 1636-1642
    • (1995) IJCAI 95 , pp. 1636-1642
    • Blum, A.L.1    Furst, M.L.2
  • 10
    • 60549099880 scopus 로고    scopus 로고
    • B. Bonet, H. Geffner, mGPT: A probabilistic planner based on heuristic search, in: Online Proceedings for The Probabilistic Planning Track of IPC-04: http://www.cs.rutgers.edu/~mlittman/topics/ipc04-pt/proceedings/, 2004
    • B. Bonet, H. Geffner, mGPT: A probabilistic planner based on heuristic search, in: Online Proceedings for The Probabilistic Planning Track of IPC-04: http://www.cs.rutgers.edu/~mlittman/topics/ipc04-pt/proceedings/, 2004
  • 11
    • 84880685295 scopus 로고    scopus 로고
    • Prioritized goal decomposition of Markov decision processes: Toward a synthesis of classical and decision theoretic planning
    • Nagoya
    • C. Boutilier, R.I. Brafman, C. Geib, Prioritized goal decomposition of Markov decision processes: Toward a synthesis of classical and decision theoretic planning, in: International Joint Conference on Artificial Intelligence (IJCAI-97). Nagoya, 1997, pp. 1156-1162
    • (1997) International Joint Conference on Artificial Intelligence (IJCAI-97) , pp. 1156-1162
    • Boutilier, C.1    Brafman, R.I.2    Geib, C.3
  • 15
    • 60549115923 scopus 로고    scopus 로고
    • C. Boutilier, R. Reiter, M. Soutchanski, S. Thrun, Decision-theoretic, high-level agent programming in the situation calculus, in: AAAI-00, Austin, TX, 2000, pp. 355-362
    • C. Boutilier, R. Reiter, M. Soutchanski, S. Thrun, Decision-theoretic, high-level agent programming in the situation calculus, in: AAAI-00, Austin, TX, 2000, pp. 355-362
  • 18
    • 0024084964 scopus 로고
    • Generalized subsumption and its application to induction and redundancy
    • Buntine W. Generalized subsumption and its application to induction and redundancy. Artificial Intelligence 36 (1988) 375-399
    • (1988) Artificial Intelligence , vol.36 , pp. 375-399
    • Buntine, W.1
  • 19
    • 0348090400 scopus 로고    scopus 로고
    • The linear programming approach to approximate dynamic programming
    • de Farias D., and Roy B.V. The linear programming approach to approximate dynamic programming. Operations Research 51 6 (2003) 850-865
    • (2003) Operations Research , vol.51 , Issue.6 , pp. 850-865
    • de Farias, D.1    Roy, B.V.2
  • 22
    • 0030697013 scopus 로고    scopus 로고
    • Abstraction and approximate decision-theoretic planning
    • Dearden R., and Boutilier C. Abstraction and approximate decision-theoretic planning. Artificial Intelligence 89 12 (1997) 219-283
    • (1997) Artificial Intelligence , vol.89 , Issue.12 , pp. 219-283
    • Dearden, R.1    Boutilier, C.2
  • 23
    • 0033188982 scopus 로고    scopus 로고
    • Bucket elimination: A unifying framework for reasoning
    • Dechter R. Bucket elimination: A unifying framework for reasoning. Artificial Intelligence 113 (1999) 41-85
    • (1999) Artificial Intelligence , vol.113 , pp. 41-85
    • Dechter, R.1
  • 27
    • 13444258086 scopus 로고    scopus 로고
    • A. Fern, S. Yoon, R. Givan, Learning domain-specific control knowledge from random walks, in: International Conference on Planning and Scheduling (ICAPS-04), June 2004, pp. 191-199
    • A. Fern, S. Yoon, R. Givan, Learning domain-specific control knowledge from random walks, in: International Conference on Planning and Scheduling (ICAPS-04), June 2004, pp. 191-199
  • 29
    • 2842560201 scopus 로고
    • STRIPS: A new approach to the application of theorem proving to problem solving
    • Fikes R.E., and Nilsson N.J. STRIPS: A new approach to the application of theorem proving to problem solving. AI Journal 2 (1971) 189-208
    • (1971) AI Journal , vol.2 , pp. 189-208
    • Fikes, R.E.1    Nilsson, N.J.2
  • 31
    • 33748273074 scopus 로고    scopus 로고
    • Graph kernels and Gaussian processes for relational reinforcement learning
    • Gartner T., Driessens K., and Ramon J. Graph kernels and Gaussian processes for relational reinforcement learning. Machine Learning Journal (MLJ) 64 (2006) 91-119
    • (2006) Machine Learning Journal (MLJ) , vol.64 , pp. 91-119
    • Gartner, T.1    Driessens, K.2    Ramon, J.3
  • 33
    • 44449170889 scopus 로고    scopus 로고
    • Exploiting first-order regression in inductive policy selection
    • Banff, Canada
    • C. Gretton, S. Thiebaux, Exploiting first-order regression in inductive policy selection, in: Uncertainty in Artificial Intelligence (UAI-04), Banff, Canada, 2004, pp. 217-225
    • (2004) Uncertainty in Artificial Intelligence (UAI-04) , pp. 217-225
    • Gretton, C.1    Thiebaux, S.2
  • 37
    • 84898970468 scopus 로고    scopus 로고
    • Linear program approximations for factored continuous-state Markov decision processes
    • M. Hauskrecht, B. Kveton, Linear program approximations for factored continuous-state Markov decision processes, in: Advances in Neural Information Processing Systems 16, 2004, pp. 895-902
    • (2004) in: Advances in Neural Information Processing Systems , vol.16 , pp. 895-902
    • Hauskrecht, M.1    Kveton, B.2
  • 44
    • 0033189384 scopus 로고    scopus 로고
    • Learning action strategies for planning domains
    • Khardon R. Learning action strategies for planning domains. Artificial Intelligence 113 1-2 (1999) 125-148
    • (1999) Artificial Intelligence , vol.113 , Issue.1-2 , pp. 125-148
    • Khardon, R.1
  • 45
    • 0032649290 scopus 로고    scopus 로고
    • Learning to take actions
    • Khardon R. Learning to take actions. Machine Learning 35 1 (1999) 57-90
    • (1999) Machine Learning , vol.35 , Issue.1 , pp. 57-90
    • Khardon, R.1
  • 51
    • 29344433509 scopus 로고    scopus 로고
    • Samuel meets Amarel: Automating value function approximation using global state space analysis
    • Pittsburgh
    • S. Mahadevan, Samuel meets Amarel: Automating value function approximation using global state space analysis, in: National Conference on Artificial Intelligence (AAAI-05), Pittsburgh, 2005, pp. 1000-1005
    • (2005) National Conference on Artificial Intelligence (AAAI-05) , pp. 1000-1005
    • Mahadevan, S.1
  • 52
    • 0004030536 scopus 로고
    • Situations, actions and causal laws, Tech. rep., Stanford University, 1963, reprinted
    • Minsky M. (Ed), MIT Press, Cambridge, MA
    • McCarthy J. Situations, actions and causal laws, Tech. rep., Stanford University, 1963, reprinted. In: Minsky M. (Ed). Semantic Information Processing (1968), MIT Press, Cambridge, MA 410-417
    • (1968) Semantic Information Processing , pp. 410-417
    • McCarthy, J.1
  • 55
    • 0141596576 scopus 로고    scopus 로고
    • Policy invariance under reward transformations: theory and application to reward shaping
    • Morgan Kaufmann, San Francisco, CA
    • Ng A.Y., Harada D., and Russell S. Policy invariance under reward transformations: theory and application to reward shaping. Proc. 16th International Conf. on Machine Learning (1999), Morgan Kaufmann, San Francisco, CA 278-287
    • (1999) Proc. 16th International Conf. on Machine Learning , pp. 278-287
    • Ng, A.Y.1    Harada, D.2    Russell, S.3
  • 56
    • 84898956770 scopus 로고    scopus 로고
    • Reinforcement learning with hierarchies of machines
    • Jordan M.M.K., and Solla S. (Eds), MIT Press, Cambridge, MA
    • Parr R., and Russell S. Reinforcement learning with hierarchies of machines. In: Jordan M.M.K., and Solla S. (Eds). Advances in Neural Information Processing Systems 10 (1998), MIT Press, Cambridge, MA 1043-1049
    • (1998) Advances in Neural Information Processing Systems 10 , pp. 1043-1049
    • Parr, R.1    Russell, S.2
  • 59
    • 0031187203 scopus 로고    scopus 로고
    • The independent choice logic for modelling multiple agents under uncertainty
    • Poole D. The independent choice logic for modelling multiple agents under uncertainty. Artificial Intelligence 94 1-2 (1997) 7-56
    • (1997) Artificial Intelligence , vol.94 , Issue.1-2 , pp. 7-56
    • Poole, D.1
  • 63
    • 0002048328 scopus 로고
    • The frame problem in the situation calculus: A simple solution (sometimes) and a completeness result for goal regression
    • Lifschitz V. (Ed), Academic Press, San Diego, CA
    • Reiter R. The frame problem in the situation calculus: A simple solution (sometimes) and a completeness result for goal regression. In: Lifschitz V. (Ed). Artificial Intelligence and Mathematical Theory of Computation (Papers in Honor of John McCarthy) (1991), Academic Press, San Diego, CA 359-380
    • (1991) Artificial Intelligence and Mathematical Theory of Computation (Papers in Honor of John McCarthy) , pp. 359-380
    • Reiter, R.1
  • 65
    • 0036327027 scopus 로고    scopus 로고
    • The design and implementation of vampire
    • Riazanov A., and Voronkov A. The design and implementation of vampire. AI Communications 15 2 (2002) 91-110
    • (2002) AI Communications , vol.15 , Issue.2 , pp. 91-110
    • Riazanov, A.1    Voronkov, A.2
  • 67
    • 60549107148 scopus 로고    scopus 로고
    • S. Sanner, First-order decision-theoretic planning in structured relational environments, Ph.D. thesis, University of Toronto, Toronto, ON, Canada, March 2008
    • S. Sanner, First-order decision-theoretic planning in structured relational environments, Ph.D. thesis, University of Toronto, Toronto, ON, Canada, March 2008
  • 78
    • 0029752470 scopus 로고    scopus 로고
    • Feature-based methods for large scale dynamic programming
    • Tsitsiklis J.N., and Van Roy B. Feature-based methods for large scale dynamic programming. Machine Learning 22 (1996) 59-94
    • (1996) Machine Learning , vol.22 , pp. 59-94
    • Tsitsiklis, J.N.1    Van Roy, B.2
  • 84
    • 13444310066 scopus 로고    scopus 로고
    • Inductive policy selection for first-order Markov decision processes
    • Edmonton
    • S. Yoon, A. Fern, R. Givan, Inductive policy selection for first-order Markov decision processes, in: Uncertainty in Artificial Intelligence (UAI-02), Edmonton, 2002, pp. 569-576
    • (2002) Uncertainty in Artificial Intelligence (UAI-02) , pp. 569-576
    • Yoon, S.1    Fern, A.2    Givan, R.3
  • 85
    • 60549105503 scopus 로고    scopus 로고
    • S. Yoon, A. Fern, R. Givan, Learning reactive policies for probabilistic planning domains, in: Online Proceedings for The Probabilistic Planning Track of IPC-04: http://www.cs.rutgers.edu/mlittman/topics/ipc04-pt/proceedings/, 2004
    • S. Yoon, A. Fern, R. Givan, Learning reactive policies for probabilistic planning domains, in: Online Proceedings for The Probabilistic Planning Track of IPC-04: http://www.cs.rutgers.edu/mlittman/topics/ipc04-pt/proceedings/, 2004
  • 86
    • 29344443330 scopus 로고    scopus 로고
    • S. Yoon, A. Fern, R. Givan, Learning measures of progress for planning domains, in: 20th National Conference on Artificial Intelligence, July 2005, pp. 1217-1222
    • S. Yoon, A. Fern, R. Givan, Learning measures of progress for planning domains, in: 20th National Conference on Artificial Intelligence, July 2005, pp. 1217-1222
  • 87
    • 33744466799 scopus 로고    scopus 로고
    • Approximate policy iteration with a policy language bias: Learning to solve relational Markov decision processes
    • Yoon S., Fern A., and Givan R. Approximate policy iteration with a policy language bias: Learning to solve relational Markov decision processes. Journal of Artificial Intelligence Research (JAIR) 25 (2006) 85-118
    • (2006) Journal of Artificial Intelligence Research (JAIR) , vol.25 , pp. 85-118
    • Yoon, S.1    Fern, A.2    Givan, R.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.