메뉴 건너뛰기




Volumn 89, Issue 1-2, 1997, Pages 219-283

Abstraction and approximate decision-theoretic planning

Author keywords

Abstraction; Approximation; Decision theory; Execution; Heuristics; Markov decision processes; Planning; Search

Indexed keywords

ALGORITHMS; APPROXIMATION THEORY; DECISION THEORY; HEURISTIC METHODS; MARKOV PROCESSES; OPTIMIZATION;

EID: 0030697013     PISSN: 00043702     EISSN: None     Source Type: Journal    
DOI: 10.1016/s0004-3702(96)00023-9     Document Type: Article
Times cited : (94)

References (56)
  • 1
    • 0020810556 scopus 로고
    • The *-minimax search procedure for trees containing chance nodes
    • B.W. Ballard, The *-minimax search procedure for trees containing chance nodes, Artif. Intell. 21 (1983) 327-350.
    • (1983) Artif. Intell. , vol.21 , pp. 327-350
    • Ballard, B.W.1
  • 2
    • 0029210635 scopus 로고
    • Learning to act using real-time dynamic programming
    • A.G. Barto, S.J. Bradtke and S.P. Singh, Learning to act using real-time dynamic programming, Artif. Intell. 72 (1995) 81-138.
    • (1995) Artif. Intell. , vol.72 , pp. 81-138
    • Barto, A.G.1    Bradtke, S.J.2    Singh, S.P.3
  • 3
    • 0003787146 scopus 로고
    • Princeton University Press, Princeton, NJ
    • R.E. Bellman, Dynamic Programming (Princeton University Press, Princeton, NJ, 1957).
    • (1957) Dynamic Programming
    • Bellman, R.E.1
  • 4
    • 0024680419 scopus 로고
    • Adaptive aggregation for infinite horizon dynamic programming
    • D.P. Bertsekas and D.A. Castanon, Adaptive aggregation for infinite horizon dynamic programming, IEEE Trans. Automat. Control 34 (1989) 589-598.
    • (1989) IEEE Trans. Automat. Control , vol.34 , pp. 589-598
    • Bertsekas, D.P.1    Castanon, D.A.2
  • 6
    • 0001854509 scopus 로고
    • Solving time-dependent planning problems
    • Detroit, MI
    • M. Boddy and T.L. Dean, Solving time-dependent planning problems, in: Proceedings IJCAI-89, Detroit, MI (1989) 979-984.
    • (1989) Proceedings IJCAI-89 , pp. 979-984
    • Boddy, M.1    Dean, T.L.2
  • 7
    • 0028447220 scopus 로고
    • Deliberation scheduling for problem solving in time-constrained environments
    • M. Boddy and T.L. Dean, Deliberation scheduling for problem solving in time-constrained environments, Artif. Intell. 67 (1994) 245-285.
    • (1994) Artif. Intell. , vol.67 , pp. 245-285
    • Boddy, M.1    Dean, T.L.2
  • 10
    • 85166207010 scopus 로고
    • Exploiting structure in policy construction
    • Montreal, Que.
    • C. Boutilier, R. Dearden and M. Goldszmidt, Exploiting structure in policy construction, in: Proceedings IJCAI-95, Montreal, Que. (1995) 1104-1111.
    • (1995) Proceedings IJCAI-95 , pp. 1104-1111
    • Boutilier, C.1    Dearden, R.2    Goldszmidt, M.3
  • 11
    • 0030349220 scopus 로고    scopus 로고
    • Computing optimal policies for partially observable decision processes using compact representations
    • Portland, OR
    • C. Boutilier and D. Poole, Computing optimal policies for partially observable decision processes using compact representations, in: Proceedings AAAI-96, Portland, OR (1996) 1168-1175.
    • (1996) Proceedings AAAI-96 , pp. 1168-1175
    • Boutilier, C.1    Poole, D.2
  • 12
    • 85168106990 scopus 로고
    • Process-oriented planning and average-reward optimality
    • Montreal, Que.
    • C. Boutilier and M.L. Puterman, Process-oriented planning and average-reward optimality, in: Proceedings IJCAI-95, Montreal, Que. (1995) 1096-1103.
    • (1995) Proceedings IJCAI-95 , pp. 1096-1103
    • Boutilier, C.1    Puterman, M.L.2
  • 13
    • 0028564629 scopus 로고
    • Acting optimally in partially observable stochastic domains
    • Seattle, WA
    • A.R. Cassandra, L.P. Kaelbling and M.L. Littman, Acting optimally in partially observable stochastic domains, in: Proceedings AAAI-94, Seattle, WA (1994) 1023-1028.
    • (1994) Proceedings AAAI-94 , pp. 1023-1028
    • Cassandra, A.R.1    Kaelbling, L.P.2    Littman, M.L.3
  • 14
    • 0002192119 scopus 로고
    • Input generalization in delayed reinforcement learning: An algorithm and performance comparisons
    • Sydney
    • D. Chapman and L.P. Kaelbling, Input generalization in delayed reinforcement learning: an algorithm and performance comparisons, in: Proceedings IJCAI-91, Sydney (1991) 726-731.
    • (1991) Proceedings IJCAI-91 , pp. 726-731
    • Chapman, D.1    Kaelbling, L.P.2
  • 16
    • 84990553353 scopus 로고
    • A model for reasoning about persistence and causation
    • T.L. Dean and K. Kanazawa, A model for reasoning about persistence and causation, Comput. Intell. 5 (1989) 142-150.
    • (1989) Comput. Intell. , vol.5 , pp. 142-150
    • Dean, T.L.1    Kanazawa, K.2
  • 19
    • 2842560201 scopus 로고
    • Strips: A new approach to the application of theorem proving to problem solving
    • R.E. Fikes and N.J. Nilsson, Strips: a new approach to the application of theorem proving to problem solving, Artif. Intell. 2 (1971) 189-208.
    • (1971) Artif. Intell. , vol.2 , pp. 189-208
    • Fikes, R.E.1    Nilsson, N.J.2
  • 23
    • 0003676137 scopus 로고
    • Computation and action under bounded resources
    • Stanford University, Stanford, CA
    • E.J. Horvitz, Computation and action under bounded resources, Tech. Rept. KSL-90-76, Stanford University, Stanford, CA (1990).
    • (1990) Tech. Rept. KSL-90-76
    • Horvitz, E.J.1
  • 29
    • 0028484996 scopus 로고
    • Automatically generating abstractions for planning
    • C.A. Knoblock, Automatically generating abstractions for planning, Artif. Intell. 68 (1994) 243-302.
    • (1994) Artif. Intell. , vol.68 , pp. 243-302
    • Knoblock, C.A.1
  • 30
    • 0039936301 scopus 로고
    • Characterizing abstraction hierarchies for planning
    • Anaheim, CA
    • C.A. Knoblock, J.D. Tenenberg and Q. Yang, Characterizing abstraction hierarchies for planning, in: Proceedings AAAI-91, Anaheim, CA (1991) 692-697.
    • (1991) Proceedings AAAI-91 , pp. 692-697
    • Knoblock, C.A.1    Tenenberg, J.D.2    Yang, Q.3
  • 31
    • 0025400088 scopus 로고
    • Real-time heuristic search
    • R.E. Korf, Real-time heuristic search, Artif. Intell. 42 (1990) 189-211.
    • (1990) Artif. Intell. , vol.42 , pp. 189-211
    • Korf, R.E.1
  • 32
    • 0028566295 scopus 로고
    • An algorithm for probabilistic least-commitment planning
    • Seattle, WA
    • N. Kushmerick, S. Hanks and D.S. Weld, An algorithm for probabilistic least-commitment planning, in: Proceedings AAAI-94, Seattle, WA (1994) 1073-1078.
    • (1994) Proceedings AAAI-94 , pp. 1073-1078
    • Kushmerick, N.1    Hanks, S.2    Weld, D.S.3
  • 35
    • 0002679852 scopus 로고
    • A survey of algorithmic methods for partially observed Markov decision processes
    • W.S. Lovejoy, A survey of algorithmic methods for partially observed Markov decision processes, Ann. Oper. Res. 28 (1991) 47-66.
    • (1991) Ann. Oper. Res. , vol.28 , pp. 47-66
    • Lovejoy, W.S.1
  • 36
  • 37
    • 0029514510 scopus 로고    scopus 로고
    • The parti-game algorithm for variable resolution reinforcement learning in multidimensional state spaces
    • to appear
    • A.W. Moore and C.G. Atkeson, The parti-game algorithm for variable resolution reinforcement learning in multidimensional state spaces, Mach. Learn. (to appear).
    • Mach. Learn.
    • Moore, A.W.1    Atkeson, C.G.2
  • 39
    • 85168129602 scopus 로고
    • Approximating optimal policies for partially observable stochastic domains
    • Montreal, Que.
    • R. Parr and S.J. Russell, Approximating optimal policies for partially observable stochastic domains, in: Proceedings IJCAI-95, Montreal, Que. (1995) 1088-1094.
    • (1995) Proceedings IJCAI-95 , pp. 1088-1094
    • Parr, R.1    Russell, S.J.2
  • 44
    • 0040831492 scopus 로고
    • Exploiting the rule structure for decision making within the independent choice logic
    • Montreal, Que.
    • D. Poole, Exploiting the rule structure for decision making within the independent choice logic, in: Proceedings Eleventh Conference on Uncertainty in Artificial Intelligence, Montreal, Que. (1995) 454-463.
    • (1995) Proceedings Eleventh Conference on Uncertainty in Artificial Intelligence , pp. 454-463
    • Poole, D.1
  • 46
    • 0037581251 scopus 로고
    • Modified policy iteration algorithms for discounted Markov decision problems
    • M.L. Puterman and M.C. Shin, Modified policy iteration algorithms for discounted Markov decision problems, Manage. Sci. 24 (1978) 1127-1137.
    • (1978) Manage. Sci. , vol.24 , pp. 1127-1137
    • Puterman, M.L.1    Shin, M.C.2
  • 49
    • 0016069798 scopus 로고
    • Planning in a hierarchy of abstraction spaces
    • E.D. Sacerdoti, Planning in a hierarchy of abstraction spaces, Artif. Intell. 5 (1974) 115-135.
    • (1974) Artif. Intell. , vol.5 , pp. 115-135
    • Sacerdoti, E.D.1
  • 50
    • 85125003135 scopus 로고
    • The nonlinear nature of plans
    • Tblisi
    • E.D. Sacerdoti, The nonlinear nature of plans, in: Proceedings IJCAI-75, Tblisi (1975) 206-214.
    • (1975) Proceedings IJCAI-75 , pp. 206-214
    • Sacerdoti, E.D.1
  • 51
    • 0001871991 scopus 로고
    • Universal plans for reactive robots in unpredictable environments
    • Milan
    • M.J. Schoppers, Universal plans for reactive robots in unpredictable environments, in: Proceedings IJCAI-87, Milan (1987) 1039-1046.
    • (1987) Proceedings IJCAI-87 , pp. 1039-1046
    • Schoppers, M.J.1
  • 52
    • 0022059617 scopus 로고
    • Iterative aggregation-disaggregation procedures for discounted semi-Markov reward processes
    • P.L. Schweitzer, M.L. Puterman and K.W. Kindle, Iterative aggregation-disaggregation procedures for discounted semi-Markov reward processes, Oper. Res. 33 (1985) 589-605.
    • (1985) Oper. Res. , vol.33 , pp. 589-605
    • Schweitzer, P.L.1    Puterman, M.L.2    Kindle, K.W.3
  • 53
    • 85153965130 scopus 로고
    • Reinforcement learning with soft state aggregation
    • S.J. Hanson, J.D. Cowan and C.L. Giles, eds., Morgan Kaufmann, San Mateo, CA
    • S.P. Singh, T. Jaakkola and M.I. Jordan, Reinforcement learning with soft state aggregation, in: S.J. Hanson, J.D. Cowan and C.L. Giles, eds., Advances in Neural Information Processing Systems 7 (Morgan Kaufmann, San Mateo, CA, 1994).
    • (1994) Advances in Neural Information Processing Systems , vol.7
    • Singh, S.P.1    Jaakkola, T.2    Jordan, M.I.3
  • 54
    • 0015658957 scopus 로고
    • The optimal control of partially observable Markov processes over a finite horizon
    • R.D. Smallwood and E.J. Sondik, The optimal control of partially observable Markov processes over a finite horizon, Oper. Res. 21 (1973) 1071-1088.
    • (1973) Oper. Res. , vol.21 , pp. 1071-1088
    • Smallwood, R.D.1    Sondik, E.J.2
  • 55
    • 0027709265 scopus 로고
    • Postponing threats in partial-order planning
    • Washington, DC
    • D.E. Smith and M.A. Peot, Postponing threats in partial-order planning, in: Proceedings AAAI-93, Washington, DC (1993) 500-506.
    • (1993) Proceedings AAAI-93 , pp. 500-506
    • Smith, D.E.1    Peot, M.A.2
  • 56
    • 0028576345 scopus 로고
    • Control strategies for a stochastic planner
    • Seattle, WA
    • J. Tash and S.J. Russell, Control strategies for a stochastic planner, in: Proceedings AAAI-94, Seattle, WA (1994) 1079-1085.
    • (1994) Proceedings AAAI-94 , pp. 1079-1085
    • Tash, J.1    Russell, S.J.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.