메뉴 건너뛰기




Volumn 121, Issue 1, 2000, Pages 49-107

Stochastic dynamic programming with factored representations

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS; APPROXIMATION THEORY; COMPUTATIONAL COMPLEXITY; DECISION THEORY; DYNAMIC PROGRAMMING; MARKOV PROCESSES; MATHEMATICAL MODELS; PROBABILITY DISTRIBUTIONS; PROBLEM SOLVING; THEOREM PROVING; TREES (MATHEMATICS);

EID: 0034248853     PISSN: 00043702     EISSN: None     Source Type: Journal    
DOI: 10.1016/S0004-3702(00)00033-3     Document Type: Article
Times cited : (320)

References (84)
  • 3
    • 0003787146 scopus 로고
    • Princeton, NJ: Princeton University Press
    • Bellman R.E. Dynamic Programming. 1957;Princeton University Press, Princeton, NJ.
    • (1957) Dynamic Programming
    • Bellman, R.E.1
  • 4
    • 0024680419 scopus 로고
    • Adaptive aggregation for infinite horizon dynamic programming
    • Bertsekas D.P., Castanon D.A. Adaptive aggregation for infinite horizon dynamic programming. IEEE Trans. Automat. Control. Vol. 34:1989;589-598.
    • (1989) IEEE Trans. Automat. Control , vol.34 , pp. 589-598
    • Bertsekas, D.P.1    Castanon, D.A.2
  • 7
    • 0028443644 scopus 로고
    • Trading accuracy for simplicity in decision trees
    • Bohanic M., Bratko I. Trading accuracy for simplicity in decision trees. Machine Learning. Vol. 15:1994;223-250.
    • (1994) Machine Learning , vol.15 , pp. 223-250
    • Bohanic, M.1    Bratko, I.2
  • 9
    • 84880685295 scopus 로고    scopus 로고
    • Prioritized goal decomposition of Markov decision processes: Toward a synthesis of classical and decision theoretic planning
    • Boutilier C., Brafman R.I., Geib C. Prioritized goal decomposition of Markov decision processes: Toward a synthesis of classical and decision theoretic planning. Proc. IJCAI-97, Nagoya, Japan. 1997;1156-1162.
    • (1997) Proc. IJCAI-97, Nagoya, Japan , pp. 1156-1162
    • Boutilier, C.1    Brafman, R.I.2    Geib, C.3
  • 11
    • 0346942368 scopus 로고    scopus 로고
    • Decision theoretic planning: Structural assumptions and computational leverage
    • Boutilier C., Dean T., Hanks S. Decision theoretic planning: Structural assumptions and computational leverage. J. Artificial Intelligence Res. Vol. 11:1999;1-94.
    • (1999) J. Artificial Intelligence Res. , vol.11 , pp. 1-94
    • Boutilier, C.1    Dean, T.2    Hanks, S.3
  • 12
    • 0028572333 scopus 로고
    • Using abstractions for decision-theoretic planning with time constraints
    • Boutilier C., Dearden R. Using abstractions for decision-theoretic planning with time constraints. Proc. AAAI-94, Seattle, WA. 1994;1016-1022.
    • (1994) Proc. AAAI-94, Seattle, WA , pp. 1016-1022
    • Boutilier, C.1    Dearden, R.2
  • 16
    • 0030349220 scopus 로고    scopus 로고
    • Computing optimal policies for partially observable decision processes using compact representations
    • Boutilier C., Poole D. Computing optimal policies for partially observable decision processes using compact representations. Proc. AAAI-96, Portland, OR. 1996;1168-1175.
    • (1996) Proc. AAAI-96, Portland, or , pp. 1168-1175
    • Boutilier, C.1    Poole, D.2
  • 18
    • 0001133021 scopus 로고
    • Generalization in reinforcement learning: Safely approximating the value function
    • G. Tesauro, D.S. Touretzky, & T.K. Leen. Cambridge, MA: MIT Press
    • Boyan J.A., Moore A.W. Generalization in reinforcement learning: Safely approximating the value function. Tesauro G., Touretzky D.S., Leen T.K. Advances in Neural Information Processing Systems 7. 1995;MIT Press, Cambridge, MA.
    • (1995) Advances in Neural Information Processing Systems 7
    • Boyan, J.A.1    Moore, A.W.2
  • 19
    • 0022769976 scopus 로고
    • Graph-based algorithms for Boolean function manipulation
    • Bryant R.E. Graph-based algorithms for Boolean function manipulation. IEEE Trans. Comput. Vol. C-35:(8):1986;677-691.
    • (1986) IEEE Trans. Comput. , vol.C-35 , Issue.8 , pp. 677-691
    • Bryant, R.E.1
  • 22
    • 0023381915 scopus 로고
    • Planning for conjunctive goals
    • Chapman D. Planning for conjunctive goals. Artificial Intelligence. Vol. 32:(3):1987;333-377.
    • (1987) Artificial Intelligence , vol.32 , Issue.3 , pp. 333-377
    • Chapman, D.1
  • 23
    • 0002192119 scopus 로고
    • Input generalization in delayed reinforcement learning: An algorithm and performance comparisons
    • Chapman D., Kaelbling L.P. Input generalization in delayed reinforcement learning: An algorithm and performance comparisons. Proc. IJCAI-91, Sydney, Australia. 1991;726-731.
    • (1991) Proc. IJCAI-91, Sydney, Australia , pp. 726-731
    • Chapman, D.1    Kaelbling, L.P.2
  • 26
    • 0031370386 scopus 로고    scopus 로고
    • Model minimization in Markov decision processes
    • Dean T., Givan R. Model minimization in Markov decision processes. Proc. AAAI-97, Providence, RI. 1997;106-111.
    • (1997) Proc. AAAI-97, Providence, RI , pp. 106-111
    • Dean, T.1    Givan, R.2
  • 29
    • 84990553353 scopus 로고
    • A model for reasoning about persistence and causation
    • Dean T., Kanazawa K. A model for reasoning about persistence and causation. Computational Intelligence. Vol. 5:(3):1989;142-150.
    • (1989) Computational Intelligence , vol.5 , Issue.3 , pp. 142-150
    • Dean, T.1    Kanazawa, K.2
  • 30
    • 0030697013 scopus 로고    scopus 로고
    • Abstraction and approximate decision theoretic planning
    • Dearden R., Boutilier C. Abstraction and approximate decision theoretic planning. Artificial Intelligence. Vol. 89:1997;219-283.
    • (1997) Artificial Intelligence , vol.89 , pp. 219-283
    • Dearden, R.1    Boutilier, C.2
  • 33
    • 0031208987 scopus 로고    scopus 로고
    • Explanation-based learning and reinforcement learning: A unified view
    • Dietterich T.G., Flann N.S. Explanation-based learning and reinforcement learning: A unified view. Machine Learning. Vol. 28:(2):1997;169-210.
    • (1997) Machine Learning , vol.28 , Issue.2 , pp. 169-210
    • Dietterich, T.G.1    Flann, N.S.2
  • 34
    • 0016543936 scopus 로고
    • Guarded commands, nondeterminacy and formal derivation of programs
    • Dijkstra E.W. Guarded commands, nondeterminacy and formal derivation of programs. Comm. ACM. Vol. 18:(8):1975;453-457.
    • (1975) Comm. ACM , vol.18 , Issue.8 , pp. 453-457
    • Dijkstra, E.W.1
  • 35
    • 0017628839 scopus 로고
    • Decision theory and artificial intelligence II: The hungry monkey
    • Feldman J.A., Sproull R.F. Decision theory and artificial intelligence II: The hungry monkey. Cognitive Sci. Vol. 1:1977;158-192.
    • (1977) Cognitive Sci. , vol.1 , pp. 158-192
    • Feldman, J.A.1    Sproull, R.F.2
  • 36
    • 2842560201 scopus 로고
    • STRIPS: A new approach to the application of theorem proving to problem solving
    • Fikes R.E., Nilsson N.J. STRIPS: A new approach to the application of theorem proving to problem solving. Artificial Intelligence. Vol. 2:1971;189-208.
    • (1971) Artificial Intelligence , vol.2 , pp. 189-208
    • Fikes, R.E.1    Nilsson, N.J.2
  • 39
    • 84880654869 scopus 로고    scopus 로고
    • Model minimization, regression, and propositional STRIPS planning
    • Givan R., Dean T. Model minimization, regression, and propositional STRIPS planning. Proc. IJCAI-97, Nagoya, Japan. 1997;1163-1168.
    • (1997) Proc. IJCAI-97, Nagoya, Japan , pp. 1163-1168
    • Givan, R.1    Dean, T.2
  • 40
    • 0028400910 scopus 로고
    • Modeling a dynamic and uncertain world I: Symbolic and probabilistic reasoning about change
    • Hanks S., McDermott D.V. Modeling a dynamic and uncertain world I: Symbolic and probabilistic reasoning about change. Artificial Intelligence. Vol. 66:1994;1-55.
    • (1994) Artificial Intelligence , vol.66 , pp. 1-55
    • Hanks, S.1    McDermott, D.V.2
  • 46
    • 0001815269 scopus 로고
    • Constructing optimal binary decision trees is NP-complete
    • Hyafil L., Rivest R.L. Constructing optimal binary decision trees is NP-complete. Inform. Process. Lett. Vol. 5:1976;15-17.
    • (1976) Inform. Process. Lett. , vol.5 , pp. 15-17
    • Hyafil, L.1    Rivest, R.L.2
  • 53
    • 0002679852 scopus 로고
    • A survey of algorithmic methods for partially observed Markov decision processes
    • Lovejoy W.S. A survey of algorithmic methods for partially observed Markov decision processes. Ann. Oper. Res. Vol. 28:1991;47-66.
    • (1991) Ann. Oper. Res. , vol.28 , pp. 47-66
    • Lovejoy, W.S.1
  • 55
    • 0014638440 scopus 로고
    • Some philosophical problems from the standpoint of artificial intelligence
    • B. Meltzer, & D. Michie. Edinburgh: Edinburgh University Press
    • McCarthy J., Hayes P.J. Some philosophical problems from the standpoint of artificial intelligence. Meltzer B., Michie D. Machine Intelligence 4. 1969;463-502 Edinburgh University Press, Edinburgh.
    • (1969) Machine Intelligence 4 , pp. 463-502
    • McCarthy, J.1    Hayes, P.J.2
  • 59
    • 0027702434 scopus 로고
    • Probabilistic Horn abduction and Bayesian networks
    • Poole D. Probabilistic Horn abduction and Bayesian networks. Artificial Intelligence. Vol. 64:(1):1993;81-129.
    • (1993) Artificial Intelligence , vol.64 , Issue.1 , pp. 81-129
    • Poole, D.1
  • 60
    • 0031187203 scopus 로고    scopus 로고
    • The independent choice logic for modelling multiple agents under uncertainty
    • Poole D. The independent choice logic for modelling multiple agents under uncertainty. Artificial Intelligence. Vol. 94:(1-2):1997;7-56.
    • (1997) Artificial Intelligence , vol.94 , Issue.12 , pp. 7-56
    • Poole, D.1
  • 63
    • 0037581251 scopus 로고
    • Modified policy iteration algorithms for discounted Markov decision problems
    • Puterman M.L., Shin M.C. Modified policy iteration algorithms for discounted Markov decision problems. Management Science. Vol. 24:1978;1127-1137.
    • (1978) Management Science , vol.24 , pp. 1127-1137
    • Puterman, M.L.1    Shin, M.C.2
  • 65
    • 1442267080 scopus 로고
    • Learning decision lists
    • Rivest R.L. Learning decision lists. Machine Learning. Vol. 2:1987;229-246.
    • (1987) Machine Learning , vol.2 , pp. 229-246
    • Rivest, R.L.1
  • 67
    • 0001871991 scopus 로고
    • Universal plans for reactive robots in unpredictable environments
    • Schoppers M.J. Universal plans for reactive robots in unpredictable environments. Proc. IJCAI-87, Milan, Italy. 1987;1039-1046.
    • (1987) Proc. IJCAI-87, Milan, Italy , pp. 1039-1046
    • Schoppers, M.J.1
  • 68
    • 0022059617 scopus 로고
    • Iterative aggregation-disaggregation procedures for discounted semi-Markov reward processes
    • Schweitzer P.L., Puterman M.L., Kindle K.W. Iterative aggregation-disaggregation procedures for discounted semi-Markov reward processes. Oper. Res. Vol. 33:1985;589-605.
    • (1985) Oper. Res. , vol.33 , pp. 589-605
    • Schweitzer, P.L.1    Puterman, M.L.2    Kindle, K.W.3
  • 69
    • 0022818911 scopus 로고
    • Evaluating influence diagrams
    • Shachter R.D. Evaluating influence diagrams. Oper. Res. Vol. 33:(6):1986;871-882.
    • (1986) Oper. Res. , vol.33 , Issue.6 , pp. 871-882
    • Shachter, R.D.1
  • 70
    • 43949170056 scopus 로고
    • The role of relevance in explanation I: Irrelevance as statistical independence
    • Shimony S.E. The role of relevance in explanation I: Irrelevance as statistical independence. Internat. J. Approx. Reason. Vol. 8:(4):1993;281-324.
    • (1993) Internat. J. Approx. Reason. , vol.8 , Issue.4 , pp. 281-324
    • Shimony, S.E.1
  • 72
    • 0001027894 scopus 로고
    • Transfer of learning by composing solutions of elemental sequential tasks
    • Singh S.P. Transfer of learning by composing solutions of elemental sequential tasks. Machine Learning. Vol. 8:1992;323-339.
    • (1992) Machine Learning , vol.8 , pp. 323-339
    • Singh, S.P.1
  • 73
    • 0015658957 scopus 로고
    • The optimal control of partially observable Markov processes over a finite horizon
    • Smallwood R.D., Sondik E.J. The optimal control of partially observable Markov processes over a finite horizon. Oper. Res. Vol. 21:1973;1071-1088.
    • (1973) Oper. Res. , vol.21 , pp. 1071-1088
    • Smallwood, R.D.1    Sondik, E.J.2
  • 74
    • 0027561028 scopus 로고
    • Structuring conditional relationships in influence diagrams
    • Smith J.E., Holtzman S., Matheson J.E. Structuring conditional relationships in influence diagrams. Oper. Res. Vol. 41:(2):1993;280-297.
    • (1993) Oper. Res. , vol.41 , Issue.2 , pp. 280-297
    • Smith, J.E.1    Holtzman, S.2    Matheson, J.E.3
  • 75
    • 0017943242 scopus 로고
    • The optimal control of partially observable Markov processes over the infinite horizon: Discounted costs
    • Sondik E.J. The optimal control of partially observable Markov processes over the infinite horizon: Discounted costs. Oper. Res. Vol. 26:1978;282-304.
    • (1978) Oper. Res. , vol.26 , pp. 282-304
    • Sondik, E.J.1
  • 76
    • 85132026293 scopus 로고
    • Integrated architectures for learning, planning, and reacting based on approximating dynamic programming
    • Sutton R.S. Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. Proc. 7th International Conference on Machine Learning, Austin, TX. 1990;216-224.
    • (1990) Proc. 7th International Conference on Machine Learning, Austin, TX , pp. 216-224
    • Sutton, R.S.1
  • 78
    • 0028576345 scopus 로고
    • Control strategies for a stochastic planner
    • Tash J., Russell S. Control strategies for a stochastic planner. Proc. AAAI-94, Seattle, WA. 1994;1079-1085.
    • (1994) Proc. AAAI-94, Seattle, WA , pp. 1079-1085
    • Tash, J.1    Russell, S.2
  • 80
    • 0000985504 scopus 로고
    • TD-Gammon, a self-teaching backgammon program, achieves master-level play
    • Tesauro G.J. TD-Gammon, a self-teaching backgammon program, achieves master-level play. Neural Computation. Vol. 6:1994;215-219.
    • (1994) Neural Computation , vol.6 , pp. 215-219
    • Tesauro, G.J.1
  • 81
    • 0029752470 scopus 로고    scopus 로고
    • Feature-based methods for large scale dynamic programming
    • Tsitsiklis J.H., Van Roy B. Feature-based methods for large scale dynamic programming. Machine Learning. Vol. 22:1996;59-94.
    • (1996) Machine Learning , vol.22 , pp. 59-94
    • Tsitsiklis, J.H.1    Van Roy, B.2
  • 83
    • 0042586698 scopus 로고
    • Achieving several goals simultaneously
    • E. Elcock, & D. Michie. Chichester, England: Ellis Horwood
    • Waldinger R. Achieving several goals simultaneously. Elcock E., Michie D. Machine Intelligence 8: Machine Representations of Knowledge. 1977;94-136 Ellis Horwood, Chichester, England.
    • (1977) Machine Intelligence 8: Machine Representations of Knowledge , pp. 94-136
    • Waldinger, R.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.