메뉴 건너뛰기




Volumn 11, Issue , 1999, Pages 1-94

Decision-Theoretic Planning: Structural Assumptions and Computational Leverage

Author keywords

[No Author keywords available]

Indexed keywords


EID: 0346942368     PISSN: 10769757     EISSN: None     Source Type: Journal    
DOI: 10.1613/jair.575     Document Type: Article
Times cited : (786)

References (158)
  • 2
    • 50549213583 scopus 로고
    • Optimal control of Markov decision processes with incomplete state estimation
    • Aström, K. J. (1965). Optimal control of Markov decision processes with incomplete state estimation. J. Math. Anal. Appl., 10, 174-205.
    • (1965) J. Math. Anal. Appl. , vol.10 , pp. 174-205
    • Aström, K.J.1
  • 5
    • 0001951408 scopus 로고
    • Using temporal logic to control search in a forward chaining planner
    • Assisi, Italy
    • Bacchus, F., & Kabanza, F. (1995). Using temporal logic to control search in a forward chaining planner. In Proceedings of the Third European Workshop on Planning (EWSP'95) Assisi, Italy. Available via the URL ftp://logos.uwaterloo.ca:/pub/tlplan/tlplan.ps.Z.
    • (1995) Proceedings of the Third European Workshop on Planning (EWSP'95)
    • Bacchus, F.1    Kabanza, F.2
  • 8
    • 0026153773 scopus 로고
    • Nonmonotonic reasoning in the framework of the situation calculus
    • Baker, A. B. (1991). Nonmonotonic reasoning in the framework of the situation calculus. Artificial Intelligence, 49, 5-23.
    • (1991) Artificial Intelligence , vol.49 , pp. 5-23
    • Baker, A.B.1
  • 9
    • 0029210635 scopus 로고
    • Learning to act using real-time dynamic programming
    • Barto, A. G., Bradtke, S. J., & Singh, S. P. (1995). Learning to act using real-time dynamic programming. Artificial Intelligence, 72(1-2), 81-138.
    • (1995) Artificial Intelligence , vol.72 , Issue.1-2 , pp. 81-138
    • Barto, A.G.1    Bradtke, S.J.2    Singh, S.P.3
  • 10
    • 85012688561 scopus 로고
    • Princeton University Press, Princeton, NJ
    • Bellman, R. (1957). Dynamic Programming. Princeton University Press, Princeton, NJ.
    • (1957) Dynamic Programming
    • Bellman, R.1
  • 11
    • 0024680419 scopus 로고
    • Adaptive aggregation for infinite horizon dynamic programming
    • Bertsekas, D. P., & Castanon, D. A. (1989). Adaptive aggregation for infinite horizon dynamic programming. IEEE Transactions on Automatic Control, 34(6), 589-598.
    • (1989) IEEE Transactions on Automatic Control , vol.34 , Issue.6 , pp. 589-598
    • Bertsekas, D.P.1    Castanon, D.A.2
  • 27
  • 30
    • 0022769976 scopus 로고
    • Graph-based algorithms for boolean function manipulation
    • Bryant, R. E. (1986). Graph-based algorithms for boolean function manipulation. IEEE Transactions on Computers, C-35(8), 677-691.
    • (1986) IEEE Transactions on Computers , vol.C-35 , Issue.8 , pp. 677-691
    • Bryant, R.E.1
  • 31
    • 0028498153 scopus 로고
    • The computational complexity of propositional STRIPS planning
    • Bylander, T. (1994). The computational complexity of propositional STRIPS planning. Artificial Intelligence, 69, 161-204.
    • (1994) Artificial Intelligence , vol.69 , pp. 161-204
    • Bylander, T.1
  • 35
    • 0023381915 scopus 로고
    • Planning for conjunctive goals
    • Chapman, D. (1987). Planning for conjunctive goals. Artificial Intelligence, 32(3), 333-377.
    • (1987) Artificial Intelligence , vol.32 , Issue.3 , pp. 333-377
    • Chapman, D.1
  • 37
    • 0001391104 scopus 로고
    • Decomposition principle for dynamic programs
    • Dantzig, G., & Wolfe, P. (1960). Decomposition principle for dynamic programs. Operations Research, 8(1), 101-111.
    • (1960) Operations Research , vol.8 , Issue.1 , pp. 101-111
    • Dantzig, G.1    Wolfe, P.2
  • 44
    • 84990553353 scopus 로고
    • A model for reasoning about persistence and causation
    • Dean, T., & Kanazawa, K. (1989). A model for reasoning about persistence and causation. Computational Intelligence, 5(3), 142-150.
    • (1989) Computational Intelligence , vol.5 , Issue.3 , pp. 142-150
    • Dean, T.1    Kanazawa, K.2
  • 48
    • 0030697013 scopus 로고    scopus 로고
    • Abstraction and approximate decision theoretic planning
    • Dearden, R., & Boutilier, C. (1997). Abstraction and approximate decision theoretic planning. Artificial Intelligence, 89, 219-283.
    • (1997) Artificial Intelligence , vol.89 , pp. 219-283
    • Dearden, R.1    Boutilier, C.2
  • 50
    • 84880665054 scopus 로고    scopus 로고
    • Mini-buckets: A general scheme for generating approximations in automated reasoning in probabilistic inference
    • Nagoya, Japan
    • Dechter, R. (1997). Mini-buckets: A general scheme for generating approximations in automated reasoning in probabilistic inference. In Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence, pp. 1297-1302 Nagoya, Japan.
    • (1997) Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence , pp. 1297-1302
    • Dechter, R.1
  • 51
    • 0006464452 scopus 로고
    • Sur un problème de production et de stockage dans l'aléatoire
    • D'Epenoux, F. (1963). Sur un problème de production et de stockage dans l'aléatoire. Management Science, 10, 98-108.
    • (1963) Management Science , vol.10 , pp. 98-108
    • D'Epenoux, F.1
  • 56
    • 0015440625 scopus 로고
    • Learning and executing generalized robot plans
    • Fikes, R., Hart, P., & Nilsson, N. (1972). Learning and executing generalized robot plans. Artificial Intelligence, 3, 251-288.
    • (1972) Artificial Intelligence , vol.3 , pp. 251-288
    • Fikes, R.1    Hart, P.2    Nilsson, N.3
  • 57
    • 2842560201 scopus 로고
    • STRIPS: A new approach to the application of theorem proving to problem solving
    • Fikes, R., & Nilsson, N. J. (1971). STRIPS: A new approach to the application of theorem proving to problem solving. Artificial Intelligence, 2, 189-208.
    • (1971) Artificial Intelligence , vol.2 , pp. 189-208
    • Fikes, R.1    Nilsson, N.J.2
  • 59
    • 84945709831 scopus 로고
    • Algorithm 97 (shortest path)
    • Floyd, R. W. (1962). Algorithm 97 (shortest path). Communications of the ACM, 5(6), 345.
    • (1962) Communications of the ACM , vol.5 , Issue.6 , pp. 345
    • Floyd, R.W.1
  • 60
    • 0344030849 scopus 로고
    • An algorithm for identifying the ergodic subchains and transient states of a stochastic matrix
    • Fox, B. L., & Landi, D. M. (1968). An algorithm for identifying the ergodic subchains and transient states of a stochastic matrix. Communications of the ACM, 2, 619-621.
    • (1968) Communications of the ACM , vol.2 , pp. 619-621
    • Fox, B.L.1    Landi, D.M.2
  • 61
  • 67
    • 0032137162 scopus 로고    scopus 로고
    • Utility Models for Goal-Directed Decision-Theoretic Planners
    • Haddawy, P., & Hanks, S. (1998). Utility Models for Goal-Directed Decision-Theoretic Planners. Computational Intelligence, 14(3).
    • (1998) Computational Intelligence , vol.14 , Issue.3
    • Haddawy, P.1    Hanks, S.2
  • 69
    • 0038256822 scopus 로고
    • Ph.D. thesis 756, Yale University, Department of Computer Science, New Haven, CT
    • Hanks, S. (1990). Projecting plans for uncertain worlds. Ph.D. thesis 756, Yale University, Department of Computer Science, New Haven, CT.
    • (1990) Projecting Plans for Uncertain Worlds
    • Hanks, S.1
  • 70
    • 0028400910 scopus 로고
    • Modeling a dynamic and uncertain world I: Symbolic and probabilistic reasoning about change
    • Hanks, S., & McDermott, D. V. (1994). Modeling a dynamic and uncertain world I: Symbolic and probabilistic reasoning about change. Artificial Intelligence, 66(1), 1-55.
    • (1994) Artificial Intelligence , vol.66 , Issue.1 , pp. 1-55
    • Hanks, S.1    McDermott, D.V.2
  • 78
    • 0000086731 scopus 로고
    • Influence diagrams
    • Howard, R. A., & Matheson, J. E. (Eds.). Strategic Decisions Group, Menlo Park, CA
    • Howard, R. A., & Matheson, J. E. (1984). Influence diagrams. In Howard, R. A., & Matheson, J. E. (Eds.), The Principles and Applications of Decision Analysis. Strategic Decisions Group, Menlo Park, CA.
    • (1984) The Principles and Applications of Decision Analysis
    • Howard, R.A.1    Matheson, J.E.2
  • 79
    • 11544363000 scopus 로고    scopus 로고
    • Refinement planning as a unifying framework for plan synthesis
    • Kambhampati, S. (1997). Refinement planning as a unifying framework for plan synthesis. AI Magazine, Summer 1997, 67-97.
    • (1997) AI Magazine , vol.SUMMER 1997 , pp. 67-97
    • Kambhampati, S.1
  • 82
    • 0002874631 scopus 로고
    • A computational scheme for reasoning in dynamic probabilistic networks
    • Stanford
    • Kjaerulff, U. (1992). A computational scheme for reasoning in dynamic probabilistic networks. In Proceedings of the Eighth Conference on Uncertainty in AI, pp. 121-129 Stanford.
    • (1992) Proceedings of the Eighth Conference on Uncertainty in AI , pp. 121-129
    • Kjaerulff, U.1
  • 87
    • 0022045044 scopus 로고
    • Macro-operators: A weak method for learning
    • Korf, R. (1985). Macro-operators: A weak method for learning. Artificial Intelligence, 26, 35-77.
    • (1985) Artificial Intelligence , vol.26 , pp. 35-77
    • Korf, R.1
  • 88
    • 0025400088 scopus 로고
    • Real-time heuristic search
    • Korf, R. E. (1990). Real-time heuristic search. Artificial Intelligence, 42, 189-211.
    • (1990) Artificial Intelligence , vol.42 , pp. 189-211
    • Korf, R.E.1
  • 97
    • 0003861655 scopus 로고    scopus 로고
    • Ph.D. thesis CS-96-09, Brown University, Department of Computer Science, Providence, RI
    • Littman, M. L. (1996). Algorithms for sequential decision making. Ph.D. thesis CS-96-09, Brown University, Department of Computer Science, Providence, RI.
    • (1996) Algorithms for Sequential Decision Making
    • Littman, M.L.1
  • 98
    • 0000494894 scopus 로고
    • Computationally feasible bounds for partially observed Markov decision processes
    • Lovejoy, W. S. (1991a). Computationally feasible bounds for partially observed Markov decision processes. Operations Research, 39(1), 162-175.
    • (1991) Operations Research , vol.39 , Issue.1 , pp. 162-175
    • Lovejoy, W.S.1
  • 99
    • 0002679852 scopus 로고
    • A survey of algorithmic methods for partially observed Markov decision processes
    • Lovejoy, W. S. (1991b). A survey of algorithmic methods for partially observed Markov decision processes. Annals of Operations Research, 28, 47-66.
    • (1991) Annals of Operations Research , vol.28 , pp. 47-66
    • Lovejoy, W.S.1
  • 102
    • 0032596468 scopus 로고    scopus 로고
    • On the undecidability of probabilistic planning and infinite-horizon partially observable Markov decision problems
    • Orlando, FL. To appear
    • Madani, O., Condon, A., & Hanks, S. (1999). On the undecidability of probabilistic planning and infinite-horizon partially observable Markov decision problems. In Proceedings of the Sixteenth National Conference on Artificial Intelligence Orlando, FL. To appear.
    • (1999) Proceedings of the Sixteenth National Conference on Artificial Intelligence
    • Madani, O.1    Condon, A.2    Hanks, S.3
  • 103
    • 0010853273 scopus 로고
    • To discount or not to discount in reinforcement learning: A case study in comparing R-learning and Q-learning
    • New Brunswick, NJ
    • Mahadevan, S. (1994). To discount or not to discount in reinforcement learning: A case study in comparing R-learning and Q-learning. In Proceedings of the Eleventh International Conference on Machine Learning, pp. 164-172 New Brunswick, NJ.
    • (1994) Proceedings of the Eleventh International Conference on Machine Learning , pp. 164-172
    • Mahadevan, S.1
  • 105
  • 106
    • 0014638440 scopus 로고
    • Some philosophical problems from the standpoint of artificial intelligence
    • McCarthy, J., & Hayes, P. J. (1969). Some philosophical problems from the standpoint of artificial intelligence. Machine Intelligence, 4, 463-502.
    • (1969) Machine Intelligence , vol.4 , pp. 463-502
    • McCarthy, J.1    Hayes, P.J.2
  • 108
    • 0029514510 scopus 로고
    • The parti-game algorithm for variable resolution reinforcement learning in multidimensional state spaces
    • Moore, A. W., & Atkeson, C. G. (1995). The parti-game algorithm for variable resolution reinforcement learning in multidimensional state spaces. Machine Learning, 21, 199-234.
    • (1995) Machine Learning , vol.21 , pp. 199-234
    • Moore, A.W.1    Atkeson, C.G.2
  • 112
    • 84898956770 scopus 로고    scopus 로고
    • Reinforcement learning with hierarchies of machines
    • Jordan, M., Kearns, M., & Solla, S. (Eds.). MIT Press, Cambridge
    • Parr, R., & Russell, S. (1998). Reinforcement learning with hierarchies of machines. In Jordan, M., Kearns, M., & Solla, S. (Eds.), Advances in Neural Information Processing Systems 10, pp. 1043-1049. MIT Press, Cambridge.
    • (1998) Advances in Neural Information Processing Systems 10 , pp. 1043-1049
    • Parr, R.1    Russell, S.2
  • 119
    • 0031187203 scopus 로고    scopus 로고
    • The independent choice logic for modelling multiple agents under uncertainty
    • Poole, D. (1997a). The independent choice logic for modelling multiple agents under uncertainty. Artificial Intelligence, 94(1-2), 7-56.
    • (1997) Artificial Intelligence , vol.94 , Issue.1-2 , pp. 7-56
    • Poole, D.1
  • 123
    • 0347999523 scopus 로고
    • CASSANDRA: Planning for contingencies
    • Northwestern University, The Institute for the Learning Sciences
    • Pryor, L., & Collins, G. (1993). CASSANDRA: Planning for contingencies. Technical report 41, Northwestern University, The Institute for the Learning Sciences.
    • (1993) Technical Report 41
    • Pryor, L.1    Collins, G.2
  • 125
    • 0037581251 scopus 로고
    • Modified policy iteration algorithms for discounted Markov decision problems
    • Puterman, M. L., & Shin, M. (1978). Modified policy iteration algorithms for discounted Markov decision problems. Management Science, 24, 1127-1137.
    • (1978) Management Science , vol.24 , pp. 1127-1137
    • Puterman, M.L.1    Shin, M.2
  • 126
    • 0001172487 scopus 로고
    • Multichain Markov decision processes with a sample-path constraint: A decomposition approach
    • Ross, K. W., & Varadarajan, R. (1991). Multichain Markov decision processes with a sample-path constraint: A decomposition approach. Mathematics of Operations Research, 16(1), 195-207.
    • (1991) Mathematics of Operations Research , vol.16 , Issue.1 , pp. 195-207
    • Ross, K.W.1    Varadarajan, R.2
  • 128
    • 0016069798 scopus 로고
    • Planning in a hierarchy of abstraction spaces
    • Sacerdoti, E. D. (1974). Planning in a hierarchy of abstraction spaces. Artificial Intelligence, 5, 115-135.
    • (1974) Artificial Intelligence , vol.5 , pp. 115-135
    • Sacerdoti, E.D.1
  • 132
    • 0022059617 scopus 로고
    • Iterative aggregation-disaggregation procedures for discounted semi-Markov reward processes
    • Schweitzer, P. L., Puterman, M. L., & Kindle, K. W. (1985). Iterative aggregation-disaggregation procedures for discounted semi-Markov reward processes. Operations Research, 33, 589-605.
    • (1985) Operations Research , vol.33 , pp. 589-605
    • Schweitzer, P.L.1    Puterman, M.L.2    Kindle, K.W.3
  • 133
    • 0022818911 scopus 로고
    • Evaluating influence diagrams
    • Shachter, R. D. (1986). Evaluating influence diagrams. Operations Research, 55(6), 871-882.
    • (1986) Operations Research , vol.55 , Issue.6 , pp. 871-882
    • Shachter, R.D.1
  • 134
    • 43949170056 scopus 로고
    • The role of relevance in explanation I: Irrelevance as statistical independence
    • Shimony, S. E. (1993). The role of relevance in explanation I: Irrelevance as statistical independence. International Journal of Approximate Reasoning, 8(4), 281-324.
    • (1993) International Journal of Approximate Reasoning , vol.8 , Issue.4 , pp. 281-324
    • Shimony, S.E.1
  • 136
  • 137
    • 85153965130 scopus 로고
    • Reinforcement learning with soft state aggregation
    • Hanson, S. J., Cowan, J. D., & Giles, C. L. (Eds.). Morgan-Kaufmann, San Mateo
    • Singh, S. P., Jaakkola, T., & Jordan, M. I. (1994). Reinforcement learning with soft state aggregation. In Hanson, S. J., Cowan, J. D., & Giles, C. L. (Eds.), Advances in Neural Information Processing Systems 7. Morgan-Kaufmann, San Mateo.
    • (1994) Advances in Neural Information Processing Systems 7
    • Singh, S.P.1    Jaakkola, T.2    Jordan, M.I.3
  • 138
    • 0015658957 scopus 로고
    • The optimal control of partially observable Markov processes over a finite horizon
    • Smallwood, R. D., & Sondik, E. J. (1973). The optimal control of partially observable Markov processes over a finite horizon. Operations Research, 21, 1071-1088.
    • (1973) Operations Research , vol.21 , pp. 1071-1088
    • Smallwood, R.D.1    Sondik, E.J.2
  • 140
    • 0017943242 scopus 로고
    • The optimal control of partially observable Markov processes over the infinite horizon: Discounted costs
    • Sondik, E. J. (1978). The optimal control of partially observable Markov processes over the infinite horizon: Discounted costs. Operations Research, 26, 282-304.
    • (1978) Operations Research , vol.26 , pp. 282-304
    • Sondik, E.J.1
  • 141
    • 0003328519 scopus 로고    scopus 로고
    • Team-partitioned, opaque-transition reinforcement learning
    • Asada, M. (Ed.). Springer Verlag, Berlin
    • Stone, P., & Veloso, M. (1999). Team-partitioned, opaque-transition reinforcement learning. In Asada, M. (Ed.), RoboCup-98: Robot Soccer World Cup II. Springer Verlag, Berlin.
    • (1999) RoboCup-98: Robot Soccer World Cup II
    • Stone, P.1    Veloso, M.2
  • 146
    • 0000985504 scopus 로고
    • TD-Gammon, a self-teaching backgammon program, achieves master-level play
    • Tesauro, G. J. (1994). TD-Gammon, a self-teaching backgammon program, achieves master-level play. Neural Computation, 6, 215-219.
    • (1994) Neural Computation , vol.6 , pp. 215-219
    • Tesauro, G.J.1
  • 147
    • 0032044899 scopus 로고    scopus 로고
    • A probabilistic approach to concurrent mapping and localization for mobile robots
    • Thrun, S., Fox, D., & Burgard, W. (1998). A probabilistic approach to concurrent mapping and localization for mobile robots. Machine Learning, 31, 29-53.
    • (1998) Machine Learning , vol.31 , pp. 29-53
    • Thrun, S.1    Fox, D.2    Burgard, W.3
  • 148
    • 0000277836 scopus 로고
    • Finding structure in reinforcement learning
    • Tesauro, G., Touretzky, D., & Leen, T. (Eds.), Cambridge, MA. MIT Press
    • Thrun, S., & Schwartz, A. (1995). Finding structure in reinforcement learning. In Tesauro, G., Touretzky, D., & Leen, T. (Eds.), Advances in Neural Information Processing Systems 7 Cambridge, MA. MIT Press.
    • (1995) Advances in Neural Information Processing Systems 7
    • Thrun, S.1    Schwartz, A.2
  • 149
    • 0010732426 scopus 로고
    • Generating conditional plans and programs
    • University of Edinburgh
    • Warren, D. (1976). Generating conditional plans and programs. In Proceedings of AISB Summer Conference, pp. 344-354 University of Edinburgh.
    • (1976) Proceedings of AISB Summer Conference , pp. 344-354
    • Warren, D.1
  • 151
    • 0028750404 scopus 로고
    • An introduction to least commitment planning
    • Weld, D. S. (1994). An introduction to least commitment planning. AI Magazine, Winter 1994, 27-61.
    • (1994) AI Magazine , vol.WINTER 1994 , pp. 27-61
    • Weld, D.S.1
  • 152
    • 0024739631 scopus 로고
    • Solutions procedures for partially observed Markov decision processes
    • White III, C. C., & Scherer, W. T. (1989). Solutions procedures for partially observed Markov decision processes. Operations Research, 57(5), 791-797.
    • (1989) Operations Research , vol.57 , Issue.5 , pp. 791-797
    • White III, C.C.1    Scherer, W.T.2
  • 153
    • 0342732841 scopus 로고    scopus 로고
    • Ph.D. thesis 96-06-03, University of Washington, Department of Computer Science and Engineering
    • Williamson, M. (1996). A value-directed approach to planning. Ph.D. thesis 96-06-03, University of Washington, Department of Computer Science and Engineering.
    • (1996) A Value-directed Approach to Planning
    • Williamson, M.1
  • 157
    • 85016628903 scopus 로고    scopus 로고
    • A model approximation scheme for planning in partially observable stochastic domains
    • Zhang, N. L., & Liu, W. (1997). A model approximation scheme for planning in partially observable stochastic domains. Journal of Artificial Intelligence Research, 7, 199-230.
    • (1997) Journal of Artificial Intelligence Research , vol.7 , pp. 199-230
    • Zhang, N.L.1    Liu, W.2
  • 158


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.