메뉴 건너뛰기




Volumn 101, Issue 1-2, 1998, Pages 99-134

Planning and acting in partially observable stochastic domains

Author keywords

Partially observable Markov decision processes; Planning; Uncertainty

Indexed keywords

ALGORITHMS; MARKOV PROCESSES; OBSERVABILITY; PROBLEM SOLVING;

EID: 0032073263     PISSN: 00043702     EISSN: None     Source Type: Journal    
DOI: 10.1016/s0004-3702(98)00023-x     Document Type: Article
Times cited : (3689)

References (70)
  • 1
    • 50549213583 scopus 로고
    • Optimal control of Markov decision processes with incomplete state estimation
    • K.J. Aström, Optimal control of Markov decision processes with incomplete state estimation, J. Math. Anal. Appl. 10 (1995) 174-205.
    • (1995) J. Math. Anal. Appl. , vol.10 , pp. 174-205
    • Aström, K.J.1
  • 4
    • 0031074857 scopus 로고    scopus 로고
    • Fast planning through planning graph analysis
    • A.L. Blum and M.L. Furst, Fast planning through planning graph analysis, Artificial Intelligence 90 (1-2) (1997) 279-298.
    • (1997) Artificial Intelligence , vol.90 , Issue.1-2 , pp. 279-298
    • Blum, A.L.1    Furst, M.L.2
  • 6
    • 0030349220 scopus 로고    scopus 로고
    • Computing optimal policies for partially observable decision processes using compact representations
    • Portland, OR, AAAI Press/MIT Press, Menlo Park, CA
    • C. Boutilier and D. Poole, Computing optimal policies for partially observable decision processes using compact representations, in: Proceedings Thirteenth National Conference on Artificial Intelligence (AAAI-96), Portland, OR, AAAI Press/MIT Press, Menlo Park, CA, 1996, pp. 1168-1175.
    • (1996) Proceedings Thirteenth National Conference on Artificial Intelligence (AAAI-96) , pp. 1168-1175
    • Boutilier, C.1    Poole, D.2
  • 11
    • 0026998041 scopus 로고
    • Reinforcement learning with perceptual aliasing: The perceptual distinctions approach
    • San Jose, CA, AAAI Press, San Jose, CA
    • L. Chrisman, Reinforcement learning with perceptual aliasing: The perceptual distinctions approach, in: Proceedings Tenth National Conference on Artificial Intelligence (AAAI-92), San Jose, CA, AAAI Press, San Jose, CA, 1992, pp. 183-188.
    • (1992) Proceedings Tenth National Conference on Artificial Intelligence (AAAI-92) , pp. 183-188
    • Chrisman, L.1
  • 12
    • 0026820657 scopus 로고
    • The complexity of stochastic games
    • A. Condon, The complexity of stochastic games, Inform. and Comput. 96 (2) (1992) 203-224.
    • (1992) Inform. and Comput. , vol.96 , Issue.2 , pp. 203-224
    • Condon, A.1
  • 15
    • 85152628189 scopus 로고
    • Anytime synthetic projection: Maximizing the probability of goal satisfaction
    • Boston, MA, Morgan Kaufmann, San Francisco, CA
    • M. Drummond and J. Bresina, Anytime synthetic projection: maximizing the probability of goal satisfaction, in: Proceedings Eighth National Conference on Artificial Intelligence (AAAI-90), Boston, MA, Morgan Kaufmann, San Francisco, CA, 1990, pp. 138-144.
    • (1990) Proceedings Eighth National Conference on Artificial Intelligence (AAAI-90) , pp. 138-144
    • Drummond, M.1    Bresina, J.2
  • 16
    • 0021486586 scopus 로고
    • The optimal search for a moving target when the search path is constrained
    • J.N. Eagle, The optimal search for a moving target when the search path is constrained, Oper. Res. 32 (5) (1984) 1107-1115.
    • (1984) Oper. Res. , vol.32 , Issue.5 , pp. 1107-1115
    • Eagle, J.N.1
  • 17
    • 0004808420 scopus 로고
    • On the average cost optimality equation and the structure of optimal policies for partially observable Markov processes
    • E. Fernández-Gaucherand, A. Arapostathis and S.I. Marcus, On the average cost optimality equation and the structure of optimal policies for partially observable Markov processes, Ann. Oper. Res. 29 (1991) 471-512.
    • (1991) Ann. Oper. Res. , vol.29 , pp. 471-512
    • Fernández-Gaucherand, E.1    Arapostathis, A.2    Marcus, S.I.3
  • 31
    • 0037919269 scopus 로고
    • Generating optimal policies for high-level plans with conditional branches and loops
    • S.-H. Lin and T. Dean, Generating optimal policies for high-level plans with conditional branches and loops, in: Proceedings Third European Workshop on Planning (1995) 205-218.
    • (1995) Proceedings Third European Workshop on Planning , pp. 205-218
    • Lin, S.-H.1    Dean, T.2
  • 33
    • 85138579181 scopus 로고
    • Learning policies for partially observable environments: Scaling up
    • A. Prieditis and S. Russell (Eds.), Morgan Kaufmann, San Francisco, CA
    • M.L. Littman, A.R. Cassandra and L.P. Kaelbling, Learning policies for partially observable environments: scaling up, in: A. Prieditis and S. Russell (Eds.), Proceedings Twelfth International Conference on Machine Learning, Morgan Kaufmann, San Francisco, CA, 1995, pp. 362-370. Reprinted in: M.H. Huhns and M.P. Singh (Eds.), Readings in Agents, Morgan Kaufmann, San Francisco, CA, 1998.
    • (1995) Proceedings Twelfth International Conference on Machine Learning , pp. 362-370
    • Littman, M.L.1    Cassandra, A.R.2    Kaelbling, L.P.3
  • 34
    • 0004123209 scopus 로고    scopus 로고
    • Morgan Kaufmann, San Francisco, CA
    • M.L. Littman, A.R. Cassandra and L.P. Kaelbling, Learning policies for partially observable environments: scaling up, in: A. Prieditis and S. Russell (Eds.), Proceedings Twelfth International Conference on Machine Learning, Morgan Kaufmann, San Francisco, CA, 1995, pp. 362-370. Reprinted in: M.H. Huhns and M.P. Singh (Eds.), Readings in Agents, Morgan Kaufmann, San Francisco, CA, 1998.
    • (1998) Readings in Agents
    • Huhns, M.H.1    Singh, M.P.2
  • 36
    • 0003861655 scopus 로고    scopus 로고
    • Ph.D. Thesis, Department of Computer Science, Brown University, also Technical Report CS-96-09.
    • M.L. Littman, Algorithms for sequential decision making, Ph.D. Thesis, Department of Computer Science, Brown University, 1996; also Technical Report CS-96-09.
    • (1996) Algorithms for Sequential Decision Making
    • Littman, M.L.1
  • 37
    • 0002679852 scopus 로고
    • A survey of algorithmic methods for partially observable Markov decision processes
    • W.S. Lovejoy, A survey of algorithmic methods for partially observable Markov decision processes, Ann. Oper. Res. 28 (1) (1991) 47-65.
    • (1991) Ann. Oper. Res. , vol.28 , Issue.1 , pp. 47-65
    • Lovejoy, W.S.1
  • 38
    • 0002535978 scopus 로고    scopus 로고
    • Technical Report CS-1998-01, Department of Computer Science, Duke University, Durham, NC, submitted for review
    • S.M. Majercik and M.L. Littman, MAXPLAN: a new approach to probabilistic planning, Technical Report CS-1998-01, Department of Computer Science, Duke University, Durham, NC, 1998; submitted for review.
    • (1998) MAXPLAN: A New Approach to Probabilistic Planning
    • Majercik, S.M.1    Littman, M.L.2
  • 41
    • 85151432208 scopus 로고
    • Overcoming incomplete perception with utile distinction memory
    • Morgan Kaufmann, Amherst, MA
    • R.A. McCallum, Overcoming incomplete perception with utile distinction memory, in: Proceedings Tenth International Conference on Machine Learning, Morgan Kaufmann, Amherst, MA, 1993, pp. 190-196.
    • (1993) Proceedings Tenth International Conference on Machine Learning , pp. 190-196
    • McCallum, R.A.1
  • 42
    • 2342482919 scopus 로고
    • Instance-based utile distinctions for reinforcement learning with hidden state
    • Morgan Kaufmann, San Francisco, CA
    • R.A. McCallum, Instance-based utile distinctions for reinforcement learning with hidden state, in: Proceedings Twelfth International Conference on Machine Learning, Morgan Kaufmann, San Francisco, CA, 1995, pp. 387-395.
    • (1995) Proceedings Twelfth International Conference on Machine Learning , pp. 387-395
    • McCallum, R.A.1
  • 43
    • 0019909899 scopus 로고
    • A survey of partially observable Markov decision processes: Theory, models, and algorithms
    • G.E. Monahan, A survey of partially observable Markov decision processes: theory, models, and algorithms, Management Science 28 (1) (1982) 1-16.
    • (1982) Management Science , vol.28 , Issue.1 , pp. 1-16
    • Monahan, G.E.1
  • 44
    • 0001870364 scopus 로고
    • A formal theory of knowledge and action
    • J.R. Hobbs and R.C. Moore (Eds.), Ablex Publishing, Norwood, NJ
    • R.C. Moore, A formal theory of knowledge and action, in: J.R. Hobbs and R.C. Moore (Eds.), Formal Theories of the Commonsense World, Ablex Publishing, Norwood, NJ, 1985, pp. 319-358.
    • (1985) Formal Theories of the Commonsense World , pp. 319-358
    • Moore, R.C.1
  • 49
    • 0029678853 scopus 로고    scopus 로고
    • Planning for contingencies: A decision-based approach
    • L. Pryor and G. Collins, Planning for contingencies: a decision-based approach, J. Artif. Intell. Res. 4 (1996) 287-339.
    • (1996) J. Artif. Intell. Res. , vol.4 , pp. 287-339
    • Pryor, L.1    Collins, G.2
  • 51
    • 0024610919 scopus 로고
    • A tutorial on hidden Markov models and selected applications in speech recognition
    • L.R. Rabiner, A tutorial on hidden Markov models and selected applications in speech recognition, Proc. IEEE 77 (2) (1989) 257-286.
    • (1989) Proc. IEEE , vol.77 , Issue.2 , pp. 257-286
    • Rabiner, L.R.1
  • 52
    • 0038295444 scopus 로고
    • Optimal control for partially observable Markov decision processes over an infinite horizon
    • K. Sawaki and A. Ichikawa, Optimal control for partially observable Markov decision processes over an infinite horizon, J. Oper. Res. Soc. Japan 21 (1) (1978) 1-14.
    • (1978) J. Oper. Res. Soc. Japan , vol.21 , Issue.1 , pp. 1-14
    • Sawaki, K.1    Ichikawa, A.2
  • 57
    • 0015658957 scopus 로고
    • The optimal control of partially observable Markov processes over a finite horizon
    • R.D. Smallwood and E.J. Sondik, The optimal control of partially observable Markov processes over a finite horizon, Oper. Res. 21 (1973) 1071-1088.
    • (1973) Oper. Res. , vol.21 , pp. 1071-1088
    • Smallwood, R.D.1    Sondik, E.J.2
  • 60
    • 0017943242 scopus 로고
    • The optimal control of partially observable Markov processes over the infinite horizon: Discounted costs
    • E.J. Sondik, The optimal control of partially observable Markov processes over the infinite horizon: discounted costs, Oper. Res. 26 (2) (1978) 282-304.
    • (1978) Oper. Res. , vol.26 , Issue.2 , pp. 282-304
    • Sondik, E.J.1
  • 61
    • 0002297358 scopus 로고
    • Hidden Markov model induction by Bayesian model merging
    • S.J. Hanson, J.D. Cowan and C.L. Giles (Eds.), Morgan Kaufmann, San Mateo, CA
    • A. Stolcke and S. Omohundro, Hidden Markov model induction by Bayesian model merging, in: S.J. Hanson, J.D. Cowan and C.L. Giles (Eds.), Advances in Neural Information Processing Systems 5, Morgan Kaufmann, San Mateo, CA, 1993, pp. 11-18.
    • (1993) Advances in Neural Information Processing Systems , vol.5 , pp. 11-18
    • Stolcke, A.1    Omohundro, S.2
  • 63
    • 0025491302 scopus 로고
    • Solving H-horizon, stationary Markov decision problems in time proportional to log(H)
    • P. Tseng, Solving H-horizon, stationary Markov decision problems in time proportional to log(H), Oper. Res. Lett. 9 (5) (1990) 287-297.
    • (1990) Oper. Res. Lett. , vol.9 , Issue.5 , pp. 287-297
    • Tseng, P.1
  • 64
    • 0037958031 scopus 로고
    • Application of Jensen's inequality for adaptive suboptimal design
    • C.C. White and D. Harrington, Application of Jensen's inequality for adaptive suboptimal design, J. Optim. Theory Appl. 32 (1) (1980) 89-99.
    • (1980) J. Optim. Theory Appl. , vol.32 , Issue.1 , pp. 89-99
    • White, C.C.1    Harrington, D.2
  • 65
    • 0000893414 scopus 로고
    • Partially observed Markov decision processes: A survey
    • C.C. White III, Partially observed Markov decision processes: a survey, Ann. Oper. Res. 32 (1991).
    • (1991) Ann. Oper. Res. , vol.32
    • White C.C. III1
  • 66
    • 0024739631 scopus 로고
    • Solution procedures for partially observed Markov decision processes
    • C.C. White III and W.T. Scherer, Solution procedures for partially observed Markov decision processes, Oper. Res. 37 (5) (1989) 791-797.
    • (1989) Oper. Res. , vol.37 , Issue.5 , pp. 791-797
    • White C.C. III1    Scherer, W.T.2
  • 69
    • 0010814759 scopus 로고    scopus 로고
    • Incremental self-improvement for life-time multi-agent reinforcement learning
    • P. Maes, M.J. Mataric, J.-A. Meyer, J. Pollack and S.W. Wilson (Eds.), MIT Press, Cambridge, MA
    • J. Zhao and J.H. Schmidhuber, Incremental self-improvement for life-time multi-agent reinforcement learning, in: P. Maes, M.J. Mataric, J.-A. Meyer, J. Pollack and S.W. Wilson (Eds.), From Animals to Animats: Proceedings Fourth International Conference on Simulation of Adaptive Behavior, MIT Press, Cambridge, MA, 1996, pp. 516-525.
    • (1996) From Animals to Animats: Proceedings Fourth International Conference on Simulation of Adaptive Behavior , pp. 516-525
    • Zhao, J.1    Schmidhuber, J.H.2
  • 70
    • 0030143640 scopus 로고    scopus 로고
    • The complexity of mean payoff games on graphs
    • U. Zwick and M. Paterson, The complexity of mean payoff games on graphs, Theoret. Comput. Sci. 158 (1-2) (1996) 343-359.
    • (1996) Theoret. Comput. Sci. , vol.158 , Issue.1-2 , pp. 343-359
    • Zwick, U.1    Paterson, M.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.