SCOPUS 정보 검색 플랫폼

Artificial Intelligence

Volumn 101, Issue 1-2, 1998, Pages 99-134

Planning and acting in partially observable stochastic domains

(3) Kaelbling, Leslie Pack a Littman, Michael L b Cassandra, Anthony R c

a Brown Univerity (United States)

b Duke University (United States)

c Microlectron and Comp Technology (United States)

Author keywords

Partially observable Markov decision processes; Planning; Uncertainty

Indexed keywords

ALGORITHMS; MARKOV PROCESSES; OBSERVABILITY; PROBLEM SOLVING;

FINITE MEMORY CONTROLLERS; PARTIALLY OBSERVABLE STOCHASTIC DOMAINS;

DECISION THEORY;

EID: 0032073263 PISSN: 00043702 EISSN: None Source Type: Journal
DOI: 10.1016/s0004-3702(98)00023-x Document Type: Article

Times cited : (3689)

References (70)

1
- 50549213583
- Optimal control of Markov decision processes with incomplete state estimation
- K.J. Aström, Optimal control of Markov decision processes with incomplete state estimation, J. Math. Anal. Appl. 10 (1995) 174-205.
- (1995) J. Math. Anal. Appl. , vol.10 , pp. 174-205
- Aström, K.J.¹

2
- 0030352106
- Rewarding behaviors
- Portland, OR, AAAI Press/MIT Press, Menlo Park, CA
- F. Bacchus, C. Boutilier and A. Grove, Rewarding behaviors, in: Proceedings Thirteenth National Conference on Artificial Intelligence (AAAI-96), Portland, OR, AAAI Press/MIT Press, Menlo Park, CA, 1996, pp. 1160-1167.
- (1996) Proceedings Thirteenth National Conference on Artificial Intelligence (AAAI-96) , pp. 1160-1167
- Bacchus, F.¹ Boutilier, C.² Grove, A.³

3
- 0003565783
- Athena Scientific, Belmont, MA
- D.P. Bertsekas, Dynamic Programming and Optimal Control, Vols. 1 and 2, Athena Scientific, Belmont, MA, 1995.
- (1995) Dynamic Programming and Optimal Control , vol.1-2
- Bertsekas, D.P.¹

4
- 0031074857
- Fast planning through planning graph analysis
- A.L. Blum and M.L. Furst, Fast planning through planning graph analysis, Artificial Intelligence 90 (1-2) (1997) 279-298.
- (1997) Artificial Intelligence , vol.90 , Issue.1-2 , pp. 279-298
- Blum, A.L.¹ Furst, M.L.²

5
- 0037807161
- Planning with external events
- Seattle, WA
- J. Blythe, Planning with external events, in: Proceedings Tenth Conference on Uncertainty in Artificial Intelligence (UAI-94), Seattle, WA, 1994, pp. 94-101.
- (1994) Proceedings Tenth Conference on Uncertainty in Artificial Intelligence (UAI-94) , pp. 94-101
- Blythe, J.¹

6
- 0030349220
- Computing optimal policies for partially observable decision processes using compact representations
- Portland, OR, AAAI Press/MIT Press, Menlo Park, CA
- C. Boutilier and D. Poole, Computing optimal policies for partially observable decision processes using compact representations, in: Proceedings Thirteenth National Conference on Artificial Intelligence (AAAI-96), Portland, OR, AAAI Press/MIT Press, Menlo Park, CA, 1996, pp. 1168-1175.
- (1996) Proceedings Thirteenth National Conference on Artificial Intelligence (AAAI-96) , pp. 1168-1175
- Boutilier, C.¹ Poole, D.²

7
- 0001909869
- Incremental pruning: A simple, fast, exact method for partially observable Markov decision processes
- Morgan Kaufmann, San Francisco, CA
- A. Cassandra, M.L. Littman and N.L. Zhang, Incremental Pruning: a simple, fast, exact method for partially observable Markov decision processes, in: Proceedings Thirteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-97), Morgan Kaufmann, San Francisco, CA, 1997, pp. 54-61.
- (1997) Proceedings Thirteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-97) , pp. 54-61
- Cassandra, A.¹ Littman, M.L.² Zhang, N.L.³

8
- 0028564629
- Acting optimally in partially observable stochastic domains
- Seattle, WA
- A.R. Cassandra, L.P. Kaelbling and M.L. Littman, Acting optimally in partially observable stochastic domains, in: Proceedings Twelfth National Conference on Artificial Intelligence (AAAI-94), Seattle, WA, 1994, pp. 1023-1028.
- (1994) Proceedings Twelfth National Conference on Artificial Intelligence (AAAI-94) , pp. 1023-1028
- Cassandra, A.R.¹ Kaelbling, L.P.² Littman, M.L.³

9
- 0003989210
- Ph.D. Thesis, Department of Computer Science, Brown University, Providence, RI
- A.R. Cassandra, Exact and approximate algorithms for partially observable Markov decision problems, Ph.D. Thesis, Department of Computer Science, Brown University, Providence, RI, 1998.
- (1998) Exact and Approximate Algorithms for Partially Observable Markov Decision Problems
- Cassandra, A.R.¹

10
- 0003818801
- Ph.D. Thesis, University of British Columbia, Vancouver, BC
- H.-T. Cheng, Algorithms for partially observable Markov decision processes, Ph.D. Thesis, University of British Columbia, Vancouver, BC, 1988.
- (1988) Algorithms for Partially Observable Markov Decision Processes
- Cheng, H.-T.¹

11
- 0026998041
- Reinforcement learning with perceptual aliasing: The perceptual distinctions approach
- San Jose, CA, AAAI Press, San Jose, CA
- L. Chrisman, Reinforcement learning with perceptual aliasing: The perceptual distinctions approach, in: Proceedings Tenth National Conference on Artificial Intelligence (AAAI-92), San Jose, CA, AAAI Press, San Jose, CA, 1992, pp. 183-188.
- (1992) Proceedings Tenth National Conference on Artificial Intelligence (AAAI-92) , pp. 183-188
- Chrisman, L.¹

12
- 0026820657
- The complexity of stochastic games
- A. Condon, The complexity of stochastic games, Inform. and Comput. 96 (2) (1992) 203-224.
- (1992) Inform. and Comput. , vol.96 , Issue.2 , pp. 203-224
- Condon, A.¹

13
- 0029332887
- Planning under time constraints in stochastic domains
- T. Dean, L.P. Kaelbling, J. Kirman and A. Nicholson, Planning under time constraints in stochastic domains, Artificial Intelligence 76 (1-2) (1995) 35-74.
- (1995) Artificial Intelligence , vol.76 , Issue.1-2 , pp. 35-74
- Dean, T.¹ Kaelbling, L.P.² Kirman, J.³ Nicholson, A.⁴

14
- 0042658767
- Technical Report 93-12-04, University of Washington, Seattle, WA
- D. Draper, S. Hanks and D. Weld, Probabilistic planning with information gathering and contingent execution, Technical Report 93-12-04, University of Washington, Seattle, WA, 1993.
- (1993) Probabilistic Planning with Information Gathering and Contingent Execution
- Draper, D.¹ Hanks, S.² Weld, D.³

15
- 85152628189
- Anytime synthetic projection: Maximizing the probability of goal satisfaction
- Boston, MA, Morgan Kaufmann, San Francisco, CA
- M. Drummond and J. Bresina, Anytime synthetic projection: maximizing the probability of goal satisfaction, in: Proceedings Eighth National Conference on Artificial Intelligence (AAAI-90), Boston, MA, Morgan Kaufmann, San Francisco, CA, 1990, pp. 138-144.
- (1990) Proceedings Eighth National Conference on Artificial Intelligence (AAAI-90) , pp. 138-144
- Drummond, M.¹ Bresina, J.²

16
- 0021486586
- The optimal search for a moving target when the search path is constrained
- J.N. Eagle, The optimal search for a moving target when the search path is constrained, Oper. Res. 32 (5) (1984) 1107-1115.
- (1984) Oper. Res. , vol.32 , Issue.5 , pp. 1107-1115
- Eagle, J.N.¹

17
- 0004808420
- On the average cost optimality equation and the structure of optimal policies for partially observable Markov processes
- E. Fernández-Gaucherand, A. Arapostathis and S.I. Marcus, On the average cost optimality equation and the structure of optimal policies for partially observable Markov processes, Ann. Oper. Res. 29 (1991) 471-512.
- (1991) Ann. Oper. Res. , vol.29 , pp. 471-512
- Fernández-Gaucherand, E.¹ Arapostathis, A.² Marcus, S.I.³

18
- 85166374680
- Conditional linear planning
- K. Hammond (Ed.), AAAI Press/MIT Press, Menlo Park, CA
- R.P. Goldman and M.S. Boddy, Conditional linear planning, in: K. Hammond (Ed.), The Second International Conference on Artificial Intelligence Planning Systems, AAAI Press/MIT Press, Menlo Park, CA, 1994, pp. 80-85.
- (1994) The Second International Conference on Artificial Intelligence Planning Systems , pp. 80-85
- Goldman, R.P.¹ Boddy, M.S.²

19
- 0004844085
- Epsilon-safe planning
- Seattle, WA
- R.P. Goldman and M.S. Boddy, Epsilon-safe planning, in: Proceedings 10th Conference on Uncertainty in Artificial Intelligence (UAI-94), Seattle, WA, 1994, pp. 253-261.
- (1994) Proceedings 10th Conference on Uncertainty in Artificial Intelligence (UAI-94) , pp. 253-261
- Goldman, R.P.¹ Boddy, M.S.²

20
- 0038595408
- Representing uncertainty in simple planners
- Bonn, Germany
- R.P. Goldman and M.S. Boddy, Representing uncertainty in simple planners, in: Proceedings 4th International Conference on Principles of Knowledge Representation and Reasoning (KR-94), Bonn, Germany, 1994, pp. 238-245.
- (1994) Proceedings 4th International Conference on Principles of Knowledge Representation and Reasoning (KR-94) , pp. 238-245
- Goldman, R.P.¹ Boddy, M.S.²

21
- 0041547042
- Technical Report 93-06-04, Department of Computer Science and Engineering, University of Washington
- P. Haddawy and S. Hanks, Utility models for goal-directed decision-theoretic planners, Technical Report 93-06-04, Department of Computer Science and Engineering, University of Washington, 1993.
- (1993) Utility Models for Goal-directed Decision-theoretic Planners
- Haddawy, P.¹ Hanks, S.²

22
- 0028576344
- Cost-effective sensing during plan execution
- Seattle, WA, AAAI Press/MIT Press, Menlo Park, CA
- E.A. Hansen, Cost-effective sensing during plan execution, in: Proceedings Twelfth National Conference on Artificial Intelligence (AAAI-94), Seattle, WA, AAAI Press/MIT Press, Menlo Park, CA, 1994, pp. 1029-1035.
- (1994) Proceedings Twelfth National Conference on Artificial Intelligence (AAAI-94) , pp. 1029-1035
- Hansen, E.A.¹

23
- 84898987770
- An improved policy iteration algorithm for partially observable MDPs
- E.A. Hansen, An improved policy iteration algorithm for partially observable MDPs, in: Advances in Neural Information Processing Systems 10 (1998).
- (1998) Advances in Neural Information Processing Systems , vol.10
- Hansen, E.A.¹

24
- 0003644124
- MIT Press, Cambridge, MA
- R.A. Howard, Dynamic Programming and Markov Processes, MIT Press, Cambridge, MA, 1960.
- (1960) Dynamic Programming and Markov Processes
- Howard, R.A.¹

25
- 84939003870
- Information value theory
- R.A. Howard, Information value theory, IEEE Trans. Systems Science and Cybernetics SSC-2 (1) (1966) 22-26.
- (1966) IEEE Trans. Systems Science and Cybernetics SSC-2 , Issue.1 , pp. 22-26
- Howard, R.A.¹

26
- 85024429815
- A new approach to linear filtering and prediction problems
- R.E. Kalman, A new approach to linear filtering and prediction problems, Trans. American Society of Mechanical Engineers, Journal of Basic Engineering 82 (1960) 35-45.
- (1960) Trans. American Society of Mechanical Engineers, Journal of Basic Engineering , vol.82 , pp. 35-45
- Kalman, R.E.¹

27
- 0347369286
- Technical Report UCB/CSD 92/685, Berkeley, CA
- S. Koenig, Optimal probabilistic and decision-theoretic planning using Markovian decision theory, Technical Report UCB/CSD 92/685, Berkeley, CA, 1992.
- (1992) Optimal Probabilistic and Decision-theoretic Planning Using Markovian Decision Theory
- Koenig, S.¹

28
- 0008840793
- Risk-sensitive planning with probabilistic decision graphs
- Bonn, Germany
- S. Koenig and R.G. Simmons, Risk-sensitive planning with probabilistic decision graphs, in: Proceedings 4th International Conference on Principles of Knowledge Representation and Reasoning (KR-94), Bonn, Germany, 1994, pp. 363-373.
- (1994) Proceedings 4th International Conference on Principles of Knowledge Representation and Reasoning (KR-94) , pp. 363-373
- Koenig, S.¹ Simmons, R.G.²

29
- 0003882343
- MIT Press, Cambridge, MA
- J.R. Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection, MIT Press, Cambridge, MA, 1992.
- (1992) Genetic Programming: On the Programming of Computers by Means of Natural Selection
- Koza, J.R.¹

30
- 0029333536
- An algorithm for probabilistic planning
- N. Kushmerick, S. Hanks and D.S. Weld, An algorithm for probabilistic planning, Artificial Intelligence 76 (1-2) (1995) 239-286.
- (1995) Artificial Intelligence , vol.76 , Issue.1-2 , pp. 239-286
- Kushmerick, N.¹ Hanks, S.² Weld, D.S.³

31
- 0037919269
- Generating optimal policies for high-level plans with conditional branches and loops
- S.-H. Lin and T. Dean, Generating optimal policies for high-level plans with conditional branches and loops, in: Proceedings Third European Workshop on Planning (1995) 205-218.
- (1995) Proceedings Third European Workshop on Planning , pp. 205-218
- Lin, S.-H.¹ Dean, T.²

32
- 0003272035
- Memoryless policies: Theoretical limitations and practical results
- D. Cliff, P. Husbands, J.-A. Meyer and S.W. Wilson (Eds.), MIT Press, Cambridge, MA
- M.L. Littman, Memoryless policies: theoretical limitations and practical results, in: D. Cliff, P. Husbands, J.-A. Meyer and S.W. Wilson (Eds.), From Animals to Animats 3: Proceedings Third International Conference on Simulation of Adaptive Behavior, MIT Press, Cambridge, MA, 1994.
- (1994) From Animals to Animats 3: Proceedings Third International Conference on Simulation of Adaptive Behavior
- Littman, M.L.¹

33
- 85138579181
- Learning policies for partially observable environments: Scaling up
- A. Prieditis and S. Russell (Eds.), Morgan Kaufmann, San Francisco, CA
- M.L. Littman, A.R. Cassandra and L.P. Kaelbling, Learning policies for partially observable environments: scaling up, in: A. Prieditis and S. Russell (Eds.), Proceedings Twelfth International Conference on Machine Learning, Morgan Kaufmann, San Francisco, CA, 1995, pp. 362-370. Reprinted in: M.H. Huhns and M.P. Singh (Eds.), Readings in Agents, Morgan Kaufmann, San Francisco, CA, 1998.
- (1995) Proceedings Twelfth International Conference on Machine Learning , pp. 362-370
- Littman, M.L.¹ Cassandra, A.R.² Kaelbling, L.P.³

34
- 0004123209
- Morgan Kaufmann, San Francisco, CA
- M.L. Littman, A.R. Cassandra and L.P. Kaelbling, Learning policies for partially observable environments: scaling up, in: A. Prieditis and S. Russell (Eds.), Proceedings Twelfth International Conference on Machine Learning, Morgan Kaufmann, San Francisco, CA, 1995, pp. 362-370. Reprinted in: M.H. Huhns and M.P. Singh (Eds.), Readings in Agents, Morgan Kaufmann, San Francisco, CA, 1998.
- (1998) Readings in Agents
- Huhns, M.H.¹ Singh, M.P.²

35
- 0003596835
- Technical Report CS-95-19, Brown University, Providence, RI
- M.L. Littman, A.R. Cassandra and L.P. Kaelbling, Efficient dynamic-programming updates in partially observable Markov decision processes, Technical Report CS-95-19, Brown University, Providence, RI, 1996.
- (1996) Efficient Dynamic-programming Updates in Partially Observable Markov Decision Processes
- Littman, M.L.¹ Cassandra, A.R.² Kaelbling, L.P.³

36
- 0003861655
- Ph.D. Thesis, Department of Computer Science, Brown University, also Technical Report CS-96-09.
- M.L. Littman, Algorithms for sequential decision making, Ph.D. Thesis, Department of Computer Science, Brown University, 1996; also Technical Report CS-96-09.
- (1996) Algorithms for Sequential Decision Making
- Littman, M.L.¹

37
- 0002679852
- A survey of algorithmic methods for partially observable Markov decision processes
- W.S. Lovejoy, A survey of algorithmic methods for partially observable Markov decision processes, Ann. Oper. Res. 28 (1) (1991) 47-65.
- (1991) Ann. Oper. Res. , vol.28 , Issue.1 , pp. 47-65
- Lovejoy, W.S.¹

38
- 0002535978
- Technical Report CS-1998-01, Department of Computer Science, Duke University, Durham, NC, submitted for review
- S.M. Majercik and M.L. Littman, MAXPLAN: a new approach to probabilistic planning, Technical Report CS-1998-01, Department of Computer Science, Duke University, Durham, NC, 1998; submitted for review.
- (1998) MAXPLAN: A New Approach to Probabilistic Planning
- Majercik, S.M.¹ Littman, M.L.²

39
- 0041656857
- A method for planning given uncertain and incomplete information
- Morgan Kaufmann, San Mateo, CA
- T.M. Mansell, A method for planning given uncertain and incomplete information, in: Proceedings 9th Conference on Uncertainty in Artificial Intelligence (UAI-93), Morgan Kaufmann, San Mateo, CA, 1993, pp. 350-358.
- (1993) Proceedings 9th Conference on Uncertainty in Artificial Intelligence (UAI-93) , pp. 350-358
- Mansell, T.M.¹

40
- 0342772590
- Systematic nonlinear planning
- Anaheim, CA
- D. McAllester and D. Rosenblitt, Systematic nonlinear planning, in: Proceedings 9th National Conference on Artificial Intelligence (AAAI-91), Anaheim, CA, 1991, pp. 634-639.
- (1991) Proceedings 9th National Conference on Artificial Intelligence (AAAI-91) , pp. 634-639
- McAllester, D.¹ Rosenblitt, D.²

41
- 85151432208
- Overcoming incomplete perception with utile distinction memory
- Morgan Kaufmann, Amherst, MA
- R.A. McCallum, Overcoming incomplete perception with utile distinction memory, in: Proceedings Tenth International Conference on Machine Learning, Morgan Kaufmann, Amherst, MA, 1993, pp. 190-196.
- (1993) Proceedings Tenth International Conference on Machine Learning , pp. 190-196
- McCallum, R.A.¹

42
- 2342482919
- Instance-based utile distinctions for reinforcement learning with hidden state
- Morgan Kaufmann, San Francisco, CA
- R.A. McCallum, Instance-based utile distinctions for reinforcement learning with hidden state, in: Proceedings Twelfth International Conference on Machine Learning, Morgan Kaufmann, San Francisco, CA, 1995, pp. 387-395.
- (1995) Proceedings Twelfth International Conference on Machine Learning , pp. 387-395
- McCallum, R.A.¹

43
- 0019909899
- A survey of partially observable Markov decision processes: Theory, models, and algorithms
- G.E. Monahan, A survey of partially observable Markov decision processes: theory, models, and algorithms, Management Science 28 (1) (1982) 1-16.
- (1982) Management Science , vol.28 , Issue.1 , pp. 1-16
- Monahan, G.E.¹

44
- 0001870364
- A formal theory of knowledge and action
- J.R. Hobbs and R.C. Moore (Eds.), Ablex Publishing, Norwood, NJ
- R.C. Moore, A formal theory of knowledge and action, in: J.R. Hobbs and R.C. Moore (Eds.), Formal Theories of the Commonsense World, Ablex Publishing, Norwood, NJ, 1985, pp. 319-358.
- (1985) Formal Theories of the Commonsense World , pp. 319-358
- Moore, R.C.¹

45
- 0008519502
- Knowledge preconditions for actions and plans
- Milan, Italy
- L. Morgenstern, Knowledge preconditions for actions and plans, in: Proceedings 10th International Joint Conference on Artificial Intelligence (IJCAI-87), Milan, Italy, 1987, pp. 867-874.
- (1987) Proceedings 10th International Joint Conference on Artificial Intelligence (IJCAI-87) , pp. 867-874
- Morgenstern, L.¹

46
- 0002489296
- UCPOP: A sound, complete, partial order planner for ADL
- Cambridge, MA
- J.S. Penberthy and D. Weld, UCPOP: a sound, complete, partial order planner for ADL, in: Proceedings Third International Conference on Principles of Knowledge Representation and Reasoning (KR-92), Cambridge, MA, 1992, pp. 103-114.
- (1992) Proceedings Third International Conference on Principles of Knowledge Representation and Reasoning (KR-92) , pp. 103-114
- Penberthy, J.S.¹ Weld, D.²

47
- 0026992168
- Conditional nonlinear planning
- M.A. Peot and D.E. Smith, Conditional nonlinear planning, in: Proceedings First International Conference on Artificial Intelligence Planning Systems, 1992, pp. 189-197.
- (1992) Proceedings First International Conference on Artificial Intelligence Planning Systems , pp. 189-197
- Peot, M.A.¹ Smith, D.E.²

48
- 0042658750
- Technical Report, Georgia Institute of Technology, Atlanta, GA
- L.K. Platzman, A feasible computational approach to infinite-horizon partially-observed Markov decision problems, Technical Report, Georgia Institute of Technology, Atlanta, GA, 1981.
- (1981) A Feasible Computational Approach to Infinite-horizon Partially-observed Markov Decision Problems
- Platzman, L.K.¹

49
- 0029678853
- Planning for contingencies: A decision-based approach
- L. Pryor and G. Collins, Planning for contingencies: a decision-based approach, J. Artif. Intell. Res. 4 (1996) 287-339.
- (1996) J. Artif. Intell. Res. , vol.4 , pp. 287-339
- Pryor, L.¹ Collins, G.²

50
- 85102627959
- Wiley, New York, NY
- M.L. Puterman, Markov Decision Processes - Discrete Stochastic Dynamic Programming, Wiley, New York, NY, 1994.
- (1994) Markov Decision Processes - Discrete Stochastic Dynamic Programming
- Puterman, M.L.¹

51
- 0024610919
- A tutorial on hidden Markov models and selected applications in speech recognition
- L.R. Rabiner, A tutorial on hidden Markov models and selected applications in speech recognition, Proc. IEEE 77 (2) (1989) 257-286.
- (1989) Proc. IEEE , vol.77 , Issue.2 , pp. 257-286
- Rabiner, L.R.¹

52
- 0038295444
- Optimal control for partially observable Markov decision processes over an infinite horizon
- K. Sawaki and A. Ichikawa, Optimal control for partially observable Markov decision processes over an infinite horizon, J. Oper. Res. Soc. Japan 21 (1) (1978) 1-14.
- (1978) J. Oper. Res. Soc. Japan , vol.21 , Issue.1 , pp. 1-14
- Sawaki, K.¹ Ichikawa, A.²

53
- 0027704415
- The frame problem and knowledge-producing actions
- Washington, DC, 1993
- R.B. Scherl and H.J. Levesque, The frame problem and knowledge-producing actions, in: Proceedings 11th National Conference on Artificial Intelligence (AAAI-93), Washington, DC, 1993, pp. 689-697.
- Proceedings 11th National Conference on Artificial Intelligence (AAAI-93) , pp. 689-697
- Scherl, R.B.¹ Levesque, H.J.²

54
- 0001871991
- Universal plans for reactive robots in unpredictable environments
- Milan, Italy
- M.J. Schoppers, Universal plans for reactive robots in unpredictable environments, in: Proceedings Tenth International Joint Conference on Artificial Intelligence (IJCAI-87), Milan, Italy, 1987, pp. 1039-1046.
- (1987) Proceedings Tenth International Joint Conference on Artificial Intelligence (IJCAI-87) , pp. 1039-1046
- Schoppers, M.J.¹

55
- 0003690189
- Wiley-Interscience, New York, NY
- A. Schrijver, Theory of Linear and Integer Programming, Wiley-Interscience, New York, NY, 1986.
- (1986) Theory of Linear and Integer Programming
- Schrijver, A.¹

56
- 2142812536
- Model-free reinforcement learning for non-Markovian decision problems
- Morgan Kaufmann, San Francisco, CA
- S.P. Singh, T. Jaakkola and M.I. Jordan, Model-free reinforcement learning for non-Markovian decision problems, in: Proceedings Eleventh International Conference on Machine Learning, Morgan Kaufmann, San Francisco, CA, 1994, pp. 284-292.
- (1994) Proceedings Eleventh International Conference on Machine Learning , pp. 284-292
- Singh, S.P.¹ Jaakkola, T.² Jordan, M.I.³

57
- 0015658957
- The optimal control of partially observable Markov processes over a finite horizon
- R.D. Smallwood and E.J. Sondik, The optimal control of partially observable Markov processes over a finite horizon, Oper. Res. 21 (1973) 1071-1088.
- (1973) Oper. Res. , vol.21 , pp. 1071-1088
- Smallwood, R.D.¹ Sondik, E.J.²

58
- 0041656843
- Representation and evaluation of plans with loops
- D.E. Smith and M. Williamson, Representation and evaluation of plans with loops, Working Notes for the 1995 Stanford Spring Symposium on Extended Theories of Action, 1995.
- (1995) Working Notes for the 1995 Stanford Spring Symposium on Extended Theories of Action
- Smith, D.E.¹ Williamson, M.²

59
- 0003871607
- Ph.D. Thesis, Stanford University
- E. Sondik, The optimal control of partially observable Markov processes, Ph.D. Thesis, Stanford University, 1971.
- (1971) The Optimal Control of Partially Observable Markov Processes
- Sondik, E.¹

60
- 0017943242
- The optimal control of partially observable Markov processes over the infinite horizon: Discounted costs
- E.J. Sondik, The optimal control of partially observable Markov processes over the infinite horizon: discounted costs, Oper. Res. 26 (2) (1978) 282-304.
- (1978) Oper. Res. , vol.26 , Issue.2 , pp. 282-304
- Sondik, E.J.¹

61
- 0002297358
- Hidden Markov model induction by Bayesian model merging
- S.J. Hanson, J.D. Cowan and C.L. Giles (Eds.), Morgan Kaufmann, San Mateo, CA
- A. Stolcke and S. Omohundro, Hidden Markov model induction by Bayesian model merging, in: S.J. Hanson, J.D. Cowan and C.L. Giles (Eds.), Advances in Neural Information Processing Systems 5, Morgan Kaufmann, San Mateo, CA, 1993, pp. 11-18.
- (1993) Advances in Neural Information Processing Systems , vol.5 , pp. 11-18
- Stolcke, A.¹ Omohundro, S.²

62
- 0028576345
- Control strategies for a stochastic planner
- Seattle, WA
- J. Tash and S. Russell, Control strategies for a stochastic planner, in: Proceedings 12th National Conference on Artificial Intelligence (AAAI-94), Seattle, WA, 1994, pp. 1079-1085.
- (1994) Proceedings 12th National Conference on Artificial Intelligence (AAAI-94) , pp. 1079-1085
- Tash, J.¹ Russell, S.²

63
- 0025491302
- Solving H-horizon, stationary Markov decision problems in time proportional to log(H)
- P. Tseng, Solving H-horizon, stationary Markov decision problems in time proportional to log(H), Oper. Res. Lett. 9 (5) (1990) 287-297.
- (1990) Oper. Res. Lett. , vol.9 , Issue.5 , pp. 287-297
- Tseng, P.¹

64
- 0037958031
- Application of Jensen's inequality for adaptive suboptimal design
- C.C. White and D. Harrington, Application of Jensen's inequality for adaptive suboptimal design, J. Optim. Theory Appl. 32 (1) (1980) 89-99.
- (1980) J. Optim. Theory Appl. , vol.32 , Issue.1 , pp. 89-99
- White, C.C.¹ Harrington, D.²

65
- 0000893414
- Partially observed Markov decision processes: A survey
- C.C. White III, Partially observed Markov decision processes: a survey, Ann. Oper. Res. 32 (1991).
- (1991) Ann. Oper. Res. , vol.32
- White C.C. III¹

66
- 0024739631
- Solution procedures for partially observed Markov decision processes
- C.C. White III and W.T. Scherer, Solution procedures for partially observed Markov decision processes, Oper. Res. 37 (5) (1989) 791-797.
- (1989) Oper. Res. , vol.37 , Issue.5 , pp. 791-797
- White C.C. III¹ Scherer, W.T.²

67
- 0012252296
- Technical Report NU-CCS-93-14, Northeastern University, College of Computer Science, Boston, MA
- R.J. Williams and L.C. Baird III, Tight performance bounds on greedy policies based on imperfect value functions, Technical Report NU-CCS-93-14, Northeastern University, College of Computer Science, Boston, MA, 1993.
- (1993) Tight Performance Bounds on Greedy Policies Based on Imperfect Value Functions
- Williams, R.J.¹ Baird L.C. III²

68
- 0010810245
- Technical Report HKUST-CS96-31, Department of Computer Science, Hong Kong University of Science and Technology
- N.L. Zhang and W. Liu, Planning in stochastic domains: problem characteristics and approximation, Technical Report HKUST-CS96-31, Department of Computer Science, Hong Kong University of Science and Technology, 1996.
- (1996) Planning in Stochastic Domains: Problem Characteristics and Approximation
- Zhang, N.L.¹ Liu, W.²

69
- 0010814759
- Incremental self-improvement for life-time multi-agent reinforcement learning
- P. Maes, M.J. Mataric, J.-A. Meyer, J. Pollack and S.W. Wilson (Eds.), MIT Press, Cambridge, MA
- J. Zhao and J.H. Schmidhuber, Incremental self-improvement for life-time multi-agent reinforcement learning, in: P. Maes, M.J. Mataric, J.-A. Meyer, J. Pollack and S.W. Wilson (Eds.), From Animals to Animats: Proceedings Fourth International Conference on Simulation of Adaptive Behavior, MIT Press, Cambridge, MA, 1996, pp. 516-525.
- (1996) From Animals to Animats: Proceedings Fourth International Conference on Simulation of Adaptive Behavior , pp. 516-525
- Zhao, J.¹ Schmidhuber, J.H.²

70
- 0030143640
- The complexity of mean payoff games on graphs
- U. Zwick and M. Paterson, The complexity of mean payoff games on graphs, Theoret. Comput. Sci. 158 (1-2) (1996) 343-359.
- (1996) Theoret. Comput. Sci. , vol.158 , Issue.1-2 , pp. 343-359
- Zwick, U.¹ Paterson, M.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.