메뉴 건너뛰기




Volumn , Issue , 1995, Pages 362-370

Learning policies for partially observable environments: Scaling up

Author keywords

[No Author keywords available]

Indexed keywords

ROBOTS;

EID: 85138579181     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (490)

References (27)
  • 1
    • 50549213583 scopus 로고
    • Optimal control of Markov decision processes with incomplete state estimation
    • Astrom, K. J. (1965). Optimal control of Markov decision processes with incomplete state estimation. J. Math. Anal. Appl., 10:174-205.
    • (1965) J. Math. Anal. Appl , vol.10 , pp. 174-205
    • Astrom, K. J.1
  • 7
    • 0026998041 scopus 로고
    • Reinforcement learning with perceptual aliasing: The perceptual distinctions approach
    • Chrisman, L. (1992). Reinforcement learning with perceptual aliasing: The perceptual distinctions approach. In Proc. Tenth National Conference on AI (AAAI).
    • (1992) Proc. Tenth National Conference on AI (AAAI)
    • Chrisman, L.1
  • 8
    • 0001041553 scopus 로고
    • Rapid task learning for real robots
    • Kluwer Academic Publishers
    • Connell, J. and Mahadevan, S. (1993). Rapid task learning for real robots. In Robot Learning. Kluwer Academic Publishers.
    • (1993) Robot Learning
    • Connell, J.1    Mahadevan, S.2
  • 9
    • 0000439891 scopus 로고
    • On the convergence of stochastic iterative dynamic programming algorithms
    • Jaakkola, T., Jordan, M. I., and Singh, S. P. (1994). On the convergence of stochastic iterative dynamic programming algorithms. Neural Computation, 6(6).
    • (1994) Neural Computation , vol.6 , Issue.6
    • Jaakkola, T.1    Jordan, M. I.2    Singh, S. P.3
  • 10
    • 2342597043 scopus 로고
    • Technical Report 93-06-03, University of Washington Department of Computer Science and Engineering. To appear in Artificial Intelligence
    • Kushmerick, N., Hanks, S., and Weld, D. (1993). An Algorithm for Probabilistic Planning. Technical Report 93-06-03, University of Washington Department of Computer Science and Engineering. To appear in Artificial Intelligence.
    • (1993) An Algorithm for Probabilistic Planning
    • Kushmerick, N.1    Hanks, S.2    Weld, D.3
  • 13
    • 0002679852 scopus 로고
    • A survey of algorithmic methods for partially observable Markov decision processes
    • Lovejoy, W. S. (1991). A survey of algorithmic methods for partially observable Markov decision processes. Annals of Operations Research, 28:47-66.
    • (1991) Annals of Operations Research , vol.28 , pp. 47-66
    • Lovejoy, W. S.1
  • 15
    • 0006488247 scopus 로고
    • The parti-game algorithm for variable resolution reinforcement learning in multidimensional state spaces
    • San Mateo, CA. Morgan Kaufmann
    • Moore, A. W. (1994). The parti-game algorithm for variable resolution reinforcement learning in multidimensional state spaces. In Advances in Neural Information Processing Systems 6, San Mateo, CA. Morgan Kaufmann.
    • (1994) Advances in Neural Information Processing Systems , vol.6
    • Moore, A. W.1
  • 21
    • 0000646059 scopus 로고
    • Learning internal representations by error backpropagation
    • Rumelhart, D. E. and McClelland, J. L., editors, Foundations, chapter 8. The MIT Press, Cambridge, MA
    • Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986). Learning internal representations by error backpropagation. In Rumelhart, D. E. and McClelland, J. L., editors, Parallel Distributed Processing: Explorations in the microstructures of cognition. Volume 1: Foundations, chapter 8. The MIT Press, Cambridge, MA.
    • (1986) Parallel Distributed Processing: Explorations in the microstructures of cognition , vol.1
    • Rumelhart, D. E.1    Hinton, G. E.2    Williams, R. J.3
  • 23
    • 0015658957 scopus 로고
    • The optimal control of partially observable Markov processes over a finite horizon
    • Smallwood, R. D. and Sondik, E. J. (1973). The optimal control of partially observable Markov processes over a finite horizon. Operations Research, 21:1071-1088.
    • (1973) Operations Research , vol.21 , pp. 1071-1088
    • Smallwood, R. D.1    Sondik, E. J.2
  • 24
    • 0017943242 scopus 로고
    • The optimal control of partially observable Markov processes over the infinite horizon: Discounted costs
    • Sondik, E. J. (1978). The optimal control of partially observable Markov processes over the infinite horizon: Discounted costs. Operations Research, 26(2).
    • (1978) Operations Research , vol.26 , Issue.2
    • Sondik, E. J.1
  • 25
    • 85152619997 scopus 로고
    • Asynchronous stohcastic aproximation and Q-learning
    • Tsitsikilis, J. N. (1994). Asynchronous stohcastic aproximation and Q-learning. Machine Learning, 16(3).
    • (1994) Machine Learning , vol.16 , Issue.3
    • Tsitsikilis, J. N.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.