메뉴 건너뛰기




Volumn , Issue , 1995, Pages 387-395

Instance-Based Utile Distinctions for Reinforcement Learning with Hidden State

Author keywords

[No Author keywords available]

Indexed keywords

DYNAMIC PROGRAMMING; LEARNING ALGORITHMS; REINFORCEMENT LEARNING;

EID: 2342482919     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (116)

References (27)
  • 1
    • 0000963039 scopus 로고
    • Pengi: an implementation of a theory of activity
    • [Agre and Chapman, 1987] pages
    • [Agre and Chapman, 1987] Philip E. Agre and David Chapman. Pengi: an implementation of a theory of activity. In AAAI, pages 268-272,1987.
    • (1987) AAAI , pp. 268-272
    • Agre, Philip E.1    Chapman, David2
  • 3
    • 0003787146 scopus 로고
    • [Bellman, 1957] Princeton University Press, Princeton, NJ
    • [Bellman, 1957] R. E. Bellman. Dynamic Programming. Princeton University Press, Princeton, NJ, 1957.
    • (1957) Dynamic Programming
    • Bellman, R. E.1
  • 4
    • 0003923091 scopus 로고
    • [Bertsekas and Shreve, 1978] Dimitri. P. Bertsekas and Academic Press
    • [Bertsekas and Shreve, 1978] Dimitri. P. Bertsekas and Steven E. Shreve. Stochastic Optimal Control. Academic Press, 1978.
    • (1978) Stochastic Optimal Control
    • Shreve, Steven E.1
  • 7
    • 0026998041 scopus 로고
    • Reinforcement learning with perceptual aliasing: The perceptual distinctions approach
    • [Chrisman, 1992]
    • [Chrisman, 1992] Lonnie Chrisman. Reinforcement learning with perceptual aliasing: The perceptual distinctions approach. In Tenth National Conference on AI, 1992.
    • (1992) Tenth National Conference on AI
    • Chrisman, Lonnie1
  • 15
    • 84916521733 scopus 로고
    • Memory-based reinforcement learning: Efficient computation with prioritized sweeping
    • [Moore and Atkeson, 1993] Morgan Kaufmann Publishers, Inc
    • [Moore and Atkeson, 1993] Andrew W. Moore and Christopher G. Atkeson. Memory-based reinforcement learning: Efficient computation with prioritized sweeping. In Advances of Neural Information Processing Systems (NIPS 5). Morgan Kaufmann Publishers, Inc., 1993.
    • (1993) Advances of Neural Information Processing Systems (NIPS 5)
    • Moore, Andrew W.1    Atkeson, Christopher G.2
  • 16
    • 33747997674 scopus 로고
    • Variable resolution dynamic programming: Efficiently learning action maps in multivariate real-valued state-spaces
    • [Moore, 1991] pages
    • [Moore, 1991] Andrew W. Moore. Variable resolution dynamic programming: Efficiently learning action maps in multivariate real-valued state-spaces. Proceedings of the Eighth International Workshop on Machine Learning, pages 333-337,1991.
    • (1991) Proceedings of the Eighth International Workshop on Machine Learning , pp. 333-337
    • Moore, Andrew W.1
  • 17
    • 0006488247 scopus 로고
    • The parti-game algorithm for variable resolution reinforcement learning in multidimensional state spaces
    • [Moore, 1993] pages Morgan Kaufmann
    • [Moore, 1993] Andrew W. Moore. The parti-game algorithm for variable resolution reinforcement learning in multidimensional state spaces. In Advances of Neural Information Processing Systems (NIPS 6), pages 711-718. Morgan Kaufmann, 1993.
    • (1993) Advances of Neural Information Processing Systems (NIPS 6) , pp. 711-718
    • Moore, Andrew W.1
  • 20
    • 85013571397 scopus 로고
    • Learning probabilistic automata with variable memory length
    • [Ron et ai, 1994] Morgan Kaufmann Publishers, Inc
    • [Ron et ai, 1994] Dana Ron, Yoram Singer, and Naftali Tishby. Learning probabilistic automata with variable memory length. In Proceedings Computational Learning Theory. Morgan Kaufmann Publishers, Inc., 1994.
    • (1994) Proceedings Computational Learning Theory
    • Ron, Dana1    Singer, Yoram2    Tishby, Naftali3
  • 21
    • 85132026293 scopus 로고
    • Integrated architectures for learning, planning, and reacting based on approximating dynamic programming
    • [Sutton, 1990] June
    • [Sutton, 1990] Richard S. Sutton. Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In Proceedings of the Seventh International Conference on Machine Learning, June 1990.
    • (1990) Proceedings of the Seventh International Conference on Machine Learning
    • Sutton, Richard S.1
  • 22
    • 0003362676 scopus 로고
    • Astro Teller. The evolution of mental models
    • [Teller, 1994] Kim Kinnear, editor, chapter 9. MIT Press
    • [Teller, 1994] Astro Teller. The evolution of mental models. In Kim Kinnear, editor, Advances in Genetic Programming, chapter 9. MIT Press, 1994.
    • (1994) Advances in Genetic Programming
  • 24
    • 0021700041 scopus 로고
    • Visual routines
    • [Ullman, 1984]
    • [Ullman, 1984] Shimon Ullman. Visual routines. Cognition, 18:97-159,1984.
    • (1984) Cognition , vol.18 , pp. 97-159
    • Ullman, Shimon1
  • 26
    • 0005951145 scopus 로고
    • Finite-memory suboptimal design for partially observed markov decision processes
    • [White and Scherer, 1994]
    • [White and Scherer, 1994] Chelsea C. White and William T. Scherer. Finite-memory suboptimal design for partially observed markov decision processes. Operations Research, 42:439-455,1994.
    • (1994) Operations Research , vol.42 , pp. 439-455
    • White, Chelsea C.1    Scherer, William T.2
  • 27
    • 0002557085 scopus 로고
    • Learning to perceive and act by trial and error
    • [Whitehead and Ballard, 1991] ()
    • [Whitehead and Ballard, 1991] Steven D. Whitehead and Dana H. Ballard. Learning to perceive and act by trial and error. Machine Learning, 7(1 ):45-83, 1991.
    • (1991) Machine Learning , vol.7 , Issue.1 , pp. 45-83
    • Whitehead, Steven D.1    Ballard, Dana H.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.