메뉴 건너뛰기




Volumn 2, Issue , 2011, Pages 713-720

Horde: A scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction

Author keywords

Artificial intelligence; Knowledge representation; Off policy learning; Real time; Reinforcement learning; Robotics; Temporal difference learning; Value function approximation

Indexed keywords

ARTIFICIAL INTELLIGENCE; AUTONOMOUS AGENTS; KNOWLEDGE REPRESENTATION; MULTI AGENT SYSTEMS; ROBOTICS;

EID: 84899464022     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (452)

References (22)
  • 1
    • 0346859314 scopus 로고
    • A model for the encoding of experiential information
    • Schank, R. C, Colby, K. M., Eds. W. H. Freeman and Company
    • Becker, J. D. (1973). A model for the encoding of experiential information. In Computer Models of Thought and Language, Schank, R. C, Colby, K. M., Eds. W. H. Freeman and Company.
    • (1973) Computer Models of Thought and Language
    • Becker, J.D.1
  • 2
    • 80054025121 scopus 로고    scopus 로고
    • PhD thesis, Dutch Research School for Information and Knowledge Systems
    • Chaslot, G. M. J-B. (2010). Monte-Carlo tree search. PhD thesis, Dutch Research School for Information and Knowledge Systems.
    • (2010) Monte-Carlo Tree Search
    • Chaslot, G.M.J.-B.1
  • 3
    • 84867104859 scopus 로고    scopus 로고
    • Neo: Learning conceptual knowledge by sensorimotor interaction with an environment
    • Marina del Rey, CA. ACM
    • Cohen, P. R., Atkin, M. S., Oates, T., Beal, C. R. (1997). Neo: Learning conceptual knowledge by sensorimotor interaction with an environment. In Agents '97, Marina del Rey, CA. ACM.
    • (1997) Agents '97
    • Cohen, P.R.1    Atkin, M.S.2    Oates, T.3    Beal, C.R.4
  • 7
    • 77954101982 scopus 로고    scopus 로고
    • GQ(A): A general gradient algorithm for temporal-difference prediction learning with eligibility traces
    • Lugano, Switzerland
    • Maei, H. R., Sutton, R. S. (2010). GQ(A): A general gradient algorithm for temporal-difference prediction learning with eligibility traces. In Proceedings of the Third Conference on Artificial General Intelligence, Lugano, Switzerland.
    • (2010) Proceedings of the Third Conference on Artificial General Intelligence
    • Maei, H.R.1    Sutton, R.S.2
  • 11
    • 84969135798 scopus 로고    scopus 로고
    • A method for clustering the experiences of a mobile robot that accords with human judgments
    • AAAI/MIT Press
    • Oates, T., Schmill, M. D., Cohen, P. R. (2000). A method for clustering the experiences of a mobile robot that accords with human judgments. Proceedings AAAI, 846-851, AAAI/MIT Press.
    • (2000) Proceedings AAAI , pp. 846-851
    • Oates, T.1    Schmill, M.D.2    Cohen, P.R.3
  • 13
    • 0031147214 scopus 로고    scopus 로고
    • Map learning with uninterpreted sensors and effectors
    • Pierce, D. M., Kuipers, B. J. (1997). Map learning with uninterpreted sensors and effectors. Artificial Intelligence 92:169-227.
    • (1997) Artificial Intelligence , vol.92 , pp. 169-227
    • Pierce, D.M.1    Kuipers, B.J.2
  • 14
    • 0031189347 scopus 로고    scopus 로고
    • CHILD: A first step toward continual learning
    • Ring, M. B. (1997). CHILD: A first step toward continual learning. Machine Learning, 28:77-104.
    • (1997) Machine Learning , vol.28 , pp. 77-104
    • Ring, M.B.1
  • 15
    • 33847202724 scopus 로고
    • Learning to predict by the method of temporal differences
    • Sutton, R. S. (1988). Learning to predict by the method of temporal differences. Machine Learning 3:9-44.
    • (1988) Machine Learning , vol.3 , pp. 9-44
    • Sutton, R.S.1
  • 16
    • 85132026293 scopus 로고
    • Integrated architectures for learning, planning, and reacting based on approximating dynamic programming
    • Morgan Kaufmann, San Mateo, CA
    • Sutton, R. S. (1990). Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In Proceedings of the Seventh International Conference on Machine Learning, pp. 216-224. Morgan Kaufmann, San Mateo, CA.
    • (1990) Proceedings of the Seventh International Conference on Machine Learning , pp. 216-224
    • Sutton, R.S.1
  • 19
    • 0033170372 scopus 로고    scopus 로고
    • Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
    • Sutton, R. S., Precup D., Singh, S. (1999). Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112:181-211.
    • (1999) Artificial Intelligence , vol.112 , pp. 181-211
    • Sutton, R.S.1    Precup, D.2    Singh, S.3
  • 21
    • 77956513316 scopus 로고    scopus 로고
    • A convergent O(n) algorithm for off-policy temporal-difference learning with linear function approximation
    • Sutton, R. S., Szepesvári, Cs., Maei, H. R. (2008). A convergent O(n) algorithm for off-policy temporal-difference learning with linear function approximation. Advances in Neural Information Processing Systems 21.
    • (2008) Advances in Neural Information Processing Systems , vol.21
    • Sutton, R.S.1    Szepesvári, C.2    Maei, H.R.3
  • 22
    • 84867456688 scopus 로고    scopus 로고
    • A multimodal learning interface for grounding spoken language in sensory perceptions
    • Yu, C., Ballard, D. (2004). A multimodal learning interface for grounding spoken language in sensory perceptions. ACM Transactions on Applied Perception 1:57-80.
    • (2004) ACM Transactions on Applied Perception , vol.1 , pp. 57-80
    • Yu, C.1    Ballard, D.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.