SCOPUS 정보 검색 플랫폼

10th International Conference on Autonomous Agents and Multiagent Systems 2011, AAMAS 2011

Volumn 2, Issue , 2011, Pages 713-720

Horde: A scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction

(6) Sutton, Richard S a Modayil, Joseph a Degris, Michael Delp Thomas a Pilarski, Patrick M a White, Adam a Precup, Doina b

a UNIVERSITY OF ALBERTA (Canada)

b MCGILL UNIVERSITY (Canada)

Author keywords

Artificial intelligence; Knowledge representation; Off policy learning; Real time; Reinforcement learning; Robotics; Temporal difference learning; Value function approximation

Indexed keywords

ARTIFICIAL INTELLIGENCE; AUTONOMOUS AGENTS; KNOWLEDGE REPRESENTATION; MULTI AGENT SYSTEMS; ROBOTICS;

ARTIFICIAL INTELLIGENCE SYSTEMS; FUNCTION APPROXIMATION; GOAL-ORIENTED BEHAVIOR; OFF-POLICY LEARNING; REAL-TIME; REAL-TIME ARCHITECTURE; TEMPORAL DIFFERENCE LEARNING; VALUE FUNCTION APPROXIMATION;

REINFORCEMENT LEARNING;

EID: 84899464022 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (452)

References (22)

1
- 0346859314
- A model for the encoding of experiential information
- Schank, R. C, Colby, K. M., Eds. W. H. Freeman and Company
- Becker, J. D. (1973). A model for the encoding of experiential information. In Computer Models of Thought and Language, Schank, R. C, Colby, K. M., Eds. W. H. Freeman and Company.
- (1973) Computer Models of Thought and Language
- Becker, J.D.¹

2
- 80054025121
- PhD thesis, Dutch Research School for Information and Knowledge Systems
- Chaslot, G. M. J-B. (2010). Monte-Carlo tree search. PhD thesis, Dutch Research School for Information and Knowledge Systems.
- (2010) Monte-Carlo Tree Search
- Chaslot, G.M.J.-B.¹

3
- 84867104859
- Neo: Learning conceptual knowledge by sensorimotor interaction with an environment
- Marina del Rey, CA. ACM
- Cohen, P. R., Atkin, M. S., Oates, T., Beal, C. R. (1997). Neo: Learning conceptual knowledge by sensorimotor interaction with an environment. In Agents '97, Marina del Rey, CA. ACM.
- (1997) Agents '97
- Cohen, P.R.¹ Atkin, M.S.² Oates, T.³ Beal, C.R.⁴

4
- 0001948734
- Academic Press
- Cunningham, M. (1972). Intelligence: Its Organization and Development. Academic Press.
- (1972) Intelligence: Its Organization and Development
- Cunningham, M.¹

5
- 0003977430
- MIT Press, Cambridge, MA
- Drescher, G. L. (1991). Made-Up Minds: A Constructivist Approach to Artificial Intelligence. MIT Press, Cambridge, MA.
- (1991) Made-Up Minds: A Constructivist Approach to Artificial Intelligence
- Drescher, G.L.¹

6
- 21144439055
- Learning in worlds with objects
- Kaelbling, L. P., Oates, T., Hernandez, N., Finney, S. (2001). Learning in worlds with objects. Working Notes of the AAAI Stanford Spring Symposium on Learning Grounded Representations.
- (2001) Working Notes of the AAAI Stanford Spring Symposium on Learning Grounded Representations
- Kaelbling, L.P.¹ Oates, T.² Hernandez, N.³ Finney, S.⁴

7
- 77954101982
- GQ(A): A general gradient algorithm for temporal-difference prediction learning with eligibility traces
- Lugano, Switzerland
- Maei, H. R., Sutton, R. S. (2010). GQ(A): A general gradient algorithm for temporal-difference prediction learning with eligibility traces. In Proceedings of the Third Conference on Artificial General Intelligence, Lugano, Switzerland.
- (2010) Proceedings of the Third Conference on Artificial General Intelligence
- Maei, H.R.¹ Sutton, R.S.²

8
- 79951481923
- Convergent temporal-difference learning with arbitrary smooth function approximation
- Vancouver, BC. MIT Press
- Maei, H. R., Szepesvári, Cs., Bhatnagar, S., Precup, D., Silver, D., Sutton, R. S. (2009). Convergent temporal-difference learning with arbitrary smooth function approximation. In Advances in Neural Information Processing Systems 22, Vancouver, BC. MIT Press.
- (2009) Advances in Neural Information Processing Systems , vol.22
- Maei, H.R.¹ Szepesvári, C.² Bhatnagar, S.³ Precup, D.⁴ Silver, D.⁵ Sutton, R.S.⁶

9
- 77956541799
- Toward off-policy learning control with function approximation
- Haifa, Israel
- Maei, H. R., Szepesvári, Cs., Bhatnagar, S., Sutton, R. S. (2010). Toward off-policy learning control with function approximation. In Proceedings of the 27th International Conference on Machine Learning, Haifa, Israel.
- (2010) Proceedings of the 27th International Conference on Machine Learning
- Maei, H.R.¹ Szepesvári, C.² Bhatnagar, S.³ Sutton, R.S.⁴

10
- 33645575072
- MIT PhD thesis
- Natale, L. (2005). Linking action to perception in a humanoid robot: A developmental approach to grasping. MIT PhD thesis.
- (2005) Linking Action to Perception in a Humanoid Robot: A Developmental Approach to Grasping
- Natale, L.¹

11
- 84969135798
- A method for clustering the experiences of a mobile robot that accords with human judgments
- AAAI/MIT Press
- Oates, T., Schmill, M. D., Cohen, P. R. (2000). A method for clustering the experiences of a mobile robot that accords with human judgments. Proceedings AAAI, 846-851, AAAI/MIT Press.
- (2000) Proceedings AAAI , pp. 846-851
- Oates, T.¹ Schmill, M.D.² Cohen, P.R.³

12
- 34748875246
- Learning symbolic models of stochastic domains
- Pasula, H., Zettlemoyer, L., Kaelbling L. (2007). Learning symbolic models of stochastic domains. Journal of Artificial Intelligence Research 59:309-352.
- (2007) Journal of Artificial Intelligence Research , vol.59 , pp. 309-352
- Pasula, H.¹ Zettlemoyer, L.² Kaelbling, L.³

13
- 0031147214
- Map learning with uninterpreted sensors and effectors
- Pierce, D. M., Kuipers, B. J. (1997). Map learning with uninterpreted sensors and effectors. Artificial Intelligence 92:169-227.
- (1997) Artificial Intelligence , vol.92 , pp. 169-227
- Pierce, D.M.¹ Kuipers, B.J.²

14
- 0031189347
- CHILD: A first step toward continual learning
- Ring, M. B. (1997). CHILD: A first step toward continual learning. Machine Learning, 28:77-104.
- (1997) Machine Learning , vol.28 , pp. 77-104
- Ring, M.B.¹

15
- 33847202724
- Learning to predict by the method of temporal differences
- Sutton, R. S. (1988). Learning to predict by the method of temporal differences. Machine Learning 3:9-44.
- (1988) Machine Learning , vol.3 , pp. 9-44
- Sutton, R.S.¹

16
- 85132026293
- Integrated architectures for learning, planning, and reacting based on approximating dynamic programming
- Morgan Kaufmann, San Mateo, CA
- Sutton, R. S. (1990). Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In Proceedings of the Seventh International Conference on Machine Learning, pp. 216-224. Morgan Kaufmann, San Mateo, CA.
- (1990) Proceedings of the Seventh International Conference on Machine Learning , pp. 216-224
- Sutton, R.S.¹

17
- 0004102479
- MIT Press
- Sutton, R. S., Barto, A. G. (1998). Reinforcement Learning: An Introduction. MIT Press.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

18
- 71149099079
- Fast gradient-descent methods for temporal-difference learning with linear function approximation
- Montreal, Canada
- Sutton, R. S., Maei, H. R., Precup, D., Bhatnagar, S., Silver, D., Szepesvari, Cs., Wiewiora, E. (2009). Fast gradient-descent methods for temporal-difference learning with linear function approximation. In Proceedings of the 26th International Conference on Machine Learning, Montreal, Canada.
- (2009) Proceedings of the 26th International Conference on Machine Learning
- Sutton, R.S.¹ Maei, H.R.² Precup, D.³ Bhatnagar, S.⁴ Silver, D.⁵ Szepesvari, C.⁶ Wiewiora, E.⁷

19
- 0033170372
- Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
- Sutton, R. S., Precup D., Singh, S. (1999). Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112:181-211.
- (1999) Artificial Intelligence , vol.112 , pp. 181-211
- Sutton, R.S.¹ Precup, D.² Singh, S.³

20
- 33749265408
- Temporal abstraction in temporal-difference networks
- Sutton, R. S., Rafols, E. J., Koop, A. (2006). Temporal abstraction in temporal-difference networks. Advances in Neural Information Processing Systems 18.
- (2006) Advances in Neural Information Processing Systems , vol.18
- Sutton, R.S.¹ Rafols, E.J.² Koop, A.³

21
- 77956513316
- A convergent O(n) algorithm for off-policy temporal-difference learning with linear function approximation
- Sutton, R. S., Szepesvári, Cs., Maei, H. R. (2008). A convergent O(n) algorithm for off-policy temporal-difference learning with linear function approximation. Advances in Neural Information Processing Systems 21.
- (2008) Advances in Neural Information Processing Systems , vol.21
- Sutton, R.S.¹ Szepesvári, C.² Maei, H.R.³

22
- 84867456688
- A multimodal learning interface for grounding spoken language in sensory perceptions
- Yu, C., Ballard, D. (2004). A multimodal learning interface for grounding spoken language in sensory perceptions. ACM Transactions on Applied Perception 1:57-80.
- (2004) ACM Transactions on Applied Perception , vol.1 , pp. 57-80
- Yu, C.¹ Ballard, D.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.