SCOPUS 정보 검색 플랫폼

Adaptive Behavior

Volumn 22, Issue 2, 2014, Pages 146-160

Multi-timescale nexting in a reinforcement learning robot

(3) Modayil, Joseph a White, Adam a Sutton, Richard S a

a UNIVERSITY OF ALBERTA (Canada)

Author keywords

predictive knowledge; Reinforcement learning; robotics; temporal difference learning

Indexed keywords

EID: 84896357393 PISSN: 10597123 EISSN: 17412633 Source Type: Journal
DOI: 10.1177/1059712313511648 Document Type: Article

Times cited : (73)

References (54)

1
- 0346859314
- Shank R. C.Colby K. M., ed.;, San Francisco, CA: W. H. Freeman and Company
- Becker J. D.Computer models of thought and language. Shank R. C.Colby K. M., ed. San Francisco, CA: W. H. Freeman and Company; 1973:396-434.
- (1973) Computer Models of Thought and Language , pp. 396-434
- Becker, J.D.¹

2
- 80052249260
- Closing the learning-planning loop with predictive state representations
- Boots B.,Siddiqi S. M.,Gordon G. J.Closing the learning-planning loop with predictive state representations.International Journal of Robotics Research. 2011;30 (7): 954-966.
- (2011) International Journal of Robotics Research , vol.30 , Issue.7 , pp. 954-966
- Boots, B.¹ Siddiqi, S.M.² Gordon, G.J.³

3
- 70349505264
- New York, NY: Wiley
- Box G. E.,Jenkins G. M.,Reinsel G. C.Time series analysis: Forecasting and control. New York, NY: Wiley; 2011:.
- (2011) Time Series Analysis: Forecasting and Control
- Box, G.E.¹ Jenkins, G.M.² Reinsel, G.C.³

4
- 0038862801
- Sensory pre-conditioning
- Brogden W.Sensory pre-conditioning.Journal of Experimental Psychology. 1939;25 (4): 323-332.
- (1939) Journal of Experimental Psychology , vol.25 , Issue.4 , pp. 323-332
- Brogden, W.¹

5
- 4344689518
- New York, NY: Springer
- Butz M.,Sigaud O.,Gérard P.Anticipatory behaviour in adaptive learning systems: Foundations, theories, and systems. New York, NY: Springer; 2003:.
- (2003) Anticipatory Behaviour in Adaptive Learning Systems: Foundations, Theories, and Systems
- Butz, M.¹ Sigaud, O.² Gérard, P.³

6
- 0003517858
- New York, NY: Springer
- Camacho E. F.,Bordons C.Model predictive control. New York, NY: Springer; 2004:.
- (2004) Model Predictive Control
- Camacho, E.F.¹ Bordons, C.²

7
- 0033855135
- Tickling expectations: Neural processing in anticipation of a sensory stimulus
- Carlsson K.,Petrovic P.,Skare S.,Petersson K.,Ingvar M.Tickling expectations: Neural processing in anticipation of a sensory stimulus.Journal of Cognitive Neuroscience. 2000;12 (4): 691-703.
- (2000) Journal of Cognitive Neuroscience , vol.12 , Issue.4 , pp. 691-703
- Carlsson, K.¹ Petrovic, P.² Skare, S.³ Petersson, K.⁴ Ingvar, M.⁵

8
- 84872566721
- Whatever next? Predictive brains, situated agents, and the future of cognitive science
- Clark A.Whatever next? Predictive brains, situated agents, and the future of cognitive science.Behavioral and Brain Sciences. 2013;36 (3): 181-204.
- (2013) Behavioral and Brain Sciences , vol.36 , Issue.3 , pp. 181-204
- Clark, A.¹

9
- 0001948734
- New York, NY: Academic Press
- Cunningham M.Intelligence: Its organization and development. New York, NY: Academic Press; 1972:.
- (1972) Intelligence: Its Organization and Development
- Cunningham, M.¹

10
- 0001234682
- Feudal reinforcement learning
- Dayan P.,Hinton G.Feudal reinforcement learning.Advances in Neural Information Processing Systems. 1993;5:271-278.
- (1993) Advances in Neural Information Processing Systems , vol.5 , pp. 271-278
- Dayan, P.¹ Hinton, G.²

11
- 84869424969
- Model-free reinforcement learning with continuous action in practice
- Proceedings of the American Control Conference;; 2177
- Degris T.,Pilarski P. M.,Sutton R. S.Model-free reinforcement learning with continuous action in practice. Proceedings of the American Control Conference; 2012; 2012. 2177.
- (2012)
- Degris, T.¹ Pilarski, P.M.² Sutton, R.S.³

12
- 0003977430
- Cambridge, MA: MIT Press
- Drescher G. L.Made-up minds: A constructivist approach to artificial intelligence. Cambridge, MA: MIT Press; 1991:.
- (1991) Made-up Minds: A Constructivist Approach to Artificial Intelligence
- Drescher, G.L.¹

13
- 33745621842
- New York: Knopf Press
- Gilbert D.Stumbling on happiness. New York: Knopf Press; 2006:.
- (2006) Stumbling on Happiness
- Gilbert, D.¹

14
- 6344257187
- The emulation theory of representation: Motor control, imagery, and perception
- Grush R.The emulation theory of representation: Motor control, imagery, and perception.Behavioural and Brain Sciences. 2004;27:377-442.
- (2004) Behavioural and Brain Sciences , vol.27 , pp. 377-442
- Grush, R.¹

15
- 20844454983
- New York: Times Books
- Hawkins J.,Blakeslee S.On intelligence. New York: Times Books; 2004:.
- (2004) On Intelligence
- Hawkins, J.¹ Blakeslee, S.²

16
- 34250782801
- New York, NY: MIT Press
- Huron D.Sweet anticipation: Music and the psychology of expectation. New York, NY: MIT Press; 2006:.
- (2006) Sweet Anticipation: Music and the Psychology of Expectation
- Huron, D.¹

17
- 0042545768
- Learning to achieve goals
- Proceedings of International Joint Conference on Artificial Intelligence;; 1094
- Kaelbling L.Learning to achieve goals. Proceedings of International Joint Conference on Artificial Intelligence; 1993; 1993. 1094.
- (1993)
- Kaelbling, L.¹

18
- 77952010176
- Cambridge: Cambridge University Press
- LaValle S. M.Planning algorithms. Cambridge: Cambridge University Press; 2006:.
- (2006) Planning Algorithms
- LaValle, S.M.¹

19
- 36448971084
- New York: Dutton Books
- Levitin D.This is your brain on music. New York: Dutton Books; 2006:.
- (2006) This is Your Brain on Music
- Levitin, D.¹

20
- 84898982129
- Predictive representations of state
- Littman M. L.,Sutton R. S.,Singh S.Predictive representations of state.Advances in Neural Information Processing Systems. 2002;14:1555-1561.
- (2002) Advances in Neural Information Processing Systems , vol.14 , pp. 1555-1561
- Littman, M.L.¹ Sutton, R.S.² Singh, S.³

21
- 0003473124
- Englewood Cliffs, NJ: Prentice-Hall
- Ljung L.System identification: Theory for the user. Englewood Cliffs, NJ: Prentice-Hall; 1998:.
- (1998) System Identification: Theory for the User
- Ljung, L.¹

22
- 84864655352
- PhD Thesis, University of Alberta, Canada
- MaeiH. R. (2011). Gradient temporal-difference learning algorithms. PhD Thesis, University of Alberta, Canada.
- (2011) Gradient temporal-difference learning algorithms
- Maei, H.R.¹

23
- 77954101982
- GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces
- Proceedings of the Third Conference on Artificial General Intelligence;; 91
- Maei H.,Sutton R. S.GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces. Proceedings of the Third Conference on Artificial General Intelligence; 2010; 2010. 91.
- (2010)
- Maei, H.¹ Sutton, R.S.²

24
- 80051891791
- Google cars drive themselves, in traffic
- Markoff J.Google cars drive themselves, in traffic.The New York Times. 2010;:A1.
- (2010) The New York Times
- Markoff, J.¹

25
- 84866006400
- Multi-timescale nexting in a reinforcement learning robot
- From Animals to Animats 12: 12th International Conference on Simulation of Adaptive Behavior;; 299
- Modayil J.,White A.,Sutton R. S.Multi-timescale nexting in a reinforcement learning robot. From Animals to Animats 12: 12th International Conference on Simulation of Adaptive Behavior; 2012; 2012. 299.
- (2012)
- Modayil, J.¹ White, A.² Sutton, R.S.³

26
- 0342721206
- A method for clustering the experiences of a mobile robot that accords with human judgments
- Proceedings of the Seventeenth Conference of the Association for the Advancement of Artificial Intelligence;; 846
- Oates T.,Schmill M. D.,Cohen P. R.A method for clustering the experiences of a mobile robot that accords with human judgments. Proceedings of the Seventeenth Conference of the Association for the Advancement of Artificial Intelligence; 2000; 2000. 846.
- (2000)
- Oates, T.¹ Schmill, M.D.² Cohen, P.R.³

27
- 34047267520
- Intrinsic motivation systems for autonomous mental development
- Oudeyer P.-Y.,Kaplan F.,Hafner V.Intrinsic motivation systems for autonomous mental development.IEEE Transactions on Evolutionary Computation. 2007;11 (2): 265-286.
- (2007) IEEE Transactions on Evolutionary Computation , vol.11 , Issue.2 , pp. 265-286
- Oudeyer, P.-Y.¹ Kaplan, F.² Hafner, V.³

28
- 0003777045
- Oxford: Oxford University Press
- Pavlov I.Conditioned reflexes: An investigation of the physiological activity of the cerebral cortex. Oxford: Oxford University Press; 1927:.
- (1927) Conditioned Reflexes: An Investigation of the Physiological Activity of the Cerebral Cortex
- Pavlov, I.¹

29
- 40649106649
- Natural actor-critic
- Peters J.,Schaal S.Natural actor-critic.Neurocomputing. 2008;71 (7): 1180-1190.
- (2008) Neurocomputing , vol.71 , Issue.7 , pp. 1180-1190
- Peters, J.¹ Schaal, S.²

30
- 44349151557
- Coordinating with the future: The anticipatory nature of representation
- Pezzulo G.Coordinating with the future: The anticipatory nature of representation.Minds and Machines. 2008;18 (2): 179-225.
- (2008) Minds and Machines , vol.18 , Issue.2 , pp. 179-225
- Pezzulo, G.¹

31
- 0031147214
- Map learning with uninterpreted sensors and effectors
- Pierce D.,Kuipers B. J.Map learning with uninterpreted sensors and effectors.Artificial Intelligence. 1997;92 (1): 169-227.
- (1997) Artificial Intelligence , vol.92 , Issue.1 , pp. 169-227
- Pierce, D.¹ Kuipers, B.J.²

32
- 0019039111
- Simultaneous and successive associations in sensory preconditioning
- Rescorla R.Simultaneous and successive associations in sensory preconditioning.Journal of Experimental Psychology: Animal Behavior Processes. 1980;6 (3): 207-216.
- (1980) Journal of Experimental Psychology: Animal Behavior Processes , vol.6 , Issue.3 , pp. 207-216
- Rescorla, R.¹

33
- 0026962175
- Reinforcement learning with a hierarchy of abstract models
- Proceedings of the Conference of the Association for the Advancement of Artificial Intelligence;; 202
- Singh S.Reinforcement learning with a hierarchy of abstract models. Proceedings of the Conference of the Association for the Advancement of Artificial Intelligence; 1992; 1992. 202.
- (1992)
- Singh, S.¹

34
- 31844457132
- Predictive state representations: A new theory for modeling dynamical systems
- Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence;; 512
- Singh S.,James M. R.,Rudary M. R.Predictive state representations: A new theory for modeling dynamical systems. Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence; 2004; 2004. 512.
- (2004)
- Singh, S.¹ James, M.R.² Rudary, M.R.³

35
- 33847202724
- Learning to predict by the method of temporal differences
- Sutton R. S.Learning to predict by the method of temporal differences.Machine Learning. 1988;3:9-44.
- (1988) Machine Learning , vol.3 , pp. 9-44
- Sutton, R.S.¹

36
- 85132026293
- Integrated architectures for learning, planning, and reacting based on approximating dynamic programming
- Proceedings of the Seventh International Conference on Machine Learning;; 216
- Sutton R. S.Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. Proceedings of the Seventh International Conference on Machine Learning; 1990; 1990. 216.
- (1990)
- Sutton, R.S.¹

37
- 84896385986
- TD models: Modeling the world at a mixture of time scales
- Proceedings of the International Conference on Machine Learning;; 531
- Sutton R. S.TD models: Modeling the world at a mixture of time scales. Proceedings of the International Conference on Machine Learning; 1995; 1995. 531.
- (1995)
- Sutton, R.S.¹

38
- 84896334746
- The grand challenge of predictive empirical abstract knowledge
- Working Notes of the IJCAI-09 Workshop on Grand Challenges for Reasoning from Experiences;
- Sutton R. S.The grand challenge of predictive empirical abstract knowledge. Working Notes of the IJCAI-09 Workshop on Grand Challenges for Reasoning from Experiences; 2009; 2009.
- (2009)
- Sutton, R.S.¹

39
- 84864841464
- Beyond reward: The problem of knowledge and data
- Proceedings of the 21st International Conference on Inductive Logic Programming;; 2
- Sutton R. S.Beyond reward: The problem of knowledge and data. Proceedings of the 21st International Conference on Inductive Logic Programming; 2012; 2012. 2.
- (2012)
- Sutton, R.S.¹

40
- 0003066891
- Gabriel MMoore J, ed.;, Cambridge, MA: MIT Press
- Sutton R. S.,Barto A. G.Learning and computational neuroscience: Foundations of adaptive networks. Gabriel MMoore J, ed. Cambridge, MA: MIT Press; 1990:497-537.
- (1990) Learning and Computational Neuroscience: Foundations of Adaptive Networks , pp. 497-537
- Sutton, R.S.¹ Barto, A.G.²

41
- 0004102479
- Cambridge, MA: MIT Press
- Sutton R. S.,Barto A. G.Reinforcement learning: An introduction. Cambridge, MA: MIT Press; 1998:.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

42
- 84864885776
- Horde: A scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction
- Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems;; 761
- Sutton R. S.,Modayil J.,Delp M.,Degris T.,Pilarski P. M.,White A.,Precup D.Horde: A scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction. Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems; 2011; 2011. 761.
- (2011)
- Sutton, R.S.¹ Modayil, J.² Delp, M.³ Degris, T.⁴ Pilarski, P.M.⁵ White, A.⁶ Precup, D.⁷

43
- 0033170372
- Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
- Sutton R. S.,Precup D.,Singh S.Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning.Artificial Intelligence. 1999;112:181-211.
- (1999) Artificial Intelligence , vol.112 , pp. 181-211
- Sutton, R.S.¹ Precup, D.² Singh, S.³

44
- 77956513316
- A convergent O(n) algorithm for off-policy temporal-difference learning with linear function approximation
- Sutton R. S.,Szepesvári Cs.,Maei H. R.A convergent O(n) algorithm for off-policy temporal-difference learning with linear function approximation.Advances in Neural Information Processing Systems 21. 2009;:.
- (2009) Advances in Neural Information Processing Systems 21
- Sutton, R.S.¹ Szepesvári, C.² Maei, H.R.³

45
- 71149099079
- Fast gradient-descent methods for temporal-difference learning with linear function approximation
- Proceedings of the 26th International Conference on Machine Learning; Montreal, Canada; 993
- Sutton R. S.,Maei H. R.,Precup D.,Bhatnagar S.,Silver D.,Szepesvari Cs.,. Wiewiora E.Fast gradient-descent methods for temporal-difference learning with linear function approximation. Proceedings of the 26th International Conference on Machine Learning; 2009Montreal, Canada; 2009. 993.
- (2009)
- Sutton, R.S.¹ Maei, H.R.² Precup, D.³ Bhatnagar, S.⁴ Silver, D.⁵ Szepesvari, C.⁶ Wiewiora, E.⁷

46
- 84899003536
- Temporal-difference networks
- Sutton R. S.,Tanner B.Temporal-difference networks.Advances in Neural Information Processing Systems 17. 2005;:1377-1384.
- (2005) Advances in Neural Information Processing Systems 17 , pp. 1377-1384
- Sutton, R.S.¹ Tanner, B.²

47
- 14044262287
- Stochastic policy gradient reinforcement learning on a simple 3D biped
- Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems;; 2849
- Tedrake R.,Zhang T.,Seung H.Stochastic policy gradient reinforcement learning on a simple 3D biped. Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems; 2005; 2005. 2849.
- (2005)
- Tedrake, R.¹ Zhang, T.² Seung, H.³

48
- 33750024797
- Stanley: The robot that won the DARPA grand challenge
- Thrun S.,Montemerlo M., et al.Stanley: The robot that won the DARPA grand challenge.Journal of Field Robotics. 2006;23 (9): 661-692.
- (2006) Journal of Field Robotics , vol.23 , Issue.9 , pp. 661-692
- Thrun, S.¹ Montemerlo, M.²

49
- 0003649763
- Berkeley, CA: University of California Press
- Tolman E. C.Purposive behavior in animals and men. Berkeley, CA: University of California Press; 1951:.
- (1951) Purposive Behavior in Animals and Men
- Tolman, E.C.¹

50
- 0344876542
- Online simultaneous localization and mapping with detection and tracking of moving objects: Theory and results from a ground vehicle in crowded urban areas
- Proceedings of the IEEE International Conference on Robotics and Automation;; 842
- Wang C. C.,Thorpe C.,Thrun S.Online simultaneous localization and mapping with detection and tracking of moving objects: Theory and results from a ground vehicle in crowded urban areas. Proceedings of the IEEE International Conference on Robotics and Automation; 2003; 2003. 842.
- (2003)
- Wang, C.C.¹ Thorpe, C.² Thrun, S.³

51
- 79955750805
- Chapel Hill, NC: Computer Science Department, University of North Carolina
- Welch G.,Bishop G.An Introduction to the Kalman filter. Chapel Hill, NC: Computer Science Department, University of North Carolina; 1995:.
- (1995) An Introduction to the Kalman Filter
- Welch, G.¹ Bishop, G.²

52
- 84872849054
- Scaling life-long off-policy learning
- Proceedings of the Second Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics
- White A.,Modayil J.,Sutton R. S.Scaling life-long off-policy learning. Proceedings of the Second Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics; 2012; 2012.
- (2012)
- White, A.¹ Modayil, J.² Sutton, R.S.³

53
- 0028799979
- An internal model for sensori-motor integration
- Wolpert D.,Ghahramani Z.,Jordan M.An internal model for sensori-motor integration.Science. 1995;269 (5232): 1880-1882.
- (1995) Science , vol.269 , Issue.5232 , pp. 1880-1882
- Wolpert, D.¹ Ghahramani, Z.² Jordan, M.³

54
- 57149090913
- Emergence of functional hierarchy in a multiple timescale neural network model: A humanoid robot experiment
- Yamashita Y.,Tani J.Emergence of functional hierarchy in a multiple timescale neural network model: A humanoid robot experiment.PLOS: Computational Biology. 2008;4 (11): 1-18.
- (2008) PLOS: Computational Biology , vol.4 , Issue.11 , pp. 1-18
- Yamashita, Y.¹ Tani, J.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.