SCOPUS 정보 검색 플랫폼

Handbook of Learning and Approximate Dynamic Programming

Volumn , Issue , 2004, Pages 359-380

Supervised actor-critic reinforcement learning

(2) Rosenstein, Michael T a Barto, Andrew G a

a Biologically Inspired Neural and Dynamical Systems Laboratory (United States)

Author keywords

Data structures; Learning; Optimization; Robots; Supervised learning; Training

Indexed keywords

DATA STRUCTURES; OPTIMIZATION; PERSONNEL TRAINING; ROBOTS; STOCHASTIC SYSTEMS; SUPERVISED LEARNING; SUPERVISORY PERSONNEL;

ACTOR-CRITIC REINFORCEMENT LEARNING; ALGORITHM DESCRIPTION; FEEDBACK CONTROLLER; INTERMITTENT CONTROLS; LEARNING; LEARNING PROBLEM; STOCHASTIC SEARCH; STRUCTURE PARTS;

REINFORCEMENT LEARNING;

EID: 84979715630 PISSN: None EISSN: None Source Type: Book
DOI: 10.1109/9780470544785.ch14 Document Type: Chapter

Times cited : (137)

References (33)

1
- 0003942195
- Byte Books, Peterborough, NH
- J. Albus, Brains, Behavior, and Robotics, Byte Books, Peterborough, NH, 1981.
- (1981) Brains, Behavior, and Robotics
- Albus, J.¹

2
- 33746599972
- Reinforcement learning in motor control
- M. A. Arbib, (ed.), MIT Press, Cambridge, MA
- A. G. Barto, Reinforcement learning in motor control, in M. A. Arbib, (ed.), The Handbook of Brain Theory and Neural Networks, Second Edition, pp. 968-972, MIT Press, Cambridge, MA, 2003.
- (2003) The Handbook of Brain Theory and Neural Networks, Second Edition , pp. 968-972
- Barto, A.G.¹

3
- 0020970738
- Neuronlike elements that can solve difficult learning control problems
- A. G. Barto, R. S. Sutton, and C. W. Anderson, Neuronlike elements that can solve difficult learning control problems, IEEE Trans. Systems, Man, and Cybernetics, vol. 13, pp. 835-846,1983.
- (1983) IEEE Trans. Systems, Man, and Cybernetics , vol.13 , pp. 835-846
- Barto, A.G.¹ Sutton, R.S.² Anderson, C.W.³

4
- 0031343491
- Biped dynamic walking using reinforcement learning
- H. Benbrahim and J. A. Franklin, Biped dynamic walking using reinforcement learning, Robotics and Autonomous Systems, vol. 22, pp. 283-302,1997.
- (1997) Robotics and Autonomous Systems , vol.22 , pp. 283-302
- Benbrahim, H.¹ Franklin, J.A.²

5
- 0004020289
- Pergamon Press, Oxford
- N. A. Bernstein, The Co-ordination and Regulation of Movements, Pergamon Press, Oxford, 1967.
- (1967) The Co-Ordination and Regulation of Movements
- Bernstein, N.A.¹

6
- 0003507691
- Hemisphere Publishing Corp., New York
- A. E. Bryson and Y.-C. Ho, Applied Optimal Control, Hemisphere Publishing Corp., New York, 1975.
- (1975) Applied Optimal Control
- Bryson, A.E.¹ Ho, Y.-C.²

7
- 4544310288
- Ph.D. Thesis, University of Massachusetts, Amherst
- J. A. Clouse, On Integrating Apprentice Learning and Reinforcement Learning, Ph.D. Thesis, University of Massachusetts, Amherst, 1996.
- (1996) On Integrating Apprentice Learning and Reinforcement Learning
- Clouse, J.A.¹

8
- 85152521744
- A teaching method for reinforcement learning
- Morgan Kaufmann, San Francisco, CA
- J. A. Clouse and P. E. Utgoff, A teaching method for reinforcement learning, Proc. Ninth International Conference on Machine Learning, pp. 92-101, Morgan Kaufmann, San Francisco, CA, 1992.
- (1992) Proc. Ninth International Conference on Machine Learning , pp. 92-101
- Clouse, J.A.¹ Utgoff, P.E.²

9
- 0003439363
- Addison-Wesley, Reading, MA
- J. J. Craig, Introduction To Robotics: Mechanics and Control, Addison-Wesley, Reading, MA, 1989.
- (1989) Introduction to Robotics: Mechanics and Control
- Craig, J.J.¹

10
- 0028739953
- Robot shaping: Developing autonomous agents through
- M. Dorigo and M. Colombetti, Robot shaping: developing autonomous agents through learning, Artificial Intelligence, vol. 71, no. 2, pp. 321-370,1994.
- (1994) Learning, Artificial Intelligence , vol.71 , Issue.2 , pp. 321-370
- Dorigo, M.¹ Colombetti, M.²

11
- 0025600638
- A stochastic reinforcement learning algorithm for learning realvalued functions
- V. Gullapalli, A stochastic reinforcement learning algorithm for learning realvalued functions, Neural Networks, vol. 3, no. 6, pp. 671-692, 1990.
- (1990) Neural Networks , vol.3 , Issue.6 , pp. 671-692
- Gullapalli, V.¹

12
- 0031343489
- A feedback control structure for on-line learning tasks
- M. Huber and R. A. Grupen, A feedback control structure for on-line learning tasks, Robotics and Autonomous Systems, vol. 22, no. 3-4, pp. 303-315, 1997.
- (1997) Robotics and Autonomous Systems , vol.22 , Issue.3-4 , pp. 303-315
- Huber, M.¹ Grupen, R.A.²

13
- 44049116478
- Forward models: Supervised learning with a distal teacher
- M. I. Jordan and D. E. Rumelhart, Forward models: Supervised learning with a distal teacher, Cognitive Science, vol. 16, no. 3, pp. 307-354,1992.
- (1992) Cognitive Science , vol.16 , Issue.3 , pp. 307-354
- Jordan, M.I.¹ Rumelhart, D.E.²

14
- 0029697750
- Building elementary robot skills from human demonstration
- IEEE, Piscataway, NJ
- M. Kaiser and R. Dillmann, Building elementary robot skills from human demonstration, Proc. IEEE International Conference on Robotics and Automation, pp. 2700-2705, IEEE, Piscataway, NJ, 1996.
- (1996) Proc. IEEE International Conference on Robotics and Automation , pp. 2700-2705
- Kaiser, M.¹ Dillmann, R.²

15
- 0035977495
- Robust reinforcement learning control with static and dynamic stability
- R. M. Kretchmar, P. M. Young, C. W. Anderson, D. C. Hittle, M. L. Anderson, C. C. Delnero, and J. Tu, Robust reinforcement learning control with static and dynamic stability, International Journal of Robust and Nonlinear Control, vol. 11, pp. 1469-1500,2001.
- (2001) International Journal of Robust and Nonlinear Control , vol.11 , pp. 1469-1500
- Kretchmar, R.M.¹ Young, P.M.² Anderson, C.W.³ Hittle, D.C.⁴ Anderson, M.L.⁵ Delnero, C.C.⁶ Tu, J.⁷

16
- 0000123778
- Self-improving reactive agents based on reinforcement learning, planning and teaching
- L.-J. Lin, Self-improving reactive agents based on reinforcement learning, planning and teaching, Machine Learning, vol. 8, no. 3-4, pp. 293-321,1992.
- (1992) Machine Learning , vol.8 , Issue.3-4 , pp. 293-321
- Lin, L.-J.¹

17
- 0029732210
- Creating advice-taking reinforcement learners
- R. Maclin and J. W. Shavlik, Creating advice-taking reinforcement learners, Machine Learning, vol. 22, no. 1-3, pp. 251-281,1996.
- (1996) Machine Learning , vol.22 , Issue.1-3 , pp. 251-281
- Maclin, R.¹ Shavlik, J.W.²

18
- 84957895797
- Reward functions for accelerated learning
- San Francisco, CA
- M. J. Mataric, Reward functions for accelerated learning, Proc. Eleventh International Conference on Machine Learning, pp. 181-189, San Francisco, CA, 1994.
- (1994) Proc. Eleventh International Conference on Machine Learning , pp. 181-189
- Mataric, M.J.¹

19
- 0003200022
- Sensory-motor primitives as a basis for imitation: Linking perception to action and biology to robotics
- C. Nehaniv and K. Dautenhahn, (eds.), MIT Press, Cambridge, MA
- M. J. Mataric, Sensory-motor primitives as a basis for imitation: linking perception to action and biology to robotics, in C. Nehaniv and K. Dautenhahn, (eds.), Imitation in Animals and Artifacts, MIT Press, Cambridge, MA, 2000.
- (2000) Imitation in Animals and Artifacts
- Mataric, M.J.¹

20
- 0141596576
- Policy invariance under reward transformations: Theory and applications to reward shaping
- Morgan Kaufmann, San Francisco, CA
- A. Y. Ng, D. Harada, and S. Russell, Policy invariance under reward transformations: Theory and applications to reward shaping, Proc. Sixteenth International Conference on Machine Learning, pp. 278-287, Morgan Kaufmann, San Francisco, CA, 1999.
- (1999) Proc. Sixteenth International Conference on Machine Learning , pp. 278-287
- Ng, A.Y.¹ Harada, D.² Russell, S.³

21
- 0013530450
- Lyapunov-constrained action sets for reinforcement learning
- C. Brodley and A. Danyluk, (eds.), Morgan Kaufmann, San Francisco, CA
- T. J. Perkins and A. G. Barto, Lyapunov-constrained action sets for reinforcement learning, in C. Brodley and A. Danyluk, (eds.), Proc. 18th International Conference on Machine Learning, pp. 409-416, Morgan Kaufmann, San Francisco, CA, 2001.
- (2001) Proc. 18Th International Conference on Machine Learning , pp. 409-416
- Perkins, T.J.¹ Barto, A.G.²

22
- 0141607826
- Lyapunov design for safe reinforcement learning
- T. J. Perkins and A. G. Barto, Lyapunov design for safe reinforcement learning, Journal of Machine Learning Research, vol. 3, pp. 803-832, 2002.
- (2002) Journal of Machine Learning Research , vol.3 , pp. 803-832
- Perkins, T.J.¹ Barto, A.G.²

23
- 0010276944
- Implicit imitation in multiagent reinforcement learning
- I. Bratko and S. Dzeroski, (eds.), Morgan Kaufmann, San Francisco, CA
- B. Price and C. Boutilier, Implicit imitation in multiagent reinforcement learning, in I. Bratko and S. Dzeroski, (eds.), Proc. 16th International Conference on Machine Learning, pp. 325-334, Morgan Kaufmann, San Francisco, CA, 1999.
- (1999) Proc. 16Th International Conference on Machine Learning , pp. 325-334
- Price, B.¹ Boutilier, C.²

24
- 0031231885
- Experiments with reinforcement learning in problems with continuous state and action spaces
- J. C. Santamaria, R. S. Sutton, and A. Ram, Experiments with reinforcement learning in problems with continuous state and action spaces, Adaptive Behavior, vol. 6, pp. 163-217, 1997.
- (1997) Adaptive Behavior , vol.6 , pp. 163-217
- Santamaria, J.C.¹ Sutton, R.S.² Ram, A.³

25
- 84898995067
- Learning from demonstration
- M. C. Mozer, M. I. Jordan, and T. Petsche, (eds.), MIT Press, Cambridge, MA
- S. Schaal, Learning from demonstration, in M. C. Mozer, M. I. Jordan, and T. Petsche, (eds.), Advances In Neural Information Processing Systems 9, pp. 1040-1046, MIT Press, Cambridge, MA, 1997.
- (1997) Advances in Neural Information Processing Systems 9 , pp. 1040-1046
- Schaal, S.¹

26
- 0033151712
- Is imitation learning the route to humanoid robots?
- S. Schaal, Is imitation learning the route to humanoid robots? Trends in Cognitive Science, vol. 3, pp. 233-242,1999.
- (1999) Trends in Cognitive Science , vol.3 , pp. 233-242
- Schaal, S.¹

27
- 0002933526
- Linearization and gain-scheduling
- W. S. Levine, (ed.), CRC Press, Boca Raton, FL
- J. S. Shamma, Linearization and gain-scheduling, in W. S. Levine, (ed.), The Control Handbook, pp. 388-396, CRC Press, Boca Raton, FL, 1996.
- (1996) The Control Handbook , pp. 388-396
- Shamma, J.S.¹

28
- 0029753630
- S. P. Singh and R. S. Sutton, Reinforcement learning with replacing eligibility traces^ Machine Learning, vol. 22, no. 1-3, pp. 123-158, 1996.
- (1996) Reinforcement Learning with Replacing Eligibility Traces^ Machine Learning , vol.22 , Issue.1-3 , pp. 123-158
- Singh, S.P.¹ Sutton, R.S.²

29
- 0036058423
- Effective reinforcement learning for mobile robots
- IEEE, Piscataway, NJ
- W. D. Smart and L. P. Kaelbling, Effective reinforcement learning for mobile robots, Proc. IEEE International Conference on Robotics and Automation, pp. 3404-3410, IEEE, Piscataway, NJ, 2002.
- (2002) Proc. IEEE International Conference on Robotics and Automation , pp. 3404-3410
- Smart, W.D.¹ Kaelbling, L.P.²

30
- 33847202724
- Learning to predict by the method of temporal differences
- R. S. Sutton, Learning to predict by the method of temporal differences, Machine Learning, vol. 3, pp. 9-44,1988.
- (1988) Machine Learning , vol.3 , pp. 9-44
- Sutton, R.S.¹

31
- 0004102479
- MIT Press, Cambridge, MA
- R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, MIT Press, Cambridge, MA, 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

32
- 34249833101
- Q-leaming
- C. J. C. H. Watkins and P. Dayan, Q-leaming, Machine Learning, vol. 8, no. 3-4, pp 279-292,1992.
- (1992) Machine Learning , vol.8 , Issue.3-4 , pp. 279-292
- Watkins, C.J.C.H.¹ Dayan, P.²

33
- 0000337576
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- R. J. Williams, Simple statistical gradient-following algorithms for connectionist reinforcement learning, Machine Learning, vol. 8, pp. 229-256,1992.
- (1992) Machine Learning , vol.8 , pp. 229-256
- Williams, R.J.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.