SCOPUS 정보 검색 플랫폼

Advances in Neural Information Processing Systems

Volumn , Issue , 2013, Pages

Projected natural actor-critic

(4) Thomas, Philip S a Dabney, William a Mahadevan, Sridhar a Giguere, Stephen a

a Biologically Inspired Neural and Dynamical Systems Laboratory (United States)

Author keywords

[No Author keywords available]

Indexed keywords

MARKOV PROCESSES; REINFORCEMENT LEARNING;

ACTOR CRITIC; ACTOR-CRITIC ALGORITHM; CONSTRAINED DOMAIN; MARKOV DECISION PROCESSES; NATURAL GRADIENT; OPTIMAL POLICIES; POLICY SEARCH; SAFETY GUARANTEES;

ALGORITHMS;

EID: 84899017702 PISSN: 10495258 EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (23)

References (32)

1
- 0000396062
- Natural gradient works efficiently in learning
- S. Amari. Natural gradient works efficiently in learning. Neural Computation, 10:251-276, 1998.
- (1998) Neural Computation , vol.10 , pp. 251-276
- Amari, S.¹

2
- 0003706925
- ISA: The Instrumentation, Systems, and Automation Society
- K. J. Åström and T. Hägglund. PID Controllers: Theory, Design, and Tuning. ISA: The Instrumentation, Systems, and Automation Society, 1995.
- (1995) PID Controllers: Theory, Design, and Tuning
- Åström, K.J.¹ Hägglund, T.²

3
- 0037211015
- Fast calculation of stabilizing PID controllers
- M. T. Söylemez, N. Munro, and H. Baki. Fast calculation of stabilizing PID controllers. Automatica, 39 (1):121-126, 2003.
- (2003) Automatica , vol.39 , Issue.1 , pp. 121-126
- Söylemez, M.T.¹ Munro, N.² Baki, H.³

4
- 41949097301
- Functional electrical stimulation
- C. L. Lynch and M. R. Popovic. Functional electrical stimulation. In IEEE Control Systems Magazine, volume 28, pages 40-50.
- IEEE Control Systems Magazine , vol.28 , pp. 40-50
- Lynch, C.L.¹ Popovic, M.R.²

5
- 67149094917
- A real-time 3-D musculoskeletal model for dynamic simulation of arm movements
- E. K. Chadwick, D. Blana, A. J. van den Bogert, and R. F. Kirsch. A real-time 3-D musculoskeletal model for dynamic simulation of arm movements. In IEEE Transactions on Biomedical Engineering, volume 56, pages 941-948, 2009.
- (2009) IEEE Transactions on Biomedical Engineering , vol.56 , pp. 941-948
- Chadwick, E.K.¹ Blana, D.² Den Van Bogert, A.J.³ Kirsch, R.F.⁴

6
- 74949105119
- A proportional derivative FES controller for planar arm movement
- Philadelphia, PA
- K. Jagodnik and A. van den Bogert. A proportional derivative FES controller for planar arm movement. In 12th Annual Conference International FES Society, Philadelphia, PA, 2007.
- (2007) 12th Annual Conference International FES Society
- Jagodnik, K.¹ Bogert Den A.Van²

7
- 74949094130
- Application of the actor-critic architecture to functional electrical stimulation control of a human arm
- P. S. Thomas, M. S. Branicky, A. J. van den Bogert, and K. M. Jagodnik. Application of the actor-critic architecture to functional electrical stimulation control of a human arm. In Proceedings of the Twenty-First Innovative Applications of Artificial Intelligence, 2009.
- (2009) Proceedings of the Twenty-First Innovative Applications of Artificial Intelligence
- Thomas, P.S.¹ Branicky, M.S.² Den Van Bogert, A.J.³ Jagodnik, K.M.⁴

8
- 0141607826
- Lyapunov design for safe reinforcement learning
- T. J. Perkins and A. G. Barto. Lyapunov design for safe reinforcement learning. Journal of Machine Learning Research, 3:803-832, 2003.
- (2003) Journal of Machine Learning Research , vol.3 , pp. 803-832
- Perkins, T.J.¹ Barto, A.G.²

9
- 0031343491
- Biped dynamic walking using reinforcement learning
- H. Bendrahim and J. A. Franklin. Biped dynamic walking using reinforcement learning. Robotics and Autonomous Systems, 22:283-302, 1997.
- (1997) Robotics and Autonomous Systems , vol.22 , pp. 283-302
- Bendrahim, H.¹ Franklin, J.A.²

10
- 27644511603
- Control of markov chains with safety bounds
- October
- A. Arapostathis, R. Kumar, and S. P. Hsu. Control of markov chains with safety bounds. In IEEE Transactions on Automation Science and Engineering, volume 2, pages 333-343, October 2005.
- (2005) IEEE Transactions on Automation Science and Engineering , vol.2 , pp. 333-343
- Arapostathis, A.¹ Kumar, R.² Hsu, S.P.³

11
- 84898984859
- Control design for Markov chains under safety constraints: A convex approach
- abs/1209.2883
- E. Arvelo and N. C. Martins. Control design for Markov chains under safety constraints: A convex approach. CoRR, abs/1209.2883, 2012.
- (2012) CoRR
- Arvelo, E.¹ Martins, N.C.²

12
- 31144477417
- Risk-sensitive reinforcement learning applied to control under constraints
- P. Geibel and F. Wysotzki. Risk-sensitive reinforcement learning applied to control under constraints. Journal of Artificial Intelligence Research 24, pages 81-108, 2005.
- (2005) Journal of Artificial Intelligence Research , vol.24 , pp. 81-108
- Geibel, P.¹ Wysotzki, F.²

13
- 84959265213
- Variational bayesian optimization for runtime risk-sensitive control
- S. Kuindersma, R. Grupen, and A. G. Barto. Variational bayesian optimization for runtime risk-sensitive control. In Robotics: Science and Systems VIII, 2012.
- (2012) Robotics: Science and Systems , vol.8
- Kuindersma, S.¹ Grupen, R.² Barto, A.G.³

14
- 70349984547
- Natural actor-critic algorithms
- S. Bhatnagar, R. S. Sutton, M. Ghavamzadeh, and M. Lee. Natural actor-critic algorithms. Automatica, 45(11):2471-2482, 2009.
- (2009) Automatica , vol.45 , Issue.11 , pp. 2471-2482
- Bhatnagar, S.¹ Sutton, R.S.² Ghavamzadeh, M.³ Lee, M.⁴

15
- 80052393597
- Adaptive subgradient methods for online learning and stochastic optimization
- University of California at Berkeley, March
- J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. Technical Report UCB/EECS-2010-24, Electrical Engineering and Computer Sciences, University of California at Berkeley, March 2010.
- (2010) Technical Report UCB/EECS-2010-24 Electrical Engineering and Computer Sciences
- Duchi, J.¹ Hazan, E.² Singer, Y.³

16
- 84892188881
- Why natural gradient?
- S. Amari and S. Douglas. Why natural gradient? In Proceedings of the 1998 IEEE International Conference on Acoustics, Speech, and Signal Processing, volume 2, pages 1213-1216, 1998.
- (1998) Proceedings of the 1998 IEEE International Conference on Acoustics, Speech, and Signal Processing , pp. 1213-1216
- Amari, S.¹ Douglas, S.²

17
- 0003692801
- Wiley, New York
- A. Nemirovski and D. Yudin. Problem Complexity and Method Efficiency in Optimization. Wiley, New York, 1983.
- (1983) Problem Complexity and Method Efficiency in Optimization
- Nemirovski, A.¹ Yudin, D.²

18
- 0037403111
- Mirror descent and nonlinear projected subgradient methods for convex optimization
- A. Beck and M. Teboulle. Mirror descent and nonlinear projected subgradient methods for convex optimization. Operations Research Letters, 2003.
- (2003) Operations Research Letters
- Beck, A.¹ Teboulle, M.²

19
- 84886008156
- Sparse Q-learning with mirror descent
- S. Mahadevan and B. Liu. Sparse Q-learning with mirror descent. In Proceedings of the Conference on Unvertainty in Artificial Intelligence, 2012.
- (2012) Proceedings of the Conference on Unvertainty in Artificial Intelligence
- Mahadevan, S.¹ Liu, B.²

20
- 84893402754
- Basis adaptation for sparse nonlinear reinforcement learning
- S. Mahadevan, S. Giguere, and N. Jacek. Basis adaptation for sparse nonlinear reinforcement learning. In Proceedings of the Conference on Artificial Intelligence, 2013.
- (2013) Proceedings of the Conference on Artificial Intelligence
- Mahadevan, S.¹ Giguere, S.² Jacek, N.³

21
- 0004031920
- University Press, Princeton, New Jersey
- R. Tyrell Rockafellar. Convex Analysis. Princeton University Press, Princeton, New Jersey, 1970.
- (1970) Convex Analysis. Princeton
- Rockafellar, R.T.¹

22
- 0003982971
- Springer, second edition
- J. Nocedal and S. Wright. Numerical Optimization. Springer, second edition, 2006.
- (2006) Numerical Optimization
- Nocedal, J.¹ Wright, S.²

23
- 84898930479
- A natural policy gradient
- S. Kakade. A natural policy gradient. In Advances in Neural Information Processing Systems, volume 14, pages 1531-1538, 2002.
- (2002) Advances in Neural Information Processing Systems , vol.14 , pp. 1531-1538
- Kakade, S.¹

24
- 84898939480
- Policy gradient methods for reinforcement learning with function approximation
- R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour. Policy gradient methods for reinforcement learning with function approximation. In Advances in Neural Information Processing Systems 12, pages 1057-1063, 2000.
- (2000) Advances in Neural Information Processing Systems , vol.12 , pp. 1057-1063
- Sutton, R.S.¹ McAllester, D.² Singh, S.³ Mansour, Y.⁴

25
- 50849108789
- Utilizing the natural gradient in temporal difference reinforcement learning with eligibility traces
- T. Morimura, E. Uchibe, and K. Doya. Utilizing the natural gradient in temporal difference reinforcement learning with eligibility traces. In International Symposium on Information Geometry and its Application, 2005.
- (2005) International Symposium on Information Geometry and Its Application
- Morimura, T.¹ Uchibe, E.² Doya, K.³

26
- 40649106649
- Natural actor-critic
- J. Peters and S. Schaal. Natural actor-critic. Neurocomputing, 71:1180-1190, 2008.
- (2008) Neurocomputing , vol.71 , pp. 1180-1190
- Peters, J.¹ Schaal, S.²

27
- 84872863511
- Motor primitive discovery
- P. S. Thomas and A. G. Barto. Motor primitive discovery. In Procedings of the IEEE Conference on Development and Learning and EPigenetic Robotics, 2012.
- (2012) Procedings of the IEEE Conference on Development and Learning and EPigenetic Robotics
- Thomas, P.S.¹ Barto, A.G.²

28
- 84869424969
- Model-free reinforcement learning with continuous action in practice
- T. Degris, P. M. Pilarski, and R. S. Sutton. Model-free reinforcement learning with continuous action in practice. In Proceedings of the 2012 American Control Conference, 2012.
- (2012) Proceedings of the 2012 American Control Conference
- Degris, T.¹ Pilarski, P.M.² Sutton, R.S.³

29
- 85035116867
- Bias in natural actor-critic algorithms
- University of Massachusetts at Amherst
- P. S. Thomas. Bias in natural actor-critic algorithms. Technical Report UM-CS-2012-018, Department of Computer Science, University of Massachusetts at Amherst, 2012.
- (2012) Technical Report UM-CS-2012-018, Department of Computer Science
- Thomas, P.S.¹

30
- 67349216631
- Combined feedforward and feedback control of a redundant, nonlinear, dynamic musculoskeletal system
- D. Blana, R. F. Kirsch, and E. K. Chadwick. Combined feedforward and feedback control of a redundant, nonlinear, dynamic musculoskeletal system. Medical and Biological Engineering and Computing, 47: 533-542, 2009.
- (2009) Medical and Biological Engineering and Computing , vol.47 , pp. 533-542
- Blana, D.¹ Kirsch, R.F.² Chadwick, E.K.³

31
- 84879911566
- PhD thesis, University of Massachusetts Amherst
- P. Deegan. Whole-Body Strategies for Mobility and Manipulation. PhD thesis, University of Massachusetts Amherst, 2010.
- (2010) Whole-Body Strategies for Mobility and Manipulation
- Deegan, P.¹

32
- 70449434818
- Dexterous mobility with the uBot-5 mobile manipulator
- S. R. Kuindersma, E. Hannigan, D. Ruiken, and R. A. Grupen. Dexterous mobility with the uBot-5 mobile manipulator. In Proceedings of the 14th International Conference on Advanced Robotics, 2009.
- (2009) Proceedings of the 14th International Conference on Advanced Robotics
- Kuindersma, S.R.¹ Hannigan, E.² Ruiken, D.³ Grupen, R.A.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.