SCOPUS 정보 검색 플랫폼

Volumn 1, Issue 3, 2014, Pages 323-336

Continuous action reinforcement learning for control-affine systems with unknown dynamics

(5) Faust, Aleksandra a Ruymgaart, Peter a Salman, Molly b Fierro, Rafael a Tapia, Lydia a

Author keywords

approximate value iteration; continuous action spaces; control affine nonlinear systems; fitted value iteration; policy approximation; Reinforcement learning

Indexed keywords

ANTENNAS; BALANCING; DECISION MAKING; DIFFERENTIAL EQUATIONS; ITERATIVE METHODS; NONLINEAR EQUATIONS; NONLINEAR SYSTEMS; REINFORCEMENT LEARNING;

AFFINE NONLINEAR SYSTEMS; COMPUTATIONALLY EFFICIENT; CONTINUOUS ACTIONS; CONTROL OF NONLINEAR SYSTEM; CONTROL-AFFINE SYSTEMS; FITTED VALUE ITERATION; SYSTEM OF DIFFERENTIAL EQUATIONS; VALUE ITERATION;

LEARNING SYSTEMS;

EID: 84969983915 PISSN: 23299266 EISSN: 23299274 Source Type: Journal
DOI: 10.1109/JAS.2014.7004690 Document Type: Article

Times cited : (28)

References (33)

1
- 68849115332
- Analysis and control of nonlinear systems: A flatness-based approach
- New York: Springer
- Levine J. Analysis and control of nonlinear systems: a flatness-based approach. Mathematical Engineering. New York: Springer, 2009.
- (2009) Mathematical Engineering
- Levine, J.¹

2
- 0004178386
- New Jersey: Prentice Hall
- Khalil H K. Nonlinear Systems. New Jersey: Prentice Hall, 1996.
- (1996) Nonlinear Systems
- Khalil, H.K.¹

3
- 85046476577
- Boca Raton, Florida: CRC Press
- Busoniu L, Babuska R, De Schutter B, Ernst D. Reinforcement Learning and Dynamic Programming Using Function Approximators. Boca Raton, Florida: CRC Press, 2010.
- (2010) Reinforcement Learning and Dynamic Programming Using Function Approximators
- Busoniu, L.¹ Babuska, R.² De Schutter, B.³ Ernst, D.⁴

4
- 0003487482
- Belmont, MA: Athena Scientific
- Bertsekas D P, Tsitsiklis J N. Neuro-Dynamic Programming. Belmont, MA: Athena Scientific, 1996.
- (1996) Neuro-Dynamic Programming
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

5
- 81355166317
- Approximate value iteration in the reinforcement learning context. Application to electrical power system control
- Ernst D, Glavic M, Geurts P, Wehenkel L. Approximate value iteration in the reinforcement learning context. application to electrical power system control. International Journal of Emerging Electric Power Systems, 2005, 3(1): 10661-106637
- (2005) International Journal of Emerging Electric Power Systems , vol.3 , Issue.1 , pp. 10661-106637
- Ernst, D.¹ Glavic, M.² Geurts, P.³ Wehenkel, L.⁴

6
- 84898818933
- Parsing indoor scenes using RGB-D imagery
- Sydney, Australia
- Taylor C J, Cowley A. Parsing indoor scenes using RGB-D imagery. In: Proceeding of Robotics: Sci. Sys. (RSS), Sydney, Australia, 2012.
- (2012) Proceeding of Robotics: Sci. Sys. (RSS)
- Taylor, C.J.¹ Cowley, A.²

7
- 34548331001
- Cambridge, U.K.: Cambridge University Press
- La Valle S M. Planning Algorithms. Cambridge, U.K.: Cambridge University Press, 2006.
- (2006) Planning Algorithms
- La Valle, S.M.¹

8
- 84860701744
- Derivative-free decentralized adaptive control of large-scale interconnected uncertain systems
- Orlando, USA: IEEE
- Yucelen T, Yang B-J, Calise A J. Derivative-free decentralized adaptive control of large-scale interconnected uncertain systems. In: Proceeding of the 50th Conference on Decision and Control and European Control Conference (CDC-ECC). Orlando, USA: IEEE, 2011. 1104-1109
- (2011) Proceeding of the 50th Conference on Decision and Control and European Control Conference (CDC-ECC) , pp. 1104-1109
- Yucelen, T.¹ Yang, B.-J.² Calise, A.J.³

9
- 80455160265
- Decentralized optimal control of a class of interconnected nonlinear discrete-time systems by using online Hamilton-Jacobi-Bellman formulation
- Mehraeen S, Jagannathan S. Decentralized optimal control of a class of interconnected nonlinear discrete-time systems by using online Hamilton-Jacobi-Bellman formulation. IEEE Transactions on Neural Networks, 2011, 22(11): 1757-1769
- (2011) IEEE Transactions on Neural Networks , vol.22 , Issue.11 , pp. 1757-1769
- Mehraeen, S.¹ Jagannathan, S.²

10
- 84875270081
- Online optimal control of affine nonlinear discrete-time systems with unknown internal dynamics by using timebased policy update
- Dierks T, Jagannathan S. Online optimal control of affine nonlinear discrete-time systems with unknown internal dynamics by using timebased policy update. IEEE Transactions on Neural Networks and Learning Systems, 2012, 23(7): 1118-1129
- (2012) IEEE Transactions on Neural Networks and Learning Systems , vol.23 , Issue.7 , pp. 1118-1129
- Dierks, T.¹ Jagannathan, S.²

11
- 84939463960
- Online adaptive algorithm for optimal control with integral reinforcement learning
- to be published
- Vamvoudakis K G, Vrabie D, Lewis F L. Online adaptive algorithm for optimal control with integral reinforcement learning. International Journal of Robust and Nonlinear Control, to be published
- International Journal of Robust and Nonlinear Control
- Vamvoudakis, K.G.¹ Vrabie, D.² Lewis, F.L.³

12
- 79959473178
- Decentralized nearly optimal control of a class of interconnected nonlinear discrete-time systems by using online Hamilton-Bellman-Jacobi formulation
- Barcelona: IEEE
- Mehraeen S, Jagannathan S. Decentralized nearly optimal control of a class of interconnected nonlinear discrete-time systems by using online Hamilton-Bellman-Jacobi formulation. In: Proceeding of the 2010 International Joint Conference on Neural Networks (IJCNN). Barcelona: IEEE, 2010. 1-8
- (2010) Proceeding of the 2010 International Joint Conference on Neural Networks (IJCNN) , pp. 1-8
- Mehraeen, S.¹ Jagannathan, S.²

13
- 79960468564
- Asymptotic tracking by a reinforcement learning-based adaptive critic controller
- Bhasin S, Sharma N, Patre P, Dixon W. Asymptotic tracking by a reinforcement learning-based adaptive critic controller. Journal of Control Theory and Applications, 2011, 9(3): 400-409
- (2011) Journal of Control Theory and Applications , vol.9 , Issue.3 , pp. 400-409
- Bhasin, S.¹ Sharma, N.² Patre, P.³ Dixon, W.⁴

14
- 84881373865
- A policy iteration approach to online optimal control of continuous-time constrained-input systems
- Modares H, Sistani M B N, Lewis F L. A policy iteration approach to online optimal control of continuous-time constrained-input systems. ISA Transactions, 2013, 52(5): 611-621
- (2013) ISA Transactions , vol.52 , Issue.5 , pp. 611-621
- Modares, H.¹ Sistani, M.B.N.² Lewis, F.L.³

15
- 39549085591
- Generalized Hamilton-Jacobi-Bellman formulation-based neural network control of affine nonlinear discrete time systems
- Chen Z, Jagannathan S. Generalized Hamilton-Jacobi-Bellman formulation-based neural network control of affine nonlinear discrete time systems. IEEE Transactions on Neural Networks, 2008, 19(1): 90-106
- (2008) IEEE Transactions on Neural Networks , vol.19 , Issue.1 , pp. 90-106
- Chen, Z.¹ Jagannathan, S.²

16
- 84865467087
- Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics
- Jiang Y, Jiang Z P. Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics. Automatica, 2012, 48(10): 2699-2704
- (2012) Automatica , vol.48 , Issue.10 , pp. 2699-2704
- Jiang, Y.¹ Jiang, Z.P.²

17
- 49049089962
- Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof
- Al-Tamimi A, Lewis F, Abu-Khalaf M. Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 2008, 38(4): 943-949
- (2008) IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics , vol.38 , Issue.4 , pp. 943-949
- Al-Tamimi, A.¹ Lewis, F.² Abu-Khalaf, M.³

18
- 33846781133
- A neural network solution for fixed-final time optimal control of nonlinear systems
- Cheng T, Lewis F L, Abu-Khalaf M. A neural network solution for fixed-final time optimal control of nonlinear systems. Automatica, 2007, 43(3): 482-490
- (2007) Automatica , vol.43 , Issue.3 , pp. 482-490
- Cheng, T.¹ Lewis, F.L.² Abu-Khalaf, M.³

19
- 84884276459
- Reinforcement learning in robotics: A survey
- Kober J, Bagnell D, Peters J. Reinforcement learning in robotics: a survey. International Journal of Robotics Research, 2013, 32(11): 1236-1274
- (2013) International Journal of Robotics Research , vol.32 , Issue.11 , pp. 1236-1274
- Kober, J.¹ Bagnell, D.² Peters, J.³

20
- 85042095332
- Reinforcement learning in continuous state and action spaces
- Berlin Heidelberg: Springer
- Hasselt H. Reinforcement learning in continuous state and action spaces. Adaptation, Learning, and Optimization. Berlin Heidelberg: Springer, 2012. 207-251
- (2012) Adaptation, Learning, and Optimization , pp. 207-251
- Hasselt, H.¹

21
- 84871756682
- A survey of actor-critic reinforcement learning: Standard and natural policy gradients
- Grondman I, Busoniu L, Lopes G A D, Babuska R. A survey of actor-critic reinforcement learning: standard and natural policy gradients. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 2012, 42(6): 1291-1307
- (2012) IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews , vol.42 , Issue.6 , pp. 1291-1307
- Grondman, I.¹ Busoniu, L.² Lopes, G.A.D.³ Babuska, R.⁴

22
- 51349128679
- Reinforcement learning in multi-dim ensional state-action space using random rectangular coarse coding and Gibbs sampling
- San Diego, CA: IEEE
- Kimura H. Reinforcement learning in multi-dim ensional state-action space using random rectangular coarse coding and Gibbs sampling. In: Proceeding of the 2007 IEEE International Conference on Intelligent Robots and Systems (IROS). San Diego, CA: IEEE, 2007. 88-95
- (2007) Proceeding of the 2007 IEEE International Conference on Intelligent Robots and Systems (IROS) , pp. 88-95
- Kimura, H.¹

23
- 85015154191
- Reinforcement learning in continuous action spaces through sequential Monte Carlo methods
- Lazaric A, Restelli M, Bonarini A. Reinforcement learning in continuous action spaces through sequential Monte Carlo methods. Advances in Neural Information Processing Systems, 2008, 20: 833-840
- (2008) Advances in Neural Information Processing Systems , vol.20 , pp. 833-840
- Lazaric, A.¹ Restelli, M.² Bonarini, A.³

24
- 85161978146
- Fitted Q-iteration in continuous action-space MDPs
- Cambridge, MA: MIT Press
- Antos A, Munos R, Szepesvári C. Fitted Q-iteration in continuous action-space MDPs. Advances in Neural Information Processing Systems 20. Cambridge, MA: MIT Press, 2007. 9-16
- (2007) Advances in Neural Information Processing Systems , vol.20 , pp. 9-16
- Antos, A.¹ Munos, R.² Szepesvári, C.³

25
- 79960128338
- X-armed bandits
- Bubeck S, Munos R, Stoltz G, Szepesvari C. X-armed bandits. The Journal of Machine Learning Research, 2011, 12: 1655-1695
- (2011) The Journal of Machine Learning Research , vol.12 , pp. 1655-1695
- Bubeck, S.¹ Munos, R.² Stoltz, G.³ Szepesvari, C.⁴

26
- 84891503761
- Optimistic planning for continuous-action deterministic systems
- Singapore: IEEE
- Buşoniu L, Daniels A, Munos R, Babuška R. Optimistic planning for continuous-action deterministic systems. In: Proceeding of the 2013 Symposium on Adaptive Dynamic Programming and Reinforcement Learning. Singapore: IEEE, 2013. 69-76
- (2013) Proceeding of the 2013 Symposium on Adaptive Dynamic Programming and Reinforcement Learning , pp. 69-76
- Buşoniu, L.¹ Daniels, A.² Munos, R.³ Babuška, R.⁴

27
- 80054835987
- Sample-based planning for continuous action Markov decision processes
- Piscataway, NL, USA: ICML
- Mansley C, Weinstein A, Littman M L. Sample-based planning for continuous action Markov decision processes. In: Proceeding of the 21st International Conference on Automated Planning and Scheduling. Piscataway, NL, USA: ICML, 2011. 335-338
- (2011) Proceeding of the 21st International Conference on Automated Planning and Scheduling , pp. 335-338
- Mansley, C.¹ Weinstein, A.² Littman, M.L.³

28
- 77958578580
- Integrating sample-based planning and model-based reinforcement learning
- Atlanta, Georgia, USA: AAAI Press
- Walsh T J, Goschin S, Littman M L. Integrating sample-based planning and model-based reinforcement learning. In: Proceedings of the 24th AAAI Conference on Artificial Intelligence. Atlanta, Georgia, USA: AAAI Press, 2010. 612-617
- (2010) Proceedings of the 24th AAAI Conference on Artificial Intelligence , pp. 612-617
- Walsh, T.J.¹ Goschin, S.² Littman, M.L.³

29
- 84887295086
- Learning swing freetrajectories for UAVs with a suspended load
- Karlsruhe, Germany: IEEE
- Faust A, Palunko I, Cruz P, Fierro R, Tapia L. Learning swing freetrajectories for UAVs with a suspended load. In: Proceedings of the 2013 IEEE International Conference on Robotics and Automation (ICRA). Karlsruhe, Germany: IEEE, 2013. 4902-4909
- (2013) Proceedings of the 2013 IEEE International Conference on Robotics and Automation (ICRA) , pp. 4902-4909
- Faust, A.¹ Palunko, I.² Cruz, P.³ Fierro, R.⁴ Tapia, L.⁵

30
- 84963735888
- Automated aerial suspended cargo delivery through reinforcement learning
- in press
- Faust A, Palunko I, Cruz P, Fierro R, Tapia L, Automated aerial suspended cargo delivery through reinforcement learning. In: Artificial Intelligence, 2015, in press.
- (2015) Artificial Intelligence
- Faust, A.¹ Palunko, I.² Cruz, P.³ Fierro, R.⁴ Tapia, L.⁵

31
- 0003787146
- Mineola, NY: Dover Publications, Incorporated
- Bellman R E. Dynamic Programming. Mineola, NY: Dover Publications, Incorporated, 1957.
- (1957) Dynamic Programming
- Bellman, R.E.¹

32
- 0004102479
- Cambridge, MA: MIT Press
- Sutton R S, Barto A G. A Reinforcement Learning: an Introduction. Cambridge, MA: MIT Press, 1998.
- (1998) A Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

33
- 44649189852
- Finite-time bounds for Fitted Value Iteration
- Munos R, Szepesvári C. Finite-time bounds for Fitted Value Iteration. Journal of Machine Learning Research, 2008, 9: 815-857
- (2008) Journal of Machine Learning Research , vol.9 , pp. 815-857
- Munos, R.¹ Szepesvári, C.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.