SCOPUS 정보 검색 플랫폼

International Journal of Machine Learning and Cybernetics

Volumn 7, Issue 6, 2016, Pages 967-980

Reinforcement learning and neural networks for multi-agent nonzero-sum games of nonlinear constrained-input systems

(3) Yasini, Sholeh a Naghibi Sitani, Mohammad Bagher a Kirampor, Ali a

Author keywords

Concurrent reinforcement learning; Coupled Hamilton Jacobi equations; Input constraints; Multi agent nonzero sum games; Neural networks

Indexed keywords

ADAPTIVE CONTROL SYSTEMS; CLOSED LOOP SYSTEMS; CONTINUOUS TIME SYSTEMS; GAME THEORY; NEURAL NETWORKS; ONLINE SYSTEMS; OPTIMAL CONTROL SYSTEMS; REINFORCEMENT LEARNING;

ACTOR-CRITIC NEURAL NETWORK; ADAPTIVE OPTIMAL CONTROL; CONSTRAINED INPUT SYSTEMS; INPUT CONSTRAINTS; JACOBI EQUATION; MULTI AGENT; PERSISTENCE OF EXCITATION; SATURATION NONLINEARITY;

MULTI AGENT SYSTEMS;

EID: 84994504422 PISSN: 18688071 EISSN: 1868808X Source Type: Journal
DOI: 10.1007/s13042-014-0300-y Document Type: Article

Times cited : (40)

References (40)

1
- 4243352182
- Dissertation: Rutgers University
- Shah V (1998) Power control for wireless data services based on utility and pricing. Dissertation, Rutgers University
- (1998) Power control for wireless data services based on utility and pricing
- Shah, V.¹

2
- 34247618255
- Newton’s method for solving cross-coupled sign-indefinite algebraic Riccati equations for weakly coupled large-scale systems
- Mukaidani H (2007) Newton’s method for solving cross-coupled sign-indefinite algebraic Riccati equations for weakly coupled large-scale systems. J Appl Math Comput 188(1):103–115
- (2007) J Appl Math Comput , vol.188 , Issue.1 , pp. 103-115
- Mukaidani, H.¹

3
- 0004251759
- Wiley, New York
- Isaacs R (1965) Differential Games. Wiley, New York
- (1965) Differential Games
- Isaacs, R.¹

4
- 34250487269
- Nonzero-sum differential games
- Starr A, Ho Y (1969) Nonzero-sum differential games. J Optim Theory Appl 3(3):148–206
- (1969) J Optim Theory Appl , vol.3 , Issue.3 , pp. 148-206
- Starr, A.¹ Ho, Y.²

5
- 0003981511
- SIAM, Philadelphia
- Basar T, Olsder GJ (1998) Dynamic Noncooperative Game Theory, 2nd edn. SIAM, Philadelphia
- (1998) Dynamic Noncooperative Game Theory
- Basar, T.¹ Olsder, G.J.²

6
- 0000672181
- Lyapunov iterations for solving coupled algebraic Lyapunov equations of Nash differential games and algebraic Riccati equations of zero-sum games
- Birkhäuser, Boston
- Li T, Gajic Z (1994) Lyapunov iterations for solving coupled algebraic Lyapunov equations of Nash differential games and algebraic Riccati equations of zero-sum games. New Trends Dynam Appl. Birkhäuser, Boston, pp 489–494
- (1994) New Trends Dynam Appl , pp. 489-494
- Li, T.¹ Gajic, Z.²

7
- 0030086666
- On global existence of solutions to coupled matrix Riccati equations in closed-loop Nash games
- Freiling G, Jank G, Abou-Kandil H (2002) On global existence of solutions to coupled matrix Riccati equations in closed-loop Nash games. IEEE Trans Autom Control 41(2):264–269
- (2002) IEEE Trans Autom Control , vol.41 , Issue.2 , pp. 264-269
- Freiling, G.¹ Jank, G.² Abou-Kandil, H.³

8
- 79953127250
- Solving coupled Riccati equations for closed-loop Nash strategy by lack of trust approach
- Jungers M, De Pieri E, Abu-Kandil H (2007) Solving coupled Riccati equations for closed-loop Nash strategy by lack of trust approach. Int J Tomography Stat 7:49–54
- (2007) Int J Tomography Stat , vol.7 , pp. 49-54
- Jungers, M.¹ De Pieri, E.² Abu-Kandil, H.³

9
- 33847202724
- Learning to predictive by the method of temporal differences
- Sutton R (1988) Learning to predictive by the method of temporal differences. Mach Learn 3(1):9–44
- (1988) Mach Learn , vol.3 , Issue.1 , pp. 9-44
- Sutton, R.¹

10
- 70349116541
- Reinforcement learning and adaptive dynamic programming for feedback control
- Lewis FL, Vrabie D (2009) Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits Syst Mag 9(3):32–50
- (2009) IEEE Circuits Syst Mag , vol.9 , Issue.3 , pp. 32-50
- Lewis, F.L.¹ Vrabie, D.²

11
- 84883537695
- Reinforcement learning and feedback control
- Lewis FL, Vrabie D, Vamvoudakis K (2012) Reinforcement learning and feedback control. IEEE Control Syst 32(6):76–105
- (2012) IEEE Control Syst , vol.32 , Issue.6 , pp. 76-105
- Lewis, F.L.¹ Vrabie, D.² Vamvoudakis, K.³

12
- 0002031779
- Approximate dynamic programming for real-time control and neural modeling
- White DA, Sofge DA, (eds), Multiscience Press, Brentwood
- Werbos PJ (1992) Approximate dynamic programming for real-time control and neural modeling. In: White DA, Sofge DA (eds) Handbook of intelligent control. Multiscience Press, Brentwood
- (1992) Handbook of intelligent control
- Werbos, P.J.¹

13
- 0036588686
- Adaptive dynamic programming
- Murray JJ, Cox CJ, Lendaris GG, Saeks R (2002) Adaptive dynamic programming. IEEE Trans Syst Man Cybern Part C Appl Rev 32(2):140–153
- (2002) IEEE Trans Syst Man Cybern Part C Appl Rev , vol.32 , Issue.2 , pp. 140-153
- Murray, J.J.¹ Cox, C.J.² Lendaris, G.G.³ Saeks, R.⁴

14
- 0003487482
- Athena Scientific, MA
- Bertsekas DP, Tsitsiklis JN (1996) Neuro-dynamic Programming. Athena Scientific, MA
- (1996) Neuro-dynamic Programming
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

15
- 67349145396
- Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems
- Vrabie D, Lewis FL (2009) Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems. Neural Netw 22(3):237–246
- (2009) Neural Netw , vol.22 , Issue.3 , pp. 237-246
- Vrabie, D.¹ Lewis, F.L.²

16
- 77950630017
- Online actor-critic algorithm to solve the continuous infinite time horizon optimal control problem
- Vamvoudakis K, Lewis FL (2010) Online actor-critic algorithm to solve the continuous infinite time horizon optimal control problem. Automatica 46(5):878–888
- (2010) Automatica , vol.46 , Issue.5 , pp. 878-888
- Vamvoudakis, K.¹ Lewis, F.L.²

17
- 84871319455
- A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems
- Bhasin S, Kamalapurkar R, Johnson M, Vamvoudakis K, Lewis FL, Dixon WD (2012) A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems. Automatica 49(1):82–92
- (2012) Automatica , vol.49 , Issue.1 , pp. 82-92
- Bhasin, S.¹ Kamalapurkar, R.² Johnson, M.³ Vamvoudakis, K.⁴ Lewis, F.L.⁵ Dixon, W.D.⁶

18
- 84885176157
- Adaptive optimal control of unknown constrained-input systems using policy iteration and neural networks
- Modares H, Lewis FL, Naghibi Sistani MB (2013) Adaptive optimal control of unknown constrained-input systems using policy iteration and neural networks. IEEE Trans Neural Netw Learning Syst 24(10):1513–1525
- (2013) IEEE Trans Neural Netw Learning Syst , vol.24 , Issue.10 , pp. 1513-1525
- Modares, H.¹ Lewis, F.L.² Naghibi Sistani, M.B.³

19
- 79960443754
- Adaptive dynamic programming for online solution of a zero-sum differential game
- Vrabie D, Lewis FL (2011) Adaptive dynamic programming for online solution of a zero-sum differential game. J Control Theory Appl 9(3):353–360
- (2011) J Control Theory Appl , vol.9 , Issue.3 , pp. 353-360
- Vrabie, D.¹ Lewis, F.L.²

20
- 79953155097
- Online solution of nonlinear two-player zero-sum games using synchronous policy iteration. In Proc
- Vamvoudakis K, Lewis FL (2010) Online solution of nonlinear two-player zero-sum games using synchronous policy iteration. In Proc. 49th IEEE CDC, pp 3040-3047
- (2010) 49th IEEE CDC , pp. 3040-3047
- Vamvoudakis, K.¹ Lewis, F.L.²

21
- 84899093084
- ∞control of constrained input systems
- ∞control of constrained input systems. Int J Adapt Cont Sig Proc 28(3–5):232–254
- (2014) Int J Adapt Cont Sig Proc , vol.28 , Issue.3-5 , pp. 232-254
- Modares, H.¹ Lewis, F.L.² Naghibi Sistani, M.B.³

22
- 84860670757
- Nonlinear two-player zero-sum game approximate solution using a policy iteration algorithm. In: Proc. IEEE CDC
- Johnson M, Bhasin S, Dixon WE (2011) Nonlinear two-player zero-sum game approximate solution using a policy iteration algorithm. In: Proc. IEEE CDC, pp 142–147
- (2011) pp 142–147
- Johnson, M.¹ Bhasin, S.² Dixon, W.E.³

23
- 79953133535
- Integral reinforcement learning for online computation of feedback Nash strategies of nonzero-sum differential games. In: Proc. 49th IEEE CDC
- Vrabie D, Lewis FL (2010) Integral reinforcement learning for online computation of feedback Nash strategies of nonzero-sum differential games. In: Proc. 49th IEEE CDC, pp 3066–3071
- (2010) pp 3066–3071
- Vrabie, D.¹ Lewis, F.L.²

24
- 79960897012
- Multi-player non-zero-sum games: online adaptive learning solution of coupled Hamilton-Jacobi equations
- Vamvoudakis K, Lewis FL (2011) Multi-player non-zero-sum games: online adaptive learning solution of coupled Hamilton-Jacobi equations. Automatica 47(8):1556–1569
- (2011) Automatica , vol.47 , Issue.8 , pp. 1556-1569
- Vamvoudakis, K.¹ Lewis, F.L.²

25
- 84885835001
- Near-optimal control for nonzero-sum differential games of continuous-time nonlinear systems using single-network ADP
- Zhang H, Cui L, Luo Y (2013) Near-optimal control for nonzero-sum differential games of continuous-time nonlinear systems using single-network ADP. IEEE Trans Cybern 45(1):206–216
- (2013) IEEE Trans Cybern , vol.45 , Issue.1 , pp. 206-216
- Zhang, H.¹ Cui, L.² Luo, Y.³

26
- 14844340822
- Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach
- Abu-Khalaf M, Lewis FL (2005) Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica 41(5):779–791
- (2005) Automatica , vol.41 , Issue.5 , pp. 779-791
- Abu-Khalaf, M.¹ Lewis, F.L.²

27
- 48949116222
- Neurodynamic programming and zero-sum games for constrained control systems
- Abu-Khalaf M, Lewis FL, Huang J (2008) Neurodynamic programming and zero-sum games for constrained control systems. IEEE Trans Neural Netw 19(7):1243–1252
- (2008) IEEE Trans Neural Netw , vol.19 , Issue.7 , pp. 1243-1252
- Abu-Khalaf, M.¹ Lewis, F.L.² Huang, J.³

28
- 79953141961
- Dissertation: Georgia Institute of Technology
- Chowdhary GV (2010) Concurrent learning for convergence in adaptive control without persistency of excitation. Dissertation, Georgia Institute of Technology
- (2010) Concurrent learning for convergence in adaptive control without persistency of excitation
- Chowdhary, G.V.¹

29
- 84883680649
- Adaptive optimal control for the partially-unknown constrained-input using policy iteration with experience replay
- Boston, Massachusetts
- Modares H, Lewis FL, Naghibi Sistani MB, Chowdhary GV, Yucelen T (2013) Adaptive optimal control for the partially-unknown constrained-input using policy iteration with experience replay. AIAA Guidance Navigation and Control Conference, Boston, Massachusetts
- (2013) AIAA Guidance Navigation and Control Conference
- Modares, H.¹ Lewis, F.L.² Naghibi Sistani, M.B.³ Chowdhary, G.V.⁴ Yucelen, T.⁵

30
- 84893708995
- Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems
- Modares H, Lewis FL, Naghibi Sistani MB (2014) Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems. Automatica 50(1):193–202
- (2014) Automatica , vol.50 , Issue.1 , pp. 193-202
- Modares, H.¹ Lewis, F.L.² Naghibi Sistani, M.B.³

31
- 84927697693
- Online concurrent reinforcement learning algorithm to solve two-player zero-sum games for partially unknown nonlinear continuous-time systems
- Yasini S, Karimpour A, Naghibi Sistani MB, Modares H (2014) Online concurrent reinforcement learning algorithm to solve two-player zero-sum games for partially unknown nonlinear continuous-time systems. Int J Adapt Cont Sig Proc. doi:10.1002/acs.2485
- (2014) Int J Adapt Cont Sig Proc
- Yasini, S.¹ Karimpour, A.² Naghibi Sistani, M.B.³ Modares, H.⁴

32
- 0004163205
- Wiley, New York
- Lewis FL, Vrabie D, Syrmos VL (2012) Optimal control, 3rd edn. Wiley, New York
- (2012) Optimal control
- Lewis, F.L.¹ Vrabie, D.² Syrmos, V.L.³

33
- 84881324637
- Optimal control of nonlinear continuous-time systems: design of bounded controllers via generalized nonquadratic functionals. In Proc
- Lyshevski SE (1998) Optimal control of nonlinear continuous-time systems: design of bounded controllers via generalized nonquadratic functionals. In Proc. IEEE ACC. pp 205–209
- (1998) IEEE ACC , pp. 205-209
- Lyshevski, S.E.¹

34
- 0025627940
- Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks
- Hornik K, Stinchcombe M, White H (1990) Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks. Neural Netw 3(5):551–560
- (1990) Neural Netw , vol.3 , Issue.5 , pp. 551-560
- Hornik, K.¹ Stinchcombe, M.² White, H.³

35
- 40649105766
- A definition of partial derivative of random functions and its application to RBFNN sensitivity analysis
- Wang XZ, Li CG, Yeung DS, Song S, Feng H (2008) A definition of partial derivative of random functions and its application to RBFNN sensitivity analysis. Neurocomputing 71(7–9):1515–1526
- (2008) Neurocomputing , vol.71 , Issue.7-9 , pp. 1515-1526
- Wang, X.Z.¹ Li, C.G.² Yeung, D.S.³ Song, S.⁴ Feng, H.⁵

36
- 84961289486
- Online neural network model for non-stationary and imbalanced data stream classification
- Ghazikhani A, Monsefi R, Sadoghi Yazdi H (2014) Online neural network model for non-stationary and imbalanced data stream classification. Int J Mach Learn Cyber 5(1):51–62. doi:10.1007/s13042-013-0180-6
- (2014) Int J Mach Learn Cyber , vol.5 , Issue.1 , pp. 51-62
- Ghazikhani, A.¹ Monsefi, R.² Sadoghi Yazdi, H.³

37
- 84877744884
- Parameter selection algorithm with self adaptive growing neural network classifier for diagnosis issues
- Barakat M, Lefebvre D, Khalil M, Druaux F, Mustapha O (2013) Parameter selection algorithm with self adaptive growing neural network classifier for diagnosis issues. Int J Mach Learn Cyber 4(3):217–233. doi:10.1007/s13042-012-0089-5
- (2013) Int J Mach Learn Cyber , vol.4 , Issue.3 , pp. 217-233
- Barakat, M.¹ Lefebvre, D.² Khalil, M.³ Druaux, F.⁴ Mustapha, O.⁵

38
- 62949149213
- Constrained nonlinear optimal control: A converse HJB approach. California Institute of Technology, Tech
- Nevisitc V, Primbs JA (1996) Constrained nonlinear optimal control: A converse HJB approach. California Institute of Technology, Tech. Rep
- (1996) Rep
- Nevisitc, V.¹ Primbs, J.A.²

39
- 84933509471
- Dynamic analysis of discrete-time BAM neural networks with stochastic perturbations and impulses
- Raja R, Karthik Raja U, Samidurai R, Leelamani A (2014) Dynamic analysis of discrete-time BAM neural networks with stochastic perturbations and impulses. Int J Mach Learn Cyber 5(1):39–50. doi:10.1007/s13042-013-0199-8
- (2014) Int J Mach Learn Cyber , vol.5 , Issue.1 , pp. 39-50
- Raja, R.¹ Karthik Raja, U.² Samidurai, R.³ Leelamani, A.⁴

40
- 0003543733
- Cambridge University Press, Cambridge
- Hardy G, Littlewood J, Polya G (1998) Inequalities, 2nd edn. Cambridge University Press, Cambridge
- (1998) Inequalities
- Hardy, G.¹ Littlewood, J.² Polya, G.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.