-
1
-
-
70350699723
-
A multiagent reinforcement learning algorithm with non-linear dynamics
-
Abdallah S, Lesser V (2008) A multiagent reinforcement learning algorithm with non-linear dynamics. J Artif Intell Res 33:521–549
-
(2008)
J Artif Intell Res
, vol.33
, pp. 521-549
-
-
Abdallah, S.1
Lesser, V.2
-
2
-
-
84891544020
-
Exponential moving average Q-learning algorithm. In: Adaptive dynamic programming and reinforcement learning (ADPRL), 2013 IEEE symposium on, IEEE, pp 31–38
-
Awheda MD, Schwartz HM (2013) Exponential moving average Q-learning algorithm. In: Adaptive dynamic programming and reinforcement learning (ADPRL), 2013 IEEE symposium on, IEEE, pp 31–38. IEEE
-
(2013)
IEEE
-
-
Awheda, M.D.1
Schwartz, H.M.2
-
3
-
-
85119225337
-
-
Awheda MD, Schwartz HM (2015) The residual gradient FACL algorithm for differential games. In Electrical and computer engineering (CCECE). 2015 IEEE 28th Canadian conference on, IEEE, pp 1006–1011. IEEE
-
Awheda MD, Schwartz HM (2015) The residual gradient FACL algorithm for differential games. In Electrical and computer engineering (CCECE). 2015 IEEE 28th Canadian conference on, IEEE, pp 1006–1011. IEEE
-
-
-
-
4
-
-
35248823118
-
Generalized multiagent learning with performance bound
-
Banerjee B, Peng J (2007) Generalized multiagent learning with performance bound. Auton Agents Multi-Agent Syst 15(3):281–312
-
(2007)
Auton Agents Multi-Agent Syst
, vol.15
, Issue.3
, pp. 281-312
-
-
Banerjee, B.1
Peng, J.2
-
5
-
-
85012688561
-
-
Princeton University Press, Princeton
-
Bellman R (1957) Dynamic programming. Princeton University Press, Princeton
-
(1957)
Dynamic programming
-
-
Bellman, R.1
-
6
-
-
84899027977
-
Convergence and no-regret in multiagent learning
-
Bowling M (2005) Convergence and no-regret in multiagent learning. Adv Neural Inf Process Syst 17:209–216
-
(2005)
Adv Neural Inf Process Syst
, vol.17
, pp. 209-216
-
-
Bowling, M.1
-
7
-
-
84957858286
-
Convergence of gradient dynamics with a variable learning rate. In: ICML
-
Bowling M, Veloso M (2001a) Convergence of gradient dynamics with a variable learning rate. In: ICML, pp 27–34
-
(2001)
pp 27–34
-
-
Bowling, M.1
Veloso, M.2
-
8
-
-
84880865940
-
Rational and convergent learning in stochastic games. In: International joint conference on artificial intelligence, vol. 17. Lawrence Erlbaum Associates Ltd
-
Bowling M, Veloso M (2001b) Rational and convergent learning in stochastic games. In: International joint conference on artificial intelligence, vol. 17. Lawrence Erlbaum Associates Ltd, pp 1021–1026
-
(2001)
pp 1021–1026
-
-
Bowling, M.1
Veloso, M.2
-
9
-
-
0036531878
-
Multiagent learning using a variable learning rate
-
Bowling M, Veloso M (2002) Multiagent learning using a variable learning rate. Artif Intell 136(2):215–250
-
(2002)
Artif Intell
, vol.136
, Issue.2
, pp. 215-250
-
-
Bowling, M.1
Veloso, M.2
-
10
-
-
70350566689
-
Effective learning in the presence of adaptive counterparts
-
Burkov A, Chaib-draa B (2009) Effective learning in the presence of adaptive counterparts. J Algorithms 64(4):127–138
-
(2009)
J Algorithms
, vol.64
, Issue.4
, pp. 127-138
-
-
Burkov, A.1
Chaib-draa, B.2
-
11
-
-
34547192059
-
Multi-agent reinforcement learning: A survey. In: Control, automation, robotics and vision, 2006. ICARCV’06. 9th international conference on, IEEE, pp 1–6
-
Busoniu L, Babuska R, De Schutter B (2006) Multi-agent reinforcement learning: A survey. In: Control, automation, robotics and vision, 2006. ICARCV’06. 9th international conference on, IEEE, pp 1–6. IEEE
-
(2006)
IEEE
-
-
Busoniu, L.1
Babuska, R.2
De Schutter, B.3
-
12
-
-
40949147745
-
A comprehensive survey of multiagent reinforcement learning
-
Busoniu L, Babuska R, De Schutter B (2008) A comprehensive survey of multiagent reinforcement learning. Syst Man Cybern Part C: Appl Rev, IEEE Trans 38(2):156–172
-
(2008)
Syst Man Cybern Part C: Appl Rev, IEEE Trans
, vol.38
, Issue.2
, pp. 156-172
-
-
Busoniu, L.1
Babuska, R.2
De Schutter, B.3
-
13
-
-
0031630561
-
The dynamics of reinforcement learning in cooperative multiagent systems. In: AAAI/IAAI
-
Claus C, Boutilier C (1998) The dynamics of reinforcement learning in cooperative multiagent systems. In: AAAI/IAAI, pp 746–752
-
(1998)
pp 746–752
-
-
Claus, C.1
Boutilier, C.2
-
14
-
-
34147159616
-
Awesome: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents
-
Conitzer V, Sandholm T (2007) Awesome: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents. Mach Learn 67(1–2):23–43
-
(2007)
Mach Learn
, vol.67
, Issue.1-2
, pp. 23-43
-
-
Conitzer, V.1
Sandholm, T.2
-
15
-
-
27744536933
-
An approach to tune fuzzy controllers based on reinforcement learning for autonomous vehicle control
-
Dai X, Li C-K, Rad AB (2005) An approach to tune fuzzy controllers based on reinforcement learning for autonomous vehicle control. Intell Transp Syst, IEEE Trans 6(3):285–293
-
(2005)
Intell Transp Syst, IEEE Trans
, vol.6
, Issue.3
, pp. 285-293
-
-
Dai, X.1
Li, C.-K.2
Rad, A.B.3
-
18
-
-
84923229149
-
Optimal adaptive control and differential games by reinforcement learning principles
-
Dixon W (2014) Optimal adaptive control and differential games by reinforcement learning principles. J Guid Control Dyn 37(3):1048–1049
-
(2014)
J Guid Control Dyn
, vol.37
, Issue.3
, pp. 1048-1049
-
-
Dixon, W.1
-
19
-
-
84880861539
-
Predicting and preventing coordination problems in cooperative Q-learning systems
-
Fulda N, Ventura D (2007) Predicting and preventing coordination problems in cooperative Q-learning systems. In: IJCAI, vol. 2007, pp 780–785
-
(2007)
IJCAI
, vol.2007
, pp. 780-785
-
-
Fulda, N.1
Ventura, D.2
-
20
-
-
1542334432
-
Learning obstacle avoidance with an operant behavior model
-
Gutnisky DA, Zanutto BS (2004) Learning obstacle avoidance with an operant behavior model. Artif Life 10(1):65–81
-
(2004)
Artif Life
, vol.10
, Issue.1
, pp. 65-81
-
-
Gutnisky, D.A.1
Zanutto, B.S.2
-
21
-
-
79551653988
-
Systems control with generalized probabilistic fuzzy-reinforcement learning
-
Hinojosa W, Nefti S, Kaymak U (2011) Systems control with generalized probabilistic fuzzy-reinforcement learning. Fuzzy Syst, IEEE Trans 19(1):51–64
-
(2011)
Fuzzy Syst, IEEE Trans
, vol.19
, Issue.1
, pp. 51-64
-
-
Hinojosa, W.1
Nefti, S.2
Kaymak, U.3
-
23
-
-
4644369748
-
Nash Q-learning for general-sum stochastic games
-
Hu J, Wellman MP (2003) Nash Q-learning for general-sum stochastic games. J Mach Learn Res 4:1039–1069
-
(2003)
J Mach Learn Res
, vol.4
, pp. 1039-1069
-
-
Hu, J.1
Wellman, M.P.2
-
24
-
-
84957858290
-
Multiagent reinforcement learning: theoretical framework and an algorithm. In: ICML, vol. 98, Citeseer
-
Hu J, Wellman MP, et al (1998) Multiagent reinforcement learning: theoretical framework and an algorithm. In: ICML, vol. 98, Citeseer, pp 242–250
-
(1998)
pp 242–250
-
-
Hu, J.1
Wellman, M.P.2
-
26
-
-
0742289960
-
A reinforcement learning with evolutionary state recruitment strategy for autonomous mobile robots control
-
Kondo T, Ito K (2004) A reinforcement learning with evolutionary state recruitment strategy for autonomous mobile robots control. Robot Auton Syst 46(2):111–124
-
(2004)
Robot Auton Syst
, vol.46
, Issue.2
, pp. 111-124
-
-
Kondo, T.1
Ito, K.2
-
27
-
-
84988290534
-
Data-based suboptimal neuro-control design with reinforcement learning for dissipative spatially distributed processes
-
Luo B, Wu H-N, Li H-X (2014a) Data-based suboptimal neuro-control design with reinforcement learning for dissipative spatially distributed processes. Ind Eng Chem Res 53(19):8106–8119
-
(2014)
Ind Eng Chem Res
, vol.53
, Issue.19
, pp. 8106-8119
-
-
Luo, B.1
Wu, H.-N.2
Li, H.-X.3
-
28
-
-
84919448289
-
Data-based approximate policy iteration for nonlinear continuous-time optimal control design
-
Luo B, Wu H-N, Huang T, Liu D (2014b) Data-based approximate policy iteration for nonlinear continuous-time optimal control design. Automatica 50(12):3281–3290
-
(2014)
Automatica
, vol.50
, Issue.12
, pp. 3281-3290
-
-
Luo, B.1
Wu, H.-N.2
Huang, T.3
Liu, D.4
-
29
-
-
84919730591
-
Off-policy reinforcement learning for (Formula presented.) control design
-
Luo B, Wu H-N, Huang T (2015a) Off-policy reinforcement learning for (Formula presented.) control design. Cybern, IEEE Trans 45(1):65–76
-
(2015)
Cybern, IEEE Trans
, vol.45
, Issue.1
, pp. 65-76
-
-
Luo, B.1
Wu, H.-N.2
Huang, T.3
-
30
-
-
84925883034
-
Adaptive optimal control of highly dissipative nonlinear spatially distributed processes with neuro-dynamic programming
-
Luo B, Wu H-N, Li H-X (2015b) Adaptive optimal control of highly dissipative nonlinear spatially distributed processes with neuro-dynamic programming. Neural Netw Learn Syst, IEEE Trans 26(4):684–696
-
(2015)
Neural Netw Learn Syst, IEEE Trans
, vol.26
, Issue.4
, pp. 684-696
-
-
Luo, B.1
Wu, H.-N.2
Li, H.-X.3
-
31
-
-
85027939867
-
Data-driven (Formula presented.) control for nonlinear distributed parameter systems. Neural Netw Learn Syst
-
Luo B, Huang T, Wu H-N, Yang X (2015c) Data-driven (Formula presented.)∞ control for nonlinear distributed parameter systems. Neural Netw Learn Syst, IEEE Trans 26(11):2949–2961
-
(2015)
IEEE Trans
, vol.26
, Issue.11
, pp. 2949-2961
-
-
Luo, B.1
Huang, T.2
Wu, H.-N.3
Yang, X.4
-
32
-
-
84941097144
-
Reinforcement learning solution for HJB equation arising in constrained optimal control problem
-
Luo B, Wu H-N, Huang T, Liu D (2015d) Reinforcement learning solution for HJB equation arising in constrained optimal control problem. Neural Netw 71:150–158
-
(2015)
Neural Netw
, vol.71
, pp. 150-158
-
-
Luo, B.1
Wu, H.-N.2
Huang, T.3
Liu, D.4
-
33
-
-
84893708995
-
Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems
-
Modares H, Lewis FL, Naghibi-Sistani M-B (2014) Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems. Automatica 50(1):193–202
-
(2014)
Automatica
, vol.50
, Issue.1
, pp. 193-202
-
-
Modares, H.1
Lewis, F.L.2
Naghibi-Sistani, M.-B.3
-
34
-
-
34548040193
-
Autonomous and fast robot learning through motivation
-
Rodríguez M, Iglesias R, Regueiro CV, Correa J, Barro S (2007) Autonomous and fast robot learning through motivation. Robot Auton Syst 55(9):735–740
-
(2007)
Robot Auton Syst
, vol.55
, Issue.9
, pp. 735-740
-
-
Rodríguez, M.1
Iglesias, R.2
Regueiro, C.V.3
Correa, J.4
Barro, S.5
-
36
-
-
0028555752
-
Learning to coordinate without sharing information. In: AAAI
-
Sen S, Sekaran M, Hale J (1994) Learning to coordinate without sharing information. In: AAAI, pp 426–431
-
(1994)
pp 426–431
-
-
Sen, S.1
Sekaran, M.2
Hale, J.3
-
37
-
-
84957858292
-
Nash convergence of gradient dynamics in general-sum games. In: Proceedings of the sixteenth conference on Uncertainty in artificial intelligence, Morgan Kaufmann Publishers Inc
-
Singh S, Kearns M, Mansour Y (2000) Nash convergence of gradient dynamics in general-sum games. In: Proceedings of the sixteenth conference on Uncertainty in artificial intelligence, Morgan Kaufmann Publishers Inc., pp 541–548
-
(2000)
pp 541–548
-
-
Singh, S.1
Kearns, M.2
Mansour, Y.3
-
38
-
-
0036058423
-
Effective reinforcement learning for mobile robots. In: Robotics and automation. Proceedings. ICRA’02. IEEE international conference on, vol. 4, IEEE, 2002, pp. 3404–3410
-
Smart WD, Kaelbling LP (2002) Effective reinforcement learning for mobile robots. In: Robotics and automation. Proceedings. ICRA’02. IEEE international conference on, vol. 4, IEEE, 2002, pp. 3404–3410. IEEE
-
(2002)
IEEE
-
-
Smart, W.D.1
Kaelbling, L.P.2
-
40
-
-
85152198941
-
Multi-agent reinforcement learning: Independent vs. cooperative agents. In: Proceedings of the tenth international conference on machine learning
-
Tan M (1993) Multi-agent reinforcement learning: Independent vs. cooperative agents. In: Proceedings of the tenth international conference on machine learning, pp 330–337
-
(1993)
pp 330–337
-
-
Tan, M.1
-
41
-
-
84898941549
-
Extending Q-learning to general adaptive multi-agent systems. In: Advances in neural information processing systems, vol. 16. MIT press
-
Tesauro G (2004) Extending Q-learning to general adaptive multi-agent systems. In: Advances in neural information processing systems, vol. 16. MIT press, pp 871–878
-
(2004)
pp 871–878
-
-
Tesauro, G.1
-
45
-
-
84957846862
-
Multiagent systems: a modern approach to distributed artificial intelligence
-
Weiss G (1999) Multiagent systems: a modern approach to distributed artificial intelligence. MIT Press
-
(1999)
MIT Press
-
-
Weiss, G.1
-
46
-
-
84876909440
-
Neural network based online simultaneous policy update algorithm for solving the HJI equation in nonlinear control
-
Wu H-N, Luo B (2012) Neural network based online simultaneous policy update algorithm for solving the HJI equation in nonlinear control. Neural Netw Learn Syst, IEEE Trans 23(12):1884–1895
-
(2012)
Neural Netw Learn Syst, IEEE Trans
, vol.23
, Issue.12
, pp. 1884-1895
-
-
Wu, H.-N.1
Luo, B.2
-
47
-
-
0037278069
-
A fuzzy controller with supervised learning assisted reinforcement learning algorithm for obstacle avoidance
-
Ye C, Yung NH, Wang D (2003) A fuzzy controller with supervised learning assisted reinforcement learning algorithm for obstacle avoidance. Syst Man Cybern Part B: Cybern, IEEE Trans 33(1):17–27
-
(2003)
Syst Man Cybern Part B: Cybern, IEEE Trans
, vol.33
, Issue.1
, pp. 17-27
-
-
Ye, C.1
Yung, N.H.2
Wang, D.3
-
48
-
-
85099723578
-
Multi-agent learning with policy prediction
-
Zhang C, Lesser VR (2010) Multi-agent learning with policy prediction. In: AAAI
-
(2010)
In: AAAI
-
-
Zhang, C.1
Lesser, V.R.2
|