SCOPUS 정보 검색 플랫폼

Artificial Intelligence Review

Volumn 45, Issue 3, 2016, Pages 299-332

Exponential moving average based multiagent reinforcement learning algorithms

(2) Awheda, Mostafa D a Schwartz, Howard M a

a CARLETON UNIVERSITY (Canada)

Author keywords

Markov decision processes; Multi agent learning systems; Nash equilibrium; Reinforcement learning

Indexed keywords

ALGORITHMS; COMPUTATION THEORY; GAME THEORY; ITERATIVE METHODS; MARKOV PROCESSES; MULTI AGENT SYSTEMS; REINFORCEMENT LEARNING; SOFTWARE AGENTS; STOCHASTIC SYSTEMS; TELECOMMUNICATION NETWORKS;

EXPONENTIAL MOVING AVERAGES; MARKOV DECISION PROCESSES; MULTI-AGENT LEARNING; MULTI-AGENT REINFORCEMENT LEARNING; MULTIAGENT REINFORCEMENT LEARNING ALGORITHM; NASH EQUILIBRIA; NASH EQUILIBRIUM POLICY; Q-LEARNING ALGORITHMS;

LEARNING ALGORITHMS;

EID: 84957842278 PISSN: 02692821 EISSN: 15737462 Source Type: Journal
DOI: 10.1007/s10462-015-9447-5 Document Type: Article

Times cited : (15)

References (48)

1
- 70350699723
- A multiagent reinforcement learning algorithm with non-linear dynamics
- Abdallah S, Lesser V (2008) A multiagent reinforcement learning algorithm with non-linear dynamics. J Artif Intell Res 33:521–549
- (2008) J Artif Intell Res , vol.33 , pp. 521-549
- Abdallah, S.¹ Lesser, V.²

2
- 84891544020
- Exponential moving average Q-learning algorithm. In: Adaptive dynamic programming and reinforcement learning (ADPRL), 2013 IEEE symposium on, IEEE, pp 31–38
- Awheda MD, Schwartz HM (2013) Exponential moving average Q-learning algorithm. In: Adaptive dynamic programming and reinforcement learning (ADPRL), 2013 IEEE symposium on, IEEE, pp 31–38. IEEE
- (2013) IEEE
- Awheda, M.D.¹ Schwartz, H.M.²

3
- 85119225337
- Awheda MD, Schwartz HM (2015) The residual gradient FACL algorithm for differential games. In Electrical and computer engineering (CCECE). 2015 IEEE 28th Canadian conference on, IEEE, pp 1006–1011. IEEE
- Awheda MD, Schwartz HM (2015) The residual gradient FACL algorithm for differential games. In Electrical and computer engineering (CCECE). 2015 IEEE 28th Canadian conference on, IEEE, pp 1006–1011. IEEE

4
- 35248823118
- Generalized multiagent learning with performance bound
- Banerjee B, Peng J (2007) Generalized multiagent learning with performance bound. Auton Agents Multi-Agent Syst 15(3):281–312
- (2007) Auton Agents Multi-Agent Syst , vol.15 , Issue.3 , pp. 281-312
- Banerjee, B.¹ Peng, J.²

5
- 85012688561
- Princeton University Press, Princeton
- Bellman R (1957) Dynamic programming. Princeton University Press, Princeton
- (1957) Dynamic programming
- Bellman, R.¹

6
- 84899027977
- Convergence and no-regret in multiagent learning
- Bowling M (2005) Convergence and no-regret in multiagent learning. Adv Neural Inf Process Syst 17:209–216
- (2005) Adv Neural Inf Process Syst , vol.17 , pp. 209-216
- Bowling, M.¹

7
- 84957858286
- Convergence of gradient dynamics with a variable learning rate. In: ICML
- Bowling M, Veloso M (2001a) Convergence of gradient dynamics with a variable learning rate. In: ICML, pp 27–34
- (2001) pp 27–34
- Bowling, M.¹ Veloso, M.²

8
- 84880865940
- Rational and convergent learning in stochastic games. In: International joint conference on artificial intelligence, vol. 17. Lawrence Erlbaum Associates Ltd
- Bowling M, Veloso M (2001b) Rational and convergent learning in stochastic games. In: International joint conference on artificial intelligence, vol. 17. Lawrence Erlbaum Associates Ltd, pp 1021–1026
- (2001) pp 1021–1026
- Bowling, M.¹ Veloso, M.²

9
- 0036531878
- Multiagent learning using a variable learning rate
- Bowling M, Veloso M (2002) Multiagent learning using a variable learning rate. Artif Intell 136(2):215–250
- (2002) Artif Intell , vol.136 , Issue.2 , pp. 215-250
- Bowling, M.¹ Veloso, M.²

10
- 70350566689
- Effective learning in the presence of adaptive counterparts
- Burkov A, Chaib-draa B (2009) Effective learning in the presence of adaptive counterparts. J Algorithms 64(4):127–138
- (2009) J Algorithms , vol.64 , Issue.4 , pp. 127-138
- Burkov, A.¹ Chaib-draa, B.²

11
- 34547192059
- Multi-agent reinforcement learning: A survey. In: Control, automation, robotics and vision, 2006. ICARCV’06. 9th international conference on, IEEE, pp 1–6
- Busoniu L, Babuska R, De Schutter B (2006) Multi-agent reinforcement learning: A survey. In: Control, automation, robotics and vision, 2006. ICARCV’06. 9th international conference on, IEEE, pp 1–6. IEEE
- (2006) IEEE
- Busoniu, L.¹ Babuska, R.² De Schutter, B.³

12
- 40949147745
- A comprehensive survey of multiagent reinforcement learning
- Busoniu L, Babuska R, De Schutter B (2008) A comprehensive survey of multiagent reinforcement learning. Syst Man Cybern Part C: Appl Rev, IEEE Trans 38(2):156–172
- (2008) Syst Man Cybern Part C: Appl Rev, IEEE Trans , vol.38 , Issue.2 , pp. 156-172
- Busoniu, L.¹ Babuska, R.² De Schutter, B.³

13
- 0031630561
- The dynamics of reinforcement learning in cooperative multiagent systems. In: AAAI/IAAI
- Claus C, Boutilier C (1998) The dynamics of reinforcement learning in cooperative multiagent systems. In: AAAI/IAAI, pp 746–752
- (1998) pp 746–752
- Claus, C.¹ Boutilier, C.²

14
- 34147159616
- Awesome: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents
- Conitzer V, Sandholm T (2007) Awesome: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents. Mach Learn 67(1–2):23–43
- (2007) Mach Learn , vol.67 , Issue.1-2 , pp. 23-43
- Conitzer, V.¹ Sandholm, T.²

15
- 27744536933
- An approach to tune fuzzy controllers based on reinforcement learning for autonomous vehicle control
- Dai X, Li C-K, Rad AB (2005) An approach to tune fuzzy controllers based on reinforcement learning for autonomous vehicle control. Intell Transp Syst, IEEE Trans 6(3):285–293
- (2005) Intell Transp Syst, IEEE Trans , vol.6 , Issue.3 , pp. 285-293
- Dai, X.¹ Li, C.-K.² Rad, A.B.³

16
- 0004205841
- Allyn & Bacon, Newton
- D’Angelo H (1970) Linear time-varying systems: analysis and synthesis. Allyn & Bacon, Newton
- (1970) Linear time-varying systems: analysis and synthesis
- D’Angelo, H.¹

17
- 0003660005
- Prentice-Hall Inc, Upper Saddle River
- DeCarlo RA (1989) Linear systems: a state variable approach with numerical implementation. Prentice-Hall Inc, Upper Saddle River
- (1989) Linear systems: a state variable approach with numerical implementation
- DeCarlo, R.A.¹

18
- 84923229149
- Optimal adaptive control and differential games by reinforcement learning principles
- Dixon W (2014) Optimal adaptive control and differential games by reinforcement learning principles. J Guid Control Dyn 37(3):1048–1049
- (2014) J Guid Control Dyn , vol.37 , Issue.3 , pp. 1048-1049
- Dixon, W.¹

19
- 84880861539
- Predicting and preventing coordination problems in cooperative Q-learning systems
- Fulda N, Ventura D (2007) Predicting and preventing coordination problems in cooperative Q-learning systems. In: IJCAI, vol. 2007, pp 780–785
- (2007) IJCAI , vol.2007 , pp. 780-785
- Fulda, N.¹ Ventura, D.²

20
- 1542334432
- Learning obstacle avoidance with an operant behavior model
- Gutnisky DA, Zanutto BS (2004) Learning obstacle avoidance with an operant behavior model. Artif Life 10(1):65–81
- (2004) Artif Life , vol.10 , Issue.1 , pp. 65-81
- Gutnisky, D.A.¹ Zanutto, B.S.²

21
- 79551653988
- Systems control with generalized probabilistic fuzzy-reinforcement learning
- Hinojosa W, Nefti S, Kaymak U (2011) Systems control with generalized probabilistic fuzzy-reinforcement learning. Fuzzy Syst, IEEE Trans 19(1):51–64
- (2011) Fuzzy Syst, IEEE Trans , vol.19 , Issue.1 , pp. 51-64
- Hinojosa, W.¹ Nefti, S.² Kaymak, U.³

22
- 0003644124
- MIT Press, Cambridge
- Howard RA (1960) Dynamic programming and markov processes. MIT Press, Cambridge
- (1960) Dynamic programming and markov processes
- Howard, R.A.¹

23
- 4644369748
- Nash Q-learning for general-sum stochastic games
- Hu J, Wellman MP (2003) Nash Q-learning for general-sum stochastic games. J Mach Learn Res 4:1039–1069
- (2003) J Mach Learn Res , vol.4 , pp. 1039-1069
- Hu, J.¹ Wellman, M.P.²

24
- 84957858290
- Multiagent reinforcement learning: theoretical framework and an algorithm. In: ICML, vol. 98, Citeseer
- Hu J, Wellman MP, et al (1998) Multiagent reinforcement learning: theoretical framework and an algorithm. In: ICML, vol. 98, Citeseer, pp 242–250
- (1998) pp 242–250
- Hu, J.¹ Wellman, M.P.²

25
- 0029679044
- Reinforcement learning: a survey
- Kaelbling LP, Littman ML, Moore AW (1996) Reinforcement learning: a survey. J Artif Intell Res 4:237–285
- (1996) J Artif Intell Res , vol.4 , pp. 237-285
- Kaelbling, L.P.¹ Littman, M.L.² Moore, A.W.³

26
- 0742289960
- A reinforcement learning with evolutionary state recruitment strategy for autonomous mobile robots control
- Kondo T, Ito K (2004) A reinforcement learning with evolutionary state recruitment strategy for autonomous mobile robots control. Robot Auton Syst 46(2):111–124
- (2004) Robot Auton Syst , vol.46 , Issue.2 , pp. 111-124
- Kondo, T.¹ Ito, K.²

27
- 84988290534
- Data-based suboptimal neuro-control design with reinforcement learning for dissipative spatially distributed processes
- Luo B, Wu H-N, Li H-X (2014a) Data-based suboptimal neuro-control design with reinforcement learning for dissipative spatially distributed processes. Ind Eng Chem Res 53(19):8106–8119
- (2014) Ind Eng Chem Res , vol.53 , Issue.19 , pp. 8106-8119
- Luo, B.¹ Wu, H.-N.² Li, H.-X.³

28
- 84919448289
- Data-based approximate policy iteration for nonlinear continuous-time optimal control design
- Luo B, Wu H-N, Huang T, Liu D (2014b) Data-based approximate policy iteration for nonlinear continuous-time optimal control design. Automatica 50(12):3281–3290
- (2014) Automatica , vol.50 , Issue.12 , pp. 3281-3290
- Luo, B.¹ Wu, H.-N.² Huang, T.³ Liu, D.⁴

29
- 84919730591
- Off-policy reinforcement learning for (Formula presented.) control design
- Luo B, Wu H-N, Huang T (2015a) Off-policy reinforcement learning for (Formula presented.) control design. Cybern, IEEE Trans 45(1):65–76
- (2015) Cybern, IEEE Trans , vol.45 , Issue.1 , pp. 65-76
- Luo, B.¹ Wu, H.-N.² Huang, T.³

30
- 84925883034
- Adaptive optimal control of highly dissipative nonlinear spatially distributed processes with neuro-dynamic programming
- Luo B, Wu H-N, Li H-X (2015b) Adaptive optimal control of highly dissipative nonlinear spatially distributed processes with neuro-dynamic programming. Neural Netw Learn Syst, IEEE Trans 26(4):684–696
- (2015) Neural Netw Learn Syst, IEEE Trans , vol.26 , Issue.4 , pp. 684-696
- Luo, B.¹ Wu, H.-N.² Li, H.-X.³

31
- 85027939867
- Data-driven (Formula presented.) control for nonlinear distributed parameter systems. Neural Netw Learn Syst
- Luo B, Huang T, Wu H-N, Yang X (2015c) Data-driven (Formula presented.)∞ control for nonlinear distributed parameter systems. Neural Netw Learn Syst, IEEE Trans 26(11):2949–2961
- (2015) IEEE Trans , vol.26 , Issue.11 , pp. 2949-2961
- Luo, B.¹ Huang, T.² Wu, H.-N.³ Yang, X.⁴

32
- 84941097144
- Reinforcement learning solution for HJB equation arising in constrained optimal control problem
- Luo B, Wu H-N, Huang T, Liu D (2015d) Reinforcement learning solution for HJB equation arising in constrained optimal control problem. Neural Netw 71:150–158
- (2015) Neural Netw , vol.71 , pp. 150-158
- Luo, B.¹ Wu, H.-N.² Huang, T.³ Liu, D.⁴

33
- 84893708995
- Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems
- Modares H, Lewis FL, Naghibi-Sistani M-B (2014) Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems. Automatica 50(1):193–202
- (2014) Automatica , vol.50 , Issue.1 , pp. 193-202
- Modares, H.¹ Lewis, F.L.² Naghibi-Sistani, M.-B.³

34
- 34548040193
- Autonomous and fast robot learning through motivation
- Rodríguez M, Iglesias R, Regueiro CV, Correa J, Barro S (2007) Autonomous and fast robot learning through motivation. Robot Auton Syst 55(9):735–740
- (2007) Robot Auton Syst , vol.55 , Issue.9 , pp. 735-740
- Rodríguez, M.¹ Iglesias, R.² Regueiro, C.V.³ Correa, J.⁴ Barro, S.⁵

35
- 84924356033
- Wiley, New York
- Schwartz HM (2014) Multi-Agent Machine Learning: A Reinforcement Approach. Wiley, New York
- (2014) Multi-Agent Machine Learning: A Reinforcement Approach
- Schwartz, H.M.¹

36
- 0028555752
- Learning to coordinate without sharing information. In: AAAI
- Sen S, Sekaran M, Hale J (1994) Learning to coordinate without sharing information. In: AAAI, pp 426–431
- (1994) pp 426–431
- Sen, S.¹ Sekaran, M.² Hale, J.³

37
- 84957858292
- Nash convergence of gradient dynamics in general-sum games. In: Proceedings of the sixteenth conference on Uncertainty in artificial intelligence, Morgan Kaufmann Publishers Inc
- Singh S, Kearns M, Mansour Y (2000) Nash convergence of gradient dynamics in general-sum games. In: Proceedings of the sixteenth conference on Uncertainty in artificial intelligence, Morgan Kaufmann Publishers Inc., pp 541–548
- (2000) pp 541–548
- Singh, S.¹ Kearns, M.² Mansour, Y.³

38
- 0036058423
- Effective reinforcement learning for mobile robots. In: Robotics and automation. Proceedings. ICRA’02. IEEE international conference on, vol. 4, IEEE, 2002, pp. 3404–3410
- Smart WD, Kaelbling LP (2002) Effective reinforcement learning for mobile robots. In: Robotics and automation. Proceedings. ICRA’02. IEEE international conference on, vol. 4, IEEE, 2002, pp. 3404–3410. IEEE
- (2002) IEEE
- Smart, W.D.¹ Kaelbling, L.P.²

39
- 0004102479
- The MIT Press, Cambridge
- Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. The MIT Press, Cambridge
- (1998) Reinforcement learning: an introduction
- Sutton, R.S.¹ Barto, A.G.²

40
- 85152198941
- Multi-agent reinforcement learning: Independent vs. cooperative agents. In: Proceedings of the tenth international conference on machine learning
- Tan M (1993) Multi-agent reinforcement learning: Independent vs. cooperative agents. In: Proceedings of the tenth international conference on machine learning, pp 330–337
- (1993) pp 330–337
- Tan, M.¹

41
- 84898941549
- Extending Q-learning to general adaptive multi-agent systems. In: Advances in neural information processing systems, vol. 16. MIT press
- Tesauro G (2004) Extending Q-learning to general adaptive multi-agent systems. In: Advances in neural information processing systems, vol. 16. MIT press, pp 871–878
- (2004) pp 871–878
- Tesauro, G.¹

42
- 2942609194
- Springer, Boston
- Thathachar MA, Sastry PS (2011) Networks of learning automata: techniques for online stochastic optimization. Springer, Boston
- (2011) Networks of learning automata: techniques for online stochastic optimization
- Thathachar, M.A.¹ Sastry, P.S.²

43
- 34249833101
- Q-learning
- Watkins CJ, Dayan P (1992) Q-learning. Mach Learn 8(3–4):279–292
- (1992) Mach Learn , vol.8 , Issue.3-4 , pp. 279-292
- Watkins, C.J.¹ Dayan, P.²

44
- 0004049893
- Ph.D. thesis: University of Cambridge England
- Watkins CJCH (1989) Learning from delayed rewards, Ph.D. thesis, University of Cambridge England
- (1989) Learning from delayed rewards
- Watkins, C.J.C.H.¹

45
- 84957846862
- Multiagent systems: a modern approach to distributed artificial intelligence
- Weiss G (1999) Multiagent systems: a modern approach to distributed artificial intelligence. MIT Press
- (1999) MIT Press
- Weiss, G.¹

46
- 84876909440
- Neural network based online simultaneous policy update algorithm for solving the HJI equation in nonlinear control
- Wu H-N, Luo B (2012) Neural network based online simultaneous policy update algorithm for solving the HJI equation in nonlinear control. Neural Netw Learn Syst, IEEE Trans 23(12):1884–1895
- (2012) Neural Netw Learn Syst, IEEE Trans , vol.23 , Issue.12 , pp. 1884-1895
- Wu, H.-N.¹ Luo, B.²

47
- 0037278069
- A fuzzy controller with supervised learning assisted reinforcement learning algorithm for obstacle avoidance
- Ye C, Yung NH, Wang D (2003) A fuzzy controller with supervised learning assisted reinforcement learning algorithm for obstacle avoidance. Syst Man Cybern Part B: Cybern, IEEE Trans 33(1):17–27
- (2003) Syst Man Cybern Part B: Cybern, IEEE Trans , vol.33 , Issue.1 , pp. 17-27
- Ye, C.¹ Yung, N.H.² Wang, D.³

48
- 85099723578
- Multi-agent learning with policy prediction
- Zhang C, Lesser VR (2010) Multi-agent learning with policy prediction. In: AAAI
- (2010) In: AAAI
- Zhang, C.¹ Lesser, V.R.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.