SCOPUS 정보 검색 플랫폼

Journal of Intelligent and Robotic Systems: Theory and Applications

Volumn 48, Issue 1, 2007, Pages 7-22

Fuzzy policy reinforcement learning in cooperative multi-robot systems

(2) Gu, Dongbing a Yang, Erfu a

a UNIVERSITY OF ESSEX (United Kingdom)

Author keywords

Cooperative control; Flocking behavior; Multi agent reinforcement learning; Policy gradient reinforcement learning

Indexed keywords

COMPUTER SIMULATION; COMPUTER SUPPORTED COOPERATIVE WORK; FUZZY SETS; LEARNING ALGORITHMS; PARAMETER ESTIMATION;

COOPERATIVE CONTROL; FLOCKING BEHAVIOR; FUZZY POLICY; MULTI-AGENT REINFORCEMENT LEARNING; POLICY GRADIENT REINFORCEMENT LEARNING;

MULTI AGENT SYSTEMS;

EID: 33846038724 PISSN: 09210296 EISSN: 15730409 Source Type: Journal
DOI: 10.1007/s10846-006-9103-z Document Type: Conference Paper

Times cited : (19)

References (23)

1
- 33846054812
- Gradient descent for general reinforcement learning
- MIT, Cambridge, MA
- Baird, L.C., Moore, A.W.: Gradient descent for general reinforcement learning. In: Advances in Neural Information System, vol.11, MIT, Cambridge, MA (1995)
- (1995) Advances in Neural Information System , vol.11
- Baird, L.C.¹ Moore, A.W.²

2
- 0020970738
- Neuronlike adaptive elements that can solve difficult learning control problems
- Barto, A.G., Sutton, R.S., Anderson, C.W.: Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans. SMC 13(5), 834-846 (1983)
- (1983) IEEE Trans. SMC , vol.13 , Issue.5 , pp. 834-846
- Barto, A.G.¹ Sutton, R.S.² Anderson, C.W.³

3
- 0013535965
- Infinite-horizon policy-gradient estimation
- Baxter, J., Bartlett, P.L.: Infinite-horizon policy-gradient estimation. J. Artif. Intell. Res. 15, 319-350 (2001)
- (2001) J. Artif. Intell. Res , vol.15 , pp. 319-350
- Baxter, J.¹ Bartlett, P.L.²

4
- 0026923465
- Learning and tuning fuzzy logic controllers through reinforcements
- Berenji, H.R., Khedkar, P.: Learning and tuning fuzzy logic controllers through reinforcements. IEEE Trans. Neural Netw. 3(5), 724-740 (1992)
- (1992) IEEE Trans. Neural Netw , vol.3 , Issue.5 , pp. 724-740
- Berenji, H.R.¹ Khedkar, P.²

5
- 0041877717
- A convergent actor critic based fuzzy reinforcement learning algorithm with application to power management of wireless transmitters
- Berenji, H.R., Vengerov, D.: A convergent actor critic based fuzzy reinforcement learning algorithm with application to power management of wireless transmitters. IEEE Trans. Fuzzy Systems. 11(4), 478-485 (2003)
- (2003) IEEE Trans. Fuzzy Systems , vol.11 , Issue.4 , pp. 478-485
- Berenji, H.R.¹ Vengerov, D.²

6
- 0036531878
- Multiagent learning using a variable learning rate
- Bowling, M., Veloso, M.: Multiagent learning using a variable learning rate. Artif. Intell. 136, 215-150 (2002)
- (2002) Artif. Intell , vol.136 , pp. 215-150
- Bowling, M.¹ Veloso, M.²

7
- 0347410594
- Using policy gradient reinforcement learning on autonomous robot controllers
- Las Vegas, Nevada, pp
- Grudic, G.Z., Kumar, V., Ungar, L.: Using policy gradient reinforcement learning on autonomous robot controllers. In: Proceedings of IEEE-RSJ International Conference on Intelligent Robots and Systems(IROS), Las Vegas, Nevada, pp. 406-411 (2003)
- (2003) Proceedings of IEEE-RSJ International Conference on Intelligent Robots and Systems(IROS) , pp. 406-411
- Grudic, G.Z.¹ Kumar, V.² Ungar, L.³

8
- 4644369748
- Nash Q-learning for general-sum stochastic games
- Hu, J., Wellman, M.P.: Nash Q-learning for general-sum stochastic games. J. Mach. Learn. Res. 4, 1039-1069 (2003)
- (2003) J. Mach. Learn. Res , vol.4 , pp. 1039-1069
- Hu, J.¹ Wellman, M.P.²

9
- 33846076026
- Reinformcenent leanring by stochastic hill climbing on discounted reward
- California
- Kimura, H., Yamamura, M., Kobayashi, S.: Reinformcenent leanring by stochastic hill climbing on discounted reward. In: Proceedings of the 12th International Conference Machine Learning, pp. 152-160 California (1995)
- (1995) Proceedings of the 12th International Conference Machine Learning , pp. 152-160
- Kimura, H.¹ Yamamura, M.² Kobayashi, S.³

10
- 3042534761
- Policy gradient reinformcenent leanring for fast quadrupedal locomotion
- New Orleans, LA
- Kohl, N., Stone, P.: Policy gradient reinformcenent leanring for fast quadrupedal locomotion. In: Proceedings of the IEEE International Conference on Robotics and Automation(ICRA), pp. 2619-2624 New Orleans, LA (2004)
- (2004) Proceedings of the IEEE International Conference on Robotics and Automation(ICRA) , pp. 2619-2624
- Kohl, N.¹ Stone, P.²

11
- 4043069840
- Actor-critic algorithms
- Konda, V.R., Tsitsiklis, J.N.: Actor-critic algorithms. SIAM J. Control Optim. 42(4), 1143-1166 (2003)
- (2003) SIAM J. Control Optim , vol.42 , Issue.4 , pp. 1143-1166
- Konda, V.R.¹ Tsitsiklis, J.N.²

12
- 85149834820
- Markov games as a framework for multiagent reinforcement learning
- Littman, M.L.: Markov games as a framework for multiagent reinforcement learning. In: Proceedings of the 11th International Conference on Machine Learning, pp.157-163 (1994)
- (1994) Proceedings of the 11th International Conference on Machine Learning , pp. 157-163
- Littman, M.L.¹

13
- 0001547175
- Littman, M.L.: Value-function reinforcement learning in Markov games. Cogn. Syst. Res. 2(1), 55-66 (2000)
- Littman, M.L.: Value-function reinforcement learning in Markov games. Cogn. Syst. Res. 2(1), 55-66 (2000)

14
- 33846070940
- Flcoking for multi-agent dynamic systems: Algorithms and theory
- Olfati-Saber, R.: Flcoking for multi-agent dynamic systems: Algorithms and theory. IEEE Trans. Automat. Contr. 19(6), 933-941 (2006)
- (2006) IEEE Trans. Automat. Contr , vol.19 , Issue.6 , pp. 933-941
- Olfati-Saber, R.¹

15
- 0012646255
- Learning to cooperate via policy search
- Peshkin, L., Kim, K., Meuleau, N., Kaelblingn, L.P.: Learning to cooperate via policy search. In: Proceedings of the 6th International Conference on uncertainty in artificial intelligence, pp. 307-314 (2000)
- (2000) Proceedings of the 6th International Conference on uncertainty in artificial intelligence , pp. 307-314
- Peshkin, L.¹ Kim, K.² Meuleau, N.³ Kaelblingn, L.P.⁴

16
- 0023379184
- Flocks, herds, and schools: A distributed behavioural model
- Reynolds, C.W.: Flocks, herds, and schools: A distributed behavioural model. Comput. Graph. 21(4), 25-34 (1987)
- (1987) Comput. Graph , vol.21 , Issue.4 , pp. 25-34
- Reynolds, C.W.¹

17
- 0001644761
- Nash convergence of gradient dynamics in general-sum games
- Stanford University, Stanford, CA
- Singh, S., Kearns, M., Mansour, Y.: Nash convergence of gradient dynamics in general-sum games. In: Proceedings of the 16th Annual Conference on Uncertainty in Artificial Intelligence (UAI), pp. 541-548 Stanford University, Stanford, CA (2000)
- (2000) Proceedings of the 16th Annual Conference on Uncertainty in Artificial Intelligence (UAI) , pp. 541-548
- Singh, S.¹ Kearns, M.² Mansour, Y.³

18
- 84898939480
- Sutton, R.S., McAllester, D., Singh, S., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. Adv. Neural Inf. Process, syst. 12, 1057-1063 (2000)(MIT)
- Sutton, R.S., McAllester, D., Singh, S., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. Adv. Neural Inf. Process, syst. 12, 1057-1063 (2000)(MIT)

19
- 34247397566
- Flocking in fixed and switching networks
- to appear
- Tanner, H.G., Jadbabaie, A., Pappas, G.J.: Flocking in fixed and switching networks. IEEE Trans. Automat. Contr. (to appear)
- IEEE Trans. Automat. Contr
- Tanner, H.G.¹ Jadbabaie, A.² Pappas, G.J.³

20
- 13444294406
- A multi-agent policy-gradient approach to network routing
- Williamstown MA, pp, July
- Tao, N., Baxter, J., Weaver, L.: A multi-agent policy-gradient approach to network routing. In: Proceedings of 18th International Conference on Machine Learning, Williamstown MA, pp. 553-560, July 2001
- (2001) Proceedings of 18th International Conference on Machine Learning , pp. 553-560
- Tao, N.¹ Baxter, J.² Weaver, L.³

21
- 14044262287
- Stochastic policy gradient reinforcement learning on a simple 3D biped
- Senda Japan, pp, October
- Tedrake, R., Zhang, T., Seung, H.: Stochastic policy gradient reinforcement learning on a simple 3D biped. In: Proceedings of IEEE-RSJ International Conference on Intelligent Robots and Systems(IROS), Senda Japan, pp. 2849-2854, October 2004
- (2004) Proceedings of IEEE-RSJ International Conference on Intelligent Robots and Systems(IROS) , pp. 2849-2854
- Tedrake, R.¹ Zhang, T.² Seung, H.³

22
- 0000337576
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- William, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8, 229-256 (1992)
- (1992) Mach. Learn , vol.8 , pp. 229-256
- William, R.J.¹

23
- 33846062254
- Nonsingular formation control of cooperative mobile robots via feedback linearization
- Edmonton, Canada, pp, August
- Yang, E., Gu, D., Hu, H.: Nonsingular formation control of cooperative mobile robots via feedback linearization. In: Proceedings of IEEE-RSJ International Conference on Intelligent Robots and Systems(IROS), Edmonton, Canada, pp. 3652-3657, August 2005
- (2005) Proceedings of IEEE-RSJ International Conference on Intelligent Robots and Systems(IROS) , pp. 3652-3657
- Yang, E.¹ Gu, D.² Hu, H.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.