메뉴 건너뛰기




Volumn 48, Issue 1, 2007, Pages 7-22

Fuzzy policy reinforcement learning in cooperative multi-robot systems

Author keywords

Cooperative control; Flocking behavior; Multi agent reinforcement learning; Policy gradient reinforcement learning

Indexed keywords

COMPUTER SIMULATION; COMPUTER SUPPORTED COOPERATIVE WORK; FUZZY SETS; LEARNING ALGORITHMS; PARAMETER ESTIMATION;

EID: 33846038724     PISSN: 09210296     EISSN: 15730409     Source Type: Journal    
DOI: 10.1007/s10846-006-9103-z     Document Type: Conference Paper
Times cited : (19)

References (23)
  • 1
    • 33846054812 scopus 로고
    • Gradient descent for general reinforcement learning
    • MIT, Cambridge, MA
    • Baird, L.C., Moore, A.W.: Gradient descent for general reinforcement learning. In: Advances in Neural Information System, vol.11, MIT, Cambridge, MA (1995)
    • (1995) Advances in Neural Information System , vol.11
    • Baird, L.C.1    Moore, A.W.2
  • 2
    • 0020970738 scopus 로고
    • Neuronlike adaptive elements that can solve difficult learning control problems
    • Barto, A.G., Sutton, R.S., Anderson, C.W.: Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans. SMC 13(5), 834-846 (1983)
    • (1983) IEEE Trans. SMC , vol.13 , Issue.5 , pp. 834-846
    • Barto, A.G.1    Sutton, R.S.2    Anderson, C.W.3
  • 3
    • 0013535965 scopus 로고    scopus 로고
    • Infinite-horizon policy-gradient estimation
    • Baxter, J., Bartlett, P.L.: Infinite-horizon policy-gradient estimation. J. Artif. Intell. Res. 15, 319-350 (2001)
    • (2001) J. Artif. Intell. Res , vol.15 , pp. 319-350
    • Baxter, J.1    Bartlett, P.L.2
  • 4
    • 0026923465 scopus 로고
    • Learning and tuning fuzzy logic controllers through reinforcements
    • Berenji, H.R., Khedkar, P.: Learning and tuning fuzzy logic controllers through reinforcements. IEEE Trans. Neural Netw. 3(5), 724-740 (1992)
    • (1992) IEEE Trans. Neural Netw , vol.3 , Issue.5 , pp. 724-740
    • Berenji, H.R.1    Khedkar, P.2
  • 5
    • 0041877717 scopus 로고    scopus 로고
    • A convergent actor critic based fuzzy reinforcement learning algorithm with application to power management of wireless transmitters
    • Berenji, H.R., Vengerov, D.: A convergent actor critic based fuzzy reinforcement learning algorithm with application to power management of wireless transmitters. IEEE Trans. Fuzzy Systems. 11(4), 478-485 (2003)
    • (2003) IEEE Trans. Fuzzy Systems , vol.11 , Issue.4 , pp. 478-485
    • Berenji, H.R.1    Vengerov, D.2
  • 6
    • 0036531878 scopus 로고    scopus 로고
    • Multiagent learning using a variable learning rate
    • Bowling, M., Veloso, M.: Multiagent learning using a variable learning rate. Artif. Intell. 136, 215-150 (2002)
    • (2002) Artif. Intell , vol.136 , pp. 215-150
    • Bowling, M.1    Veloso, M.2
  • 8
    • 4644369748 scopus 로고    scopus 로고
    • Nash Q-learning for general-sum stochastic games
    • Hu, J., Wellman, M.P.: Nash Q-learning for general-sum stochastic games. J. Mach. Learn. Res. 4, 1039-1069 (2003)
    • (2003) J. Mach. Learn. Res , vol.4 , pp. 1039-1069
    • Hu, J.1    Wellman, M.P.2
  • 13
    • 0001547175 scopus 로고    scopus 로고
    • Littman, M.L.: Value-function reinforcement learning in Markov games. Cogn. Syst. Res. 2(1), 55-66 (2000)
    • Littman, M.L.: Value-function reinforcement learning in Markov games. Cogn. Syst. Res. 2(1), 55-66 (2000)
  • 14
    • 33846070940 scopus 로고    scopus 로고
    • Flcoking for multi-agent dynamic systems: Algorithms and theory
    • Olfati-Saber, R.: Flcoking for multi-agent dynamic systems: Algorithms and theory. IEEE Trans. Automat. Contr. 19(6), 933-941 (2006)
    • (2006) IEEE Trans. Automat. Contr , vol.19 , Issue.6 , pp. 933-941
    • Olfati-Saber, R.1
  • 16
    • 0023379184 scopus 로고
    • Flocks, herds, and schools: A distributed behavioural model
    • Reynolds, C.W.: Flocks, herds, and schools: A distributed behavioural model. Comput. Graph. 21(4), 25-34 (1987)
    • (1987) Comput. Graph , vol.21 , Issue.4 , pp. 25-34
    • Reynolds, C.W.1
  • 18
    • 84898939480 scopus 로고    scopus 로고
    • Sutton, R.S., McAllester, D., Singh, S., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. Adv. Neural Inf. Process, syst. 12, 1057-1063 (2000)(MIT)
    • Sutton, R.S., McAllester, D., Singh, S., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. Adv. Neural Inf. Process, syst. 12, 1057-1063 (2000)(MIT)
  • 22
    • 0000337576 scopus 로고
    • Simple statistical gradient-following algorithms for connectionist reinforcement learning
    • William, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8, 229-256 (1992)
    • (1992) Mach. Learn , vol.8 , pp. 229-256
    • William, R.J.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.