SCOPUS 정보 검색 플랫폼

IEEE 2005 Symposium on Computational Intelligence and Games, CIG'05

Volumn , Issue , 2005, Pages 292-299

A survey on multiagent reinforcement learning towards multi-robot systems

(2) Yang, Erfu a Gu, Dongbing a

a UNIVERSITY OF ESSEX (United Kingdom)

Author keywords

[No Author keywords available]

Indexed keywords

MULTI-AGENT REINFORCEMENT LEARNING; MULTI-ROBOT SYSTEMS; SCALING-UP; THEORETICAL RESEARCH;

INDUSTRIAL ROBOTS; INTELLIGENT ROBOTS; MULTI AGENT SYSTEMS; MULTIPURPOSE ROBOTS; REINFORCEMENT; ROBOT LEARNING; SURVEYS;

REINFORCEMENT LEARNING;

EID: 65149099581 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (16)

References (63)

1
- 0030674885
- Cooperative mobile robotics: Antecedents and directions
- Y. U. Cao, A. S. Fukunaga, and A. B. Kahng, "Cooperative mobile robotics: antecedents and directions, " Auton. Robot., vol. 4, pp. 1-23, 1997.
- (1997) Auton. Robot. , vol.4 , pp. 1-23
- Cao, Y.U.¹ Fukunaga, A.S.² Kahng, A.B.³

2
- 0030647149
- Reinforcement learning in the multi-robot domain
- M. J. Matarić, "Reinforcement learning in the multi-robot domain, " Auton. Robots, vol. 4, pp. 73-83, 1997. (Pubitemid 127508276)
- (1997) Autonomous Robots , vol.4 , Issue.1 , pp. 73-83
- Mataric, M.J.¹

3
- 0032117054
- Learning from history for behavior-based mobile robots in non-stationary conditions
- F. Michaud and M. Matarić, "Learning from history for behavior-based mobile robots in non-stationary conditions, " Auton. Robots, vol. 5, pp. 335-354, 1998. (Pubitemid 128512026)
- (1998) Autonomous Robots , vol.5 , Issue.3-4 , pp. 335-354
- Michaud, F.¹ Mataric, M.J.²

4
- 0032308533
- Behavior-based formation control for multirobot teams
- PII S1042296X98094464
- T. Balch and R. C. Arkin, "Behavior-based formation control for multirobot teams, " IEEE Trans. Robot. Automat., vol. 14, no. 6, pp. 926-939, 1998. (Pubitemid 128743571)
- (1998) IEEE Transactions on Robotics and Automation , vol.14 , Issue.6 , pp. 926-939
- Balch, T.¹ Arkin, R.C.²

5
- 0033148990
- Co-operative behaviour acquisition for mobile robots in dynamically changing real worlds via vision-based reinforcement learning and development
- M. Asada, E. Uchibe, and K. Hosoda, "Co-operative behaviour acquisition for mobile robots in dynamically changing real worlds via vision-based reinforcement learning and development, " Art. Intel., vol. 110, pp. 275-292, 1999.
- (1999) Art. Intel , vol.110 , pp. 275-292
- Asada, M.¹ Uchibe, E.² Hosoda, K.³

6
- 0345073177
- Reinforcement learning soccer teams with incomplete world models
- M.Wiering, R. Salustowicz, and J. Schmidhuber, "Reinforcement learning soccer teams with incomplete world models, " Auton. Robots, vol. 7, pp. 77-88, 1999.
- (1999) Auton. Robots , vol.7 , pp. 77-88
- Wiering, M.¹ Salustowicz, R.² Schmidhuber, J.³

7
- 65149084371
- [Online]. Available: citeseer.nj.nec.com/299778.html
- S. V. Zwaan, J. A. A. Moreira, and P. U. Lima, "Cooperative learning and planning for multiple robots, " 2000. [Online]. Available: citeseer.nj.nec.com/299778.html.
- (2000) Cooperative Learning and Planning for Multiple Robots
- Zwaan, S.V.¹ Moreira, J.A.A.² Lima, P.U.³

8
- 0003481349
- Robot awareness in cooperative mobile robot learning
- C. F. Touzet, "Robot awareness in cooperative mobile robot learning, " Auton. Robots, vol. 8, pp. 87-97, 2000.
- (2000) Auton. Robots , vol.8 , pp. 87-97
- Touzet, C.F.¹

9
- 5644261272
- Learning in large cooperative multi-robot domains
- F. Fernandez and L. E. Parker, "Learning in large co-operative multi-robot domains, " Int. J. Robot. Automat., vol. 16, no. 4, pp. 217-226, 2001. (Pubitemid 34133699)
- (2001) International Journal of Robotics and Automation , vol.16 , Issue.4 , pp. 217-226
- Fernandez, F.¹ Parker, L.E.²

10
- 85056917745
- CRC Press
- J. Liu and J. Wu, Multi-Agent Robotic Systems. CRC Press, 2001.
- (2001) Multi-Agent Robotic Systems
- Liu, J.¹ Wu, J.²

11
- 0002797521
- Learning in behavior-based multi-robot systems: Policies, models, and other agents
- PII S1389041701000171
- M. J. Matarić, "Learning in behavior-based multi-robot systems: policies, models, and other agents, " J. Cogn. Syst. Res., vol. 2, pp. 81-93, 2001. (Pubitemid 33718552)
- (2001) Cognitive Systems Research , vol.2 , Issue.1 , pp. 81-93
- Mataric, M.J.¹

12
- 84880818689
- Simultaneous adversarial multi-robot learning
- August
- M. Bowling and M. Veloso, "Simultaneous adversarial multi-robot learning, " in Proc. 8th Int. Joint Conf. Artificial Intelligence, August 2003.
- (2003) Proc. 8th Int. Joint Conf. Artificial Intelligence
- Bowling, M.¹ Veloso, M.²

13
- 0344465292
- Design and analysis of internet-based tele-coordinated multi-robot systems
- I. H. Elhajj, A. Goradia, N. Xi, and et al, "Design and analysis of internet-based tele-coordinated multi-robot systems, " Auton. Robots, vol. 15, pp. 237-254, 2003.
- (2003) Auton. Robots , vol.15 , pp. 237-254
- Elhajj, I.H.¹ Goradia, A.² Xi, N.³

14
- 0141973893
- Distributed coordination in heterogeneous multi-robot systems
- L. Iocchi, D. Nardi, M. Piaggio, and et al, "Distributed coordination in heterogeneous multi-robot systems, " Auton. Robots, vol. 15, pp. 155-168, 2003.
- (2003) Auton. Robots , vol.15 , pp. 155-168
- Iocchi, L.¹ Nardi, D.² Piaggio, M.³

15
- 0037338242
- Multi-robot task allocation in uncertain environments
- M. J. Mataric, G. S. Sukhatme, and E. H. Østergaard, "Multi-robot task allocation in uncertain environments, " Auton. Robots, vol. 14, pp. 255-263, 2003.
- (2003) Auton. Robots , vol.14 , pp. 255-263
- Mataric, M.J.¹ Sukhatme, G.S.² Østergaard, E.H.³

16
- 33745834823
- Distributed lazy Q-learning for cooperative mobile robots
- C. F. Touzet, "Distributed lazy Q-learning for cooperative mobile robots, " Int. J. Advanced Robot. Syst., vol. 1, no. 1, pp. 5-13, 2004.
- (2004) Int. J. Advanced Robot. Syst. , vol.1 , Issue.1 , pp. 5-13
- Touzet, C.F.¹

17
- 85149834820
- Markov games as a framework for multi-agent learning
- San Francisco
- M. L. Littman, "Markov games as a framework for multi-agent learning, " in Proc. 11th Int. Conf. Machine Learning, San Francisco, 1994, pp. 157-163.
- (1994) Proc. 11th Int. Conf. Machine Learning , pp. 157-163
- Littman, M.L.¹

18
- 0001961616
- A generalized reinforcement-learning model: Convergence and applications
- Bari, Italy, July 3-6
- M. L. Littman and C. Szepesvári, "A generalized reinforcement-learning model: convergence and applications, " in Proc. 13th Int. Conf. Machine Learning, Bari, Italy, July 3-6 1996, pp. 310-318.
- (1996) Proc. 13th Int. Conf. Machine Learning , pp. 310-318
- Littman, M.L.¹ Szepesvári, C.²

19
- 0003629453
- Generalized markov decision processes: Dynamic-programming and reinforcement-learning algorithms
- Department of Computer Science
- C. Szepesvári and M. L. Littman, "Generalized markov decision processes: dynamic-programming and reinforcement-learning algorithms, " Department of Computer Science, Brown University, Technical report CS-96-11, 1996.
- (1996) Brown University, Technical report CS-96-11
- Szepesvári, C.¹ Littman, M.L.²

20
- 0031630561
- The dynamics of reinforcement learning in coopertive multiagent systems
- Madison, WI
- C. Claus and C. Boutilier, "The dynamics of reinforcement learning in coopertive multiagent systems, " in Proc. 15th National Conf. Artificial Intelligence, Madison, WI, 1998, pp. 746-752.
- (1998) Proc. 15th National Conf. Artificial Intelligence , pp. 746-752
- Claus, C.¹ Boutilier, C.²

21
- 1642321450
- [Online]. Availableciteseer.ist.psu.edu/hu99multiagent.html
- J. Hu and M. P. Wellman, "Multiagent reinforcement learning in stochastic games, " 1999. [Online]. Availableciteseer.ist.psu.edu/ hu99multiagent.html.
- (1999) Multiagent Reinforcement Learning in Stochastic Games
- Hu, J.¹ Wellman, M.P.²

22
- 84943276328
- Rationality assumptions and optimality of co-learning
- Springer
- R. Sun and D. Qi, "Rationality assumptions and optimality of co-learning, " Lecture Notes in Computer Science, vol. 1881. Springer, 2000, pp. 61-75.
- (2000) Lecture Notes in Computer Science , vol.1881 , pp. 61-75
- Sun, R.¹ Qi, D.²

23
- 0033570798
- A unified analysis of value-function-based reinforcement learning algorithms
- C. Szepesvári and M. L. Littman, "A unified analysis of value-function-based reinforcement learning algorithms, " Neur. Comput., vol. 11, no. 8, pp. 2017- 2059, 1999.
- (1999) Neur. Comput , vol.11 , Issue.8 , pp. 2017-2059
- Szepesvári, C.¹ Littman, M.L.²

24
- 0034205975
- Multiagent systems: A survey from a machine learning perspective
- P. Stone and M. Veloso, "Multiagent systems: a survey from a machine learning perspective, " Auton. Robots, vol. 8, pp. 345-383, 2000.
- (2000) Auton. Robots , vol.8 , pp. 345-383
- Stone, P.¹ Veloso, M.²

25
- 0002550841
- Learning about other agents in a dynamic multiagent system
- PII S138904170100016X
- J. Hu and M. P.Wellman, "Learning about other agents in a dynamic multiagent system, " J. Cogn. Syst. Res., vol. 2, pp. 67-79, 2001. (Pubitemid 33718551)
- (2001) Cognitive Systems Research , vol.2 , Issue.1 , pp. 67-79
- Hu, J.¹ Weliman, M.P.²

26
- 0001547175
- Value-function reinforcement learning in Markov games
- PII S1389041701000158
- M. L. Littman, "Value-function reinforcement learning in markov games, " J. Cogn. Syst. Res., vol. 2, pp. 55-66, 2001. (Pubitemid 33718550)
- (2001) Cognitive Systems Research , vol.2 , Issue.1 , pp. 55-66
- Littman, M.L.¹

27
- 22344437403
- Leading best-response strategies in repeated games
- M. L. Littman and P. Stone, "Leading best-response strategies in repeated games, " in 17th Ann. Int. Joint Conf. Artificial Intelligence Work. Econ. Agents, Models, and Mechanism, 2001.
- (2001) 17th Ann. Int. Joint Conf. Artificial Intelligence Work. Econ. Agents, Models, and Mechanism
- Littman, M.L.¹ Stone, P.²

28
- 0242466944
- Friend-or-foe Q-learning in general-sum games
- Morgan Kaufman
- M. L. Littman, "Friend-or-foe Q-learning in general-sum games, " in Proc. 18th Int. Conf. Machine Learning, Morgan Kaufman, 2001, pp. 322-328.
- (2001) Proc. 18th Int. Conf. Machine Learning , pp. 322-328
- Littman, M.L.¹

29
- 0036778915
- The lagging anchor algorithm: Reinforcement learning in two-player zero-sum games with imperfect information
- DOI 10.1023/A:1014063505958
- F. A. Dahl, "The lagging anchor algorithm: reinforcement learning in twoplayer zero-sum games with imperfect information, " Mach. Learn., vol. 49, pp. 5-37, 2002. (Pubitemid 34325693)
- (2002) Machine Learning , vol.49 , Issue.1 , pp. 5-37
- Dahl, F.A.¹

30
- 0036465258
- Expertness based cooperative Qlearning
- February
- M. N. Ahmadabali and M. Asadpour, "Expertness based cooperative Qlearning, " IEEE Trans. Syst., Man, and Cyber.-Part B: Cybernetics, vol. 32, no. 1, pp. 66-76, February 2002.
- (2002) IEEE Trans. Syst. Man, and Cyber.-Part B: Cybernetics , vol.32 , Issue.1 , pp. 66-76
- Ahmadabali, M.N.¹ Asadpour, M.²

31
- 65149097468
- Multiagent reinforcement learning: Stochastic games with multiple learning players
- Department of Computer Science
- G. Chalkiadakis, "Multiagent reinforcement learning: stochastic games with multiple learning players, " Department of Computer Science, University of Toronto, " Technical report, 2003.
- (2003) University of Toronto, Technical report
- Chalkiadakis, G.¹

32
- 0036355732
- A multiagent reinforcement learning algorithm using extended optimal response
- N. Suematsu and A. Hayashi, "A multiagent reinforcement learning algorithm using extended optimal response, " in Proc. 1st Int. Joint Conf. Auton. Agents & Multiagent Syst., Bologna, Italy, July 15-19 2002, pp. 370-377. (Pubitemid 34975488)
- (2002) Proceedings of the International Conference on Autonomous Agents , Issue.2 , pp. 370-377
- Suematsu, N.¹ Hayashi, A.²

33
- 1142280924
- Multiagent reinforcement learning: Theoretical framework and an algorithm
- Melbourne, Australia, July 14-18
- G. Chalkiadakis and C. Boutilier, "Multiagent reinforcement learning: theoretical framework and an algorithm, " in 2nd Int. Joint Conf. Auton. Agents & Multiagent Syst., Melbourne, Australia, July 14-18 2003, pp. 709-716.
- (2003) 2nd Int. Joint Conf. Auton. Agents & Multiagent Syst. , pp. 709-716
- Chalkiadakis, G.¹ Boutilier, C.²

34
- 4644369748
- Nash Q-learning for general-sum stochastic games
- J. Hu and M. P.Wellman, "Nash Q-learning for general-sum stochastic games, " J. Mach. Learn. Res., vol. 4, pp. 1039-1069, 2003.
- (2003) J. Mach. Learn. Res. , vol.4 , pp. 1039-1069
- Hu, J.¹ Wellman, M.P.²

35
- 65149098041
- [Online]
- M. Wahab, "Reinforcement learning in multiagent systems, " 2003. [Online]. Available: http://www.cs.mcgill.ca/̃mwahab/RL%20in%20MAS. pdf.
- (2003) Reinforcement Learning in Multiagent Systems
- Wahab, M.¹

36
- 0004102479
- Cambridge, Massachusetts: MIT Press
- R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. Cambridge, Massachusetts: MIT Press, 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

37
- 34249833101
- Q-learning
- C. J. Watkins and P. Dayan, "Q-learning, " Mach. Learn., vol. 8, pp. 279-292, 1992.
- (1992) Mach. Learn. , vol.8 , pp. 279-292
- Watkins, C.J.¹ Dayan, P.²

38
- 0004071782
- London: Academic Press
- T. Basar and G. J. Olsder, Dynamic Noncooperative Game Theory. London: Academic Press, 1982.
- (1982) Dynamic Noncooperative Game Theory
- Basar, T.¹ Olsder, G.J.²

39
- 84880690163
- Sequential optimality and coordination in multiagent systems
- C. Boutilier, "Sequential optimality and coordination in multiagent systems, " in Proc. 16th Int. Joint Conf. Artificial Intelligence, 1999, pp. 478-485.
- (1999) Proc. 16th Int. Joint Conf. Artificial Intelligence , pp. 478-485
- Boutilier, C.¹

40
- 34047197046
- [Online]
- M. G. Lagoudakis and R. Parr, "Value function approximation in zero-sum markov games, " 2002. [Online]. Available: http://www.cs.duke.edu/ ̃mgl/papers/PDF/uai2002.pdf.
- (2002) Value Function Approximation in Zero-sum Markov Games
- Lagoudakis, M.G.¹ Parr, R.²

41
- 65149094534
- [Online]
- X. Li, "Refining basis functions in least-square approximation of zero-sum markov games, " 2003. [Online]. Available: http://www.xiaolei.org/ research/li03basis.pdf.
- (2003) Refining Basis Functions in Least-square Approximation of Zero-sum Markov Games
- Li, X.¹

42
- 4544279348
- [Online]
- Y. Shoham and R. Powers, "Multi-agent reinforcement learning: a critical survey, " 2003. [Online]. Available: http://www.stanford.edu/ ̃grenager/MALearning?ACriticalSurvey?2003?0516.%pdf.
- (2003) Multi-agent Reinforcement Learning: A Critical Survey
- Shoham, Y.¹ Powers, R.²

43
- 0000929496
- Multiagent reinforcement learning: Theoretical framework and an algorithm
- San Francisco, California
- J. Hu and M. P. Wellman, "Multiagent reinforcement learning: theoretical framework and an algorithm, " in Proc. the 15th Int. Conf. Machine Learning, San Francisco, California, 1998, pp. 242-250.
- (1998) Proc. The 15th Int. Conf. Machine Learning , pp. 242-250
- Hu, J.¹ Wellman, M.P.²

44
- 22944447799
- Ph.D. dissertation, School of Computer Science, Carnegie Mellon University, Pittsburgh, May
- M. Bowling, "Multiagent learning in the presence of agents with limitations, " Ph.D. dissertation, School of Computer Science, Carnegie Mellon University, Pittsburgh, May 2003.
- (2003) Multiagent Learning in the Presence of Agents with Limitations
- Bowling, M.¹

45
- 0036531878
- Multiagent learning using a variable learning rate
- M. H. Bowling and M. M. Veloso, "Multiagent learning using a variable learning rate, " Art. Intell., vol. 136, no. 2, pp. 215-250, 2002.
- (2002) Art. Intell. , vol.136 , Issue.2 , pp. 215-250
- Bowling, M.H.¹ Veloso, M.M.²

46
- 1142305722
- Convergent gradient ascent in general-sum games
- August 13-19
- B. Banerjee and J. Peng, "Convergent gradient ascent in general-sum games, " in Proc. 13th Europ. Conf. Mach. Learn., August 13-19 2002, pp. 686-692.
- (2002) Proc. 13th Europ. Conf. Mach. Learn. , pp. 686-692
- Banerjee, B.¹ Peng, J.²

47
- 1142280919
- Adaptive policy gradient in multiagent learning
- ACM Press
- B. Banerjee and J. Peng, "Adaptive policy gradient in multiagent learning, " in Proc. 2nd int. joint conf. Auton. agents and multiagent systems. ACM Press, 2003, pp. 686-692.
- (2003) Proc. 2nd Int. Joint Conf. Auton. Agents and Multiagent Systems , pp. 686-692
- Banerjee, B.¹ Peng, J.²

48
- 84898939480
- Policy gradient methods for reinforcement learning with function approximation
- MIT Press
- R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour, "Policy gradient methods for reinforcement learning with function approximation, " in Advanc. Neur. Inf. Proc. Syst.. MIT Press, 12, pp. 1057-1063.
- Advanc. Neur. Inf. Proc. Syst. , vol.12 , pp. 1057-1063
- Sutton, R.S.¹ McAllester, D.² Singh, S.³ Mansour, Y.⁴

49
- 4444326434
- Scaling up reinforcement learning with a relational representation
- Sydney
- E. F. Morales, "Scaling up reinforcement learning with a relational representation, " in Workshop Adaptabil. Multi-Agent Syst., Sydney, 2003.
- (2003) Workshop Adaptabil. Multi-Agent Syst.
- Morales, E.F.¹

50
- 0004247096
- Cambridge, Massachusetts: MIT Press
- D. Fudenberg and D. K. Levine, The Theory of Learning in Games. Cambridge, Massachusetts: MIT Press, 1999.
- (1999) The Theory of Learning in Games
- Fudenberg, D.¹ Levine, D.K.²

51
- 78649498458
- [Online]. Available: citeseer.ist.psu.edu/sen96multiagent.html
- S. Sen and M. Sekaran, "Multiagent coordination with learning classifier systems, " [Online]. Available: citeseer.ist.psu.edu/ sen96multiagent.html.
- Multiagent Coordination with Learning Classifier Systems
- Sen, S.¹ Sekaran, M.²

52
- 65149102375
- [Online]
- S. Valluri and S. Babu, "Reinforcement learning for keepaway soccer problem, " 2002. [Online]. Available: http://www.cis.ksu.edu/̃babu/ final/html/ProjectReport.htm.
- (2002) Reinforcement Learning for Keepaway Soccer Problem
- Valluri, S.¹ Babu, S.²

53
- 0035558808
- KaBaGe-RL: Kanerva-based generalisation and reinforcement learning for possession football
- Hawaii
- K. Kostiadis and H. Hu, "KaBaGe-RL: kanerva-based generalisation and reinforcement learning for possession football, " in Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems, Hawaii, 2001.
- (2001) Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems
- Kostiadis, K.¹ Hu, H.²

54
- 0035978635
- Modular Q-learning based multi-agent cooperation for robot soccer
- DOI 10.1016/S0921-8890(01)00114-2, PII S0921889001001142
- K.-H. Park, Y.-J. Kim, and J.-H. Kim, "Modular Q-learning based multi-agent cooperation for robot soccer, " Robot. Auton. Syst., vol. 35, pp. 109-122, 2001. (Pubitemid 32427408)
- (2001) Robotics and Autonomous Systems , vol.35 , Issue.2 , pp. 109-122
- Park, K.-H.¹ Kim, Y.-J.² Kim, J.-H.³

55
- 0033280134
- Cooperation and coordination between fuzzy reinforcement learning agents in continuous-state partially observable markov decision processes
- H. R. Berenji and D. A. Vengerov, "Cooperation and coordination between fuzzy reinforcement learning agents in continuous-state partially observable markov decision processes, " in Proc. 8th IEEE Int. Conf. Fuzzy Systems, 2000.
- (2000) Proc. 8th IEEE Int. Conf. Fuzzy Systems
- Berenji, H.R.¹ Vengerov, D.A.²

56
- 0033685787
- Advantages of cooperation between reinforcement learning agents in difficult stochastic problems
- H. R. Berenji and D. A. Vengerov, "Advantages of cooperation between reinforcement learning agents in difficult stochastic problems, " in Proc. 9th IEEE Int. Conf. Fuzzy Systems, 2000.
- (2000) Proc. 9th IEEE Int. Conf. Fuzzy Systems
- Berenji, H.R.¹ Vengerov, D.A.²

57
- 80053643391
- [Online]
- H. R. Berenji and D. Vengerov, "Generalized markov decision processes: dynamic-programming and reinforcement-learning algorithms, " [Online]. Available: http://www.iiscorp.com/projects/multi-agent/tech?rep?iis? 00?10.pdf.
- Generalized Markov Decision Processes: Dynamic-programming and Reinforcement-learning Algorithms
- Berenji, H.R.¹ Vengerov, D.²

58
- 80053654761
- Modular-fuzzy cooperative algorithm for multiagent systems
- Springer
- I. Gültekin and A. Arslan, "Modular-fuzzy cooperative algorithm for multiagent systems, " Lecture Notes in Computer Science, vol. 2457. Springer, 2002, pp. 255-263.
- (2002) Lecture Notes in Computer Science , vol.2457 , pp. 255-263
- Gültekin, I.¹ Arslan, A.²

59
- 80053654030
- Minimax fuzzy Q-learning in cooperative multi-agent systems
- Springer
- A. Kilic and A. Arslan, "Minimax fuzzy Q-learning in cooperative multi-agent systems, " Lecture Notes in Computer Science, vol. 2457. Springer, 2002, pp. 264-272.
- (2002) Lecture Notes in Computer Science , vol.2457 , pp. 264-272
- Kilic, A.¹ Arslan, A.²

60
- 0041877717
- A convergent actor-critic-based frl algorithm with application to power management of wireless transmitters
- August
- H. R. Berenji and D. Vengerov, "A convergent Actor-Critic-based FRL algorithm with application to power management of wireless transmitters, " IEEE Trans. Fuzz. Syst., vol. 11, no. 4, pp. 478-485, August 2003.
- (2003) IEEE Trans. Fuzz. Syst. , vol.11 , Issue.4 , pp. 478-485
- Berenji, H.R.¹ Vengerov, D.²

61
- 0034274415
- A study of reinforcement learning in the continuous case by the means of viscosity solutions
- R. Munos, "A study of reinforcement learning in the continuous case by the means of viscosity solutions, " Mach. Learn., vol. 40, pp. 265-299, 2000.
- (2000) Mach. Learn. , vol.40 , pp. 265-299
- Munos, R.¹

62
- 0034313638
- Multiagent reinforcement learning using function approximation
- DOI 10.1109/5326.897075
- O. Abul, F. Polat, and R. Alhajj, "Multiagent reinforcement learning using function approximation, " IEEE Trans. Syst., Man, and Cybern.-Part C: Appli.n and Rev., vol. 30, no. 4, pp. 485-497, 2000. (Pubitemid 32190905)
- (2000) IEEE Transactions on Systems, Man and Cybernetics Part C: Applications and Reviews , vol.30 , Issue.4 , pp. 485-497
- Abul, O.¹ Polat, F.² Alhajj, R.³

63
- 0035283402
- On the convergence of temporal-difference learning with linear function approximation
- DOI 10.1023/A:1007609817671
- V. Tadić, "On the convergence of temporal-difference learning with linear function approximation, " Mach. Learn., vol. 42, pp. 241-267, 2001. (Pubitemid 32188797)
- (2001) Machine Learning , vol.42 , Issue.3 , pp. 241-267
- Tadic, V.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.