SCOPUS 정보 검색 플랫폼

Studies in Computational Intelligence

Volumn 310, Issue , 2010, Pages 183-221

Multi-agent reinforcement learning: An overview

(3) Buşoniu, Lucian a Babuška, Robert a De Schutter, Bart a

a DELFT UNIVERSITY OF TECHNOLOGY (Netherlands)

Author keywords

[No Author keywords available]

Indexed keywords

EID: 77956317028 PISSN: 1860949X EISSN: None Source Type: Book Series
DOI: 10.1007/978-3-642-14435-6_7 Document Type: Article

Times cited : (679)

References (144)

1
- 0034313638
- Multiagent reinforcement learning using function approximation
- Abul, O., Polat, F., Alhajj, R.: Multiagent reinforcement learning using function approximation. IEEE Transactions on Systems, Man, and Cybernetics-Part C: Applications and Reviews 4(4), 485-497 (2000)
- (2000) IEEE Transactions on Systems, Man, and Cybernetics-Part C: Applications and Reviews , vol.4 , Issue.4 , pp. 485-497
- Abul, O.¹ Polat, F.² Alhajj, R.³

2
- 0003435075
- Oxford University Press, Oxford
- Bäck, T.: Evolutionary Algorithms in Theory and Practice: Evolution Strategies, Evolutionary Programming, Genetic Algorithms. Oxford University Press, Oxford (1996)
- (1996) Evolutionary Algorithms in Theory and Practice: Evolution Strategies, Evolutionary Programming, Genetic Algorithms
- Bäck, T.¹

3
- 0004071782
- 2nd edn. Society for Industrial and Applied Mathematics, SIAM
- Başar, T., Olsder, G.J.: Dynamic Noncooperative Game Theory, 2nd edn. Society for Industrial and Applied Mathematics, SIAM (1999)
- (1999) Dynamic Noncooperative Game Theory
- Başar, T.¹ Olsder, G.J.²

4
- 79955953465
- Cooperative multi-agent reinforcement learning of traffic lights
- Porto, Portugal
- Bakker, B., Steingrover, M., Schouten, R., Nijhuis, E., Kester, L.: Cooperative multi-agent reinforcement learning of traffic lights. In: Workshop on Cooperative Multi-Agent Learning, 16th European Conference on Machine Learning (ECML-2005), Porto, Portugal (2005)
- (2005) Workshop on Cooperative Multi-Agent Learning, 16th European Conference on Machine Learning (ECML-2005)
- Bakker, B.¹ Steingrover, M.² Schouten, R.³ Nijhuis, E.⁴ Kester, L.⁵

5
- 1142280919
- Adaptive policy gradient in multiagent learning
- Melbourne, Australia
- Banerjee, B., Peng, J.: Adaptive policy gradient in multiagent learning. In: Proceedings 2nd International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2003), Melbourne, Australia, pp. 686-692 (2003)
- (2003) Proceedings 2nd International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2003) , pp. 686-692
- Banerjee, B.¹ Peng, J.²

6
- 0020970738
- Neuronlike adaptive elements that can solve difficult learning control problems
- Barto, A.G., Sutton, R.S., Anderson, C.W.: Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man, and Cybernetics 13(5), 833-846 (1983)
- (1983) IEEE Transactions on Systems, Man, and Cybernetics , vol.13 , Issue.5 , pp. 833-846
- Barto, A.G.¹ Sutton, R.S.² Anderson, C.W.³

7
- 0041877717
- A convergent actor-critic-based FRL algorithm with application to power management of wireless transmitters
- Berenji, H.R., Vengerov, D.: A convergent actor-critic-based FRL algorithm with application to power management of wireless transmitters. IEEE Transactions on Fuzzy Systems 11(4), 478-485 (2003)
- (2003) IEEE Transactions on Fuzzy Systems , vol.11 , Issue.4 , pp. 478-485
- Berenji, H.R.¹ Vengerov, D.²

8
- 0003565783
- 3rd edn. Athena Scientific
- Bertsekas, D.P.: Dynamic Programming and Optimal Control, 3rd edn., vol. 2. Athena Scientific (2007)
- (2007) Dynamic Programming and Optimal Control , vol.2
- Bertsekas, D.P.¹

9
- 0003487482
- Athena Scientific
- Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. Athena Scientific (1996)
- (1996) Neuro-Dynamic Programming
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

10
- 13244278201
- An actor-critic algorithm for constrained Markov decision processes
- Borkar, V.: An actor-critic algorithm for constrained Markov decision processes. Systems & Control Letters 54(3), 207-213 (2005)
- (2005) Systems & Control Letters , vol.54 , Issue.3 , pp. 207-213
- Borkar, V.¹

11
- 0002500351
- Planning, learning and coordination in multiagent decision processes
- De Zeeuwse Stromen, The Netherlands
- Boutilier, C.: Planning, learning and coordination in multiagent decision processes. In: Proceedings 6th Conference on Theoretical Aspects of Rationality and Knowledge (TARK- 1996), pp. 195-210. De Zeeuwse Stromen, The Netherlands (1996)
- (1996) Proceedings 6th Conference on Theoretical Aspects of Rationality and Knowledge (TARK- 1996) , pp. 195-210
- Boutilier, C.¹

12
- 0003091684
- Convergence problems of general-sum multiagent reinforcement learning
- Stanford University, US
- Bowling, M.: Convergence problems of general-sum multiagent reinforcement learning. In: Proceedings 17th International Conference on Machine Learning (ICML-2000), Stanford University, US, pp. 89-94 (2000)
- (2000) Proceedings 17th International Conference on Machine Learning (ICML-2000) , pp. 89-94
- Bowling, M.¹

13
- 22944447799
- Ph.D. thesis, Computer Science Dept., Carnegie Mellon University, Pittsburgh, US
- Bowling, M.: Multiagent learning in the presence of agents with limitations. Ph.D. thesis, Computer Science Dept., Carnegie Mellon University, Pittsburgh, US (2003)
- (2003) Multiagent Learning in the Presence of Agents with Limitations
- Bowling, M.¹

14
- 84899027977
- Convergence and no-regret in multiagent learning
- Saul, L.K., Weiss, Y., Bottou, L. (eds.), MIT Press, Cambridge
- Bowling, M.: Convergence and no-regret in multiagent learning. In: Saul, L.K., Weiss, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems 17, pp. 209-216. MIT Press, Cambridge (2005)
- (2005) Advances in Neural Information Processing Systems , vol.17 , pp. 209-216
- Bowling, M.¹

15
- 0003863106
- An analysis of stochastic game theory for multiagent reinforcement learning
- CarnegieMellon University, Pittsburgh, US
- Bowling, M., Veloso, M.: An analysis of stochastic game theory for multiagent reinforcement learning. Tech. rep., Computer Science Dept., CarnegieMellon University, Pittsburgh, US (2000), http://www.cs.ualberta.ca/ ~bowling/papers/00tr.pdf
- (2000) Tech. Rep., Computer Science Dept.
- Bowling, M.¹ Veloso, M.²

16
- 84880865940
- Rational and convergent learning in stochastic games
- San Francisco, US
- Bowling, M., Veloso, M.: Rational and convergent learning in stochastic games. In: Proceedings 17th International Conference on Artificial Intelligence (IJCAI-2001), San Francisco, US, pp. 1021-1026 (2001)
- (2001) Proceedings 17th International Conference on Artificial Intelligence (IJCAI-2001) , pp. 1021-1026
- Bowling, M.¹ Veloso, M.²

17
- 0036531878
- Multiagent learning using a variable learning rate
- Bowling, M., Veloso, M.: Multiagent learning using a variable learning rate. Artificial Intelligence 136(2), 215-250 (2002)
- (2002) Artificial Intelligence , vol.136 , Issue.2 , pp. 215-250
- Bowling, M.¹ Veloso, M.²

18
- 0000719863
- Packet routing in dynamically changing networks: A reinforcement learning approach
- Moody, J. (ed.). Morgan Kaufmann, San Francisco
- Boyan, J.A., Littman,M.L.: Packet routing in dynamically changing networks: A reinforcement learning approach. In: Moody, J. (ed.) Advances in Neural Information Processing Systems 6, pp. 671-678. Morgan Kaufmann, San Francisco (1994)
- (1994) Advances in Neural Information Processing Systems , vol.6 , pp. 671-678
- Boyan, J.A.¹ Littman, M.L.²

19
- 0002672918
- Iterative solutions of games by fictitious play
- Koopmans, T.C. (ed.). ch. XXIV, Wiley, Chichester
- Brown, G.W.: Iterative solutions of games by fictitious play. In: Koopmans, T.C. (ed.) Activitiy Analysis of Production and Allocation, ch. XXIV, pp. 374-376. Wiley, Chichester (1951)
- (1951) Activitiy Analysis of Production and Allocation , pp. 374-376
- Brown, G.W.¹

20
- 40949147745
- A comprehensive survey of multi-agent reinforcement learning
- Buşoniu, L., Babuška, R., De Schutter, B.: A comprehensive survey of multi-agent reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics. Part C: Applications and Reviews 38(2), 156-172 (2008)
- (2008) IEEE Transactions on Systems, Man, and Cybernetics. Part C: Applications and Reviews , vol.38 , Issue.2 , pp. 156-172
- Buşoniu, L.¹ Babuška, R.² De Schutter, B.³

21
- 84873428767
- Multiagent reinforcement learning with adaptive state focus
- Brussels, Belgium
- Buşoniu, L., De Schutter, B., Babuška, R.: Multiagent reinforcement learning with adaptive state focus. In: Proceedings 17th Belgian-Dutch Conference on Artificial Intelligence (BNAIC-2005), Brussels, Belgium, pp. 35-42 (2005)
- (2005) Proceedings 17th Belgian-Dutch Conference on Artificial Intelligence (BNAIC-2005) , pp. 35-42
- Buşoniu, L.¹ De Schutter, B.² Babuška, R.³

22
- 34547223380
- Decentralized reinforcement learning control of a robotic manipulator
- Singapore
- Buşoniu, L., De Schutter, B., Babuška, R.: Decentralized reinforcement learning control of a robotic manipulator. In: Proceedings 9th International Conference of Control, Automation, Robotics, and Vision (ICARCV-2006), Singapore, pp. 1347-1352 (2006)
- (2006) Proceedings 9th International Conference of Control, Automation, Robotics, and Vision (ICARCV-2006) , pp. 1347-1352
- Buşoniu, L.¹ De Schutter, B.² Babuška, R.³

23
- 77950350393
- Approximate dynamic programming and reinforcement learning
- Babuška, R., Groen, F.C.A. (eds.), Springer, Heidelberg
- Buşoniu, L., De Schutter, B., Babuška, R.: Approximate dynamic programming and reinforcement learning. In: Babuška, R., Groen, F.C.A. (eds.) Interactive Collaborative Information Systems. Studies in Computational Intelligence, vol. 281, pp. 3-44. Springer, Heidelberg (2010)
- (2010) Interactive Collaborative Information Systems. Studies in Computational Intelligence , vol.281 , pp. 3-44
- Buşoniu, L.¹ De Schutter, B.² Babuška, R.³

24
- 34548099216
- Shaping multi-agent systems with gradient reinforcement learning
- Buffet, O., Dutech, A., Charpillet, F.: Shaping multi-agent systems with gradient reinforcement learning. Autonomous Agents and Multi-Agent Systems 15(2), 197-220 (2007)
- (2007) Autonomous Agents and Multi-Agent Systems , vol.15 , Issue.2 , pp. 197-220
- Buffet, O.¹ Dutech, A.² Charpillet, F.³

25
- 2642545776
- Opponent modeling in multi-agent systems
- Weiß, G., Sen, S. (eds.), ch. 3, Springer, Heidelberg
- Carmel, D., Markovitch, S.: Opponent modeling in multi-agent systems. In: Weiß, G., Sen, S. (eds.) Adaptation and Learning in Multi-Agent Systems, ch. 3, pp. 40-52. Springer, Heidelberg (1996)
- (1996) Adaptation and Learning in Multi-Agent Systems , pp. 40-52
- Carmel, D.¹ Markovitch, S.²

26
- 65149097468
- Multiagent reinforcement learning: Stochastic games with multiple learning players
- University of Toronto, Canada
- Chalkiadakis, G.:Multiagent reinforcement learning: Stochastic games with multiple learning players. Tech. rep., Dept. of Computer Science, University of Toronto, Canada (2003), http://www.cs.toronto.edu/~gehalk/DepthReport/ DepthReport.ps
- (2003) Tech. Rep., Dept. of Computer Science
- Chalkiadakis, G.¹

27
- 0003890671
- Wiley, Chichester
- Cherkassky, V., Mulier, F.: Learning from Data: Concepts, Theory, And Methods. Wiley, Chichester (1998)
- (1998) Learning from Data: Concepts Theory and Methods
- Cherkassky, V.¹ Mulier, F.²

28
- 85156238953
- Predictive Q-routing: A memory-based reinforcement learning approach to adaptive traffic control
- Touretzky, D.S., Mozer, M., Hasselmo, M.E. (eds.), MIT Press, Cambridge
- Choi, S.P.M., Yeung, D.Y.: Predictive Q-routing: A memory-based reinforcement learning approach to adaptive traffic control. In: Touretzky, D.S., Mozer, M., Hasselmo, M.E. (eds.) Advances in Neural Information Processing Systems 8, pp. 945-951. MIT Press, Cambridge (1995)
- (1995) Advances in Neural Information Processing Systems , vol.8 , pp. 945-951
- Choi, S.P.M.¹ Yeung, D.Y.²

29
- 0031630561
- The dynamics of reinforcement learning in cooperative multiagent systems
- Madison, US
- Claus, C., Boutilier, C.: The dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings 15th National Conference on Artificial Intelligence and 10th Conference on Innovative Applications of Artificial Intelligence (AAAI/IAAI-1998), Madison, US, pp. 746-752 (1998)
- (1998) Proceedings 15th National Conference on Artificial Intelligence and 10th Conference on Innovative Applications of Artificial Intelligence (AAAI/IAAI-1998) , pp. 746-752
- Claus, C.¹ Boutilier, C.²

30
- 77956330149
- Learning from an automated training agent
- Tahoe City, US
- Clouse, J.: Learning from an automated training agent. In: Working Notes Workshop on Agents that Learn from Other Agents, 12th International Conference on Machine Learning (ICML-1995), Tahoe City, US (1995)
- (1995) Working Notes Workshop on Agents That Learn from Other Agents, 12th International Conference on Machine Learning (ICML-1995)
- Clouse, J.¹

31
- 1942421183
- AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents
- Washington, US
- Conitzer, V., Sandholm, T.: AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents. In: Proceedings 20th International Conference on Machine Learning (ICML-2003), Washington, US, pp. 83-90 (2003)
- (2003) Proceedings 20th International Conference on Machine Learning (ICML-2003) , pp. 83-90
- Conitzer, V.¹ Sandholm, T.²

32
- 85156187730
- Improving elevator performance using reinforcement learning
- Touretzky, D.S., Mozer, M.C., Hasselmo, M.E. (eds.), MIT Press, Cambridge
- Crites, R.H., Barto, A.G.: Improving elevator performance using reinforcement learning. In: Touretzky, D.S., Mozer, M.C., Hasselmo, M.E. (eds.) Advances in Neural Information Processing Systems 8, pp. 1017-1023. MIT Press, Cambridge (1996)
- (1996) Advances in Neural Information Processing Systems , vol.8 , pp. 1017-1023
- Crites, R.H.¹ Barto, A.G.²

33
- 0032208335
- Elevator group control using multiple reinforcement learning agents
- Crites, R.H., Barto, A.G.: Elevator group control using multiple reinforcement learning agents. Machine Learning 33(2-3), 235-262 (1998)
- (1998) Machine Learning , vol.33 , Issue.2-3 , pp. 235-262
- Crites, R.H.¹ Barto, A.G.²

34
- 21844465127
- Tree-based batch mode reinforcement learning
- Ernst, D., Geurts, P.,Wehenkel, L.: Tree-based batch mode reinforcement learning. Journal of Machine Learning Research 6, 503-556 (2005)
- (2005) Journal of Machine Learning Research , vol.6 , pp. 503-556
- Ernst, D.¹ Geurts, P.² Wehenkel, L.³

35
- 5644261272
- Learning in large cooperative multi-robot systems
- Ferńandez, F., Parker, L.E.: Learning in large cooperative multi-robot systems. International Journal of Robotics and Automation, Special Issue on Computational Intelligence Techniques in Cooperative Robots 16(4), 217-226 (2001)
- (2001) International Journal of Robotics and Automation, Special Issue on Computational Intelligence Techniques in Cooperative Robots , vol.16 , Issue.4 , pp. 217-226
- Ferńandez, F.¹ Parker, L.E.²

36
- 84947918810
- A game-theoretic approach to the simple coevolutionary algorithm
- Deb, K., Rudolph, G., Lutton, E., Merelo, J.J., Schoenauer, M., Schwefel, H.-P., Yao, X. (eds.) PPSN 2000. Springer, Heidelberg
- Ficici, S.G., Pollack, J.B.: A game-theoretic approach to the simple coevolutionary algorithm. In: Deb, K., Rudolph, G., Lutton, E., Merelo, J.J., Schoenauer, M., Schwefel, H.-P., Yao, X. (eds.) PPSN 2000. LNCS, vol. 1917, pp. 467-476. Springer, Heidelberg (2000)
- (2000) LNCS , vol.1917 , pp. 467-476
- Ficici, S.G.¹ Pollack, J.B.²

37
- 4544220380
- Hierarchical reinforcement learning in communication-mediated multiagent coordination
- New York, US
- Fischer, F., Rovatsos, M., Weiss, G.: Hierarchical reinforcement learning in communication-mediated multiagent coordination. In: Proceedings 3rd International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-2004), New York, US, pp. 1334-1335 (2004)
- (2004) Proceedings 3rd International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-2004) , pp. 1334-1335
- Fischer, F.¹ Rovatsos, M.² Weiss, G.³

38
- 33745586802
- Structural abstraction experiments in reinforcement learning
- Zhang, S., Jarvis, R.A. (eds.) AI 2005. Springer, Heidelberg
- Fitch, R., Hengst, B., Suc, D., Calbert, G., Scholz, J.B.: Structural abstraction experiments in reinforcement learning. In: Zhang, S., Jarvis, R.A. (eds.) AI 2005. LNCS (LNAI), vol. 3809, pp. 164-175. Springer, Heidelberg (2005)
- (2005) LNCS (LNAI) , vol.3809 , pp. 164-175
- Fitch, R.¹ Hengst, B.² Suc, D.³ Calbert, G.⁴ Scholz, J.B.⁵

39
- 0004247096
- MIT Press, Cambridge
- Fudenberg, D., Levine, D.K.: The Theory of Learning in Games. MIT Press, Cambridge (1998)
- (1998) The Theory of Learning in Games
- Fudenberg, D.¹ Levine, D.K.²

40
- 33846942607
- Hierarchical multi-agent reinforcement learning
- Ghavamzadeh, M., Mahadevan, S., Makar, R.: Hierarchical multi-agent reinforcement learning. Autonomous Agents and Multi-Agent Systems 13(2), 197-229 (2006)
- (2006) Autonomous Agents and Multi-Agent Systems , vol.13 , Issue.2 , pp. 197-229
- Ghavamzadeh, M.¹ Mahadevan, S.² Makar, R.³

41
- 33845529505
- Reinforcement learning: An overview
- Aachen, Germany
- Glorennec, P.Y.: Reinforcement learning: An overview. In: Proceedings European Symposium on Intelligent Techniques (ESIT-2000), Aachen, Germany, pp. 17-35 (2000)
- (2000) Proceedings European Symposium on Intelligent Techniques (ESIT-2000) , pp. 17-35
- Glorennec, P.Y.¹

42
- 1942517280
- Correlated-Q learning
- Washington, US
- Greenwald, A., Hall, K.: Correlated-Q learning. In: Proceedings 20th International Conference on Machine Learning (ICML-2003), Washington, US, pp. 242-249 (2003)
- (2003) Proceedings 20th International Conference on Machine Learning (ICML-2003) , pp. 242-249
- Greenwald, A.¹ Hall, K.²

43
- 4544236179
- Coordinated reinforcement learning
- Sydney, Australia
- Guestrin, C., Lagoudakis, M.G., Parr, R.: Coordinated reinforcement learning. In: Proceedings 19th International Conference on Machine Learning (ICML-2002), Sydney, Australia, pp. 227-234 (2002)
- (2002) Proceedings 19th International Conference on Machine Learning (ICML-2002) , pp. 227-234
- Guestrin, C.¹ Lagoudakis, M.G.² Parr, R.³

44
- 9444233318
- Dynamic programming for partially observable stochastic games
- San Jose, US
- Hansen, E.A., Bernstein, D.S., Zilberstein, S.: Dynamic programming for partially observable stochastic games. In: Proceedings 19th National Conference on Artificial Intelligence (AAAI-2004), San Jose, US, pp. 709-715 (2004)
- (2004) Proceedings 19th National Conference on Artificial Intelligence (AAAI-2004) , pp. 709-715
- Hansen, E.A.¹ Bernstein, D.S.² Zilberstein, S.³

45
- 0002363593
- Strongly typed genetic programming in evolving cooperation strategies
- Pittsburgh, US
- Haynes, T.,Wainwright, R., Sen, S., Schoenefeld, D.: Strongly typed genetic programming in evolving cooperation strategies. In: Proceedings 6th International Conference on Genetic Algorithms (ICGA-1995), Pittsburgh, US, pp. 271-278 (1995)
- (1995) Proceedings 6th International Conference on Genetic Algorithms (ICGA-1995) , pp. 271-278
- Haynes, T.¹ Wainwright, R.² Sen, S.³ Schoenefeld, D.⁴

46
- 0032207350
- Learning coordination strategies for cooperative multiagent systems
- Ho, F., Kamel, M.: Learning coordination strategies for cooperative multiagent systems. Machine Learning 33(2-3), 155-177 (1998)
- (1998) Machine Learning , vol.33 , Issue.2-3 , pp. 155-177
- Ho, F.¹ Kamel, M.²

47
- 0030377615
- Fuzzy interpolation-based Q-learning with continuous states and actions
- New Orleans, US
- Horiuchi, T., Fujino, A., Katai, O., Sawaragi, T.: Fuzzy interpolation-based Q-learning with continuous states and actions. In: Proceedings 5th IEEE International Conference on Fuzzy Systems (FUZZ-IEEE-1996), New Orleans, US, pp. 594-600 (1996)
- (1996) Proceedings 5th IEEE International Conference on Fuzzy Systems (FUZZ-IEEE-1996) , pp. 594-600
- Horiuchi, T.¹ Fujino, A.² Katai, O.³ Sawaragi, T.⁴

48
- 34248683404
- Market performance of adaptive trading agents in synchronous double auctions
- Yuan, S.-T., Yokoo, M. (eds.) PRIMA 2001. Springer, Heidelberg
- Hsu, W.T., Soo, V.W.: Market performance of adaptive trading agents in synchronous double auctions. In: Yuan, S.-T., Yokoo, M. (eds.) PRIMA 2001. LNCS (LNAI), vol. 2132, pp. 108-121. Springer, Heidelberg (2001)
- (2001) LNCS (LNAI) , vol.2132 , pp. 108-121
- Hsu, W.T.¹ Soo, V.W.²

49
- 0000929496
- Multiagent reinforcement learning: Theoretical framework and an algorithm
- Madison, US
- Hu, J., Wellman, M.P.: Multiagent reinforcement learning: Theoretical framework and an algorithm. In: Proceedings 15th International Conference on Machine Learning (ICML- 1998), Madison, US, pp. 242-250 (1998)
- (1998) Proceedings 15th International Conference on Machine Learning (ICML- 1998) , pp. 242-250
- Hu, J.¹ Wellman, M.P.²

50
- 4644369748
- Nash Q-learning for general-sum stochastic games
- Hu, J., Wellman, M.P.: Nash Q-learning for general-sum stochastic games. Journal of Machine Learning Research 4, 1039-1069 (2003)
- (2003) Journal of Machine Learning Research , vol.4 , pp. 1039-1069
- Hu, J.¹ Wellman, M.P.²

51
- 21244489639
- A reinforcement learning scheme for a partially-observable multi-agent game
- Ishii, S., Fujita, H., Mitsutake, M., Yamazaki, T.,Matsuda, J., Matsuno, Y.: A reinforcement learning scheme for a partially-observable multi-agent game. Machine Learning 59(1-2), 31-54 (2005)
- (2005) Machine Learning , vol.59 , Issue.1-2 , pp. 31-54
- Ishii, S.¹ Fujita, H.² Mitsutake, M.³ Yamazaki, T.⁴ Matsuda, J.⁵ Matsuno, Y.⁶

52
- 0037843409
- An approach to the pursuit problem on a heterogeneous multiagent system using reinforcement learning
- Ishiwaka, Y., Sato, T., Kakazu, Y.: An approach to the pursuit problem on a heterogeneous multiagent system using reinforcement learning. Robotics and Autonomous Systems 43(4), 245-256 (2003)
- (2003) Robotics and Autonomous Systems , vol.43 , Issue.4 , pp. 245-256
- Ishiwaka, Y.¹ Sato, T.² Kakazu, Y.³

53
- 0000439891
- On the convergence of stochastic iterative dynamic programming algorithms
- Jaakkola, T., Jordan, M.I., Singh, S.P.: On the convergence of stochastic iterative dynamic programming algorithms. Neural Computation 6(6), 1185-1201 (1994)
- (1994) Neural Computation , vol.6 , Issue.6 , pp. 1185-1201
- Jaakkola, T.¹ Jordan, M.I.² Singh, S.P.³

54
- 9444236608
- On no-regret learning, fictitious play, and Nash equilibrium
- Williams College, Williamstown, US
- Jafari, A., Greenwald, A.R., Gondek, D., Ercal, G.: On no-regret learning, fictitious play, and Nash equilibrium. In: Proceedings 18th International Conference on Machine Learning (ICML-2001), pp. 226-233. Williams College, Williamstown, US (2001)
- (2001) Proceedings 18th International Conference on Machine Learning (ICML-2001) , pp. 226-233
- Jafari, A.¹ Greenwald, A.R.² Gondek, D.³ Ercal, G.⁴

55
- 21444435294
- MIT Press, Cambridge
- Jong, K.D.: Evolutionary Computation: A Unified Approach.MIT Press, Cambridge (2005)
- (2005) Evolutionary Computation: A Unified Approach
- Jong, K.D.¹

56
- 0032140718
- Fuzzy inference system learning by reinforcement methods
- Jouffe, L.: Fuzzy inference system learning by reinforcement methods. IEEE Transactions on Systems, Man, and Cybernetics-Part C: Applications and Reviews 28(3), 338-355 (1998)
- (1998) IEEE Transactions on Systems, Man, and Cybernetics-Part C: Applications and Reviews , vol.28 , Issue.3 , pp. 338-355
- Jouffe, L.¹

57
- 34548765672
- Kernelizing LSPE(λ)
- Honolulu, US
- Jung, T., Polani, D.: Kernelizing LSPE(λ). In: Proceedings 2007 IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning (ADPRL-2007), Honolulu, US, pp. 338-345 (2007)
- (2007) Proceedings 2007 IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning (ADPRL-2007) , pp. 338-345
- Jung, T.¹ Polani, D.²

58
- 0029679044
- Reinforcement learning: A survey
- Kaelbling, L.P., Littman,M.L.,Moore, A.W.: Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4, 237-285 (1996)
- (1996) Journal of Artificial Intelligence Research , vol.4 , pp. 237-285
- Kaelbling, L.P.¹ Littman, M.L.² Moore, A.W.³

59
- 0036932299
- Reinforcement learning of coordination in cooperative multiagent systems
- Menlo Park, US
- Kapetanakis, S., Kudenko, D.: Reinforcement learning of coordination in cooperative multiagent systems. In: Proceedings 18th National Conference on Artificial Intelligence and 14th Conference on Innovative Applications of Artificial Intelligence (AAAI/IAAI-2002), Menlo Park, US, pp. 326-331 (2002)
- (2002) Proceedings 18th National Conference on Artificial Intelligence and 14th Conference on Innovative Applications of Artificial Intelligence (AAAI/IAAI-2002) , pp. 326-331
- Kapetanakis, S.¹ Kudenko, D.²

60
- 40949099898
- Utile coordination: Learning interdependencies among cooperative agents
- Colchester, United Kingdom
- Kok, J.R., 't Hoen, P.J., Bakker, B., Vlassis, N.: Utile coordination: Learning interdependencies among cooperative agents. In: Proceedings IEEE Symposium on Computational Intelligence and Games (CIG 2005), Colchester, United Kingdom, pp. 29-36 (2005)
- (2005) Proceedings IEEE Symposium on Computational Intelligence and Games (CIG 2005) , pp. 29-36
- Kok, J.R.¹ Hoen P J, T.² Bakker, B.³ Vlassis, N.⁴

61
- 12244304892
- Non-communicative multi-robot coordination in dynamic environment
- Kok, J.R., Spaan, M.T.J., Vlassis, N.: Non-communicative multi-robot coordination in dynamic environment. Robotics and Autonomous Systems 50(2-3), 99-114 (2005)
- (2005) Robotics and Autonomous Systems , vol.50 , Issue.2-3 , pp. 99-114
- Kok, J.R.¹ Spaan, M.T.J.² Vlassis, N.³

62
- 14344250637
- Sparse cooperative Q-learning
- Banff, Canada
- Kok, J.R., Vlassis, N.: Sparse cooperative Q-learning. In: Proceedings 21st International Conference on Machine Learning (ICML-2004), Banff, Canada, pp. 481-488 (2004)
- (2004) Proceedings 21st International Conference on Machine Learning (ICML-2004) , pp. 481-488
- Kok, J.R.¹ Vlassis, N.²

63
- 4043069840
- On actor-critic algorithms
- Konda, V.R., Tsitsiklis, J.N.: On actor-critic algorithms. SIAM Journal on Control and Optimization 42(4), 1143-1166 (2003)
- (2003) SIAM Journal on Control and Optimization , vol.42 , Issue.4 , pp. 1143-1166
- Konda, V.R.¹ Tsitsiklis, J.N.²

64
- 78649701299
- Asymmetric multiagent reinforcement learning
- Halifax, Canada
- Könönen, V.: Asymmetric multiagent reinforcement learning. In: Proceedings IEEE/WIC International Conference on Intelligent Agent Technology (IAT-2003), Halifax, Canada, pp. 336-342 (2003)
- (2003) Proceedings IEEE/WIC International Conference on Intelligent Agent Technology (IAT-2003) , pp. 336-342
- Könönen, V.¹

65
- 35048865922
- Gradient based method for symmetric and asymmetric multiagent reinforcement learning
- Liu, J., Cheung, Y.-m., Yin, H. (eds.) IDEAL 2003. Springer, Heidelberg
- Könönen, V.: Gradient based method for symmetric and asymmetric multiagent reinforcement learning. In: Liu, J., Cheung, Y.-m., Yin, H. (eds.) IDEAL 2003. LNCS, vol. 2690, pp. 68-75. Springer, Heidelberg (2003)
- (2003) LNCS , vol.2690 , pp. 68-75
- Könönen, V.¹

66
- 4644323293
- Least-squares policy iteration
- Lagoudakis, M.G., Parr, R.: Least-squares policy iteration. Journal of Machine Learning Research 4, 1107-1149 (2003)
- (2003) Journal of Machine Learning Research , vol.4 , pp. 1107-1149
- Lagoudakis, M.G.¹ Parr, R.²

67
- 0012286079
- An algorithm for distributed reinforcement learning in cooperative multi-agent systems
- Stanford University, US
- Lauer, M., Riedmiller, M.: An algorithm for distributed reinforcement learning in cooperative multi-agent systems. In: Proceedings 17th International Conference on Machine Learning (ICML-2000), Stanford University, US, pp. 535-542 (2000)
- (2000) Proceedings 17th International Conference on Machine Learning (ICML-2000) , pp. 535-542
- Lauer, M.¹ Riedmiller, M.²

68
- 84949746648
- A multi-agent Q-learning framework for optimizing stock trading systems
- Hameurlain, A., Cicchetti, R., Traunmüller, R. (eds.) DEXA 2002. Springer, Heidelberg
- Lee, J.-W., Jang Min, O.: A multi-agent Q-learning framework for optimizing stock trading systems. In: Hameurlain, A., Cicchetti, R., Traunmüller, R. (eds.) DEXA 2002. LNCS, vol. 2453, pp. 153-162. Springer, Heidelberg (2002)
- (2002) LNCS , vol.2453 , pp. 153-162
- Lee, J.-W.¹ Jang Min, O.²

69
- 85149834820
- Markov games as a framework for multi-agent reinforcement learning
- New Brunswick, US
- Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. In: Proceedings 11th International Conference on Machine Learning (ICML-1994), New Brunswick, US, pp. 157-163 (1994)
- (1994) Proceedings 11th International Conference on Machine Learning (ICML-1994) , pp. 157-163
- Littman, M.L.¹

70
- 0001547175
- Value-function reinforcement learning in Markov games
- Littman, M.L.: Value-function reinforcement learning in Markov games. Journal of Cognitive Systems Research 2(1), 55-66 (2001)
- (2001) Journal of Cognitive Systems Research , vol.2 , Issue.1 , pp. 55-66
- Littman, M.L.¹

71
- 80053136974
- Implicit negotiation in repeated games
- Meyer, J.-J.C., Tambe, M. (eds.) ATAL 2001. Springer, Heidelberg
- Littman, M.L., Stone, P.: Implicit negotiation in repeated games. In: Meyer, J.-J.C., Tambe, M. (eds.) ATAL 2001. LNCS (LNAI), vol. 2333, pp. 96-105. Springer, Heidelberg (2002)
- (2002) LNCS (LNAI) , vol.2333 , pp. 96-105
- Littman, M.L.¹ Stone, P.²

72
- 0000494894
- Computationally feasible bounds for partially observed Markov decision processes
- Lovejoy, W.S.: Computationally feasible bounds for partially observed Markov decision processes. Operations Research 39(1), 162-175 (1991)
- (1991) Operations Research , vol.39 , Issue.1 , pp. 162-175
- Lovejoy, W.S.¹

73
- 84957895797
- Reward functions for accelerated learning
- New Brunswick, US
- Mataríc, M.J.: Reward functions for accelerated learning. In: Proceedings 11th International Conference on Machine Learning (ICML-1994), New Brunswick, US, pp. 181-189 (1994)
- (1994) Proceedings 11th International Conference on Machine Learning (ICML-1994) , pp. 181-189
- Mataríc, M.J.¹

74
- 84949949419
- Learning in multi-robot systems
- Weiß, G., Sen, S. (eds.), ch. 10, Springer, Heidelberg
- Mataríc, M.J.: Learning in multi-robot systems. In: Weiß, G., Sen, S. (eds.) Adaptation and Learning in Multi-Agent Systems, ch. 10, pp. 152-163. Springer, Heidelberg (1996)
- (1996) Adaptation and Learning in Multi-Agent Systems , pp. 152-163
- Mataríc, M.J.¹

75
- 0030647149
- Reinforcement learning in the multi-robot domain
- Mataríc, M.J.: Reinforcement learning in the multi-robot domain. Autonomous Robots 4(1), 73-83 (1997)
- (1997) Autonomous Robots , vol.4 , Issue.1 , pp. 73-83
- Mataríc, M.J.¹

76
- 56449091120
- An analysis of reinforcement learning with function approximation
- Helsinki, Finland
- Melo, F.S., Meyn, S.P., Ribeiro, M.I.: An analysis of reinforcement learning with function approximation. In: Proceedings 25th International Conference on Machine Learning (ICML-2008), Helsinki, Finland, pp. 664-671 (2008)
- (2008) Proceedings 25th International Conference on Machine Learning (ICML-2008) , pp. 664-671
- Melo, F.S.¹ Meyn, S.P.² Ribeiro, M.I.³

77
- 84867463287
- Karlsruhe brainstormers - A reinforcement learning approach to robotic soccer
- Birk, A., Coradeschi, S., Tadokoro, S. (eds.) RoboCup 2001. Springer, Heidelberg
- Merke, A., Riedmiller, M.A.: Karlsruhe brainstormers - A reinforcement learning approach to robotic soccer. In: Birk, A., Coradeschi, S., Tadokoro, S. (eds.) RoboCup 2001. LNCS (LNAI), vol. 2377, pp. 435-440. Springer, Heidelberg (2002)
- (2002) LNCS (LNAI) , vol.2377 , pp. 435-440
- Merke, A.¹ Riedmiller, M.A.²

78
- 84880763288
- When evolving populations is better than coevolving individuals: The blind mice problem
- Acapulco, Mexico
- Miconi, T.: When evolving populations is better than coevolving individuals: The blind mice problem. In: Proceedings 18th International Joint Conference on Artificial Intelligence (IJCAI 2003), Acapulco, Mexico, pp. 647-652 (2003)
- (2003) Proceedings 18th International Joint Conference on Artificial Intelligence (IJCAI 2003) , pp. 647-652
- Miconi, T.¹

79
- 0027684215
- Prioritized sweeping: Reinforcement learning with less data and less time
- Moore, A.W., Atkeson, C.G.: Prioritized sweeping: Reinforcement learning with less data and less time. Machine Learning 13, 103-130 (1993)
- (1993) Machine Learning , vol.13 , pp. 103-130
- Moore, A.W.¹ Atkeson, C.G.²

80
- 44649189852
- Finite time bounds for fitted value iteration
- Munos, R., Szepesv́ari, C.: Finite time bounds for fitted value iteration. Journal of Machine Learning Research 9, 815-857 (2008)
- (2008) Journal of Machine Learning Research , vol.9 , pp. 815-857
- Munos, R.¹ Szepesv́ari, C.²

81
- 0031701693
- Learning organizational roles for negotiated search in a multiagent system
- Nagendra Prasad, M.V., Lesser, V.R., Lander, S.E.: Learning organizational roles for negotiated search in a multiagent system. International Journal of Human-Computer Studies 48(1), 51-67 (1998)
- (1998) International Journal of Human-Computer Studies , vol.48 , Issue.1 , pp. 51-67
- Nagendra Prasad, M.V.¹ Lesser, V.R.² Lander, S.E.³

82
- 0004182779
- McGraw-Hill, New York
- Nash, S., Sofer, A.: Linear and Nonlinear Programming. McGraw-Hill, New York (1996)
- (1996) Linear and Nonlinear Programming
- Nash, S.¹ Sofer, A.²

83
- 0037288398
- Least-squares policy evaluation algorithms with linear function approximation
- Nedíc, A., Bertsekas, D.P.: Least-squares policy evaluation algorithms with linear function approximation. Discrete Event Dynamic Systems: Theory and Applications 13(1-2), 79-110 (2003)
- (2003) Discrete Event Dynamic Systems: Theory and Applications , vol.13 , Issue.1-2 , pp. 79-110
- Nedíc, A.¹ Bertsekas, D.P.²

84
- 43549124850
- Multi-agent model predictive control for transportation networks: Serial versus parallel schemes
- Negenborn, R.R., De Schutter, B., Hellendoorn, H.: Multi-agent model predictive control for transportation networks: Serial versus parallel schemes. Engineering Applications of Artificial Intelligence 21(3), 353-366 (2008)
- (2008) Engineering Applications of Artificial Intelligence , vol.21 , Issue.3 , pp. 353-366
- Negenborn, R.R.¹ De Schutter, B.² Hellendoorn, H.³

85
- 0036832956
- Kernel-based reinforcement learning
- Ormoneit, D., Sen, S.: Kernel-based reinforcement learning. Machine Learning 49(2-3), 161-178 (2002)
- (2002) Machine Learning , vol.49 , Issue.2-3 , pp. 161-178
- Ormoneit, D.¹ Sen, S.²

86
- 26444601262
- Cooperative multi-agent learning: The state of the art
- Panait, L., Luke, S.: Cooperative multi-agent learning: The state of the art. Autonomous Agents and Multi-Agent Systems 11(3), 387-434 (2005)
- (2005) Autonomous Agents and Multi-Agent Systems , vol.11 , Issue.3 , pp. 387-434
- Panait, L.¹ Luke, S.²

87
- 84880814738
- Improving coevolutionary search for optimal multiagent behaviors
- Acapulco, Mexico
- Panait, L.,Wiegand, R.P., Luke, S.: Improving coevolutionary search for optimal multiagent behaviors. In: Proceedings 18th International Joint Conference on Artificial Intelligence (IJCAI-2003), Acapulco, Mexico, pp. 653-660 (2003)
- (2003) Proceedings 18th International Joint Conference on Artificial Intelligence (IJCAI-2003) , pp. 653-660
- Panait, L.¹ Wiegand, R.P.² Luke, S.³

88
- 0001873336
- Industrial and practical applications of DAI
- Weiss, G. (ed.), ch. 9, MIT Press, Cambridge
- Parunak, H.V.D.: Industrial and practical applications of DAI. In: Weiss, G. (ed.) Multi- Agent Systems: A Modern Approach to Distributed Artificial Intelligence, ch. 9, pp. 377-412. MIT Press, Cambridge (1999)
- (1999) Multi- Agent Systems: A Modern Approach to Distributed Artificial Intelligence , pp. 377-412
- Parunak, H.V.D.¹

89
- 0000955979
- Incremental multi-step Q-learning
- Peng, J., Williams, R.J.: Incremental multi-step Q-learning. Machine Learning 22(1-3), 283-290 (1996)
- (1996) Machine Learning , vol.22 , Issue.1-3 , pp. 283-290
- Peng, J.¹ Williams, R.J.²

90
- 40649106649
- Natural actor-critic
- Peters, J., Schaal, S.: Natural actor-critic. Neurocomputing 71(7-9), 1180-1190 (2008)
- (2008) Neurocomputing , vol.71 , Issue.7-9 , pp. 1180-1190
- Peters, J.¹ Schaal, S.²

91
- 85027492413
- A cooperative coevolutionary approach to function optimization
- Davidor, Y., Männer, R., Schwefel, H.-P. (eds.) PPSN 1994. Springer, Heidelberg
- Potter, M.A., Jong, K.A.D.: A cooperative coevolutionary approach to function optimization. In: Davidor, Y., Männer, R., Schwefel, H.-P. (eds.) PPSN 1994. LNCS, vol. 866, pp. 249-257. Springer, Heidelberg (1994)
- (1994) LNCS , vol.866 , pp. 249-257
- Potter, M.A.¹ Jong, K.A.D.²

92
- 47349092417
- Wiley, Chichester
- Powell,W.B.: Approximate Dynamic Programming: Solving the Curses of Dimensionality. Wiley, Chichester (2007)
- (2007) Approximate Dynamic Programming: Solving the Curses of Dimensionality
- Powell, W.B.¹

93
- 84898936075
- New criteria and a new algorithm for learning in multi-agent systems
- Saul, L.K.,Weiss, Y., Bottou, L. (eds.), MIT Press, Cambridge
- Powers, R., Shoham, Y.: New criteria and a new algorithm for learning in multi-agent systems. In: Saul, L.K.,Weiss, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems 17, pp. 1089-1096. MIT Press, Cambridge (2005)
- (2005) Advances in Neural Information Processing Systems , vol.17 , pp. 1089-1096
- Powers, R.¹ Shoham, Y.²

94
- 0010276944
- Implicit imitation in multiagent reinforcement learning
- Bled, Slovenia
- Price, B., Boutilier, C.: Implicit imitation in multiagent reinforcement learning. In: Proceedings 16th International Conference on Machine Learning (ICML-1999), Bled, Slovenia, pp. 325-334 (1999)
- (1999) Proceedings 16th International Conference on Machine Learning (ICML-1999) , pp. 325-334
- Price, B.¹ Boutilier, C.²

95
- 27344432348
- Accelerating reinforcement learning through implicit imitation
- Price, B., Boutilier, C.: Accelerating reinforcement learning through implicit imitation. Journal of Artificial Intelligence Research 19, 569-629 (2003)
- (2003) Journal of Artificial Intelligence Research , vol.19 , pp. 569-629
- Price, B.¹ Boutilier, C.²

96
- 85102627959
- Wiley, Chichester
- Puterman,M.L.: Markov Decision Processes-Discrete Stochastic Dynamic Programming. Wiley, Chichester (1994)
- (1994) Markov Decision Processes-Discrete Stochastic Dynamic Programming
- Puterman, M.L.¹

97
- 1142292938
- The communicative multiagent team decision problem: Analyzing teamwork theories and models
- Pynadath, D.V., Tambe, M.: The communicative multiagent team decision problem: Analyzing teamwork theories and models. Journal of Artificial Intelligence Research 16, 389-423 (2002)
- (2002) Journal of Artificial Intelligence Research , vol.16 , pp. 389-423
- Pynadath, D.V.¹ Tambe, M.²

98
- 84944045450
- Reinforcement learning applications in dynamic pricing of retail markets
- Newport Beach, US
- Raju, C., Narahari, Y., Ravikumar, K.: Reinforcement learning applications in dynamic pricing of retail markets. In: Proceedings 2003 IEEE International Conference on ECommerce (CEC-2003), Newport Beach, US, pp. 339-346 (2003)
- (2003) Proceedings 2003 IEEE International Conference on ECommerce (CEC-2003) , pp. 339-346
- Raju, C.¹ Narahari, Y.² Ravikumar, K.³

99
- 33646398129
- Neural fitted Q iteration - First experiences with a data efficient neural reinforcement learning method
- Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. Springer, Heidelberg
- Riedmiller, M.: Neural fitted Q iteration - first experiences with a data efficient neural reinforcement learning method. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 317-328. Springer, Heidelberg (2005)
- (2005) LNCS (LNAI) , vol.3720 , pp. 317-328
- Riedmiller, M.¹

100
- 84864835333
- Reinforcement learning for cooperating and communicating reactive agents in electrical power grids
- Hannebauer, M., Wendler, J., Pagello, E. (eds.), Springer, Heidelberg
- Riedmiller, M.A., Moore, A.W., Schneider, J.G.: Reinforcement learning for cooperating and communicating reactive agents in electrical power grids. In: Hannebauer, M., Wendler, J., Pagello, E. (eds.) Balancing Reactivity and Social Deliberation in Multi-Agent Systems, pp. 137-149. Springer, Heidelberg (2000)
- (2000) Balancing Reactivity and Social Deliberation in Multi-Agent Systems , pp. 137-149
- Riedmiller, M.A.¹ Moore, A.W.² Schneider, J.G.³

101
- 0032208296
- Learning team strategies: Soccer case studies
- Salustowicz, R.,Wiering, M., Schmidhuber, J.: Learning team strategies: Soccer case studies. Machine Learning 33(2-3), 263-282 (1998)
- (1998) Machine Learning , vol.33 , Issue.2-3 , pp. 263-282
- Salustowicz, R.¹ Wiering, M.² Schmidhuber, J.³

102
- 0001624494
- Adaptive load balancing: A study in multi-agent learning
- Schaerf, A., Shoham, Y., Tennenholtz, M.: Adaptive load balancing: A study in multi-agent learning. Journal of Artificial Intelligence Research 2, 475-500 (1995)
- (1995) Journal of Artificial Intelligence Research , vol.2 , pp. 475-500
- Schaerf, A.¹ Shoham, Y.² Tennenholtz, M.³

103
- 0007918330
- A general method for incremental self-improvement and multi-agent learning
- Yao, X. (ed.), ch. 3. World Scientific, Singapore
- Schmidhuber, J.: A general method for incremental self-improvement and multi-agent learning. In: Yao, X. (ed.) Evolutionary Computation: Theory and Applications, ch. 3, pp. 81-123. World Scientific, Singapore (1999)
- (1999) Evolutionary Computation: Theory and Applications , pp. 81-123
- Schmidhuber, J.¹

104
- 0003544514
- Sejnowski, T.J., Hinton, G.E. (eds.), MIT Press, Cambridge
- Sejnowski, T.J., Hinton, G.E. (eds.): Unsupervised Learning: Foundations of Neural Computation. MIT Press, Cambridge (1999)
- (1999) Unsupervised Learning: Foundations of Neural Computation

105
- 0028555752
- Learning to coordinate without sharing information
- Seattle, US
- Sen, S., Sekaran, M., Hale, J.: Learning to coordinate without sharing information. In: Proceedings 12th National Conference on Artificial Intelligence (AAAI-1994), Seattle, US, pp. 426-431 (1994)
- (1994) Proceedings 12th National Conference on Artificial Intelligence (AAAI-1994) , pp. 426-431
- Sen, S.¹ Sekaran, M.² Hale, J.³

106
- 0001842882
- Learning in multiagent systems
- Weiss, G. (ed.), ch. 6, MIT Press, Cambridge
- Sen, S.,Weiss, G.: Learning in multiagent systems. In:Weiss, G. (ed.) Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence, ch. 6, pp. 259-298. MIT Press, Cambridge (1999)
- (1999) Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence , pp. 259-298
- Sen, S.¹ Weiss, G.²

107
- 84924111881
- Cambridge University Press, Cambridge
- Shoham, Y., Leyton-Brown, K.: Multiagent Systems: Algorithmic, Game Theoretic and Logical Foundations. Cambridge University Press, Cambridge (2008)
- (2008) Multiagent Systems: Algorithmic Game Theoretic and Logical Foundations
- Shoham, Y.¹ Leyton-Brown, K.²

108
- 34147161536
- If multi-agent learning is the answer, what is the question?
- Shoham, Y., Powers, R., Grenager, T.: If multi-agent learning is the answer, what is the question? Artificial Intelligence 171(7), 365-377 (2007)
- (2007) Artificial Intelligence , vol.171 , Issue.7 , pp. 365-377
- Shoham, Y.¹ Powers, R.² Grenager, T.³

109
- 0001644761
- Nash convergence of gradient dynamics in generalsum games
- San Francisco, US
- Singh, S., Kearns, M., Mansour, Y.: Nash convergence of gradient dynamics in generalsum games. In: Proceedings 16th Conference on Uncertainty in Artificial Intelligence (UAI 2000), San Francisco, US, pp. 541-548 (2000)
- (2000) Proceedings 16th Conference on Uncertainty in Artificial Intelligence (UAI 2000) , pp. 541-548
- Singh, S.¹ Kearns, M.² Mansour, Y.³

110
- 85153965130
- Reinforcement learning with soft state aggregation
- Tesauro, G., Touretzky, D.S., Leen, T.K. (eds.), MIT Press, Cambridge
- Singh, S.P., Jaakkola, T., Jordan, M.I.: Reinforcement learning with soft state aggregation. In: Tesauro, G., Touretzky, D.S., Leen, T.K. (eds.) Advances in Neural Information Processing Systems 7, pp. 361-368. MIT Press, Cambridge (1995)
- (1995) Advances in Neural Information Processing Systems , vol.7 , pp. 361-368
- Singh, S.P.¹ Jaakkola, T.² Jordan, M.I.³

111
- 0004018184
- Cambridge University Press, Cambridge
- Smith, J.M.: Evolution and the Theory of Games. Cambridge University Press, Cambridge (1982)
- (1982) Evolution and the Theory of Games
- Smith, J.M.¹

112
- 12244265998
- High level coordination of agents based on multiagent Markov decision processes with roles
- Lausanne, Switzerland
- Spaan, M.T.J., Vlassis, N., Groen, F.C.A.: High level coordination of agents based on multiagent Markov decision processes with roles. In:Workshop on Cooperative Robotics, 2002 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS-2002), Lausanne, Switzerland, pp. 66-73 (2002)
- (2002) Workshop on Cooperative Robotics, 2002 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS-2002) , pp. 66-73
- Spaan, M.T.J.¹ Vlassis, N.² Groen, F.C.A.³

113
- 40949159247
- A reinforcement learning based neuralmulti-agent-system for control of a combustion process
- Como, Italy
- Stephan, V., Debes, K., Gross, H.M., Wintrich, F., Wintrich, H.: A reinforcement learning based neuralmulti-agent-system for control of a combustion process. In: Proceedings IEEEINNS- ENNS International Joint Conference on Neural Networks (IJCNN-2000), Como, Italy, pp. 6217-6222 (2000)
- (2000) Proceedings IEEEINNS- ENNS International Joint Conference on Neural Networks (IJCNN-2000) , pp. 6217-6222
- Stephan, V.¹ Debes, K.² Gross, H.M.³ Wintrich, F.⁴ Wintrich, H.⁵

114
- 0032645144
- Team-partitioned, opaque-transition reinforcement learning
- Seattle, US
- Stone, P., Veloso, M.: Team-partitioned, opaque-transition reinforcement learning. In: Proceedings 3rd International Conference on Autonomous Agents (Agents-1999), Seattle, US, pp. 206-212 (1999)
- (1999) Proceedings 3rd International Conference on Autonomous Agents (Agents-1999) , pp. 206-212
- Stone, P.¹ Veloso, M.²

115
- 0034205975
- Multiagent systems: A survey from the machine learning perspective
- Stone, P., Veloso, M.: Multiagent systems: A survey from the machine learning perspective. Autonomous Robots 8(3), 345-383 (2000)
- (2000) Autonomous Robots , vol.8 , Issue.3 , pp. 345-383
- Stone, P.¹ Veloso, M.²

116
- 0036355732
- A multiagent reinforcement learning algorithm using extended optimal response
- Bologna, Italy
- Suematsu, N., Hayashi, A.: A multiagent reinforcement learning algorithm using extended optimal response. In: Proceedings 1st International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-2002), Bologna, Italy, pp. 370-377 (2002)
- (2002) Proceedings 1st International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-2002) , pp. 370-377
- Suematsu, N.¹ Hayashi, A.²

117
- 36249003814
- An agent-based decision support system for wholesale electricity markets
- Sueyoshi, T., Tadiparthi, G.R.: An agent-based decision support system for wholesale electricity markets. Decision Support Systems 44, 425-446 (2008)
- (2008) Decision Support Systems , vol.44 , pp. 425-446
- Sueyoshi, T.¹ Tadiparthi, G.R.²

118
- 33847202724
- Learning to predict by the method of temporal differences
- Sutton, R.S.: Learning to predict by the method of temporal differences. Machine Learning 3, 9-44 (1988)
- (1988) Machine Learning , vol.3 , pp. 9-44
- Sutton, R.S.¹

119
- 85132026293
- Integrated architectures for learning, planning, and reacting based on approximating dynamic programming
- Austin, US
- Sutton, R.S.: Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In: Proceedings 7th International Conference on Machine Learning (ICML-1990), Austin, US, pp. 216-224 (1990)
- (1990) Proceedings 7th International Conference on Machine Learning (ICML-1990) , pp. 216-224
- Sutton, R.S.¹

120
- 0004102479
- MIT Press, Cambridge
- Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

121
- 14344263882
- Interpolation-based Q-learning
- Bannf, Canada
- Szepesv́ari, C., Smart, W.D.: Interpolation-based Q-learning. In: Proceedings 21st International Conference on Machine Learning (ICML-2004), Bannf, Canada, pp. 791-798 (2004)
- (2004) Proceedings 21st International Conference on Machine Learning (ICML-2004) , pp. 791-798
- Szepesv́ari, C.¹ Smart, W.D.²

122
- 40949126042
- Multiagent reinforcement learning applied to a chase problem in a continuous world
- Tamakoshi, H., Ishii, S.:Multiagent reinforcement learning applied to a chase problem in a continuous world. Artificial Life and Robotics 5(4), 202-206 (2001)
- (2001) Artificial Life and Robotics , vol.5 , Issue.4 , pp. 202-206
- Tamakoshi, H.¹ Ishii, S.²

123
- 85152198941
- Multi-agent reinforcement learning: Independent vs. cooperative agents
- Amherst, US
- Tan, M.: Multi-agent reinforcement learning: Independent vs. cooperative agents. In: Proceedings 10th International Conference on Machine Learning (ICML 1993), Amherst, US, pp. 330-337 (1993)
- (1993) Proceedings 10th International Conference on Machine Learning (ICML 1993) , pp. 330-337
- Tan, M.¹

124
- 84898941549
- Extending Q-learning to general adaptive multi-agent systems
- Thrun, S., Saul, L.K., Schölkopf, B. (eds.), MIT Press, Cambridge
- Tesauro, G.: Extending Q-learning to general adaptive multi-agent systems. In: Thrun, S., Saul, L.K., Schölkopf, B. (eds.) Advances in Neural Information Processing Systems 16, MIT Press, Cambridge (2004)
- (2004) Advances in Neural Information Processing Systems , vol.16
- Tesauro, G.¹

125
- 0036274424
- Pricing in agent economies using multi-agent Q-learning
- Tesauro, G., Kephart, J.O.: Pricing in agent economies using multi-agent Q-learning. Autonomous Agents and Multi-Agent Systems 5(3), 289-304 (2002)
- (2002) Autonomous Agents and Multi-Agent Systems , vol.5 , Issue.3 , pp. 289-304
- Tesauro, G.¹ Kephart, J.O.²

126
- 2042544751
- Multi-agent learning for routing control within an Internet environment
- Tillotson, P.,Wu, Q., Hughes, P.:Multi-agent learning for routing control within an Internet environment. Engineering Applications of Artificial Intelligence 17(2), 179-185 (2004)
- (2004) Engineering Applications of Artificial Intelligence , vol.17 , Issue.2 , pp. 179-185
- Tillotson, P.¹ Wu, Q.² Hughes, P.³

127
- 0031341345
- Neural reinforcement learning for behaviour synthesis
- Touzet, C.F.: Neural reinforcement learning for behaviour synthesis. Robotics and Autonomous Systems 22(3-4), 251-281 (1997)
- (1997) Robotics and Autonomous Systems , vol.22 , Issue.3-4 , pp. 251-281
- Touzet, C.F.¹

128
- 0003481349
- Robot awareness in cooperative mobile robot learning
- Touzet, C.F.: Robot awareness in cooperative mobile robot learning. Autonomous Robots 8(1), 87-97 (2000)
- (2000) Autonomous Robots , vol.8 , Issue.1 , pp. 87-97
- Touzet, C.F.¹

129
- 0028497630
- Asynchronous stochastic approximation and Q-learning
- Tsitsiklis, J.N.: Asynchronous stochastic approximation and Q-learning. Machine Learning 16(1), 185-202 (1994)
- (1994) Machine Learning , vol.16 , Issue.1 , pp. 185-202
- Tsitsiklis, J.N.¹

130
- 31344450384
- An evolutionary dynamical analysis of multiagent learning in iterated games
- Tuyls, K., 't Hoen, P.J., Vanschoenwinkel, B.: An evolutionary dynamical analysis of multiagent learning in iterated games. Autonomous Agents and Multi-Agent Systems 12(1), 115-153 (2006)
- (2006) Autonomous Agents and Multi-Agent Systems , vol.12 , Issue.1 , pp. 115-153
- Tuyls, K.¹ Hoen P J, T.² Vanschoenwinkel, B.³

131
- 40949146053
- Q-learning in simulated robotic soccer - Large state spaces and incomplete information
- Las Vegas, US
- Tuyls, K., Maes, S., Manderick, B.: Q-learning in simulated robotic soccer - large state spaces and incomplete information. In: Proceedings 2002 International Conference on Machine Learning and Applications (ICMLA-2002), Las Vegas, US, pp. 226-232 (2002)
- (2002) Proceedings 2002 International Conference on Machine Learning and Applications (ICMLA-2002) , pp. 226-232
- Tuyls, K.¹ Maes, S.² Manderick, B.³

132
- 28544446213
- Evolutionary game theory and multi-agent reinforcement learning
- Tuyls, K., Noẃe, A.: Evolutionary game theory and multi-agent reinforcement learning. The Knowledge Engineering Review 20(1), 63-90 (2005)
- (2005) The Knowledge Engineering Review , vol.20 , Issue.1 , pp. 63-90
- Tuyls, K.¹ Noẃe, A.²

133
- 0004196515
- Adversarial reinforcement learning
- Carnegie Mellon University, Pittsburgh, US
- Uther, W.T., Veloso, M.: Adversarial reinforcement learning. Tech. rep., School of Computer Science, Carnegie Mellon University, Pittsburgh, US (1997), http://www.cs.cmu.edu/afs/cs/user/will/www/papers/ Uther97a.ps
- (1997) Tech. Rep., School of Computer Science
- Uther, W.T.¹ Veloso, M.²

134
- 23144455713
- Learning in multiagent systems: An introduction from a game-theoretic perspective
- Alonso, E., Kudenko, D., Kazakov, D. (eds.) AAMAS 2000 and AAMAS 2002. Springer, Heidelberg
- Vidal, J.M.: Learning in multiagent systems: An introduction from a game-theoretic perspective. In: Alonso, E., Kudenko, D., Kazakov, D. (eds.) AAMAS 2000 and AAMAS 2002. LNCS (LNAI), vol. 2636, pp. 202-215. Springer, Heidelberg (2003)
- (2003) LNCS (LNAI) , vol.2636 , pp. 202-215
- Vidal, J.M.¹

135
- 52949118902
- A concise introduction to multiagent systems and distributed artificial intelligence
- Morgan & Claypool Publishers
- Vlassis, N.: A Concise Introduction to Multiagent Systems and Distributed Artificial Intelligence. Synthesis Lectures in Artificial Intelligence and Machine Learning. Morgan & Claypool Publishers (2007)
- (2007) Synthesis Lectures in Artificial Intelligence and Machine Learning
- Vlassis, N.¹

136
- 27744448185
- Reinforcement learning to play an optimal Nash equilibrium in team Markov games
- Becker, S., Thrun, S., Obermayer, K. (eds.), MIT Press, Cambridge
- Wang, X., Sandholm, T.: Reinforcement learning to play an optimal Nash equilibrium in team Markov games. In: Becker, S., Thrun, S., Obermayer, K. (eds.) Advances in Neural Information Processing Systems 15, pp. 1571-1578. MIT Press, Cambridge (2003)
- (2003) Advances in Neural Information Processing Systems , vol.15 , pp. 1571-1578
- Wang, X.¹ Sandholm, T.²

137
- 34249833101
- Q-learning
- Watkins, C.J.C.H., Dayan, P.: Q-learning. Machine Learning 8, 279-292 (1992)
- (1992) Machine Learning , vol.8 , pp. 279-292
- Watkins, C.J.C.H.¹ Dayan, P.²

138
- 4544231144
- Best-response multiagent learning in non-stationary environments
- New York, US
- Weinberg, M., Rosenschein, J.S.: Best-response multiagent learning in non-stationary environments. In: Proceedings 3rd International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-2004), New York, US, pp. 506-513 (2004)
- (2004) Proceedings 3rd International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-2004) , pp. 506-513
- Weinberg, M.¹ Rosenschein, J.S.²

139
- 0003744207
- Weiss, G. (ed.), MIT Press, Cambridge
- Weiss, G. (ed.): Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence. MIT Press, Cambridge (1999)
- (1999) Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence

140
- 0242697601
- The 2001 trading agent competition
- Wellman, M.P., Greenwald, A.R., Stone, P., Wurman, P.R.: The 2001 Trading Agent Competition. Electronic Markets 13(1) (2003)
- (2003) Electronic Markets , vol.13 , Issue.1
- Wellman, M.P.¹ Greenwald, A.R.² Stone, P.³ Wurman, P.R.⁴

141
- 36249019659
- Multi-agent reinforcement learning for traffic light control
- Stanford University, US
- Wiering, M.: Multi-agent reinforcement learning for traffic light control. In: Proceedings 17th International Conference on Machine Learning (ICML-2000), pp. 1151-1158. Stanford University, US (2000)
- (2000) Proceedings 17th International Conference on Machine Learning (ICML-2000) , pp. 1151-1158
- Wiering, M.¹

142
- 0345073177
- Reinforcement learning soccer teams with incomplete world models
- Wiering, M., Salustowicz, R., Schmidhuber, J.: Reinforcement learning soccer teams with incomplete world models. Autonomous Robots 7(1), 77-88 (1999)
- (1999) Autonomous Robots , vol.7 , Issue.1 , pp. 77-88
- Wiering, M.¹ Salustowicz, R.² Schmidhuber, J.³

143
- 77956312600
- Limit behavior of no-regret dynamics
- Kyiv School of Economics, Kyiv, Ucraine
- Zapechelnyuk, A.: Limit behavior of no-regret dynamics. Discussion Papers 21, Kyiv School of Economics, Kyiv, Ucraine (2009)
- (2009) Discussion Papers , vol.21
- Zapechelnyuk, A.¹

144
- 1942484421
- Online convex programming and generalized infinitesimal gradient ascent
- Washington, US
- Zinkevich, M.: Online convex programming and generalized infinitesimal gradient ascent. In: Proceedings 20th International Conference on Machine Learning (ICML-2003), Washington, US, pp. 928-936 (2003)
- (2003) Proceedings 20th International Conference on Machine Learning (ICML-2003) , pp. 928-936
- Zinkevich, M.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.