SCOPUS 정보 검색 플랫폼

10th International Conference on Autonomous Agents and Multiagent Systems 2011, AAMAS 2011

Volumn 1, Issue , 2011, Pages 209-216

Theoretical considerations of potential-based reward shaping for multi-agent systems

(2) Devlin, Sam a Kudenko, Daniel a

a UNIVERSITY OF YORK (United Kingdom)

Author keywords

Multiagent learning; Reinforcement learning; Reward shaping; Reward structures for learning

Indexed keywords

AUTONOMOUS AGENTS; REINFORCEMENT LEARNING; VECTOR QUANTIZATION;

EMPIRICAL STUDIES; EQUIVALENCE TO; MULTI-AGENT LEARNING; MULTI-AGENT REINFORCEMENT LEARNING; NASH EQUILIBRIA; REWARD SHAPING; REWARD STRUCTURES FOR LEARNING; STOCHASTIC GAME;

MULTI AGENT SYSTEMS;

EID: 84899455116 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (120)

References (35)

1
- 57749177069
- Potential-based shaping in model-based reinforcement learning
- J. Asmuth, M. Littman, and R. Zinkov. Potential-based shaping in model-based reinforcement learning. In Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, pages 604-609, 2008.
- (2008) Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence , pp. 604-609
- Asmuth, J.¹ Littman, M.² Zinkov, R.³

2
- 84899963942
- Social reward shaping in the prisoner's dilemma
- M. Babes, E. de Cote, and M. Littman. Social reward shaping in the prisoner's dilemma. In Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent Systems, volume 3, pages 1389-1392, 2008.
- (2008) th International Joint Conference on Autonomous Agents and Multiagent Systems , vol.3 , pp. 1389-1392
- Babes, M.¹ De Cote, E.² Littman, M.³

3
- 0003565783
- Set Athena Scientific, 3rd edition
- D. P. Bertsekas. Dynamic Programming and Optimal Control (2 Vol Set). Athena Scientific, 3rd edition, 2007.
- (2007) Dynamic Programming and Optimal Control , vol.2
- Bertsekas, D.P.¹

4
- 84880690163
- Sequential optimality and coordination in multiagent systems
- Citeseer
- C. Boutilier. Sequential optimality and coordination in multiagent systems. In International Joint Conference on Artificial Intelligence, volume 16, pages 478-485. Citeseer, 1999.
- (1999) International Joint Conference on Artificial Intelligence , vol.16 , pp. 478-485
- Boutilier, C.¹

5
- 40949147745
- A comprehensive survey of multiagent reinforcement learning
- L. Busoniu, R. Babuska, and B. De Schutter. A Comprehensive Survey of MultiAgent Reinforcement Learning. IEEE Transactions on Systems Man & Cybernetics Part C Applications and Reviews, 38(2):156, 2008.
- (2008) IEEE Transactions on Systems Man & Cybernetics Part C Applications and Reviews , vol.38 , Issue.2 , pp. 156
- Busoniu, L.¹ Babuska, R.² De Schutter, B.³

6
- 0031630561
- The dynamics of reinforcement learning in cooperative multiagent systems
- C. Claus and C. Boutilier. The dynamics of reinforcement learning in cooperative multiagent systems. In Proceedings of the National Conference on Artificial Intelligence, pages 746-752, 1998.
- (1998) Proceedings of the National Conference on Artificial Intelligence , pp. 746-752
- Claus, C.¹ Boutilier, C.²

7
- 84899421933
- Multi-agent, potential-based reward shaping for robo cup keep away
- S. Devlin, M. Grzes, and D. Kudenko. Multi-agent, potential-based reward shaping for RoboCup Keep Away. In Proceedings of The Tenth Annual International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2011.
- (2011) Proceedings of the Tenth Annual International Conference on Autonomous Agents and Multiagent Systems (AAMAS)
- Devlin, S.¹ Grzes, M.² Kudenko, D.³

8
- 1942484477
- Principled methods for advising reinforcement learning agents
- G. C. Eric Wiewiora and C. Elkan. Principled methods for advising reinforcement learning agents. In Proceedings of the Twentieth International Conference on Machine Learning, 2003.
- (2003) Proceedings of the Twentieth International Conference on Machine Learning
- Wiewiora, G.C.E.¹ Elkan, C.²

9
- 0003989209
- Springer Verlag
- J. Filar and K. Vrieze. Competitive Markov decision processes. Springer Verlag, 1997.
- (1997) Competitive Markov Decision Processes
- Filar, J.¹ Vrieze, K.²

10
- 78650499444
- Plan-based reward shaping for reinforcement learning
- IEEE
- M. Grzes and D. Kudenko. Plan-based reward shaping for reinforcement learning. In Proceedings of the 4th IEEE International Conference on Intelligent Systems (IS'08), pages 22-29. IEEE, 2008.
- (2008) th IEEE International Conference on Intelligent Systems (IS'08) , pp. 22-29
- Grzes, M.¹ Kudenko, D.²

11
- 4644369748
- Nash Q-learning for general-sum stochastic games
- J. Hu and M. Wellman. Nash Q-learning for general-sum stochastic games. The Journal of Machine Learning Research, 4:1039-1069, 2003.
- (2003) The Journal of Machine Learning Research , vol.4 , pp. 1039-1069
- Hu, J.¹ Wellman, M.²

12
- 0036932299
- Reinforcement learning of coordination in cooperative multi-agent systems
- Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press
- S. Kapetanakis and D. Kudenko. Reinforcement learning of coordination in cooperative multi-agent systems. In Proceedings of the National Conference on Artificial Intelligence, pages 326-331. Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999-2002.
- (1999) Proceedings of the National Conference on Artificial Intelligence , pp. 326-331
- Kapetanakis, S.¹ Kudenko, D.²

13
- 49949094351
- S. Kapetanakis and D. Kudenko. Reinforcement learning of coordination in heterogeneous cooperative multi-agent systems, pages 119-131, 2004.
- (2004) Reinforcement Learning of Coordination in Heterogeneous Cooperative Multi-agent Systems , pp. 119-131
- Kapetanakis, S.¹ Kudenko, D.²

14
- 85149834820
- Markov games as a framework for multi-agent reinforcement learning
- Citeseer
- M. Littman. Markov games as a framework for multi-agent reinforcement learning. In Proceedings of the eleventh international conference on machine learning, volume 157, page 163. Citeseer, 1994.
- (1994) Proceedings of the Eleventh International Conference on Machine Learning , vol.157 , pp. 163
- Littman, M.¹

15
- 0242466944
- Friend-or-foe q-learning in general-sum games
- M. Littman. Friend-or-foe Q-learning in general-sum games. In Machine Learning - International Workshop then Conference, pages 322-328, 2001.
- (2001) Machine Learning - International Workshop Then Conference , pp. 322-328
- Littman, M.¹

16
- 34547964974
- Automatic shaping and decomposition of reward functions
- ACM
- B. Marthi. Automatic shaping and decomposition of reward functions. In Proceedings of the 24th International Conference on Machine learning, page 608. ACM, 2007.
- (2007) th International Conference on Machine Learning , pp. 608
- Marthi, B.¹

17
- 0030647149
- Reinforcement learning in the multi-robot domain
- M. Mataric. Reinforcement learning in the multi-robot domain. Autonomous Robots, 4(1):73-83, 1997.
- (1997) Autonomous Robots , vol.4 , Issue.1 , pp. 73-83
- Mataric, M.¹

18
- 77950915046
- Decentralized learning in wireless sensor networks
- M. Mihaylov, K. Tuyls, and A. Nowe. Decentralized Learning in Wireless Sensor Networks. Adaptive and Learning Agents, pages 60-73, 2009.
- (2009) Adaptive and Learning Agents , pp. 60-73
- Mihaylov, M.¹ Tuyls, K.² Nowe, A.³

19
- 0001730497
- Non-cooperative games
- J. Nash. Non-cooperative games. Annals of mathematics, 54(2):286-295, 1951.
- (1951) Annals of Mathematics , vol.54 , Issue.2 , pp. 286-295
- Nash, J.¹

20
- 0141596576
- Policy invariance under reward transformations: Theory and application to reward shaping
- A. Y. Ng, D. Harada, and S. J. Russell. Policy invariance under reward transformations: Theory and application to reward shaping. In Proceedings of the 16th International Conference on Machine Learning, pages 278-287, 1999.
- (1999) thInternational Conference on Machine Learning , pp. 278-287
- Ng, A.Y.¹ Harada, D.² Russell, S.J.³

21
- 34447553096
- Reinforcement learning for humanoid robotics
- J. Peters, S. Vijayakumar, and S. Schaal. Reinforcement learning for humanoid robotics. In Proceedings of Humanoids2003, Third IEEE-RAS International Conference on Humanoid Robots, 2003.
- (2003) Proceedings of Humanoids2003, Third IEEE-RAS International Conference on Humanoid Robots
- Peters, J.¹ Vijayakumar, S.² Schaal, S.³

22
- 0003998452
- John Wiley & Sons, Inc. New York, NY, USA
- M. L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., New York, NY, USA, 1994.
- (1994) Markov Decision Processes: Discrete Stochastic Dynamic Programming
- Puterman, M.L.¹

23
- 1642401055
- Learning to drive a bicycle using reinforcement learning and shaping
- J. Randlpv and P. Alstrom. Learning to drive a bicycle using reinforcement learning and shaping. In Proceedings of the 15th International Conference on Machine Learning, pages 463-471, 1998.
- (1998) th International Conference on Machine Learning , pp. 463-471
- Randlpv, J.¹ Alstrom, P.²

24
- 34147161536
- If multi-agent learning is the answer, what is the question?
- Y. Shoham, R. Powers, and T. Grenager. If multi-agent learning is the answer, what is the question? Artificial Intelligence, 171(7):365-377, 2007.
- (2007) Artificial Intelligence , vol.171 , Issue.7 , pp. 365-377
- Shoham, Y.¹ Powers, R.² Grenager, T.³

25
- 0032645144
- Team-partitioned, opaque-transition reinforcement learning
- ACM
- P. Stone and M. Veloso. Team-partitioned, opaque-transition reinforcement learning. In Proceedings of the third annual conference on Autonomous Agents, pages 206-212. ACM, 1999.
- (1999) Proceedings of the Third Annual Conference on Autonomous Agents , pp. 206-212
- Stone, P.¹ Veloso, M.²

26
- 85156221438
- Generalization in reinforcement learning: Successful examples using sparse coarse coding
- R. Sutton. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding. Advances in Neural Information Processing Systems, pages 1038-1044, 1996.
- (1996) Advances in Neural Information Processing Systems , pp. 1038-1044
- Sutton, R.¹

27
- 0003617454
- PhD thesis, Department of Computer Science, University of Massachusetts, Amherst
- R. S. Sutton. Temporal credit assignment in reinforcement learning. PhD thesis, Department of Computer Science, University of Massachusetts, Amherst, 1984.
- (1984) Temporal Credit Assignment in Reinforcement Learning
- Sutton, R.S.¹

28
- 0004102479
- MIT Press
- R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

29
- 85152198941
- Multi-agent reinforcement learning: Independent vs. Cooperative agents
- M. Tan. Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents. In Proceedings of the Tenth International Conference on Machine Learning, volume 337, 1993.
- (1993) Proceedings of the Tenth International Conference on Machine Learning , vol.337
- Tan, M.¹

30
- 70349592320
- Learning from actions not taken in multi agent systems
- K. Turner and N. Khani. Learning from actions not taken in multiagent systems. Advances in Complex Systems (ACS), 12(04):455-473, 2009.
- (2009) Advances in Complex Systems (ACS) , vol.12 , Issue.4 , pp. 455-473
- Turner, K.¹ Khani, N.²

31
- 27744448185
- Reinforcement learning to play an optimal nash equilibrium in team Markov games
- X. Wang and T. Sandholm. Reinforcement learning to play an optimal Nash equilibrium in team Markov games. Advances in neural information processing systems, pages 1603-1610, 2003.
- (2003) Advances in Neural Information Processing Systems , pp. 1603-1610
- Wang, X.¹ Sandholm, T.²

32
- 0032207451
- Conjectural equilibrium in multiagent learning
- M. Wellman and J. Hu. Conjectural equilibrium in multiagent learning. Machine Learning, 33(2):179-200, 1998.
- (1998) Machine Learning , vol.33 , Issue.2 , pp. 179-200
- Wellman, M.¹ Hu, J.²

33
- 27344453198
- Potential-based shaping and Q-value initialization are equivalent
- E. Wiewiora. Potential-based shaping and Q-value initialization are equivalent. Journal of Artificial Intelligence Research, 19(1):205-208, 2003.
- (2003) Journal of Artificial Intelligence Research , vol.19 , Issue.1 , pp. 205-208
- Wiewiora, E.¹

34
- 0004320981
- An introduction to collective intelligence
- D. Wolpert and K. Turner. An introduction to collective intelligence. Technical Report cs.LG/9908014, NASA Ames Research Center, 1999.
- (1999) Technical Report Cs.LG/9908014, NASA Ames Research Center
- Wolpert, D.¹ Turner, K.²

35
- 0038797355
- John Wiley and Sons
- M. Wooldridge. An Introduction to Multi Agent Systems. John Wiley and Sons, 2002.
- (2002) An Introduction to Multi Agent Systems
- Wooldridge, M.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.