SCOPUS 정보 검색 플랫폼

11th International Conference on Autonomous Agents and Multiagent Systems 2012, AAMAS 2012: Innovative Applications Track

Volumn 1, Issue , 2012, Pages 328-335

Dynamic potential-based reward shaping

(2) Devlin, Sam a Kudenko, Daniel a

a UNIVERSITY OF YORK (United Kingdom)

Author keywords

Reinforcement Learning; Reward Shaping

Indexed keywords

AUTONOMOUS AGENTS; OPTIMIZATION; REINFORCEMENT LEARNING;

AGENT LEARNING; MULTIPLE AGENTS; NASH EQUILIBRIA; OPTIMAL POLICIES; REWARD SHAPING; SINGLE-AGENT;

MULTI AGENT SYSTEMS;

EID: 84899412758 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (110)

References (24)

1
- 57749177069
- Potential-based shaping in model-based reinforcement learning
- J. Asmuth, M. Littman, and R. Zinkov. Potential-based shaping in model-based reinforcement learning. In Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, pages 604-609, 2008.
- (2008) Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence , pp. 604-609
- Asmuth, J.¹ Littman, M.² Zinkov, R.³

2
- 84899963942
- Social reward shaping in the prisoner's dilemma
- M. Babes, E. de Cote, and M. Littman. Social reward shaping in the prisoner's dilemma. In Proceedings of The Seventh Annual International Conference on Autonomous Agents and Multiagent Systems, Volume 3, pages 1389-1392, 2008.
- (2008) Proceedings of the Seventh Annual International Conference on Autonomous Agents and Multiagent Systems , vol.3 , pp. 1389-1392
- Babes, M.¹ De Cote, E.² Littman, M.³

3
- 0003565783
- Athena Scientific, 3rd edition
- D. P. Bertsekas. Dynamic Programming and Optimal Control (2 Vol Set). Athena Scientific, 3rd edition, 2007.
- (2007) Dynamic Programming and Optimal Control (2 Vol Set)
- Bertsekas, D.P.¹

4
- 84880690163
- Sequential optimality and coordination in multiagent systems
- C. Boutilier. Sequential optimality and coordination in multiagent systems. In International Joint Conference on Artificial Intelligence, Volume 16, pages 478-485, 1999.
- (1999) International Joint Conference on Artificial Intelligence , vol.16 , pp. 478-485
- Boutilier, C.¹

5
- 40949147745
- A comprehensive survey of MultiAgent reinforcement learning
- L. Busoniu, R. Babuska, and B. De Schutter. A Comprehensive Survey of MultiAgent Reinforcement Learning. IEEE Transactions on Systems Man & Cybernetics Part C Applications and Reviews, 38(2):156, 2008.
- (2008) IEEE Transactions on Systems Man & Cybernetics Part C Applications and Reviews , vol.38 , Issue.2 , pp. 156
- Busoniu, L.¹ Babuska, R.² De Schutter, B.³

6
- 0031630561
- The dynamics of reinforcement learning in cooperative multiagent systems
- C. Claus and C. Boutilier. The dynamics of reinforcement learning in cooperative multiagent systems. In Proceedings of the National Conference on Artificial Intelligence, pages 746-752, 1998.
- (1998) Proceedings of the National Conference on Artificial Intelligence , pp. 746-752
- Claus, C.¹ Boutilier, C.²

7
- 79955403826
- An empirical study of potential-based reward shaping and advice in complex, multi-agent systems
- S. Devlin, M. Grzes, and D. Kudenko. An empirical study of potential-based reward shaping and advice in complex, multi-agent systems. Advances in Complex Systems, 2011.
- (2011) Advances in Complex Systems
- Devlin, S.¹ Grzes, M.² Kudenko, D.³

8
- 84899455116
- Theoretical considerations of potential-based reward shaping for multi-agent systems
- S. Devlin and D. Kudenko. Theoretical considerations of potential-based reward shaping for multi-agent systems. In Proceedings of The Tenth Annual International Conference on Autonomous Agents and Multiagent Systems, 2011.
- (2011) Proceedings of the Tenth Annual International Conference on Autonomous Agents and Multiagent Systems
- Devlin, S.¹ Kudenko, D.²

9
- 0003989209
- Springer Verlag
- J. Filar and K. Vrieze. Competitive Markov decision processes. Springer Verlag, 1997.
- (1997) Competitive Markov Decision Processes
- Filar, J.¹ Vrieze, K.²

10
- 78650499444
- Plan-based reward shaping for reinforcement learning
- IEEE
- M. Grzes and D. Kudenko. Plan-based reward shaping for reinforcement learning. In Proceedings of the 4th IEEE International Conference on Intelligent Systems (IS'08), pages 22-29. IEEE, 2008.
- (2008) Proceedings of the 4th IEEE International Conference on Intelligent Systems (IS'08) , pp. 22-29
- Grzes, M.¹ Kudenko, D.²

11
- 77950298151
- Online learning of shaping rewards in reinforcement learning
- M. Grzes and D. Kudenko. Online learning of shaping rewards in reinforcement learning. Artificial Neural Networks-ICANN 2010, pages 541-550, 2010.
- (2010) Artificial Neural Networks-ICANN 2010 , pp. 541-550
- Grzes, M.¹ Kudenko, D.²

12
- 34548090340
- PhD thesis, University of Illinois at Urbana-Champaign
- A. Laud. Theory and application of reward shaping in reinforcement learning. PhD thesis, University of Illinois at Urbana-Champaign, 2004.
- (2004) Theory and Application of Reward Shaping in Reinforcement Learning
- Laud, A.¹

13
- 34547964974
- Automatic shaping and decomposition of reward functions
- ACM
- B. Marthi. Automatic shaping and decomposition of reward functions. In Proceedings of the 24th International Conference on Machine learning, page 608. ACM, 2007.
- (2007) Proceedings of the 24th International Conference on Machine Learning , pp. 608
- Marthi, B.¹

14
- 0030647149
- Reinforcement learning in the multi-robot domain
- M. Mataric. Reinforcement learning in the multi-robot domain. Autonomous Robots, 4(1):73-83, 1997.
- (1997) Autonomous Robots , vol.4 , Issue.1 , pp. 73-83
- Mataric, M.¹

15
- 0141596576
- Policy invariance under reward transformations: Theory and application to reward shaping
- A. Y. Ng, D. Harada, and S. J. Russell. Policy invariance under reward transformations: Theory and application to reward shaping. In Proceedings of the 16th International Conference on Machine Learning, pages 278-287, 1999.
- (1999) Proceedings of the 16th International Conference on Machine Learning , pp. 278-287
- Ng, A.Y.¹ Harada, D.² Russell, S.J.³

16
- 0003998452
- John Wiley & Sons, Inc., New York, NY, USA
- M. L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., New York, NY, USA, 1994.
- (1994) Markov Decision Processes: Discrete Stochastic Dynamic Programming
- Puterman, M.L.¹

17
- 1642401055
- Learning to drive a bicycle using reinforcement learning and shaping
- J. Randløv and P. Alstrom. Learning to drive a bicycle using reinforcement learning and shaping. In Proceedings of the 15th International Conference on Machine Learning, pages 463-471, 1998.
- (1998) Proceedings of the 15th International Conference on Machine Learning , pp. 463-471
- Randløv, J.¹ Alstrom, P.²

18
- 34147161536
- If multi-agent learning is the answer, what is the question?
- Y. Shoham, R. Powers, and T. Grenager. If multi-agent learning is the answer, what is the question? Artificial Intelligence, 171(7):365-377, 2007.
- (2007) Artificial Intelligence , vol.171 , Issue.7 , pp. 365-377
- Shoham, Y.¹ Powers, R.² Grenager, T.³

19
- 0032645144
- Team-partitioned, opaque-transition reinforcement learning
- ACM
- P. Stone and M. Veloso. Team-partitioned, opaque-transition reinforcement learning. In Proceedings of the third annual conference on Autonomous Agents, pages 206-212. ACM, 1999.
- (1999) Proceedings of the Third Annual Conference on Autonomous Agents , pp. 206-212
- Stone, P.¹ Veloso, M.²

20
- 0003617454
- PhD thesis, Department of Computer Science, University of Massachusetts, Amherst
- R. S. Sutton. Temporal credit assignment in reinforcement learning. PhD thesis, Department of Computer Science, University of Massachusetts, Amherst, 1984.
- (1984) Temporal Credit Assignment in Reinforcement Learning
- Sutton, R.S.¹

21
- 0004102479
- MIT Press
- R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

22
- 34249833101
- Q-learning
- C. Watkins and P. Dayan. Q-learning. Machine learning, 8(3):279-292, 1992.
- (1992) Machine Learning , vol.8 , Issue.3 , pp. 279-292
- Watkins, C.¹ Dayan, P.²

23
- 27344453198
- Potential-based shaping and Q-value initialization are equivalent
- E. Wiewiora. Potential-based shaping and Q-value initialization are equivalent. Journal of Artificial Intelligence Research, 19(1):205-208, 2003.
- (2003) Journal of Artificial Intelligence Research , vol.19 , Issue.1 , pp. 205-208
- Wiewiora, E.¹

24
- 1942484477
- Principled methods for advising reinforcement learning agents
- E. Wiewiora, G. Cottrell, and C. Elkan. Principled methods for advising reinforcement learning agents. In Proceedings of the Twentieth International Conference on Machine Learning, 2003.
- (2003) Proceedings of the Twentieth International Conference on Machine Learning
- Wiewiora, E.¹ Cottrell, G.² Elkan, C.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.