SCOPUS 정보 검색 플랫폼

Volumn 14, Issue 2, 2011, Pages 279-305

Action discovery for single and multi-agent reinforcement learning

Author keywords

Multi agent reinforcement learning; multi agent systems

Indexed keywords

EID: 79955402817 PISSN: 02195259 EISSN: None Source Type: Journal
DOI: 10.1142/S0219525911002937 Document Type: Conference Paper

Times cited : (6)

References (17)

1
- 84864030941
- An application of reinforcement learning to aerobatic helicopter flight
- Abbeel, P., Coates, A., Quigley, M. and Ng, A. Y., An application of reinforcement learning to aerobatic helicopter flight, NIPS 19 (2007).
- (2007) NIPS , vol.19
- Abbeel, P.¹ Coates, A.² Quigley, M.³ Ng, A.Y.⁴

2
- 57349106030
- Multiagent reinforcement learning and self-organization in a network of agents
- Abdallah, S. and Lesser, V., Multiagent reinforcement learning and self-organization in a network of agents, in Proceedings of 6th International Conference on Autonomous Agents and Multiagent Systems (AAMAS) (2007).
- (2007) Proceedings of 6th International Conference on Autonomous Agents and Multiagent Systems (AAMAS)
- Abdallah, S.¹ Lesser, V.²

4
- 35248823118
- Generalized multiagent learning with performance bound
- Banerjee, B. and Peng, J., Generalized multiagent learning with performance bound, Autonomous Agents and Multiagent Systems 15(3) (2007) 281-312.
- (2007) Autonomous Agents and Multiagent Systems , vol.15 , Issue.3 , pp. 281-312
- Banerjee, B.¹ Peng, J.²

5
- 84880690163
- Sequential optimality and coordination in multiagent systems
- Boutilier, C., Sequential optimality and coordination in multiagent systems, in Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence (1999), pp. 478-485.
- (1999) Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence , pp. 478-485
- Boutilier, C.¹

7
- 85156187730
- Improving elevator performance using reinforcement learning
- Crites, R. H. and Barto, A. G., Improving elevator performance using reinforcement learning, in Advances in Neural Information Processing Systems 8, 8 (1996) 1017-1023.
- (1996) Advances in Neural Information Processing Systems , vol.8 , Issue.8 , pp. 1017-1023
- Crites, R.H.¹ Barto, A.G.²

8
- 4644369748
- Nash Q-learning for general-sum stochastic games
- Hu, J. and Wellman, M. P., Nash Q-learning for general-sum stochastic games, J. Mach. Learn. Res. 4 (2003) 1039-1069.
- (2003) J. Mach. Learn. Res. , vol.4 , pp. 1039-1069
- Hu, J.¹ Wellman, M.P.²

12
- 0013530450
- Lyapunov-constrained action sets for reinforcement learning
- Perkins, T. J. and Barto, A. G., Lyapunov-constrained action sets for reinforcement learning, in Proceedings of the ICML (2001), pp. 409-416.
- (2001) Proceedings of the ICML , pp. 409-416
- Perkins, T.J.¹ Barto, A.G.²

14
- 0004102479
- MIT Press
- Sutton, R. and Barto, A. G., Reinforcement Learning: An Introduction, MIT Press, 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.¹ Barto, A.G.²

16
- 0029276036
- Temporal difference learning and TD-gammon
- Tesauro, G., Temporal difference learning and TD-gammon, Commun. ACM 38(3) (1995) 58-68.
- (1995) Commun. ACM , vol.38 , Issue.3 , pp. 58-68
- Tesauro, G.¹

17
- 27344453198
- Potential-based shaping and Q-value initialization are equivalent
- Wiewiora, E., Potential based shaping and Q-value initialization are equivalent, J. Artif. Intel. Res. 19 (2003) 205-208. (Pubitemid 41525920)
- (2003) Journal of Artificial Intelligence Research , vol.19 , pp. 205-208
- Wiewiora, E.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.