-
1
-
-
40949147745
-
A comprehensive survey of multiagent reinforcement learning
-
L. Busoniu, R. Babuska, and B. De Schutter. A comprehensive survey of multiagent reinforcement learning. Systems, Man, and Cybernetics, IEEE Transactions on, 38(2):156-172, 2008.
-
(2008)
Systems, Man, and Cybernetics, IEEE Transactions On
, vol.38
, Issue.2
, pp. 156-172
-
-
Busoniu, L.1
Babuska, R.2
De Schutter, B.3
-
2
-
-
84871781883
-
An overview of recent progress in the study of distributed multi-agent coordination
-
Y. Cao, W. Yu, W. Ren, and G. Chen. An overview of recent progress in the study of distributed multi-agent coordination. IEEE Transactions on Industrial Informatics, 1(9):427-438, 2013.
-
(2013)
IEEE Transactions on Industrial Informatics
, vol.1
, Issue.9
, pp. 427-438
-
-
Cao, Y.1
Yu, W.2
Ren, W.3
Chen, G.4
-
3
-
-
0032208335
-
Elevator group control using multiple reinforcement learning agents
-
R. H. Crites and A. G. Barto. Elevator group control using multiple reinforcement learning agents. Machine Learning, 33(2):235-262, 1998.
-
(1998)
Machine Learning
, vol.33
, Issue.2
, pp. 235-262
-
-
Crites, R.H.1
Barto, A.G.2
-
4
-
-
84979258646
-
-
arXiv, abs/1602.02672
-
J. N. Foerster, Y. M. Assael, N. de Freitas, and S. Whiteson. Learning to communicate to solve riddles with deep distributed recurrent Q-networks. arXiv, abs/1602.02672, 2016.
-
(2016)
Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-networks
-
-
Foerster, J.N.1
Assael, Y.M.2
De Freitas, N.3
Whiteson, S.4
-
5
-
-
0034207091
-
Probabilistic approach to collaborative multi-robot localization
-
D. Fox, W. Burgard, H. Kruppa, and S. Thrun. Probabilistic approach to collaborative multi-robot localization. Autonomous Robots, 8(3):325-344, 2000.
-
(2000)
Autonomous Robots
, vol.8
, Issue.3
, pp. 325-344
-
-
Fox, D.1
Burgard, W.2
Kruppa, H.3
Thrun, S.4
-
7
-
-
0012296128
-
Multiagent planning with factored MDPs
-
C. Guestrin, D. Koller, and R. Parr. Multiagent planning with factored MDPs. In NIPS, 2001.
-
(2001)
NIPS
-
-
Guestrin, C.1
Koller, D.2
Parr, R.3
-
8
-
-
84937779024
-
Deep learning for real-time atari game play using offline monte-carlo tree search planning
-
X. Guo, S. Singh, H. Lee, R. L. Lewis, and X. Wang. Deep learning for real-time atari game play using offline monte-carlo tree search planning. In NIPS, 2014.
-
(2014)
NIPS
-
-
Guo, X.1
Singh, S.2
Lee, H.3
Lewis, R.L.4
Wang, X.5
-
9
-
-
85083953090
-
Neural GPUs learn algorithms
-
L. Kaiser and I. Sutskever. Neural gpus learn algorithms. In ICLR, 2016.
-
(2016)
ICLR
-
-
Kaiser, L.1
Sutskever, I.2
-
11
-
-
85083951076
-
Adam: A method for stochastic optimization
-
D. Kingma and J. Ba. Adam: A method for stochastic optimization. In ICLR, 2015.
-
(2015)
ICLR
-
-
Kingma, D.1
Ba, J.2
-
12
-
-
0012286079
-
An algorithm for distributed reinforcement learning in cooperative multi-agent systems
-
M. Lauer and M. A. Riedmiller. An algorithm for distributed reinforcement learning in cooperative multi-agent systems. In ICML, 2000.
-
(2000)
ICML
-
-
Lauer, M.1
Riedmiller, M.A.2
-
13
-
-
84979924150
-
End-to-end training of deep visuomotor policies
-
S. Levine, C. Finn, T. Darrell, and P. Abbeel. End-to-end training of deep visuomotor policies. Journal of Machine Learning Research, 17(39):1-40, 2016.
-
(2016)
Journal of Machine Learning Research
, vol.17
, Issue.39
, pp. 1-40
-
-
Levine, S.1
Finn, C.2
Darrell, T.3
Abbeel, P.4
-
15
-
-
0001547175
-
Value-function reinforcement learning in Markov games
-
M. L. Littman. Value-function reinforcement learning in markov games. Cognitive Systems Research, 2(1):55-66, 2001.
-
(2001)
Cognitive Systems Research
, vol.2
, Issue.1
, pp. 55-66
-
-
Littman, M.L.1
-
16
-
-
85083951314
-
Move evaluation in go using deep convolutional neural networks
-
C. J. Maddison, A. Huang, I. Sutskever, and D. Silver. Move evaluation in go using deep convolutional neural networks. In ICLR, 2015.
-
(2015)
ICLR
-
-
Maddison, C.J.1
Huang, A.2
Sutskever, I.3
Silver, D.4
-
17
-
-
85028097976
-
Coordination of communication in robot teams by reinforcement learning
-
D. Maravall, J. De Lope, and R. Domnguez. Coordination of communication in robot teams by reinforcement learning. Robotics and Autonomous Systems, 61(7):661-666, 2013.
-
(2013)
Robotics and Autonomous Systems
, vol.61
, Issue.7
, pp. 661-666
-
-
Maravall, D.1
De Lope, J.2
Domnguez, R.3
-
18
-
-
0030647149
-
Reinforcement learning in the multi-robot domain
-
M. Matari. Reinforcement learning in the multi-robot domain. Autonomous Robots, 4(1):73-83, 1997.
-
(1997)
Autonomous Robots
, vol.4
, Issue.1
, pp. 73-83
-
-
Matari, M.1
-
19
-
-
84868340899
-
Querypomdp: Pomdp-based communication in multiagent systems
-
F. S. Melo, M. Spaan, and S. J. Witwicki. Querypomdp: Pomdp-based communication in multiagent systems. In Multi-Agent Systems, pages 189-204, 2011.
-
(2011)
Multi-Agent Systems
, pp. 189-204
-
-
Melo, F.S.1
Spaan, M.2
Witwicki, S.J.3
-
20
-
-
84924051598
-
Human-level control through deep reinforcement learning
-
V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, D. Wierstra, S. Legg, and D. Hassabis. Human-level control through deep reinforcement learning. Nature, 518(7540):529-533, 2015.
-
(2015)
Nature
, vol.518
, Issue.7540
, pp. 529-533
-
-
Mnih, V.1
Kavukcuoglu, K.2
Silver, D.3
Rusu, A.A.4
Veness, J.5
Bellemare, M.G.6
Graves, A.7
Riedmiller, M.8
Fidjeland, A.K.9
Ostrovski, G.10
Petersen, S.11
Beattie, C.12
Sadik, A.13
Wierstra, D.14
Legg, S.15
Hassabis, D.16
-
21
-
-
64149119332
-
Consensus and cooperation in networked multi-agent systems
-
R. Olfati-Saber, J. Fax, and R. Murray. Consensus and cooperation in networked multi-agent systems. Proceedings of the IEEE, 95(1):215-233, 2007.
-
(2007)
Proceedings of the IEEE
, vol.95
, Issue.1
, pp. 215-233
-
-
Olfati-Saber, R.1
Fax, J.2
Murray, R.3
-
22
-
-
0020276268
-
Reverend bayes on inference engines: A distributed hierarchical approach
-
J. Pearl. Reverend bayes on inference engines: A distributed hierarchical approach. In AAAI, 1982.
-
(1982)
AAAI
-
-
Pearl, J.1
-
23
-
-
58649113008
-
The graph neural network model
-
F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini. The graph neural network model. IEEE Trans. Neural Networks, 20(1):61-80, 2009.
-
(2009)
IEEE Trans. Neural Networks
, vol.20
, Issue.1
, pp. 61-80
-
-
Scarselli, F.1
Gori, M.2
Tsoi, A.C.3
Hagenbuchner, M.4
Monfardini, G.5
-
24
-
-
84963949906
-
Mastering the game of go with deep neural networks and tree search
-
D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, et al. Mastering the game of go with deep neural networks and tree search. Nature, 529(7587):484-489, 2016.
-
(2016)
Nature
, vol.529
, Issue.7587
, pp. 484-489
-
-
Silver, D.1
Huang, A.2
Maddison, C.J.3
Guez, A.4
Sifre, L.5
Van Den Driessche, G.6
Schrittwieser, J.7
Antonoglou, I.8
Panneershelvam, V.9
Lanctot, M.10
-
25
-
-
0031648211
-
Towards collaborative and adversarial learning: A case study in robotic soccer
-
P. Stone and M. Veloso. Towards collaborative and adversarial learning: A case study in robotic soccer. International Journal of Human Computer Studies, (48), 1998.
-
(1998)
International Journal of Human Computer Studies
, Issue.48
-
-
Stone, P.1
Veloso, M.2
-
26
-
-
85018863429
-
Mazebase: A sandbox for learning from games
-
S. Sukhbaatar, A. Szlam, G. Synnaeve, S. Chintala, and R. Fergus. Mazebase: A sandbox for learning from games. CoRR, abs/1511.07401, 2015.
-
(2015)
CoRR
-
-
Sukhbaatar, S.1
Szlam, A.2
Synnaeve, G.3
Chintala, S.4
Fergus, R.5
-
29
-
-
84998600495
-
-
arXiv:1511.08779
-
A. Tampuu, T. Matiisen, D. Kodelja, I. Kuzovkin, K. Korjus, J. Aru, and R. Vicente. Multiagent cooperation and competition with deep reinforcement learning. arXiv:1511.08779, 2015.
-
(2015)
Multiagent Cooperation and Competition with Deep Reinforcement Learning
-
-
Tampuu, A.1
Matiisen, T.2
Kodelja, D.3
Kuzovkin, I.4
Korjus, K.5
Aru, J.6
Vicente, R.7
-
30
-
-
85152198941
-
Multi-agent reinforcement learning: Independent vs. Cooperative agents
-
M. Tan. Multi-agent reinforcement learning: Independent vs. cooperative agents. In ICML, 1993.
-
(1993)
ICML
-
-
Tan, M.1
-
33
-
-
67649405225
-
Reinforcement learning to play an optimal nash equilibrium in team Markov games
-
X. Wang and T. Sandholm. Reinforcement learning to play an optimal nash equilibrium in team markov games. In NIPS, pages 1571-1578, 2002.
-
(2002)
NIPS
, pp. 1571-1578
-
-
Wang, X.1
Sandholm, T.2
-
34
-
-
85083951707
-
Towards ai-complete question answering: A set of prerequisite toy tasks
-
J. Weston, A. Bordes, S. Chopra, and T. Mikolov. Towards ai-complete question answering: A set of prerequisite toy tasks. In ICLR, 2016.
-
(2016)
ICLR
-
-
Weston, J.1
Bordes, A.2
Chopra, S.3
Mikolov, T.4
-
35
-
-
0000337576
-
Simple statistical gradient-following algorithms for connectionist reinforcement learning
-
R. J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. In Machine Learning, pages 229-256, 1992.
-
(1992)
Machine Learning
, pp. 229-256
-
-
Williams, R.J.1
-
36
-
-
84999008900
-
Dynamic memory networks for visual and textual question answering
-
C. Xiong, S. Merity, and R. Socher. Dynamic memory networks for visual and textual question answering. ICML, 2016.
-
(2016)
ICML
-
-
Xiong, C.1
Merity, S.2
Socher, R.3
-
37
-
-
84899453582
-
Coordinating multi-agent reinforcement learning with limited communication
-
C. Zhang and V. Lesser. Coordinating multi-agent reinforcement learning with limited communication. In Proc. AAMAS, pages 1101-1108, 2013.
-
(2013)
Proc. AAMAS
, pp. 1101-1108
-
-
Zhang, C.1
Lesser, V.2
|