-
2
-
-
0020970738
-
Neuron-like adaptive elements that can solve difficult learning control problems
-
Barto, A, Sutton, R and Anderson, C, 1983, Neuron-like adaptive elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man and Cybernetics 13(5), 834-846.
-
(1983)
IEEE Transactions on Systems, Man and Cybernetics
, vol.13
, Issue.5
, pp. 834-846
-
-
Barto, A.1
Sutton, R.2
Anderson, C.3
-
3
-
-
33751170995
-
Learning to behave socially and avoid the Braess paradox in a commuting scenario
-
Bazzan, ALC and Klugl, F, 2003, Learning to behave socially and avoid the Braess paradox in a commuting scenario. In Proceedings of the 1st International Workshop on Evolutionary Game Theory for Learning in MAS, Melbourne Australia, 14 July 2003.
-
(2003)
Proceedings of the 1st International Workshop on Evolutionary Game Theory for Learning in MAS, Melbourne Australia, 14 July 2003
-
-
Bazzan, A.L.C.1
Klugl, F.2
-
8
-
-
0031281590
-
Learning through reinforcement and replicator dynamics
-
November
-
Borgers, T and Sarin, R, 1997, Learning through reinforcement and replicator dynamics. Journal of Economic Theory 77(1), November.
-
(1997)
Journal of Economic Theory
, vol.77
, Issue.1
-
-
Borgers, T.1
Sarin, R.2
-
10
-
-
0028564629
-
Acting optimally in partially observable stochastic domains
-
Cassandra, AR, Kaelbling, LP. and Littman, ML, 1994, Acting optimally in partially observable stochastic domains. In Proceedings of the 12th National Conference on Artificial Intelligence, Seattle, WA.
-
(1994)
Proceedings of the 12th National Conference on Artificial Intelligence, Seattle, WA
-
-
Cassandra, A.R.1
Kaelbling, L.P.2
Littman, M.L.3
-
12
-
-
0003860985
-
-
Princeton, NJ: Princeton University Press
-
Gintis, CM, 2000, Game Theory Evolving. Princeton, NJ: Princeton University Press.
-
(2000)
Game Theory Evolving
-
-
Gintis, C.M.1
-
13
-
-
28544446120
-
-
Menlo Park, CA: AAAI Press
-
Grenager, T, Powers, R and Shoham, Y, 2002, Dispersion Games: General Definitions and Some Specific Learning Results. Menlo Park, CA: AAAI Press.
-
(2002)
Dispersion Games: General Definitions and Some Specific Learning Results
-
-
Grenager, T.1
Powers, R.2
Shoham, Y.3
-
14
-
-
0003779190
-
-
Reading, MA: Academic Press
-
Hirsch, MW and Smale, S, 1974, Differential Equations, Dynamical Systems and Linear Algebra. Reading, MA: Academic Press.
-
(1974)
Differential Equations, Dynamical Systems and Linear Algebra
-
-
Hirsch, M.W.1
Smale, S.2
-
15
-
-
22944478374
-
Engineering multi-agent reinforcement learning using evolutionary dynamics
-
Berlin: Springer
-
Hoen, PJ and Tuyls, K, 2004, Engineering multi-agent reinforcement learning using evolutionary dynamics. In Proceedings of the 15th European Conference on Machine Learning (ECML'04) (Lecture Notes in Artificial Intelligence, 3201), Pisa, Italy, 20-24 September 2004. Berlin: Springer.
-
(2004)
Proceedings of the 15th European Conference on Machine Learning (ECML'04) (Lecture Notes in Artificial Intelligence, 3201), Pisa, Italy, 20-24 September 2004
, vol.3201
-
-
Hoen, P.J.1
Tuyls, K.2
-
18
-
-
9444236608
-
On no-regret learning, fictitious play, and Nash equilibrium
-
Jafari, C, Greenwald, A, Gondek, D and Ercal, G, 2001, On no-regret learning, fictitious play, and Nash equilibrium. In Proceedings of the 18th International Conference on Machine Learning, pp. 223-226.
-
(2001)
Proceedings of the 18th International Conference on Machine Learning
, pp. 223-226
-
-
Jafari, C.1
Greenwald, A.2
Gondek, D.3
Ercal, G.4
-
25
-
-
0012327484
-
Using eligibility traces to find the best memoryless policy in a partially observable Markov process
-
Loch, J and Singh, S, 1998, Using eligibility traces to find the best memoryless policy in a partially observable Markov process. In Proceedings of the 15th International Conference on Machine Learning, San Francisco, CA.
-
(1998)
Proceedings of the 15th International Conference on Machine Learning, San Francisco, CA
-
-
Loch, J.1
Singh, S.2
-
27
-
-
34548719708
-
The logic of animal conflict
-
Maynard Smith, J and Price, GR, 1973, The logic of animal conflict. Nature 146, 15-18.
-
(1973)
Nature
, vol.146
, pp. 15-18
-
-
Maynard Smith, J.1
Price, G.R.2
-
29
-
-
0016082525
-
Learning automata: A survey
-
Narendra, K and Thathacher, M, 1974, Learning automata: a survey. IEEE Transactions on Systems, Man, and Cybernetics 14, 323-334.
-
(1974)
IEEE Transactions on Systems, Man, and Cybernetics
, vol.14
, pp. 323-334
-
-
Narendra, K.1
Thathacher, M.2
-
31
-
-
84948131383
-
Social agents playing a periodical policy
-
Nowé, A, Parent, J and Verbeeck, K, 2001, Social agents playing a periodical policy. In Proceedings of the 12th European Conference on Machine Learning, pp. 382-393.
-
(2001)
Proceedings of the 12th European Conference on Machine Learning
, pp. 382-393
-
-
Nowé, A.1
Parent, J.2
Verbeeck, K.3
-
32
-
-
0011847654
-
Distributed reinforcement learning, loadbased routing a case study
-
Nowé, A and Verbeeck, K, 1999, Distributed reinforcement learning, loadbased routing a case study. Notes of the Neural, Symbolic and Reinforcement Methods for Sequence Learning Workshop at IJCAI99, Stockholm, Sweden.
-
(1999)
Notes of the Neural, Symbolic and Reinforcement Methods for Sequence Learning Workshop at IJCAI99, Stockholm, Sweden
-
-
Nowé, A.1
Verbeeck, K.2
-
41
-
-
1142293590
-
-
Institute for Theoretical Physics, Köln, Euroland
-
Stauffer, D, 1999, Life, Love and Death: Models of Biological Reproduction and Aging. Institute for Theoretical Physics, Köln, Euroland.
-
(1999)
Life, Love and Death: Models of Biological Reproduction and Aging
-
-
Stauffer, D.1
-
42
-
-
28544444638
-
Towards a hardware implementation of reinforcement learning for call admission control in networks for integrated services
-
Steenhaut, K, Nowé, A, Fakir, M and Dirkx, E, 1997, Towards a hardware implementation of reinforcement learning for call admission control in networks for integrated services. In Proceedings of the International Workshop on Applications of Neural Networks and other Intelligent Techniques to Telecommunications, 3, Melbourne.
-
(1997)
Proceedings of the International Workshop on Applications of Neural Networks and Other Intelligent Techniques to Telecommunications, 3, Melbourne
-
-
Steenhaut, K.1
Nowé, A.2
Fakir, M.3
Dirkx, E.4
-
43
-
-
33847202724
-
Learning to predict by the methods of temporal differences
-
Boston, MA: Kluwer Academic
-
Sutton, RS, 1988, Learning to predict by the methods of temporal differences. Machine Learning, vol. 3. Boston, MA: Kluwer Academic, pp. 9-44.
-
(1988)
Machine Learning
, vol.3
, pp. 9-44
-
-
Sutton, R.S.1
-
46
-
-
0036894214
-
Varieties of learning automata: An overview
-
Thathacher, MAL and Sastry, PS, 2002, Varieties of learning automata: an overview. IEEE Transactions on Systems, Man, and Cybernetics - Part B: Cybernetics 32(6).
-
(2002)
IEEE Transactions on Systems, Man, and Cybernetics - Part B: Cybernetics
, vol.32
, Issue.6
-
-
Thathacher, M.A.L.1
Sastry, P.S.2
-
47
-
-
0000502181
-
On the behavior of finite automata in random media
-
Tsetlin, ML, 1962, On the behavior of finite automata in random media. Automation and Remote Control 22, 1210-1219.
-
(1962)
Automation and Remote Control
, vol.22
, pp. 1210-1219
-
-
Tsetlin, M.L.1
-
49
-
-
27144547178
-
Asynchronous stochastic approximation and Q-learning
-
Laboratory for Information and Decision Systems and the Operation Research Center, MIT, Cambridge, MA
-
Tsitsiklis, JN, 1993, Asynchronous stochastic approximation and Q-learning. Internal Report, Laboratory for Information and Decision Systems and the Operation Research Center, MIT, Cambridge, MA.
-
(1993)
Internal Report
-
-
Tsitsiklis, J.N.1
-
50
-
-
1142305721
-
Towards a relation between learning agents and evolutionary dynamics
-
Belgium: KU Leuven
-
Tuyls, K, Lenaerts, T, Verbeeck, K, Maes, S and Manderick, B, 2002, Towards a relation between learning agents and evolutionary dynamics. In Proceedings of the Belgium-Netherlands Artificial Intelligence Conference 2002 (BNAIC). Belgium: KU Leuven.
-
(2002)
Proceedings of the Belgium-Netherlands Artificial Intelligence Conference 2002 (BNAIC)
-
-
Tuyls, K.1
Lenaerts, T.2
Verbeeck, K.3
Maes, S.4
Manderick, B.5
-
51
-
-
8344263004
-
On a dynamical analysis of reinforcement learning in games: Emergence of Occam's Razor
-
Berlin, Springer
-
Tuyls, K, Verbeeck, K and Maes, S, On a dynamical analysis of reinforcement learning in games: emergence of Occam's Razor. Multi-agent Systems and Applications III (Central and Eastern European conference on Multi- Agent Systems 2003), Prague, 16-18 June 2003, Czech Republic (Lecture Notes in Artificial Intelligence, 2691). Berlin, Springer.
-
Multi-agent Systems and Applications III (Central and Eastern European Conference on Multi- agent Systems 2003), Prague, 16-18 June 2003, Czech Republic (Lecture Notes in Artificial Intelligence, 2691)
, vol.2691
-
-
Tuyls, K.1
Verbeeck, K.2
Maes, S.3
-
52
-
-
26444437242
-
A selection-mutation model for Q-leaming in multi-agent systems
-
New York: ACM Press
-
Tuyls, K, Verbeeck, K and Lenaerts, T, 2003, A selection-mutation model for Q-leaming in multi-agent systems. In The ACM International Conference Proceedings Series, Autonomous Agents and Multi-agent Systems 2003, Melbourne, Australia 14-18 July 2003. New York: ACM Press.
-
(2003)
The ACM International Conference Proceedings Series, Autonomous Agents and Multi-agent Systems 2003, Melbourne, Australia 14-18 July 2003
-
-
Tuyls, K.1
Verbeeck, K.2
Lenaerts, T.3
-
53
-
-
9444229990
-
Extended replicator dynamics as a key to reinforcement learning in multi-agent systems
-
Berlin: Springer
-
Tuyls, K, Heytens, D, Nowé, A and Manderick, B, 2003, Extended replicator dynamics as a key to reinforcement learning in multi-agent systems. In Proceedings of the European Conference on Machine Learning '03 (Lecture Notes in Artificial Intelligence), Cavtat-Dubrovnik, Croatia 22-26 September 2003. Berlin: Springer.
-
(2003)
Proceedings of the European Conference on Machine Learning '03 (Lecture Notes in Artificial Intelligence), Cavtat-Dubrovnik, Croatia 22-26 September 2003
-
-
Tuyls, K.1
Heytens, D.2
Nowé, A.3
Manderick, B.4
-
54
-
-
84943265381
-
Learning to reach the Pareto optimal Nash equilibrium as a team
-
Berlin: Springer
-
Verbeeck, K, Nowé, A, Lenaerts, T and Parent, J, 2002, Learning to reach the Pareto optimal Nash equilibrium as a team. In Proceedings of the 15th Australian Joint Conference on Artificial Intelligence (Lecture Notes in Artificial Intelligence, 2557). Berlin: Springer, pp. 407-418.
-
(2002)
Proceedings of the 15th Australian Joint Conference on Artificial Intelligence (Lecture Notes in Artificial Intelligence, 2557)
, vol.2557
, pp. 407-418
-
-
Verbeeck, K.1
Nowé, A.2
Lenaerts, T.3
Parent, J.4
-
59
-
-
84899033169
-
Using collective intelligence to route internet traffic
-
Wolpert, DH, Turner, K and Frank, J, 1998, Using collective intelligence to route internet traffic. Advances in Neural Information Processing Systems, Denver, CO, 1998, pp. 952-958.
-
(1998)
Advances in Neural Information Processing Systems, Denver, CO, 1998
, pp. 952-958
-
-
Wolpert, D.H.1
Turner, K.2
Frank, J.3
-
60
-
-
0032691530
-
General principles of learning-based multi-agent systems
-
New York: ACM Press
-
Wolpert, DH, Wheller, KR and Tumer, K, 1999, General principles of learning-based multi-agent systems. In Proceedings of the 3rd International Conference on Autonomous Agents (Agents'99), Seattle, W A. New York: ACM Press.
-
(1999)
Proceedings of the 3rd International Conference on Autonomous Agents (Agents'99), Seattle, WA
-
-
Wolpert, D.H.1
Wheller, K.R.2
Tumer, K.3
|