-
2
-
-
40949094373
-
Sep.) A concise introduction to multiagent systems and distributed AI, Fac. Sci. Univ. Amsterdam, Amsterdam, The Netherlands
-
Tech. Rep, Online, Available
-
N. Vlassis. (2003, Sep.) A concise introduction to multiagent systems and distributed AI, Fac. Sci. Univ. Amsterdam, Amsterdam, The Netherlands, Tech. Rep. [Online]. Available: http://www.science.uva.nl/ ~vlassis/cimasdai/cimasdai.pdf
-
(2003)
-
-
Vlassis, N.1
-
3
-
-
0001873336
-
Industrial and practical applications of DAI
-
G. Weiss, Ed. Cambridge, MA: MIT Press, ch. 9, pp
-
H. V. D. Parunak, "Industrial and practical applications of DAI," in Multi-Agent Systems: A Modern Approach to Distributed Artificial Intelligence, G. Weiss, Ed. Cambridge, MA: MIT Press, 1999, ch. 9, pp. 377-412.
-
(1999)
Multi-Agent Systems: A Modern Approach to Distributed Artificial Intelligence
, pp. 377-412
-
-
Parunak, H.V.D.1
-
4
-
-
0034205975
-
Multiagent systems: A survey from the machine learning perspective
-
P. Stone and M. Veloso, "Multiagent systems: A survey from the machine learning perspective," Auton. Robots, vol. 8, no. 3, pp. 345-383, 2000.
-
(2000)
Auton. Robots
, vol.8
, Issue.3
, pp. 345-383
-
-
Stone, P.1
Veloso, M.2
-
5
-
-
0032208335
-
Elevator group control using multiple reinforcement learning agents
-
R. H. Crites and A. G. Barto, "Elevator group control using multiple reinforcement learning agents," Mach. Learn., vol. 33, no. 2-3, pp. 235-262, 1998.
-
(1998)
Mach. Learn
, vol.33
, Issue.2-3
, pp. 235-262
-
-
Crites, R.H.1
Barto, A.G.2
-
6
-
-
0001842882
-
Learning in multiagent systems
-
G. Weiss, Ed. Cambridge, MA: MIT Press, ch. 6, pp
-
S. Sen and G. Weiss, "Learning in multiagent systems," in Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence, G. Weiss, Ed. Cambridge, MA: MIT Press, 1999, ch. 6, pp. 259-298.
-
(1999)
Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence
, pp. 259-298
-
-
Sen, S.1
Weiss, G.2
-
8
-
-
0029679044
-
Reinforcement learning: A survey
-
L. P. Kaelbling, M. L. Littman, and A. W. Moore, "Reinforcement learning: A survey," J. Artif. Intell. Res., vol. 4, pp. 237-285, 1996.
-
(1996)
J. Artif. Intell. Res
, vol.4
, pp. 237-285
-
-
Kaelbling, L.P.1
Littman, M.L.2
Moore, A.W.3
-
13
-
-
0036531878
-
Multiagent learning using a variable learning rate
-
M. Bowling and M. Veloso, "Multiagent learning using a variable learning rate," Artif. Intell., vol. 136, no. 2, pp. 215-250, 2002.
-
(2002)
Artif. Intell
, vol.136
, Issue.2
, pp. 215-250
-
-
Bowling, M.1
Veloso, M.2
-
14
-
-
4544279348
-
-
May, Comput. Sci. Dept, Stanford Univ, Stanford, CA, Tech. Rep, Online, Available
-
Y. Shoham, R. Powers, and T. Grenager. (2003, May). "Multi-agent reinforcement learning: A critical survey," Comput. Sci. Dept., Stanford Univ., Stanford, CA, Tech. Rep. [Online]. Available: http:// multiagent.stanford.edu/papers/MALearning_ACriticalSurvey_2003_0516.pdf
-
(2003)
Multi-agent reinforcement learning: A critical survey
-
-
Shoham, Y.1
Powers, R.2
Grenager, T.3
-
15
-
-
0003435075
-
-
London, U.K, Oxford Univ. Press
-
T. Bäck, Evolutionary Algorithms in Theory and Practice: Evolution Strategies, Evolutionary Programming, Genetic Algorithms. London, U.K.: Oxford Univ. Press, 1996.
-
(1996)
Evolutionary Algorithms in Theory and Practice: Evolution Strategies, Evolutionary Programming, Genetic Algorithms
-
-
Bäck, T.1
-
17
-
-
26444601262
-
Cooperative multi-agent learning: The state of the art
-
Nov
-
L. Panait and S. Luke, "Cooperative multi-agent learning: The state of the art," Auton. Agents Multi-Agent Syst., vol. 11, no. 3, pp. 387-434, Nov. 2005.
-
(2005)
Auton. Agents Multi-Agent Syst
, vol.11
, Issue.3
, pp. 387-434
-
-
Panait, L.1
Luke, S.2
-
18
-
-
85027492413
-
A cooperative coevolutionary approach to function optimization
-
Jerusalem, Israel, Oct. 9-14
-
M. A. Potter and K. A. D. Jong, "A cooperative coevolutionary approach to function optimization," in Proc. 3rd Conf. Parallel Probl. Solving Nat. (PPSN-III), Jerusalem, Israel, Oct. 9-14, 1994, pp. 249-257.
-
(1994)
Proc. 3rd Conf. Parallel Probl. Solving Nat. (PPSN-III)
, pp. 249-257
-
-
Potter, M.A.1
Jong, K.A.D.2
-
19
-
-
84947918810
-
A game-theoretic approach to the simple coevolutionary algorithm
-
Paris, France, Sep. 18-20
-
S. G. Ficici and J. B. Pollack, "A game-theoretic approach to the simple coevolutionary algorithm," in Proc. 6th Int. Conf. Parallel Probl. Solving Nat. (PPSN-VI), Paris, France, Sep. 18-20, 2000, pp. 467-476.
-
(2000)
Proc. 6th Int. Conf. Parallel Probl. Solving Nat. (PPSN-VI)
, pp. 467-476
-
-
Ficici, S.G.1
Pollack, J.B.2
-
20
-
-
84880814738
-
Improving coevolutionary search for optimal multiagent behaviors
-
Acapulco, Mexico, Aug. 9-15, pp
-
L. Panait, R. P. Wiegand, and S. Luke, "Improving coevolutionary search for optimal multiagent behaviors," in Proc. l8th Int. Joint Conf. Artif. Intell. (IJCAI-03), Acapulco, Mexico, Aug. 9-15, pp. 653-660.
-
Proc. l8th Int. Joint Conf. Artif. Intell. (IJCAI-03)
, pp. 653-660
-
-
Panait, L.1
Wiegand, R.P.2
Luke, S.3
-
21
-
-
0002363593
-
Strongly typed genetic programming in evolving cooperation strategies
-
Pittsburgh, PA, Jul. 15-19, pp
-
T. Haynes, R. Wainwright, S. Sen, and D. Schoenefeld, "Strongly typed genetic programming in evolving cooperation strategies," in Proc. 6th Int. Conf. Genet. Algorithms (ICGA-95), Pittsburgh, PA, Jul. 15-19, pp. 271-278.
-
Proc. 6th Int. Conf. Genet. Algorithms (ICGA-95)
, pp. 271-278
-
-
Haynes, T.1
Wainwright, R.2
Sen, S.3
Schoenefeld, D.4
-
22
-
-
0032208296
-
Learning team strategies: Soccer case studies
-
R. Salustowicz, M. Wiering, and J. Schmidhuber, "Learning team strategies: Soccer case studies," Mach. Learn., vol. 33, no. 2-3, pp. 263-282, 1998.
-
(1998)
Mach. Learn
, vol.33
, Issue.2-3
, pp. 263-282
-
-
Salustowicz, R.1
Wiering, M.2
Schmidhuber, J.3
-
23
-
-
84880763288
-
When evolving populations is better than coevolving individuals: The blind mice problem
-
Acapulco, Mexico, Aug. 9-15, pp
-
T. Miconi, "When evolving populations is better than coevolving individuals: The blind mice problem," in Proc. 18th Int. Joint Conf. Artif. Intell. (IJCAI-03), Acapulco, Mexico, Aug. 9-15, pp. 647-652.
-
Proc. 18th Int. Joint Conf. Artif. Intell. (IJCAI-03)
, pp. 647-652
-
-
Miconi, T.1
-
24
-
-
35048865922
-
Gradient based method for symmetric and asymmetric multiagent reinforcement learning
-
Hong Kong, China, Mar. 21-23, pp
-
V. Könönen, "Gradient based method for symmetric and asymmetric multiagent reinforcement learning," in Proc. 4th Int. Conf. Intell. Data Eng. Autom. Learn. (IDEAL-03), Hong Kong, China, Mar. 21-23, pp. 68-75.
-
Proc. 4th Int. Conf. Intell. Data Eng. Autom. Learn. (IDEAL-03)
, pp. 68-75
-
-
Könönen, V.1
-
25
-
-
0032207350
-
Learning coordination strategies for cooperative multiagent systems
-
F. Ho and M. Kamel, "Learning coordination strategies for cooperative multiagent systems," Mach. Learn., vol. 33, no. 2-3, pp. 155-177, 1998.
-
(1998)
Mach. Learn
, vol.33
, Issue.2-3
, pp. 155-177
-
-
Ho, F.1
Kamel, M.2
-
26
-
-
0007918330
-
A general method for incremental self-improvement and multi-agent learning
-
X. Yao, Ed. Singapore: World Scientific, ch. 3, pp
-
J. Schmidhuber, "A general method for incremental self-improvement and multi-agent learning," in Evolutionary Computation: Theory and Applications, X. Yao, Ed. Singapore: World Scientific, 1999, ch. 3, pp. 81-123.
-
(1999)
Evolutionary Computation: Theory and Applications
, pp. 81-123
-
-
Schmidhuber, J.1
-
28
-
-
28544446213
-
Evolutionary game theory and multi-agent re-inforcement learning
-
K. Tuyls and A. Nowé, "Evolutionary game theory and multi-agent re-inforcement learning," Knowl. Eng. Rev., vol. 20, no. 1, pp. 63-90, 2005.
-
(2005)
Knowl. Eng. Rev
, vol.20
, Issue.1
, pp. 63-90
-
-
Tuyls, K.1
Nowé, A.2
-
33
-
-
0000955979
-
Incremental multi-step Q-learning
-
J. Peng and R. J. Williams, "Incremental multi-step Q-learning," Mach. Learn., vol. 22, no. 1-3, pp. 283-290, 1996.
-
(1996)
Mach. Learn
, vol.22
, Issue.1-3
, pp. 283-290
-
-
Peng, J.1
Williams, R.J.2
-
34
-
-
33847202724
-
Learning to predict by the methods of temporal differences
-
R. S. Sutton, "Learning to predict by the methods of temporal differences," Mach. Learn., vol. 3, pp. 9-44, 1988.
-
(1988)
Mach. Learn
, vol.3
, pp. 9-44
-
-
Sutton, R.S.1
-
35
-
-
0020970738
-
Neuronlike adaptive elements that can solve difficult learning control problems
-
Sep./Oct
-
A. G. Barto, R. S. Sutton, and C. W. Anderson, "Neuronlike adaptive elements that can solve difficult learning control problems," IEEE Trans. Syst., Man, Cybern., vol. SMC-5, no. 5, pp. 843-846, Sep./Oct. 1983.
-
(1983)
IEEE Trans. Syst., Man, Cybern
, vol.SMC-5
, Issue.5
, pp. 843-846
-
-
Barto, A.G.1
Sutton, R.S.2
Anderson, C.W.3
-
36
-
-
85132026293
-
Integrated architectures for learning, planning, and reacting based on approximating dynamic programming
-
Austin, TX, Jun. 21-23, pp
-
R. S. Sutton, "Integrated architectures for learning, planning, and reacting based on approximating dynamic programming," in Proc. 7th Int. Conf. Mach. Learn. (ICML-90), Austin, TX, Jun. 21-23, pp. 216-224.
-
Proc. 7th Int. Conf. Mach. Learn. (ICML-90)
, pp. 216-224
-
-
Sutton, R.S.1
-
37
-
-
0027684215
-
Prioritized sweeping: Reinforcement learning with less data and less time
-
A. W. Moore and C. G. Atkeson, "Prioritized sweeping: Reinforcement learning with less data and less time," Mach. Learn., vol. 13, pp. 103-130, 1993.
-
(1993)
Mach. Learn
, vol.13
, pp. 103-130
-
-
Moore, A.W.1
Atkeson, C.G.2
-
38
-
-
0001547175
-
Value-function reinforcement learning in Markov games
-
M. L. Littman, "Value-function reinforcement learning in Markov games," J. Cogn. Syst. Res., vol. 2, no. 1, pp. 55-66, 2001.
-
(2001)
J. Cogn. Syst. Res
, vol.2
, Issue.1
, pp. 55-66
-
-
Littman, M.L.1
-
39
-
-
85149834820
-
Markov games as a framework for multi-agent reinforcement learning
-
New Brunswick, NJ, Jul. 10-13, pp
-
M. L. Littman, "Markov games as a framework for multi-agent reinforcement learning," in Proc. 11th Int. Conf. Mach. Learn. (ICML-94), New Brunswick, NJ, Jul. 10-13, pp. 157-163.
-
Proc. 11th Int. Conf. Mach. Learn. (ICML-94)
, pp. 157-163
-
-
Littman, M.L.1
-
40
-
-
0000929496
-
Multiagent reinforcement learning: Theoretical framework and an algorithm
-
Madison, WI, Jul. 24-27, pp
-
J. Hu and M. P. Wellman, "Multiagent reinforcement learning: Theoretical framework and an algorithm," in Proc. 15th Int. Conf. Mach. Learn. (ICML-98), Madison, WI, Jul. 24-27, pp. 242-250.
-
Proc. 15th Int. Conf. Mach. Learn. (ICML-98)
, pp. 242-250
-
-
Hu, J.1
Wellman, M.P.2
-
41
-
-
0012286079
-
An algorithm for distributed reinforcement learning in cooperative multi-agent systems
-
Stanford Univ, Stanford, CA, Jun. 29-Jul. 2, pp
-
M. Lauer and M. Riedmiller, "An algorithm for distributed reinforcement learning in cooperative multi-agent systems," in Proc. 17th Int. Conf. Mach. Learn. (ICML-00), Stanford Univ., Stanford, CA, Jun. 29-Jul. 2, pp. 535-542.
-
Proc. 17th Int. Conf. Mach. Learn. (ICML-00)
, pp. 535-542
-
-
Lauer, M.1
Riedmiller, M.2
-
42
-
-
1942517280
-
Correlated-Q learning
-
Washington, DC, Aug. 21-24, pp
-
A. Greenwald and K. Hall, "Correlated-Q learning," in Proc. 20th Int. Conf. Mach. Learn. (ICML-03), Washington, DC, Aug. 21-24, pp. 242-249.
-
Proc. 20th Int. Conf. Mach. Learn. (ICML-03)
, pp. 242-249
-
-
Greenwald, A.1
Hall, K.2
-
43
-
-
0000439891
-
On the convergence of stochastic iterative dynamic programming algorithms
-
T. Jaakkola, M. I. Jordan, and S. P. Singh, "On the convergence of stochastic iterative dynamic programming algorithms," Neural Comput. vol. 6, no. 6, pp. 1185-1201, 1994.
-
(1994)
Neural Comput
, vol.6
, Issue.6
, pp. 1185-1201
-
-
Jaakkola, T.1
Jordan, M.I.2
Singh, S.P.3
-
44
-
-
0028497630
-
Asynchronous stochastic approximation and Q-learning
-
J. N. Tsitsiklis, "Asynchronous stochastic approximation and Q-learning," Mach. Learn., vol. 16, no. 1, pp. 185-202, 1994.
-
(1994)
Mach. Learn
, vol.16
, Issue.1
, pp. 185-202
-
-
Tsitsiklis, J.N.1
-
45
-
-
4544236179
-
Coordinated reinforcement learning
-
Sydney, Australia, Jul. 8-12, pp
-
C. Guestrin, M. G. Lagoudakis, and R. Parr, "Coordinated reinforcement learning," in Proc. 19th Int. Conf. Mach. Learn. (ICML-02), Sydney, Australia, Jul. 8-12, pp. 227-234.
-
Proc. 19th Int. Conf. Mach. Learn. (ICML-02)
, pp. 227-234
-
-
Guestrin, C.1
Lagoudakis, M.G.2
Parr, R.3
-
46
-
-
12244304892
-
Non-communicative multi-robot coordination in dynamic environment
-
J. R. Kok, M. T. J. Spaan, and N. Vlassis, "Non-communicative multi-robot coordination in dynamic environment," Robot. Auton. Syst., vol. 50, no. 2-3, pp. 99-114, 2005.
-
(2005)
Robot. Auton. Syst
, vol.50
, Issue.2-3
, pp. 99-114
-
-
Kok, J.R.1
Spaan, M.T.J.2
Vlassis, N.3
-
47
-
-
14344250637
-
Sparse cooperative Q-1earning
-
Banff, AB, Canada, Jul. 4-8, pp
-
J. R. Kok and N. Vlassis, "Sparse cooperative Q-1earning," in Proc. 21st Int. Conf. Mach. Learn. (ICML-04), Banff, AB, Canada, Jul. 4-8, pp. 481-488.
-
Proc. 21st Int. Conf. Mach. Learn. (ICML-04)
, pp. 481-488
-
-
Kok, J.R.1
Vlassis, N.2
-
48
-
-
37249017092
-
Using the max-plus algorithm for multiagent decision making in coordination graphs
-
Robot Soccer World Cup IX RoboCup 2005, Osaka, Japan, Jul. 13-19
-
J. R. Kok and N. Vlassis, "Using the max-plus algorithm for multiagent decision making in coordination graphs," in Robot Soccer World Cup IX (RoboCup 2005). Lecture Notes in Computer Science, vol. 4020, Osaka, Japan, Jul. 13-19, 2005.
-
(2005)
Lecture Notes in Computer Science
, vol.4020
-
-
Kok, J.R.1
Vlassis, N.2
-
49
-
-
33745586802
-
Structural abstraction experiments in reinforcement learning
-
Proc. 18th Aust. Joint Conf. Artif. Intell, AI-05, Sydney, Australia, Dec. 5-9, pp
-
R. Fitch, B. Hengst, D. Sue, G. Calbert, and J. B. Scholz, "Structural abstraction experiments in reinforcement learning," in Proc. 18th Aust. Joint Conf. Artif. Intell. (AI-05 , Lecture Notes in Computer Science, vol. 3809, Sydney, Australia, Dec. 5-9, pp. 164-175.
-
Lecture Notes in Computer Science
, vol.3809
, pp. 164-175
-
-
Fitch, R.1
Hengst, B.2
Sue, D.3
Calbert, G.4
Scholz, J.B.5
-
50
-
-
84873428767
-
Multiagent reinforcement learning with adaptive state focus
-
Brussels, Belgium, Oct. 17-18, pp
-
L. Buşoniu, B. De Schutter, and R. Babuška, "Multiagent reinforcement learning with adaptive state focus," in Proc. 17th Belgian - Dutch Conf. Artif. Intell. (BNAIC-05), Brussels, Belgium, Oct. 17-18, pp. 35-42.
-
Proc. 17th Belgian - Dutch Conf. Artif. Intell. (BNAIC-05)
, pp. 35-42
-
-
Buşoniu, L.1
De Schutter, B.2
Babuška, R.3
-
51
-
-
85152198941
-
Multi-agent reinforcement learning: Independent vs. cooperative agents
-
Amherst, OH, Jun. 27-29, pp
-
M. Tan, "Multi-agent reinforcement learning: Independent vs. cooperative agents," in Proc. 10th Int. Conf. Mach. Learn. (ICML-93) Amherst, OH, Jun. 27-29, pp. 330-337.
-
Proc. 10th Int. Conf. Mach. Learn. (ICML-93)
, pp. 330-337
-
-
Tan, M.1
-
52
-
-
77956330149
-
Learning from an automated training agent
-
presented at the, Tahoe City, CA, Jul. 9-12
-
J. Clouse, "Learning from an automated training agent," presented at the Workshop Agents that Learn from Other Agents, 12th Int. Conf. Mach. Learn. (ICML-95), Tahoe City, CA, Jul. 9-12.
-
Workshop Agents that Learn from Other Agents, 12th Int. Conf. Mach. Learn. (ICML-95)
-
-
Clouse, J.1
-
53
-
-
27344432348
-
Accelerating reinforcement learning through implicit imitation
-
B. Price and C. Boutilier, "Accelerating reinforcement learning through implicit imitation," J. Artif. Intell. Res., vol. 19, pp. 569-629, 2003.
-
(2003)
J. Artif. Intell. Res
, vol.19
, pp. 569-629
-
-
Price, B.1
Boutilier, C.2
-
54
-
-
4644369748
-
Nash Q-learning for general-sum stochastic games
-
J. Hu and M. P. Wellman, "Nash Q-learning for general-sum stochastic games," J. Mach. Learn. Res., vol. 4, pp. 1039-1069, 2003.
-
(2003)
J. Mach. Learn. Res
, vol.4
, pp. 1039-1069
-
-
Hu, J.1
Wellman, M.P.2
-
55
-
-
84898936075
-
New criteria and a new algorithm for learning in multi-agent systems
-
Vancouver, BC, Canada, Dec. 13-18
-
R. Powers and Y. Shoham, "New criteria and a new algorithm for learning in multi-agent systems," in Proc. Adv. Neural Inf. Process. Syst. (NIPS-04), Vancouver, BC, Canada, Dec. 13-18, vol. 17, pp. 1089-1096.
-
Proc. Adv. Neural Inf. Process. Syst. (NIPS-04)
, vol.17
, pp. 1089-1096
-
-
Powers, R.1
Shoham, Y.2
-
56
-
-
84880865940
-
Rational and convergent learning in stochastic games
-
San Francisco, CA, Aug. 4-10
-
M. Bowling and M. Veloso, "Rational and convergent learning in stochastic games," in Proc. 17th Int. Conf. Artif. Intell. (IJCAI-01), San Francisco, CA, Aug. 4-10, 2001, pp. 1021-1026.
-
(2001)
Proc. 17th Int. Conf. Artif. Intell. (IJCAI-01)
, pp. 1021-1026
-
-
Bowling, M.1
Veloso, M.2
-
57
-
-
84899027977
-
Convergence and no-regret in multiagent learning
-
Vancouver, BC, Canada, Dec. 13-18
-
M. Bowling, "Convergence and no-regret in multiagent learning," in Proc. Adv. Neural Inf. Process. Syst. (NIPS-04 , Vancouver, BC, Canada, Dec. 13-18, vol. 17, pp. 209-216.
-
Proc. Adv. Neural Inf. Process. Syst. (NIPS-04
, vol.17
, pp. 209-216
-
-
Bowling, M.1
-
58
-
-
40949100375
-
-
G. Chalkiadakis. (2003, Mar.). Multiagent reinforcement learning: Stochastic games with multiple learning players, Dept. of Comput. Sci., Univ. Toronto. Toronto, ON, Canada, Tech. Rep. [Online]. Available: http://www.cs.toronto.edu/~gehalk/DepthReport/DepthReport.ps
-
G. Chalkiadakis. (2003, Mar.). Multiagent reinforcement learning: Stochastic games with multiple learning players, Dept. of Comput. Sci., Univ. Toronto. Toronto, ON, Canada, Tech. Rep. [Online]. Available: http://www.cs.toronto.edu/~gehalk/DepthReport/DepthReport.ps
-
-
-
-
59
-
-
0003863106
-
-
Oct, Dept. Comput. Sci, Carnegie Mellon Univ, Pittsburgh, PA, Tech. Rep, Online, Available
-
M. Bowling and M. Veloso. (2000, Oct.). "An analysis of stochastic game theory for multiagent reinforcement learning," Dept. Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, Tech. Rep. [Online]. Available: http://www.cs.ualberta.ca/bowling/papers/00tr.pdf
-
(2000)
An analysis of stochastic game theory for multiagent reinforcement learning
-
-
Bowling, M.1
Veloso, M.2
-
60
-
-
1942421183
-
AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents
-
Washington, DC, Aug. 21-24, pp
-
V. Conitzer and T. Sandholm, "AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents," in Proc. 20th Int. Conf. Mach. Learn. (ICML-03), Washington, DC, Aug. 21-24, pp. 83-90.
-
Proc. 20th Int. Conf. Mach. Learn. (ICML-03)
, pp. 83-90
-
-
Conitzer, V.1
Sandholm, T.2
-
61
-
-
22944447799
-
Multiagent learning in the presence of agents with limitations,
-
Ph.D. dissertation, Dept. Comput. Sci, Carnegie Mellon Univ, Pittsburgh, PA, May
-
M. Bowling, "Multiagent learning in the presence of agents with limitations," Ph.D. dissertation, Dept. Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, May 2003.
-
(2003)
-
-
Bowling, M.1
-
62
-
-
0031630561
-
The dynamics of reinforcement learning in cooperative multiagent systems
-
Madison, WI, Jul. 26-30, pp
-
C. Claus and C. Boutilier, "The dynamics of reinforcement learning in cooperative multiagent systems," in Proc. 15th Nat. Conf. Artif. Intell. 10th Conf. Innov. Appl. Artif. Intell. (AAAI/IAAI-98), Madison, WI, Jul. 26-30, pp. 746-752.
-
Proc. 15th Nat. Conf. Artif. Intell. 10th Conf. Innov. Appl. Artif. Intell. (AAAI/IAAI-98)
, pp. 746-752
-
-
Claus, C.1
Boutilier, C.2
-
63
-
-
0036932299
-
Reinforcement learning of coordination in cooperative multi-agent systems
-
Menlo Park, CA, Jul. 28-Aug. 1, pp
-
S. Kapetanakis and D. Kudenko, "Reinforcement learning of coordination in cooperative multi-agent systems," in Proc. 18th Nat. Conf. Artif. Intell. 14th Conf. Innov. Appl. Artif. Intell. (AAAI/IAAI-02), Menlo Park, CA, Jul. 28-Aug. 1, pp. 326-331.
-
Proc. 18th Nat. Conf. Artif. Intell. 14th Conf. Innov. Appl. Artif. Intell. (AAAI/IAAI-02)
, pp. 326-331
-
-
Kapetanakis, S.1
Kudenko, D.2
-
64
-
-
27744448185
-
Reinforcement learning to play an optimal Nash equilibrium in team Markov games
-
Vancouver, BC, Canada, Dec. 9-14
-
X. Wang and T. Sandholm, "Reinforcement learning to play an optimal Nash equilibrium in team Markov games," in Proc. Adv. Neural Inf. Process. Syst. (NIPS-02), Vancouver, BC, Canada, Dec. 9-14, vol. 15, pp. 1571-1578.
-
Proc. Adv. Neural Inf. Process. Syst. (NIPS-02)
, vol.15
, pp. 1571-1578
-
-
Wang, X.1
Sandholm, T.2
-
65
-
-
0002672918
-
Iterative solutions of games by fictitious play
-
T. C. Koopmans, Ed. New York: Wiley, ch. XXIV, pp
-
G. W. Brown, "Iterative solutions of games by fictitious play," in Activitiy Analysis of Production and Allocation, T. C. Koopmans, Ed. New York: Wiley, 1951, ch. XXIV, pp. 374-376.
-
(1951)
Activitiy Analysis of Production and Allocation
, pp. 374-376
-
-
Brown, G.W.1
-
66
-
-
0001644761
-
Nash convergence of gradient dynamics in general-sum games
-
San Francisco, CA, Jun. 30-Jul. 3, pp
-
S. Singh, M. Kearns, and Y. Mansour, "Nash convergence of gradient dynamics in general-sum games," in Proc. 16th Conf. Uncertainty Artif. Intell. (UAI-00), San Francisco, CA, Jun. 30-Jul. 3, pp. 541-548.
-
Proc. 16th Conf. Uncertainty Artif. Intell. (UAI-00)
, pp. 541-548
-
-
Singh, S.1
Kearns, M.2
Mansour, Y.3
-
67
-
-
1942484421
-
Online convex programming and generalized infinitesimal gradient ascent
-
Washington, DC, Aug. 21
-
M. Zinkevich, "Online convex programming and generalized infinitesimal gradient ascent," in Proc. 20th Int. Conf. Mach. Learn. (ICML-03) Washington, DC, Aug. 21 24, pp. 928-936.
-
(1924)
Proc. 20th Int. Conf. Mach. Learn. (ICML-03)
, pp. 928-936
-
-
Zinkevich, M.1
-
68
-
-
84898941549
-
Extending Q-Iearning to general adaptive multi-agent systems
-
Vancouver, BC, Canada, Dec. 8-13
-
G. Tesauro, "Extending Q-Iearning to general adaptive multi-agent systems," in Proc. Adv. Neural Inf. Process. Syst. (NIPS-03), Vancouver, BC, Canada, Dec. 8-13, vol. 16.
-
Proc. Adv. Neural Inf. Process. Syst. (NIPS-03)
, vol.16
-
-
Tesauro, G.1
-
69
-
-
0028555752
-
Learning to coordinate without sharing information
-
Seattle, WA, Jul. 31-Aug. 4, pp
-
S. Sen, M. Sekaran, and J. Hale, "Learning to coordinate without sharing information," in Proc. 12th Nat. Conf. Artif. Intell. (AAAI-94), Seattle, WA, Jul. 31-Aug. 4, pp. 426-431.
-
Proc. 12th Nat. Conf. Artif. Intell. (AAAI-94)
, pp. 426-431
-
-
Sen, S.1
Sekaran, M.2
Hale, J.3
-
70
-
-
0030647149
-
Reinforcement learning in the multi-robot domain
-
M. J. Matarić, "Reinforcement learning in the multi-robot domain," Auton. Robots, vol. 4, no. 1, pp. 73-83, 1997.
-
(1997)
Auton. Robots
, vol.4
, Issue.1
, pp. 73-83
-
-
Matarić, M.J.1
-
71
-
-
85156187730
-
Improving elevator performance using reinforcement learning
-
Denver, CO, Nov. 27-30
-
R. H. Crites and A. G. Barto, "Improving elevator performance using reinforcement learning," in Proc. Adv. Neural Inf. Process. Syst. (NIPS-95), Denver, CO, Nov. 27-30, 1996, vol. 8, pp. 1017-1023.
-
(1996)
Proc. Adv. Neural Inf. Process. Syst. (NIPS-95)
, vol.8
, pp. 1017-1023
-
-
Crites, R.H.1
Barto, A.G.2
-
72
-
-
78649701299
-
Asymmetric multiagent reinforcement learning
-
Halifax, NS, Canada, Oct. 13-17, pp
-
V. Könönen, "Asymmetric multiagent reinforcement learning," in Proc. IEEE/WIC Int. Conf. Intell. Agent Technol. (IAT-03), Halifax, NS, Canada, Oct. 13-17, pp. 336-342.
-
Proc. IEEE/WIC Int. Conf. Intell. Agent Technol. (IAT-03)
, pp. 336-342
-
-
Könönen, V.1
-
73
-
-
4544231144
-
Best-response multiagent learning in non-stationary environments
-
New York, NY, Aug. 19-23, pp
-
M. Weinberg and J. S. Rosenschein, "Best-response multiagent learning in non-stationary environments," in Proc. 3rd Int. Joint Conf. Auton. Agents Multiagent Syst. (AAMAS-04), New York, NY, Aug. 19-23, pp. 506-513.
-
Proc. 3rd Int. Joint Conf. Auton. Agents Multiagent Syst. (AAMAS-04)
, pp. 506-513
-
-
Weinberg, M.1
Rosenschein, J.S.2
-
74
-
-
1142280919
-
Adaptive policy gradient in multiagent learning
-
Melbourne, Australia, Jul. 14-18, pp
-
B. Banerjee and J. Peng, "Adaptive policy gradient in multiagent learning," in Proc. 2nd Int. Joint Conf. Auton. Agents Multiagent Syst. (AAMAS-03), Melbourne, Australia, Jul. 14-18, pp. 686-692.
-
Proc. 2nd Int. Joint Conf. Auton. Agents Multiagent Syst. (AAMAS-03)
, pp. 686-692
-
-
Banerjee, B.1
Peng, J.2
-
75
-
-
0036355732
-
A multiagent reinforcement learning algorithm using extended optimal response
-
Bologna, Italy, Jul. 15-19, pp
-
N. Suematsu and A. Hayashi, "A multiagent reinforcement learning algorithm using extended optimal response," in Proc. 1st Int. Joint Conf. Auton. Agents Multiagent Syst. (AAMAS-02), Bologna, Italy, Jul. 15-19, pp. 370-377.
-
Proc. 1st Int. Joint Conf. Auton. Agents Multiagent Syst. (AAMAS-02)
, pp. 370-377
-
-
Suematsu, N.1
Hayashi, A.2
-
76
-
-
2642545776
-
Opponent modeling in multi-agent systems
-
G. Weiss and S. Sen, Eds. New York: Springer-Verlag, ch. 3, pp
-
D. Carmel and S. Markovitch, "Opponent modeling in multi-agent systems," in Adaptation and Learning in Multi-Agent Systems, G. Weiss and S. Sen, Eds. New York: Springer-Verlag, 1996, ch. 3, pp. 40-52.
-
(1996)
Adaptation and Learning in Multi-Agent Systems
, pp. 40-52
-
-
Carmel, D.1
Markovitch, S.2
-
77
-
-
40949139243
-
Apr.). Adversarial reinforcement learning, School Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA
-
Tech. Rep, Onlilne, Available
-
W. T. Uther and M. Veloso. (1997, Apr.). Adversarial reinforcement learning, School Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, Tech. Rep. [Onlilne]. Available: http://www.cs.cmu.edu/afs/cs/user/will/ www/papers/Uther97a.ps
-
(1997)
-
-
Uther, W.T.1
Veloso, M.2
-
78
-
-
1142292938
-
The communicative multiagent team decision problem: Analyzing teamwork theories and models
-
D. V. Pynadath and M. Tambe, "The communicative multiagent team decision problem: Analyzing teamwork theories and models," J. Artif. Intell. Res., vol. 16, pp. 389-423, 2002.
-
(2002)
J. Artif. Intell. Res
, vol.16
, pp. 389-423
-
-
Pynadath, D.V.1
Tambe, M.2
-
79
-
-
40949144431
-
-
M. T. J. Spaan, N. Vlassis, and F. C. A. Groen, High level coordination of agents based on multiagent Markov decision processes with roles, in Proc. Workshop Coop. Robot., 2002 IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS-02), Lausanne, Switzerland, Oct. 1, pp. 66-73.
-
M. T. J. Spaan, N. Vlassis, and F. C. A. Groen, "High level coordination of agents based on multiagent Markov decision processes with roles," in Proc. Workshop Coop. Robot., 2002 IEEE/RSJ Int. Conf. Intell. Robots Syst. (IROS-02), Lausanne, Switzerland, Oct. 1, pp. 66-73.
-
-
-
-
80
-
-
0002500351
-
Planning, learning and coordination in multiagent decision processes
-
De Zeeuwse Stromen, The Netherlands, Mar. 17-20, pp
-
C. Boutilier, "Planning, learning and coordination in multiagent decision processes," in Proc. 6th Conf. Theor. Aspects Rationality Knowl. (TARK-96), De Zeeuwse Stromen, The Netherlands, Mar. 17-20, pp. 195-210.
-
Proc. 6th Conf. Theor. Aspects Rationality Knowl. (TARK-96)
, pp. 195-210
-
-
Boutilier, C.1
-
81
-
-
4544220380
-
Hierarchical reinforcement learning in communication-mediated multiagent coordination
-
New York, Aug. 19-23, pp
-
F. Fischer, M. Rovatsos, and G. Weiss, "Hierarchical reinforcement learning in communication-mediated multiagent coordination," in Proc. 3rd Int. Joint Conf. Auton. Agents Multiagent Syst. (AAMAS-04 , New York, Aug. 19-23, pp. 1334-1335.
-
Proc. 3rd Int. Joint Conf. Auton. Agents Multiagent Syst. (AAMAS-04
, pp. 1334-1335
-
-
Fischer, F.1
Rovatsos, M.2
Weiss, G.3
-
82
-
-
0031701693
-
Learning organizational roles for negotiated search in a multiagent system
-
M. V. Nagendra Prasad, V. R. Lesser, and S. E. Lander, "Learning organizational roles for negotiated search in a multiagent system," Int. J. Hum. Comput. Stud., vol. 48, no. 1, pp. 51-67, 1998.
-
(1998)
Int. J. Hum. Comput. Stud
, vol.48
, Issue.1
, pp. 51-67
-
-
Nagendra Prasad, M.V.1
Lesser, V.R.2
Lander, S.E.3
-
83
-
-
40949099898
-
Utile coordination: Learning interdependencies among cooperative agents
-
Colchester, U.K, Apr. 4-6, pp
-
J. R. Kok, P. J. 't Hoen, B. Bakker, and N. Vlassis, "Utile coordination: Learning interdependencies among cooperative agents," in Proc. IEEE Symp. Comput. Intell. Games (CIG-05), Colchester, U.K., Apr. 4-6, pp. 29-36.
-
Proc. IEEE Symp. Comput. Intell. Games (CIG-05)
, pp. 29-36
-
-
Kok, J.R.1
't Hoen, P.J.2
Bakker, B.3
Vlassis, N.4
-
85
-
-
84957895797
-
Reward functions for accelerated learning
-
New Brunswick, NJ, Jul. 10-13, pp
-
M. J. Matarić, "Reward functions for accelerated learning," in Proc. 11th Int. Conf. Mach. Learn. (ICML-94), New Brunswick, NJ, Jul. 10-13, pp. 181-189.
-
Proc. 11th Int. Conf. Mach. Learn. (ICML-94)
, pp. 181-189
-
-
Matarić, M.J.1
-
86
-
-
84949949419
-
Learning in multi-robot systems
-
G. Weiss and S. Sen, Eds. New York: Springer-Verlag, ch. 10, pp
-
M. J. Matarić, "Learning in multi-robot systems," in Adaptation and Learning in Multi-Agent Systems, G. Weiss and S. Sen, Eds. New York: Springer-Verlag, 1996, ch. 10, pp. 152-163.
-
(1996)
Adaptation and Learning in Multi-Agent Systems
, pp. 152-163
-
-
Matarić, M.J.1
-
87
-
-
31344450384
-
An evolutionary dynamical analysis of multi-agent learning in iterated games
-
K. Tuyls, P. J. 't Hoen, and B. Vanschoenwinkel, "An evolutionary dynamical analysis of multi-agent learning in iterated games," Auton. Agents Multi-Agent Syst., vol. 12, no. 1, pp. 115-153, 2006.
-
(2006)
Auton. Agents Multi-Agent Syst
, vol.12
, Issue.1
, pp. 115-153
-
-
Tuyls, K.1
't Hoen, P.J.2
Vanschoenwinkel, B.3
-
88
-
-
0003091684
-
Convergence problems of general-sum multiagent reinforcement learning
-
Stanford Univ, Stanford, CA, Jun. 29-Jul. 2, pp
-
M. Bowling, "Convergence problems of general-sum multiagent reinforcement learning," in Proc. 17th Int. Conf. Mach. Learn. (ICML-00), Stanford Univ., Stanford, CA, Jun. 29-Jul. 2, pp. 89-94.
-
Proc. 17th Int. Conf. Mach. Learn. (ICML-00)
, pp. 89-94
-
-
Bowling, M.1
-
89
-
-
0242635251
-
Implicit negotiation in repeated games
-
Seattle, WA, Aug. 21-24, pp
-
M. L. Littman and P. Stone, "Implicit negotiation in repeated games," in Proc. 8th Int. Workshop Agent Theories Arch. Lang. (ATAL-2001), Seattle, WA, Aug. 21-24, pp. 96-105.
-
Proc. 8th Int. Workshop Agent Theories Arch. Lang. (ATAL-2001)
, pp. 96-105
-
-
Littman, M.L.1
Stone, P.2
-
90
-
-
40949159247
-
A reinforcement learning based neural multi-agent-system for control of a combustion process
-
Como, Italy, Jul. 24-27, pp
-
V. Stephan, K. Debes, H.-M. Gross, F. Wintrich, and H. Wintrich, "A reinforcement learning based neural multi-agent-system for control of a combustion process," in Proc. IEEE-INNS-ENNS Int. Joint Conf. Neural Netw. (IJCNN-00), Como, Italy, Jul. 24-27, pp. 6217-6222.
-
Proc. IEEE-INNS-ENNS Int. Joint Conf. Neural Netw. (IJCNN-00)
, pp. 6217-6222
-
-
Stephan, V.1
Debes, K.2
Gross, H.-M.3
Wintrich, F.4
Wintrich, H.5
-
91
-
-
36249019659
-
Multi-agent reinforcement learning for traffic light control
-
Stanford Univ, Stanford, CA, Jun. 29-Jul. 2, pp
-
M. Wiering, "Multi-agent reinforcement learning for traffic light control," in Proc. 17th Int. Conf. Mach. Learn. (ICML-00), Stanford Univ., Stanford, CA, Jun. 29-Jul. 2, pp. 1151-1158.
-
Proc. 17th Int. Conf. Mach. Learn. (ICML-00)
, pp. 1151-1158
-
-
Wiering, M.1
-
92
-
-
40949130408
-
Cooperative multi-agent reinforcement learning of traffic lights
-
presented at the, Porto, Portugal, Oct. 3
-
B. Bakker, M. Steingrover, R. Schouten, E. Nijhuis, and L. Kester, "Cooperative multi-agent reinforcement learning of traffic lights," presented at the Workshop Coop. Multi-Agent Learn., 16th Eur. Conf. Mach. Learn. (ECML-05), Porto, Portugal, Oct. 3.
-
Workshop Coop. Multi-Agent Learn., 16th Eur. Conf. Mach. Learn. (ECML-05)
-
-
Bakker, B.1
Steingrover, M.2
Schouten, R.3
Nijhuis, E.4
Kester, L.5
-
93
-
-
84864835333
-
Reinforcement learning for cooperating and communicating reactive agents in electrical power grids
-
M. Hannebauer, J. Wendler, and E. Pagello, Eds. New York: Springer
-
M. A. Riedmiller, A. W. Moore, and J. G. Schneider, "Reinforcement learning for cooperating and communicating reactive agents in electrical power grids," in Balancing Reactivity and Social Deliberation in Multi-Agent Systems, M. Hannebauer, J. Wendler, and E. Pagello, Eds. New York: Springer, 2000, pp. 137-149.
-
(2000)
Balancing Reactivity and Social Deliberation in Multi-Agent Systems
, pp. 137-149
-
-
Riedmiller, M.A.1
Moore, A.W.2
Schneider, J.G.3
-
94
-
-
0003481349
-
Robot awareness in cooperative mobile robot learning
-
C. F. Touzet, "Robot awareness in cooperative mobile robot learning," Auton. Robots, vol. 8, no. 1, pp. 87-97, 2000.
-
(2000)
Auton. Robots
, vol.8
, Issue.1
, pp. 87-97
-
-
Touzet, C.F.1
-
95
-
-
5644261272
-
Learning in large cooperative multirobot systems
-
F. Fernández and L. E. Parker, "Learning in large cooperative multirobot systems," Int. J. Robot. Autom., vol. 16, no. 4, pp. 217-226, 2001.
-
(2001)
Int. J. Robot. Autom
, vol.16
, Issue.4
, pp. 217-226
-
-
Fernández, F.1
Parker, L.E.2
-
96
-
-
0037843409
-
An approach to the pursuit problem on a heterogeneous multiagent system using reinforcement learning
-
Y. Ishiwaka, T. Sato, and Y. Kakazu, "An approach to the pursuit problem on a heterogeneous multiagent system using reinforcement learning," Robot. Auton. Syst., vol. 43, no. 4, pp. 245-256, 2003.
-
(2003)
Robot. Auton. Syst
, vol.43
, Issue.4
, pp. 245-256
-
-
Ishiwaka, Y.1
Sato, T.2
Kakazu, Y.3
-
97
-
-
0032645144
-
Team-partitioned, opaque-transition reinforcement learning
-
Seattle, WA, May 1-5, pp
-
P. Stone and M. Veloso, "Team-partitioned, opaque-transition reinforcement learning," in Proc. 3rd Int. Conf. Auton. Agents (Agents-99), Seattle, WA, May 1-5, pp. 206-212.
-
Proc. 3rd Int. Conf. Auton. Agents (Agents-99)
, pp. 206-212
-
-
Stone, P.1
Veloso, M.2
-
98
-
-
0345073177
-
Reinforcement learning soccer teams with incomplete world models
-
M. Wiering, R. Salustowicz, and J. Schmidhuber, "Reinforcement learning soccer teams with incomplete world models," Auton. Robots, vol. 7, no. 1, pp. 77-88, 1999.
-
(1999)
Auton. Robots
, vol.7
, Issue.1
, pp. 77-88
-
-
Wiering, M.1
Salustowicz, R.2
Schmidhuber, J.3
-
99
-
-
40949146053
-
Q-Learning in simulated robotic soccer - large state spaces and incomplete information
-
Las Vegas, NV, Jun. 24-27, pp
-
K. Tuyls, S. Maes, and B. Manderick, "Q-Learning in simulated robotic soccer - large state spaces and incomplete information," in Proc. 2002 Int. Conf. Mach. Learn. Appl. (ICMLA-02), Las Vegas, NV, Jun. 24-27, pp. 226-232.
-
Proc. 2002 Int. Conf. Mach. Learn. Appl. (ICMLA-02)
, pp. 226-232
-
-
Tuyls, K.1
Maes, S.2
Manderick, B.3
-
100
-
-
84867463287
-
Karlsruhe brainstormers - A reinforcement learning approach to robotic soccer
-
Robot Soccer World Cup V RoboCup 2001, Washington, DC, Aug. 2-10, pp
-
A. Merke and M. A. Riedmiller, "Karlsruhe brainstormers - A reinforcement learning approach to robotic soccer," in Robot Soccer World Cup V (RoboCup 2001). Lecture Notes in Computer Science, vol. 2377, Washington, DC, Aug. 2-10, pp. 435-440.
-
Lecture Notes in Computer Science
, vol.2377
, pp. 435-440
-
-
Merke, A.1
Riedmiller, M.A.2
-
101
-
-
0242697601
-
The 2001 trading agent competition
-
M. P. Wellman, A. R. Greenwald, P. Stone, and P. R. Wurman, "The 2001 trading agent competition," Electron. Markets, vol. 13, no. 1, pp. 4-12, 2003.
-
(2003)
Electron. Markets
, vol.13
, Issue.1
, pp. 4-12
-
-
Wellman, M.P.1
Greenwald, A.R.2
Stone, P.3
Wurman, P.R.4
-
102
-
-
34248683404
-
Market performance of adaptive trading agents in synchronous double auctions
-
Proc. 4th Pacific Rim Int. Workshop Multi-Agents. Intell. Agents: Specification Model. Appl, PRIMA-01, Taipei, Taiwan, R.O.C, Jul. 28-29, pp
-
W.-T. Hsu and V.-W. Soo, "Market performance of adaptive trading agents in synchronous double auctions," in Proc. 4th Pacific Rim Int. Workshop Multi-Agents. Intell. Agents: Specification Model. Appl. (PRIMA-01). Lecture Notes in Computer Science Series, vol. 2132, Taipei, Taiwan, R.O.C., Jul. 28-29, pp. 108-121.
-
Lecture Notes in Computer Science Series
, vol.2132
, pp. 108-121
-
-
Hsu, W.-T.1
Soo, V.-W.2
-
103
-
-
84949746648
-
A multi-agent Q-Learning framework for optimizing stock trading systems
-
Proc. 13th Int. Conf. Database Expert Syst. Appl, DEXA-02, Aixen-Provence, France, Sep. 2-6, pp
-
J. W. Lee and J. Oo, "A multi-agent Q-Learning framework for optimizing stock trading systems," in Proc. 13th Int. Conf. Database Expert Syst. Appl. (DEXA-02). Lecture Notes in Computer Science, vol. 2453, Aixen-Provence, France, Sep. 2-6, pp. 153-162.
-
Lecture Notes in Computer Science
, vol.2453
, pp. 153-162
-
-
Lee, J.W.1
Oo, J.2
-
104
-
-
23044457299
-
Stock trading system using reinforcement learning with cooperative agents
-
Sydney, Australia, Jul. 8-12, pp
-
J. Oo, J. W. Lee, and B.-T. Zhang, "Stock trading system using reinforcement learning with cooperative agents," in Proc. 19th Int. Conf. Mach. Learn. (ICML-02), Sydney, Australia, Jul. 8-12, pp. 451-458.
-
Proc. 19th Int. Conf. Mach. Learn. (ICML-02)
, pp. 451-458
-
-
Oo, J.1
Lee, J.W.2
Zhang, B.-T.3
-
105
-
-
0036274424
-
Pricing in agent economies using multiagent Q-Learning
-
G. Tesauro and J. O. Kephart, "Pricing in agent economies using multiagent Q-Learning," Auton. Agents Multi-Agent Syst., vol. 5, no. 3, pp. 289-304, 2002.
-
(2002)
Auton. Agents Multi-Agent Syst
, vol.5
, Issue.3
, pp. 289-304
-
-
Tesauro, G.1
Kephart, J.O.2
-
106
-
-
84944045450
-
Reinforcement learning applications in dynamic pricing of retail markets
-
Newport Beach, CA, Jun. 24-27, pp
-
C. Raju, Y. Narahari, and K. Ravikumar, "Reinforcement learning applications in dynamic pricing of retail markets," in Proc. 2003 IEEE Int. Conf. E-Commerce (CEC-03), Newport Beach, CA, Jun. 24-27, pp. 339-346.
-
Proc. 2003 IEEE Int. Conf. E-Commerce (CEC-03)
, pp. 339-346
-
-
Raju, C.1
Narahari, Y.2
Ravikumar, K.3
-
107
-
-
0001624494
-
Adaptive load balancing: A study in multi-agent learning
-
A. Schaerf, Y. Shoham, and M. Tennenholtz, "Adaptive load balancing: A study in multi-agent learning," J. Artif. Intell. Res., vol. 2, pp. 475-500, 1995.
-
(1995)
J. Artif. Intell. Res
, vol.2
, pp. 475-500
-
-
Schaerf, A.1
Shoham, Y.2
Tennenholtz, M.3
-
108
-
-
0000719863
-
Packet routing in dynamically changing networks: A reinforcement learning approach
-
Denver, CO, Nov. 29-Dec. 2
-
J. A. Boyan and M. L. Littman, "Packet routing in dynamically changing networks: A reinforcement learning approach," in Proc. Adv. Neural Inf. Process. Syst. (NIPS-93), Denver, CO, Nov. 29-Dec. 2, vol. 6, pp. 671-678.
-
Proc. Adv. Neural Inf. Process. Syst. (NIPS-93)
, vol.6
, pp. 671-678
-
-
Boyan, J.A.1
Littman, M.L.2
-
109
-
-
85156238953
-
Predictive Q-routing: A memory-based reinforcement learning approach to adaptive traffic control
-
Denver, CO, Nov. 27-30
-
S. P. M. Choi and D.-Y. Yeung, "Predictive Q-routing: A memory-based reinforcement learning approach to adaptive traffic control," in Proc. Adv. Neural Inf. Process. Syst. (NIPS-95), Denver, CO, Nov. 27-30, vol. 8, pp. 945-951.
-
Proc. Adv. Neural Inf. Process. Syst. (NIPS-95)
, vol.8
, pp. 945-951
-
-
Choi, S.P.M.1
Yeung, D.-Y.2
-
110
-
-
2042544751
-
Multi-agent learning for routing control within an Internet environment
-
P. Tillotson, Q. Wu, and P. Hughes, "Multi-agent learning for routing control within an Internet environment," Eng. Appl. Artif. Intell. vol. 17, no. 2, pp. 179-185, 2004.
-
(2004)
Eng. Appl. Artif. Intell
, vol.17
, Issue.2
, pp. 179-185
-
-
Tillotson, P.1
Wu, Q.2
Hughes, P.3
-
112
-
-
84880694195
-
Stable function approximation in dynamic programming
-
Tahoe City, CA, Jul. 9-12, pp
-
G. Gordon, "Stable function approximation in dynamic programming," in Proc. 12th Int. Conf. Mach. Learn. (ICML-95 , Tahoe City, CA, Jul. 9-12, pp. 261-268.
-
Proc. 12th Int. Conf. Mach. Learn. (ICML-95
, pp. 261-268
-
-
Gordon, G.1
-
113
-
-
0029752470
-
Feature-based methods for large scale dynamic programming
-
J. N. Tsitsiklis and B. Van Roy, "Feature-based methods for large scale dynamic programming," Mach. Learn., vol. 22, no. 1-3, pp. 59-94, 1996.
-
(1996)
Mach. Learn
, vol.22
, Issue.1-3
, pp. 59-94
-
-
Tsitsiklis, J.N.1
Van Roy, B.2
-
114
-
-
0036832953
-
Variable-resolution discretization in optimal control
-
R. Munos and A. Moore, "Variable-resolution discretization in optimal control," Mach. Learn., vol. 49, no. 2-3, pp. 291-323, 2002.
-
(2002)
Mach. Learn
, vol.49
, Issue.2-3
, pp. 291-323
-
-
Munos, R.1
Moore, A.2
-
115
-
-
40949107944
-
p-norm for approximate value iteration
-
p-norm for approximate value iteration," SIAM J. Control Optim., vol. 46, no. 2. pp. 546-561, 2007.
-
(2007)
SIAM J. Control Optim
, vol.46
, Issue.2
, pp. 546-561
-
-
Munos, R.1
-
116
-
-
31844456754
-
Finite time bounds for sampling based fitted value iteration
-
Bonn, Germany, Aug. 7-11, pp
-
C. Szepesvári and R. Munos, "Finite time bounds for sampling based fitted value iteration," in Proc. 22nd Int. Conf. Mach. Learn. (ICML-05), Bonn, Germany, Aug. 7-11, pp. 880-887.
-
Proc. 22nd Int. Conf. Mach. Learn. (ICML-05)
, pp. 880-887
-
-
Szepesvári, C.1
Munos, R.2
-
117
-
-
0031143730
-
An analysis of temporal difference learning with function approximation
-
May
-
J. N. Tsitsiklis and B. Van Roy, "An analysis of temporal difference learning with function approximation," IEEE Trans. Autom. Control vol. 42, no. 5, pp. 674-690, May 1997.
-
(1997)
IEEE Trans. Autom. Control
, vol.42
, Issue.5
, pp. 674-690
-
-
Tsitsiklis, J.N.1
Van Roy, B.2
-
118
-
-
0036832956
-
Kernel-based reinforcement learning
-
D. Ormoneit and S. Sen, "Kernel-based reinforcement learning," Mach. Learn., vol. 49, no. 2-3, pp. 161-178, 2002.
-
(2002)
Mach. Learn
, vol.49
, Issue.2-3
, pp. 161-178
-
-
Ormoneit, D.1
Sen, S.2
-
119
-
-
14344263882
-
Interpolation-based Q-learning
-
Banff, AB, Canada, Jul. 4-8
-
C. Szepesvári and W. D. Smart, "Interpolation-based Q-learning," in Proc. 21st Int. Conf. Mach. Learn. (ICML-04 , Banff, AB, Canada, Jul. 4-8.
-
Proc. 21st Int. Conf. Mach. Learn. (ICML-04
-
-
Szepesvári, C.1
Smart, W.D.2
-
120
-
-
21844465127
-
Tree-based batch mode reinforcement learning
-
D. Ernst, P. Geurts, and L. Wehenkel, "Tree-based batch mode reinforcement learning," J. Mach. Learn. Res., vol. 6, pp. 503-556, 2005.
-
(2005)
J. Mach. Learn. Res
, vol.6
, pp. 503-556
-
-
Ernst, D.1
Geurts, P.2
Wehenkel, L.3
-
121
-
-
4644323293
-
Least-squares policy iteration
-
M. G. Lagoudakis and R. Parr, "Least-squares policy iteration," J. Mach. Learn. Res., vol. 4, pp. 1107-1149, 2003.
-
(2003)
J. Mach. Learn. Res
, vol.4
, pp. 1107-1149
-
-
Lagoudakis, M.G.1
Parr, R.2
-
122
-
-
0035312760
-
Relational reinforcement learning
-
S. Dzěroski, L. D. Raedt, and K. Driessens, "Relational reinforcement learning," Mach. Learn., vol. 43, no. 1-2, pp. 7-52, 2001.
-
(2001)
Mach. Learn
, vol.43
, Issue.1-2
, pp. 7-52
-
-
Dzěroski, S.1
Raedt, L.D.2
Driessens, K.3
-
123
-
-
0034313638
-
Multiagent reinforcement learning using function approximation
-
Nov
-
O. Abul, F. Polar, and R. Alhajj, "Multiagent reinforcement learning using function approximation," IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 4, no. 4, pp. 485-497, Nov. 2000.
-
(2000)
IEEE Trans. Syst., Man, Cybern. C, Appl. Rev
, vol.4
, Issue.4
, pp. 485-497
-
-
Abul, O.1
Polar, F.2
Alhajj, R.3
-
124
-
-
34547223380
-
Decentralized reinforcement learning control of a robotic manipulator
-
Singapore, Dec. 5-8, pp
-
L. Buşoniu, B. De Schutter, and R. Babuška, "Decentralized reinforcement learning control of a robotic manipulator," in Proc. 9th Int. Conf. Control Autom. Robot. Vis. (ICARCV-06), Singapore, Dec. 5-8, pp. 1347-1352.
-
Proc. 9th Int. Conf. Control Autom. Robot. Vis. (ICARCV-06)
, pp. 1347-1352
-
-
Buşoniu, L.1
De Schutter, B.2
Babuška, R.3
-
125
-
-
40949126042
-
Multiagent reinforcement learning applied to a chase problem in a continuous world
-
H. Tamakoshi and S. Ishii, "Multiagent reinforcement learning applied to a chase problem in a continuous world," Artif. Life Robot., vol. 5, no. 4, pp. 202-206, 2001.
-
(2001)
Artif. Life Robot
, vol.5
, Issue.4
, pp. 202-206
-
-
Tamakoshi, H.1
Ishii, S.2
-
126
-
-
0010276944
-
Implicit imitation in multiagent reinforcement learning
-
Bled, Slovenia, Jun. 27-30, pp
-
B. Price and C. Boutilier, "Implicit imitation in multiagent reinforcement learning," in Proc. 16th Int. Conf. Mach. Learn. (ICML-99), Bled, Slovenia, Jun. 27-30, pp. 325-334.
-
Proc. 16th Int. Conf. Mach. Learn. (ICML-99)
, pp. 325-334
-
-
Price, B.1
Boutilier, C.2
-
127
-
-
34548099216
-
Shaping multi-agent systems with gradient reinforcement learning
-
O. Buffet, A. Dutech, and F. Charpillet, "Shaping multi-agent systems with gradient reinforcement learning," Auton. Agents Multi-Agent Syst., vol. 15, no. 2, pp. 197-220, 2007.
-
(2007)
Auton. Agents Multi-Agent Syst
, vol.15
, Issue.2
, pp. 197-220
-
-
Buffet, O.1
Dutech, A.2
Charpillet, F.3
-
128
-
-
33846942607
-
Hierarchical multiagent reinforcement learning
-
M. Ghavamzadeh, S. Mahadevan, and R. Makar, "Hierarchical multiagent reinforcement learning," Auton. Agents Multi-Agent Syst., vol. 13, no. 2, pp. 197-229, 2006.
-
(2006)
Auton. Agents Multi-Agent Syst
, vol.13
, Issue.2
, pp. 197-229
-
-
Ghavamzadeh, M.1
Mahadevan, S.2
Makar, R.3
-
129
-
-
0000494894
-
Computationally feasible bounds for partially observed Markov decision processes
-
W. S. Lovejoy, "Computationally feasible bounds for partially observed Markov decision processes," Oper. Res., vol. 39, no. 1, pp. 162-175, 1991.
-
(1991)
Oper. Res
, vol.39
, Issue.1
, pp. 162-175
-
-
Lovejoy, W.S.1
-
130
-
-
21244489639
-
A reinforcement learning scheme for a partially-observable multi-agent game
-
S. Ishii, H. Fujita, M. Mitsmake, T. Yamazaki, J. Matsuda, and Y. Matsuno, "A reinforcement learning scheme for a partially-observable multi-agent game," Mach. Learn., vol. 59, no. 1-2, pp. 31-54, 2005.
-
(2005)
Mach. Learn
, vol.59
, Issue.1-2
, pp. 31-54
-
-
Ishii, S.1
Fujita, H.2
Mitsmake, M.3
Yamazaki, T.4
Matsuda, J.5
Matsuno, Y.6
-
131
-
-
9444233318
-
Dynamic programming for partially observable stochastic games
-
San Jose, CA, Jul. 25-29, pp
-
E. A. Hansen, D. S. Bernstein, and S. Zilberstein, "Dynamic programming for partially observable stochastic games," in Proc. 19th Natl. Conf. Artif. Intell. (AAAI-04), San Jose, CA, Jul. 25-29, pp. 709-715.
-
Proc. 19th Natl. Conf. Artif. Intell. (AAAI-04)
, pp. 709-715
-
-
Hansen, E.A.1
Bernstein, D.S.2
Zilberstein, S.3
-
132
-
-
23144455713
-
Learning in multiagent systems: An introduction from a game-theoretic perspective
-
Adaptive Agents, E. Alonso, Ed. New York: Springer-Verlag
-
J. M. Vidal, "Learning in multiagent systems: An introduction from a game-theoretic perspective," in Adaptive Agents. Lecture Notes in Artificial Intelligence, vol. 2636, E. Alonso, Ed. New York: Springer-Verlag, 2003, pp. 202-215.
-
(2003)
Lecture Notes in Artificial Intelligence
, vol.2636
, pp. 202-215
-
-
Vidal, J.M.1
|