-
1
-
-
0030149709
-
Purposive behavior acquisition for a real robot by vision-based reinforcement learning
-
Asada, M., Noda S., Tawaratsumida, S., & Hosodaal, K. (1996). Purposive behavior acquisition for a real robot by vision-based reinforcement learning. Machine Learning, 23(2-3), 279-303.
-
(1996)
Machine Learning
, vol.23
, Issue.2-3
, pp. 279-303
-
-
Asada, M.1
Noda, S.2
Tawaratsumida, S.3
Hosodaal, K.4
-
2
-
-
2542506169
-
Hebbian synaptic modifications in spiking neurons that learn
-
Technical report, Australian National University
-
Bartlett, P., & Baxter, J. (1999). Hebbian synaptic modifications in spiking neurons that learn. Technical report, Australian National University.
-
(1999)
-
-
Bartlett, P.1
Baxter, J.2
-
4
-
-
0013495368
-
Experiments with infinite-horizon, policy-gradient estimation
-
Baxter, J., Bartlett, P., & Weaver, L. (2001). Experiments with infinite-horizon, policy-gradient estimation. Journal of Artificial Intelligence Research, 15, 351-381.
-
(2001)
Journal of Artificial Intelligence Research
, vol.15
, pp. 351-381
-
-
Baxter, J.1
Bartlett, P.2
Weaver, L.3
-
5
-
-
0036874366
-
The complexity of decentralized control of Markov decision processes
-
Bernstein, D., Givan, R., Immerman, N., & Zilberstein, S. (2002). The complexity of decentralized control of Markov decision processes. Mathematics of Operations Research, 27(4), 819-840.
-
(2002)
Mathematics of Operations Research
, vol.27
, Issue.4
, pp. 819-840
-
-
Bernstein, D.1
Givan, R.2
Immerman, N.3
Zilberstein, S.4
-
9
-
-
34548075393
-
-
Buffet, O., & Aberdeen, D. (2006). The factored policy gradient planner (IPC-06 Version). In A. Gerevini, B. Bonet, & B. Givan (Eds.), Proceedings of the fifth international planning competition (IPC-5) (pp. 69-71). Winner, probabilistic track of the 5th International Planning Competition.
-
Buffet, O., & Aberdeen, D. (2006). The factored policy gradient planner (IPC-06 Version). In A. Gerevini, B. Bonet, & B. Givan (Eds.), Proceedings of the fifth international planning competition (IPC-5) (pp. 69-71). Winner, probabilistic track of the 5th International Planning Competition.
-
-
-
-
10
-
-
34548090879
-
Self-growth of basic behaviors in an action selection based agent
-
S. Schaal, A. Ijspeert, A. Billard, S. Vijayakumar, J. Hallam, & J.-A. Meyer Eds
-
Buffet, O., Dutech, A., & Charpillet, F. (2004). Self-growth of basic behaviors in an action selection based agent. In S. Schaal, A. Ijspeert, A. Billard, S. Vijayakumar, J. Hallam, & J.-A. Meyer (Eds.), From animals to animats 8: Proceedings of the eighth international conference on simulation of adaptive behavior (SAB'04) (pp. 223-232).
-
(2004)
From animals to animats 8: Proceedings of the eighth international conference on simulation of adaptive behavior (SAB'04)
, pp. 223-232
-
-
Buffet, O.1
Dutech, A.2
Charpillet, F.3
-
11
-
-
33645896149
-
Développement autonome des comportements de base d'un agent
-
Buffet, O., Dutech, A., & Charpillet, F. (2005). Développement autonome des comportements de base d'un agent. Revue. d'Intelligence Artificielle, 19(4-5), 603-632.
-
(2005)
Revue. d'Intelligence Artificielle
, vol.19
, Issue.4-5
, pp. 603-632
-
-
Buffet, O.1
Dutech, A.2
Charpillet, F.3
-
12
-
-
34548066148
-
-
Carmel, D., & Markovitch, S. (1996). Adaption and learning in multi-agent systems, 1042, Lecture notes in artificial intelligence, Chapt. Opponent modeling in multi-agent systems (pp. 40-52). Springer-Verlag.
-
Carmel, D., & Markovitch, S. (1996). Adaption and learning in multi-agent systems, Vol. 1042, Lecture notes in artificial intelligence, Chapt. Opponent modeling in multi-agent systems (pp. 40-52). Springer-Verlag.
-
-
-
-
13
-
-
0003989210
-
-
Ph.D. thesis, Brown. University, Department of Computer Science, Providence, RI
-
Cassandra, A. R. (1998). Exact and approximate algorithms for partially observable Markov decision processes. Ph.D. thesis, Brown. University, Department of Computer Science, Providence, RI.
-
(1998)
Exact and approximate algorithms for partially observable Markov decision processes
-
-
Cassandra, A.R.1
-
14
-
-
84901418243
-
Ant colony optimization: A new meta-heuristic
-
P. Angeline, Z. Michalewicz, M. Schoenauer, X. Yao, & A. Zalzala Eds
-
Dorigo, M., & Di Caro, G. (1999). Ant colony optimization: A new meta-heuristic. In P. Angeline, Z. Michalewicz, M. Schoenauer, X. Yao, & A. Zalzala (Eds.), Proceedings of the congress on evolutionary computation (CEC-99) (pp. 1470-1477).
-
(1999)
Proceedings of the congress on evolutionary computation (CEC-99)
, pp. 1470-1477
-
-
Dorigo, M.1
Di Caro, G.2
-
16
-
-
5644261272
-
Learning in large cooperative multi-robot domains
-
Fernández, F., & Parker, L. (2001). Learning in large cooperative multi-robot domains. International Journal of Robotics and Automation, 16(4), 217-226.
-
(2001)
International Journal of Robotics and Automation
, vol.16
, Issue.4
, pp. 217-226
-
-
Fernández, F.1
Parker, L.2
-
17
-
-
4444338336
-
A formal analysis and taxonomy of task allocation in multi-robot systems
-
Gerkey, B., & Matarić, M. (2004). A formal analysis and taxonomy of task allocation in multi-robot systems. International Journal of Robotics Research, 23(9), 939-954.
-
(2004)
International Journal of Robotics Research
, vol.23
, Issue.9
, pp. 939-954
-
-
Gerkey, B.1
Matarić, M.2
-
22
-
-
0000439891
-
On the convergence of stochastic iterative dynamic programming algorithms
-
Jaakkola, T., Jordan, M., & Singh, S. (1994). On the convergence of stochastic iterative dynamic programming algorithms. Neural Computation, 6(6), 1186-1201.
-
(1994)
Neural Computation
, vol.6
, Issue.6
, pp. 1186-1201
-
-
Jaakkola, T.1
Jordan, M.2
Singh, S.3
-
23
-
-
34548094725
-
-
Jong, E. D. (2000). Attractors in the development of communication. In J.-A. Meyer, A. Berthoz, D. Floreano, H. L. Roitblat, & S. W Wilson (Eds.), From animals to animats 6: Proceedings of the sixth international conference on simulation of adaptive behavior (SAB-00).
-
Jong, E. D. (2000). Attractors in the development of communication. In J.-A. Meyer, A. Berthoz, D. Floreano, H. L. Roitblat, & S. W Wilson (Eds.), From animals to animats 6: Proceedings of the sixth international conference on simulation of adaptive behavior (SAB-00).
-
-
-
-
24
-
-
0029679044
-
Reinforcement learning: A survey
-
Kaelbling, L., Littman, M., & Moore, A. (1996). Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4, 237-285.
-
(1996)
Journal of Artificial Intelligence Research
, vol.4
, pp. 237-285
-
-
Kaelbling, L.1
Littman, M.2
Moore, A.3
-
26
-
-
85138579181
-
Learning policies for partially observable environments: Scaling up
-
A. Prieditis & S. X Russell Eds
-
Littman, M., Cassandra, A., & Kaelbling, L. (1995). Learning policies for partially observable environments: Scaling up. In A. Prieditis & S. X Russell (Eds.), Proceedings of the twelveth international conference on machine learning (ICML'95) (pp. 362-370).
-
(1995)
Proceedings of the twelveth international conference on machine learning (ICML'95)
, pp. 362-370
-
-
Littman, M.1
Cassandra, A.2
Kaelbling, L.3
-
27
-
-
0030647149
-
Reinforcement learning in the multi-robot domain
-
Matarić, M. (1997). Reinforcement learning in the multi-robot domain. Autonomous Robots, 4(1), 73-83.
-
(1997)
Autonomous Robots
, vol.4
, Issue.1
, pp. 73-83
-
-
Matarić, M.1
-
29
-
-
0141596576
-
Policy invariance under reward transformations: Theory and application to reward shaping
-
I. Bratko & S. Dzeroski Eds
-
Ng, A., Harada, D., & Russell, S. (1999). Policy invariance under reward transformations: Theory and application to reward shaping. In I. Bratko & S. Dzeroski (Eds.), Proceedings of the sixteenth international conference on machine learning (ICML'99) (pp. 278-287).
-
(1999)
Proceedings of the sixteenth international conference on machine learning (ICML'99)
, pp. 278-287
-
-
Ng, A.1
Harada, D.2
Russell, S.3
-
30
-
-
0012646255
-
Learning to cooperate via policy search
-
C. Boutilier & M. Goldszmidt Eds
-
Peshkin, L., Kim, K., Meuleau, N., & Kaelbling, L. (2000). Learning to cooperate via policy search. In C. Boutilier & M. Goldszmidt (Eds.), Proceedings of the sixteenth conference on uncertainty in artificial intelligence (UAI'00) (pp. 489-496).
-
(2000)
Proceedings of the sixteenth conference on uncertainty in artificial intelligence (UAI'00)
, pp. 489-496
-
-
Peshkin, L.1
Kim, K.2
Meuleau, N.3
Kaelbling, L.4
-
31
-
-
33646413135
-
Natural actor-critic
-
J. Gama, R. Camacho, P. Brazdil, A. Jorge, & L. Torgo (Eds, Proceedings of the sixteenth european conference on machine, learning ECML'05
-
Peters, J., Vijayakumar, S., & Schaal, S. (2005). Natural actor-critic. In J. Gama, R. Camacho, P. Brazdil, A. Jorge, & L. Torgo (Eds.), Proceedings of the sixteenth european conference on machine, learning (ECML'05), Vol. 3720, Lecture notes in computer science.
-
(2005)
Lecture notes in computer science
, vol.3720
-
-
Peters, J.1
Vijayakumar, S.2
Schaal, S.3
-
33
-
-
1142292938
-
The communicative multiagent team decision problem: Analyzing teamwork theories and models
-
Pynadath, D., & Tambe, M. (2002). The communicative multiagent team decision problem: Analyzing teamwork theories and models. Journal of Artificial Intelligence Research, 16, 389-423.
-
(2002)
Journal of Artificial Intelligence Research
, vol.16
, pp. 389-423
-
-
Pynadath, D.1
Tambe, M.2
-
37
-
-
0032208296
-
Learning team strategies: Soccer case studies
-
Salustowicz, R., Wiering, M., & Schmidhuber, J. (1998). Learning team strategies: Soccer case studies. Machine Learning, 33, 263-282.
-
(1998)
Machine Learning
, vol.33
, pp. 263-282
-
-
Salustowicz, R.1
Wiering, M.2
Schmidhuber, J.3
-
38
-
-
4544279348
-
Multi-agent reinforcement learning: A critical survey
-
Technical report, Stanford
-
Shoham, Y., Powers, R., & Grenager, T. (2003). Multi-agent reinforcement learning: A critical survey. Technical report, Stanford.
-
(2003)
-
-
Shoham, Y.1
Powers, R.2
Grenager, T.3
-
39
-
-
34548076726
-
-
Singh, S., Jaakkola, T., & Jordan, M. (1994). Learning without state estimation in partially observable Markovian decision processes. In W. W. Cohen & H. Hirsh (Eds.), Proceedings of the eleventh international conference on machine learning (ICML'94).
-
Singh, S., Jaakkola, T., & Jordan, M. (1994). Learning without state estimation in partially observable Markovian decision processes. In W. W. Cohen & H. Hirsh (Eds.), Proceedings of the eleventh international conference on machine learning (ICML'94).
-
-
-
-
42
-
-
84974678409
-
Layered learning
-
R. L. de Mántaras & E. Plaza (Eds, Proceedings of the eleventh european conference on machine learning ECML'00
-
Stone, P., & Veloso, M. (2000a). Layered learning. In R. L. de Mántaras & E. Plaza (Eds.), Proceedings of the eleventh european conference on machine learning (ECML'00), Vol. 1810, Lecture notes in computer science.
-
(2000)
Lecture notes in computer science
, vol.1810
-
-
Stone, P.1
Veloso, M.2
-
43
-
-
0034205975
-
Multiagent systems: A survey from a machine learning perspective
-
Stone, P., & Veloso, M. (2000b). Multiagent systems: A survey from a machine learning perspective. Autonomous Robotics, 8(3).
-
(2000)
Autonomous Robotics
, vol.8
, Issue.3
-
-
Stone, P.1
Veloso, M.2
-
46
-
-
84898939480
-
Policy gradient methods for reinforcement learning with function approximation
-
S. A. Solla, T. K. Leen, & K.-R. Müller Eds
-
Sutton, R., McAllester, D., Singh, S., & Mansour, Y. (1999). Policy gradient methods for reinforcement learning with function approximation. In S. A. Solla, T. K. Leen, & K.-R. Müller (Eds.), Advances in neural information processing systems 11 (NIPS'99), Vol. 12 (pp. 1057-1063).
-
(1999)
Advances in neural information processing systems 11 (NIPS'99)
, vol.12
, pp. 1057-1063
-
-
Sutton, R.1
McAllester, D.2
Singh, S.3
Mansour, Y.4
-
51
-
-
0004320981
-
An introduction to collective intelligence
-
Technical Report NASA-ARC-IC-99-63, NASA AMES Research Center
-
Wolpert, D., & Tumer, K. (1999). An introduction to collective intelligence. Technical Report NASA-ARC-IC-99-63, NASA AMES Research Center.
-
(1999)
-
-
Wolpert, D.1
Tumer, K.2
-
53
-
-
84962090726
-
-
Xuan, P., Lesser, V., & Zilberstein, S. (2000). Communication in multi-agent Markov decision processes. In S. Parsons & P. Gmytrasiewicz (Eds.), Proceedings of ICMAS workshop on game theoretic and decision theoretic agents.
-
Xuan, P., Lesser, V., & Zilberstein, S. (2000). Communication in multi-agent Markov decision processes. In S. Parsons & P. Gmytrasiewicz (Eds.), Proceedings of ICMAS workshop on game theoretic and decision theoretic agents.
-
-
-
|