SCOPUS 정보 검색 플랫폼

Revue d'Intelligence Artificielle

Volumn 19, Issue 4-5, 2005, Pages 603-632

Autonomous development of basic behaviors of an agent;Développement autonome des comportements de base d'un agent

(3) Buffet, Olivier a,b Dutech, Alain b Charpillet, François b

a AUSTRALIAN NATIONAL UNIVERSITY (Australia)

b LORIA (France)

Author keywords

Markov Decision Problems; Multiple Motivations; Reinforcement Learning

Indexed keywords

DECISION THEORY; LEARNING SYSTEMS; MARKOV PROCESSES;

MARKOV DECISION PROBLEMS; MULTIPLE MOTIVATIONS; REINFORCEMENT LEARNING;

AUTONOMOUS AGENTS;

EID: 33645896149 PISSN: 0992499X EISSN: None Source Type: Journal
DOI: 10.3166/ria.19.603-632 Document Type: Conference Paper

Times cited : (2)

References (43)

1
- 0013535965
- Infinite-horizon policy-gradient estimation
- Baxter J., Bartlett P., «Infinite-Horizon Policy-Gradient Estimation », Journal of Artificial Intelligence Research, vol. 15, p. 319-350, 2001a.
- (2001) Journal of Artificial Intelligence Research , vol.15 , pp. 319-350
- Baxter, J.¹ Bartlett, P.²

2
- 0013495368
- Experiments with infinite-horizon, policy-gradient estimation
- Baxter J., Bartlett P., Weaver L., «Experiments with Infinite-Horizon, Policy-Gradient Estimation », Journal of Artificial Intelligence Research, vol. 15, p. 351-381, 2001b.
- (2001) Journal of Artificial Intelligence Research , vol.15 , pp. 351-381
- Baxter, J.¹ Bartlett, P.² Weaver, L.³

3
- 0004211236
- Athena Scientifi c
- Bertsekas D., Tsitsiklis J., Neurodynamic Programming, Athena Scientifi c, 1996.
- (1996) Neurodynamic Programming
- Bertsekas, D.¹ Tsitsiklis, J.²

4
- 84880891360
- Symbolic dynamic programming for first-order MDPs
- Boutilier C., Reiter R., Price B., «Symbolic Dynamic Programming for First-order MDPs », Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI'01), p. 690-697, 2001.
- (2001) Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI'01) , pp. 690-697
- Boutilier, C.¹ Reiter, R.² Price, B.³

5
- 33645843610
- PhD thesis. Université Henri Poincaré, Nancy 1, septembre. Laboratoire Lorrain de recherche en informatique et ses applications (LORIA)
- Buffet O., Une double approche modulaire de l'apprentissage par renforcement pour des agents intelligents adaptatifs, PhD thesis. Université Henri Poincaré, Nancy 1, septembre, 2003. Laboratoire Lorrain de recherche en informatique et ses applications (LORIA).
- (2003) Une Double Approche Modulaire de l'Apprentissage par Renforcement pour des Agents Intelligents Adaptatifs
- Buffet, O.¹

6
- 0010220857
- Adaptive combination of behaviors in an agent
- Buffet O., Dutech A., Charpillet F., «Adaptive Combination of Behaviors in an Agent », Proceedings of the 15th European Conference on Artificial Intelligence (ECAI'02), 2002.
- (2002) Proceedings of the 15th European Conference on Artificial Intelligence (ECAI'02)
- Buffet, O.¹ Dutech, A.² Charpillet, F.³

7
- 1142292522
- Automatic generation of an agent's basic behaviors
- Buffet O., Dutech A., Charpillet F., «Automatic Generation of an Agent's Basic Behaviors », Proceedings of the 2nd International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS'03), 2003.
- (2003) Proceedings of the 2nd International Joint Conference on Autonomous Agents and Multi-agent Systems (AAMAS'03)
- Buffet, O.¹ Dutech, A.² Charpillet, F.³

8
- 0003989210
- PhD thesis, Brown University, Department of Computer Science, Providence, RI
- Cassandra A. R., Exact and Approximate Algorithms for Partially Observable Markov Decision Processes, PhD thesis, Brown University, Department of Computer Science, Providence, RI, 1998.
- (1998) Exact and Approximate Algorithms for Partially Observable Markov Decision Processes
- Cassandra, A.R.¹

9
- 0001234682
- Feudal reinforcement learning
- Dayan P., Hinton G., «Feudal Reinforcement Learning », Advances in Neural Information Processing Systems 5 (NIPS'93), 1993.
- (1993) Advances in Neural Information Processing Systems 5 (NIPS'93)
- Dayan, P.¹ Hinton, G.²

10
- 0002278788
- Hierarchical reinforcement learning with the MAXQ value function decomposition
- Dietterich T., «Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition », Journal of Artificial Intelligence Research, vol. 13, p. 227-303, 2000.
- (2000) Journal of Artificial Intelligence Research , vol.13 , pp. 227-303
- Dietterich, T.¹

11
- 0004782095
- Learning hierarchical control structure for multiple tasks and changing environments
- Digney B., «Learning Hierarchical Control Structure for Multiple Tasks and Changing Environments », Proceedings of the Fifth Conference on the Simulation of Adaptive Behavior (SAB'98), 1998.
- (1998) Proceedings of the Fifth Conference on the Simulation of Adaptive Behavior (SAB'98)
- Digney, B.¹

12
- 84859236073
- Multi-agent systems by incremental gradient reinforcement learning
- Dutech A., Buffet O., Charpillet E, «Multi-Agent Systems by Incremental Gradient Reinforcement Learning », Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI'01), 2001.
- (2001) Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI'01)
- Dutech, A.¹ Buffet, O.² Charpillet, E.³

13
- 0035312760
- Relational reinforcement learning
- Dzeroski S., Raedt L. D., Driessens K., «Relational reinforcement learning », Machine Learning, vol. 43, p. 7-52, 2001.
- (2001) Machine Learning , vol.43 , pp. 7-52
- Dzeroski, S.¹ Raedt, L.D.² Driessens, K.³

14
- 84972539429
- Combining probability distributions: A critique and an annotated bibliography
- February
- Genest C, Zidek J., «Combining Probability Distributions : A Critique and an Annotated Bibliography », Statistical Science, vol. 1, no 1, p. 114-135, February, 1986.
- (1986) Statistical Science , vol.1 , Issue.1 , pp. 114-135
- Genest, C.¹ Zidek, J.²

15
- 44449170889
- Exploiting first-order regression in inductive policy selection
- Gretton C., Thiébaux S., «Exploiting First-Order Regression in Inductive Policy Selection », Proceedings of the Twentieth Conference on Uncertainty in Artificial Intelligence (UAI'04), 2004.
- (2004) Proceedings of the Twentieth Conference on Uncertainty in Artificial Intelligence (UAI'04)
- Gretton, C.¹ Thiébaux, S.²

16
- 0006419533
- Hierarchical solution of Markov decision processes using macro-actions
- Hauskretch M., Meuleau N., Kaelbling L., Dean T., Boutilier C., «Hierarchical Solution of Markov Decision Processes Using Macro-Actions », Proceedings of the Fourteenth International Conference on Uncertainty in Artificial Intelligence (UAI'98), p. 220-229, 1998.
- (1998) Proceedings of the Fourteenth International Conference on Uncertainty in Artificial Intelligence (UAI'98) , pp. 220-229
- Hauskretch, M.¹ Meuleau, N.² Kaelbling, L.³ Dean, T.⁴ Boutilier, C.⁵

17
- 0013465036
- Discovering hierarchy in reinforcement learning with HEXQ
- Hengst B., «Discovering Hierarchy in Reinforcement Learning with HEXQ », Proceedings of the Nineteenth International Conference on Machine Learning (ICML'02), p. 243-250, 2002.
- (2002) Proceedings of the Nineteenth International Conference on Machine Learning (ICML'02) , pp. 243-250
- Hengst, B.¹

18
- 0007914441
- Action selection methods using reinforcement learning
- September
- Humphrys M., «Action Selection methods using Reinforcement Learning », From Animals to Animals 4 : 4th International Conference on Simulation of Adaptive Behavior (SAB-96), September, 1996.
- (1996) From Animals to Animals 4: 4th International Conference on Simulation of Adaptive Behavior (SAB-96)
- Humphrys, M.¹

19
- 0000439891
- On the convergence of stochastic iterative dynamic programming algorithms
- Jaakkola T., Jordan M., Singh S., «On the Convergence of Stochastic Iterative Dynamic Programming Algorithms », Neural Computation, vol. 6, no 6, p. 1186-1201, 1994.
- (1994) Neural Computation , vol.6 , Issue.6 , pp. 1186-1201
- Jaakkola, T.¹ Jordan, M.² Singh, S.³

20
- 0032329151
- A roadmap of agent research and development
- Jennings N., Sycara K., Wooldridge M., «A Roadmap of Agent Research and Development », Autonomous Agents and Multi-Agent Systems, vol. 1, p. 7-38, 1998.
- (1998) Autonomous Agents and Multi-agent Systems , vol.1 , pp. 7-38
- Jennings, N.¹ Sycara, K.² Wooldridge, M.³

21
- 33645861652
- Tileworld users' manual
- August
- Joslin D., Nunes A., Pollack M. E., TileWorld Users' Manual, Technical Report no TR 93-12, August, 1993.
- (1993) Technical Report No TR 93-12 , vol.TR 93-12
- Joslin, D.¹ Nunes, A.² Pollack, M.E.³

22
- 85143168613
- Hierarchical learning in stochastic domains: Preliminary results
- Kaelbling L., «Hierarchical Learning in Stochastic Domains : Preliminary Results », Proceedings of the Tenth International Conference on Machine Learning (ICML'93), 1993.
- (1993) Proceedings of the Tenth International Conference on Machine Learning (ICML'93)
- Kaelbling, L.¹

23
- 0029679044
- Reinforcement learning: A survey
- Kaelbling L., Littman M., Moore A., «Reinforcement Learning : A Survey », Journal of Artificial Intelligence Research, vol. 4, p. 237-285, 1996.
- (1996) Journal of Artificial Intelligence Research , vol.4 , pp. 237-285
- Kaelbling, L.¹ Littman, M.² Moore, A.³

24
- 0036832951
- A sparse sampling algorithm for near-optimal planning in large Markov decision processes
- Kearns M., Mansour Y., Ng A., «A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes », Machine Learning, vol. 49, p. 193-208, 2002.
- (2002) Machine Learning , vol.49 , pp. 193-208
- Kearns, M.¹ Mansour, Y.² Ng, A.³

25
- 0000123778
- Self-improving reactive agent based on reinforcement learning, planning and teaching
- Lin. L.-J., «Self-improving reactive agent based on reinforcement learning, planning and teaching. », Machine Learning, vol. 8, p. 293-321, 1992.
- (1992) Machine Learning , vol.8 , pp. 293-321
- Lin, L.-J.¹

26
- 33645870136
- Hierarchical learning of robot skills
- Lin L.-J., «Hierarchical Learning of Robot Skills », Proceedings of the IEEE International Conference on Neural Networks (ICNN'93), 1993.
- (1993) Proceedings of the IEEE International Conference on Neural Networks (ICNN'93)
- Lin, L.-J.¹

27
- 85138579181
- Learning policies for partially observable environments: Scaling up
- Littman M., Cassandra A., Kaelbling L., «Learning policies for partially observable environments : scaling up », Proceedings of the 12th International Conference on Machine Learning (ICML'95), 1995.
- (1995) Proceedings of the 12th International Conference on Machine Learning (ICML'95)
- Littman, M.¹ Cassandra, A.² Kaelbling, L.³

28
- 0004272772
- Cambridge University Press
- MacKay D., Information Theory, Inference, and Learning Algorithms, Cambridge University Press, 2003.
- (2003) Information Theory, Inference, and Learning Algorithms
- MacKay, D.¹

29
- 0026880130
- Automatic programming of behavior-based robots using reinforcement learning
- June
- Mahadevan S., Cornell J., «Automatic Programming of Behavior-based Robots using Reinforcement Learning », Artificial Intelligence, vol. 55, no 2-3, p. 311-365, June, 1992.
- (1992) Artificial Intelligence , vol.55 , Issue.2-3 , pp. 311-365
- Mahadevan, S.¹ Cornell, J.²

30
- 0003932121
- PhD thesis, University of Rochester
- McCallum R. A., Reinforcement Learning with Selective Perception and Hidden State, PhD thesis, University of Rochester, 1995.
- (1995) Reinforcement Learning with Selective Perception and Hidden State
- McCallum, R.A.¹

31
- 2342662851
- Increasing behavioural repertoire in a mobile robot
- Nehmzow U., Smithers T., McGonigle B., «Increasing Behavioural Repertoire in a Mobile Robot », From Animals to Animats : Proceedings of the Second Conference on the Simulation of Adaptive Behavior (SAB'93), 1993.
- (1993) From Animals to Animats: Proceedings of the Second Conference on the Simulation of Adaptive Behavior (SAB'93)
- Nehmzow, U.¹ Smithers, T.² McGonigle, B.³

32
- 0003989214
- PhD thesis
- Parr R. E., Hierarchical Control and Learning for Markov Decision Processes, PhD thesis, 1998.
- (1998) Hierarchical Control and Learning for Markov Decision Processes
- Parr, R.E.¹

33
- 33750307958
- On-line search for solving Markov Decision Processes via heuristic sampling
- Peret L., Garcia F., «On-line search for solving Markov Decision Processes via heuristic sampling », Proceedings of the 16th European Conference on Artificial Intelligence (ECAI'2004), 2004.
- (2004) Proceedings of the 16th European Conference on Artificial Intelligence (ECAI'2004)
- Peret, L.¹ Garcia, F.²

34
- 0003464618
- Armand Colin
- Piaget J., La Psychologie de l'Intelligence, Armand Colin, 1967.
- (1967) La Psychologie de l'Intelligence
- Piaget, J.¹

35
- 0003998452
- John Wiley and Sons, Inc., New York, USA
- Puterman M. L., Markov Decision Processes-Discrete Stochastic Dynamic Programming, John Wiley and Sons, Inc., New York, USA, 1994.
- (1994) Markov Decision Processes-discrete Stochastic Dynamic Programming
- Puterman, M.L.¹

36
- 0003584577
- Englewood Cliffs, NJ: prentice Hall
- Russell S., Norvig P., Artificial Intelligence : A Modern Approach, Englewood Cliffs, NJ : prentice Hall, 1995.
- (1995) Artificial Intelligence: A Modern Approach
- Russell, S.¹ Norvig, P.²

37
- 4544279348
- Multi-agent reinforcement learning: A critical survey
- Stanford
- Shoham Y., Powers R., Grenager T., Multi-agent reinforcement learning : a critical survey. Technical report, Stanford, 2003.
- (2003) Technical Report
- Shoham, Y.¹ Powers, R.² Grenager, T.³

38
- 2142812536
- Learning without state estimation in Partially Observable Markovian Decision Processes
- Singh S., Jaakkola T., Jordan M., «Learning without state estimation in Partially Observable Markovian Decision Processes », Proceedings of the Eleventh International Conference on Machine Learning (ICML'94), 1994.
- (1994) Proceedings of the Eleventh International Conference on Machine Learning (ICML'94)
- Singh, S.¹ Jaakkola, T.² Jordan, M.³

39
- 0008321896
- Reinforcement learning: An introduction
- MIT Press, Cambridge, MA
- Sutton R., Barto G., Reinforcement Learning : an introduction, Bradford Book, MIT Press, Cambridge, MA, 1998.
- (1998) Bradford Book
- Sutton, R.¹ Barto, G.²

40
- 0003702006
- PhD thesis, University of Edinburgh
- Tyrrell T., Computational Mechanisms for Action Selection, PhD thesis, University of Edinburgh, 1993.
- (1993) Computational Mechanisms for Action Selection
- Tyrrell, T.¹

41
- 0032276461
- Incremental robot shaping
- Urzelai J., Floreano D., Dorigo M., Colombetti M., «Incremental Robot Shaping », Connection Science Journal, 1998.
- (1998) Connection Science Journal
- Urzelai, J.¹ Floreano, D.² Dorigo, M.³ Colombetti, M.⁴

42
- 0004049893
- PhD thesis, King's College of Cambridge, UK
- Watkins C., Learning from delayed rewards, PhD thesis, King's College of Cambridge, UK., 1989.
- (1989) Learning from Delayed Rewards
- Watkins, C.¹

43
- 84962016165
- A theory of mentally developing robots
- June
- Weng J., «A Theory of Mentally Developing Robots », Proceedings of the 2nd International Conference on Development and Learning (ICDL'02), June, 2002.
- (2002) Proceedings of the 2nd International Conference on Development and Learning (ICDL'02)
- Weng, J.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.