SCOPUS 정보 검색 플랫폼

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Volumn 1545, Issue , 1998, Pages 29-45

Modular reinforcement learning: An application to a real robot task

(3) Kalmár, Zsolt a Szepesvári, Csaba b Lőrincz, András c

a Dept Of Informatics JATE ^* (Hungary)

b JATE (Hungary)

c Institute of Isotopes (Hungary)

Author keywords

[No Author keywords available]

Indexed keywords

CONTROLLERS; MACHINE LEARNING; MARKOV PROCESSES; ROBOTS;

ACTION SPACES; DESIGN CONTROLLERS; DISCRETE TIME; MODEL BASED APPROACH; MODULAR REINFORCEMENT LEARNING; OPERATING CONDITION; SWITCHING STRATEGIES; SYSTEMATIC DESIGN METHODS;

REINFORCEMENT LEARNING;

EID: 84878320217 PISSN: 03029743 EISSN: 16113349 Source Type: Book Series
DOI: 10.1007/3-540-49240-2_3 Document Type: Conference Paper

Times cited : (4)

References (25)

1
- 0030149709
- Purposive behavior acquisition for a real robot by vision-based reinforcement learning
- M. Asada, S. Noda, S. Tawaratsumida, and K. Hosoda. Purposive behavior acquisition for a real robot by vision-based reinforcement learning. Machine Learning, 23:279-303, 1996.
- (1996) Machine Learning , vol.23 , pp. 279-303
- Asada, M.¹ Noda, S.² Tawaratsumida, S.³ Hosoda, K.⁴

2
- 0029210635
- Learning to act using real-time dynamic programming
- A. Barto, S. J. Bradtke, and S. Singh. Learning to act using real-time dynamic programming. Artificial Intelligence, 1(72):81-138, 1995.
- (1995) Artificial Intelligence , vol.1 , Issue.72 , pp. 81-138
- Barto, A.¹ Bradtke, S.J.² Singh, S.³

3
- 0003787146
- Princeton University Press, Princeton, New Jersey
- R. Bellman. Dynamic Programming. Princeton University Press, Princeton, New Jersey, 1957.
- (1957) Dynamic Programming
- Bellman, R.¹

4
- 0001937317
- Elephants don't play chess
- Bradford-MIT Press
- R. Brooks. Elephants don't play chess. In Designing Autonomous Agents. Bradford-MIT Press, 1991.
- (1991) Designing Autonomous Agents
- Brooks, R.¹

5
- 0000439891
- On the convergence of stochastic iterative dynamic programming algorithms
- T. Jaakkola, M. Jordan, and S. Singh. On the convergence of stochastic iterative dynamic programming algorithms. Neural Computation, 6(6):1185-1201, November 1994.
- (1994) Neural Computation , vol.6 , Issue.6 , pp. 1185-1201
- Jaakkola, T.¹ Jordan, M.² Singh, S.³

6
- 0028745065
- Z. Kalmar, C. Szepesväri, and A. Lorincz. Generalization in an autonomous agent. In Proc. of IEEE WCCI ICNN'94, volume 3, pages 1815-1817, Orlando, Florida, June 1994. IEEE Inc.
- Kalmar, Z.¹ Szepesväri, C.² Lorincz, A.³

7
- 0029205333
- Generalized dynamic concept model as a route to construct adaptive autonomous agents
- Z. Kalmar, C. Szepesväri, and A. Lorincz. Generalized dynamic concept model as a route to construct adaptive autonomous agents. Neural Network World, 5:353-360, 1995.
- (1995) Neural Network World , vol.5 , pp. 353-360
- Kalmar, Z.¹ Szepesväri, C.² Lorincz, A.³

8
- 0032045145
- Z. Kalmar, C. Szepesväri, and A. Lorincz. Module based reinforcement learning: Experiments with a real robot. Machine Learning, 31:55-85, 1998. joint special issue on "Learning Robots" with the J. of Autonomous Robots;.
- Kalmar, Z.¹ Szepesväri, C.² Lorincz, A.³

9
- 84947778935
- M. Littman. Algorithms for Sequential Decision Making. PhD thesis, Department of Computer Science, Brown University, February 1996. Also Technical Report CS-96-09.
- Littman, M.¹

10
- 0001961616
- A Generalized Reinforcement Learning Model: Convergence and applications
- M. Littman and C. Szepesväri. A Generalized Reinforcement Learning Model: Convergence and applications. In Int. Gonf. on Machine Learning, pages 310-318, 1996.
- (1996) Int. Gonf. On Machine Learning , pp. 310-318
- Littman, M.¹ Szepesväri, C.²

11
- 85138579181
- Learning policies for partially observable environments: Scaling up
- In A. Prieditis and S. Russell, editors, San Francisco, CA, Morgan Kaufmann
- M. L. Littman, A. Cassandra, and L. P. Kaelbling. Learning policies for partially observable environments: Scaling up. In A. Prieditis and S. Russell, editors, Proceedings of the Twelfth International Conference on Machine Learning, pages 362-370, San Francisco, CA, 1995. Morgan Kaufmann.
- (1995) Proceedings of the Twelfth International Conference on Machine Learning , pp. 362-370
- Littman, M.L.¹ Cassandra, A.² Kaelbling, L.P.³

12
- 0002765109
- A bottom-up mechanism for behavior selection in an artificial creature
- In J. Meyer and S. Wilson, editors, MIT Press
- P. Maes. A bottom-up mechanism for behavior selection in an artificial creature. In J. Meyer and S. Wilson, editors, Proc. of the First International Conference on Simulation of Adaptive Behavior. MIT Press, 1991.
- (1991) Proc. Of the First International Conference on Simulation of Adaptive Behavior
- Maes, P.¹

13
- 0026880130
- Automatic programming of behavior-based robots using reinforcement learning
- S. Mahadevan and J. Connell. Automatic programming of behavior-based robots using reinforcement learning. Artificial Intelligence, 55:311-365, 1992.
- (1992) Artificial Intelligence , vol.55 , pp. 311-365
- Mahadevan, S.¹ Connell, J.²

14
- 0030647149
- Reinforcement learning in the multi-robot domain
- M. Mataric. Reinforcement learning in the multi-robot domain. Autonomous Robots, 4, 1997.
- (1997) Autonomous Robots , pp. 4
- Mataric, M.¹

15
- 0003644137
- Holden Day, San Francisco, California
- S. Ross. Applied Probability Models with Optimization Applications. Holden Day, San Francisco, California, 1970.
- (1970) Applied Probability Models with Optimization Applications.
- Ross, S.¹

16
- 2142812536
- Learning without state-estimation in partially observable Markovian decision processes
- S. Singh, T. Jaakkola, and M. Jordan. Learning without state-estimation in partially observable Markovian decision processes. In Proc. of the Eleventh Machine Learning Conference, pages pp. 284-292, 1995.
- (1995) Proc. Of the Eleventh Machine Learning Conference , pp. 284-292
- Singh, S.¹ Jaakkola, T.² Jordan, M.³

17
- 0003617454
- PhD thesis, University of Massachusetts, Amherst, MA
- R. Sutton. Temporal Credit Assignment in Reinforcement Learning. PhD thesis, University of Massachusetts, Amherst, MA, 1984.
- (1984) Temporal Credit Assignment in Reinforcement Learning.
- Sutton, R.¹

18
- 0000723997
- Generalization in reinforcement learning: Successful examples using sparse coarse coding
- R. S. Sutton. Generalization in reinforcement learning: Successful examples using sparse coarse coding. Advances in Neural Information Processing Systems, 8, 1996.
- (1996) Advances in Neural Information Processing Systems , pp. 8
- Sutton, R.S.¹

19
- 30844447222
- A unified analysis of value-function-based reinforcement-learning algorithms
- submitted
- C. Szepesväri and M. Littman. A unified analysis of value-function-based reinforcement-learning algorithms. Neural Computation, 1997. submitted.
- (1997) Neural Computation
- Szepesväri, C.¹ Littman, M.²

20
- 84977014241
- Behavior of an adaptive self-organizing autonomous agent working with cues and competing concepts
- C. Szepesväri and A. Lorincz. Behavior of an adaptive self-organizing autonomous agent working with cues and competing concepts. Adaptive Behavior, 2(2): 131-160, 1994.
- (1994) Adaptive Behavior , vol.2 , Issue.2 , pp. 131-160
- Szepesväri, C.¹ Lorincz, A.²

21
- 0002210775
- Van Nostrand Rheinhold, Florence KY
- S. Thrun. The role of exploration in learning control. Van Nostrand Rheinhold, Florence KY, 1992.
- (1992) The Role of Exploration in Learning Control
- Thrun, S.¹

22
- 0008813539
- Technical Report LIDS-P-2322, Laboratory for Information and Decision Systems, Massachusetts Institute of Technology
- J. Tsitsiklis and B. Van Roy. An analysis of temporal difference learning with function approximation. Technical Report LIDS-P-2322, Laboratory for Information and Decision Systems, Massachusetts Institute of Technology, 1995.
- (1995) An Analysis of Temporal Difference Learning with Function Approximation
- Tsitsiklis, J.¹ Van Roy, B.²

23
- 0029752470
- Feature-based methods for large scale dynamic programming
- J. N. Tsitsiklis and B. Van Roy. Feature-based methods for large scale dynamic programming. Machine Learning, 22:59-94, 1996.
- (1996) Machine Learning , vol.22 , pp. 59-94
- Tsitsiklis, J.N.¹ Van Roy, B.²

24
- 0030418601
- Behavior coordination for a mobile robot using modular reinforcement learning
- E. Uchibe, M. Asada, and K. Hosoda. Behavior coordination for a mobile robot using modular reinforcement learning. In Proc. of IEEE/RS'J Int. Conf. on Intelligent Robot and Sytems, pages 1329-1336, 1996.
- (1996) Proc. Of IEEE/RS'J Int. Conf. On Intelligent Robot and Sytems , pp. 1329-1336
- Uchibe, E.¹ Asada, M.² Hosoda, K.³

25
- 34249833101
- Q-learning
- C. Watkins and P. Dayan. Q-learning. Machine Learning, 3(8):279-292, 1992.
- (1992) Machine Learning , vol.3 , Issue.8 , pp. 279-292
- Watkins, C.¹ Dayan, P.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.