SCOPUS 정보 검색 플랫폼

Volumn 34, Issue 5, 2004, Pages 2140-2143

A new Q-learning algorithm based on the metropolis criterion

Author keywords

[No Author keywords available]

Indexed keywords

COMBINATORIAL MATHEMATICS; LEARNING ALGORITHMS; OPTIMIZATION; SIMULATED ANNEALING; THEOREM PROVING;

BOLTZMANN EXPLORATION; METROPOLIS CRITERION; Q-LEARNING ALGORITHM; Q-LEARNING EXPLOITATION; REINFORCEMENT LEARNING;

LEARNING SYSTEMS;

ALGORITHM; ARTIFICIAL INTELLIGENCE; COMPUTER SIMULATION; EVALUATION; INFORMATION RETRIEVAL; LETTER; METHODOLOGY; THEORETICAL MODEL;

ALGORITHMS; ARTIFICIAL INTELLIGENCE; COMPUTER SIMULATION; INFORMATION STORAGE AND RETRIEVAL; MODELS, THEORETICAL;

EID: 4844223639 PISSN: 10834419 EISSN: None Source Type: Journal
DOI: 10.1109/TSMCB.2004.832154 Document Type: Article

Times cited : (150)

References (14)

1
- 0033170372
- Between MDP's and semi-MDPs: A framework for temporal abstraction in reinforcement learning
- R. S. Sutton, D. Precup, and S. Singh, "Between MDP's and semi-MDPs: A framework for temporal abstraction in reinforcement learning," Artific. Intell., vol. 112, pp. 181-211, 1999.
- (1999) Artific. Intell. , vol.112 , pp. 181-211
- Sutton, R.S.¹ Precup, D.² Singh, S.³

3
- 33847202724
- Learning to predict by the method of temporal difference
- R. S. Sutton, "Learning to predict by the method of temporal difference," Mach. Learn., vol. 3, no. 1, pp. 9-44, 1988.
- (1988) Mach. Learn. , vol.3 , Issue.1 , pp. 9-44
- Sutton, R.S.¹

4
- 0004049893
- Ph.D dissertation, Psychol. Dept., Cambridge Univ., Cambridge, U.K
- C. J. C. H. Watkins, "Learning from delayed rewards," Ph.D dissertation, Psychol. Dept., Cambridge Univ., Cambridge, U.K., 1989.
- (1989) Learning From Delayed Rewards
- Watkins, C.J.C.H.¹

5
- 34249833101
- Q-learning
- C. J. C. H. Watkins and P. Dayan, "Q-learning," Mach. Learn., vol. 8, no. 3, pp. 279-292, 1992.
- (1992) Mach. Learn. , vol.8 , Issue.3 , pp. 279-292
- Watkins, C.J.C.H.¹ Dayan, P.²

7
- 0033148990
- Cooperative behavior acquisition for mobile robots in dynamically changing real worlds via vision-based reinforcement learning and development
- M. Asada, E. Uchibe, and K. Hosoda, "Cooperative behavior acquisition for mobile robots in dynamically changing real worlds via vision-based reinforcement learning and development," Intell., vol. 110, pp. 275-292, 1999.
- (1999) Intell. , vol.110 , pp. 275-292
- Asada, M.¹ Uchibe, E.² Hosoda, K.³

8
- 0004102479
- Cambridge, MA: MIT Press
- R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press, 1998.
- (1998) Reinforcement Learning: an Introduction
- Sutton, R.S.¹ Barto, A.G.²

9
- 0029679044
- Reinforcement learning. A survey
- L. P. Kaelbling, M. L. Littman, and A. W. Moore, "Reinforcement learning. A survey," J. AI Res., vol. 4, pp. 237-285, 1996.
- (1996) J. AI Res. , vol.4 , pp. 237-285
- Kaelbling, L.P.¹ Littman, M.L.² Moore, A.W.³

10
- 5744249209
- Equation of calculations by fast computing machines
- N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller, "Equation of calculations by fast computing machines," J. Chem. Phys., vol. 21, pp. 1087-1092, 1953.
- (1953) J. Chem. Phys. , vol.21 , pp. 1087-1092
- Metropolis, N.¹ Rosenbluth, A.W.² Rosenbluth, M.N.³ Teller, A.H.⁴ Teller, E.⁵

11
- 0033687233
- Nature's way of optimizing
- S. Boettcher and A. Percus, "Nature's way of optimizing," Artific. Intell., vol. 119, pp. 275-286, 2000.
- (2000) Artific. Intell. , vol.119 , pp. 275-286
- Boettcher, S.¹ Percus, A.²

12
- 26444479778
- Optimization by simulated annealing
- S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, "Optimization by simulated annealing," Science, vol. 220, pp. 671-680, 1983.
- (1983) Science , vol.220 , pp. 671-680
- Kirkpatrick, S.¹ Gelatt, C.D.² Vecchi, M.P.³

13
- 0004255908
- New York: McGraw-Hill
- T. Mitchell, Machine Learning. New York: McGraw-Hill, 1997.
- (1997) Machine Learning
- Mitchell, T.¹

14
- 0031208987
- Explanation-based learning and reinforcement learning: A unified view
- T. G. Dietterich and N. S. Flann, "Explanation-based learning and reinforcement learning: A unified view," Mach. Learn., vol. 28, pp. 169-210, 1997.
- (1997) Mach. Learn. , vol.28 , pp. 169-210
- Dietterich, T.G.¹ Flann, N.S.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.