SCOPUS 정보 검색 플랫폼

Volumn 7006 LNAI, Issue , 2011, Pages 335-346

Value-difference based exploration: Adaptive control between epsilon-greedy and softmax

Author keywords

[No Author keywords available]

Indexed keywords

ACTION SELECTION; ADAPTIVE CONTROL; Q-LEARNING; TEMPORAL DIFFERENCE LEARNING;

INTELLIGENT AGENTS; LEARNING ALGORITHMS; MULTI AGENT SYSTEMS; POTASSIUM IODIDE;

ADAPTIVE CONTROL SYSTEMS;

EID: 80054004135 PISSN: 03029743 EISSN: 16113349 Source Type: Book Series
DOI: 10.1007/978-3-642-24455-1_33 Document Type: Conference Paper

Times cited : (204)

References (14)

1
- 0004102479
- MIT Press, Cambridge
- Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

2
- 0004049893
- PhD thesis, University of Cambridge, Cambridge, England
- Watkins, C.: Learning from Delayed Rewards. PhD thesis, University of Cambridge, Cambridge, England (1989)
- (1989) Learning from Delayed Rewards
- Watkins, C.¹

3
- 0003411271
- Technical Report CMU-CS-92-102, Carnegie Mellon University, Pittsburgh, PA, USA
- Thrun, S.B.: Efficient exploration in reinforcement learning. Technical Report CMU-CS-92-102, Carnegie Mellon University, Pittsburgh, PA, USA (1992)
- (1992) Efficient Exploration in Reinforcement Learning
- Thrun, S.B.¹

4
- 0004280606
- MIT Press, Cambridge
- Kaelbling, L.P.: Learning in embedded systems. MIT Press, Cambridge (1993)
- (1993) Learning in Embedded Systems
- Kaelbling, L.P.¹

5
- 0041966002
- Using confidence bounds for exploitation-exploration trade-offs
- Auer, P.: Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research 3, 397-422 (2002)
- (2002) Journal of Machine Learning Research , vol.3 , pp. 397-422
- Auer, P.¹

7
- 78349266245
- Interview with Richard S. Sutton
- Heidrich-Meisner, V.: Interview with Richard S. Sutton. In: Künstliche Intelligenz, vol. 3, pp. 41-43 (2009)
- (2009) Künstliche Intelligenz , vol.3 , pp. 41-43
- Heidrich-Meisner, V.¹

9
- 84966203785
- Some aspects of the sequential design of experiments
- Robbins, H.: Some aspects of the sequential design of experiments. Bulletin of the American Mathematical Society 58, 527-535 (1952)
- (1952) Bulletin of the American Mathematical Society , vol.58 , pp. 527-535
- Robbins, H.¹

10
- 0345161977
- PhD thesis, University of Amserdam, Amsterdam
- Wiering, M.: Explorations in Efficient Reinforcement Learning. PhD thesis, University of Amserdam, Amsterdam (1999)
- (1999) Explorations in Efficient Reinforcement Learning
- Wiering, M.¹

11
- 0003636089
- Technical Report CUED/F-INFENG/TR 166, Cambridge University
- Rummery, G.A., Niranjan, M.: On-line Q-learning using connectionist systems. Technical Report CUED/F-INFENG/TR 166, Cambridge University (1994)
- (1994) On-line Q-learning Using Connectionist Systems
- Rummery, G.A.¹ Niranjan, M.²

12
- 33748998787
- Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming
- George, A.P., Powell, W.B.: Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming. Machine Learning 65(1), 167-198 (2006)
- (2006) Machine Learning , vol.65 , Issue.1 , pp. 167-198
- George, A.P.¹ Powell, W.B.²

13
- 34249833101
- Technical note: Q-learning
- Watkins, C., Dayan, P.: Technical note: Q-learning. Machine Learning 8(3), 279-292 (1992)
- (1992) Machine Learning , vol.8 , Issue.3 , pp. 279-292
- Watkins, C.¹ Dayan, P.²

14
- 33745223257
- Cortical substrates for exploratory decisions in humans
- Daw, N.D., O'Doherty, J.P., Dayan, P., Seymour, B., Dolan, R.J.: Cortical substrates for exploratory decisions in humans. Nature 441, 876-879 (2006)
- (2006) Nature , vol.441 , pp. 876-879
- Daw, N.D.¹ O'Doherty, J.P.² Dayan, P.³ Seymour, B.⁴ Dolan, R.J.⁵

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.