SCOPUS 정보 검색 플랫폼

Volumn 4131 LNCS - I, Issue , 2006, Pages 850-859

Nearly optimal exploration-exploitation decision thresholds

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS; APPROXIMATION THEORY; DECISION MAKING; PROBLEM SOLVING;

OPTIMAL DECISION THRESHOLDS; REINFORCEMENT LEARNING;

LEARNING SYSTEMS;

EID: 33749818313 PISSN: 03029743 EISSN: 16113349 Source Type: Book Series
DOI: 10.1007/11840817_88 Document Type: Conference Paper

Times cited : (10)

References (14)

1
- 0003503387
- John Wiley & Sons Republished by Dover in 2004
- Wald, A.: Sequential Analysis. John Wiley & Sons (1947) Republished by Dover in 2004.
- (1947) Sequential Analysis
- Wald, A.¹

2
- 0003759417
- John Wiley & Sons Republished in 2004
- DeGroot, M.H.: Optimal Statistical Decisions. John Wiley & Sons (1970) Republished in 2004.
- (1970) Optimal Statistical Decisions
- DeGroot, M.H.¹

3
- 0004870746
- A problem in the sequential design of experiments
- Bellman, R.E.: A problem in the sequential design of experiments. Sankhya 16 (1957) 221-229
- (1957) Sankhya , vol.16 , pp. 221-229
- Bellman, R.E.¹

4
- 30044441333
- The sample complexity of exploration in the multiarmed bandit problem
- Marmor, S., Tsitsiklis, J.N.: The sample complexity of exploration in the multiarmed bandit problem. Journal of Machine Learning Research 5 (2004) 623-648
- (2004) Journal of Machine Learning Research , vol.5 , pp. 623-648
- Marmor, S.¹ Tsitsiklis, J.N.²

5
- 0031619316
- Bayesian Q-learning
- Dearden, R., Friedrnan, N., Russell, S.J.: Bayesian Q-learning. In: AAAI/IAAI. (1998) 761-768
- (1998) AAAI/IAAI , pp. 761-768
- Dearden, R.¹ Friedrnan, N.² Russell, S.J.³

7
- 0004102479
- MIT Press
- Sutton, R.S., Barto, A.G.; Reinforcement Learning: An Introduction. MIT Press (1998)
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

8
- 21444436092
- On the lambert W function
- Corless, R.M., Gonnet, G.H., Hare, D.E.G., Jeffrey, D.J., Knuth, D.E.: On the lambert W function. Advances in Computational Mathematics 5 (1996) 329-359
- (1996) Advances in Computational Mathematics , vol.5 , pp. 329-359
- Corless, R.M.¹ Gonnet, G.H.² Hare, D.E.G.³ Jeffrey, D.J.⁴ Knuth, D.E.⁵

9
- 0033170372
- Between MDPs and serni-MDPs: A framework for temporal abstraction in reinforcement learning
- Sutton, R.S., Precup, D., Singh, S.P.: Between MDPs and serni-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112(1-2) (1999) 181-211
- (1999) Artificial Intelligence , vol.112 , Issue.1-2 , pp. 181-211
- Sutton, R.S.¹ Precup, D.² Singh, S.P.³

14
- 33749856840
- IDIAP-RR 05-29, IDIAP
- Dimitrakakis, C., Bengio, S.: Gradient estimates of return. IDIAP-RR 05-29, IDIAP (2005)
- (2005) Gradient Estimates of Return
- Dimitrakakis, C.¹ Bengio, S.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.