SCOPUS 정보 검색 플랫폼

Volumn 5323 LNAI, Issue , 2008, Pages 27-40

Algorithms and bounds for rollout sampling approximate policy iteration

Author keywords

[No Author keywords available]

Indexed keywords

CLASSIFIERS; LEARNING SYSTEMS; REINFORCEMENT; REINFORCEMENT LEARNING;

ALLOCATION STRATEGIES; LEARNING PROBLEMS; POLICY ITERATION; SIMPLE METHODS; STATE SPACES; VALUE FUNCTIONS;

EDUCATION;

EID: 58449114139 PISSN: 03029743 EISSN: 16113349 Source Type: Book Series
DOI: 10.1007/978-3-540-89722-4_3 Document Type: Conference Paper

Times cited : (8)

References (10)

1
- 0036568025
- Finite-time analysis of the multiarmed bandit problem
- Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Machine Learning Journal 47(2-3), 235-256 (2002)
- (2002) Machine Learning Journal , vol.47 , Issue.2-3 , pp. 235-256
- Auer, P.¹ Cesa-Bianchi, N.² Fischer, P.³

2
- 38049040954
- Auer, P., Ortner, R., Szepesvari, C.: Improved Rates for the Stochastic Continuum-Armed Bandit Problem. In: Bshouty, N.H., Gentile, C. (eds.) COLT 2007. LNCS, 4539, pp. 454-468. Springer, Heidelberg (2007)
- Auer, P., Ortner, R., Szepesvari, C.: Improved Rates for the Stochastic Continuum-Armed Bandit Problem. In: Bshouty, N.H., Gentile, C. (eds.) COLT 2007. LNCS, vol. 4539, pp. 454-468. Springer, Heidelberg (2007)

3
- 58449087518
- Bertsekas, D.: Dynamic programming and suboptimal control: From ADP to MFC. Fundamental Issues in Control, European Journal of Control 11(4-5) (2005): From 2005 CDC, Seville, Spain
- Bertsekas, D.: Dynamic programming and suboptimal control: From ADP to MFC. Fundamental Issues in Control, European Journal of Control 11(4-5) (2005): From 2005 CDC, Seville, Spain

5
- 33745295134
- Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems
- Even-Dar, E., Mannor, S., Mansour, Y.: Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems. Journal of Machine Learning Research 7, 1079-1105 (2006)
- (2006) Journal of Machine Learning Research , vol.7 , pp. 1079-1105
- Even-Dar, E.¹ Mannor, S.² Mansour, Y.³

6
- 22944468731
- Approximate policy iteration with a policy language bias
- Fern, A., Yoon, S., Givan, R.: Approximate policy iteration with a policy language bias. Advances in Neural Information Processing Systems 16(3) (2004)
- (2004) Advances in Neural Information Processing Systems , vol.16 , Issue.3
- Fern, A.¹ Yoon, S.² Givan, R.³

7
- 33744466799
- Approximate policy iteration with a policy language bias: Solving relational Markov decision processes
- Fern, A., Yoon, S., Givan, R.: Approximate policy iteration with a policy language bias: Solving relational Markov decision processes. Journal of Artificial Intelligence Research 25, 75-118 (2006)
- (2006) Journal of Artificial Intelligence Research , vol.25 , pp. 75-118
- Fern, A.¹ Yoon, S.² Givan, R.³

8
- 33750293964
- Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS, 4212, pp. 282-293. Springer, Heidelberg (2006)
- Kocsis, L., Szepesvári, C.: Bandit based Monte-Carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS, vol. 4212, pp. 282-293. Springer, Heidelberg (2006)

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.