메뉴 건너뛰기




Volumn 1, Issue , 2005, Pages 81-88

Efficient exploration with latent structure

Author keywords

[No Author keywords available]

Indexed keywords


EID: 73549099066     PISSN: 23307668     EISSN: 2330765X     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (4)

References (22)
  • 1
    • 0141988716 scopus 로고    scopus 로고
    • Recent advances in hierarchical reinforcement learning
    • Special Issue on Reinforcement Learning
    • Andrew Barto and Sridhar Mahadevan. Recent advances in hierarchical reinforcement learning. Discrete Event Systems 13, Special Issue on Reinforcement Learning 41-77, 2003.
    • (2003) Discrete Event Systems , vol.13 , pp. 41-77
    • Barto, A.1    Mahadevan, S.2
  • 3
    • 0041965975 scopus 로고    scopus 로고
    • R-MAX-A general polynomial time algorithm for near-optimal reinforcement learning
    • [Brafman and Tennenholtz, 2002] Ronen I. Brafman and Moshe Tennenholtz. R-MAX-A general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research, 3:213-231, 2002.
    • (2002) Journal of Machine Learning Research , vol.3 , pp. 213-231
    • Brafman, R.I.1    Tennenholtz, M.2
  • 11
    • 84891584370 scopus 로고
    • Wiley-Interscience series in systems and optimization. Wiley, Chichester, NY
    • [Gittins, 1989] J. C. Gittins. Multi-Armed Bandit Allocation Indices. Wiley-Interscience series in systems and optimization. Wiley, Chichester, NY, 1989.
    • (1989) Multi-Armed Bandit Allocation Indices
    • Gittins, J.C.1
  • 14
    • 0036832954 scopus 로고    scopus 로고
    • Nearoptimal reinforcement learning in polynomial time
    • [Kearns and Singh, 2002] Michael J. Kearns and Satinder P. Singh. Nearoptimal reinforcement learning in polynomial time. Machine Learning, 49(2-3):209-232, 2002.
    • (2002) Machine Learning , vol.49 , Issue.2-3 , pp. 209-232
    • Kearns, M.J.1    Singh, S.P.2
  • 15
    • 30044441333 scopus 로고    scopus 로고
    • The sample complexity of exploration in the multi-Armed bandit problem
    • [Mannor and Tsitsiklis, 2004] Shie Mannor and John N. Tsitsiklis. The sample complexity of exploration in the multi-Armed bandit problem. Journal of Artificial Intelligence Research, 5:623-648, 2004.
    • (2004) Journal of Artificial Intelligence Research , vol.5 , pp. 623-648
    • Mannor, S.1    Tsitsiklis, J.N.2
  • 20
    • 0002210775 scopus 로고
    • The role of exploration in learning control
    • In David A. White and Donald A. Sofge, editors Van Nostrand Reinhold, New York, NY
    • [Thrun, 1992] Sebastian B. Thrun. The role of exploration in learning control. In David A. White and Donald A. Sofge, editors, Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches, pages 527-559. Van Nostrand Reinhold, New York, NY, 1992.
    • (1992) Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches , pp. 527-559
    • Sebastian, B.T.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.