메뉴 건너뛰기




Volumn , Issue , 2008, Pages 296-303

Active reinforcement learning

Author keywords

[No Author keywords available]

Indexed keywords

LEARNING ALGORITHMS; MACHINE LEARNING; MARKOV PROCESSES; LEARNING SYSTEMS; REINFORCEMENT; ROBOT LEARNING;

EID: 56449114181     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1390156.1390194     Document Type: Conference Paper
Times cited : (33)

References (14)
  • 1
    • 31844444663 scopus 로고    scopus 로고
    • Exploration and apprenticeship learning in reinforcement learning
    • Abbeel, P., & Ng, A. Y. (2005). Exploration and apprenticeship learning in reinforcement learning. ICML.
    • (2005) ICML
    • Abbeel, P.1    Ng, A.Y.2
  • 2
    • 33749242451 scopus 로고    scopus 로고
    • Using inaccurate models in reinforcement learning
    • Abbeel, P., Quigley, M., & Ng, A. (2006). Using inaccurate models in reinforcement learning. ICML.
    • (2006) ICML
    • Abbeel, P.1    Quigley, M.2    Ng, A.3
  • 3
    • 0041965975 scopus 로고    scopus 로고
    • R-MAX- A general polynomial time algorithm for near-optimal reinforcement learning
    • Brafman, R. I., & Tennenholtz, M. (2002). R-MAX- A general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research, 3, 213-231.
    • (2002) Journal of Machine Learning Research , vol.3 , pp. 213-231
    • Brafman, R.I.1    Tennenholtz, M.2
  • 4
    • 0037289322 scopus 로고    scopus 로고
    • From perturbation analysis to markov decision processes and reinforcement learning
    • Cao, X. (2003). From perturbation analysis to markov decision processes and reinforcement learning. Discrete Event Dynamic Systems: Theory and Applications, 13, 9-39.
    • (2003) Discrete Event Dynamic Systems: Theory and Applications , vol.13 , pp. 9-39
    • Cao, X.1
  • 7
    • 0034272032 scopus 로고    scopus 로고
    • Bounded-parameter markov decision processes
    • Givan, R., Leach, S., & Dean, T. (2000). Bounded-parameter markov decision processes. Artificial Intelligence, 122, 71-109.
    • (2000) Artificial Intelligence , vol.122 , pp. 71-109
    • Givan, R.1    Leach, S.2    Dean, T.3
  • 8
    • 0036832954 scopus 로고    scopus 로고
    • Near optimal reinforcement learning in polynomial time
    • Kearns, M., & Singh, S. (2002). Near optimal reinforcement learning in polynomial time. Machine Learning, 49, 209-232.
    • (2002) Machine Learning , vol.49 , pp. 209-232
    • Kearns, M.1    Singh, S.2
  • 11
    • 56449109724 scopus 로고    scopus 로고
    • Robustness in markov decision problems with uncertain transition matrices
    • Nilim, A., & Ghaoui, L. E. (2003). Robustness in markov decision problems with uncertain transition matrices. NIPS.
    • (2003) NIPS
    • Nilim, A.1    Ghaoui, L.E.2
  • 13
    • 31844432138 scopus 로고    scopus 로고
    • A theoretical analysis of model-based interval estimation
    • Strehl, A. L., & Littman, M. L. (2005). A theoretical analysis of model-based interval estimation. ICML.
    • (2005) ICML
    • Strehl, A.L.1    Littman, M.L.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.