메뉴 건너뛰기




Volumn 148, Issue , 2006, Pages 305-312

Qualitative reinforcement learning

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS; COMPUTER SIMULATION; MARKOV PROCESSES; PROBABILITY; PROBLEM SOLVING; STOCHASTIC MODELS;

EID: 34250717446     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1143844.1143883     Document Type: Conference Paper
Times cited : (7)

References (10)
  • 1
    • 14344251217 scopus 로고    scopus 로고
    • Apprenticeship learning via inverse reinforcement learning
    • Abbeel, P., & Ng, A. Y. (2004). Apprenticeship learning via inverse reinforcement learning. ICML.
    • (2004) ICML
    • Abbeel, P.1    Ng, A.Y.2
  • 2
    • 31844444663 scopus 로고    scopus 로고
    • Exploration and apprenticeship learning in reinforcement learning
    • Abbeel, P., & Ng, A. Y. (2005). Exploration and apprenticeship learning in reinforcement learning. ICML.
    • (2005) ICML
    • Abbeel, P.1    Ng, A.Y.2
  • 3
    • 16244402846 scopus 로고    scopus 로고
    • Qualitative mdps and pomdps: An order-of-magnitude approximation
    • Bonet, B., & Pearl, J. (2002). Qualitative mdps and pomdps: An order-of-magnitude approximation. UAI.
    • (2002) UAI
    • Bonet, B.1    Pearl, J.2
  • 4
    • 34250717373 scopus 로고    scopus 로고
    • Epshteyn, A., & DeJong, G. (2006). Qualitative reinforcement learning (full paper). http://www.ews.uiuc.edu/~aepshtey/pubs/qual_rl.ps.
    • Epshteyn, A., & DeJong, G. (2006). Qualitative reinforcement learning (full paper). http://www.ews.uiuc.edu/~aepshtey/pubs/qual_rl.ps.
  • 5
    • 0034272032 scopus 로고    scopus 로고
    • Bounded-parameter markov decision processes
    • Givan, R., Leach, S., & Dean, T. (2000). Bounded-parameter markov decision processes. Artificial Intelligence, 122, 71-109.
    • (2000) Artificial Intelligence , vol.122 , pp. 71-109
    • Givan, R.1    Leach, S.2    Dean, T.3
  • 6
    • 1942484890 scopus 로고    scopus 로고
    • The influence of reward on the speed of reinforcement learning: An analysis of shaping
    • Laud, A., & DeJong, G. (2003). The influence of reward on the speed of reinforcement learning: An analysis of shaping. ICML.
    • (2003) ICML
    • Laud, A.1    DeJong, G.2
  • 7
    • 0141596576 scopus 로고    scopus 로고
    • Policy invariance under reward transformations: Theory and application to reward shaping
    • Ng, A. Y., Harada, D., & Russell, S. J. (1999). Policy invariance under reward transformations: Theory and application to reward shaping. ICML.
    • (1999) ICML
    • Ng, A.Y.1    Harada, D.2    Russell, S.J.3
  • 8
    • 34250706765 scopus 로고    scopus 로고
    • Sabbadin, R. (1999). A possibilistic model for qualitative sequential decision problems under uncertainty in partially observable environments. UAI
    • Sabbadin, R. (1999). A possibilistic model for qualitative sequential decision problems under uncertainty in partially observable environments. UAI


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.