메뉴 건너뛰기




Volumn 73, Issue 3, 2008, Pages 289-312

Transfer in variable-reward hierarchical reinforcement learning

Author keywords

Average reward learning; Hierarchical reinforcement learning; Multi criteria learning; Transfer learning

Indexed keywords

EDUCATION; LEARNING SYSTEMS; PROBABILITY DENSITY FUNCTION; PROBLEM SOLVING; REINFORCEMENT; REINFORCEMENT LEARNING; SILVER;

EID: 55149090494     PISSN: 08856125     EISSN: 15730565     Source Type: Journal    
DOI: 10.1007/s10994-008-5061-y     Document Type: Article
Times cited : (78)

References (24)
  • 1
    • 14344251217 scopus 로고    scopus 로고
    • Apprenticeship learning via inverse reinforcement learning
    • Abbeel, P., & Ng, A. (2004). Apprenticeship learning via inverse reinforcement learning. In Proceedings of the ICML.
    • (2004) Proceedings of the ICML
    • Abbeel, P.1    Ng, A.2
  • 3
    • 0002278788 scopus 로고    scopus 로고
    • Hierarchical reinforcement learning with the MAXQ value function decomposition
    • T. Dietterich 2000 Hierarchical reinforcement learning with the MAXQ value function decomposition Journal of Artificial Intelligence Research 9 227 303
    • (2000) Journal of Artificial Intelligence Research , vol.9 , pp. 227-303
    • Dietterich, T.1
  • 4
    • 0000184142 scopus 로고
    • Constrained Markov decision models with weighted discounted rewards
    • 2
    • E. Feinberg A. Schwartz 1995 Constrained Markov decision models with weighted discounted rewards Mathematics of Operations Research 20 2 302 320
    • (1995) Mathematics of Operations Research , vol.20 , pp. 302-320
    • Feinberg, E.1    Schwartz, A.2
  • 8
    • 0032073263 scopus 로고    scopus 로고
    • Planning and acting in partially observable stochastic domains
    • Kaelbling, L., Littman, M., & Cassandra, A. (1998). Planning and acting in partially observable stochastic domains. AI Journal.
    • (1998) AI Journal
    • Kaelbling, L.1    Littman, M.2    Cassandra, A.3
  • 12
    • 31844444500 scopus 로고    scopus 로고
    • Dynamic preferences in multi-criteria reinforcement learning
    • Natarajan, S., & Tadepalli, P. (2005). Dynamic preferences in multi-criteria reinforcement learning. In Proceedings of the ICML.
    • (2005) Proceedings of the ICML
    • Natarajan, S.1    Tadepalli, P.2
  • 13
    • 0346738900 scopus 로고    scopus 로고
    • Flexible decomposition algorithms for weakly coupled Markov decision problems
    • Parr, R. (1998). Flexible decomposition algorithms for weakly coupled Markov decision problems. In UAI.
    • (1998) UAI
    • Parr, R.1
  • 16
    • 1942484759 scopus 로고    scopus 로고
    • Q-decomposition for reinforcement learning agents
    • Russell, S., & Zimdars, A. (2003). Q-decomposition for reinforcement learning agents. In Proceedings of ICML-03.
    • (2003) Proceedings of ICML-03
    • Russell, S.1    Zimdars, A.2
  • 18
    • 36949003610 scopus 로고    scopus 로고
    • Model-based hierarchical average reward reinforcement learning
    • Seri, S., & Tadepalli, P. (2002). Model-based hierarchical average reward reinforcement learning. In Proceedings of the ICML (pp. 562-569).
    • (2002) Proceedings of the ICML , pp. 562-569
    • Seri, S.1    Tadepalli, P.2
  • 19
    • 0033170372 scopus 로고    scopus 로고
    • Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
    • 1-2
    • R. Sutton D. Precup S. Singh 1999 Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning Artificial Intelligence 112 1-2 181 211
    • (1999) Artificial Intelligence , vol.112 , pp. 181-211
    • Sutton, R.1    Precup, D.2    Singh, S.3
  • 20
    • 0032050241 scopus 로고    scopus 로고
    • Model-based average reward reinforcement learning
    • P. Tadepalli D. Ok 1998 Model-based average reward reinforcement learning Artificial Intelligence 100 177 224
    • (1998) Artificial Intelligence , vol.100 , pp. 177-224
    • Tadepalli, P.1    Ok, D.2
  • 24
    • 0040030981 scopus 로고
    • Multi-objective infinite-horizon discounted Markov decision processes
    • D. White 1982 Multi-objective infinite-horizon discounted Markov decision processes Journal of Mathematical Analysis and Applications 89 639 647
    • (1982) Journal of Mathematical Analysis and Applications , vol.89 , pp. 639-647
    • White, D.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.