메뉴 건너뛰기




Volumn 3, Issue , 2006, Pages 177-182

Reinforcement learning with hierarchical decision-making

Author keywords

[No Author keywords available]

Indexed keywords

DECISION MAKING; HIERARCHICAL SYSTEMS; LEARNING ALGORITHMS; MARKOV PROCESSES;

EID: 34547539260     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ISDA.2006.37     Document Type: Conference Paper
Times cited : (4)

References (17)
  • 1
    • 0037288370 scopus 로고    scopus 로고
    • A.G. Barto, and S. Mahadevan, S, Recent Advances in Hierarchical reinforcement Learning. Discrete Event Dynamic Systems: Theory and Applications 13, 2003, pp. 341-379.
    • A.G. Barto, and S. Mahadevan, S, "Recent Advances in Hierarchical reinforcement Learning". Discrete Event Dynamic Systems: Theory and Applications 13, 2003, pp. 341-379.
  • 2
    • 0029210635 scopus 로고
    • Learning to Act Using Real-Time Dynamic Programming
    • A.G. Barto, S.J. Bradtke, and S.P. Singh, "Learning to Act Using Real-Time Dynamic Programming", Artificial Intelligence 72, 1995, pp. 81-138.
    • (1995) Artificial Intelligence , vol.72 , pp. 81-138
    • Barto, A.G.1    Bradtke, S.J.2    Singh, S.P.3
  • 6
    • 0002278788 scopus 로고    scopus 로고
    • Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition
    • T.D. Dietterich, "Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition", Journal of Artificial Intelligence Research 13, 2000, pp. 227-303.
    • (2000) Journal of Artificial Intelligence Research , vol.13 , pp. 227-303
    • Dietterich, T.D.1
  • 12
    • 0033901602 scopus 로고    scopus 로고
    • Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms
    • S.P. Singh, T. Jaakkola, M.L. Littman, C. Szepesvári, "Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms", Machine Learning 38, 2000, pp. 287-308.
    • (2000) Machine Learning , vol.38 , pp. 287-308
    • Singh, S.P.1    Jaakkola, T.2    Littman, M.L.3    Szepesvári, C.4
  • 13
    • 85132026293 scopus 로고
    • Integrated Architectures for Learning, Planning and Reacting Based on Approximating Dynamic Programming
    • R.S. Sutton, "Integrated Architectures for Learning, Planning and Reacting Based on Approximating Dynamic Programming", Proceedings of the 7th International Conference on Machine Learning, 1990, pp. 216-224.
    • (1990) Proceedings of the 7th International Conference on Machine Learning , pp. 216-224
    • Sutton, R.S.1
  • 15
    • 0033170372 scopus 로고    scopus 로고
    • Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning
    • R.S. Sutton, D. Precup, and S. Singh. "Between MDPs and Semi-MDPs: a Framework for Temporal Abstraction in Reinforcement Learning", Artificial Intelligence 112, 1999, pp. 181-211.
    • (1999) Artificial Intelligence , vol.112 , pp. 181-211
    • Sutton, R.S.1    Precup, D.2    Singh, S.3
  • 16
    • 34547533392 scopus 로고    scopus 로고
    • C.J.C.H. Watkins, Learning from Delayed Rewards, Ph.D. thesis, Cambridge University, Cambridge, England
    • C.J.C.H. Watkins, Learning from Delayed Rewards, Ph.D. thesis, Cambridge University, Cambridge, England.
  • 17


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.