메뉴 건너뛰기




Volumn , Issue , 2005, Pages 401-408

A causal approach to hierarchical decomposition of factored MDPs

Author keywords

[No Author keywords available]

Indexed keywords

DECISION THEORY; HIERARCHICAL SYSTEMS; LEARNING SYSTEMS;

EID: 31844455449     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1102351.1102402     Document Type: Conference Paper
Times cited : (33)

References (19)
  • 1
    • 85166207010 scopus 로고
    • Exploiting structure in policy construction
    • Boutilier, C., Dearden, R., & Goldszmidt, M. (1995) Exploiting structure in policy construction. IJCAI, 14: 1104-1113.
    • (1995) IJCAI , vol.14 , pp. 1104-1113
    • Boutilier, C.1    Dearden, R.2    Goldszmidt, M.3
  • 2
    • 84990553353 scopus 로고
    • A model for reasoning about persistence and causation
    • Dean, T., & Kanazawa, K. (1989) A model for reasoning about persistence and causation. Computational Intelligence, 5(3): 142-150.
    • (1989) Computational Intelligence , vol.5 , Issue.3 , pp. 142-150
    • Dean, T.1    Kanazawa, K.2
  • 3
    • 0002278788 scopus 로고    scopus 로고
    • Hierarchical reinforcement learning with the MAXQ value function decomposition
    • Dietterich, T. (2000). Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 13: 227-303.
    • (2000) Journal of Artificial Intelligence Research , vol.13 , pp. 227-303
    • Dietterich, T.1
  • 4
    • 0007907759 scopus 로고    scopus 로고
    • Emergent hierarchical control structures: Learning reactive/hierarchical relationships in reinforcement environments
    • Digney, B. (1996) Emergent hierarchical control structures: Learning reactive/hierarchical relationships in reinforcement environments. From animals to animate, 4: 363-372.
    • (1996) From Animals to Animate , vol.4 , pp. 363-372
    • Digney, B.1
  • 5
    • 33744500784 scopus 로고    scopus 로고
    • Symbolic generalization for on-line planning
    • Feng, Z., Hansen, E., & Zilberstein, Z. (2003) Symbolic generalization for on-line planning. UAI, 19: 209-216.
    • (2003) UAI , vol.19 , pp. 209-216
    • Feng, Z.1    Hansen, E.2    Zilberstein, Z.3
  • 6
    • 0013528312 scopus 로고    scopus 로고
    • Continuous-time hierarchical reinforcement learning
    • Ghavamzadeh, M., & Mahadevan, S. (2001) Continuous-time hierarchical reinforcement learning. ICML, 18: 186-193.
    • (2001) ICML , vol.18 , pp. 186-193
    • Ghavamzadeh, M.1    Mahadevan, S.2
  • 7
    • 84880898477 scopus 로고    scopus 로고
    • Max-norm projections for factored MDPs
    • Guestrin, C., Koller, D., & Parr, R. (2001) Max-norm projections for factored MDPs. IJCAI, 17: 673-680.
    • (2001) IJCAI , vol.17 , pp. 673-680
    • Guestrin, C.1    Koller, D.2    Parr, R.3
  • 8
    • 13444260042 scopus 로고    scopus 로고
    • A planning heuristic based on causal graph analysis
    • Helmert, M. (2004) A planning heuristic based on causal graph analysis. ICAPS, 16: 161-170.
    • (2004) ICAPS , vol.16 , pp. 161-170
    • Helmert, M.1
  • 9
    • 0013465036 scopus 로고    scopus 로고
    • Discovering hierarchy in reinforcement learning with HEXQ
    • Hengst, B. (2002) Discovering hierarchy in reinforcement learning with HEXQ. ICML, 19: 243-250.
    • (2002) ICML , vol.19 , pp. 243-250
    • Hengst, B.1
  • 10
    • 0002956570 scopus 로고    scopus 로고
    • SPUDD: Stochastic planning using decision diagrams
    • Hoey, J., St-Aubin, R., Hu, A., & Boutilier, C. (1999) SPUDD: Stochastic Planning using Decision Diagrams. UAI, 15: 279-288.
    • (1999) UAI , vol.15 , pp. 279-288
    • Hoey, J.1    St-Aubin, R.2    Hu, A.3    Boutilier, C.4
  • 11
    • 84880677563 scopus 로고    scopus 로고
    • Efficient reinforcement learning in factored MDPs
    • Kearns, M., & Koller, D: (1999) Efficient reinforcement learning in factored MDPs. IJCAI, 16: 740-747.
    • (1999) IJCAI , vol.16 , pp. 740-747
    • Kearns, M.1    Koller, D.2
  • 12
    • 14344250635 scopus 로고    scopus 로고
    • Dynamic abstraction in reinforcement learning via clustering
    • Mannor, S., Menache, I., Hoze, A., & Klein, U. (2004) Dynamic abstraction in reinforcement learning via clustering. ICML, 21: 560-567.
    • (2004) ICML , vol.21 , pp. 560-567
    • Mannor, S.1    Menache, I.2    Hoze, A.3    Klein, U.4
  • 13
    • 0013465187 scopus 로고    scopus 로고
    • Automatic discovery of subgoals in reinforcement learning using diverse density
    • McGovern, A., & Barto, A. (2001) Automatic discovery of subgoals in reinforcement learning using diverse density. ICML, 18: 361-368.
    • (2001) ICML , vol.18 , pp. 361-368
    • McGovern, A.1    Barto, A.2
  • 14
    • 14344264466 scopus 로고    scopus 로고
    • Q-Cut - Dynamic discovery of sub-goals in reinforcement learning
    • Menache, I., Mannor, S., & Shimkin, N. (2002) Q-Cut - Dynamic discovery of sub-goals in reinforcement learning. ECML, 14: 295-306.
    • (2002) ECML , vol.14 , pp. 295-306
    • Menache, I.1    Mannor, S.2    Shimkin, N.3
  • 15
    • 84898956770 scopus 로고    scopus 로고
    • Reinforcement learning with hierarchies of machines
    • Parr, R., & Russell, S. (1998) Reinforcement learning with hierarchies of machines. NIPS, 10: 1043-1049.
    • (1998) NIPS , vol.10 , pp. 1043-1049
    • Parr, R.1    Russell, S.2
  • 16
    • 14344250461 scopus 로고    scopus 로고
    • PolicyBlocks: An algorithm for creating useful macro-actions in reinforcement learning
    • Pickett, M., & Barto, A. (2002) PolicyBlocks: An algorithm for creating useful macro-actions in reinforcement learning. ICML, 19: 506-513.
    • (2002) ICML , vol.19 , pp. 506-513
    • Pickett, M.1    Barto, A.2
  • 17
    • 14344261491 scopus 로고    scopus 로고
    • Using relative novelty to identify useful temporal abstractions in reinforcement learning
    • Şimşek, Ö., & Barto, A. (2004) Using relative novelty to identify useful temporal abstractions in reinforcement learning. ICML, 21: 751-758.
    • (2004) ICML , vol.21 , pp. 751-758
    • Şimşek, Ö.1    Barto, A.2
  • 18
    • 0033170372 scopus 로고    scopus 로고
    • Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
    • Button, R., Precup, D., & Singh, S. (1999) Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112: 181-211.
    • (1999) Artificial Intelligence , vol.112 , pp. 181-211
    • Button, R.1    Precup, D.2    Singh, S.3
  • 19
    • 33749882712 scopus 로고
    • Finding structure in reinforcement learning
    • Thrun, S., & Schwartz, A. (1995) Finding structure in reinforcement learning. NIPS, 8: 385-392.
    • (1995) NIPS , vol.8 , pp. 385-392
    • Thrun, S.1    Schwartz, A.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.