SCOPUS 정보 검색 플랫폼 - 논문 보기

메뉴 건너뛰기

ICML 2005 - Proceedings of the 22nd International Conference on Machine Learning

Volumn , Issue , 2005, Pages 401-408

A causal approach to hierarchical decomposition of factored MDPs

(2) Jonsson, Anders a Barto, Andrew a

a The Manning College of Information and Computer Sciences (United States)

Author keywords

[No Author keywords available]

Indexed keywords

DECISION THEORY; HIERARCHICAL SYSTEMS; LEARNING SYSTEMS;

CAUSAL RELATIONSHIPS; MARKOV DECISION PROCESSES; VARIABLE INFLUENCE STRUCTURE ANALYSIS;

LEARNING ALGORITHMS;

EID: 31844455449 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/1102351.1102402 Document Type: Conference Paper

Times cited : (33)

References (19)

1
- 85166207010
- Exploiting structure in policy construction
- Boutilier, C., Dearden, R., & Goldszmidt, M. (1995) Exploiting structure in policy construction. IJCAI, 14: 1104-1113.
- (1995) IJCAI , vol.14 , pp. 1104-1113
- Boutilier, C.¹ Dearden, R.² Goldszmidt, M.³

2
- 84990553353
- A model for reasoning about persistence and causation
- Dean, T., & Kanazawa, K. (1989) A model for reasoning about persistence and causation. Computational Intelligence, 5(3): 142-150.
- (1989) Computational Intelligence , vol.5 , Issue.3 , pp. 142-150
- Dean, T.¹ Kanazawa, K.²

3
- 0002278788
- Hierarchical reinforcement learning with the MAXQ value function decomposition
- Dietterich, T. (2000). Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 13: 227-303.
- (2000) Journal of Artificial Intelligence Research , vol.13 , pp. 227-303
- Dietterich, T.¹

4
- 0007907759
- Emergent hierarchical control structures: Learning reactive/hierarchical relationships in reinforcement environments
- Digney, B. (1996) Emergent hierarchical control structures: Learning reactive/hierarchical relationships in reinforcement environments. From animals to animate, 4: 363-372.
- (1996) From Animals to Animate , vol.4 , pp. 363-372
- Digney, B.¹

5
- 33744500784
- Symbolic generalization for on-line planning
- Feng, Z., Hansen, E., & Zilberstein, Z. (2003) Symbolic generalization for on-line planning. UAI, 19: 209-216.
- (2003) UAI , vol.19 , pp. 209-216
- Feng, Z.¹ Hansen, E.² Zilberstein, Z.³

6
- 0013528312
- Continuous-time hierarchical reinforcement learning
- Ghavamzadeh, M., & Mahadevan, S. (2001) Continuous-time hierarchical reinforcement learning. ICML, 18: 186-193.
- (2001) ICML , vol.18 , pp. 186-193
- Ghavamzadeh, M.¹ Mahadevan, S.²

7
- 84880898477
- Max-norm projections for factored MDPs
- Guestrin, C., Koller, D., & Parr, R. (2001) Max-norm projections for factored MDPs. IJCAI, 17: 673-680.
- (2001) IJCAI , vol.17 , pp. 673-680
- Guestrin, C.¹ Koller, D.² Parr, R.³

8
- 13444260042
- A planning heuristic based on causal graph analysis
- Helmert, M. (2004) A planning heuristic based on causal graph analysis. ICAPS, 16: 161-170.
- (2004) ICAPS , vol.16 , pp. 161-170
- Helmert, M.¹

9
- 0013465036
- Discovering hierarchy in reinforcement learning with HEXQ
- Hengst, B. (2002) Discovering hierarchy in reinforcement learning with HEXQ. ICML, 19: 243-250.
- (2002) ICML , vol.19 , pp. 243-250
- Hengst, B.¹

10
- 0002956570
- SPUDD: Stochastic planning using decision diagrams
- Hoey, J., St-Aubin, R., Hu, A., & Boutilier, C. (1999) SPUDD: Stochastic Planning using Decision Diagrams. UAI, 15: 279-288.
- (1999) UAI , vol.15 , pp. 279-288
- Hoey, J.¹ St-Aubin, R.² Hu, A.³ Boutilier, C.⁴

11
- 84880677563
- Efficient reinforcement learning in factored MDPs
- Kearns, M., & Koller, D: (1999) Efficient reinforcement learning in factored MDPs. IJCAI, 16: 740-747.
- (1999) IJCAI , vol.16 , pp. 740-747
- Kearns, M.¹ Koller, D.²

12
- 14344250635
- Dynamic abstraction in reinforcement learning via clustering
- Mannor, S., Menache, I., Hoze, A., & Klein, U. (2004) Dynamic abstraction in reinforcement learning via clustering. ICML, 21: 560-567.
- (2004) ICML , vol.21 , pp. 560-567
- Mannor, S.¹ Menache, I.² Hoze, A.³ Klein, U.⁴

13
- 0013465187
- Automatic discovery of subgoals in reinforcement learning using diverse density
- McGovern, A., & Barto, A. (2001) Automatic discovery of subgoals in reinforcement learning using diverse density. ICML, 18: 361-368.
- (2001) ICML , vol.18 , pp. 361-368
- McGovern, A.¹ Barto, A.²

14
- 14344264466
- Q-Cut - Dynamic discovery of sub-goals in reinforcement learning
- Menache, I., Mannor, S., & Shimkin, N. (2002) Q-Cut - Dynamic discovery of sub-goals in reinforcement learning. ECML, 14: 295-306.
- (2002) ECML , vol.14 , pp. 295-306
- Menache, I.¹ Mannor, S.² Shimkin, N.³

15
- 84898956770
- Reinforcement learning with hierarchies of machines
- Parr, R., & Russell, S. (1998) Reinforcement learning with hierarchies of machines. NIPS, 10: 1043-1049.
- (1998) NIPS , vol.10 , pp. 1043-1049
- Parr, R.¹ Russell, S.²

16
- 14344250461
- PolicyBlocks: An algorithm for creating useful macro-actions in reinforcement learning
- Pickett, M., & Barto, A. (2002) PolicyBlocks: An algorithm for creating useful macro-actions in reinforcement learning. ICML, 19: 506-513.
- (2002) ICML , vol.19 , pp. 506-513
- Pickett, M.¹ Barto, A.²

17
- 14344261491
- Using relative novelty to identify useful temporal abstractions in reinforcement learning
- Şimşek, Ö., & Barto, A. (2004) Using relative novelty to identify useful temporal abstractions in reinforcement learning. ICML, 21: 751-758.
- (2004) ICML , vol.21 , pp. 751-758
- Şimşek, Ö.¹ Barto, A.²

18
- 0033170372
- Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
- Button, R., Precup, D., & Singh, S. (1999) Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112: 181-211.
- (1999) Artificial Intelligence , vol.112 , pp. 181-211
- Button, R.¹ Precup, D.² Singh, S.³

19
- 33749882712
- Finding structure in reinforcement learning
- Thrun, S., & Schwartz, A. (1995) Finding structure in reinforcement learning. NIPS, 8: 385-392.
- (1995) NIPS , vol.8 , pp. 385-392
- Thrun, S.¹ Schwartz, A.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.