-
1
-
-
85166207010
-
Exploiting structure in policy construction
-
Boutilier, C., Dearden, R., & Goldszmidt, M. (1995) Exploiting structure in policy construction. IJCAI, 14: 1104-1113.
-
(1995)
IJCAI
, vol.14
, pp. 1104-1113
-
-
Boutilier, C.1
Dearden, R.2
Goldszmidt, M.3
-
2
-
-
84990553353
-
A model for reasoning about persistence and causation
-
Dean, T., & Kanazawa, K. (1989) A model for reasoning about persistence and causation. Computational Intelligence, 5(3): 142-150.
-
(1989)
Computational Intelligence
, vol.5
, Issue.3
, pp. 142-150
-
-
Dean, T.1
Kanazawa, K.2
-
3
-
-
0002278788
-
Hierarchical reinforcement learning with the MAXQ value function decomposition
-
Dietterich, T. (2000). Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 13: 227-303.
-
(2000)
Journal of Artificial Intelligence Research
, vol.13
, pp. 227-303
-
-
Dietterich, T.1
-
4
-
-
0007907759
-
Emergent hierarchical control structures: Learning reactive/hierarchical relationships in reinforcement environments
-
Digney, B. (1996) Emergent hierarchical control structures: Learning reactive/hierarchical relationships in reinforcement environments. From animals to animate, 4: 363-372.
-
(1996)
From Animals to Animate
, vol.4
, pp. 363-372
-
-
Digney, B.1
-
5
-
-
33744500784
-
Symbolic generalization for on-line planning
-
Feng, Z., Hansen, E., & Zilberstein, Z. (2003) Symbolic generalization for on-line planning. UAI, 19: 209-216.
-
(2003)
UAI
, vol.19
, pp. 209-216
-
-
Feng, Z.1
Hansen, E.2
Zilberstein, Z.3
-
6
-
-
0013528312
-
Continuous-time hierarchical reinforcement learning
-
Ghavamzadeh, M., & Mahadevan, S. (2001) Continuous-time hierarchical reinforcement learning. ICML, 18: 186-193.
-
(2001)
ICML
, vol.18
, pp. 186-193
-
-
Ghavamzadeh, M.1
Mahadevan, S.2
-
7
-
-
84880898477
-
Max-norm projections for factored MDPs
-
Guestrin, C., Koller, D., & Parr, R. (2001) Max-norm projections for factored MDPs. IJCAI, 17: 673-680.
-
(2001)
IJCAI
, vol.17
, pp. 673-680
-
-
Guestrin, C.1
Koller, D.2
Parr, R.3
-
8
-
-
13444260042
-
A planning heuristic based on causal graph analysis
-
Helmert, M. (2004) A planning heuristic based on causal graph analysis. ICAPS, 16: 161-170.
-
(2004)
ICAPS
, vol.16
, pp. 161-170
-
-
Helmert, M.1
-
9
-
-
0013465036
-
Discovering hierarchy in reinforcement learning with HEXQ
-
Hengst, B. (2002) Discovering hierarchy in reinforcement learning with HEXQ. ICML, 19: 243-250.
-
(2002)
ICML
, vol.19
, pp. 243-250
-
-
Hengst, B.1
-
10
-
-
0002956570
-
SPUDD: Stochastic planning using decision diagrams
-
Hoey, J., St-Aubin, R., Hu, A., & Boutilier, C. (1999) SPUDD: Stochastic Planning using Decision Diagrams. UAI, 15: 279-288.
-
(1999)
UAI
, vol.15
, pp. 279-288
-
-
Hoey, J.1
St-Aubin, R.2
Hu, A.3
Boutilier, C.4
-
11
-
-
84880677563
-
Efficient reinforcement learning in factored MDPs
-
Kearns, M., & Koller, D: (1999) Efficient reinforcement learning in factored MDPs. IJCAI, 16: 740-747.
-
(1999)
IJCAI
, vol.16
, pp. 740-747
-
-
Kearns, M.1
Koller, D.2
-
12
-
-
14344250635
-
Dynamic abstraction in reinforcement learning via clustering
-
Mannor, S., Menache, I., Hoze, A., & Klein, U. (2004) Dynamic abstraction in reinforcement learning via clustering. ICML, 21: 560-567.
-
(2004)
ICML
, vol.21
, pp. 560-567
-
-
Mannor, S.1
Menache, I.2
Hoze, A.3
Klein, U.4
-
13
-
-
0013465187
-
Automatic discovery of subgoals in reinforcement learning using diverse density
-
McGovern, A., & Barto, A. (2001) Automatic discovery of subgoals in reinforcement learning using diverse density. ICML, 18: 361-368.
-
(2001)
ICML
, vol.18
, pp. 361-368
-
-
McGovern, A.1
Barto, A.2
-
14
-
-
14344264466
-
Q-Cut - Dynamic discovery of sub-goals in reinforcement learning
-
Menache, I., Mannor, S., & Shimkin, N. (2002) Q-Cut - Dynamic discovery of sub-goals in reinforcement learning. ECML, 14: 295-306.
-
(2002)
ECML
, vol.14
, pp. 295-306
-
-
Menache, I.1
Mannor, S.2
Shimkin, N.3
-
15
-
-
84898956770
-
Reinforcement learning with hierarchies of machines
-
Parr, R., & Russell, S. (1998) Reinforcement learning with hierarchies of machines. NIPS, 10: 1043-1049.
-
(1998)
NIPS
, vol.10
, pp. 1043-1049
-
-
Parr, R.1
Russell, S.2
-
16
-
-
14344250461
-
PolicyBlocks: An algorithm for creating useful macro-actions in reinforcement learning
-
Pickett, M., & Barto, A. (2002) PolicyBlocks: An algorithm for creating useful macro-actions in reinforcement learning. ICML, 19: 506-513.
-
(2002)
ICML
, vol.19
, pp. 506-513
-
-
Pickett, M.1
Barto, A.2
-
17
-
-
14344261491
-
Using relative novelty to identify useful temporal abstractions in reinforcement learning
-
Şimşek, Ö., & Barto, A. (2004) Using relative novelty to identify useful temporal abstractions in reinforcement learning. ICML, 21: 751-758.
-
(2004)
ICML
, vol.21
, pp. 751-758
-
-
Şimşek, Ö.1
Barto, A.2
-
18
-
-
0033170372
-
Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
-
Button, R., Precup, D., & Singh, S. (1999) Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112: 181-211.
-
(1999)
Artificial Intelligence
, vol.112
, pp. 181-211
-
-
Button, R.1
Precup, D.2
Singh, S.3
-
19
-
-
33749882712
-
Finding structure in reinforcement learning
-
Thrun, S., & Schwartz, A. (1995) Finding structure in reinforcement learning. NIPS, 8: 385-392.
-
(1995)
NIPS
, vol.8
, pp. 385-392
-
-
Thrun, S.1
Schwartz, A.2
|