-
1
-
-
14344251217
-
Apprenticeship learning via inverse reinforcement learning
-
Abbeel, P., & Ng, A. (2004). Apprenticeship learning via inverse reinforcement learning. In Proceedings of the ICML.
-
(2004)
Proceedings of the ICML
-
-
Abbeel, P.1
Ng, A.2
-
3
-
-
0002278788
-
Hierarchical reinforcement learning with the MAXQ value function decomposition
-
T. Dietterich 2000 Hierarchical reinforcement learning with the MAXQ value function decomposition Journal of Artificial Intelligence Research 9 227 303
-
(2000)
Journal of Artificial Intelligence Research
, vol.9
, pp. 227-303
-
-
Dietterich, T.1
-
4
-
-
0000184142
-
Constrained Markov decision models with weighted discounted rewards
-
2
-
E. Feinberg A. Schwartz 1995 Constrained Markov decision models with weighted discounted rewards Mathematics of Operations Research 20 2 302 320
-
(1995)
Mathematics of Operations Research
, vol.20
, pp. 302-320
-
-
Feinberg, E.1
Schwartz, A.2
-
7
-
-
84880803349
-
Generalizing plans to new environments in relational MDPs
-
Guestrin, C., Koller, D., Gearhart, C., & Kanodia, N. (2003). Generalizing plans to new environments in relational MDPs. In International joint conference on artificial intelligence.
-
(2003)
International Joint Conference on Artificial Intelligence
-
-
Guestrin, C.1
Koller, D.2
Gearhart, C.3
Kanodia, N.4
-
8
-
-
0032073263
-
Planning and acting in partially observable stochastic domains
-
Kaelbling, L., Littman, M., & Cassandra, A. (1998). Planning and acting in partially observable stochastic domains. AI Journal.
-
(1998)
AI Journal
-
-
Kaelbling, L.1
Littman, M.2
Cassandra, A.3
-
12
-
-
31844444500
-
Dynamic preferences in multi-criteria reinforcement learning
-
Natarajan, S., & Tadepalli, P. (2005). Dynamic preferences in multi-criteria reinforcement learning. In Proceedings of the ICML.
-
(2005)
Proceedings of the ICML
-
-
Natarajan, S.1
Tadepalli, P.2
-
13
-
-
0346738900
-
Flexible decomposition algorithms for weakly coupled Markov decision problems
-
Parr, R. (1998). Flexible decomposition algorithms for weakly coupled Markov decision problems. In UAI.
-
(1998)
UAI
-
-
Parr, R.1
-
18
-
-
36949003610
-
Model-based hierarchical average reward reinforcement learning
-
Seri, S., & Tadepalli, P. (2002). Model-based hierarchical average reward reinforcement learning. In Proceedings of the ICML (pp. 562-569).
-
(2002)
Proceedings of the ICML
, pp. 562-569
-
-
Seri, S.1
Tadepalli, P.2
-
19
-
-
0033170372
-
Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
-
1-2
-
R. Sutton D. Precup S. Singh 1999 Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning Artificial Intelligence 112 1-2 181 211
-
(1999)
Artificial Intelligence
, vol.112
, pp. 181-211
-
-
Sutton, R.1
Precup, D.2
Singh, S.3
-
20
-
-
0032050241
-
Model-based average reward reinforcement learning
-
P. Tadepalli D. Ok 1998 Model-based average reward reinforcement learning Artificial Intelligence 100 177 224
-
(1998)
Artificial Intelligence
, vol.100
, pp. 177-224
-
-
Tadepalli, P.1
Ok, D.2
-
22
-
-
57749085102
-
Relational macros for transfer in reinforcement learning
-
Torrey, L., Shavlik, J., Walker, T., & Maclin, R. (2007). Relational macros for transfer in reinforcement learning. In Proceedings of the 17th conference on inductive logic programming.
-
(2007)
Proceedings of the 17th Conference on Inductive Logic Programming
-
-
Torrey, L.1
Shavlik, J.2
Walker, T.3
MacLin, R.4
-
24
-
-
0040030981
-
Multi-objective infinite-horizon discounted Markov decision processes
-
D. White 1982 Multi-objective infinite-horizon discounted Markov decision processes Journal of Mathematical Analysis and Applications 89 639 647
-
(1982)
Journal of Mathematical Analysis and Applications
, vol.89
, pp. 639-647
-
-
White, D.1
|