-
1
-
-
0033170372
-
Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
-
Sutton, R.S., Precup, D., Singh, S.P.: Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112(1-2), 181-211 (1999)
-
(1999)
Artificial Intelligence
, vol.112
, Issue.1-2
, pp. 181-211
-
-
Sutton, R.S.1
Precup, D.2
Singh, S.P.3
-
2
-
-
84912073624
-
-
Stolle, M., Precup, D.: Learning options in reinforcement learning. In: Koenig, S., Holte, R.C. (eds.) SARA 2002. LNCS (LNAI), 2371, pp. 212-223. Springer, Heidelberg (2002)
-
Stolle, M., Precup, D.: Learning options in reinforcement learning. In: Koenig, S., Holte, R.C. (eds.) SARA 2002. LNCS (LNAI), vol. 2371, pp. 212-223. Springer, Heidelberg (2002)
-
-
-
-
3
-
-
0010220982
-
Planning, learning and coordination in multiagent decision processes
-
Boutilier, C.: Planning, learning and coordination in multiagent decision processes. In: Theoretical Aspects of Rationality and Knowledge, pp. 195-201 (1996)
-
(1996)
Theoretical Aspects of Rationality and Knowledge
, pp. 195-201
-
-
Boutilier, C.1
-
4
-
-
85166207010
-
Exploiting structure in policy construction
-
Mellish, C, ed, Morgan Kaufmann, San Francisco
-
Boutilier, C., Dearden, R., Goldszmidt, M.: Exploiting structure in policy construction. In: Mellish, C. (ed.) Proceedings of the 14th International Joint Conference on Artificial Intelligence, pp. 1104-1111. Morgan Kaufmann, San Francisco (1995)
-
(1995)
Proceedings of the 14th International Joint Conference on Artificial Intelligence
, pp. 1104-1111
-
-
Boutilier, C.1
Dearden, R.2
Goldszmidt, M.3
-
6
-
-
33749242809
-
Learning the structure of factored markov decision processes in reinforcement learning problems
-
New York, NY, USA, pp
-
Degris, T., Sigaud, O., Wuillemin, P.H.: Learning the structure of factored markov decision processes in reinforcement learning problems. In: Proceedings of the 23rd International Conference on Machine learning, New York, NY, USA, pp. 257-264 (2006)
-
(2006)
Proceedings of the 23rd International Conference on Machine learning
, pp. 257-264
-
-
Degris, T.1
Sigaud, O.2
Wuillemin, P.H.3
-
7
-
-
36348930987
-
-
AAAI, pp, AAAI Press, Menlo Park 2007
-
Strehl, A.L., Diuk, C., Littman, M.L.: Efficient structure learning in factored-state mdps. In: AAAI, pp. 645-650. AAAI Press, Menlo Park (2007)
-
Efficient structure learning in factored-state mdps
, pp. 645-650
-
-
Strehl, A.L.1
Diuk, C.2
Littman, M.L.3
-
8
-
-
33747670266
-
Learning factor graphs in polynomial time and sample complexity
-
Abbeel, P., Koller, D., Ng, A.Y.: Learning factor graphs in polynomial time and sample complexity. Journal of Machine Learning Research 7, 1743-1788 (2006)
-
(2006)
Journal of Machine Learning Research
, vol.7
, pp. 1743-1788
-
-
Abbeel, P.1
Koller, D.2
Ng, A.Y.3
-
10
-
-
29344475738
-
Solving factored MDPs with continuous and discrete variables
-
Guestrin, C., Hauskrecht, M., Kveton, B.: Solving factored MDPs with continuous and discrete variables. In: Proceedings of the 20th conference on Uncertainty in artificial intelligence, pp. 235-242 (2004)
-
(2004)
Proceedings of the 20th conference on Uncertainty in artificial intelligence
, pp. 235-242
-
-
Guestrin, C.1
Hauskrecht, M.2
Kveton, B.3
-
11
-
-
0000337576
-
Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning
-
Williams, R.: Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning. Reinforcement Learning 8, 229-256 (1992)
-
(1992)
Reinforcement Learning
, vol.8
, pp. 229-256
-
-
Williams, R.1
-
13
-
-
0011812680
-
Local and global optimization algorithms for generalized learning automata
-
Phansalkar, V., Thathachar, M.: Local and global optimization algorithms for generalized learning automata. Neural Computation 7(5), 950-973 (1995)
-
(1995)
Neural Computation
, vol.7
, Issue.5
, pp. 950-973
-
-
Phansalkar, V.1
Thathachar, M.2
|