-
2
-
-
0346982426
-
Using expectation-maximization for reinforcement learning
-
P. Dayan and G. E. Hinton. Using Expectation-Maximization for Reinforcement Learning. Neural Computation, 9:271-278, 1997. (Pubitemid 127635391)
-
(1997)
Neural Computation
, vol.9
, Issue.2
, pp. 271-278
-
-
Dayan, P.1
Hinton, G.E.2
-
3
-
-
84862273812
-
Variational methods for reinforcement learning
-
T. Furmston and D. Barber. Variational Methods for Reinforcement Learning. AISTATS, 9(13):241-248, 2010.
-
(2010)
AISTATS
, vol.9
, Issue.13
, pp. 241-248
-
-
Furmston, T.1
Barber, D.2
-
4
-
-
85162074018
-
Trans-dimensional MCMC for Bayesian policy learning
-
M. Hoffman, A. Doucet, N. de Freitas, and A. Jasra. Trans-dimensional MCMC for Bayesian Policy Learning. NIPS, 20:665-672, 2008.
-
(2008)
NIPS
, vol.20
, pp. 665-672
-
-
Hoffman, M.1
Doucet, A.2
De Freitas, N.3
Jasra, A.4
-
5
-
-
84862277035
-
An expectation maximization algorithm for continuous Markov decision processes with arbitrary rewards
-
M. Hoffman, N. de Freitas, A. Doucet, and J. Peters. An Expectation Maximization Algorithm for Continuous Markov Decision Processes with Arbitrary Rewards. AISTATS, 5(12):232-239, 2009.
-
(2009)
AISTATS
, vol.5
, Issue.12
, pp. 232-239
-
-
Hoffman, M.1
De Freitas, N.2
Doucet, A.3
Peters, J.4
-
6
-
-
0032073263
-
Planning and acting in partially observable stochastic domains
-
PII S000437029800023X
-
L. Kaelbling, M. Littman, and A. Cassandra. Planning and Acting in Partially Observable Stochastic Domains. Artificial Intelligence, 101:99-134, 1998. (Pubitemid 128387390)
-
(1998)
Artificial Intelligence
, vol.101
, Issue.1-2
, pp. 99-134
-
-
Kaelbling, L.P.1
Littman, M.L.2
Cassandra, A.R.3
-
8
-
-
84858754385
-
Policy search for motor primitives in robotics
-
J. Kober and J. Peters. Policy search for motor primitives in robotics. NIPS, 21:849-856, 2009.
-
(2009)
NIPS
, vol.21
, pp. 849-856
-
-
Kober, J.1
Peters, J.2
-
10
-
-
1942420675
-
Optimization with em and expectation-conjugate-gradient
-
R. Salakhutdinov, S. Roweis, and Z. Ghahramani. Optimization with EM and Expectation-Conjugate-Gradient. ICML, (20):672-679, 2003.
-
(2003)
ICML
, Issue.20
, pp. 672-679
-
-
Salakhutdinov, R.1
Roweis, S.2
Ghahramani, Z.3
-
12
-
-
84898939480
-
Policy gradient methods for reinforcement learning with function approximation
-
R. Sutton, D. McAllester, S. Singh, and Y. Mansour. Policy Gradient Methods for Reinforcement Learning with Function Approximation. NIPS, 13, 2000.
-
(2000)
NIPS
, vol.13
-
-
Sutton, R.1
McAllester, D.2
Singh, S.3
Mansour, Y.4
-
15
-
-
51349153274
-
Probabilistic inference for solving (PO)MDPs
-
M. Toussaint, S. Harmeling, and A. Storkey. Probabilistic inference for solving (PO)MDPs. Research Report EDI-INF-RR-0934, University of Edinburgh, School of Informatics, 2006.
-
(2006)
Research Report EDI-INF-RR-0934, University of Edinburgh, School of Informatics
-
-
Toussaint, M.1
Harmeling, S.2
Storkey, A.3
-
17
-
-
65749118363
-
Graphical models, exponential families, and Variational inference
-
M. J. Wainwright and M. I. Jordan. Graphical Models, Exponential Families, and Variational Inference. Foundations and Trends in Machine Learning, 1(1-2):1-305, 2008.
-
(2008)
Foundations and Trends in Machine Learning
, vol.1
, Issue.1-2
, pp. 1-305
-
-
Wainwright, M.J.1
Jordan, M.I.2
|