-
1
-
-
0012312949
-
State abstraction for programmable reinforcement learning agents
-
D. Andre and S. J. Russell. State Abstraction for Programmable Reinforcement Learning Agents. In AAAI, 2002.
-
(2002)
AAAI
-
-
Andre, D.1
Russell, S.J.2
-
4
-
-
0004142943
-
-
Technical report, Computer Sciences Laboratory, Australian National University
-
J. Baxter, L. Weaver, and P. Bartlett. Direct gradient-based reinforcement learning: II. Gradient ascent algorithms and experiments. Technical report, Computer Sciences Laboratory, Australian National University, 1999.
-
(1999)
Direct Gradient-Based Reinforcement Learning: II. Gradient Ascent Algorithms and Experiments
-
-
Baxter, J.1
Weaver, L.2
Bartlett, P.3
-
6
-
-
0029372831
-
The helmholtz machine
-
P. Dayan, G. E. Hinton, R. M. Neal, and R. S. Zemel. The Helmholtz machine. Neural Computation, 7(5):889–904, 1995.
-
(1995)
Neural Computation
, vol.7
, Issue.5
, pp. 889-904
-
-
Dayan, P.1
Hinton, G.E.2
Neal, R.M.3
Zemel, R.S.4
-
7
-
-
84903590417
-
A survey on policy search for robotics
-
M. P. Deisenroth, G. Neumann, and J. Peters. A Survey on Policy Search for Robotics. Foundations and Trends in Robotics, 2(2011):1–142, 2011.
-
(2011)
Foundations and Trends in Robotics
, vol.2
, Issue.2011
, pp. 1-142
-
-
Deisenroth, M.P.1
Neumann, G.2
Peters, J.3
-
8
-
-
85167430664
-
High-quality policies for the canadian traveler’s problem
-
P. Eyerich, T. Keller, and M. Helmert. High-quality policies for the Canadian traveler’s problem. In AAAI, 2010.
-
(2010)
AAAI
-
-
Eyerich, P.1
Keller, T.2
Helmert, M.3
-
9
-
-
84877577190
-
Complexity of canadian traveler problem variants
-
D. Fried, S. E. Shimony, A. Benbassat, and C. Wenner. Complexity of Canadian traveler problem variants. Theor. Comput. Sci., 487:1–16, 2013.
-
(2013)
Theor. Comput. Sci.
, vol.487
, pp. 1-16
-
-
Fried, D.1
Shimony, S.E.2
Benbassat, A.3
Wenner, C.4
-
10
-
-
70049098573
-
Church: A language for generative models
-
N. Goodman, V. Mansinghka, D. M. Roy, K. Bonawitz, and J. B. Tenenbaum. Church: a language for generative models. In Uncertainty in Artificial Intelligence, pages 220–229, 2008.
-
(2008)
Uncertainty in Artificial Intelligence
, pp. 220-229
-
-
Goodman, N.1
Mansinghka, V.2
Roy, D.M.3
Bonawitz, K.4
Tenenbaum, J.B.5
-
11
-
-
84926688403
-
Probabilistic programming
-
A. D. Gordon, T. A. Henzinger, A. V. Nori, and S. K. Rajamani. Probabilistic programming. In International Conference on Software Engineering (ICSE, FOSE track), 2014.
-
(2014)
International Conference on Software Engineering (ICSE, FOSE Track)
-
-
Gordon, A.D.1
Henzinger, T.A.2
Nori, A.V.3
Rajamani, S.K.4
-
14
-
-
84878919168
-
Stochastic variational inference
-
M. D. Hoffman, D. M. Blei, C. Wang, and J. Paisley. Stochastic variational inference. Journal of Machine Learning Research, 14:1303–1347, 2013.
-
(2013)
Journal of Machine Learning Research
, vol.14
, pp. 1303-1347
-
-
Hoffman, M.D.1
Blei, D.M.2
Wang, C.3
Paisley, J.4
-
15
-
-
84862277035
-
An expectation maximization algorithm for continuous markov decision processes with arbitrary reward
-
M. W. Hoffman, N. d. Freitas, A. Doucet, and J. R. Peters. An expectation maximization algorithm for continuous Markov Decision Processes with arbitrary reward. In International Conference on Artificial Intelligence and Statistics, pages 232–239, 2009a.
-
(2009)
International Conference on Artificial Intelligence and Statistics
, pp. 232-239
-
-
Hoffman, M.W.1
d. Freitas, N.2
Doucet, A.3
Peters, J.R.4
-
16
-
-
78751705157
-
New inference strategies for solving markov decision processes using reversible jump MCMC
-
AUAI Press
-
M. W. Hoffman, H. Kueck, N. de Freitas, and A. Doucet. New inference strategies for solving Markov decision processes using reversible jump MCMC. In Uncertainty in Artificial Intelligence, pages 223–231. AUAI Press, 2009b.
-
(2009)
Uncertainty in Artificial Intelligence
, pp. 223-231
-
-
Hoffman, M.W.1
Kueck, H.2
de Freitas, N.3
Doucet, A.4
-
17
-
-
29044440299
-
Path integrals and symmetry breaking for optimal control theory
-
H. J. Kappen. Path integrals and symmetry breaking for optimal control theory. Journal of Statistical Mechanics: Theory and Experiment, 2005(11), 2005. P11011.
-
(2005)
Journal of Statistical Mechanics: Theory and Experiment
, vol.2005
, Issue.11
, pp. P11011
-
-
Kappen, H.J.1
-
23
-
-
55049091050
-
Blog: Probabilistic models with unknown objects
-
B. Milch, B. Marthi, S. Russell, D. Sontag, D. L. Ong, and A. Kolobov. Blog: Probabilistic models with unknown objects. Statistical relational learning, page 373, 2007.
-
(2007)
Statistical Relational Learning
, pp. 373
-
-
Milch, B.1
Marthi, B.2
Russell, S.3
Sontag, D.4
Ong, D.L.5
Kolobov, A.6
-
24
-
-
79959874165
-
-
Microsoft Research Cambridge
-
T. Minka, J. Winn, J. Guiver, S. Webster, Y. Zaykov, B. Yangel, A. Spengler, and J. Bronskill. Infer.NET 2.6, 2014. Microsoft Research Cambridge. http://research.microsoft.com/infernet.
-
(2014)
Infer.NET 2.6
-
-
Minka, T.1
Winn, J.2
Guiver, J.3
Webster, S.4
Zaykov, Y.5
Yangel, B.6
Spengler, A.7
Bronskill, J.8
-
27
-
-
84959387419
-
Planning in discrete and continuous markov decision processes by probabilistic programming
-
Cham, Springer International Publishing
-
D. Nitti, V. Belle, and L. De Raedt. Planning in Discrete and Continuous Markov Decision Processes by Probabilistic Programming. In ECML PKDD, Lecture Notes in Computer Science, pages 327–342, Cham, 2015. Springer International Publishing.
-
(2015)
ECML PKDD, Lecture Notes in Computer Science
, pp. 327-342
-
-
Nitti, D.1
Belle, V.2
De Raedt, L.3
-
29
-
-
59349087586
-
Using programming language theory to make automatic differentiation sound and efficient
-
B. A. Pearlmutter and J. M. Siskind. Using programming language theory to make automatic differentiation sound and efficient. Advances in Automatic Differentiation, pages 79–90, 2008.
-
(2008)
Advances in Automatic Differentiation
, pp. 79-90
-
-
Pearlmutter, B.A.1
Siskind, J.M.2
-
34
-
-
31144465830
-
Heuristic search value iteration for POMDPs
-
Arlington, Virginia, United States, AUAI Press
-
T. Smith and R. Simmons. Heuristic Search Value Iteration for POMDPs. In Uncertainty in Artificial Intelligence, pages 520–527, Arlington, Virginia, United States, 2004. AUAI Press.
-
(2004)
Uncertainty in Artificial Intelligence
, pp. 520-527
-
-
Smith, T.1
Simmons, R.2
-
36
-
-
84898939480
-
Policy gradient methods for reinforcement learning with function approximation
-
R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour. Policy Gradient Methods for Reinforcement Learning with Function Approximation. Neural Information Processing Systems, pages 1057–1063, 1999.
-
(1999)
Neural Information Processing Systems
, pp. 1057-1063
-
-
Sutton, R.S.1
McAllester, D.2
Singh, S.3
Mansour, Y.4
-
37
-
-
67650915125
-
Efficient computation of optimal actions
-
E. Todorov. Efficient computation of optimal actions. Proc. Nat. Acad. Sci. of America, 106(28):11478–11483, 2009.
-
(2009)
Proc. Nat. Acad. Sci. of America
, vol.106
, Issue.28
, pp. 11478-11483
-
-
Todorov, E.1
-
38
-
-
51349153274
-
Probabilistic inference for solving (PO)MDPs
-
M. Toussaint, S. Harmeling, and A. Storkey. Probabilistic inference for solving (PO)MDPs. Neural Computation, 31(December):357–373, 2006.
-
(2006)
Neural Computation
, vol.31
, Issue.December
, pp. 357-373
-
-
Toussaint, M.1
Harmeling, S.2
Storkey, A.3
-
39
-
-
65749118363
-
Graphical models, exponential families, and variational inference
-
M. J. Wainwright and M. I. Jordan. Graphical Models, Exponential Families, and Variational Inference. Foundations and Trends in Machine Learning, 1(1âĂŞ2):1–305, 2008.
-
(2008)
Foundations and Trends in Machine Learning
, vol.1
, Issue.1-2
, pp. 1-305
-
-
Wainwright, M.J.1
Jordan, M.I.2
-
40
-
-
0000337576
-
Simple statistical gradient-following algorithms for connectionist reinforcement learning
-
R. J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8(3-4):229–256, 1992.
-
(1992)
Machine Learning
, vol.8
, Issue.3-4
, pp. 229-256
-
-
Williams, R.J.1
-
42
-
-
84881042664
-
Bayesian policy search with policy priors
-
D. Wingate, N. D. Goodman, D. M. Roy, L. P. Kaelbling, and J. B. Tenenbaum. Bayesian policy search with policy priors. In International Joint Conference on Artificial Intelligence, pages 1565–1570, 2011.
-
(2011)
International Joint Conference on Artificial Intelligence
, pp. 1565-1570
-
-
Wingate, D.1
Goodman, N.D.2
Roy, D.M.3
Kaelbling, L.P.4
Tenenbaum, J.B.5
-
43
-
-
84924316006
-
-
Technical report, Computer Science and Artificial Intelligence Laboratory, Cambridge, MA
-
D. Wingate, C. Diuk, T. O. Donnell, J. Tenenbaum, S. Gershman, L. Labs, and J. B. Tenenbaum. Compositional Policy Priors. Technical report, Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, 2013.
-
(2013)
Compositional Policy Priors
-
-
Wingate, D.1
Diuk, C.2
Donnell, T.O.3
Tenenbaum, J.4
Gershman, S.5
Labs, L.6
Tenenbaum, J.B.7
|