메뉴 건너뛰기




Volumn , Issue , 2011, Pages 221-229

Effcient inference in Markov control problems

Author keywords

[No Author keywords available]

Indexed keywords

ARTIFICIAL INTELLIGENCE;

EID: 80053139999     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (6)

References (17)
  • 2
    • 0346982426 scopus 로고    scopus 로고
    • Using expectation-maximization for reinforcement learning
    • P. Dayan and G. E. Hinton. Using Expectation-Maximization for Reinforcement Learning. Neural Computation, 9:271-278, 1997. (Pubitemid 127635391)
    • (1997) Neural Computation , vol.9 , Issue.2 , pp. 271-278
    • Dayan, P.1    Hinton, G.E.2
  • 3
    • 84862273812 scopus 로고    scopus 로고
    • Variational methods for reinforcement learning
    • T. Furmston and D. Barber. Variational Methods for Reinforcement Learning. AISTATS, 9(13):241-248, 2010.
    • (2010) AISTATS , vol.9 , Issue.13 , pp. 241-248
    • Furmston, T.1    Barber, D.2
  • 4
    • 85162074018 scopus 로고    scopus 로고
    • Trans-dimensional MCMC for Bayesian policy learning
    • M. Hoffman, A. Doucet, N. de Freitas, and A. Jasra. Trans-dimensional MCMC for Bayesian Policy Learning. NIPS, 20:665-672, 2008.
    • (2008) NIPS , vol.20 , pp. 665-672
    • Hoffman, M.1    Doucet, A.2    De Freitas, N.3    Jasra, A.4
  • 5
    • 84862277035 scopus 로고    scopus 로고
    • An expectation maximization algorithm for continuous Markov decision processes with arbitrary rewards
    • M. Hoffman, N. de Freitas, A. Doucet, and J. Peters. An Expectation Maximization Algorithm for Continuous Markov Decision Processes with Arbitrary Rewards. AISTATS, 5(12):232-239, 2009.
    • (2009) AISTATS , vol.5 , Issue.12 , pp. 232-239
    • Hoffman, M.1    De Freitas, N.2    Doucet, A.3    Peters, J.4
  • 6
    • 0032073263 scopus 로고    scopus 로고
    • Planning and acting in partially observable stochastic domains
    • PII S000437029800023X
    • L. Kaelbling, M. Littman, and A. Cassandra. Planning and Acting in Partially Observable Stochastic Domains. Artificial Intelligence, 101:99-134, 1998. (Pubitemid 128387390)
    • (1998) Artificial Intelligence , vol.101 , Issue.1-2 , pp. 99-134
    • Kaelbling, L.P.1    Littman, M.L.2    Cassandra, A.R.3
  • 8
    • 84858754385 scopus 로고    scopus 로고
    • Policy search for motor primitives in robotics
    • J. Kober and J. Peters. Policy search for motor primitives in robotics. NIPS, 21:849-856, 2009.
    • (2009) NIPS , vol.21 , pp. 849-856
    • Kober, J.1    Peters, J.2
  • 10
    • 1942420675 scopus 로고    scopus 로고
    • Optimization with em and expectation-conjugate-gradient
    • R. Salakhutdinov, S. Roweis, and Z. Ghahramani. Optimization with EM and Expectation-Conjugate-Gradient. ICML, (20):672-679, 2003.
    • (2003) ICML , Issue.20 , pp. 672-679
    • Salakhutdinov, R.1    Roweis, S.2    Ghahramani, Z.3
  • 12
    • 84898939480 scopus 로고    scopus 로고
    • Policy gradient methods for reinforcement learning with function approximation
    • R. Sutton, D. McAllester, S. Singh, and Y. Mansour. Policy Gradient Methods for Reinforcement Learning with Function Approximation. NIPS, 13, 2000.
    • (2000) NIPS , vol.13
    • Sutton, R.1    McAllester, D.2    Singh, S.3    Mansour, Y.4
  • 17
    • 65749118363 scopus 로고    scopus 로고
    • Graphical models, exponential families, and Variational inference
    • M. J. Wainwright and M. I. Jordan. Graphical Models, Exponential Families, and Variational Inference. Foundations and Trends in Machine Learning, 1(1-2):1-305, 2008.
    • (2008) Foundations and Trends in Machine Learning , vol.1 , Issue.1-2 , pp. 1-305
    • Wainwright, M.J.1    Jordan, M.I.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.