메뉴 건너뛰기




Volumn 9, Issue , 2010, Pages 241-248

Variational methods for reinforcement learning

Author keywords

[No Author keywords available]

Indexed keywords

APPROXIMATE SOLUTION; BAYESIAN ALTERNATIVES; EXPECTATION PROPAGATION; MARKOV DECISION PROCESSES; OPTIMAL DECISIONS; POINT ESTIMATE; TRANSITION MATRICES; TRANSITION MODEL; VARIATIONAL BAYES; VARIATIONAL METHODS;

EID: 84862273812     PISSN: 15324435     EISSN: 15337928     Source Type: Journal    
DOI: None     Document Type: Conference Paper
Times cited : (46)

References (15)
  • 1
    • 84864030941 scopus 로고    scopus 로고
    • An application of reinforcement learning to aerobatic helicopter flight
    • P. Abbeel, A. Coates, M. Quigley, and A. Ng. An Application of Reinforcement Learning to Aerobatic Helicopter Flight. NIPS, 19:1-8, 2007.
    • (2007) NIPS , vol.19 , pp. 1-8
    • Abbeel, P.1    Coates, A.2    Quigley, M.3    Ng, A.4
  • 2
    • 13844295342 scopus 로고    scopus 로고
    • The variational Bayesian em algorithm for incomplete data: With application to scoring graphical model structures
    • Oxford University Press
    • M. J. Beal and Z. Ghahramani. The Variational Bayesian EM Algorithm for Incomplete Data: with Application to Scoring Graphical Model Structures. In Bayesian Statistics, volume 7, pages 453-464. Oxford University Press, 2003.
    • (2003) Bayesian Statistics , vol.7 , pp. 453-464
    • Beal, M.J.1    Ghahramani, Z.2
  • 3
    • 85156187730 scopus 로고
    • Improving elevator performance using reinforcement learning
    • R. Crites and A. Barto. Improving Elevator Performance Using Reinforcement Learning. NIPS, 8: 1017-1023, 1995.
    • (1995) NIPS , vol.8 , pp. 1017-1023
    • Crites, R.1    Barto, A.2
  • 4
    • 0346982426 scopus 로고    scopus 로고
    • Using expectation-maximization for reinforcement learning
    • P. Dayan and G. E. Hinton. Using Expectation-Maximization for Reinforcement Learning. Neural Computation, 9:271-278, 1997.
    • (1997) Neural Computation , vol.9 , pp. 271-278
    • Dayan, P.1    Hinton, G.E.2
  • 7
    • 84907554788 scopus 로고    scopus 로고
    • Solving deterministic policy (PO)MPDs using Expectation-Maximisation and Antifreeze
    • Workshop on Learning and data Mining for Robotics
    • T. Furmston and D. Barber. Solving deterministic policy (PO)MPDs using Expectation-Maximisation and Antifreeze. European Conference on Machine Learning (ECML), 1:50-65, 2009. Workshop on Learning and data Mining for Robotics.
    • (2009) European Conference on Machine Learning (ECML) , vol.1 , pp. 50-65
    • Furmston, T.1    Barber, D.2
  • 8
    • 85162074018 scopus 로고    scopus 로고
    • Trans-dimensional MCMC for Bayesian policy learning
    • M. Hoffman, A. Doucet, N. de Freitas, and A. Jasra. Trans-dimensional MCMC for Bayesian Policy Learning. NIPS, 20:665-672, 2008.
    • (2008) NIPS , vol.20 , pp. 665-672
    • Hoffman, M.1    Doucet, A.2    De Freitas, N.3    Jasra, A.4
  • 9
    • 84858754385 scopus 로고    scopus 로고
    • Policy search for motor primitives in robotics
    • J. Kober and J. Peters. Policy search for motor primitives in robotics. NIPS, 21:849-856, 2009.
    • (2009) NIPS , vol.21 , pp. 849-856
    • Kober, J.1    Peters, J.2
  • 12
    • 58449109750 scopus 로고    scopus 로고
    • Probabilistic inference for fast learning in control
    • S. Girgin, M. Loth, R. Munos, P. Preux, and D. Ryabko, editors
    • C. Rasmussen and M. Deisenroth. Probabilistic inference for fast learning in control. In S. Girgin, M. Loth, R. Munos, P. Preux, and D. Ryabko, editors, Recent Advances in Reinforcement Learning, pages 229-242, 2008.
    • (2008) Recent Advances in Reinforcement Learning , pp. 229-242
    • Rasmussen, C.1    Deisenroth, M.2
  • 15
    • 65749118363 scopus 로고    scopus 로고
    • Graphical models, exponential families, and variational inference
    • M. J. Wainwright and M. I. Jordan. Graphical models, exponential families, and variational inference. Foundations and Trends in Machine Learning, 1(1-2):1-305, 2008.
    • (2008) Foundations and Trends in Machine Learning , vol.1 , Issue.1-2 , pp. 1-305
    • Wainwright, M.J.1    Jordan, M.I.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.