메뉴 건너뛰기




Volumn 6792 LNCS, Issue PART 2, 2011, Pages 221-228

Improving Gaussian process value function approximation in policy gradient algorithms

Author keywords

control problems; Gaussian processes; policy gradient methods; Reinforcement learning; value function estimation

Indexed keywords

BASIS VECTOR; CONTINUOUS DOMAIN; CONTINUOUS STATE-ACTION SPACES; CONTROL PROBLEMS; DISTANCE-BASED; DYNAMIC SYSTEM CONTROL; GAUSSIAN PROCESS REGRESSION; GAUSSIAN PROCESSES; GRADIENT BASED; GRADIENT ESTIMATES; GRADIENT VARIANCE; KULLBACK-LEIBLER DISTANCE; POLICY GRADIENT; POLICY GRADIENT METHODS; POLICY SEARCH; TIME-DEPENDENT FACTORS; VALUE FUNCTION APPROXIMATION; VALUE FUNCTIONS; VALUE-BASED; WILLIAMS;

EID: 79959344344     PISSN: 03029743     EISSN: 16113349     Source Type: Book Series    
DOI: 10.1007/978-3-642-21738-8_29     Document Type: Conference Paper
Times cited : (3)

References (15)
  • 1
    • 84898958374 scopus 로고    scopus 로고
    • Gradient descent for general reinforcement learning
    • Kearns, M.S., Solla, S.A., Cohn, D.A. (eds.) NIPS 1998. MIT Press, Cambridge
    • Baird, L., Moore, A.: Gradient descent for general reinforcement learning. In: Kearns, M.S., Solla, S.A., Cohn, D.A. (eds.) NIPS 1998. Advances in Neural Information Processing Systems, vol. 11, pp. 968-974. MIT Press, Cambridge (1998)
    • (1998) Advances in Neural Information Processing Systems , vol.11 , pp. 968-974
    • Baird, L.1    Moore, A.2
  • 3
    • 84898947911 scopus 로고    scopus 로고
    • Sparse representation for Gaussian process models
    • Leen, T.K., Dietterich, T.G., Tresp, V. (eds.) MIT Press, Cambridge
    • Csató, L., Opper, M.: Sparse representation for Gaussian process models. In: Leen, T.K., Dietterich, T.G., Tresp, V. (eds.) NIPS, vol. 13, pp. 444-450. MIT Press, Cambridge (2001)
    • (2001) NIPS , vol.13 , pp. 444-450
    • Csató, L.1    Opper, M.2
  • 4
  • 6
    • 77956930777 scopus 로고    scopus 로고
    • Importance sampling for continuous time Bayesian networks
    • Fan, Y., Xu, J., Shelton, C.R.: Importance sampling for continuous time Bayesian networks. Journal of Machine Learning Research 11, 2115-2140 (2010)
    • (2010) Journal of Machine Learning Research , vol.11 , pp. 2115-2140
    • Fan, Y.1    Xu, J.2    Shelton, C.R.3
  • 7
    • 84864065133 scopus 로고    scopus 로고
    • Bayesian policy gradient algorithms
    • Schölkopf, B., Platt, J., Hoffman, T. (eds.) NIPS 2007, MIT Press, Cambridge
    • Ghavamzadeh, M., Engel, Y.: Bayesian policy gradient algorithms. In: Schölkopf, B., Platt, J., Hoffman, T. (eds.) NIPS 2007, Advances in Neural Information Processing Systems, vol. 19, pp. 457-464. MIT Press, Cambridge (2007)
    • (2007) Advances in Neural Information Processing Systems , vol.19 , pp. 457-464
    • Ghavamzadeh, M.1    Engel, Y.2
  • 9
    • 44949241322 scopus 로고    scopus 로고
    • Reinforcement learning of motor skills with policy gradients
    • Peters, J., Schaal, S.: Reinforcement learning of motor skills with policy gradients. Neural Networks 21(4), 682-697 (2008)
    • (2008) Neural Networks , vol.21 , Issue.4 , pp. 682-697
    • Peters, J.1    Schaal, S.2
  • 11
    • 84899026055 scopus 로고    scopus 로고
    • Gaussian processes in reinforcement learning
    • Saul, L.K., Thrun, S., Schlkopf, B. (eds.) NIPS 2003, MIT Press, Cambridge
    • Rasmussen, C.E., Kuss, M.: Gaussian processes in reinforcement learning. In: Saul, L.K., Thrun, S., Schlkopf, B. (eds.) NIPS 2003, Advances in Neural Information Processing Systems, pp. 751-759. MIT Press, Cambridge (2004)
    • (2004) Advances in Neural Information Processing Systems , pp. 751-759
    • Rasmussen, C.E.1    Kuss, M.2
  • 13
    • 52949143093 scopus 로고    scopus 로고
    • Geodesic gaussian kernels for value function approximation
    • Sugiyama, M., Hachiya, H., Towell, C., Vijayakumar, S.: Geodesic gaussian kernels for value function approximation. Auton. Robots 25, 287-304 (2008)
    • (2008) Auton. Robots , vol.25 , pp. 287-304
    • Sugiyama, M.1    Hachiya, H.2    Towell, C.3    Vijayakumar, S.4
  • 14
    • 84898939480 scopus 로고    scopus 로고
    • Policy gradient methods for reinforcement learning with function approximation
    • Solla, S.A., Leen, T.K., Müller, K.R. (eds.) NIPS 1999, MIT Press, Cambridge
    • Sutton, R.S., McAllester, D.A., Singh, S.P., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: Solla, S.A., Leen, T.K., Müller, K.R. (eds.) NIPS 1999, Advances in Neural Information Processing Systems, pp. 1057-1063. MIT Press, Cambridge (1999)
    • (1999) Advances in Neural Information Processing Systems , pp. 1057-1063
    • Sutton, R.S.1    McAllester, D.A.2    Singh, S.P.3    Mansour, Y.4
  • 15
    • 0000337576 scopus 로고
    • Simple statistical gradient-following algorithms for connectionist reinforcement learning
    • Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning 8, 229-256 (1992)
    • (1992) Machine Learning , vol.8 , pp. 229-256
    • Williams, R.J.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.