메뉴 건너뛰기




Volumn 72, Issue 7-9, 2009, Pages 1508-1524

Gaussian process dynamic programming

Author keywords

Bayesian active learning; Dynamic programming; Gaussian processes; Optimal control; Policy learning; Reinforcement learning

Indexed keywords

A-PRIORI; APPROXIMATE VALUE FUNCTIONS; APPROXIMATION TECHNIQUES; BAYESIAN ACTIVE LEARNING; CONTINUOUS STATE; GAUSSIAN PROCESSES; INITIAL STATE; ON THE FLIES; OPTIMAL CONTROL; OPTIMAL CONTROL PROBLEMS; POLICY LEARNING; PRIOR KNOWLEDGE; PROBABILISTIC MODELS; STATE SPACES; SWING-UP; TRANSITION DYNAMICS; UNKNOWN VALUES; VALUE FUNCTIONS;

EID: 61849173491     PISSN: 09252312     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.neucom.2008.12.019     Document Type: Article
Times cited : (186)

References (56)
  • 1
    • 0039816976 scopus 로고
    • Using local trajectory optimizers to speed up global optimization in dynamic programming
    • J.E. Hanson, S.J. Moody, R.P. Lippmann Eds, Morgan Kaufmann, Los Altos, CA
    • C.G. Atkeson, Using local trajectory optimizers to speed up global optimization in dynamic programming, in: J.E. Hanson, S.J. Moody, R.P. Lippmann (Eds.), Advances in Neural Information Processing Systems, vol. 6, Morgan Kaufmann, Los Altos, CA, 1994, pp. 503-521.
    • (1994) Advances in Neural Information Processing Systems , vol.6 , pp. 503-521
    • Atkeson, C.G.1
  • 4
    • 0003787146 scopus 로고
    • Princeton University Press, Princeton, NJ, USA
    • Bellman R.E. Dynamic Programming (1957), Princeton University Press, Princeton, NJ, USA
    • (1957) Dynamic Programming
    • Bellman, R.E.1
  • 5
    • 61849143544 scopus 로고    scopus 로고
    • Dynamic Programming and Optimal Control
    • third ed, Athena Scientific, Belmont, MA, USA
    • D.P. Bertsekas, Dynamic Programming and Optimal Control, Optimization and Computation Series, vol. 1, third ed., Athena Scientific, Belmont, MA, USA, 2005.
    • (2005) Optimization and Computation Series , vol.1
    • Bertsekas, D.P.1
  • 6
    • 61849185818 scopus 로고    scopus 로고
    • Dynamic Programming and Optimal Control
    • third ed, Athena Scientific, Belmont, MA, USA
    • D.P. Bertsekas, Dynamic Programming and Optimal Control, Optimization and Computation Series, vol. 2, third ed., Athena Scientific, Belmont, MA, USA, 2007.
    • (2007) Optimization and Computation Series , vol.2
    • Bertsekas, D.P.1
  • 9
    • 84972528615 scopus 로고
    • Bayesian experimental design: a review
    • Chaloner K., and Verdinelli I. Bayesian experimental design: a review. Statistical Science 10 (1995) 273-304
    • (1995) Statistical Science , vol.10 , pp. 273-304
    • Chaloner, K.1    Verdinelli, I.2
  • 12
    • 0033629916 scopus 로고    scopus 로고
    • Reinforcement learning in continuous time and space
    • Doya K. Reinforcement learning in continuous time and space. Neural Computation 12 1 (2000) 219-245
    • (2000) Neural Computation , vol.12 , Issue.1 , pp. 219-245
    • Doya, K.1
  • 13
    • 1942421151 scopus 로고    scopus 로고
    • Bayes meets Bellman: The Gaussian process approach to temporal difference learning
    • Washington, DC, USA, August
    • Y. Engel, S. Mannor, R. Meir, Bayes meets Bellman: the Gaussian process approach to temporal difference learning, in: Proceedings of the 20th International Conference on Machine Learning, Washington, DC, USA, vol. 20, August 2003, pp. 154-161.
    • (2003) Proceedings of the 20th International Conference on Machine Learning , vol.20 , pp. 154-161
    • Engel, Y.1    Mannor, S.2    Meir, R.3
  • 16
    • 84864065133 scopus 로고    scopus 로고
    • Bayesian policy gradient algorithms
    • Schölkopf B., Platt J.C., and Hoffman T. (Eds), The MIT Press, Cambridge, MA, USA
    • Ghavamzadeh M., and Engel Y. Bayesian policy gradient algorithms. In: Schölkopf B., Platt J.C., and Hoffman T. (Eds). Advances in Neural Information Processing Systems, vol. 19 (2007), The MIT Press, Cambridge, MA, USA 457-464
    • (2007) Advances in Neural Information Processing Systems, vol. 19 , pp. 457-464
    • Ghavamzadeh, M.1    Engel, Y.2
  • 17
    • 84867040604 scopus 로고    scopus 로고
    • Gaussian process priors with uncertain inputs-application to multiple-step ahead time series forecasting
    • Becker S., Thrun S., and Obermayer K. (Eds), The MIT Press, Cambridge, MA, USA
    • Girard A., Rasmussen C.E., Quiñonero Candela J., and Murray-Smith R. Gaussian process priors with uncertain inputs-application to multiple-step ahead time series forecasting. In: Becker S., Thrun S., and Obermayer K. (Eds). Advances in Neural Information Processing Systems, vol. 15 (2003), The MIT Press, Cambridge, MA, USA 529-536
    • (2003) Advances in Neural Information Processing Systems, vol. 15 , pp. 529-536
    • Girard, A.1    Rasmussen, C.E.2    Quiñonero Candela, J.3    Murray-Smith, R.4
  • 18
    • 84880694195 scopus 로고
    • Stable function approximation in dynamic programming
    • Prieditis A., and Russell S. (Eds), Morgan Kaufmann, San Francisco, CA, USA
    • Gordon G.J. Stable function approximation in dynamic programming. In: Prieditis A., and Russell S. (Eds). Proceedings of the 12th International Conference on Machine Learning (1995), Morgan Kaufmann, San Francisco, CA, USA 261-268
    • (1995) Proceedings of the 12th International Conference on Machine Learning , pp. 261-268
    • Gordon, G.J.1
  • 21
    • 21244437999 scopus 로고    scopus 로고
    • Unscented filtering and nonlinear estimation
    • Julier S.J., and Uhlmann J.K. Unscented filtering and nonlinear estimation. IEEE Review 92 3 (2004) 401-422
    • (2004) IEEE Review , vol.92 , Issue.3 , pp. 401-422
    • Julier, S.J.1    Uhlmann, J.K.2
  • 22
    • 85024429815 scopus 로고
    • A new approach to linear filtering and prediction problems
    • Kalman R.E. A new approach to linear filtering and prediction problems. Transactions of the ASME-Journal of Basic Engineering 82 Series D (1960) 35-45
    • (1960) Transactions of the ASME-Journal of Basic Engineering , vol.82 , Issue.Series D , pp. 35-45
    • Kalman, R.E.1
  • 27
    • 41549146576 scopus 로고    scopus 로고
    • Near-optimal sensor placements in Gaussian processes: theory, efficient algorithms and empirical studies
    • Krause A., Singh A., and Guestrin C. Near-optimal sensor placements in Gaussian processes: theory, efficient algorithms and empirical studies. Journal of Machine Learning Research 9 (2008) 235-284
    • (2008) Journal of Machine Learning Research , vol.9 , pp. 235-284
    • Krause, A.1    Singh, A.2    Guestrin, C.3
  • 29
    • 0000695404 scopus 로고
    • Information-based objective functions for active data selection
    • MacKay D.J.C. Information-based objective functions for active data selection. Neural Computation 4 (1992) 590-604
    • (1992) Neural Computation , vol.4 , pp. 590-604
    • MacKay, D.J.C.1
  • 30
    • 0000597408 scopus 로고    scopus 로고
    • Comparison of approximate methods for handling hyperparameters
    • MacKay D.J.C. Comparison of approximate methods for handling hyperparameters. Neural Computation 11 5 (1999) 1035-1068
    • (1999) Neural Computation , vol.11 , Issue.5 , pp. 1035-1068
    • MacKay, D.J.C.1
  • 33
    • 0015764255 scopus 로고
    • The intrinsic random functions and their applications
    • Matheron G. The intrinsic random functions and their applications. Advances in Applied Probability 5 (1973) 439-468
    • (1973) Advances in Applied Probability , vol.5 , pp. 439-468
    • Matheron, G.1
  • 35
    • 84945582505 scopus 로고    scopus 로고
    • Nonlinear adaptive control using non-parametric Gaussian process prior models
    • Academic Press, Barcelona, Spain
    • Murray-Smith R., and Sbarbaro D. Nonlinear adaptive control using non-parametric Gaussian process prior models. Proceedings of the 15th IFAC World Congress vol. 15 (July 2002), Academic Press, Barcelona, Spain
    • (2002) Proceedings of the 15th IFAC World Congress , vol.15
    • Murray-Smith, R.1    Sbarbaro, D.2
  • 38
    • 0036832956 scopus 로고    scopus 로고
    • Kernel-based reinforcement learning
    • Ormoneit D., and Sen S. Kernel-based reinforcement learning. Machine Learning 49 2-3 (2002) 161-178
    • (2002) Machine Learning , vol.49 , Issue.2-3 , pp. 161-178
    • Ormoneit, D.1    Sen, S.2
  • 40
    • 40649106649 scopus 로고    scopus 로고
    • Natural actor-critic
    • Peters J., and Schaal S. Natural actor-critic. Neurocomputing 71 7-9 (2008) 1180-1190
    • (2008) Neurocomputing , vol.71 , Issue.7-9 , pp. 1180-1190
    • Peters, J.1    Schaal, S.2
  • 41
    • 44949241322 scopus 로고    scopus 로고
    • Reinforcement learning of motor skills with policy gradients
    • Peters J., and Schaal S. Reinforcement learning of motor skills with policy gradients. Neural Networks 21 (2008) 682-697
    • (2008) Neural Networks , vol.21 , pp. 682-697
    • Peters, J.1    Schaal, S.2
  • 42
    • 33750297394 scopus 로고    scopus 로고
    • T. Pfingsten, Bayesian active learning for sensitivity analysis, in: Proceedings of the 17th European Conference on Machine Learning, September 2006, pp. 353-364.
    • T. Pfingsten, Bayesian active learning for sensitivity analysis, in: Proceedings of the 17th European Conference on Machine Learning, September 2006, pp. 353-364.
  • 45
    • 58449109750 scopus 로고    scopus 로고
    • Probabilistic inference for fast learning in control
    • S. Girgin, M. Loth, R. Munos, P. Preux, D. Ryabko Eds, Recent Advances in Reinforcement Learning, Springer, Berlin, November
    • C.E. Rasmussen, M.P. Deisenroth, Probabilistic inference for fast learning in control, in: S. Girgin, M. Loth, R. Munos, P. Preux, D. Ryabko (Eds.), Recent Advances in Reinforcement Learning, Lecture Notes on Computer Science, vol. 5323, Springer, Berlin, November 2008, pp. 229-242.
    • (2008) Lecture Notes on Computer Science , vol.5323 , pp. 229-242
    • Rasmussen, C.E.1    Deisenroth, M.P.2
  • 47
    • 84899026055 scopus 로고    scopus 로고
    • Gaussian processes in reinforcement learning
    • Thrun S., Saul L.K., and Schölkopf B. (Eds), The MIT Press, Cambridge, MA, USA
    • Rasmussen C.E., and Kuss M. Gaussian processes in reinforcement learning. In: Thrun S., Saul L.K., and Schölkopf B. (Eds). Advances in Neural Information Processing Systems, vol. 16 (2004), The MIT Press, Cambridge, MA, USA 751-759
    • (2004) Advances in Neural Information Processing Systems, vol. 16 , pp. 751-759
    • Rasmussen, C.E.1    Kuss, M.2
  • 48
    • 34247621089 scopus 로고    scopus 로고
    • Gaussian processes for machine learning
    • The MIT Press, Cambridge, MA, USA URL 〈http://www.gaussianprocess.org/gpml/〉
    • Rasmussen C.E., and Williams C.K.I. Gaussian processes for machine learning. Adaptive Computation and Machine Learning (2006), The MIT Press, Cambridge, MA, USA. http://www.gaussianprocess.org/gpml/ URL 〈http://www.gaussianprocess.org/gpml/〉
    • (2006) Adaptive Computation and Machine Learning
    • Rasmussen, C.E.1    Williams, C.K.I.2
  • 49
    • 0033233953 scopus 로고    scopus 로고
    • Concepts and facilities of a neural reinforcement learning control architecture for technical process control
    • Riedmiller M. Concepts and facilities of a neural reinforcement learning control architecture for technical process control. Neural Computation and Application 8 (2000) 323-338
    • (2000) Neural Computation and Application , vol.8 , pp. 323-338
    • Riedmiller, M.1
  • 50
    • 33646687423 scopus 로고    scopus 로고
    • Neural Fitted Q iteration-first experiences with a data efficient neural reinforcement learning method
    • Porto, Portugal
    • M. Riedmiller, Neural Fitted Q iteration-first experiences with a data efficient neural reinforcement learning method, in: Proceedings of the 16th European Conference on Machine Learning, Porto, Portugal, 2005.
    • (2005) Proceedings of the 16th European Conference on Machine Learning
    • Riedmiller, M.1
  • 52
    • 84864038646 scopus 로고    scopus 로고
    • Sparse Gaussian processes using pseudo-inputs
    • Weiss Y., Schölkopf B., and Platt J.C. (Eds), The MIT Press, Cambridge, MA, USA
    • Snelson E., and Ghahramani Z. Sparse Gaussian processes using pseudo-inputs. In: Weiss Y., Schölkopf B., and Platt J.C. (Eds). Advances in Neural Information Processing Systems, vol. 18 (2006), The MIT Press, Cambridge, MA, USA 1257-1264
    • (2006) Advances in Neural Information Processing Systems, vol. 18 , pp. 1257-1264
    • Snelson, E.1    Ghahramani, Z.2
  • 56
    • 0002295913 scopus 로고    scopus 로고
    • Gaussian processes for regression
    • Touretzky D.S., Mozer M.C., and Hasselmo M.E. (Eds), The MIT Press, Cambridge, MA, USA
    • Williams C.K.I., and Rasmussen C.E. Gaussian processes for regression. In: Touretzky D.S., Mozer M.C., and Hasselmo M.E. (Eds). Advances in Neural Processing Systems, vol. 8 (1996), The MIT Press, Cambridge, MA, USA 598-604
    • (1996) Advances in Neural Processing Systems, vol. 8 , pp. 598-604
    • Williams, C.K.I.1    Rasmussen, C.E.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.