메뉴 건너뛰기




Volumn , Issue , 2009, Pages 1211-1218

Uncertainty handling CMA-ES for reinforcement learning

Author keywords

Covariance matrix adaptation evolution strategy; Direct policy search; Reinforcement learning; Uncertainty handling

Indexed keywords

COVARIANCE MATRIX ADAPTATION EVOLUTION STRATEGIES; DIRECT POLICY SEARCH; EVOLUTIONARY LEARNING; LEARNING SPEED; NOISY OBSERVATIONS; PARTIALLY OBSERVABLE MARKOV DECISION PROCESS; POLICY GRADIENT METHODS; RANDOM SEARCHES; UNCERTAINTY HANDLING;

EID: 72749104931     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1569901.1570064     Document Type: Conference Paper
Times cited : (13)

References (28)
  • 1
    • 72749092976 scopus 로고    scopus 로고
    • S. Amari and H. Nagaoka. Methods of Information Geometry. Number 191 in Translations of Mathematical Monographs. American Mathematical Society and Oxford University Press, 2000.
    • S. Amari and H. Nagaoka. Methods of Information Geometry. Number 191 in Translations of Mathematical Monographs. American Mathematical Society and Oxford University Press, 2000.
  • 3
    • 56449128627 scopus 로고
    • Evolution strategies
    • H.-G. Beyer. Evolution strategies. Scholarpedia, 2(8):1965, 2007.
    • (1965) Scholarpedia , vol.2 , Issue.8
    • Beyer, H.-G.1
  • 6
    • 56449125243 scopus 로고    scopus 로고
    • Uncertainty handling in model selection for support vector machines
    • G. Rudolph, editor, Parallel Problem Solving from Nature PPSN X, of, Springer-Verlag
    • T. Glasmachers and C. Igel. Uncertainty handling in model selection for support vector machines. In G. Rudolph, editor, Parallel Problem Solving from Nature (PPSN X), volume 5199 of LNCS, pages 185-194. Springer-Verlag, 2008.
    • (2008) LNCS , vol.5199 , pp. 185-194
    • Glasmachers, T.1    Igel, C.2
  • 8
    • 0042879997 scopus 로고    scopus 로고
    • Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES)
    • N. Hansen, S. D. Müller, and P. Koumoutsakos. Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES). Evolutionary Computation, 11(1):1-18, 2003.
    • (2003) Evolutionary Computation , vol.11 , Issue.1 , pp. 1-18
    • Hansen, N.1    Müller, S.D.2    Koumoutsakos, P.3
  • 9
    • 56449130836 scopus 로고    scopus 로고
    • Evolutionary optimization of feedback controllers for thermoacoustic instabilities
    • J. F. Morrison, D. M. Birch, and P. Lavoie, editors, Springer-Verlag
    • N. Hansen, A. S. P. Niederberger, L. Guzzella, and P. Koumoutsakos. Evolutionary optimization of feedback controllers for thermoacoustic instabilities. In J. F. Morrison, D. M. Birch, and P. Lavoie, editors, IUTAM Symposium on Flow Control and MEMS. Springer-Verlag, 2008.
    • (2008) IUTAM Symposium on Flow Control and MEMS
    • Hansen, N.1    Niederberger, A.S.P.2    Guzzella, L.3    Koumoutsakos, P.4
  • 10
    • 59749085404 scopus 로고    scopus 로고
    • A method for handling uncertainty in evolutionary optimization with an application to feedback control of combustion
    • N. Hansen, A. S. P. Niederberger, L. Guzzella, and P. Koumoutsakos. A method for handling uncertainty in evolutionary optimization with an application to feedback control of combustion. IEEE Transactions on Evolutionary Computation, 13(1):180-197, 2009.
    • (2009) IEEE Transactions on Evolutionary Computation , vol.13 , Issue.1 , pp. 180-197
    • Hansen, N.1    Niederberger, A.S.P.2    Guzzella, L.3    Koumoutsakos, P.4
  • 11
    • 0035377566 scopus 로고    scopus 로고
    • Completely derandomized self-adaptation in evolution strategies
    • N. Hansen and A. Ostermeier. Completely derandomized self-adaptation in evolution strategies. Evolutionary Computation, 9(2):159-195, 2001.
    • (2001) Evolutionary Computation , vol.9 , Issue.2 , pp. 159-195
    • Hansen, N.1    Ostermeier, A.2
  • 12
    • 56449106904 scopus 로고    scopus 로고
    • Evolution strategies for direct policy search
    • G. Rudolph, editor, Parallel Problem Solving from Nature PPSN X, number in, Springer-Verlag
    • V. Heidrich-Meisner and C. Igel. Evolution strategies for direct policy search. In G. Rudolph, editor, Parallel Problem Solving from Nature (PPSN X), number 5199 in LNCS, pages 428-437. Springer-Verlag, 2008.
    • (2008) LNCS , vol.5199 , pp. 428-437
    • Heidrich-Meisner, V.1    Igel, C.2
  • 14
    • 58449122813 scopus 로고    scopus 로고
    • Variable metric reinforcement learning methods applied to the noisy mountain car problem
    • S. Girgin et al, editors, European Workshop on Reinforcement Learning EWRL 2008, number in, Springer-Verlag
    • V. Heidrich-Meisner and C. Igel. Variable metric reinforcement learning methods applied to the noisy mountain car problem. In S. Girgin et al., editors, European Workshop on Reinforcement Learning (EWRL 2008), number 5323 in LNAI, pages 136-150. Springer-Verlag, 2008.
    • (2008) LNAI , vol.5323 , pp. 136-150
    • Heidrich-Meisner, V.1    Igel, C.2
  • 15
    • 84901411269 scopus 로고    scopus 로고
    • Neuroevolution for reinforcement learning using evolution strategies
    • IEEE Press
    • C. Igel. Neuroevolution for reinforcement learning using evolution strategies. In Congress on Evolutionary Computation (CEC 2003), volume 4, pages 2588-2595. IEEE Press, 2003.
    • (2003) Congress on Evolutionary Computation (CEC 2003) , vol.4 , pp. 2588-2595
    • Igel, C.1
  • 17
    • 0032073263 scopus 로고    scopus 로고
    • Planning and acting in partially observable stochastic domains
    • L. Kaelbling, M. Littman, and A. Cassandra. Planning and acting in partially observable stochastic domains. Artificial Intelligence, 101(1-2):99-134, 1998.
    • (1998) Artificial Intelligence , vol.101 , Issue.1-2 , pp. 99-134
    • Kaelbling, L.1    Littman, M.2    Cassandra, A.3
  • 18
    • 84898930479 scopus 로고    scopus 로고
    • A natural policy gradient
    • T. G. Dietterich, S. Becker, and Z. Ghahramani, editors, MIT Press
    • S. Kakade. A natural policy gradient. In T. G. Dietterich, S. Becker, and Z. Ghahramani, editors, Advances in Neural Information Processing Systems (NIPS14). MIT Press, 2002.
    • (2002) Advances in Neural Information Processing Systems (NIPS14)
    • Kakade, S.1
  • 20
    • 40649106649 scopus 로고    scopus 로고
    • Natural actor-critic
    • J. Peters and S. Schaal. Natural actor-critic. Neurocomputing, 71(7-9):1180-1190, 2008.
    • (2008) Neurocomputing , vol.71 , Issue.7-9 , pp. 1180-1190
    • Peters, J.1    Schaal, S.2
  • 23
    • 33745783272 scopus 로고    scopus 로고
    • Integrating techniques from statistical ranking into evolutionary algorithms
    • Applications of Evolutionary Computing, of, Springer
    • C. Schmidt, J. Branke, and S. Chick. Integrating techniques from statistical ranking into evolutionary algorithms. In Applications of Evolutionary Computing, volume 3907 of LNCS, pages 752-763. Springer, 2006.
    • (2006) LNCS , vol.3907 , pp. 752-763
    • Schmidt, C.1    Branke, J.2    Chick, S.3
  • 27
    • 62149099100 scopus 로고    scopus 로고
    • Efficient covariance matrix update for variable metric evolution strategies
    • T. Suttorp, N. Hansen, and C. Igel. Efficient covariance matrix update for variable metric evolution strategies. Machine Learning, 75(2):167-197, 2009.
    • (2009) Machine Learning , vol.75 , Issue.2 , pp. 167-197
    • Suttorp, T.1    Hansen, N.2    Igel, C.3
  • 28
    • 33646714634 scopus 로고    scopus 로고
    • Evolutionary function approximation for reinforcement learning
    • S. Whiteson and P. Stone. Evolutionary function approximation for reinforcement learning. Journal of Machine Learning Research, 7:877-917, 2006.
    • (2006) Journal of Machine Learning Research , vol.7 , pp. 877-917
    • Whiteson, S.1    Stone, P.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.