메뉴 건너뛰기




Volumn 3, Issue , 2015, Pages 1889-1897

Trust region policy optimization

Author keywords

[No Author keywords available]

Indexed keywords

APPROXIMATION ALGORITHMS; ARTIFICIAL INTELLIGENCE; LEARNING SYSTEMS;

EID: 84969963490     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (6406)

References (33)
  • 6
    • 33846679442 scopus 로고    scopus 로고
    • Simulation optimization: A review, new developments, and applications
    • Winter Simulation Conference
    • Fu, Michael C, Glover, Fred W, and April, Jay. Simulation optimization: a review, new developments, and applications. In Proceedings of the 37th conference on Winter simulation, pp. 83-95. Winter Simulation Conference, 2005.
    • (2005) Proceedings of the 37th Conference on Winter Simulation , pp. 83-95
    • Fu, M.C.1    Glover, F.W.2    April, J.3
  • 11
    • 1342332031 scopus 로고    scopus 로고
    • A tutorial on MM algorithms
    • Hunter, David R and Lange, Kenneth. A tutorial on MM algorithms. The American Statistician, 58(1):30-37, 2004.
    • (2004) The American Statistician , vol.58 , Issue.1 , pp. 30-37
    • Hunter, D.R.1    Lange, K.2
  • 13
    • 1942514728 scopus 로고    scopus 로고
    • Approximately optimal approximate reinforcement learning
    • Kakade, Sham and Langford, John. Approximately optimal approximate reinforcement learning. In ICML, volume 2, pp. 267-274, 2002.
    • (2002) ICML , vol.2 , pp. 267-274
    • Kakade, S.1    Langford, J.2
  • 14
    • 1942420814 scopus 로고    scopus 로고
    • Reinforcement learning as classification: Leveraging modern classifiers
    • Lagoudakis, Michail G and Parr, Ronald. Reinforcement learning as classification: Leveraging modern classifiers. In ICML, volume 3, pp. 424-431, 2003.
    • (2003) ICML , vol.3 , pp. 424-431
    • Lagoudakis, M.G.1    Parr, R.2
  • 16
    • 84937822296 scopus 로고    scopus 로고
    • Learning neural network policies with guided policy search under unknown dynamics
    • Levine, Sergey and Abbeel, Pieter. Learning neural network policies with guided policy search under unknown dynamics. In Advances in Neural Information Processing Systems, pp. 1071-1079, 2014.
    • (2014) Advances in Neural Information Processing Systems , pp. 1071-1079
    • Levine, S.1    Abbeel, P.2
  • 17
    • 84872565347 scopus 로고    scopus 로고
    • Training deep and recurrent networks with hessian-free optimization
    • Springer
    • Martens, J. and Sutskever, I. Training deep and recurrent networks with hessian-free optimization. In Neural Networks: Tricks of the Trade, pp. 479-535. Springer, 2012.
    • (2012) Neural Networks: Tricks of the Trade , pp. 479-535
    • Martens, J.1    Sutskever, I.2
  • 23
    • 44949241322 scopus 로고    scopus 로고
    • Reinforcement learning of motor skills with policy gradients
    • Peters, J. and Schaal, S. Reinforcement learning of motor skills with policy gradients. Neural Networks, 21(4): 682-697, 2008a.
    • (2008) Neural Networks , vol.21 , Issue.4 , pp. 682-697
    • Peters, J.1    Schaal, S.2
  • 26
    • 40649106649 scopus 로고    scopus 로고
    • Natural actor-critic
    • Peters, Jan and Schaal, Stefan. Natural actor-critic. Neuro-computing, 71(7): 1180-1190, 2008b.
    • (2008) Neuro-computing , vol.71 , Issue.7 , pp. 1180-1190
    • Peters, J.1    Schaal, S.2
  • 29
    • 33845344721 scopus 로고    scopus 로고
    • Learning tetris using the noisy cross-entropy method
    • Szita, István and Lörincz, András. Learning tetris using the noisy cross-entropy method. Neural computation, 18 (12):2936-2941, 2006.
    • (2006) Neural Computation , vol.18 , Issue.12 , pp. 2936-2941
    • Szita, I.1    Lörincz, A.2
  • 32
    • 70349668763 scopus 로고    scopus 로고
    • Optimal gait and form for animal locomotion
    • ACM
    • Wampler, Kevin and Popović, Zoran. Optimal gait and form for animal locomotion. In ACM Transactions on Graphics (TOG), volume 28, pp. 60. ACM, 2009.
    • (2009) ACM Transactions on Graphics (TOG) , vol.28 , pp. 60
    • Wampler, K.1    Popović, Z.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.