SCOPUS 정보 검색 플랫폼

5th International Conference on Learning Representations, ICLR 2017 - Conference Track Proceedings

Volumn , Issue , 2017, Pages

Epopt: Learning robust neural network policies using model ensembles

(4) Rajeswaran, Aravind a Ghotra, Sarvjeet b Ravindran, Balaraman c Levine, Sergey d

a UNIVERSITY OF WASHINGTON (United States)

b NATIONAL INSTITUTE OF TECHNOLOGY KARNATAKA (India)

c INDIAN INSTITUTE OF TECHNOLOGY MADRAS (India)

d UNIVERSITY OF CALIFORNIA (United States)

Author keywords

[No Author keywords available]

Indexed keywords

BAYESIAN NETWORKS; DEEP NEURAL NETWORKS; PROBABILITY DISTRIBUTIONS;

APPROXIMATE BAYESIAN; DOMAIN ADAPTATION; FUNCTION APPROXIMATORS; MODEL ENSEMBLES; MODEL-BASED METHOD; REAL-WORLD TASK; SAMPLE COMPLEXITY; SIMULATED TRAININGS;

REINFORCEMENT LEARNING;

EID: 85064811489 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (181)

References (39)

1
- 33749242451
- Using inaccurate models in reinforcement learning
- Pieter Abbeel, Morgan Quigley, and Andrew Y. Ng. Using inaccurate models in reinforcement learning. In ICML, 2006.
- (2006) ICML
- Abbeel, P.¹ Quigley, M.² Ng, A.Y.³

2
- 63149159130
- A survey of robot learning from demonstration
- Brenna D. Argall, Sonia Chernova, Manuela Veloso, and Brett Browning. A survey of robot learning from demonstration. Robotics and Autonomous Systems, 57(5):469 - 483, 2009.
- (2009) Robotics and Autonomous Systems , vol.57 , Issue.5 , pp. 469-483
- Argall, B.D.¹ Chernova, S.² Veloso, M.³ Browning, B.⁴

3
- 85015444377
- Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. OpenAI Gym, 2016.
- (2016) OpenAI Gym
- Brockman, G.¹ Cheung, V.² Pettersson, L.³ Schneider, J.⁴ Schulman, J.⁵ Tang, J.⁶ Zaremba, W.⁷

4
- 84867115622
- Learning parameterized skills
- Bruno Castro da Silva, George Konidaris, and Andrew G. Barto. Learning parameterized skills. In ICML, 2012.
- (2012) ICML
- Da Silva, B.C.¹ Konidaris, G.² Barto, A.G.³

5
- 84903590417
- A survey on policy search for robotics
- Marc Peter Deisenroth, Gerhard Neumann, and Jan Peters. A survey on policy search for robotics. Foundations and Trends in Robotics, 2(12):1-142, 2013.
- (2013) Foundations and Trends in Robotics , vol.2 , Issue.12 , pp. 1-142
- Deisenroth, M.P.¹ Neumann, G.² Peters, J.³

6
- 77249117255
- Percentile optimization for markov decision processes with parameter uncertainty
- Erick Delage and Shie Mannor. Percentile optimization for markov decision processes with parameter uncertainty. Operations Research, 58(1):203-213, 2010.
- (2010) Operations Research , vol.58 , Issue.1 , pp. 203-213
- Delage, E.¹ Mannor, S.²

7
- 84999018287
- Benchmarking deep reinforcement learning for continuous control
- Yan Duan, Xi Chen, Rein Houthooft, John Schulman, and Pieter Abbeel. Benchmarking deep reinforcement learning for continuous control. In ICML, 2016.
- (2016) ICML
- Duan, Y.¹ Chen, X.² Houthooft, R.³ Schulman, J.⁴ Abbeel, P.⁵

8
- 1942421168
- Design for an optimal probe
- Michael O. Duff. Design for an optimal probe. In ICML, 2003.
- (2003) ICML
- Duff, M.O.¹

9
- 84866768817
- Infinite-horizon model predictive control for periodic tasks with contacts
- Tom Erez, Yuval Tassa, and Emanuel Todorov. Infinite-horizon model predictive control for periodic tasks with contacts. In Proceedings of Robotics: Science and Systems, 2011.
- (2011) Proceedings of Robotics: Science and Systems
- Erez, T.¹ Tassa, Y.² Todorov, E.³

10
- 84962321579
- A comprehensive survey on safe reinforcement learning
- Javier García and Fernando Fernández. A comprehensive survey on safe reinforcement learning. Journal of Machine Learning Research, 2015.
- (2015) Journal of Machine Learning Research
- García, J.¹ Fernández, F.²

11
- 84973621947
- Bayesian reinforcement learning: A survey
- Mohammad Ghavamzadeh, Shie Mannor, Joelle Pineau, and Aviv Tamar. Bayesian reinforcement learning: A survey. Foundations and Trends in Machine Learning, 8(5-6):359-483, 2015.
- (2015) Foundations and Trends in Machine Learning , vol.8 , Issue.5-6 , pp. 359-483
- Ghavamzadeh, M.¹ Mannor, S.² Pineau, J.³ Tamar, A.⁴

12
- 33646243319
- A natural policy gradient
- Sham Kakade. A natural policy gradient. In NIPS, 2001.
- (2001) NIPS
- Kakade, S.¹

13
- 23244466805
- PhD thesis, University College London
- Sham Kakade. On the Sample Complexity of Reinforcement Learning. PhD thesis, University College London, 2003.
- (2003) On the Sample Complexity of Reinforcement Learning
- Kakade, S.¹

14
- 1942514728
- Approximately optimal approximate reinforcement learning
- Sham Kakade and John Langford. Approximately optimal approximate reinforcement learning. In ICML, 2002.
- (2002) ICML
- Kakade, S.¹ Langford, J.²

15
- 84897529781
- Guided policy search
- Sergey Levine and Vladlen Koltun. Guided policy search. In ICML, 2013.
- (2013) ICML
- Levine, S.¹ Koltun, V.²

16
- 84965135289
- ArXiv e-prints, September
- T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra. Continuous control with deep reinforcement learning. ArXiv e-prints, September 2015.
- (2015) Continuous Control with Deep Reinforcement Learning
- Lillicrap, T.P.¹ Hunt, J.J.² Pritzel, A.³ Heess, N.⁴ Erez, T.⁵ Tassa, Y.⁶ Silver, D.⁷ Wierstra, D.⁸

17
- 84899014168
- Reinforcement learning in robust markov decision processes
- Shiau Hong Lim, Huan Xu, and Shie Mannor. Reinforcement learning in robust markov decision processes. In NIPS. 2013.
- (2013) NIPS
- Lim, S.H.¹ Xu, H.² Mannor, S.³

18
- 0003473124
- Birkhäuser Boston, Boston, MA
- Lennart Ljung. System Identification, pp. 163-173. Birkhäuser Boston, Boston, MA, 1998.
- (1998) System Identification , pp. 163-173
- Ljung, L.¹

19
- 84924051598
- Human-level control through deep reinforcement learning
- Feb
- Volodymyr Mnih et al. Human-level control through deep reinforcement learning. Nature, 518(7540): 529-533, Feb 2015.
- (2015) Nature , vol.518 , Issue.7540 , pp. 529-533
- Mnih, V.¹

20
- 84958149573
- Ensemble-CIO: Full-body dynamic motion planning that transfers to physical humanoids
- I. Mordatch, K. Lowrey, and E. Todorov. Ensemble-CIO: Full-body dynamic motion planning that transfers to physical humanoids. In IROS, 2015a.
- (2015) IROS
- Mordatch, I.¹ Lowrey, K.² Todorov, E.³

21
- 84965182099
- Interactive control of diverse complex characters with neural networks
- Igor Mordatch, Kendall Lowrey, Galen Andrew, Zoran Popovic, and Emanuel V. Todorov. Interactive control of diverse complex characters with neural networks. In NIPS. 2015b.
- (2015) NIPS
- Mordatch, I.¹ Lowrey, K.² Andrew, G.³ Popovic, Z.⁴ Todorov, E.V.⁵

22
- 14344250395
- Robust control of markov decision processes with uncertain transition matrices
- Arnab Nilim and Laurent El Ghaoui. Robust control of markov decision processes with uncertain transition matrices. Operations Research, 53(5):780-798, 2005.
- (2005) Operations Research , vol.53 , Issue.5 , pp. 780-798
- Nilim, A.¹ Ghaoui, L.E.²

23
- 84979992249
- Terrain-adaptive locomotion skills using deep reinforcement learning
- Xue Bin Peng, Glen Berseth, and Michiel van de Panne. Terrain-adaptive locomotion skills using deep reinforcement learning. ACM Transactions on Graphics (Proc. SIGGRAPH 2016), 2016.
- (2016) ACM Transactions on Graphics (Proc. SIGGRAPH 2016)
- Peng, X.B.¹ Berseth, G.² Van De Panne, M.³

24
- 33750724397
- Point-based value iteration for continuous pomdps
- Josep M. Porta, Nikos A. Vlassis, Matthijs T. J. Spaan, and Pascal Poupart. Point-based value iteration for continuous pomdps. Journal of Machine Learning Research, 7:2329-2367, 2006.
- (2006) Journal of Machine Learning Research , vol.7 , pp. 2329-2367
- Porta, J.M.¹ Vlassis, N.A.² Spaan, M.T.J.³ Poupart, P.⁴

25
- 33749251297
- An analytic solution to discrete Bayesian reinforcement learning
- Pascal Poupart, Nikos A. Vlassis, Jesse Hoey, and Kevin Regan. An analytic solution to discrete bayesian reinforcement learning. In ICML, 2006.
- (2006) ICML
- Poupart, P.¹ Vlassis, N.A.² Hoey, J.³ Regan, K.⁴

26
- 51649091499
- Bayesian reinforcement learning in continuous pomdps with application to robot navigation
- S. Ross, B. Chaib-draa, and J. Pineau. Bayesian reinforcement learning in continuous pomdps with application to robot navigation. In ICRA, 2008.
- (2008) ICRA
- Ross, S.¹ Chaib-Draa, B.² Pineau, J.³

27
- 84867115891
- Agnostic system identification for model-based reinforcement learning
- Stephane Ross and Drew Bagnell. Agnostic system identification for model-based reinforcement learning. In ICML, 2012.
- (2012) ICML
- Ross, S.¹ Bagnell, D.²

28
- 84969963490
- Trust region policy optimization
- John Schulman, Sergey Levine, Philipp Moritz, Michael Jordan, and Pieter Abbeel. Trust region policy optimization. In ICML, 2015.
- (2015) ICML
- Schulman, J.¹ Levine, S.² Moritz, P.³ Jordan, M.⁴ Abbeel, P.⁵

29
- 84963949906
- Mastering the game of go with deep neural networks and tree search
- Jan
- David Silver et al. Mastering the game of go with deep neural networks and tree search. Nature, 529 (7587):484-489, Jan 2016.
- (2016) Nature , vol.529 , Issue.7587 , pp. 484-489
- Silver, D.¹

30
- 84864947498
- Integrating a partial model into model free reinforcement learning
- Aviv Tamar, Dotan Di Castro, and Ron Meir. Integrating a partial model into model free reinforcement learning. Journal of Machine Learning Research, 2012.
- (2012) Journal of Machine Learning Research
- Tamar, A.¹ Castro, D.D.² Meir, R.³

31
- 84960095426
- Optimizing the cvar via sampling
- Aviv Tamar, Yonatan Glassner, and Shie Mannor. Optimizing the cvar via sampling. In AAAI Conference on Artificial Intelligence, 2015.
- (2015) AAAI Conference on Artificial Intelligence
- Tamar, A.¹ Glassner, Y.² Mannor, S.³

32
- 68949157375
- Transfer learning for reinforcement learning domains: A survey
- December
- Matthew E. Taylor and Peter Stone. Transfer learning for reinforcement learning domains: A survey. Journal of Machine Learning Research, 10:1633-1685, December 2009.
- (2009) Journal of Machine Learning Research , vol.10 , pp. 1633-1685
- Taylor, M.E.¹ Stone, P.²

33
- 84960191971
- High-confidence off-policy evaluation
- Philip Thomas, Georgios Theocharous, and Mohammad Ghavamzadeh. High-confidence off-policy evaluation. In AAAI Conference on Artificial Intelligence. 2015.
- (2015) AAAI Conference on Artificial Intelligence
- Thomas, P.¹ Theocharous, G.² Ghavamzadeh, M.³

34
- 84872292044
- MujoCo: A physics engine for model-based control
- Oct
- E. Todorov, T. Erez, and Y. Tassa. Mujoco: A physics engine for model-based control. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5026-5033, Oct 2012.
- (2012) 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems , pp. 5026-5033
- Todorov, E.¹ Erez, T.² Tassa, Y.³

35
- 85042936847
- Springer Berlin Heidelberg, Berlin, Heidelberg
- Nikos Vlassis, Mohammad Ghavamzadeh, Shie Mannor, and Pascal Poupart. Bayesian Reinforcement Learning, pp. 359-386. Springer Berlin Heidelberg, Berlin, Heidelberg, 2012.
- (2012) Bayesian Reinforcement Learning , pp. 359-386
- Vlassis, N.¹ Ghavamzadeh, M.² Mannor, S.³ Poupart, P.⁴

36
- 84871677137
- Optimizing walking controllers for uncertain inputs and environments
- Jack M. Wang, David J. Fleet, and Aaron Hertzmann. Optimizing walking controllers for uncertain inputs and environments. ACM Trans. Graph., 2010.
- (2010) ACM Trans. Graph.
- Wang, J.M.¹ Fleet, D.J.² Hertzmann, A.³

37
- 71749106087
- Real-time reinforcement learning by sequential actor-critics and experience replay
- Pawel Wawrzynski. Real-time reinforcement learning by sequential actor-critics and experience replay. Neural Networks, 22:1484-1497, 2009.
- (2009) Neural Networks , vol.22 , pp. 1484-1497
- Wawrzynski, P.¹

38
- 0000337576
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- Ronald J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8(3):229-256, 1992.
- (1992) Machine Learning , vol.8 , Issue.3 , pp. 229-256
- Williams, R.J.¹

39
- 0003585352
- Prentice-Hall, Inc., Upper Saddle River, NJ, USA
- Kemin Zhou, John C. Doyle, and Keith Glover. Robust and Optimal Control. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1996. ISBN 0-13-456567-3.
- (1996) Robust and Optimal Control
- Zhou, K.¹ Doyle, J.C.² Glover, K.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.