SCOPUS 정보 검색 플랫폼

Advances in Neural Information Processing Systems

Volumn 2017-December, Issue , 2017, Pages 1088-1099

One-shot imitation learning

(8) Duan, Yan a Andrychowicz, Marcin b Stadie, Bradly a,b Ho, Jonathan a Schneider, Jonas b Sutskever, Ilya b Abbeel, Pieter a Zaremba, Wojciech b

a Berkeley Lab (United States)

b OpenAI LLC (United States)

Author keywords

[No Author keywords available]

Indexed keywords

NEURAL NETWORKS;

FEATURE ENGINEERINGS; GENERAL SYSTEMS; IMITATION LEARNING; INITIAL STATE; META-LEARNING FRAMEWORKS; NUMBER OF SAMPLES; TRAINING DATA; TRAINING TIME;

DEMONSTRATIONS;

EID: 85047012880 PISSN: 10495258 EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (571)

References (66)

1
- 14344251217
- Apprenticeship learning via inverse reinforcement learning
- Pieter Abbeel and Andrew Ng. Apprenticeship learning via inverse reinforcement learning. In International Conference on Machine Learning (ICML), 2004.
- (2004) International Conference on Machine Learning (ICML)
- Abbeel, P.¹ Ng, A.²

2
- 85019172761
- Learning to learn by gradient descent by gradient descent
- Marcin Andrychowicz, Misha Denil, Sergio Gomez, Matthew W Hoffman, David Pfau, Tom Schaul, and Nando de Freitas. Learning to learn by gradient descent by gradient descent. In Neural Information Processing Systems (NIPS), 2016.
- (2016) Neural Information Processing Systems (NIPS)
- Andrychowicz, M.¹ Denil, M.² Gomez, S.³ Hoffman, M.W.⁴ Pfau, D.⁵ Schaul, T.⁶ De Freitas, N.⁷

3
- 63149159130
- A survey of robot learning from demonstration
- Brenna D Argall, Sonia Chernova, Manuela Veloso, and Brett Browning. A survey of robot learning from demonstration. Robotics and autonomous systems, 57(5):469-483, 2009.
- (2009) Robotics and Autonomous Systems , vol.57 , Issue.5 , pp. 469-483
- Argall, B.D.¹ Chernova, S.² Veloso, M.³ Browning, B.⁴

4
- 84856675275
- Tabula rasa: Model transfer for object category detection
- IEEE
- Yusuf Aytar and Andrew Zisserman. Tabula rasa: Model transfer for object category detection. In 2011 International Conference on Computer Vision, pages 2252-2259. IEEE, 2011.
- (2011) 2011 International Conference on Computer Vision , pp. 2252-2259
- Aytar, Y.¹ Zisserman, A.²

5
- 85019174179
- Using fast weights to attend to the recent past
- Jimmy Ba, Geoffrey E Hinton, Volodymyr Mnih, Joel Z Leibo, and Catalin Ionescu. Using fast weights to attend to the recent past. In Neural Information Processing Systems (NIPS), 2016.
- (2016) Neural Information Processing Systems (NIPS)
- Ba, J.¹ Hinton, G.E.² Mnih, V.³ Leibo, J.Z.⁴ Ionescu, C.⁵

6
- 84922389693
- arXiv preprint
- Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.
- (2014) Neural Machine Translation by Jointly Learning to Align and Translate
- Bahdanau, D.¹ Cho, K.² Bengio, Y.³

7
- 85018882182
- Interaction networks for learning about objects, relations and physics
- Peter Battaglia, Razvan Pascanu, Matthew Lai, Danilo Jimenez Rezende, et al. Interaction networks for learning about objects, relations and physics. In Advances in Neural Information Processing Systems, pages 4502-4510, 2016.
- (2016) Advances in Neural Information Processing Systems , pp. 4502-4510
- Battaglia, P.¹ Pascanu, R.² Lai, M.³ Rezende, D.J.⁴

8
- 85047008902
- On the optimization of a synaptic learning rule
- Samy Bengio, Yoshua Bengio, Jocelyn Cloutier, and Jan Gecsei. On the optimization of a synaptic learning rule. In Optimality in Artificial and Biological Neural Networks, pages 6-8, 1992.
- (1992) Optimality in Artificial and Biological Neural Networks , pp. 6-8
- Bengio, S.¹ Bengio, Y.² Cloutier, J.³ Gecsei, J.⁴

9
- 84921824478
- Université de Montréal, Département d'informatique et de recherche opérationnelle
- Yoshua Bengio, Samy Bengio, and Jocelyn Cloutier. Learning a synaptic learning rule. Université de Montréal, Département d'informatique et de recherche opérationnelle, 1990.
- (1990) Learning a Synaptic Learning Rule
- Bengio, Y.¹ Bengio, S.² Cloutier, J.³

10
- 0029509952
- Neuro-dynamic programming: An overview
- Proceedings of the 34th IEEE Conference on IEEE
- Dimitri P Bertsekas and John N Tsitsiklis. Neuro-dynamic programming: an overview. In Decision and Control, 1995., Proceedings of the 34th IEEE Conference on, Volume 1, pages 560-564. IEEE, 1995.
- (1995) Decision and Control, 1995 , vol.1 , pp. 560-564
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

11
- 70449601159
- EPFL Press
- Sylvain Calinon. Robot programming by demonstration. EPFL Press, 2009.
- (2009) Robot Programming by Demonstration
- Calinon, S.¹

12
- 85088231129
- A compositional object-based approach to learning physical dynamics
- Michael B Chang, Tomer Ullman, Antonio Torralba, and Joshua B Tenenbaum. A compositional object-based approach to learning physical dynamics. In Int. Conf. on Learning Representations (ICLR), 2017.
- (2017) Int. Conf. on Learning Representations (ICLR)
- Chang, M.B.¹ Ullman, T.² Torralba, A.³ Tenenbaum, J.B.⁴

13
- 84919728106
- arXiv preprint
- Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. Learning phrase representations using rnn encoderdecoder for statistical machine translation. arXiv preprint arXiv:1406.1078, 2014.
- (2014) Learning Phrase Representations Using Rnn Encoderdecoder for Statistical Machine Translation
- Cho, K.¹ Van Merriënboer, B.² Gulcehre, C.³ Bahdanau, D.⁴ Bougares, F.⁵ Schwenk, H.⁶ Bengio, Y.⁷

14
- 84906332834
- Decaf: A deep convolutional activation feature for generic visual recognition
- Jeff Donahue, Yangqing Jia, Oriol Vinyals, Judy Hoffman, Ning Zhang, Eric Tzeng, and Trevor Darrell. Decaf: A deep convolutional activation feature for generic visual recognition. In ICML, pages 647-655, 2014.
- (2014) ICML , pp. 647-655
- Donahue, J.¹ Jia, Y.² Vinyals, O.³ Hoffman, J.⁴ Zhang, N.⁵ Tzeng, E.⁶ Darrell, T.⁷

15
- 84914177532
- arXiv preprint
- Lixin Duan, Dong Xu, and Ivor Tsang. Learning with augmented features for heterogeneous domain adaptation. arXiv preprint arXiv:1206.4660, 2012.
- (2012) Learning with Augmented Features for Heterogeneous Domain Adaptation
- Duan, L.¹ Xu, D.² Tsang, I.³

16
- 85028473985
- arXiv preprint
- 2: Fast reinforcement learning via slow reinforcement learning. arXiv preprint arXiv:1611.02779, 2016.
- (2016) 2: Fast Reinforcement Learning Via Slow Reinforcement Learning
- Duan, Y.¹ Schulman, J.² Chen, X.³ Bartlett, P.L.⁴ Sutskever, I.⁵ Abbeel, P.⁶

17
- 85112095815
- Towards a neural statistician
- Harrison Edwards and Amos Storkey. Towards a neural statistician. International Conference on Learning Representations (ICLR), 2017.
- (2017) International Conference on Learning Representations (ICLR)
- Edwards, H.¹ Storkey, A.²

18
- 84989292118
- Guided cost learning: Deep inverse optimal control via policy optimization
- Chelsea Finn, Sergey Levine, and Pieter Abbeel. Guided cost learning: Deep inverse optimal control via policy optimization. In Proceedings of the 33rd International Conference on Machine Learning, Volume 48, 2016.
- (2016) Proceedings of the 33rd International Conference on Machine Learning , vol.48
- Finn, C.¹ Levine, S.² Abbeel, P.³

19
- 85046762258
- arXiv preprint
- Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-agnostic meta-learning for fast adaptation of deep networks. arXiv preprint arXiv:1703.03400, 2017.
- (2017) Model-agnostic Meta-learning for Fast Adaptation of Deep Networks
- Finn, C.¹ Abbeel, P.² Levine, S.³

20
- 85048435317
- Learning invariant feature spaces to transfer skills with reinforcement learning
- Abhishek Gupta, Coline Devin, YuXuan Liu, Pieter Abbeel, and Sergey Levine. Learning invariant feature spaces to transfer skills with reinforcement learning. In Int. Conf. on Learning Representations (ICLR), 2017.
- (2017) Int. Conf. on Learning Representations (ICLR)
- Gupta, A.¹ Devin, C.² Liu, Y.³ Abbeel, P.⁴ Levine, S.⁵

21
- 84965103751
- Learning continuous control policies by stochastic value gradients
- Nicolas Heess, Gregory Wayne, David Silver, Tim Lillicrap, Tom Erez, and Yuval Tassa. Learning continuous control policies by stochastic value gradients. In Advances in Neural Information Processing Systems, pages 2944-2952, 2015.
- (2015) Advances in Neural Information Processing Systems , pp. 2944-2952
- Heess, N.¹ Wayne, G.² Silver, D.³ Lillicrap, T.⁴ Erez, T.⁵ Tassa, Y.⁶

22
- 85018872345
- Generative adversarial imitation learning
- Jonathan Ho and Stefano Ermon. Generative adversarial imitation learning. In Advances in Neural Information Processing Systems, pages 4565-4573, 2016.
- (2016) Advances in Neural Information Processing Systems , pp. 4565-4573
- Ho, J.¹ Ermon, S.²

23
- 0002771403
- Learning to learn using gradient descent
- Springer
- Sepp Hochreiter, A Steven Younger, and Peter R Conwell. Learning to learn using gradient descent. In International Conference on Artificial Neural Networks. Springer, 2001.
- (2001) International Conference on Artificial Neural Networks
- Hochreiter, S.¹ Steven Younger, A.² Conwell, P.R.³

24
- 85083950659
- arXiv preprint
- Judy Hoffman, Erik Rodner, Jeff Donahue, Trevor Darrell, and Kate Saenko. Efficient learning of domain-invariant image representations. arXiv preprint arXiv:1301.3224, 2013.
- (2013) Efficient Learning of Domain-invariant Image Representations
- Hoffman, J.¹ Rodner, E.² Donahue, J.³ Darrell, T.⁴ Saenko, K.⁵

25
- 84941620184
- Adam: A method for stochastic optimization
- Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations (ICLR), 2014.
- (2014) Proceedings of the 3rd International Conference on Learning Representations (ICLR)
- Kingma, D.P.¹ Ba, J.²

26
- 85020183301
- Siamese neural networks for one-shot image recognition
- Gregory Koch. Siamese neural networks for one-shot image recognition. ICML Deep Learning Workshop, 2015.
- (2015) ICML Deep Learning Workshop
- Koch, G.¹

27
- 85018911798
- arXiv preprint
- David Krueger, Tegan Maharaj, János Kramár, Mohammad Pezeshki, Nicolas Ballas, Nan Rosemary Ke, Anirudh Goyal, Yoshua Bengio, Hugo Larochelle, Aaron Courville, et al. Zoneout: Regularizing rnns by randomly preserving hidden activations. arXiv preprint arXiv:1606.01305, 2016.
- (2016) Zoneout: Regularizing Rnns by Randomly Preserving Hidden Activations
- Krueger, D.¹ Maharaj, T.² Kramár, J.³ Pezeshki, M.⁴ Ballas, N.⁵ Ke, N.R.⁶ Goyal, A.⁷ Bengio, Y.⁸ Larochelle, H.⁹ Courville, A.¹⁰

28
- 80052895155
- What you saw is not what you get: Domain adaptation using asymmetric kernel transforms
- IEEE
- Brian Kulis, Kate Saenko, and Trevor Darrell. What you saw is not what you get: Domain adaptation using asymmetric kernel transforms. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 1785-1792. IEEE, 2011.
- (2011) Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference On , pp. 1785-1792
- Kulis, B.¹ Saenko, K.² Darrell, T.³

29
- 85162407671
- Nonlinear inverse reinforcement learning with Gaussian processes
- S. Levine, Z. Popovic, and V. Koltun. Nonlinear inverse reinforcement learning with gaussian processes. In Advances in Neural Information Processing Systems (NIPS), 2011.
- (2011) Advances in Neural Information Processing Systems (NIPS)
- Levine, S.¹ Popovic, Z.² Koltun, V.³

30
- 84979924150
- End-to-end training of deep visuomotor policies
- Sergey Levine, Chelsea Finn, Trevor Darrell, and Pieter Abbeel. End-to-end training of deep visuomotor policies. Journal of Machine Learning Research, 17(39):1-40, 2016.
- (2016) Journal of Machine Learning Research , vol.17 , Issue.39 , pp. 1-40
- Levine, S.¹ Finn, C.² Darrell, T.³ Abbeel, P.⁴

31
- 85031093489
- arXiv preprint
- Ke Li and Jitendra Malik. Learning to optimize. arXiv preprint arXiv:1606.01885, 2016.
- (2016) Learning to Optimize
- Li, K.¹ Malik, J.²

32
- 84965135289
- arXiv preprint
- Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971, 2015.
- (2015) Continuous Control with Deep Reinforcement Learning
- Lillicrap, T.P.¹ Hunt, J.J.² Pritzel, A.³ Heess, N.⁴ Erez, T.⁵ Tassa, Y.⁶ Silver, D.⁷ Wierstra, D.⁸

33
- 84973924037
- Learning transferable features with deep adaptation networks
- Mingsheng Long and Jianmin Wang. Learning transferable features with deep adaptation networks. CoRR, abs/1502.02791, 1:2 2015.
- (2015) CoRR , vol.1 , pp. 2
- Long, M.¹ Wang, J.²

34
- 84898072330
- arXiv preprint
- Yishay Mansour, Mehryar Mohri, and Afshin Rostamizadeh. Domain adaptation: Learning bounds and algorithms. arXiv preprint arXiv:0902.3430, 2009.
- (2009) Domain Adaptation: Learning Bounds and Algorithms
- Mansour, Y.¹ Mohri, M.² Rostamizadeh, A.³

35
- 84924051598
- Human-level control through deep reinforcement learning
- Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al. Human-level control through deep reinforcement learning. Nature, 518(7540):529-533, 2015.
- (2015) Nature , vol.518 , Issue.7540 , pp. 529-533
- Mnih, V.¹ Kavukcuoglu, K.² Silver, D.³ Rusu, A.A.⁴ Veness, J.⁵ Bellemare, M.G.⁶ Graves, A.⁷ Riedmiller, M.⁸ Fidjeland, A.K.⁹ Ostrovski, G.¹⁰

36
- 84875556978
- Meta-neural networks that learn by learning
- Devang K Naik and RJ Mammone. Meta-neural networks that learn by learning. In International Joint Conference on Neural Netowrks (IJCNN), 1992.
- (1992) International Joint Conference on Neural Netowrks (IJCNN)
- Naik, D.K.¹ Mammone, R.²

37
- 0042547347
- Algorithms for inverse reinforcement learning
- Andrew Ng and Stuart Russell. Algorithms for inverse reinforcement learning. In International Conference on Machine Learning (ICML), 2000.
- (2000) International Conference on Machine Learning (ICML)
- Ng, A.¹ Russell, S.²

38
- 0141596576
- Policy invariance under reward transformations: Theory and application to reward shaping
- Andrew Y Ng, Daishi Harada, and Stuart Russell. Policy invariance under reward transformations: Theory and application to reward shaping. In ICML, Volume 99, pages 278-287, 1999.
- (1999) ICML , vol.99 , pp. 278-287
- Ng, A.Y.¹ Harada, D.² Russell, S.³

39
- 3042583887
- Autonomous helicopter flight via reinforcement learning
- Andrew Y Ng, H Jin Kim, Michael I Jordan, Shankar Sastry, and Shiv Ballianda. Autonomous helicopter flight via reinforcement learning. In NIPS, Volume 16, 2003.
- (2003) NIPS , vol.16
- Ng, A.Y.¹ Jin Kim, H.² Jordan, M.I.³ Sastry, S.⁴ Ballianda, S.⁵

40
- 44949241322
- Reinforcement learning of motor skills with policy gradients
- Jan Peters and Stefan Schaal. Reinforcement learning of motor skills with policy gradients. Neural networks, 21(4):682-697, 2008.
- (2008) Neural Networks , vol.21 , Issue.4 , pp. 682-697
- Peters, J.¹ Schaal, S.²

41
- 0000796434
- Alvinn: An autonomous land vehicle in a neural network
- Dean A Pomerleau. Alvinn: An autonomous land vehicle in a neural network. In Advances in Neural Information Processing Systems, pages 305-313, 1989.
- (1989) Advances in Neural Information Processing Systems , pp. 305-313
- Pomerleau, D.A.¹

42
- 85041901997
- Optimization as a model for few-shot learning
- Sachin Ravi and Hugo Larochelle. Optimization as a model for few-shot learning. In Under Review, ICLR, 2017.
- (2017) Under Review, ICLR
- Ravi, S.¹ Larochelle, H.²

43
- 84998631632
- One-shot generalization in deep generative models
- Danilo Jimenez Rezende, Shakir Mohamed, Ivo Danihelka, Karol Gregor, and Daan Wierstra. One-shot generalization in deep generative models. International Conference on Machine Learning (ICML), 2016.
- (2016) International Conference on Machine Learning (ICML)
- Rezende, D.J.¹ Mohamed, S.² Danihelka, I.³ Gregor, K.⁴ Wierstra, D.⁵

44
- 84867135104
- A reduction of imitation learning and structured prediction to no-regret online learning
- Stéphane Ross, Geoffrey J Gordon, and Drew Bagnell. A reduction of imitation learning and structured prediction to no-regret online learning. In AISTATS, Volume 1, page 6, 2011.
- (2011) AISTATS , vol.1 , pp. 6
- Ross, S.¹ Gordon, G.J.² Bagnell, D.³

45
- 85016426881
- arXiv preprint
- Andrei A Rusu, Neil C Rabinowitz, Guillaume Desjardins, Hubert Soyer, James Kirkpatrick, Koray Kavukcuoglu, Razvan Pascanu, and Raia Hadsell. Progressive neural networks. arXiv preprint arXiv:1606.04671, 2016.
- (2016) Progressive Neural Networks
- Rusu, A.A.¹ Rabinowitz, N.C.² Desjardins, G.³ Soyer, H.⁴ Kirkpatrick, J.⁵ Kavukcuoglu, K.⁶ Pascanu, R.⁷ Hadsell, R.⁸

46
- 85041964417
- 2 rl: Real single-image flight without a single real image. 2016.
- (2016) 2 rl: Real Single-Image Flight Without a Single Real Image
- Sadeghi, F.¹ Levine, S.²

47
- 84998717754
- Meta-learning with memory-augmented neural networks
- Adam Santoro, Sergey Bartunov, Matthew Botvinick, Daan Wierstra, and Timothy Lillicrap. Meta-learning with memory-augmented neural networks. In International Conference on Machine Learning (ICML), 2016.
- (2016) International Conference on Machine Learning (ICML)
- Santoro, A.¹ Bartunov, S.² Botvinick, M.³ Wierstra, D.⁴ Lillicrap, T.⁵

48
- 0033151712
- Is imitation learning the route to humanoid robots?
- Stefan Schaal. Is imitation learning the route to humanoid robots? Trends in cognitive sciences, 3(6):233-242, 1999.
- (1999) Trends in Cognitive Sciences , vol.3 , Issue.6 , pp. 233-242
- Schaal, S.¹

49
- 25944480439
- On learning how to learn: The meta-meta-⋯ hook.) Diploma thesis, Institut f. Informatik, Tech. Univ. Munich
- Jurgen Schmidhuber. Evolutionary principles in self-referential learning. On learning how to learn: The meta-meta-⋯ hook.) Diploma thesis, Institut f. Informatik, Tech. Univ. Munich, 1987.
- (1987) Evolutionary Principles in Self-Referential Learning
- Schmidhuber, J.¹

50
- 0346377064
- Learning to control fast-weight memories: An alternative to dynamic recurrent networks
- Jürgen Schmidhuber. Learning to control fast-weight memories: An alternative to dynamic recurrent networks. Neural Computation, 1992.
- (1992) Neural Computation
- Schmidhuber, J.¹

51
- 84969963490
- Trust region policy optimization
- John Schulman, Sergey Levine, Pieter Abbeel, Michael I Jordan, and Philipp Moritz. Trust region policy optimization. In ICML, pages 1889-1897, 2015.
- (2015) ICML , pp. 1889-1897
- Schulman, J.¹ Levine, S.² Abbeel, P.³ Jordan, M.I.⁴ Moritz, P.⁵

52
- 84963949906
- Mastering the game of go with deep neural networks and tree search
- David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, et al. Mastering the game of go with deep neural networks and tree search. Nature, 529(7587):484-489, 2016.
- (2016) Nature , vol.529 , Issue.7587 , pp. 484-489
- Silver, D.¹ Huang, A.² Maddison, C.J.³ Guez, A.⁴ Sifre, L.⁵ Van Den Driessche, G.⁶ Schrittwieser, J.⁷ Antonoglou, I.⁸ Panneershelvam, V.⁹ Lanctot, M.¹⁰

53
- 84904163933
- Dropout: A simple way to prevent neural networks from overfitting
- Nitish Srivastava, Geoffrey E Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(1):1929-1958, 2014.
- (2014) Journal of Machine Learning Research , vol.15 , Issue.1 , pp. 1929-1958
- Srivastava, N.¹ Hinton, G.E.² Krizhevsky, A.³ Sutskever, I.⁴ Salakhutdinov, R.⁵

54
- 85146496818
- Third person imitation learning
- Bradlie Stadie, Pieter Abbeel, and Ilya Sutskever. Third person imitation learning. In Int. Conf. on Learning Representations (ICLR), 2017.
- (2017) Int. Conf. on Learning Representations (ICLR)
- Stadie, B.¹ Abbeel, P.² Sutskever, I.³

55
- 84928547704
- Sequence to sequence learning with neural networks
- Ilya Sutskever, Oriol Vinyals, and Quoc V Le. Sequence to sequence learning with neural networks. In Advances in neural information processing systems, pages 3104-3112, 2014.
- (2014) Advances in Neural Information Processing Systems , pp. 3104-3112
- Sutskever, I.¹ Vinyals, O.² Le, Q.V.³

56
- 0004102479
- MIT, press Cambridge
- Richard S Sutton and Andrew G Barto. Reinforcement learning: An introduction, Volume 1. MIT press Cambridge, 1998.
- (1998) Reinforcement Learning: An Introduction , vol.1
- Sutton, R.S.¹ Barto, A.G.²

57
- 0029276036
- Temporal difference learning and td-gammon
- Gerald Tesauro. Temporal difference learning and td-gammon. Communications of the ACM, 38(3):58-68, 1995.
- (1995) Communications of the ACM , vol.38 , Issue.3 , pp. 58-68
- Tesauro, G.¹

58
- 0003901612
- Springer Science & Business Media
- Sebastian Thrun and Lorien Pratt. Learning to learn. Springer Science & Business Media, 1998.
- (1998) Learning to Learn
- Thrun, S.¹ Pratt, L.²

59
- 84969568676
- arXiv preprint
- Eric Tzeng, Judy Hoffman, Ning Zhang, Kate Saenko, and Trevor Darrell. Deep domain confusion: Maximizing for domain invariance. arXiv preprint arXiv:1412.3474, 2014.
- (2014) Deep Domain Confusion: Maximizing for Domain Invariance
- Tzeng, E.¹ Hoffman, J.² Zhang, N.³ Saenko, K.⁴ Darrell, T.⁵

60
- 84990841560
- arXiv preprint
- Eric Tzeng, Coline Devin, Judy Hoffman, Chelsea Finn, Xingchao Peng, Pieter Abbeel, Sergey Levine, Kate Saenko, and Trevor Darrell. Towards adapting deep visuomotor representations from simulated to real environments. arXiv preprint arXiv:1511.07111, 2015.
- (2015) Towards Adapting Deep Visuomotor Representations from Simulated to Real Environments
- Tzeng, E.¹ Devin, C.² Hoffman, J.³ Finn, C.⁴ Peng, X.⁵ Abbeel, P.⁶ Levine, S.⁷ Saenko, K.⁸ Darrell, T.⁹

61
- 85018863845
- Matching networks for one shot learning
- Oriol Vinyals, Charles Blundell, Tim Lillicrap, Daan Wierstra, et al Matching networks for one shot learning. In Neural Information Processing Systems (NIPS), 2016.
- (2016) Neural Information Processing Systems (NIPS)
- Vinyals, O.¹ Blundell, C.² Lillicrap, T.³ Wierstra, D.⁴

62
- 85028474927
- arXiv preprint
- Jane X Wang, Zeb Kurth-Nelson, Dhruva Tirumala, Hubert Soyer, Joel Z Leibo, Remi Munos, Charles Blundell, Dharshan Kumaran, and Matt Botvinick. Learning to reinforcement learn. arXiv preprint arXiv:1611.05763, 2016.
- (2016) Learning to Reinforcement Learn
- Wang, J.X.¹ Kurth-Nelson, Z.² Tirumala, D.³ Soyer, H.⁴ Leibo, J.Z.⁵ Munos, R.⁶ Blundell, C.⁷ Kumaran, D.⁸ Botvinick, M.⁹

63
- 84939821074
- Show, attend and tell: Neural image caption generation with visual attention
- Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron C Courville, Ruslan Salakhutdinov, Richard S Zemel, and Yoshua Bengio. Show, attend and tell: Neural image caption generation with visual attention. In ICML, Volume 14, pages 77-81, 2015.
- (2015) ICML , vol.14 , pp. 77-81
- Xu, K.¹ Ba, J.² Kiros, R.³ Cho, K.⁴ Courville, A.C.⁵ Salakhutdinov, R.⁶ Zemel, R.S.⁷ Bengio, Y.⁸

64
- 37849026107
- Cross-domain video concept detection using adaptive svms
- ACM
- Jun Yang, Rong Yan, and Alexander G Hauptmann. Cross-domain video concept detection using adaptive svms. In Proceedings of the 15th ACM international conference on Multimedia, pages 188-197. ACM, 2007.
- (2007) Proceedings of the 15th ACM International Conference on Multimedia , pp. 188-197
- Yang, J.¹ Yan, R.² Hauptmann, A.G.³

65
- 85083952059
- Multi-scale context aggregation by dilated convolutions
- Fisher Yu and Vladlen Koltun. Multi-scale context aggregation by dilated convolutions. In International Conference on Learning Representations (ICLR), 2016.
- (2016) International Conference on Learning Representations (ICLR)
- Yu, F.¹ Koltun, V.²

66
- 57749097473
- Maximum entropy inverse reinforcement learning
- B. Ziebart, A. Maas, J. A. Bagnell, and A. K. Dey. Maximum entropy inverse reinforcement learning. In AAAI Conference on Artificial Intelligence, 2008.
- (2008) AAAI Conference on Artificial Intelligence
- Ziebart, B.¹ Maas, A.² Bagnell, J.A.³ Dey, A.K.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.