SCOPUS 정보 검색 플랫폼

Advances in Neural Information Processing Systems

Volumn , Issue , 2016, Pages 3682-3690

Hierarchical deep reinforcement learning: Integrating temporal abstraction and intrinsic motivation

(4) Kulkarni, Tejas D a Narasimhan, Karthik R b Saeedi, Ardavan b Tenenbaum, Joshua B c

a DEEPMIND (United Kingdom)

b MASSACHUSETTS INSTITUTE OF TECHNOLOGY (United States)

c MASSACHUSETTS INSTITUTE OF TECHNOLOGY (United States)

Author keywords

[No Author keywords available]

Indexed keywords

DEEP LEARNING; LEARNING ALGORITHMS; STOCHASTIC SYSTEMS;

DECISION PROCESS; DELAYED FEEDBACK; GOAL SPECIFICATIONS; INTRINSIC BEHAVIOR; INTRINSIC MOTIVATION; STOCHASTIC TRANSITIONS; TEMPORAL ABSTRACTION; VALUE FUNCTIONS;

REINFORCEMENT LEARNING;

EID: 85019246453 PISSN: 10495258 EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (1131)

References (35)

1
- 84870239334
- Active learning of inverse models with intrinsically motivated goal exploration in robots
- A. Baranes and P.-Y. Oudeyer. Active learning of inverse models with intrinsically motivated goal exploration in robots. Robotics and Autonomous Systems, 61(1): 49-73, 2013.
- (2013) Robotics and Autonomous Systems , vol.61 , Issue.1 , pp. 49-73
- Baranes, A.¹ Oudeyer, P.-Y.²

2
- 0141988716
- Recent advances in hierarchical reinforcement learning
- A. G. Barto and S. Mahadevan. Recent advances in hierarchical reinforcement learning. Discrete Event Dynamic Systems, 13(4): 341-379, 2003.
- (2003) Discrete Event Dynamic Systems , vol.13 , Issue.4 , pp. 341-379
- Barto, A.G.¹ Mahadevan, S.²

3
- 84998969754
- The arcade learning environment: An evaluation platform for general agents
- M. G. Bellemare, Y. Naddaf, J. Veness, and M. Bowling. The arcade learning environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research, 2012.
- (2012) Journal of Artificial Intelligence Research
- Bellemare, M.G.¹ Naddaf, Y.² Veness, J.³ Bowling, M.⁴

4
- 84899422939
- Object focused q-learning for autonomous agents
- L. C. Cobo, C. L. Isbell, and A. L. Thomaz. Object focused q-learning for autonomous agents. In Proceedings of AAMAS, pages 1061-1068, 2013.
- (2013) Proceedings of AAMAS , pp. 1061-1068
- Cobo, L.C.¹ Isbell, C.L.² Thomaz, A.L.³

5
- 0001158047
- Improving generalization for temporal difference learning: The successor representation
- P. Dayan. Improving generalization for temporal difference learning: The successor representation. Neural Computation, 5(4): 613-624, 1993.
- (1993) Neural Computation , vol.5 , Issue.4 , pp. 613-624
- Dayan, P.¹

6
- 0002278788
- Hierarchical reinforcement learning with the maxq value function decomposition
- T. G. Dietterich. Hierarchical reinforcement learning with the maxq value function decomposition. J. Artif. Intell. Res.(JAIR), 13: 227-303, 2000.
- (2000) J. Artif. Intell. Res.(JAIR) , vol.13 , pp. 227-303
- Dietterich, T.G.¹

7
- 56449093331
- An object-oriented representation for efficient reinforcement learning
- C. Diuk, A. Cohen, and M. L. Littman. An object-oriented representation for efficient reinforcement learning. In Proceedings of the International Conference on Machine learning, pages 240-247, 2008.
- (2008) Proceedings of the International Conference on Machine Learning , pp. 240-247
- Diuk, C.¹ Cohen, A.² Littman, M.L.³

8
- 84989317394
- arXiv preprint arXiv: 1603.08575
- S. Eslami, N. Heess, T. Weber, Y. Tassa, K. Kavukcuoglu, and G. E. Hinton. Attend, infer, repeat: Fast scene understanding with generative models. arXiv preprint arXiv: 1603.08575, 2016.
- (2016) Attend, Infer, Repeat: Fast Scene Understanding with Generative Models
- Eslami, S.¹ Heess, N.² Weber, T.³ Tassa, Y.⁴ Kavukcuoglu, K.⁵ Hinton, G.E.⁶

9
- 85019183591
- Curiosity driven reinforcement learning for motion planning on humanoids
- M. Frank, J. Leitner, M. Stollenga, A. Förster, and J. Schmidhuber. Curiosity driven reinforcement learning for motion planning on humanoids. Intrinsic motivations and open-ended development in animals, humans, and robots, page 245, 2015.
- (2015) Intrinsic Motivations and Open-ended Development in Animals, Humans, and Robots , pp. 245
- Frank, M.¹ Leitner, J.² Stollenga, M.³ Förster, A.⁴ Schmidhuber, J.⁵

10
- 29344435556
- Subgoal discovery for hierarchical reinforcement learning using learned policies
- S. Goel and M. Huber. Subgoal discovery for hierarchical reinforcement learning using learned policies. In FLAIRS conference, pages 346-350, 2003.
- (2003) FLAIRS Conference , pp. 346-350
- Goel, S.¹ Huber, M.²

11
- 84880803349
- Generalizing plans to new environments in relational mdps
- C. Guestrin, D. Koller, C. Gearhart, and N. Kanodia. Generalizing plans to new environments in relational mdps. In Proceedings of International Joint conference on Artificial Intelligence, pages 1003-1010, 2003.
- (2003) Proceedings of International Joint Conference on Artificial Intelligence , pp. 1003-1010
- Guestrin, C.¹ Koller, D.² Gearhart, C.³ Kanodia, N.⁴

12
- 4544318426
- Efficient solution algorithms for factored mdps
- C. Guestrin, D. Koller, R. Parr, and S. Venkataraman. Efficient solution algorithms for factored mdps. Journal of Artificial Intelligence Research, pages 399-468, 2003.
- (2003) Journal of Artificial Intelligence Research , pp. 399-468
- Guestrin, C.¹ Koller, D.² Parr, R.³ Venkataraman, S.⁴

13
- 84881401229
- Hierarchical memory-based reinforcement learning
- N. Hernandez-Gardiol and S. Mahadevan. Hierarchical memory-based reinforcement learning. In Advances in Neural Information Processing Systems, pages 1047-1053, 2001.
- (2001) Advances in Neural Information Processing Systems , pp. 1047-1053
- Hernandez-Gardiol, N.¹ Mahadevan, S.²

14
- 84905695541
- Evolving deep unsupervised convolutional networks for vision-based reinforcement learning
- ACM
- J. Koutník, J. Schmidhuber, and F. Gomez. Evolving deep unsupervised convolutional networks for vision-based reinforcement learning. In Proceedings of the 2014 conference on Genetic and evolutionary computation, pages 541-548. ACM, 2014.
- (2014) Proceedings of the 2014 Conference on Genetic and Evolutionary Computation , pp. 541-548
- Koutník, J.¹ Schmidhuber, J.² Gomez, F.³

15
- 84971448181
- arXiv preprint arXiv: 1602.01783
- V. Mnih, A. P. Badia, M. Mirza, A. Graves, T. P. Lillicrap, T. Harley, D. Silver, and K. Kavukcuoglu. Asynchronous methods for deep reinforcement learning. arXiv preprint arXiv: 1602.01783, 2016.
- (2016) Asynchronous Methods for Deep Reinforcement Learning
- Mnih, V.¹ Badia, A.P.² Mirza, M.³ Graves, A.⁴ Lillicrap, T.P.⁵ Harley, T.⁶ Silver, D.⁷ Kavukcuoglu, K.⁸

16
- 84924051598
- Human-level control through deep reinforcement learning
- V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, et al. Human-level control through deep reinforcement learning. Nature, 518(7540): 529-533, 2015.
- (2015) Nature , vol.518 , Issue.7540 , pp. 529-533
- Mnih, V.¹ Kavukcuoglu, K.² Silver, D.³ Rusu, A.A.⁴ Veness, J.⁵ Bellemare, M.G.⁶ Graves, A.⁷ Riedmiller, M.⁸

17
- 84965128263
- Variational information maximisation for intrinsically motivated reinforcement learning
- S. Mohamed and D. J. Rezende. Variational information maximisation for intrinsically motivated reinforcement learning. In Advances in Neural Information Processing Systems, pages 2116-2124, 2015.
- (2015) Advances in Neural Information Processing Systems , pp. 2116-2124
- Mohamed, S.¹ Rezende, D.J.²

18
- 84980007683
- arXiv preprint arXiv: 1507.04296
- A. Nair, P. Srinivasan, S. Blackwell, C. Alcicek, R. Fearon, A. De Maria, V. Panneershelvam, et al. Massively parallel methods for deep reinforcement learning. arXiv preprint arXiv: 1507.04296, 2015.
- (2015) Massively Parallel Methods for Deep Reinforcement Learning
- Nair, A.¹ Srinivasan, P.² Blackwell, S.³ Alcicek, C.⁴ Fearon, R.⁵ De Maria, A.⁶ Panneershelvam, V.⁷

19
- 84979240090
- arXiv preprint arXiv: 1602.04621
- I. Osband, C. Blundell, A. Pritzel, and B. Van Roy. Deep exploration via bootstrapped dqn. arXiv preprint arXiv: 1602.04621, 2016.
- (2016) Deep Exploration Via Bootstrapped Dqn
- Osband, I.¹ Blundell, C.² Pritzel, A.³ Van Roy, B.⁴

20
- 84891105730
- What is intrinsic motivation? A typology of computational approaches
- P.-Y. Oudeyer and F. Kaplan. What is intrinsic motivation? a typology of computational approaches. Frontiers in neurorobotics, 1: 6, 2009.
- (2009) Frontiers in Neurorobotics , vol.1 , pp. 6
- Oudeyer, P.-Y.¹ Kaplan, F.²

21
- 84969760283
- Universal value function approximators
- T. Schaul, D. Horgan, K. Gregor, and D. Silver. Universal value function approximators. In Proceedings of the 32nd International Conference on Machine Learning (ICML-15), pages 1312-1320, 2015.
- (2015) Proceedings of the 32nd International Conference on Machine Learning (ICML-15) , pp. 1312-1320
- Schaul, T.¹ Horgan, D.² Gregor, K.³ Silver, D.⁴

22
- 77956578648
- Formal theory of creativity, fun, and intrinsic motivation (1990-2010)
- J. Schmidhuber. Formal theory of creativity, fun, and intrinsic motivation (1990-2010). Autonomous Mental Development, IEEE Transactions on, 2(3): 230-247, 2010.
- (2010) Autonomous Mental Development, IEEE Transactions on , vol.2 , Issue.3 , pp. 230-247
- Schmidhuber, J.¹

23
- 84963949906
- Mastering the game of go with deep neural networks and tree search
- D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, et al. Mastering the game of go with deep neural networks and tree search. Nature, 529(7587): 484-489, 2016.
- (2016) Nature , vol.529 , Issue.7587 , pp. 484-489
- Silver, D.¹ Huang, A.² Maddison, C.J.³ Guez, A.⁴ Sifre, L.⁵ Van Den Driessche, G.⁶ Schrittwieser, J.⁷

24
- 31844447221
- Identifying useful subgoals in reinforcement learning by local graph partitioning
- Ö. Şimşek, A. Wolfe, and A. Barto. Identifying useful subgoals in reinforcement learning by local graph partitioning. In Proceedings of the International conference on Machine learning, pages 816-823, 2005.
- (2005) Proceedings of the International Conference on Machine Learning , pp. 816-823
- Şimşek, O.¹ Wolfe, A.² Barto, A.³

25
- 84893133238
- Where do rewards come from
- S. Singh, R. L. Lewis, and A. G. Barto. Where do rewards come from. In Proceedings of the annual conference of the cognitive science society, pages 2601-2606, 2009.
- (2009) Proceedings of the Annual Conference of the Cognitive Science Society , pp. 2601-2606
- Singh, S.¹ Lewis, R.L.² Barto, A.G.³

26
- 79953822184
- Intrinsically motivated reinforcement learning: An evolutionary perspective
- S. Singh, R. L. Lewis, A. G. Barto, and J. Sorg. Intrinsically motivated reinforcement learning: An evolutionary perspective. Autonomous Mental Development, IEEE Transactions on, 2(2): 70-82, 2010.
- (2010) Autonomous Mental Development, IEEE Transactions on , vol.2 , Issue.2 , pp. 70-82
- Singh, S.¹ Lewis, R.L.² Barto, A.G.³ Sorg, J.⁴

27
- 33745609140
- Intrinsically motivated reinforcement learning
- S. P. Singh, A. G. Barto, and N. Chentanez. Intrinsically motivated reinforcement learning. In Advances in neural information processing systems, pages 1281-1288, 2004.
- (2004) Advances in Neural Information Processing Systems , pp. 1281-1288
- Singh, S.P.¹ Barto, A.G.² Chentanez, N.³

28
- 84868298774
- Linear options
- Richland, SC
- J. Sorg and S. Singh. Linear options. In Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems, pages 31-38, Richland, SC, 2010.
- (2010) Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems , pp. 31-38
- Sorg, J.¹ Singh, S.²

29
- 33845754454
- Core knowledge
- E. S. Spelke and K. D. Kinzler. Core knowledge. Developmental science, 10(1): 89-96, 2007.
- (2007) Developmental Science , vol.10 , Issue.1 , pp. 89-96
- Spelke, E.S.¹ Kinzler, K.D.²

30
- 84937876815
- Design principles of the hippocampal cognitive map
- K. L. Stachenfeld, M. Botvinick, and S. J. Gershman. Design principles of the hippocampal cognitive map. In Advances in neural information processing systems, pages 2528-2536, 2014.
- (2014) Advances in Neural Information Processing Systems , pp. 2528-2536
- Stachenfeld, K.L.¹ Botvinick, M.² Gershman, S.J.³

31
- 84959023524
- arXiv preprint arXiv: 1507.00814
- B. C. Stadie, S. Levine, and P. Abbeel. Incentivizing exploration in reinforcement learning with deep predictive models. arXiv preprint arXiv: 1507.00814, 2015.
- (2015) Incentivizing Exploration in Reinforcement Learning with Deep Predictive Models
- Stadie, B.C.¹ Levine, S.² Abbeel, P.³

32
- 0003420416
- MIT Press Cambridge
- R. S. Sutton and A. G. Barto. Introduction to reinforcement learning. MIT Press Cambridge, 1998.
- (1998) Introduction to Reinforcement Learning
- Sutton, R.S.¹ Barto, A.G.²

33
- 84899464022
- Horde: A scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction
- R. S. Sutton, J. Modayil, M. Delp, T. Degris, P. M. Pilarski, A. White, and D. Precup. Horde: A scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction. In The 10th International Conference on Autonomous Agents and Multiagent Systems, pages 761-768, 2011.
- (2011) The 10th International Conference on Autonomous Agents and Multiagent Systems , pp. 761-768
- Sutton, R.S.¹ Modayil, J.² Delp, M.³ Degris, T.⁴ Pilarski, P.M.⁵ White, A.⁶ Precup, D.⁷

34
- 0033170372
- Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning
- R. S. Sutton, D. Precup, and S. Singh. Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning. Artificial intelligence, 112(1): 181-211, 1999.
- (1999) Artificial Intelligence , vol.112 , Issue.1 , pp. 181-211
- Sutton, R.S.¹ Precup, D.² Singh, S.³

35
- 84937951926
- Universal option models
- C. Szepesvari, R. S. Sutton, J. Modayil, S. Bhatnagar, et al. Universal option models. In Advances in Neural Information Processing Systems, pages 990-998, 2014.
- (2014) Advances in Neural Information Processing Systems , pp. 990-998
- Szepesvari, C.¹ Sutton, R.S.² Modayil, J.³ Bhatnagar, S.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.