메뉴 건너뛰기




Volumn , Issue , 2016, Pages 3682-3690

Hierarchical deep reinforcement learning: Integrating temporal abstraction and intrinsic motivation

Author keywords

[No Author keywords available]

Indexed keywords

DEEP LEARNING; LEARNING ALGORITHMS; STOCHASTIC SYSTEMS;

EID: 85019246453     PISSN: 10495258     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (1131)

References (35)
  • 1
    • 84870239334 scopus 로고    scopus 로고
    • Active learning of inverse models with intrinsically motivated goal exploration in robots
    • A. Baranes and P.-Y. Oudeyer. Active learning of inverse models with intrinsically motivated goal exploration in robots. Robotics and Autonomous Systems, 61(1): 49-73, 2013.
    • (2013) Robotics and Autonomous Systems , vol.61 , Issue.1 , pp. 49-73
    • Baranes, A.1    Oudeyer, P.-Y.2
  • 2
    • 0141988716 scopus 로고    scopus 로고
    • Recent advances in hierarchical reinforcement learning
    • A. G. Barto and S. Mahadevan. Recent advances in hierarchical reinforcement learning. Discrete Event Dynamic Systems, 13(4): 341-379, 2003.
    • (2003) Discrete Event Dynamic Systems , vol.13 , Issue.4 , pp. 341-379
    • Barto, A.G.1    Mahadevan, S.2
  • 4
    • 84899422939 scopus 로고    scopus 로고
    • Object focused q-learning for autonomous agents
    • L. C. Cobo, C. L. Isbell, and A. L. Thomaz. Object focused q-learning for autonomous agents. In Proceedings of AAMAS, pages 1061-1068, 2013.
    • (2013) Proceedings of AAMAS , pp. 1061-1068
    • Cobo, L.C.1    Isbell, C.L.2    Thomaz, A.L.3
  • 5
    • 0001158047 scopus 로고
    • Improving generalization for temporal difference learning: The successor representation
    • P. Dayan. Improving generalization for temporal difference learning: The successor representation. Neural Computation, 5(4): 613-624, 1993.
    • (1993) Neural Computation , vol.5 , Issue.4 , pp. 613-624
    • Dayan, P.1
  • 6
    • 0002278788 scopus 로고    scopus 로고
    • Hierarchical reinforcement learning with the maxq value function decomposition
    • T. G. Dietterich. Hierarchical reinforcement learning with the maxq value function decomposition. J. Artif. Intell. Res.(JAIR), 13: 227-303, 2000.
    • (2000) J. Artif. Intell. Res.(JAIR) , vol.13 , pp. 227-303
    • Dietterich, T.G.1
  • 10
    • 29344435556 scopus 로고    scopus 로고
    • Subgoal discovery for hierarchical reinforcement learning using learned policies
    • S. Goel and M. Huber. Subgoal discovery for hierarchical reinforcement learning using learned policies. In FLAIRS conference, pages 346-350, 2003.
    • (2003) FLAIRS Conference , pp. 346-350
    • Goel, S.1    Huber, M.2
  • 17
    • 84965128263 scopus 로고    scopus 로고
    • Variational information maximisation for intrinsically motivated reinforcement learning
    • S. Mohamed and D. J. Rezende. Variational information maximisation for intrinsically motivated reinforcement learning. In Advances in Neural Information Processing Systems, pages 2116-2124, 2015.
    • (2015) Advances in Neural Information Processing Systems , pp. 2116-2124
    • Mohamed, S.1    Rezende, D.J.2
  • 20
    • 84891105730 scopus 로고    scopus 로고
    • What is intrinsic motivation? A typology of computational approaches
    • P.-Y. Oudeyer and F. Kaplan. What is intrinsic motivation? a typology of computational approaches. Frontiers in neurorobotics, 1: 6, 2009.
    • (2009) Frontiers in Neurorobotics , vol.1 , pp. 6
    • Oudeyer, P.-Y.1    Kaplan, F.2
  • 22
    • 77956578648 scopus 로고    scopus 로고
    • Formal theory of creativity, fun, and intrinsic motivation (1990-2010)
    • J. Schmidhuber. Formal theory of creativity, fun, and intrinsic motivation (1990-2010). Autonomous Mental Development, IEEE Transactions on, 2(3): 230-247, 2010.
    • (2010) Autonomous Mental Development, IEEE Transactions on , vol.2 , Issue.3 , pp. 230-247
    • Schmidhuber, J.1
  • 34
    • 0033170372 scopus 로고    scopus 로고
    • Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning
    • R. S. Sutton, D. Precup, and S. Singh. Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning. Artificial intelligence, 112(1): 181-211, 1999.
    • (1999) Artificial Intelligence , vol.112 , Issue.1 , pp. 181-211
    • Sutton, R.S.1    Precup, D.2    Singh, S.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.