메뉴 건너뛰기




Volumn 13, Issue , 2012, Pages 1333-1371

Transfer in reinforcement learning via shared features

Author keywords

Reinforcement learning; Shaping; Skills; Transfer

Indexed keywords

KNOWLEDGE TRANSFER; SHAPING; SKILL TRANSFER; SKILLS; TRANSFER;

EID: 84862001711     PISSN: 15324435     EISSN: 15337928     Source Type: Journal    
DOI: None     Document Type: Article
Times cited : (86)

References (55)
  • 3
    • 0141988716 scopus 로고    scopus 로고
    • Recent advances in hierarchical reinforcement learning
    • Special Issue on Reinforcement Learning
    • A. G. Barto and S. Mahadevan. Recent advances in hierarchical reinforcement learning. Discrete Event Dynamic Systems, 13:41-77, 2003. Special Issue on Reinforcement Learning.
    • (2003) Discrete Event Dynamic Systems , vol.13 , pp. 41-77
    • Barto, A.G.1    Mahadevan, S.2
  • 4
    • 33749244036 scopus 로고    scopus 로고
    • Reusing old policies to accelerate learning on new MDPs
    • Department of Computer Science, University of Massachusetts at Amherst, April
    • D. S. Bernstein. Reusing old policies to accelerate learning on new MDPs. Technical Report UM-CS-1999-026, Department of Computer Science, University of Massachusetts at Amherst, April 1999.
    • (1999) Technical Report UM-CS-1999-026
    • Bernstein, D.S.1
  • 5
    • 24044449704 scopus 로고    scopus 로고
    • Learning evaluation functions to improve optimization by local search
    • J. Boyan and A. W. Moore. Learning evaluation functions to improve optimization by local search. Journal of Machine Learning Research, 1:77-112, 2000.
    • (2000) Journal of Machine Learning Research , vol.1 , pp. 77-112
    • Boyan, J.1    Moore, A.W.2
  • 7
    • 0002278788 scopus 로고    scopus 로고
    • Hierarchical reinforcement learning with the MAXQ value function decomposition
    • T. G. Dietterich. Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 13:227-303, 2000. (Pubitemid 33682087)
    • (2000) Journal of Artificial Intelligence Research , vol.13 , pp. 227-303
    • Dietterich, T.G.1
  • 8
    • 0004782095 scopus 로고    scopus 로고
    • Learning hierarchical control structures for multiple tasks and changing environments
    • R. Pfeifer, B. Blumberg, J. Meyer, and S. W. Wilson, editors, Zurich, Switzerland, August, MIT Press
    • B. L. Digney. Learning hierarchical control structures for multiple tasks and changing environments. In R. Pfeifer, B. Blumberg, J. Meyer, and S. W. Wilson, editors, From Animals to Animats 5: Proceedings of the Fifth International Conference on Simulation of Adaptive Behavior, Zurich, Switzerland, August 1998. MIT Press.
    • (1998) From Animals to Animats 5: Proceedings of the Fifth International Conference on Simulation of Adaptive Behavior
    • Digney, B.L.1
  • 11
    • 58349096666 scopus 로고    scopus 로고
    • Proto-transfer learning in Markov Decision Processes using spectral methods
    • University of Massachusetts Amherst
    • K. Ferguson and S. Mahadevan. Proto-transfer learning in Markov Decision Processes using spectral methods. Technical Report TR-08-23, University of Massachusetts Amherst, 2008.
    • (2008) Technical Report TR-08-23
    • Ferguson, K.1    Mahadevan, S.2
  • 12
    • 34247199512 scopus 로고    scopus 로고
    • Probabilistic policy reuse in a reinforcement learning agent
    • DOI 10.1145/1160633.1160762, Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multiagent Systems
    • F. Fernández and M. Veloso. Probabilistic policy reuse in a reinforcement learning agent. In Proceedings of the 5th International Joint Conference on Autonomous Agents and Multiagent Systems, pages 720-727, 2006. (Pubitemid 46609543)
    • (2006) Proceedings of the International Conference on Autonomous Agents , vol.2006 , pp. 720-727
    • Fernandez, F.1    Veloso, M.2
  • 14
    • 46249125542 scopus 로고    scopus 로고
    • Learning to behave in space: A qualitative spatial representation for robot navigation with reinforcement learning
    • L. Frommberger. Learning to behave in space: A qualitative spatial representation for robot navigation with reinforcement learning. International Journal on Artificial Intelligence Tools, 17(3):465-482, 2008.
    • (2008) International Journal on Artificial Intelligence Tools , vol.17 , Issue.3 , pp. 465-482
    • Frommberger, L.1
  • 15
    • 0032349513 scopus 로고    scopus 로고
    • Affordances, motivations, and the world graph theory
    • A. Guazzelli, F. J. Corbacho, M. Bota, and M. A. Arbib. Affordances, motivations, and the world graph theory. Adaptive Behavior, 6(3/4):433-471, 1998.
    • (1998) Adaptive Behavior , vol.6 , Issue.3-4 , pp. 433-471
    • Guazzelli, A.1    Corbacho, F.J.2    Bota, M.3    Arbib, M.A.4
  • 17
    • 84898927961 scopus 로고    scopus 로고
    • Automated state abstraction for options using the U-Tree algorithm
    • A. Jonsson and A. G. Barto. Automated state abstraction for options using the U-Tree algorithm. In Advances in Neural Information Processing Systems 13, pages 1054-1060, 2001.
    • (2001) Advances in Neural Information Processing Systems , vol.13 , pp. 1054-1060
    • Jonsson, A.1    Barto, A.G.2
  • 23
    • 0030647149 scopus 로고    scopus 로고
    • Reinforcement learning in the multi-robot domain
    • M. J. Matarić. Reinforcement learning in the multi-robot domain. Autonomous Robots, 4(1):73-83, 1997. (Pubitemid 127508276)
    • (1997) Autonomous Robots , vol.4 , Issue.1 , pp. 73-83
    • Mataric, M.J.1
  • 26
    • 0027684215 scopus 로고
    • Prioritized sweeping: Reinforcement learning with less data and less time
    • A. W. Moore and C. G. Atkeson. Prioritized sweeping: Reinforcement learning with less data and less time. Machine Learning, 13(1):103-130, 1993.
    • (1993) Machine Learning , vol.13 , Issue.1 , pp. 103-130
    • Moore, A.W.1    Atkeson, C.G.2
  • 29
    • 0142121953 scopus 로고    scopus 로고
    • Using options for knowledge transfer in reinforcement learning
    • Department of Computer Science, University of Massachusetts, Amherst
    • T. J. Perkins and D. Precup. Using options for knowledge transfer in reinforcement learning. Technical Report UM-CS-1999-034, Department of Computer Science, University of Massachusetts, Amherst, 1999.
    • (1999) Technical Report UM-CS-1999-034
    • Perkins, T.J.1    Precup, D.2
  • 37
    • 14344261491 scopus 로고    scopus 로고
    • Using relative novelty to identify useful temporal abstractions in reinforcement learning
    • Proceedings, Twenty-First International Conference on Machine Learning, ICML 2004
    • Ö. Şimşek and A. G. Barto. Using relative novelty to identify useful temporal abstractions in reinforcement learning. In Proceedings of the Twenty-First International Conference on Machine Learning, pages 751-758, 2004. (Pubitemid 40290877)
    • (2004) Proceedings, Twenty-First International Conference on Machine Learning, ICML 2004 , pp. 751-758
    • Simsek, O.1    Barto, A.G.2
  • 42
    • 27544506565 scopus 로고    scopus 로고
    • Reinforcement learning for RoboCup soccer keepaway
    • DOI 10.1177/105971230501300301
    • P. Stone, R. S. Sutton, and G. Kuhlmann. Reinforcement learning for robocup soccer keepaway. Adaptive Behavior, 13(3):165-188, 2005. (Pubitemid 41546119)
    • (2005) Adaptive Behavior , vol.13 , Issue.3 , pp. 165-188
    • Stone, P.1    Sutton, R.S.2    Kuhlmann, G.3
  • 45
    • 0033170372 scopus 로고    scopus 로고
    • Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
    • DOI 10.1016/S0004-3702(99)00052-1
    • R. S. Sutton, D. Precup, and S. P. Singh. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112(1-2):181-211, 1999. (Pubitemid 32079890)
    • (1999) Artificial Intelligence , vol.112 , Issue.1 , pp. 181-211
    • Sutton, R.S.1    Precup, D.2    Singh, S.3
  • 47
    • 34848816477 scopus 로고    scopus 로고
    • Transfer learning via inter-task mappings for temporal difference learning
    • M. E. Taylor, P. Stone, and Y. Liu. Transfer learning via inter-task mappings for temporal difference learning. Journal of Machine Learning Research, 8:2125-2167, 2007. (Pubitemid 47510988)
    • (2007) Journal of Machine Learning Research , vol.8 , pp. 2125-2167
    • Taylor, M.E.1    Stone, P.2    Liu, Y.3
  • 52
    • 27344453198 scopus 로고    scopus 로고
    • Potential-based shaping and Q-value initialization are equivalent
    • E. Wiewiora. Potential-based shaping and Q-value initialization are equivalent. Journal of Artificial Intelligence Research, 19:205-208, 2003. (Pubitemid 41525920)
    • (2003) Journal of Artificial Intelligence Research , vol.19 , pp. 205-208
    • Wiewiora, E.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.