메뉴 건너뛰기




Volumn 2, Issue , 2015, Pages 1312-1320

Universal value function approximators

Author keywords

[No Author keywords available]

Indexed keywords

ARTIFICIAL INTELLIGENCE; LEARNING ALGORITHMS; LEARNING SYSTEMS;

EID: 84969760283     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (1188)

References (26)
  • 2
    • 0031189914 scopus 로고    scopus 로고
    • Multitask learning
    • Caruana, Rich. Multitask learning. Machine learning, 28 (1):41-75, 1997.
    • (1997) Machine Learning , vol.28 , Issue.1 , pp. 41-75
    • Caruana, R.1
  • 7
    • 0036832959 scopus 로고    scopus 로고
    • Structure in the space of value functions
    • Foster, David and Dayan, Peter. Structure in the space of value functions. Machine Learning, 49(2-3):325-346, 2002.
    • (2002) Machine Learning , vol.49 , Issue.2-3 , pp. 325-346
    • Foster, D.1    Dayan, P.2
  • 11
    • 84868358933 scopus 로고    scopus 로고
    • Reinforcement learning to adjust parametrized motor primitives to new situations
    • Kober, Jens, Wilhelm, Andreas, Oztop, Erhan, and Peters, Jan. Reinforcement learning to adjust parametrized motor primitives to new situations. Autonomous Robots, 33 (4):361-379, 2012.
    • (2012) Autonomous Robots , vol.33 , Issue.4 , pp. 361-379
    • Kober, J.1    Wilhelm, A.2    Oztop, E.3    Peters, J.4
  • 15
    • 84896357393 scopus 로고    scopus 로고
    • Multi-timescale nexting in a reinforcement learning robot
    • Modayil, Joseph, White, Adam, and Sutton, Richard S. Multi-timescale nexting in a reinforcement learning robot. Adaptive Behavior, 22(2): 146-160, 2014.
    • (2014) Adaptive Behavior , vol.22 , Issue.2 , pp. 146-160
    • Modayil, J.1    White, A.2    Sutton, R.S.3
  • 16
    • 4644328593 scopus 로고    scopus 로고
    • Off-policy temporal-difference learning with function approximation
    • Citeseer
    • Precup, Doina, Sutton, Richard S, and Dasgupta, Sanjoy. Off-policy temporal-difference learning with function approximation. In ICML, pp. 417-424. Citeseer, 2001.
    • (2001) ICML , pp. 417-424
    • Precup, D.1    Sutton, R.S.2    Dasgupta, S.3
  • 19
    • 1942452236 scopus 로고    scopus 로고
    • Learning predictive state representations
    • Singh, Satinder, Littman, Michael L, Jong, Nicholas K, Pardoe, David, and Stone, Peter. Learning predictive state representations. In ICML, pp. 712-719, 2003.
    • (2003) ICML , pp. 712-719
    • Singh, S.1    Littman, M.L.2    Jong, N.K.3    Pardoe, D.4    Stone, P.5
  • 21
    • 84899003536 scopus 로고    scopus 로고
    • Temporal-difference networks
    • Saul, L.K., Weiss, Y., and Bottou, L. (eds.), MIT Press
    • Sutton, Richard S and Tanner, Brian. Temporal-difference networks. In Saul, L.K., Weiss, Y., and Bottou, L. (eds.), Advances in Neural Information Processing Systems 17, pp. 1377-1384. MIT Press, 2005.
    • (2005) Advances in Neural Information Processing Systems , vol.17 , pp. 1377-1384
    • Sutton, R.S.1    Tanner, B.2
  • 22
    • 0033170372 scopus 로고    scopus 로고
    • Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning
    • Sutton, Richard S, Precup, Doina, and Singh, Satinder. Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning. Artificial intelligence, 112(1): 181-211, 1999.
    • (1999) Artificial Intelligence , vol.112 , Issue.1 , pp. 181-211
    • Sutton, R.S.1    Precup, D.2    Singh, S.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.