메뉴 건너뛰기




Volumn 5, Issue 3, 2012, Pages 293-311

A survey of inverse reinforcement learning techniques

Author keywords

Artificial intelligence; Inverse reinforcement learning; Learning methods; Reinforcement learning; Reward function

Indexed keywords

COMPLEX PROBLEMS; DESIGN/METHODOLOGY/APPROACH; DYNAMIC ENVIRONMENTS; FUNDAMENTAL THEORY; INVERSE REINFORCEMENT LEARNING; LATEST DEVELOPMENT; LEARNING METHODS; REINFORCEMENT LEARNING TECHNIQUES; REWARD FUNCTION; SEQUENTIAL DECISION MAKING; SUCCINCT REPRESENTATION;

EID: 84865146660     PISSN: 1756378X     EISSN: 17563798     Source Type: Journal    
DOI: 10.1108/17563781211255862     Document Type: Article
Times cited : (102)

References (58)
  • 2
    • 77955809093 scopus 로고    scopus 로고
    • Autonomous helicopter aerobatics through apprenticeship learning
    • Abbeel, P., Coates, A. and Ng, A. (2010), "Autonomous helicopter aerobatics through apprenticeship learning" in International Journal of Robotics Research, Vol. 29, No. 13, pp. 1608-39.
    • (2010) International Journal of Robotics Research , vol.29 , Issue.13 , pp. 1608-1639
    • Abbeel, P.1    Coates, A.2    Ng, A.3
  • 6
    • 0000396062 scopus 로고    scopus 로고
    • Natural gradient works efficiently in learning
    • Amari, S. (1998), "Natural gradient works efficiently in learning" in Neural Computation, Vol. 10, No. 2, pp. 251-76.
    • (1998) Neural Computation , vol.10 , Issue.2 , pp. 251-276
    • Amari, S.1
  • 18
    • 0000030684 scopus 로고
    • The expected-utility hypothesis and the measurability of utility
    • Friedman, M. and Savage, L. (1952), "The expected-utility hypothesis and the measurability of utility" in The Journal of Political Economy, No. 6, pp. 463-74.
    • (1952) The Journal of Political Economy , Issue.6 , pp. 463-474
    • Friedman, M.1    Savage, L.2
  • 22
    • 2342632212 scopus 로고    scopus 로고
    • Solving a huge number of similar tasks: A combination of multi-task learning and a hierarchical Bayesian approach
    • Heskes, T. (1998), "Solving a huge number of similar tasks: a combination of multi-task learning and a hierarchical Bayesian approach" in Proceedings of the 15th International Conference on Machine Learning (ICML'98), pp. 233-41.
    • (1998) Proceedings of the 15th International Conference on Machine Learning (ICML'98) , pp. 233-241
    • Heskes, T.1
  • 23
    • 11944275853 scopus 로고
    • Information theory and statistical mechanics
    • Jaynes, E. (1957), "Information theory and statistical mechanics" in Physical Review, Vol. 108, No. 2, p. 171.
    • (1957) Physical Review , vol.108 , Issue.2 , pp. 171
    • Jaynes, E.1
  • 27
    • 77953327625 scopus 로고    scopus 로고
    • Imitation and reinforcement learning, practical algorithms for motor primitives in robotics
    • Kober, J. and Peters, J. (2010), "Imitation and reinforcement learning, practical algorithms for motor primitives in robotics" in Robotics and Automation Magazine, IEEE, Vol. 17, No. 2, pp. 55-62.
    • (2010) Robotics and Automation Magazine, IEEE , vol.17 , Issue.2 , pp. 55-62
    • Kober, J.1    Peters, J.2
  • 28
    • 85162069513 scopus 로고    scopus 로고
    • Hierarchical apprenticeship learning with application to quadruped locomotion
    • MIT Press, Cambridge, MA
    • Kolter, J., Abbeel, P. and Ng, A. (2008), "Hierarchical apprenticeship learning with application to quadruped locomotion", Advances in Neural Information Processing Systems, MIT Press, Cambridge, MA.
    • (2008) Advances in Neural Information Processing Systems
    • Kolter, J.1    Abbeel, P.2    Ng, A.3
  • 31
  • 33
  • 34
    • 84865148144 scopus 로고    scopus 로고
    • A survey of POMDP solution techniques
    • Murphy, K. (2000), "A survey of POMDP solution techniques" in Environment, Vol. 2, p. X3.
    • (2000) Environment , vol.2
    • Murphy, K.1
  • 37
    • 0003212629 scopus 로고
    • Efficient training of artificial neural networks for autonomous navigation
    • Pomerleau, D. (1991), "Efficient training of artificial neural networks for autonomous navigation" in Neural Computation, Vol. 3, No. 1, pp. 88-97.
    • (1991) Neural Computation , vol.3 , Issue.1 , pp. 88-97
    • Pomerleau, D.1
  • 39
    • 80053156567 scopus 로고    scopus 로고
    • Inverse reinforcement learning with Gaussian process
    • Qiao, Q. and Beling, P. (2011), "Inverse reinforcement learning with Gaussian process" in American Control Conference (ACC), pp. 113-18.
    • (2011) American Control Conference (ACC) , pp. 113-118
    • Qiao, Q.1    Beling, P.2
  • 42
    • 67650957592 scopus 로고    scopus 로고
    • Learning to search: Functional gradient techniques for imitation learning
    • Ratliff, N., Silver, D. and Bagnell, J. (2009), "Learning to search: functional gradient techniques for imitation learning" in Autonomous Robots, No. 1, pp. 25-53.
    • (2009) Autonomous Robots , Issue.1 , pp. 25-53
    • Ratliff, N.1    Silver, D.2    Bagnell, J.3
  • 46
    • 0033151712 scopus 로고    scopus 로고
    • Is imitation learning the route to humanoid robots?
    • Schaal, S. (1999), "Is imitation learning the route to humanoid robots?" in Trends in Cognitive Sciences, Vol. 3, No. 6, pp. 233-42.
    • (1999) Trends in Cognitive Sciences , vol.3 , Issue.6 , pp. 233-242
    • Schaal, S.1
  • 47
    • 78650179844 scopus 로고    scopus 로고
    • Modified reward function on abstract features in inverse reinforcement learning
    • Springer
    • Shen-yi, C., Hui, Q., Jia, F., Zhuo-jun, J., Miao-liang, Z., Springer (2010), "Modified reward function on abstract features in inverse reinforcement learning" in Journal of Zhejiang University - Science C, Vol. 11, No. 9, pp. 718-23.
    • (2010) Journal of Zhejiang University - Science C , vol.11 , Issue.9 , pp. 718-723
    • Shen-yi, C.1    Hui, Q.2    Jia, F.3    Zhuo-jun, J.4    Miao-liang, Z.5
  • 49
    • 77957947591 scopus 로고    scopus 로고
    • Learning from demonstration for autonomous navigation in complex unstructured terrain
    • Silver, D., Bagnell, J. and Stentz, A. (2010), "Learning from demonstration for autonomous navigation in complex unstructured terrain" in The International Journal of Robotics Research, Vol. 29, No. 12, p. 1565.
    • (2010) The International Journal of Robotics Research , vol.29 , Issue.12 , pp. 1565
    • Silver, D.1    Bagnell, J.2    Stentz, A.3
  • 50
    • 79957999943 scopus 로고    scopus 로고
    • Perceptual interpretation for autonomous navigation through dynamic imitation learning
    • Silver, D., Bagnell, J. and Stentz, A. (2011), "Perceptual interpretation for autonomous navigation through dynamic imitation learning" in International Symposium on Robotics Research, pp. 433-49.
    • (2011) International Symposium on Robotics Research , pp. 433-449
    • Silver, D.1    Bagnell, J.2    Stentz, A.3
  • 53
    • 85162012324 scopus 로고    scopus 로고
    • A game-theoretic approach to apprenticeship learning
    • MIT Press, Cambridge, MA
    • Syed, U. and Schapire, R. (2008), "A game-theoretic approach to apprenticeship learning" in Advances in Neural Information Processing Systems, MIT Press, Cambridge, MA, pp. 1449-56.
    • (2008) Advances in Neural Information Processing Systems , pp. 1449-1456
    • Syed, U.1    Schapire, R.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.