메뉴 건너뛰기




Volumn 1, Issue , 2010, Pages 5-12

Combining manual feedback with subsequent MDP reward signals for reinforcement learning

Author keywords

Human teachers; Human agent interaction; Reinforcement learning; Shaping

Indexed keywords

FEEDBACK; INTELLIGENT AGENTS; LEARNING ALGORITHMS; MARKOV PROCESSES; MULTI AGENT SYSTEMS; PERSONNEL TRAINING; REINFORCEMENT LEARNING; TEACHING;

EID: 84884357468     PISSN: 15488403     EISSN: 15582914     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (200)

References (19)
  • 3
    • 0001133021 scopus 로고
    • Generalization in reinforcement learning: Safely approximating the value function
    • J. Boyan and A. Moore. Generalization in reinforcement learning: Safely approximating the value function. NIPS, 1995.
    • (1995) NIPS
    • Boyan, J.1    Moore, A.2
  • 4
    • 0028739953 scopus 로고
    • Robot shaping: Developing situated agents through learning
    • M. Dorigo and M. Colombctti. Robot shaping: Developing situated agents through learning. Artificial Intelligence, 70(2):321-370, 1994.
    • (1994) Artificial Intelligence , vol.70 , Issue.2 , pp. 321-370
    • Dorigo, M.1    Colombctti, M.2
  • 5
    • 77955023200 scopus 로고    scopus 로고
    • Probabilistic policy reuse in a reinforcement learning agent
    • F. Fernandez and M. Veloso. Probabilistic policy reuse in a reinforcement learning agent. AAMAS, 2006.
    • (2006) AAMAS
    • Fernandez, F.1    Veloso, M.2
  • 11
    • 84957895797 scopus 로고
    • Reward functions for accelerated learning
    • M. Mataric. Reward functions for accelerated learning. ICML, 1994.
    • (1994) ICML
    • Mataric, M.1
  • 12
    • 0141596576 scopus 로고    scopus 로고
    • Policy invariance under reward transformations: Theory and application to reward shaping
    • A. Ng, D. Harada, and S. Russell. Policy invariance under reward transformations: Theory and application to reward shaping. ICML, 1999.
    • (1999) ICML
    • Ng, A.1    Harada, D.2    Russell, S.3
  • 13
    • 27344432348 scopus 로고    scopus 로고
    • Accelerating reinforcement learning through implicit imitation
    • B. Price and C. Boutilier. Accelerating reinforcement learning through implicit imitation. JAIR, 19:569-629, 2003.
    • (2003) JAIR , vol.19 , pp. 569-629
    • Price, B.1    Boutilier, C.2
  • 15
    • 70449370276 scopus 로고    scopus 로고
    • Rl-glue: Language-independent software for reinforcement-learning experiments
    • B. Tanner and A. White. Rl-glue: Language-independent software for reinforcement-learning experiments. JMLR, 10, 2009.
    • JMLR , vol.10 , pp. 2009
    • Tanner, B.1    White, A.2
  • 16
    • 84922201091 scopus 로고    scopus 로고
    • Transferring instances for model-based reinforcement learning
    • M. E. Taylor, N. K. Jong, and P. Stone. Transferring instances for model-based reinforcement learning. ECML PKDD, 2008.
    • (2008) ECML PKDD
    • Taylor, M.E.1    Jong, N.K.2    Stone, P.3
  • 17
    • 38349005230 scopus 로고    scopus 로고
    • Cross-domain transfer for reinforcement learning
    • M. E. Taylor and P. Stone. Cross-domain transfer for reinforcement learning. ICML, 2007.
    • (2007) ICML
    • Taylor, M.E.1    Stone, P.2
  • 18
    • 34848816477 scopus 로고    scopus 로고
    • Transfer learning via inter-task mappings for temporal difference learning
    • M. E. Taylor, P. Stone, and Y. Liu. Transfer learning via inter-task mappings for temporal difference learning. JMLR, 8(1):2125-2167, 2007.
    • (2007) JMLR , vol.8 , Issue.1 , pp. 2125-2167
    • Taylor, M.E.1    Stone, P.2    Liu, Y.3
  • 19
    • 70350460438 scopus 로고    scopus 로고
    • Reinforcement learning with human teachers: Evidence of feedback and guidance with implications for learning performance
    • A. Thomaz and C. Brcazcal. Reinforcement Learning with Human Teachers: Evidence of Feedback and Guidance with Implications for Learning Performance. AAAI, 2006.
    • (2006) AAAI
    • Thomaz, A.1    Brcazcal, C.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.