메뉴 건너뛰기




Volumn 20, Issue 10, 2005, Pages 1037-1052

Concurrent Q-learning: Reinforcement learning for dynamic goals and environments

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS; INFORMATION ANALYSIS; PROBLEM SOLVING;

EID: 27844586782     PISSN: 08848173     EISSN: None     Source Type: Journal    
DOI: 10.1002/int.20105     Document Type: Article
Times cited : (11)

References (27)
  • 1
    • 33847202724 scopus 로고
    • Learning to predict by the methods of temporal differences
    • Sutton RS. Learning to predict by the methods of temporal differences. Mach Learn 1988;3:9-44.
    • (1988) Mach Learn , vol.3 , pp. 9-44
    • Sutton, R.S.1
  • 2
    • 0033968832 scopus 로고    scopus 로고
    • A model of hippocampally dependent navigation, using the temporal difference learning rule
    • Foster DJ, Morris RGM, Dayan P. A model of hippocampally dependent navigation, using the temporal difference learning rule. Hippocampus 2000;10(1):1-16.
    • (2000) Hippocampus , vol.10 , Issue.1 , pp. 1-16
    • Foster, D.J.1    Morris, R.G.M.2    Dayan, P.3
  • 4
    • 0005535917 scopus 로고
    • Spatial localization does not require the presence of local cues
    • Morris RGM. Spatial localization does not require the presence of local cues. Learn Motiv 1981;12:239-260.
    • (1981) Learn Motiv , vol.12 , pp. 239-260
    • Morris, R.G.M.1
  • 5
    • 0032891813 scopus 로고    scopus 로고
    • Delay-dependent impairment of a matching-to-place task with chronic and intrahippocampal infusion of the NMDA-Antagonist D-AP5
    • Steele RJ, Morris RGM. Delay-dependent impairment of a matching-to-place task with chronic and intrahippocampal infusion of the NMDA-Antagonist D-AP5. Hippocampus 1999;9(2):118-136.
    • (1999) Hippocampus , vol.9 , Issue.2 , pp. 118-136
    • Steele, R.J.1    Morris, R.G.M.2
  • 6
    • 0017126949 scopus 로고
    • Place units in the hippocampus of the freely moving rat
    • O'Keefe I Place units in the hippocampus of the freely moving rat. Exp Neural 1976;51: 78-109.
    • (1976) Exp Neural , vol.51 , pp. 78-109
    • O'Keefe, I.1
  • 7
    • 0015145985 scopus 로고
    • The hippocampus as a spatial map. Preliminary evidence from the unit activity in the freely-moving rat
    • O'Keefe J, Dostrovsky J. The hippocampus as a spatial map. Preliminary evidence from the unit activity in the freely-moving rat. Brain Res 1971;34:171-175.
    • (1971) Brain Res , vol.34 , pp. 171-175
    • O'Keefe, J.1    Dostrovsky, J.2
  • 8
    • 0024787062 scopus 로고
    • The firing of hippocampal place cells predicts the future position of freely moving rats
    • Muller RU, Kubie JL. The firing of hippocampal place cells predicts the future position of freely moving rats. J Neurosci 1989;9(12):4101-4110.
    • (1989) J Neurosci , vol.9 , Issue.12 , pp. 4101-4110
    • Muller, R.U.1    Kubie, J.L.2
  • 9
    • 0036592026 scopus 로고    scopus 로고
    • Actor-critic models of the basal ganglia: New anatomical and computational perspectives
    • Joel D, Niv Y, Ruppin E. Actor-critic models of the basal ganglia: New anatomical and computational perspectives. Neural Netw 2002;15:535-547.
    • (2002) Neural Netw , vol.15 , pp. 535-547
    • Joel, D.1    Niv, Y.2    Ruppin, E.3
  • 10
    • 0034078011 scopus 로고    scopus 로고
    • Neuronal coding of prediction errors
    • Schultz W, Dickinson A. Neuronal coding of prediction errors. Ann Rev Neurosci 2000;23:473-500.
    • (2000) Ann Rev Neurosci , vol.23 , pp. 473-500
    • Schultz, W.1    Dickinson, A.2
  • 11
    • 0034061495 scopus 로고    scopus 로고
    • Reward processing in primate orbitofrontal cortex and basal ganglia
    • Schultz W. Tremblay L, Hollerman JR. Reward processing in primate orbitofrontal cortex and basal ganglia. Cereb Cortex 2000;10:272-283.
    • (2000) Cereb Cortex , vol.10 , pp. 272-283
    • Schultz, W.1    Tremblay, L.2    Hollerman, J.R.3
  • 12
    • 0020970738 scopus 로고
    • Neuronlike adaptive elements that can solve difficult learning control problems
    • Barto AG, Sutton RS, Anderson CW. Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans Syst Man Cyber 1983;13(5):834-846.
    • (1983) IEEE Trans Syst Man Cyber , vol.13 , Issue.5 , pp. 834-846
    • Barto, A.G.1    Sutton, R.S.2    Anderson, C.W.3
  • 14
    • 0001234682 scopus 로고
    • Feudal reinforcement learning
    • Giles CL, Hanson SJ, Cowan JD, editors. San Mateo, CA: Morgan Kaufmann
    • Dayan P, Hinton GE. Feudal reinforcement learning. In: Giles CL, Hanson SJ, Cowan JD, editors. Advances in neural information processing systems 5. San Mateo, CA: Morgan Kaufmann; 1993.
    • (1993) Advances in Neural Information Processing Systems , vol.5
    • Dayan, P.1    Hinton, G.E.2
  • 15
    • 0001806701 scopus 로고    scopus 로고
    • The MAXQ method for hierarchical reinforcement learning
    • San Francisco, CA: Morgan Kaufmann
    • Dietterich TG. The MAXQ method for hierarchical reinforcement learning. In: Proc Fifteenth Int Conf on Machine Learning. San Francisco, CA: Morgan Kaufmann; 1998. pp 118-126.
    • (1998) Proc Fifteenth Int Conf on Machine Learning , pp. 118-126
    • Dietterich, T.G.1
  • 16
    • 0007907759 scopus 로고    scopus 로고
    • Emergent hierarchical control structures: Learning reactive/hierarchical relationships in reinforcement environments
    • Cambridge, MA: MIT Press
    • Digney BL. Emergent hierarchical control structures: Learning reactive/hierarchical relationships in reinforcement environments. In: Proc Fourth Conf on Simulation of Adaptive Behavior. Cambridge, MA: MIT Press; 1996.
    • (1996) Proc Fourth Conf on Simulation of Adaptive Behavior
    • Digney, B.L.1
  • 17
    • 0043247546 scopus 로고    scopus 로고
    • Accelerating reinforcement learning by composing solutions of automatically identified subtasks
    • Drummond C. Accelerating reinforcement learning by composing solutions of automatically identified subtasks. J Artif Intell Res 2002;16:59-104.
    • (2002) J Artif Intell Res , vol.16 , pp. 59-104
    • Drummond, C.1
  • 18
    • 0344506168 scopus 로고    scopus 로고
    • Reinforcement learning with hierarchies of machines
    • Jordan MI, Kearns MJ, Solla SA, editors. Cambridge, MA: The MIT Press
    • Parr R, Russell S. Reinforcement learning with hierarchies of machines. In: Jordan MI, Kearns MJ, Solla SA, editors. Advances in neural information processing systems. Cambridge, MA: The MIT Press; 1997.
    • (1997) Advances in Neural Information Processing Systems
    • Parr, R.1    Russell, S.2
  • 19
    • 0026962175 scopus 로고
    • Reinforcement learning with a hierarchy of abstract models
    • San Jose, CA
    • Singh SP. Reinforcement learning with a hierarchy of abstract models. In: Proc Tenth National Conf on Artificial Intelligence, San Jose, CA; 1992. pp 202-207.
    • (1992) Proc Tenth National Conf on Artificial Intelligence , pp. 202-207
    • Singh, S.P.1
  • 20
    • 27844549069 scopus 로고    scopus 로고
    • Concurrent Q-learning for autonomous mapping and navigation
    • Vadakkepat P, Tan WW, Tan WC, Loh AP, editors, Singapore. National University of Singapore
    • Ollington RB, Vamplew PW. Concurrent Q-learning for autonomous mapping and navigation. In: Vadakkepat P, Tan WW, Tan WC, Loh AP, editors. Proc 2nd Int Conf on Computational Intelligence, Robotics and Autonomous Systems, Singapore. National University of Singapore; 2003.
    • (2003) Proc 2nd Int Conf on Computational Intelligence, Robotics and Autonomous Systems
    • Ollington, R.B.1    Vamplew, P.W.2
  • 22
    • 0029753630 scopus 로고    scopus 로고
    • Reinforcement learning with replacing eligibility traces
    • Singh SP, Sutton RS. Reinforcement learning with replacing eligibility traces. Mach Learn 1996;22:123-158.
    • (1996) Mach Learn , vol.22 , pp. 123-158
    • Singh, S.P.1    Sutton, R.S.2
  • 25
    • 0000955979 scopus 로고    scopus 로고
    • Incremental multi-step Q-learning
    • Peng J, Williams RJ. Incremental multi-step Q-learning. Mach Learn 1996;22(1-3): 283-290.
    • (1996) Mach Learn , vol.22 , Issue.1-3 , pp. 283-290
    • Peng, J.1    Williams, R.J.2
  • 27
    • 85143168613 scopus 로고
    • Hierarchical learning in stochastic domains: Preliminary results
    • San Francisco, CA: Morgan Kaufmann
    • Kaelbling LP. Hierarchical learning in stochastic domains: Preliminary results. In: Proc Tenth Int Conf on Machine Learning. San Francisco, CA: Morgan Kaufmann; 1993. pp 167-173.
    • (1993) Proc Tenth Int Conf on Machine Learning , pp. 167-173
    • Kaelbling, L.P.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.