메뉴 건너뛰기




Volumn 16, Issue 1, 2003, Pages 5-9

Meta-learning in reinforcement learning

Author keywords

Dopamine; Dynamic environment; Meta learning; Meta parameters; Neuromodulation; Reinforcement learning; TD error

Indexed keywords

ADAPTIVE SYSTEMS; ALGORITHMS; MARKOV PROCESSES; ROBUSTNESS (CONTROL SYSTEMS); SIGNAL ENCODING;

EID: 0037258402     PISSN: 08936080     EISSN: None     Source Type: Journal    
DOI: 10.1016/S0893-6080(02)00228-9     Document Type: Article
Times cited : (216)

References (17)
  • 1
    • 0019855733 scopus 로고
    • Activity of norepinephrine-containing locus coeruleus neurons in behaving rats anticipates fluctuations in the sleep-waking cycle
    • Aston-Jones G., Bloom F.E. Activity of norepinephrine-containing locus coeruleus neurons in behaving rats anticipates fluctuations in the sleep-waking cycle. Journal of Neuroscience. 1:(8):1981;876-886.
    • (1981) Journal of Neuroscience , vol.1 , Issue.8 , pp. 876-886
    • Aston-Jones, G.1    Bloom, F.E.2
  • 2
    • 0036592008 scopus 로고    scopus 로고
    • Opponent interactions between serotonin and dopamine
    • Daw N.D., Kakade S., Dayan P. Opponent interactions between serotonin and dopamine. Neural Networks. 15:2002;603-616.
    • (2002) Neural Networks , vol.15 , pp. 603-616
    • Daw, N.D.1    Kakade, S.2    Dayan, P.3
  • 3
    • 0027299420 scopus 로고
    • Dopaminergic regulation of cortical acetylcholine release: Effects of dopamine receptor agonists
    • Day J., Fibiger H.C. Dopaminergic regulation of cortical acetylcholine release: effects of dopamine receptor agonists. Neuroscience. 54:(3):1993;643-648.
    • (1993) Neuroscience , vol.54 , Issue.3 , pp. 643-648
    • Day, J.1    Fibiger, H.C.2
  • 4
    • 0033629916 scopus 로고    scopus 로고
    • Reinforcement learning in continuous time and space
    • Doya K. Reinforcement learning in continuous time and space. Neural Computations. 12:(1):2000;219-245.
    • (2000) Neural Computations , vol.12 , Issue.1 , pp. 219-245
    • Doya, K.1
  • 5
    • 0036592023 scopus 로고    scopus 로고
    • Metalearning and neuromodulation
    • Doya K. Metalearning and neuromodulation. Neural Networks. 15:2002;495-506.
    • (2002) Neural Networks , vol.15 , pp. 495-506
    • Doya, K.1
  • 6
    • 0025600638 scopus 로고
    • A stochastic reinforcement learning algorithm for learning real-valued functions
    • Gullapalli V. A stochastic reinforcement learning algorithm for learning real-valued functions. Neural Networks. 3:1990;671-692.
    • (1990) Neural Networks , vol.3 , pp. 671-692
    • Gullapalli, V.1
  • 7
    • 0034742514 scopus 로고    scopus 로고
    • D2-like dopamine receptor activation excites rat dorsal raphe 5-HT neurons in vitro
    • Haj-Dahmane S. D2-like dopamine receptor activation excites rat dorsal raphe 5-HT neurons in vitro. European Journal of Neuroscience. 14:(1):2001;125-134.
    • (2001) European Journal of Neuroscience , vol.14 , Issue.1 , pp. 125-134
    • Haj-Dahmane, S.1
  • 8
    • 0036592028 scopus 로고    scopus 로고
    • Control of exploitation-exploration meta-parameters in reinforcement learning
    • Ishii S., Yoshida W., Yoshimoto J. Control of exploitation-exploration meta-parameters in reinforcement learning. Neural Networks. 15:2002;665-687.
    • (2002) Neural Networks , vol.15 , pp. 665-687
    • Ishii, S.1    Yoshida, W.2    Yoshimoto, J.3
  • 9
    • 0027250812 scopus 로고
    • 5-HT and motor control: A hypothesis
    • Jacobs B.L., Fornal C.A. 5-HT and motor control: a hypothesis. Trends in Neuroscience. 16:(9):1993;346-352.
    • (1993) Trends in Neuroscience , vol.16 , Issue.9 , pp. 346-352
    • Jacobs, B.L.1    Fornal, C.A.2
  • 12
    • 0031867046 scopus 로고    scopus 로고
    • Predictive reward signal of dopamine neurons
    • Schultz W. Predictive reward signal of dopamine neurons. Journal of Neurophysiology. 80:(1):1998;1-27.
    • (1998) Journal of Neurophysiology , vol.80 , Issue.1 , pp. 1-27
    • Schultz, W.1
  • 13
    • 0031939094 scopus 로고    scopus 로고
    • A model of cerebellar metaplasticity
    • Schweighofer N., Arbib M.A. A model of cerebellar metaplasticity. Learning Memory. (4):1998;421-428.
    • (1998) Learning Memory , Issue.4 , pp. 421-428
    • Schweighofer, N.1    Arbib, M.A.2
  • 14
    • 0016045280 scopus 로고
    • An opponent process theory of motivation. I. Temporal dynamics of affect
    • Solomon R.L., Corbit J.D. An opponent process theory of motivation. I. Temporal dynamics of affect. Psychological Review. 81:1974;119-145.
    • (1974) Psychological Review , vol.81 , pp. 119-145
    • Solomon, R.L.1    Corbit, J.D.2
  • 15
    • 0026971570 scopus 로고
    • Adapting bias by gradient descent: An incremental version of the delta-bar-delta
    • Cambridge, MA: MIT Press
    • Sutton, R (1992). Adapting bias by gradient descent: an incremental version of the delta-bar-delta. Tenth National Conference on Artificial Intelligence. Cambridge, MA: MIT Press.
    • (1992) Tenth National Conference on Artificial Intelligence
    • Sutton, R.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.