메뉴 건너뛰기




Volumn 21, Issue 10, 2007, Pages 1177-1199

Incremental acquisition of behaviors and signs based on a reinforcement learning schemata model and a spike timing-dependent plasticity network

Author keywords

Modular learning; Operant conditioning; Reinforcement learning; Spike timing dependent plasticity; Symbol emergence

Indexed keywords

BRAIN; COMPUTER ARCHITECTURE; MATHEMATICAL MODELS; NEUROLOGY;

EID: 34547521549     PISSN: 01691864     EISSN: 15685535     Source Type: Journal    
DOI: 10.1163/156855307781389374     Document Type: Article
Times cited : (18)

References (46)
  • 4
    • 34249833101 scopus 로고
    • Technical note: Q-learning
    • C. Watkins and P. Dayan, Technical note: Q-learning, Machine Learn. 8, 279-292 (1992).
    • (1992) Machine Learn , vol.8 , pp. 279-292
    • Watkins, C.1    Dayan, P.2
  • 5
    • 0033629916 scopus 로고    scopus 로고
    • Reinforcement learning in continous time and space
    • K. Doya, Reinforcement learning in continous time and space, Neural Comput.12, 219-245 (2000).
    • (2000) Neural Comput , vol.12 , pp. 219-245
    • Doya, K.1
  • 6
    • 0001027894 scopus 로고
    • Transfer of learning by composing solutions of elemental sequential tasks
    • S. P. Singh, Transfer of learning by composing solutions of elemental sequential tasks, Machine Learning Arch.8, 323-339 (1992).
    • (1992) Machine Learning Arch , vol.8 , pp. 323-339
    • Singh, S.P.1
  • 7
    • 0000541213 scopus 로고
    • Adaptive critics and the basal ganglia
    • J. C. Houk, J. L. Davis and D. G. Beiser, Eds, pp, MIT Press, Cambridge, MA
    • A. G. Barto, Adaptive critics and the basal ganglia, in: Models of Information Processing in the Basal Ganglia(J. C. Houk, J. L. Davis and D. G. Beiser, Eds), pp. 215-232. MIT Press, Cambridge, MA (1995).
    • (1995) Models of Information Processing in the Basal Ganglia , pp. 215-232
    • Barto, A.G.1
  • 8
    • 0033213819 scopus 로고    scopus 로고
    • What are the computations of the cerebllum, the basal ganglia, and the cerebral cortex?
    • K. Doya, What are the computations of the cerebllum, the basal ganglia, and the cerebral cortex?, Neural Networks12, 961-974 (1999).
    • (1999) Neural Networks , vol.12 , pp. 961-974
    • Doya, K.1
  • 9
    • 3343026029 scopus 로고    scopus 로고
    • Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops
    • S. C. Tanaka, K. Doya, G. Okada, K. Ueda, Y. Okamoto and S. Yamawaki, Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops, Nat. Neurosci.7, 887-893 (2004).
    • (2004) Nat. Neurosci , vol.7 , pp. 887-893
    • Tanaka, S.C.1    Doya, K.2    Okada, G.3    Ueda, K.4    Okamoto, Y.5    Yamawaki, S.6
  • 10
    • 0028302807 scopus 로고
    • Responses of tonically active neurons in the primate's striatum undergo systematic changes during behavioral sensorimotor conditioning
    • T. Aosaki, H. Tsubokawa, A. Ishida, K. Watanabe, A. M. Graybiel and M. Kimura, Responses of tonically active neurons in the primate's striatum undergo systematic changes during behavioral sensorimotor conditioning, J. Neurosci.14, 3969-3984 (1994).
    • (1994) J. Neurosci , vol.14 , pp. 3969-3984
    • Aosaki, T.1    Tsubokawa, H.2    Ishida, A.3    Watanabe, K.4    Graybiel, A.M.5    Kimura, M.6
  • 11
    • 0031867046 scopus 로고    scopus 로고
    • Predictive reward signal of dopamine neurons
    • W. Schultz, Predictive reward signal of dopamine neurons, J. Neurophysiol.80, 1-27 (1998).
    • (1998) J. Neurophysiol , vol.80 , pp. 1-27
    • Schultz, W.1
  • 14
    • 34547500226 scopus 로고    scopus 로고
    • Incremental acquisition of behavioral concepts through social interactions with a caregiver
    • T. Taniguchi and T. Sawaragi, Incremental acquisition of behavioral concepts through social interactions with a caregiver, in: Proc. Artificial Life and Robotics(2006).
    • (2006) Proc. Artificial Life and Robotics
    • Taniguchi, T.1    Sawaragi, T.2
  • 17
    • 0035976319 scopus 로고    scopus 로고
    • Cognitive developmental robotics as a new paradigm for the design of humanoid robots
    • M. Asada, K. F. MacDorman, H. Ishiguro and Y Kuniyoshi, Cognitive developmental robotics as a new paradigm for the design of humanoid robots, Robotics Autonomous Syst.37, 185-193 (2001).
    • (2001) Robotics Autonomous Syst , vol.37 , pp. 185-193
    • Asada, M.1    MacDorman, K.F.2    Ishiguro, H.3    Kuniyoshi, Y.4
  • 18
    • 0033170372 scopus 로고    scopus 로고
    • Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
    • R. S. Sutton, D. Precup and S. Singh, Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning, Artif. Intell.112, 181-211 (1999).
    • (1999) Artif. Intell , vol.112 , pp. 181-211
    • Sutton, R.S.1    Precup, D.2    Singh, S.3
  • 21
    • 34547512431 scopus 로고    scopus 로고
    • Modular learning syatem. and scheduling for behavior acquisition in multiagent environment
    • CD-ROM
    • Y. Takahashi et al., Modular learning syatem. and scheduling for behavior acquisition in multiagent environment, in: RoboCup 2004 Symp. Papers and Team Description Papers, CD-ROM (2004).
    • (2004) RoboCup 2004 Symp. Papers and Team Description Papers
    • Takahashi, Y.1
  • 22
    • 0031012615 scopus 로고    scopus 로고
    • Regulation of synaptic efficacy by coincidence of postsynaptic aps and epsps
    • H. Markram et al., Regulation of synaptic efficacy by coincidence of postsynaptic aps and epsps, Science275, 213-215 (1997).
    • (1997) Science , vol.275 , pp. 213-215
    • Markram, H.1
  • 23
    • 0033667165 scopus 로고    scopus 로고
    • Synaptic plasticity: Taming the beast
    • L. F. Abbott and S. B. Nelson, Synaptic plasticity: taming the beast, Nat. Neurosci. Suppl.3, 1178-1182 (2000).
    • (2000) Nat. Neurosci. Suppl , vol.3 , pp. 1178-1182
    • Abbott, L.F.1    Nelson, S.B.2
  • 24
    • 2542497002 scopus 로고    scopus 로고
    • Synaptic regulation on various stdp rules
    • Y. Sakai, K. Nakano and S. Yoshizawa, Synaptic regulation on various stdp rules, Neurocomputing58-60, 351-357 (2004).
    • (2004) Neurocomputing , vol.58-60 , pp. 351-357
    • Sakai, Y.1    Nakano, K.2    Yoshizawa, S.3
  • 25
    • 0032192424 scopus 로고    scopus 로고
    • Multiple paired forward and inverse models for motor control
    • D. M. Wolpert and M. Kawato, Multiple paired forward and inverse models for motor control, Neural Networks11, 1317-1329 (1998).
    • (1998) Neural Networks , vol.11 , pp. 1317-1329
    • Wolpert, D.M.1    Kawato, M.2
  • 26
    • 9144234306 scopus 로고    scopus 로고
    • Seif-organization of distributedly represented multiple behavior schemata in a mirror system: Reviews of robots using RNNPB
    • J. Tani, M. Ito and Y Sugita, Seif-organization of distributedly represented multiple behavior schemata in a mirror system: reviews of robots using RNNPB, Neural Networks17, 1273-1289 (2004).
    • (2004) Neural Networks , vol.17 , pp. 1273-1289
    • Tani, J.1    Ito, M.2    Sugita, Y.3
  • 27
    • 0001940458 scopus 로고
    • Adaptive mixtures of local experts
    • R. A. Jacobs, M. I. Jordan et al., Adaptive mixtures of local experts, Neural Comput.3, 79-87 (1991).
    • (1991) Neural Comput , vol.3 , pp. 79-87
    • Jacobs, R.A.1    Jordan, M.I.2
  • 28
    • 20444493133 scopus 로고    scopus 로고
    • Design and performance of symbols self-organized within an autonomous agent interacting with varied environments
    • CD-ROM
    • T. Taniguchi and T. Sawaragi, Design and performance of symbols self-organized within an autonomous agent interacting with varied environments, in: Proc. IEEE Int. Workshop on ROMAN, CD-ROM (2004).
    • (2004) Proc. IEEE Int. Workshop on ROMAN
    • Taniguchi, T.1    Sawaragi, T.2
  • 29
    • 15744397350 scopus 로고    scopus 로고
    • Self-organization of inner symbols for chase: Symbol organization and embodiment
    • CD-ROM
    • T. Taniguchi and T. Sawaragi, Self-organization of inner symbols for chase: symbol organization and embodiment, in: Proc. IEEE Int. Conf. on SMC, CD-ROM (2004).
    • (2004) Proc. IEEE Int. Conf. on SMC
    • Taniguchi, T.1    Sawaragi, T.2
  • 30
    • 0035487297 scopus 로고    scopus 로고
    • Mosaic model for sensorimotor learning and control
    • M. Haruno, D. M. Wolpert and M. Kawato, Mosaic model for sensorimotor learning and control, Neural Comput.13, 2201-2220 (2001).
    • (2001) Neural Comput , vol.13 , pp. 2201-2220
    • Haruno, M.1    Wolpert, D.M.2    Kawato, M.3
  • 31
    • 0036618011 scopus 로고    scopus 로고
    • Multiple model-based reinforcement learning
    • K. Doya et al., Multiple model-based reinforcement learning, Neural Comput.14, 1347-1369 (2000).
    • (2000) Neural Comput , vol.14 , pp. 1347-1369
    • Doya, K.1
  • 32
    • 0033213813 scopus 로고    scopus 로고
    • Learning to perceive the world as articulated: An approach for hierarchical learning in sensory-motor systems
    • J. Tani and S. Nolfi, Learning to perceive the world as articulated: an approach for hierarchical learning in sensory-motor systems, Neural Networks12, 1131-1141 (1999).
    • (1999) Neural Networks , vol.12 , pp. 1131-1141
    • Tani, J.1    Nolfi, S.2
  • 33
    • 0031619316 scopus 로고    scopus 로고
    • Bayesian Q-learning
    • R. Dearden et al., Bayesian Q-learning, in: Proc. AAAI-98, pp. 761-768 (1998).
    • (1998) Proc. AAAI-98 , pp. 761-768
    • Dearden, R.1
  • 34
    • 25144452832 scopus 로고    scopus 로고
    • What can a neuron learn with spike-timing-dependent plasticity?
    • R. Legenstein, C. Naeger and W. Maass, What can a neuron learn with spike-timing-dependent plasticity?, Neural Comput.17, 2337-2382 (2005).
    • (2005) Neural Comput , vol.17 , pp. 2337-2382
    • Legenstein, R.1    Naeger, C.2    Maass, W.3
  • 36
    • 0037362828 scopus 로고    scopus 로고
    • A stochastic method to predict the consequence of arbitrary forms of spike-timing-dependent plasticity
    • H. Cateau and T. Fukai, A stochastic method to predict the consequence of arbitrary forms of spike-timing-dependent plasticity, Neural Comput.15, 597-620 (2003).
    • (2003) Neural Comput , vol.15 , pp. 597-620
    • Cateau, H.1    Fukai, T.2
  • 37
    • 84898973877 scopus 로고    scopus 로고
    • Reducing spike train variability: A computational theory of spike-timing dependent plasticity
    • S. M. Bohte and M. C. Mozer, Reducing spike train variability: a computational theory of spike-timing dependent plasticity, Adv. Neural Inform. Process. Syst.17, 201-208 (2005).
    • (2005) Adv. Neural Inform. Process. Syst , vol.17 , pp. 201-208
    • Bohte, S.M.1    Mozer, M.C.2
  • 38
    • 0032535029 scopus 로고    scopus 로고
    • Synaptic modification in cultured hippocampal neurons: Dependence on spike timing, synaptic strength, and postsynaptic cell type
    • G. Q. Bi and M. M. Poo, Synaptic modification in cultured hippocampal neurons: dependence on spike timing, synaptic strength, and postsynaptic cell type, J. Neurosci.18, 10464-10472 (1998).
    • (1998) J. Neurosci , vol.18 , pp. 10464-10472
    • Bi, G.Q.1    Poo, M.M.2
  • 39
    • 0033860923 scopus 로고    scopus 로고
    • Competitive hebbian learning through spike-timing-dependent synaptic plasticity
    • S. Song et al., Competitive hebbian learning through spike-timing-dependent synaptic plasticity. Nature Neuroscience3, 919-926 (2000).
    • (2000) Nature Neuroscience , vol.3 , pp. 919-926
    • Song, S.1
  • 40
    • 0242389641 scopus 로고    scopus 로고
    • Hierarchical mosaic for movement generation
    • M. Haruno, D. M. Wolpert and M. Kawato, Hierarchical mosaic for movement generation, Int. Congr. Ser.1250, 575-590 (2003).
    • (2003) Int. Congr. Ser , vol.1250 , pp. 575-590
    • Haruno, M.1    Wolpert, D.M.2    Kawato, M.3
  • 41
  • 43
    • 34547497348 scopus 로고    scopus 로고
    • Webots, http://www.cybe.rbotics.com. (commercial mobile robot simulation software).
    • Webots, http://www.cybe.rbotics.com. (commercial mobile robot simulation software).
  • 44
    • 25444519563 scopus 로고
    • The symbol grounding problem
    • S. Hamad, The symbol grounding problem, Physica D42, 35-346 (1990).
    • (1990) Physica D , vol.42 , pp. 35-346
    • Hamad, S.1
  • 45
    • 34547531313 scopus 로고    scopus 로고
    • Symbol emergence by combining a reinforcement learning schema model with asymmetric synaptic plasticity
    • T. Taniguchi and T. Sawaragi, Symbol emergence by combining a reinforcement learning schema model with asymmetric synaptic plasticity, in: Proc. 5th Int. Conf. on Development and Learning(2006).
    • (2006) Proc. 5th Int. Conf. on Development and Learning
    • Taniguchi, T.1    Sawaragi, T.2
  • 46
    • 0035950280 scopus 로고    scopus 로고
    • Cortical development and remapping through spike timing-dependent plasticity
    • S. Song and L. F. Abbott, Cortical development and remapping through spike timing-dependent plasticity, Neuron32, 339-350 (2001).
    • (2001) Neuron , vol.32 , pp. 339-350
    • Song, S.1    Abbott, L.F.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.