SCOPUS 정보 검색 플랫폼

Volumn 21, Issue 10, 2007, Pages 1177-1199

Incremental acquisition of behaviors and signs based on a reinforcement learning schemata model and a spike timing-dependent plasticity network

(2) Taniguchi, Tadahiro a Sawaragi, Tetsuo a

a KYOTO UNIVERSITY (Japan)

Author keywords

Modular learning; Operant conditioning; Reinforcement learning; Spike timing dependent plasticity; Symbol emergence

Indexed keywords

BRAIN; COMPUTER ARCHITECTURE; MATHEMATICAL MODELS; NEUROLOGY;

MODULAR LEARNING; OPERANT CONDITIONING; SPIKE TIMING-DEPENDENT PLASTICITY; SYMBOL EMERGENCE;

REINFORCEMENT LEARNING;

EID: 34547521549 PISSN: 01691864 EISSN: 15685535 Source Type: Journal
DOI: 10.1163/156855307781389374 Document Type: Article

Times cited : (18)

References (46)

1
- 0004183870
- Scott, Foresman, Glenview, IL
- G. S. Reynolds, A Primer of Operant Conditioning. Scott, Foresman, Glenview, IL (1975).
- (1975) A Primer of Operant Conditioning
- Reynolds, G.S.¹

2
- 0004102479
- MIT Press, Cambridge, MA
- R. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA (1998).
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.¹ Barto, A.G.²

3
- 0003487482
- Athena Scientific, Nashua, NH
- D. P. Bertsekas and J. N. Tsitsiklis, Neuro-Dynamic Programming. Athena Scientific, Nashua, NH (1996).
- (1996) Neuro-Dynamic Programming
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

4
- 34249833101
- Technical note: Q-learning
- C. Watkins and P. Dayan, Technical note: Q-learning, Machine Learn. 8, 279-292 (1992).
- (1992) Machine Learn , vol.8 , pp. 279-292
- Watkins, C.¹ Dayan, P.²

5
- 0033629916
- Reinforcement learning in continous time and space
- K. Doya, Reinforcement learning in continous time and space, Neural Comput.12, 219-245 (2000).
- (2000) Neural Comput , vol.12 , pp. 219-245
- Doya, K.¹

6
- 0001027894
- Transfer of learning by composing solutions of elemental sequential tasks
- S. P. Singh, Transfer of learning by composing solutions of elemental sequential tasks, Machine Learning Arch.8, 323-339 (1992).
- (1992) Machine Learning Arch , vol.8 , pp. 323-339
- Singh, S.P.¹

7
- 0000541213
- Adaptive critics and the basal ganglia
- J. C. Houk, J. L. Davis and D. G. Beiser, Eds, pp, MIT Press, Cambridge, MA
- A. G. Barto, Adaptive critics and the basal ganglia, in: Models of Information Processing in the Basal Ganglia(J. C. Houk, J. L. Davis and D. G. Beiser, Eds), pp. 215-232. MIT Press, Cambridge, MA (1995).
- (1995) Models of Information Processing in the Basal Ganglia , pp. 215-232
- Barto, A.G.¹

8
- 0033213819
- What are the computations of the cerebllum, the basal ganglia, and the cerebral cortex?
- K. Doya, What are the computations of the cerebllum, the basal ganglia, and the cerebral cortex?, Neural Networks12, 961-974 (1999).
- (1999) Neural Networks , vol.12 , pp. 961-974
- Doya, K.¹

9
- 3343026029
- Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops
- S. C. Tanaka, K. Doya, G. Okada, K. Ueda, Y. Okamoto and S. Yamawaki, Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops, Nat. Neurosci.7, 887-893 (2004).
- (2004) Nat. Neurosci , vol.7 , pp. 887-893
- Tanaka, S.C.¹ Doya, K.² Okada, G.³ Ueda, K.⁴ Okamoto, Y.⁵ Yamawaki, S.⁶

10
- 0028302807
- Responses of tonically active neurons in the primate's striatum undergo systematic changes during behavioral sensorimotor conditioning
- T. Aosaki, H. Tsubokawa, A. Ishida, K. Watanabe, A. M. Graybiel and M. Kimura, Responses of tonically active neurons in the primate's striatum undergo systematic changes during behavioral sensorimotor conditioning, J. Neurosci.14, 3969-3984 (1994).
- (1994) J. Neurosci , vol.14 , pp. 3969-3984
- Aosaki, T.¹ Tsubokawa, H.² Ishida, A.³ Watanabe, K.⁴ Graybiel, A.M.⁵ Kimura, M.⁶

11
- 0031867046
- Predictive reward signal of dopamine neurons
- W. Schultz, Predictive reward signal of dopamine neurons, J. Neurophysiol.80, 1-27 (1998).
- (1998) J. Neurophysiol , vol.80 , pp. 1-27
- Schultz, W.¹

12
- 34547502130
- A neural substrate of prediction and reward
- W. Schultz, P. Dayan and P. R. Montague, A neural substrate of prediction and reward, Annu. Rev. Neurosci.15, 353 (1992).
- (1992) Annu. Rev. Neurosci , vol.15 , pp. 353
- Schultz, W.¹ Dayan, P.² Montague, P.R.³

13
- 0028564629
- Acting optimally in partially observable stochastic domains
- Seattle, WA
- A. R. Cassandra, L. P. Kaelbling and M. L. Littman, Acting optimally in partially observable stochastic domains, in: Proc. 12th Natl. Conf. on Artificial Intelligence, Seattle, WA, vol. 2, pp. 1023-1028 (1994).
- (1994) Proc. 12th Natl. Conf. on Artificial Intelligence , vol.2 , pp. 1023-1028
- Cassandra, A.R.¹ Kaelbling, L.P.² Littman, M.L.³

14
- 34547500226
- Incremental acquisition of behavioral concepts through social interactions with a caregiver
- T. Taniguchi and T. Sawaragi, Incremental acquisition of behavioral concepts through social interactions with a caregiver, in: Proc. Artificial Life and Robotics(2006).
- (2006) Proc. Artificial Life and Robotics
- Taniguchi, T.¹ Sawaragi, T.²

15
- 0004230131
- Wiley, New York, NY
- D. O. Hebb, The Organization of Behavior. Wiley, New York, NY (1949).
- (1949) The Organization of Behavior
- Hebb, D.O.¹

16
- 34547511317
- Aibo offcial site: http://www.jp.aibo.com/.
- Aibo offcial site

17
- 0035976319
- Cognitive developmental robotics as a new paradigm for the design of humanoid robots
- M. Asada, K. F. MacDorman, H. Ishiguro and Y Kuniyoshi, Cognitive developmental robotics as a new paradigm for the design of humanoid robots, Robotics Autonomous Syst.37, 185-193 (2001).
- (2001) Robotics Autonomous Syst , vol.37 , pp. 185-193
- Asada, M.¹ MacDorman, K.F.² Ishiguro, H.³ Kuniyoshi, Y.⁴

18
- 0033170372
- Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
- R. S. Sutton, D. Precup and S. Singh, Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning, Artif. Intell.112, 181-211 (1999).
- (1999) Artif. Intell , vol.112 , pp. 181-211
- Sutton, R.S.¹ Precup, D.² Singh, S.³

19
- 33749651693
- Intrinsically motivated, learning of hierarchical collections of skills
- A. G. Barto, S. Singh and N. Chentanez, Intrinsically motivated, learning of hierarchical collections of skills, in: Proc. Int. Conf. on Development and Learning(2004).
- (2004) Proc. Int. Conf. on Development and Learning
- Barto, A.G.¹ Singh, S.² Chentanez, N.³

20
- 80055036368
- Reinforcement learning of hierarchical skills on the sony Aibo robot
- V. Soni and S. Singh, Reinforcement learning of hierarchical skills on the sony Aibo robot, in: Proc. Int. Conf. on Development and Learning(2006).
- (2006) Proc. Int. Conf. on Development and Learning
- Soni, V.¹ Singh, S.²

21
- 34547512431
- Modular learning syatem. and scheduling for behavior acquisition in multiagent environment
- CD-ROM
- Y. Takahashi et al., Modular learning syatem. and scheduling for behavior acquisition in multiagent environment, in: RoboCup 2004 Symp. Papers and Team Description Papers, CD-ROM (2004).
- (2004) RoboCup 2004 Symp. Papers and Team Description Papers
- Takahashi, Y.¹

22
- 0031012615
- Regulation of synaptic efficacy by coincidence of postsynaptic aps and epsps
- H. Markram et al., Regulation of synaptic efficacy by coincidence of postsynaptic aps and epsps, Science275, 213-215 (1997).
- (1997) Science , vol.275 , pp. 213-215
- Markram, H.¹

23
- 0033667165
- Synaptic plasticity: Taming the beast
- L. F. Abbott and S. B. Nelson, Synaptic plasticity: taming the beast, Nat. Neurosci. Suppl.3, 1178-1182 (2000).
- (2000) Nat. Neurosci. Suppl , vol.3 , pp. 1178-1182
- Abbott, L.F.¹ Nelson, S.B.²

24
- 2542497002
- Synaptic regulation on various stdp rules
- Y. Sakai, K. Nakano and S. Yoshizawa, Synaptic regulation on various stdp rules, Neurocomputing58-60, 351-357 (2004).
- (2004) Neurocomputing , vol.58-60 , pp. 351-357
- Sakai, Y.¹ Nakano, K.² Yoshizawa, S.³

25
- 0032192424
- Multiple paired forward and inverse models for motor control
- D. M. Wolpert and M. Kawato, Multiple paired forward and inverse models for motor control, Neural Networks11, 1317-1329 (1998).
- (1998) Neural Networks , vol.11 , pp. 1317-1329
- Wolpert, D.M.¹ Kawato, M.²

26
- 9144234306
- Seif-organization of distributedly represented multiple behavior schemata in a mirror system: Reviews of robots using RNNPB
- J. Tani, M. Ito and Y Sugita, Seif-organization of distributedly represented multiple behavior schemata in a mirror system: reviews of robots using RNNPB, Neural Networks17, 1273-1289 (2004).
- (2004) Neural Networks , vol.17 , pp. 1273-1289
- Tani, J.¹ Ito, M.² Sugita, Y.³

27
- 0001940458
- Adaptive mixtures of local experts
- R. A. Jacobs, M. I. Jordan et al., Adaptive mixtures of local experts, Neural Comput.3, 79-87 (1991).
- (1991) Neural Comput , vol.3 , pp. 79-87
- Jacobs, R.A.¹ Jordan, M.I.²

28
- 20444493133
- Design and performance of symbols self-organized within an autonomous agent interacting with varied environments
- CD-ROM
- T. Taniguchi and T. Sawaragi, Design and performance of symbols self-organized within an autonomous agent interacting with varied environments, in: Proc. IEEE Int. Workshop on ROMAN, CD-ROM (2004).
- (2004) Proc. IEEE Int. Workshop on ROMAN
- Taniguchi, T.¹ Sawaragi, T.²

29
- 15744397350
- Self-organization of inner symbols for chase: Symbol organization and embodiment
- CD-ROM
- T. Taniguchi and T. Sawaragi, Self-organization of inner symbols for chase: symbol organization and embodiment, in: Proc. IEEE Int. Conf. on SMC, CD-ROM (2004).
- (2004) Proc. IEEE Int. Conf. on SMC
- Taniguchi, T.¹ Sawaragi, T.²

30
- 0035487297
- Mosaic model for sensorimotor learning and control
- M. Haruno, D. M. Wolpert and M. Kawato, Mosaic model for sensorimotor learning and control, Neural Comput.13, 2201-2220 (2001).
- (2001) Neural Comput , vol.13 , pp. 2201-2220
- Haruno, M.¹ Wolpert, D.M.² Kawato, M.³

31
- 0036618011
- Multiple model-based reinforcement learning
- K. Doya et al., Multiple model-based reinforcement learning, Neural Comput.14, 1347-1369 (2000).
- (2000) Neural Comput , vol.14 , pp. 1347-1369
- Doya, K.¹

32
- 0033213813
- Learning to perceive the world as articulated: An approach for hierarchical learning in sensory-motor systems
- J. Tani and S. Nolfi, Learning to perceive the world as articulated: an approach for hierarchical learning in sensory-motor systems, Neural Networks12, 1131-1141 (1999).
- (1999) Neural Networks , vol.12 , pp. 1131-1141
- Tani, J.¹ Nolfi, S.²

33
- 0031619316
- Bayesian Q-learning
- R. Dearden et al., Bayesian Q-learning, in: Proc. AAAI-98, pp. 761-768 (1998).
- (1998) Proc. AAAI-98 , pp. 761-768
- Dearden, R.¹

34
- 25144452832
- What can a neuron learn with spike-timing-dependent plasticity?
- R. Legenstein, C. Naeger and W. Maass, What can a neuron learn with spike-timing-dependent plasticity?, Neural Comput.17, 2337-2382 (2005).
- (2005) Neural Comput , vol.17 , pp. 2337-2382
- Legenstein, R.¹ Naeger, C.² Maass, W.³

35
- 0037162524
- Dynamical model of long-term synaptic plasticity
- H. D. I. Abarbanel, R. Huerta and M. I. Rabinovich, Dynamical model of long-term synaptic plasticity, Proc. Natl. Acad. Sci. USA99, 10132-10137 (2002).
- (2002) Proc. Natl. Acad. Sci. USA , vol.99 , pp. 10132-10137
- Abarbanel, H.D.I.¹ Huerta, R.² Rabinovich, M.I.³

36
- 0037362828
- A stochastic method to predict the consequence of arbitrary forms of spike-timing-dependent plasticity
- H. Cateau and T. Fukai, A stochastic method to predict the consequence of arbitrary forms of spike-timing-dependent plasticity, Neural Comput.15, 597-620 (2003).
- (2003) Neural Comput , vol.15 , pp. 597-620
- Cateau, H.¹ Fukai, T.²

37
- 84898973877
- Reducing spike train variability: A computational theory of spike-timing dependent plasticity
- S. M. Bohte and M. C. Mozer, Reducing spike train variability: a computational theory of spike-timing dependent plasticity, Adv. Neural Inform. Process. Syst.17, 201-208 (2005).
- (2005) Adv. Neural Inform. Process. Syst , vol.17 , pp. 201-208
- Bohte, S.M.¹ Mozer, M.C.²

38
- 0032535029
- Synaptic modification in cultured hippocampal neurons: Dependence on spike timing, synaptic strength, and postsynaptic cell type
- G. Q. Bi and M. M. Poo, Synaptic modification in cultured hippocampal neurons: dependence on spike timing, synaptic strength, and postsynaptic cell type, J. Neurosci.18, 10464-10472 (1998).
- (1998) J. Neurosci , vol.18 , pp. 10464-10472
- Bi, G.Q.¹ Poo, M.M.²

39
- 0033860923
- Competitive hebbian learning through spike-timing-dependent synaptic plasticity
- S. Song et al., Competitive hebbian learning through spike-timing-dependent synaptic plasticity. Nature Neuroscience3, 919-926 (2000).
- (2000) Nature Neuroscience , vol.3 , pp. 919-926
- Song, S.¹

40
- 0242389641
- Hierarchical mosaic for movement generation
- M. Haruno, D. M. Wolpert and M. Kawato, Hierarchical mosaic for movement generation, Int. Congr. Ser.1250, 575-590 (2003).
- (2003) Int. Congr. Ser , vol.1250 , pp. 575-590
- Haruno, M.¹ Wolpert, D.M.² Kawato, M.³

41
- 85156220253
- Constructive algorithms for hierarchical mixtures of experts
- S. R. Waterhouse and A. J. Robinson, Constructive algorithms for hierarchical mixtures of experts, Adv. Neural Inform. Process. Syst.8, 584-590 (1996).
- (1996) Adv. Neural Inform. Process. Syst , vol.8 , pp. 584-590
- Waterhouse, S.R.¹ Robinson, A.J.²

42
- 33745434243
- Lexicon acquisition based on behavior learning
- S. Takamuku, Y Takahashi and M. Asada. Lexicon acquisition based on behavior learning, in: Proc. 4th IEEE Int. Conf. on Development and Learning(2005).
- (2005) Proc. 4th IEEE Int. Conf. on Development and Learning
- Takamuku, S.¹ Takahashi, Y.² Asada, M.³

43
- 34547497348
- Webots, http://www.cybe.rbotics.com. (commercial mobile robot simulation software).
- Webots, http://www.cybe.rbotics.com. (commercial mobile robot simulation software).

44
- 25444519563
- The symbol grounding problem
- S. Hamad, The symbol grounding problem, Physica D42, 35-346 (1990).
- (1990) Physica D , vol.42 , pp. 35-346
- Hamad, S.¹

45
- 34547531313
- Symbol emergence by combining a reinforcement learning schema model with asymmetric synaptic plasticity
- T. Taniguchi and T. Sawaragi, Symbol emergence by combining a reinforcement learning schema model with asymmetric synaptic plasticity, in: Proc. 5th Int. Conf. on Development and Learning(2006).
- (2006) Proc. 5th Int. Conf. on Development and Learning
- Taniguchi, T.¹ Sawaragi, T.²

46
- 0035950280
- Cortical development and remapping through spike timing-dependent plasticity
- S. Song and L. F. Abbott, Cortical development and remapping through spike timing-dependent plasticity, Neuron32, 339-350 (2001).
- (2001) Neuron , vol.32 , pp. 339-350
- Song, S.¹ Abbott, L.F.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.