SCOPUS 정보 검색 플랫폼

Neural Computation

Volumn 21, Issue 4, 2009, Pages 1173-1202

On the asymptotic equivalence between differential Hebbian and temporal difference learning

(3) Kolodziejski, Christoph a Porr, Bernd b Wörgötter, Florentin a

a UNIVERSITY OF GÖTTINGEN (Germany)

b UNIVERSITY OF GLASGOW (United Kingdom)

Author keywords

[No Author keywords available]

Indexed keywords

ANIMAL; ARTICLE; BIOLOGICAL MODEL; BIOPHYSICS; COMPUTER SIMULATION; LEARNING; NERVE CELL; PHYSIOLOGY; REINFORCEMENT; TIME;

ANIMALS; BIOPHYSICS; COMPUTER SIMULATION; LEARNING; MODELS, NEUROLOGICAL; NEURONS; REINFORCEMENT (PSYCHOLOGY); TIME FACTORS;

EID: 65549116541 PISSN: 08997667 EISSN: 1530888X Source Type: Journal
DOI: 10.1162/neco.2008.04-08-750 Document Type: Letter

Times cited : (6)

References (42)

1
- 0004370245
- (Tech. Rep. WL-TR-93-1146). Ohio: Wright Laboratory, Wright-Patterson Air Force Base
- Baird, L. (1993). Advantage updating (Tech. Rep. WL-TR-93-1146). Ohio: Wright Laboratory, Wright-Patterson Air Force Base.
- (1993) Advantage updating
- Baird, L.¹

2
- 0013495368
- Experiments with infinite-horizon, policy-gradient estimation
- Baxter, J., Bartlett, P. L., & Weaver, L. (2001). Experiments with infinite-horizon, policy-gradient estimation. Journal of Artificial Intelligence Research, 15, 351-381.
- (2001) Journal of Artificial Intelligence Research , vol.15 , pp. 351-381
- Baxter, J.¹ Bartlett, P.L.² Weaver, L.³

3
- 0036498662
- Matters temporal
- Dayan, P. (2002). Matters temporal. Trends in Cognitive Sciences, 6(3), 105-106.
- (2002) Trends in Cognitive Sciences , vol.6 , Issue.3 , pp. 105-106
- Dayan, P.¹

4
- 0028388685
- TD(λ) converges with probability 1
- Dayan, P., & Sejnowski, T. (1994). TD(λ) converges with probability 1. Mach. Learn., 14(3), 295-301.
- (1994) Mach. Learn. , vol.14 , Issue.3 , pp. 295-301
- Dayan, P.¹ Sejnowski, T.²

5
- 85156231814
- Temporal difference learning in continuous time and space
- D. S. Touretzky, M. C. Mozer, & M. E. Hasselmo (Eds.), Cambridge, MA: MIT Press
- Doya, K. (1996). Temporal difference learning in continuous time and space. In D. S. Touretzky, M. C. Mozer, & M. E. Hasselmo (Eds.), Advances in neural information processing systems, 8 (pp. 1073-1079). Cambridge, MA: MIT Press.
- (1996) Advances in neural information processing systems , vol.8 , pp. 1073-1079
- Doya, K.¹

6
- 0034524427
- Complementary roles of basal ganglia and cerebellum in learning and motor control
- Doya, K. (2000). Complementary roles of basal ganglia and cerebellum in learning and motor control. Current Opinion in Neurobiology, 10(6), 732-739.
- (2000) Current Opinion in Neurobiology , vol.10 , Issue.6 , pp. 732-739
- Doya, K.¹

7
- 0026579349
- Homosynaptic long-term depression in area CA1 of hippocampus and effects of N-methyl-D-aspartate receptor blockade
- Dudek, S., & Bear, M. (1992). Homosynaptic long-term depression in area CA1 of hippocampus and effects of N-methyl-D-aspartate receptor blockade. Proceedings of the National Academy of Sciences, 89(10), 4363-4367.
- (1992) Proceedings of the National Academy of Sciences , vol.89 , Issue.10 , pp. 4363-4367
- Dudek, S.¹ Bear, M.²

8
- 34249708388
- Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity
- Florian, R. V. (2007). Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity. Neural Computation, 19, 1468-1502.
- (2007) Neural Computation , vol.19 , pp. 1468-1502
- Florian, R.V.¹

9
- 0029821128
- A neuronal learning rule for sub-millisecond temporal coding
- Gerstner, W., Kempter, R., van Hemmen, L., & Wagner, H. (1996). A neuronal learning rule for sub-millisecond temporal coding. Nature, 383, 76-78.
- (1996) Nature , vol.383 , pp. 76-78
- Gerstner, W.¹ Kempter, R.² van Hemmen, L.³ Wagner, H.⁴

10
- 0032123567
- The basal ganglia and chunking of action repertoires
- Graybiel, A. (1998). The basal ganglia and chunking of action repertoires. Neurobiol. Learn. Mem., 70(1-2), 119-136.
- (1998) Neurobiol. Learn. Mem. , vol.70 , Issue.1-2 , pp. 119-136
- Graybiel, A.¹

11
- 0035015792
- Influence of expectation of different rewards on behavior-related neuronal activity in the striatum
- Hassani, O. K., Cromwell, H. C., & Schultz, W. (2001). Influence of expectation of different rewards on behavior-related neuronal activity in the striatum. J. Neurophysiol., 85(6), 2477-2489.
- (2001) J. Neurophysiol. , vol.85 , Issue.6 , pp. 2477-2489
- Hassani, O.K.¹ Cromwell, H.C.² Schultz, W.³

12
- 0020118274
- Neural networks and physical systems with emergent collective computational abilities
- Hopfield, J. J. (1982). Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences, 79, 2554-2558.
- (1982) Proceedings of the National Academy of Sciences , vol.79 , pp. 2554-2558
- Hopfield, J.J.¹

13
- 34948906745
- Solving the distal reward problem through linkage of STDP and dopamine signaling
- Izhikevich, E. (2007). Solving the distal reward problem through linkage of STDP and dopamine signaling. Cerebral Cortex, 17, 2443-2452.
- (2007) Cerebral Cortex , vol.17 , pp. 2443-2452
- Izhikevich, E.¹

14
- 0023878618
- A neuronal model of classical conditioning
- Klopf, A. H. (1988). A neuronal model of classical conditioning. Psychobiol., 16(2), 85-123.
- (1988) Psychobiol. , vol.16 , Issue.2 , pp. 85-123
- Klopf, A.H.¹

15
- 40149107540
- Mathematical properties of neuronal TD-rules and differential Hebbian learning: A comparison
- Kolodziejski, C., Porr, B., & Wörgötter, F. (2008). Mathematical properties of neuronal TD-rules and differential Hebbian learning: A comparison. Biological Cybernetics, 98(3), 259-272.
- (2008) Biological Cybernetics , vol.98 , Issue.3 , pp. 259-272
- Kolodziejski, C.¹ Porr, B.² Wörgötter, F.³

16
- 0042276165
- Differential Hebbian learning
- J. S. Denker (Ed.), New York: American Institute of Physics
- Kosco, B. (1986). Differential Hebbian learning. In J. S. Denker (Ed.), Neural networks for computing: AIP Conference Proc. proceedings (Vol. 151). New York: American Institute of Physics.
- (1986) Neural networks for computing: AIP Conference Proc. proceedings , vol.151
- Kosco, B.¹

17
- 0003452601
- Berlin: Springer-Verlag
- Kushner, H. K., & Clark, D. S. (1978). Stochastic approximation for constrained and unconstrained systems. Berlin: Springer-Verlag.
- (1978) Stochastic approximation for constrained and unconstrained systems
- Kushner, H.K.¹ Clark, D.S.²

18
- 0023981750
- Self-organisation in a perceptual network
- Linsker, R. (1988). Self-organisation in a perceptual network. Computer, 21(3), 105-117.
- (1988) Computer , vol.21 , Issue.3 , pp. 105-117
- Linsker, R.¹

19
- 0031012615
- Regulation of synaptic efficacy by coincidence of postsynaptic APs and EPSPs
- Markram, H., Lübke, J., Frotscher, M., & Sakmann, B. (1997). Regulation of synaptic efficacy by coincidence of postsynaptic APs and EPSPs. Science, 275, 213-215.
- (1997) Science , vol.275 , pp. 213-215
- Markram, H.¹ Lübke, J.² Frotscher, M.³ Sakmann, B.⁴

20
- 0029981543
- A framework for mesencephalic dopamine systems based on predictive Hebbian learning
- Montague, P. R., Dayan, P., & Sejnowski, T. J. (1996). A framework for mesencephalic dopamine systems based on predictive Hebbian learning. Journal of Neuroscience, 76(5), 1936-1947.
- (1996) Journal of Neuroscience , vol.76 , Issue.5 , pp. 1936-1947
- Montague, P.R.¹ Dayan, P.² Sejnowski, T.J.³

21
- 3242673464
- Coincident but distinct messages of midbrain dopamine and striatal tonically active neurons
- Morris, G., Arkadir, D., Nevet, A., Vaadia, E., & Bergman, H. (2004). Coincident but distinct messages of midbrain dopamine and striatal tonically active neurons. Neuron, 43(1), 133-143.
- (2004) Neuron , vol.43 , Issue.1 , pp. 133-143
- Morris, G.¹ Arkadir, D.² Nevet, A.³ Vaadia, E.⁴ Bergman, H.⁵

22
- 33747585633
- Midbrain dopamine neurons encode decisions for future action
- Morris, G., Nevet, A., Arkadir, D., Vaadia, E., & Bergman, H. (2006). Midbrain dopamine neurons encode decisions for future action. Nature Neuroscience, 9(8), 1057-1063.
- (2006) Nature Neuroscience , vol.9 , Issue.8 , pp. 1057-1063
- Morris, G.¹ Nevet, A.² Arkadir, D.³ Vaadia, E.⁴ Bergman, H.⁵

23
- 0020464111
- A simplified neuron model as a principal component analyzer
- Oja, E. (1982). A simplified neuron model as a principal component analyzer. J. Math. Biol., 15(3), 267-273.
- (1982) J. Math. Biol. , vol.15 , Issue.3 , pp. 267-273
- Oja, E.¹

24
- 40449100017
- Dopamine receptor activation is required for corticostriatal spike-timing-dependent plasticity
- Pawlak, V., & Kerr, J. N. D. (2008). Dopamine receptor activation is required for corticostriatal spike-timing-dependent plasticity. J. Neurosci., 28(10), 2435-2446.
- (2008) J. Neurosci. , vol.28 , Issue.10 , pp. 2435-2446
- Pawlak, V.¹ Kerr, J.N.D.²

25
- 0742301619
- Isotropic-sequence-order learning in a closed-loop behavioural system
- Porr, B., & Wörgötter, F. (2003). Isotropic-sequence-order learning in a closed-loop behavioural system. Phil. Trans. R. Soc. Lond. A, 361, 2225-2244.
- (2003) Phil. Trans. R. Soc. Lond. A , vol.361 , pp. 2225-2244
- Porr, B.¹ Wörgötter, F.²

26
- 35549002871
- Learning with " relevance" : Using a third factor to stabilise Hebbian learning
- Porr, B., & Wörgötter, F. (2007). Learning with "relevance": Using a third factor to stabilise Hebbian learning. Neural Comp., 19, 2694-2719.
- (2007) Neural Comp. , vol.19 , pp. 2694-2719
- Porr, B.¹ Wörgötter, F.²

27
- 67650298948
- A spiking neural network model of an actor-critic learning agent
- Potjans, W., Morrison, A., & Diesmann, M. (2009). A spiking neural network model of an actor-critic learning agent. Neural Computation, 21(2), 301-339.
- (2009) Neural Computation , vol.21 , Issue.2 , pp. 301-339
- Potjans, W.¹ Morrison, A.² Diesmann, M.³

28
- 0035489925
- Spike-timing-dependent Hebbian plasticity as temporal difference learning
- Rao, R., & Sejnowski, T. (2001). Spike-timing-dependent Hebbian plasticity as temporal difference learning. Neural Computation, 13, 2221-2237.
- (2001) Neural Computation , vol.13 , pp. 2221-2237
- Rao, R.¹ Sejnowski, T.²

29
- 33751184634
- The short-latency dopamine signal: A role in discovering novel actions?
- Redgrave, P., & Gurney, K. (2006). The short-latency dopamine signal: A role in discovering novel actions? Nature Reviews Neuroscience, 7, 967-975.
- (2006) Nature Reviews Neuroscience , vol.7 , pp. 967-975
- Redgrave, P.¹ Gurney, K.²

30
- 0032696609
- Computational consequences of temporally asymmetric learning rules: I. Differential Hebbian learning
- Roberts, P. (1999). Computational consequences of temporally asymmetric learning rules: I. Differential Hebbian learning. J. Comput. Neurosci., 7(3), 235-246.
- (1999) J. Comput. Neurosci. , vol.7 , Issue.3 , pp. 235-246
- Roberts, P.¹

31
- 57049100874
- An implementation of reinforcement learning based on spike-timing dependent plasticity
- Roberts, P., Santiago, R., & Lafferriere, G. (2008). An implementation of reinforcement learning based on spike-timing dependent plasticity. Biological Cybernetics, 99(6), 517-523.
- (2008) Biological Cybernetics , vol.99 , Issue.6 , pp. 517-523
- Roberts, P.¹ Santiago, R.² Lafferriere, G.³

32
- 0031867046
- Predictive reward signal of dopamine neurons
- Schultz, W. (1998). Predictive reward signal of dopamine neurons. J. Neurophysiol., 80, 1-27.
- (1998) J. Neurophysiol. , vol.80 , pp. 1-27
- Schultz, W.¹

33
- 0026442752
- Neuronal activity in monkey ventral striatum related to the expectation of reward
- Schultz, W., Apicella, P., Scarnati, E., & Ljungberg, T. (1992). Neuronal activity in monkey ventral striatum related to the expectation of reward. J. Neurosci., 12(12), 4595-4610.
- (1992) J. Neurosci. , vol.12 , Issue.12 , pp. 4595-4610
- Schultz, W.¹ Apicella, P.² Scarnati, E.³ Ljungberg, T.⁴

34
- 0033901602
- Convergence results for single-step on-policy reinforcement-learning algorithms
- Singh, S. P., Jaakkola, T., Littman, M. L., & Szepesvári, C. (2000). Convergence results for single-step on-policy reinforcement-learning algorithms. Machine Learning, 38(3), 287-308.
- (2000) Machine Learning , vol.38 , Issue.3 , pp. 287-308
- Singh, S.P.¹ Jaakkola, T.² Littman, M.L.³ Szepesvári, C.⁴

35
- 33847202724
- Learning to predict by the method of temporal differences
- Sutton, R. S. (1988). Learning to predict by the method of temporal differences. Machine Learning, 3, 9-44.
- (1988) Machine Learning , vol.3 , pp. 9-44
- Sutton, R.S.¹

36
- 0019537951
- Towards a modern theory of adaptive networks: Expectation and prediction
- Sutton, R., & Barto, A. (1981). Towards a modern theory of adaptive networks: Expectation and prediction. Psychol. Review, 88, 135-170.
- (1981) Psychol. Review , vol.88 , pp. 135-170
- Sutton, R.¹ Barto, A.²

37
- 0004102479
- Cambridge, MA: MIT Press
- Sutton, R., & Barto, A. (1998). Reinforcement learning: An introduction.Cambridge, MA: MIT Press.
- (1998) Reinforcement learning: An introduction
- Sutton, R.¹ Barto, A.²

38
- 53149129441
- Path-finding in real and simulated rats: On the usefulness of forgetting and frustration for navigation learning
- Tamosiunaite, M., Ainge, J., Kulvicius, T., Porr, B., Dudchenko, P., & Wörgötter, F. (2008). Path-finding in real and simulated rats: On the usefulness of forgetting and frustration for navigation learning. J. Comp. Neuroscience, 25, 562-582.
- (2008) J. Comp. Neuroscience , vol.25 , pp. 562-582
- Tamosiunaite, M.¹ Ainge, J.² Kulvicius, T.³ Porr, B.⁴ Dudchenko, P.⁵ Wörgötter, F.⁶

39
- 0031143730
- An analysis of temporal-difference learning with function approximation
- Tsitsiklis, J. N., & Van Roy, B. (1997). An analysis of temporal-difference learning with function approximation. IEEE Transactions on Automatic Control, 42(5), 674-690.
- (1997) IEEE Transactions on Automatic Control , vol.42 , Issue.5 , pp. 674-690
- Tsitsiklis, J.N.¹ Van Roy, B.²

40
- 34249833101
- Technical note: Q-learning
- Watkins, C., & Dayan, P. (1992). Technical note: Q-learning. Mach. Learn., 8, 279-292.
- (1992) Mach. Learn. , vol.8 , pp. 279-292
- Watkins, C.¹ Dayan, P.²

41
- 0002278965
- Adaptive switching circuits
- New York: Institute of Radio Engineers
- Widrow, B., & Hoff, M. E. (1960). Adaptive switching circuits. In IRE WESCON Convention Record (pp. 96-104). New York: Institute of Radio Engineers.
- (1960) IRE WESCON Convention Record , pp. 96-104
- Widrow, B.¹ Hoff, M.E.²

42
- 22944460232
- Convergence and divergence in standard averaging reinforcement learning
- J. Boulicaut, F. Esposito, F. Giannotti, & D. Pedreschi (Eds.), Berlin: Springer-Verlag
- Wiering, M. (2004). Convergence and divergence in standard averaging reinforcement learning. In J. Boulicaut, F. Esposito, F. Giannotti, & D. Pedreschi (Eds.), Proceedings of the 15th European Conference on Machine learning ECML'04 (pp. 477-488). Berlin: Springer-Verlag.
- (2004) Proceedings of the 15th European Conference on Machine learning ECML'04 , pp. 477-488
- Wiering, M.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.