메뉴 건너뛰기




Volumn 6, Issue 5, 2011, Pages

Democratic population decisions result in robust policy-gradient learning: A parametric study with GPU simulations

Author keywords

[No Author keywords available]

Indexed keywords

ARTICLE; ARTIFICIAL NEURAL NETWORK; BIOINFORMATICS; COMPUTER; COMPUTER PROGRAM; COMPUTER SIMULATION; DECISION MAKING; GRAPHICS PROCESSING UNIT; MATHEMATICAL COMPUTING; MATHEMATICAL MODEL; NETWORK LEARNING; NEUROSCIENCE; PARAMETRIC TEST; REINFORCEMENT; SIGNAL NOISE RATIO; ALGORITHM; COMPUTER GRAPHICS;

EID: 79955761534     PISSN: None     EISSN: 19326203     Source Type: Journal    
DOI: 10.1371/journal.pone.0018539     Document Type: Article
Times cited : (13)

References (83)
  • 2
    • 84877609547 scopus 로고    scopus 로고
    • Brook for gpus: stream computing on graphics hardware
    • ACM New York, NY, USA, doi:
    • Buck I, Foley T, Horn D, Sugerman J, Fatahalian K, et al. (2004) Brook for gpus: stream computing on graphics hardware. In: SIGGRAPH ′04: ACM SIGGRAPH 2004 Papers. ACM New York, NY, USA pp. 777-786 doi:http://doi.acm.org/10.1145/1186562.1015800.
    • (2004) SIGGRAPH ′04: ACM SIGGRAPH 2004 Papers , pp. 777-786
    • Buck, I.1    Foley, T.2    Horn, D.3    Sugerman, J.4    Fatahalian, K.5
  • 3
    • 0016640311 scopus 로고
    • Homogeneous nets of neuron-like elements
    • Amari SI, (1975) Homogeneous nets of neuron-like elements. Biological Cybernetics 17: 211-220.
    • (1975) Biological Cybernetics , vol.17 , pp. 211-220
    • Amari, S.I.1
  • 4
    • 0017713690 scopus 로고
    • Dynamics of pattern formation in lateral-inhibition type neural fields
    • Amari S, (1977) Dynamics of pattern formation in lateral-inhibition type neural fields. Biol Cybern 27: 77-87.
    • (1977) Biol Cybern , vol.27 , pp. 77-87
    • Amari, S.1
  • 5
    • 23044433348 scopus 로고    scopus 로고
    • Correlated firing in a feedforward network with mexican-hat-type connectivity
    • Hamaguchi K, Okada M, Yamana M, Aihara K, (2005) Correlated firing in a feedforward network with mexican-hat-type connectivity. Neural Comput 17: 2034-2059.
    • (2005) Neural Comput , vol.17 , pp. 2034-2059
    • Hamaguchi, K.1    Okada, M.2    Yamana, M.3    Aihara, K.4
  • 6
    • 0026476933 scopus 로고
    • A cortical model of winner-take-all competition via lateral inhibition
    • Coultrip R, Granger R, Lynch G, (1992) A cortical model of winner-take-all competition via lateral inhibition. Neural Networks 5: 47-54.
    • (1992) Neural Networks , vol.5 , pp. 47-54
    • Coultrip, R.1    Granger, R.2    Lynch, G.3
  • 7
    • 0042408141 scopus 로고    scopus 로고
    • Effect of lateral connections on the accuracy of the population code for a network of spiking neurons
    • Spiridon M, Gerstner W, (2001) Effect of lateral connections on the accuracy of the population code for a network of spiking neurons. Network: Computation in Neural Systems 12: 409-421257-272.
    • (2001) Network: Computation in Neural Systems , vol.12
    • Spiridon, M.1    Gerstner, W.2
  • 9
    • 0023034513 scopus 로고
    • Neuronal population coding of movement direction
    • Georgopoulos AP, Schwartz A, Kettner RE, (1986) Neuronal population coding of movement direction. Science 233: 1416-1419.
    • (1986) Science , vol.233 , pp. 1416-1419
    • Georgopoulos, A.P.1    Schwartz, A.2    Kettner, R.E.3
  • 10
    • 33646801243 scopus 로고    scopus 로고
    • Optimal spike-timing dependent plasticity for precise action potential firing in supervised learning
    • Pfister JP, Toyoizumi T, Barber D, Gerstner W, (2006) Optimal spike-timing dependent plasticity for precise action potential firing in supervised learning. Neural Computation 18: 1309-1339.
    • (2006) Neural Computation , vol.18 , pp. 1309-1339
    • Pfister, J.P.1    Toyoizumi, T.2    Barber, D.3    Gerstner, W.4
  • 11
    • 34249708388 scopus 로고    scopus 로고
    • Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity
    • Florian RV, (2007) Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity. Neural Computation 19: 1468-1502.
    • (2007) Neural Computation , vol.19 , pp. 1468-1502
    • Florian, R.V.1
  • 12
    • 74549209037 scopus 로고    scopus 로고
    • Spike-based reinforcement learning in continuous state and action space: When policy gradient methods fail
    • Vasilaki E, Frémaux N, Urbanczik R, Senn W, Gerstner W, (2009) Spike-based reinforcement learning in continuous state and action space: When policy gradient methods fail. PLoS Comput Biol 5: e1000586.
    • (2009) PLoS Comput Biol , vol.5
    • Vasilaki, E.1    Frémaux, N.2    Urbanczik, R.3    Senn, W.4    Gerstner, W.5
  • 13
    • 0000337576 scopus 로고
    • Simple statistical gradient-following methods for connectionist reinforcement learning
    • Williams R, (1992) Simple statistical gradient-following methods for connectionist reinforcement learning. Machine Learning 8: 229-256.
    • (1992) Machine Learning , vol.8 , pp. 229-256
    • Williams, R.1
  • 15
    • 33644820226 scopus 로고    scopus 로고
    • A microcircuit model of prefrontal functions: ying and yang of reverberatory neurodynamics in cognition
    • The frontal lobes, development, function, and pathology
    • Wang X, (2006) A microcircuit model of prefrontal functions: ying and yang of reverberatory neurodynamics in cognition. The frontal lobes development, function, and pathology.
    • (2006)
    • Wang, X.1
  • 16
    • 63149146163 scopus 로고    scopus 로고
    • Learning exible sensori-motor mappings in a complex network
    • Vasilaki E, Fusi S, Wang XJ, Senn W, (2009) Learning exible sensori-motor mappings in a complex network. Biol Cybern 100: 147-158.
    • (2009) Biol Cybern , vol.100 , pp. 147-158
    • Vasilaki, E.1    Fusi, S.2    Wang, X.J.3    Senn, W.4
  • 17
    • 78049420686 scopus 로고    scopus 로고
    • Pavlovian-Instrumental Interaction in "Observing Behavior"
    • Beierholm U, Dayan P, (2010) Pavlovian-Instrumental Interaction in "Observing Behavior". PLoS Computational Biology 6.
    • (2010) PLoS Computational Biology , vol.6
    • Beierholm, U.1    Dayan, P.2
  • 18
    • 67349170462 scopus 로고    scopus 로고
    • Goal-directed control and its antipodes
    • Dayan P, (2009) Goal-directed control and its antipodes. Neural Networks 22: 213-219.
    • (2009) Neural Networks , vol.22 , pp. 213-219
    • Dayan, P.1
  • 21
    • 77954761462 scopus 로고    scopus 로고
    • The role of value systems in decision making
    • Better than conscious
    • Dayan P, (2008) The role of value systems in decision making. Better than conscious pp. 51-70.
    • (2008) , pp. 51-70
    • Dayan, P.1
  • 22
    • 0019537951 scopus 로고
    • Towards a modern theory of adaptive networks: expectation and prediction
    • Sutton RS, Barto AG, (1981) Towards a modern theory of adaptive networks: expectation and prediction. Psychol Review 88: 135-171.
    • (1981) Psychol Review , vol.88 , pp. 135-171
    • Sutton, R.S.1    Barto, A.G.2
  • 25
    • 0019957779 scopus 로고
    • Place navigation impaired in rats with hippocampal lesions
    • Morris R, Garrard P, Rawlins J, O'Keefe J, (1982) Place navigation impaired in rats with hippocampal lesions. Nature 297: 681-683.
    • (1982) Nature , vol.297 , pp. 681-683
    • Morris, R.1    Garrard, P.2    Rawlins, J.3    O'Keefe, J.4
  • 26
    • 0033968832 scopus 로고    scopus 로고
    • Models of hippocampally dependent navigation using the temporal difference learning rule
    • Foster D, Morris R, Dayan P, (2000) Models of hippocampally dependent navigation using the temporal difference learning rule. Hippocampus 10: 1-16.
    • (2000) Hippocampus , vol.10 , pp. 1-16
    • Foster, D.1    Morris, R.2    Dayan, P.3
  • 27
    • 0034276719 scopus 로고    scopus 로고
    • Spatial cognition and neuro-mimetic navigation: a model of hippocampal place cell activity
    • Arleo A, Gerstner W, (2000) Spatial cognition and neuro-mimetic navigation: a model of hippocampal place cell activity. Biological Cybernetics 83: 287-299.
    • (2000) Biological Cybernetics , vol.83 , pp. 287-299
    • Arleo, A.1    Gerstner, W.2
  • 30
    • 78049417739 scopus 로고    scopus 로고
    • Reinforcement learning on slow features of highdimensional input streams
    • Legenstein R, Wilbert N, Wiskott L, (2010) Reinforcement learning on slow features of highdimensional input streams. PLoS Comput Biol 6: e1000894.
    • (2010) PLoS Comput Biol , vol.6
    • Legenstein, R.1    Wilbert, N.2    Wiskott, L.3
  • 31
    • 0347362917 scopus 로고    scopus 로고
    • Learning in spiking neural networks by reinforcement of stochastic synaptic transmission
    • Seung HS, (2003) Learning in spiking neural networks by reinforcement of stochastic synaptic transmission. Neuron 40: 1063-1073.
    • (2003) Neuron , vol.40 , pp. 1063-1073
    • Seung, H.S.1
  • 32
    • 37649027755 scopus 로고    scopus 로고
    • Learning in neural networks by reinforcement of irregular spiking
    • Xie X, Seung H, (2004) Learning in neural networks by reinforcement of irregular spiking. Physical Review E 69: 41909.
    • (2004) Physical Review E , vol.69 , pp. 41909
    • Xie, X.1    Seung, H.2
  • 33
    • 33746652644 scopus 로고    scopus 로고
    • Gradient Learning in Spiking Neural Networks by Dynamic Perturbation of Conductances
    • Fiete I, Seung H, (2006) Gradient Learning in Spiking Neural Networks by Dynamic Perturbation of Conductances. Physical Review Letters 97: 48104.
    • (2006) Physical Review Letters , vol.97 , pp. 48104
    • Fiete, I.1    Seung, H.2
  • 34
    • 34548049545 scopus 로고    scopus 로고
    • Reinforcement learning, spike-time-dependent plasticity, and the bcm rule
    • Baras D, Meir R, (2007) Reinforcement learning, spike-time-dependent plasticity, and the bcm rule. Neural Computation 19: 2245-2279.
    • (2007) Neural Computation , vol.19 , pp. 2245-2279
    • Baras, D.1    Meir, R.2
  • 35
    • 77955988359 scopus 로고    scopus 로고
    • Learning spike-based population codes by reward and population feedback
    • Friedrich J, Urbanczik R, Senn W, (2010) Learning spike-based population codes by reward and population feedback. Neural computation 22: 1698-1717.
    • (2010) Neural Computation , vol.22 , pp. 1698-1717
    • Friedrich, J.1    Urbanczik, R.2    Senn, W.3
  • 36
    • 0004049893 scopus 로고
    • Learning from delayed rewards
    • Cambridge, PhD-thesis, Cambridge University
    • Watkins C, (1989) Learning from delayed rewards. Cambridge PhD-thesis, Cambridge University.
    • (1989)
    • Watkins, C.1
  • 37
    • 0004007508 scopus 로고    scopus 로고
    • Reinforcement learning
    • MIT Press, Cambridge
    • Sutton R, Barto A, (1998) Reinforcement learning. MIT Press, Cambridge.
    • (1998)
    • Sutton, R.1    Barto, A.2
  • 38
    • 67650298948 scopus 로고    scopus 로고
    • A spiking neural network model of an actor-critic learning agent
    • Potjans W, Morrison A, Diesmann M, (2009) A spiking neural network model of an actor-critic learning agent. Neural Comp 21: 301-339.
    • (2009) Neural Comp , vol.21 , pp. 301-339
    • Potjans, W.1    Morrison, A.2    Diesmann, M.3
  • 39
    • 79959855306 scopus 로고    scopus 로고
    • Temporal difference based actor critic learning - convergence and neural implementation
    • Di Castro D, Volkinshtein S, Meir R, (2009) Temporal difference based actor critic learning- convergence and neural implementation. NIPS 22: 385-392.
    • (2009) NIPS , vol.22 , pp. 385-392
    • Di Castro, D.1    Volkinshtein, S.2    Meir, R.3
  • 40
    • 0035315989 scopus 로고    scopus 로고
    • Temporal difference model reproduces anticipatory neural activity
    • Suri R, Schultz W, (2001) Temporal difference model reproduces anticipatory neural activity. Neural Computation 13: 841-862.
    • (2001) Neural Computation , vol.13 , pp. 841-862
    • Suri, R.1    Schultz, W.2
  • 41
    • 0029821128 scopus 로고    scopus 로고
    • A neuronal learning rule for submillisecond temporal coding
    • Gerstner W, Kempter R, van Hemmen JL, Wagner H, (1996) A neuronal learning rule for submillisecond temporal coding. Nature 383: 76-78.
    • (1996) Nature , vol.383 , pp. 76-78
    • Gerstner, W.1    Kempter, R.2    van Hemmen, J.L.3    Wagner, H.4
  • 43
    • 0033667165 scopus 로고    scopus 로고
    • Synaptic plasticity: Taming the beast
    • Abbott LF, Nelson SB, (2000) Synaptic plasticity: Taming the beast. Nature Neuroscience 3: 1178-1183.
    • (2000) Nature Neuroscience , vol.3 , pp. 1178-1183
    • Abbott, L.F.1    Nelson, S.B.2
  • 44
    • 34948906745 scopus 로고    scopus 로고
    • Solving the distal reward problem through linkage of stdp and dopamine signaling
    • Izhikevich E, (2007) Solving the distal reward problem through linkage of stdp and dopamine signaling. Cerebral Cortex 17: 2443-2452.
    • (2007) Cerebral Cortex , vol.17 , pp. 2443-2452
    • Izhikevich, E.1
  • 45
    • 37549060355 scopus 로고    scopus 로고
    • Reinforcement Learning With Modulated Spike Timing Dependent Synaptic Plasticity
    • Farries MA, Fairhall AL, (2007) Reinforcement Learning With Modulated Spike Timing Dependent Synaptic Plasticity. J Neurophysiol 98: 3648-3665.
    • (2007) J Neurophysiol , vol.98 , pp. 3648-3665
    • Farries, M.A.1    Fairhall, A.L.2
  • 46
    • 55449121121 scopus 로고    scopus 로고
    • A learning theory for reward-modulated spiketiming-dependent plasticity with application to biofeedback
    • Legenstein R, Pecevski D, Maass W, (2008) A learning theory for reward-modulated spiketiming-dependent plasticity with application to biofeedback. PLoS Computational Biology 4 (10): e1000180.
    • (2008) PLoS Computational Biology , vol.4 , Issue.10
    • Legenstein, R.1    Pecevski, D.2    Maass, W.3
  • 47
    • 0015145985 scopus 로고
    • The hippocampus as a spatial map. preliminary evidence from unit activity in the freely-moving rat
    • O'Keefe J, Dostrovsky J, (1971) The hippocampus as a spatial map. preliminary evidence from unit activity in the freely-moving rat. Brain Res 34: 171-175.
    • (1971) Brain Res , vol.34 , pp. 171-175
    • O'Keefe, J.1    Dostrovsky, J.2
  • 48
    • 0034072947 scopus 로고    scopus 로고
    • Position Reconstruction From an Ensemble of Hippocampal Place Cells: Contribution of Theta Phase Coding
    • Jensen O, Lisman JE, (2000) Position Reconstruction From an Ensemble of Hippocampal Place Cells: Contribution of Theta Phase Coding. J Neurophysiol 83: 2602-2609.
    • (2000) J Neurophysiol , vol.83 , pp. 2602-2609
    • Jensen, O.1    Lisman, J.E.2
  • 49
    • 11844290435 scopus 로고
    • A theoretical analysis of neuronal variability
    • Stein RB, (1965) A theoretical analysis of neuronal variability. Biophys J 5: 173-194.
    • (1965) Biophys J , vol.5 , pp. 173-194
    • Stein, R.B.1
  • 50
    • 0004017463 scopus 로고    scopus 로고
    • Spiking Neuron Models
    • Cambridge UK, Cambridge University Press
    • Gerstner W, Kistler WK, (2002) Spiking Neuron Models. Cambridge UK Cambridge University Press.
    • (2002)
    • Gerstner, W.1    Kistler, W.K.2
  • 51
    • 0022213383 scopus 로고
    • Learning by statistical cooperation of self-interested neuron-like neuron elements
    • Barto A, (1985) Learning by statistical cooperation of self-interested neuron-like neuron elements. Human Neurobiology 4: 229-256.
    • (1985) Human Neurobiology , vol.4 , pp. 229-256
    • Barto, A.1
  • 52
    • 0034910789 scopus 로고    scopus 로고
    • The transient precision of integrate and fire neurons: Effect of background activity and noise
    • Van Rossum M, (2001) The transient precision of integrate and fire neurons: Effect of background activity and noise. Journal of Computational Neuroscience 10: 303-311.
    • (2001) Journal of Computational Neuroscience , vol.10 , pp. 303-311
    • van Rossum, M.1
  • 53
    • 0036523040 scopus 로고    scopus 로고
    • Fast propagation of firing rates through layered networks of noisy neurons
    • Van Rossum M, Turrigiano G, Nelson S, (2002) Fast propagation of firing rates through layered networks of noisy neurons. Journal of Neuroscience 22: 1956.
    • (2002) Journal of Neuroscience , vol.22 , pp. 1956
    • van Rossum, M.1    Turrigiano, G.2    Nelson, S.3
  • 54
    • 2542428511 scopus 로고    scopus 로고
    • Computation with populations codes in layered networks of integrate-and-fire neurons
    • Van Rossum M, Renart A, (2004) Computation with populations codes in layered networks of integrate-and-fire neurons. Neurocomputing 58: 265-270.
    • (2004) Neurocomputing , vol.58 , pp. 265-270
    • van Rossum, M.1    Renart, A.2
  • 55
    • 77957836774 scopus 로고    scopus 로고
    • CUDA by Example: An Introduction to General-Purpose GPU Programming
    • Addison-Wesley
    • Sanders J, Kandrot E, (2010) CUDA by Example: An Introduction to General-Purpose GPU Programming. Addison-Wesley.
    • (2010)
    • Sanders, J.1    Kandrot, E.2
  • 56
    • 70349668883 scopus 로고    scopus 로고
    • NVIDIA CUDA Programming Guide
    • NVIDIA Corporation
    • NVIDIA Corporation (2008) NVIDIA CUDA Programming Guide.
    • (2008)
  • 57
    • 79952429502 scopus 로고    scopus 로고
    • Neural networks on gpus: Restricted boltzmann machines
    • Technical report, University of Toronto
    • Ly DL, Paprotski V, Yen D, (2008) Neural networks on gpus: Restricted boltzmann machines. Technical report, University of Toronto.
    • (2008)
    • Ly, D.L.1    Paprotski, V.2    Yen, D.3
  • 60
    • 84885847922 scopus 로고    scopus 로고
    • Brian: a simulator for spiking neural networks in Python
    • Goodman D, Brette R, (2008) Brian: a simulator for spiking neural networks in Python. Frontiers in neuroinformatics 2.
    • (2008) Frontiers in Neuroinformatics , vol.2
    • Goodman, D.1    Brette, R.2
  • 63
    • 78049278797 scopus 로고    scopus 로고
    • Code generation: A strategy for neural network simulators
    • Goodman D, (2010) Code generation: A strategy for neural network simulators. Neuroinformatics 8: 183-196.
    • (2010) Neuroinformatics , vol.8 , pp. 183-196
    • Goodman, D.1
  • 64
    • 68149182671 scopus 로고    scopus 로고
    • A configurable simulation environment bfor the efficient simulation of large-scale spiking neural networks on graphics processors
    • Nageswaran J, Dutt N, Krichmar J, Nicolau A, Veidenbaum A, (2009) A configurable simulation environment bfor the efficient simulation of large-scale spiking neural networks on graphics processors. Neural Networks 22: 791-800.
    • (2009) Neural Networks , vol.22 , pp. 791-800
    • Nageswaran, J.1    Dutt, N.2    Krichmar, J.3    Nicolau, A.4    Veidenbaum, A.5
  • 69
    • 34548826240 scopus 로고    scopus 로고
    • Neuromimetic ICs with analog cores: an alternative for simulating spiking neural networks
    • ISCAS 2007. IEEE International Symposium on IEEE
    • Renaud S, Tomas J, Bornat Y, Daouzli A, Saighi S, (2007) Neuromimetic ICs with analog cores: an alternative for simulating spiking neural networks. In: Circuits and Systems, 2007. ISCAS 2007. IEEE International Symposium on. IEEE pp. 3355-3358.
    • (2007) Circuits and Systems, 2007 , pp. 3355-3358
    • Renaud, S.1    Tomas, J.2    Bornat, Y.3    Daouzli, A.4    Saighi, S.5
  • 76
    • 79953694010 scopus 로고    scopus 로고
    • Trends in programming languages for neuroscience simulations
    • Davison A, Hines M, Muller E, (2009) Trends in programming languages for neuroscience simulations. Frontiers in neuroscience 3: 374.
    • (2009) Frontiers in Neuroscience , vol.3 , pp. 374
    • Davison, A.1    Hines, M.2    Muller, E.3
  • 77
    • 79955751254 scopus 로고    scopus 로고
    • A common language for neuronal networks in software and hardware
    • The Neuromorphic Engineer
    • Davison A, Muller E, Bruederle D, Kremkow J, (2010) A common language for neuronal networks in software and hardware. The Neuromorphic Engineer.
    • (2010)
    • Davison, A.1    Muller, E.2    Bruederle, D.3    Kremkow, J.4
  • 79
    • 70350368872 scopus 로고    scopus 로고
    • Efficient sparse matrix-vector multiplication on CUDA
    • NVIDIA Technical Report NVR-2008-004, NVIDIA Corporation
    • Bell N, Garland M, (2008) Efficient sparse matrix-vector multiplication on CUDA. NVIDIA Technical Report NVR-2008-004, NVIDIA Corporation.
    • (2008)
    • Bell, N.1    Garland, M.2
  • 82
    • 84879298940 scopus 로고    scopus 로고
    • A high performance framework for agent based pedestrian dynamics on gpu hardware
    • Proceedings of EUROSIS ESM 2008
    • Richmond P, Romano D, (2008) A high performance framework for agent based pedestrian dynamics on gpu hardware. Proceedings of EUROSIS ESM 2008.
    • (2008)
    • Richmond, P.1    Romano, D.2
  • 83
    • 77953878345 scopus 로고    scopus 로고
    • High performance cellular level agent-based simulation with FLAME for the GPU
    • Richmond P, Walker D, Coakley S, Romano D, (2010) High performance cellular level agent-based simulation with FLAME for the GPU. Briefings in bioinformatics 11: 334.
    • (2010) Briefings in Bioinformatics , vol.11 , pp. 334
    • Richmond, P.1    Walker, D.2    Coakley, S.3    Romano, D.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.