SCOPUS 정보 검색 플랫폼

PLoS Computational Biology

Volumn 8, Issue 9, 2012, Pages

Spike-based Decision Learning of Nash Equilibria in Two-Player Games

(2) Friedrich, Johannes a Senn, Walter a

a UNIVERSITY OF BERN (Switzerland)

Author keywords

[No Author keywords available]

Indexed keywords

BEHAVIORAL RESEARCH; COMPUTATION THEORY; DECISION MAKING; GAME THEORY; MULTI AGENT SYSTEMS; NEURONS; POPULATION STATISTICS; STOCHASTIC SYSTEMS;

ADAPTIVE DECISION MAKING; CO-ADAPTATION; COMPUTATIONAL ALGORITHM; DECISION TASK; MULTI-AGENT ENVIRONMENT; NASH EQUILIBRIA; POPULATION CODING; REINFORCEMENT LEARNINGS; STOCHASTICS; TWO-PLAYER GAMES;

REINFORCEMENT LEARNING;

ARTICLE; COVARIANCE LEARNING; DECISION MAKING; GAME; HUMAN; LEARNING; NERVE CELL; PLAY; REINFORCEMENT; REINFORCEMENT LEARNING; REWARD; SPIKE WAVE; STOCHASTIC MODEL; TASK PERFORMANCE; TEMPORAL DIFFERENCE LEARNING;

ANIMALIA;

EID: 84866941777 PISSN: 1553734X EISSN: 15537358 Source Type: Journal
DOI: 10.1371/journal.pcbi.1002691 Document Type: Article

Times cited : (5)

References (51)

1
- 60749114870
- Decision theory, reinforcement learning, and the brain
- Dayan P, Daw ND, (2008) Decision theory, reinforcement learning, and the brain. Cogn Affect Behav Ne 8: 429-453.
- (2008) Cogn Affect Behav Ne , vol.8 , pp. 429-453
- Dayan, P.¹ Daw, N.D.²

2
- 0004102479
- Cambridge, MA: MIT Press
- Sutton RS, Barto AG (1998) Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

3
- 2942617032
- Temporal difference models describe higher-order learning in humans
- Seymour B, O'Doherty JP, Dayan P, Koltzenburg M, Jones AK, et al. (2004) Temporal difference models describe higher-order learning in humans. Nature 429: 664-7.
- (2004) Nature , vol.429 , pp. 664-667
- Seymour, B.¹ O'Doherty, J.P.² Dayan, P.³ Koltzenburg, M.⁴ Jones, A.K.⁵

4
- 67650298948
- A spiking neural network model of an actor-aritic learning agent
- Potjans W, Morrison A, Diesmann M, (2009) A spiking neural network model of an actor-aritic learning agent. Neural Comput 21: 301-339.
- (2009) Neural Comput , vol.21 , pp. 301-339
- Potjans, W.¹ Morrison, A.² Diesmann, M.³

5
- 0003644124
- Cambridge, MA: MIT Press
- Howard RA (1960) Dynamic programming and Markov processes. Cambridge, MA: MIT Press.
- (1960) Dynamic programming and Markov processes
- Howard, R.A.¹

6
- 0015658957
- The optimal control of partially observable markov processes over a finite horizon
- Smallwood RD, Sondik EJ, (1973) The optimal control of partially observable markov processes over a finite horizon. Oper Res 21: 1071-1088.
- (1973) Oper Res , vol.21 , pp. 1071-1088
- Smallwood, R.D.¹ Sondik, E.J.²

7
- 79959853243
- Spatio-temporal credit assignment in neuronal population learning
- Friedrich J, Urbanczik R, Senn W, (2011) Spatio-temporal credit assignment in neuronal population learning. PLoS Comput Biol 7: e1002092.
- (2011) PLoS Comput Biol , vol.7
- Friedrich, J.¹ Urbanczik, R.² Senn, W.³

8
- 0004247096
- Cambridge, MA: MIT Press
- Fudenberg D, Levine DK (1998) Theory of Learning in Games. Cambridge, MA: MIT Press.
- (1998) Theory of Learning in Games
- Fudenberg, D.¹ Levine, D.K.²

9
- 84884079276
- Princeton: Princeton University Press
- Von Neumann J, Morgenstern O (1944) Theory of games and economic behavior. Princeton: Princeton University Press.
- (1944) Theory of games and economic behavior
- von Neumann, J.¹ Morgenstern, O.²

10
- 33947232811
- Decision-making in blackjack: An electrophysiological analysis
- Hewig J, Trippe R, Hecht H, Coles GH, Holroyd CB, et al. (2007) Decision-making in blackjack: An electrophysiological analysis. Cereb Cortex 17: 865-877.
- (2007) Cereb Cortex , vol.17 , pp. 865-877
- Hewig, J.¹ Trippe, R.² Hecht, H.³ Coles, G.H.⁴ Holroyd, C.B.⁵

11
- 5144223501
- Activity in posterior parietal cortex is correlated with the relative subjective desirability of action
- Dorris MC, Glimcher PW, (2004) Activity in posterior parietal cortex is correlated with the relative subjective desirability of action. Neuron 44: 365-378.
- (2004) Neuron , vol.44 , pp. 365-378
- Dorris, M.C.¹ Glimcher, P.W.²

12
- 0002621983
- Animal Intelligence: An Experimental Study of the Associative Processes in Animals
- Thorndike EL, (1898) Animal Intelligence: An Experimental Study of the Associative Processes in Animals. Psychol Monogr 2: 321-330.
- (1898) Psychol Monogr , vol.2 , pp. 321-330
- Thorndike, E.L.¹

13
- 0002109138
- A theory of Pavlovian conditioning: variations in the effectiveness of reinforecement and nonreinforcement
- In: Black AH, Prokasy WF, editors, New York: Appleton Century Crofts
- Rescorla R, Wagner A (1972) A theory of Pavlovian conditioning: variations in the effectiveness of reinforecement and nonreinforcement. In: Black AH, Prokasy WF, editors. Classical Conditioning II: current research and theory. New York: Appleton Century Crofts. pp. 64-99.
- (1972) Classical Conditioning II: Current research and theory , pp. 64-99
- Rescorla, R.¹ Wagner, A.²

14
- 84899031338
- Statistical models of conditioning
- Dayan P, Long T (1998) Statistical models of conditioning. Adv Neural Inf Process Syst 10. pp. 117-123.
- (1998) Adv Neural Inf Process Syst , vol.10 , pp. 117-123
- Dayan, P.¹ Long, T.²

15
- 33746652644
- Gradient learning in spiking neural networks by dynamic perturbation of conductances
- Fiete IR, Seung HS, (2006) Gradient learning in spiking neural networks by dynamic perturbation of conductances. Phys Rev Lett 97: 048104.
- (2006) Phys Rev Lett , vol.97 , pp. 048104
- Fiete, I.R.¹ Seung, H.S.²

16
- 77957731196
- Functional requirements for reward-modulated spike- timing-dependent plasticity
- Frémaux N, Sprekeler H, Gerstner W, (2010) Functional requirements for reward-modulated spike- timing-dependent plasticity. J Neurosci 30: 13326-13337.
- (2010) J Neurosci , vol.30 , pp. 13326-13337
- Frémaux, N.¹ Sprekeler, H.² Gerstner, W.³

17
- 0038829878
- Predicting how people play games: reinforcement learning in experimental games with unique, mixed strategy equilibria
- Erev I, Roth AE, (1998) Predicting how people play games: reinforcement learning in experimental games with unique, mixed strategy equilibria. Amer Econ Rev 88: 848-881.
- (1998) Amer Econ Rev , vol.88 , pp. 848-881
- Erev, I.¹ Roth, A.E.²

18
- 5644290841
- An integrated theory of the mind
- Anderson JR, Bothell D, Byrne MD, Douglass S, Lebiere C, et al. (2004) An integrated theory of the mind. Psychol Rev 111: 1036-1060.
- (2004) Psychol Rev , vol.111 , pp. 1036-1060
- Anderson, J.R.¹ Bothell, D.² Byrne, M.D.³ Douglass, S.⁴ Lebiere, C.⁵

19
- 0004017463
- Cambridge, UK: Cambridge University Press
- Gerstner W, Kistler WM (2002) Spiking Neuron Models. Cambridge, UK: Cambridge University Press.
- (2002) Spiking Neuron Models
- Gerstner, W.¹ Kistler, W.M.²

20
- 0030896968
- A neural substrate of prediction and reward
- Schultz W, Dayan P, Montague PR, (1997) A neural substrate of prediction and reward. Science 275: 1593-1599.
- (1997) Science , vol.275 , pp. 1593-1599
- Schultz, W.¹ Dayan, P.² Montague, P.R.³

21
- 53849125053
- Decision making in recurrent neuronal circuits
- Wang XJ, (2008) Decision making in recurrent neuronal circuits. Neuron 60: 215-234.
- (2008) Neuron , vol.60 , pp. 215-234
- Wang, X.J.¹

22
- 60749100305
- Reinforcement learning in populations of spiking neurons
- Urbanczik R, Senn W, (2009) Reinforcement learning in populations of spiking neurons. Nat Neurosci 12: 250-252.
- (2009) Nat Neurosci , vol.12 , pp. 250-252
- Urbanczik, R.¹ Senn, W.²

23
- 34948906745
- Solving the distal reward problem through linkage of STDP and dopamine signaling
- Izhikevich EM, (2007) Solving the distal reward problem through linkage of STDP and dopamine signaling. Cereb Cortex 17: 2443-2452.
- (2007) Cereb Cortex , vol.17 , pp. 2443-2452
- Izhikevich, E.M.¹

24
- 33745726849
- Neural correlations, population coding and computation
- Averbeck B, Latham PE, Pouget A, (2006) Neural correlations, population coding and computation. Nat Rev Neurosci 7: 358-3666.
- (2006) Nat Rev Neurosci , vol.7 , pp. 358-3666
- Averbeck, B.¹ Latham, P.E.² Pouget, A.³

25
- 21244466146
- Zur Theorie der Gesellschaftsspiele
- Von Neumann J, (1928) Zur Theorie der Gesellschaftsspiele. Math Ann 100: 295-320.
- (1928) Math Ann , vol.100 , pp. 295-320
- von Neumann, J.¹

26
- 50149108585
- An electrophysiological analysis of coaching in blackjack
- Hewig J, Trippe R, Hecht H, Coles GH, Holroyd CB, et al. (2008) An electrophysiological analysis of coaching in blackjack. Cortex 44: 1197-1205.
- (2008) Cortex , vol.44 , pp. 1197-1205
- Hewig, J.¹ Trippe, R.² Hecht, H.³ Coles, G.H.⁴ Holroyd, C.B.⁵

27
- 67649304665
- Inspection games
- In: Aumann RJ, Hart S, editors
- Avenhaus R, Von Stengel B, Zamir S (2002) Inspection games. In: Aumann RJ, Hart S, editors. Handbook of Game Theory with Economic Applications. pp. 1947-1987.
- (2002) Handbook of Game Theory with Economic Applications , pp. 1947-1987
- Avenhaus, R.¹ von Stengel, B.² Zamir, S.³

28
- 58249121629
- Amsterdam: Elsevier
- Glimcher PW, Camerer C, Fehr E, Poldrack R, editors (2008) Neuroeconomics: decision making and the brain. Amsterdam: Elsevier.
- (2008) Neuroeconomics: Decision making and the brain
- Glimcher, P.W.¹ Camerer, C.² Fehr, E.³ Poldrack, R.⁴

29
- 33646801243
- Optimal spike-timing-dependent plasticity for precise action potential firing in supervised learning
- Pfister J, Toyoizumi T, Barber D, Gerstner W, (2006) Optimal spike-timing-dependent plasticity for precise action potential firing in supervised learning. Neural Comput 18: 1318-1348.
- (2006) Neural Comput , vol.18 , pp. 1318-1348
- Pfister, J.¹ Toyoizumi, T.² Barber, D.³ Gerstner, W.⁴

30
- 34249708388
- Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity
- Florian RV, (2007) Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity. Neural Comput 19: 1468-1502.
- (2007) Neural Comput , vol.19 , pp. 1468-1502
- Florian, R.V.¹

31
- 77955988359
- Learning spike-based population codes by reward and population feedback
- Friedrich J, Urbanczik R, Senn W, (2010) Learning spike-based population codes by reward and population feedback. Neural Comput 22: 1698-1717.
- (2010) Neural Comput , vol.22 , pp. 1698-1717
- Friedrich, J.¹ Urbanczik, R.² Senn, W.³

32
- 0347362917
- Learning in spiking neural networks by reinforcement of stochastic synaptic transmission
- Seung HS, (2003) Learning in spiking neural networks by reinforcement of stochastic synaptic transmission. Neuron 40: 1063-1073.
- (2003) Neuron , vol.40 , pp. 1063-1073
- Seung, H.S.¹

33
- 27144462270
- Learning curves for stochastic gradient descent in linear feedfor-ward networks
- Werfel J, Xie X, Seung HS, (2005) Learning curves for stochastic gradient descent in linear feedfor-ward networks. Neural Comput 17: 2699-2718.
- (2005) Neural Comput , vol.17 , pp. 2699-2718
- Werfel, J.¹ Xie, X.² Seung, H.S.³

34
- 0002070953
- Learning behavior and mixed-strategy Nash equilibria
- Crawford VP, (1985) Learning behavior and mixed-strategy Nash equilibria. J Econ Behav Organ 6: 69-78.
- (1985) J Econ Behav Organ , vol.6 , pp. 69-78
- Crawford, V.P.¹

35
- 38249030846
- On the instability of mixed-strategy Nash equilibria
- Stahl DO, (1988) On the instability of mixed-strategy Nash equilibria. J Econ Behav Organ 9: 59-69.
- (1988) J Econ Behav Organ , vol.9 , pp. 59-69
- Stahl, D.O.¹

36
- 0013315245
- A re-examination of probability matching and ra-tional choice
- Shanks DR, Tunney RJ, McCarthy JD, (2002) A re-examination of probability matching and ra-tional choice. J Behav Decis Making 15: 233-250.
- (2002) J Behav Decis Making , vol.15 , pp. 233-250
- Shanks, D.R.¹ Tunney, R.J.² McCarthy, J.D.³

37
- 33750041626
- Operant matching is a generic outcome of synaptic plasticity based on the covariance between reward and neural activity
- Loewenstein Y, Seung HS, (2006) Operant matching is a generic outcome of synaptic plasticity based on the covariance between reward and neural activity. Proc Natl Acad Sci U S A 103: 15224-15229.
- (2006) Proc Natl Acad Sci U S A , vol.103 , pp. 15224-15229
- Loewenstein, Y.¹ Seung, H.S.²

38
- 70449718877
- Operant matching as a Nash equilibrium of an in-tertemporal game
- Loewenstein Y, Prelec D, Seung HS, (2009) Operant matching as a Nash equilibrium of an in-tertemporal game. Neural Comput 21: 2755-2773.
- (2009) Neural Comput , vol.21 , pp. 2755-2773
- Loewenstein, Y.¹ Prelec, D.² Seung, H.S.³

39
- 37749023538
- The actor-critic learning is behind the matching law: matching versus optimal behaviors
- Sakai Y, Fukai T, (2008) The actor-critic learning is behind the matching law: matching versus optimal behaviors. Neural Comput 20: 227-251.
- (2008) Neural Comput , vol.20 , pp. 227-251
- Sakai, Y.¹ Fukai, T.²

40
- 27844539379
- Relative and absolute strength of response as a function of frequency of reinforcement
- Herrnstein RJ, (1961) Relative and absolute strength of response as a function of frequency of reinforcement. J Exp Anal Behav 4: 267-272.
- (1961) J Exp Anal Behav , vol.4 , pp. 267-272
- Herrnstein, R.J.¹

41
- 84866930679
- Synaptic theory of replicator-like melioration
- Loewenstein Y, (2010) Synaptic theory of replicator-like melioration. Front Comput Neurosci 4: 17.
- (2010) Front Comput Neurosci , vol.4 , pp. 17
- Loewenstein, Y.¹

42
- 33645566919
- A biophysically based neural model of matching law behavior: melio-ration by stochastic synapses
- Soltani A, Wang XJ, (2006) A biophysically based neural model of matching law behavior: melio-ration by stochastic synapses. J Neurosci 26: 3731-3744.
- (2006) J Neurosci , vol.26 , pp. 3731-3744
- Soltani, A.¹ Wang, X.J.²

43
- 0004005973
- New York: Wiley
- Luce R (1959) Individual Choice Behavior: A Theoretical Analysis. New York: Wiley.
- (1959) Individual Choice Behavior: A Theoretical Analysis
- Luce, R.¹

44
- 0001281582
- Do people play Nash equilibrium? Lessons from evolutionary game theory
- Shanks DR, Tunney RJ, McCarthy JD, (1998) Do people play Nash equilibrium? Lessons from evolutionary game theory. J Econ Lit 36: 1-28.
- (1998) J Econ Lit , vol.36 , pp. 1-28
- Shanks, D.R.¹ Tunney, R.J.² McCarthy, J.D.³

45
- 85149834820
- Markov games as a framework for multi-agent reinforcement learning
- San Francisco, CA
- Littman ML (1994) Markov games as a framework for multi-agent reinforcement learning. In: Proceedings of the International Conference on Machine Learning. San Francisco, CA. pp. 157-163.
- (1994) Proceedings of the International Conference on Machine Learning , pp. 157-163
- Littman, M.L.¹

46
- 4644369748
- Nash Q-learning for general-sum stochastic games
- Hu J, Wellman MP, (2003) Nash Q-learning for general-sum stochastic games. J Mach Learn Res 4: 1039-1069.
- (2003) J Mach Learn Res , vol.4 , pp. 1039-1069
- Hu, J.¹ Wellman, M.P.²

47
- 1642570323
- The Nash equilibrium: a perspective
- Holt CA, Roth AE, (2004) The Nash equilibrium: a perspective. Proc Natl Acad Sci U S A 101: 3999-4002.
- (2004) Proc Natl Acad Sci U S A , vol.101 , pp. 3999-4002
- Holt, C.A.¹ Roth, A.E.²

48
- 58149154664
- Game theory of mind
- Yoshida W, Dolan RJ, Friston KJ, (2008) Game theory of mind. PLoS Comput Biol 4: e1000254.
- (2008) PLoS Comput Biol , vol.4
- Yoshida, W.¹ Dolan, R.J.² Friston, K.J.³

49
- 84866950041
- London: S. Sonnenschein
- Schopenhauer A (1890) The wisdom of life. London: S. Sonnenschein. 147 pp.
- (1890) The wisdom of life , pp. 147
- Schopenhauer, A.¹

50
- 0029800695
- How the brain keeps the eyes still
- Seung HS, (1996) How the brain keeps the eyes still. Proc Natl Acad Sci U S A 93: 13339-13344.
- (1996) Proc Natl Acad Sci U S A , vol.93 , pp. 13339-13344
- Seung, H.S.¹

51
- 0000337576
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- Williams RJ, (1992) Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach Learn 8: 229-256.
- (1992) Mach Learn , vol.8 , pp. 229-256
- Williams, R.J.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.