메뉴 건너뛰기




Volumn 20, Issue 8, 2009, Pages 1368-1371

Simple artificial neural networks that match probability and exploit and explore when confronting a multiarmed bandit

Author keywords

Instrumental learning; Multiarmed bandit; Operant conditioning; Perceptron; Probability matching

Indexed keywords

INSTRUMENTAL LEARNING; MULTIARMED BANDIT; OPERANT CONDITIONING; PERCEPTRON; PROBABILITY MATCHING;

EID: 68949216971     PISSN: 10459227     EISSN: None     Source Type: Journal    
DOI: 10.1109/TNN.2009.2025588     Document Type: Article
Times cited : (20)

References (39)
  • 1
    • 27844539379 scopus 로고
    • Relative and absolute strength of response as a function of frequency of reinforcement
    • R. J. Herrnstein, "Relative and absolute strength of response as a function of frequency of reinforcement," J. Exp. Anal. Behav., vol. 4, pp. 267-272, 1961.
    • (1961) J. Exp. Anal. Behav , vol.4 , pp. 267-272
    • Herrnstein, R.J.1
  • 2
    • 0034014181 scopus 로고    scopus 로고
    • An economist's perspective on probability matching
    • Feb
    • N. Vulkan, "An economist's perspective on probability matching," J. Econom. Surv., vol. 14, pp. 101-118, Feb. 2000.
    • (2000) J. Econom. Surv , vol.14 , pp. 101-118
    • Vulkan, N.1
  • 4
    • 23444461158 scopus 로고    scopus 로고
    • On the classic and modern theories of matching
    • Jul
    • J. J. McDowell, "On the classic and modern theories of matching," J. Exp. Anal. Behav., vol. 84, pp. 111-127, Jul. 2005.
    • (2005) J. Exp. Anal. Behav , vol.84 , pp. 111-127
    • McDowell, J.J.1
  • 5
    • 0001730110 scopus 로고
    • Toward a law of response strength
    • P. de Villiers and R. J. Herrnstein, "Toward a law of response strength," Psychol. Bull., vol. 83, pp. 1131-1153, 1976.
    • (1976) Psychol. Bull , vol.83 , pp. 1131-1153
    • de Villiers, P.1    Herrnstein, R.J.2
  • 7
    • 85136499808 scopus 로고
    • Choice in concurrent schedules and a quantitative formulation of the law of effect
    • W.K. Honig and J. E. R. Staddon, Eds. Englewood Cliffs, NJ: Prentice-Hall
    • P. de Villiers, "Choice in concurrent schedules and a quantitative formulation of the law of effect," in Handbook of Operant Behavior W.K. Honig and J. E. R. Staddon, Eds. Englewood Cliffs, NJ: Prentice-Hall, 1977, pp. 233-287.
    • (1977) Handbook of Operant Behavior , pp. 233-287
    • de Villiers, P.1
  • 9
    • 84987278656 scopus 로고
    • Maximizing and matching on concurrent ratio schedules
    • R. J. Herrnstein and D. H. Loveland, "Maximizing and matching on concurrent ratio schedules," J. Exp. Anal. Behav., vol. 24, pp. 107-116, 1975.
    • (1975) J. Exp. Anal. Behav , vol.24 , pp. 107-116
    • Herrnstein, R.J.1    Loveland, D.H.2
  • 10
    • 11144273669 scopus 로고
    • The perceptron: A probabilistic model for information storage and organization in the brain
    • F. Rosenblatt, "The perceptron: A probabilistic model for information storage and organization in the brain," Psychol. Rev., vol. 65, pp. 386-408, 1958.
    • (1958) Psychol. Rev , vol.65 , pp. 386-408
    • Rosenblatt, F.1
  • 13
    • 68949215727 scopus 로고    scopus 로고
    • M. R. W. Dawson, Connectionism and classical conditioning, Comparat. Cogn. Behav. Rev., 3, Monograph, pp. 1-115, 2008.
    • M. R. W. Dawson, "Connectionism and classical conditioning," Comparat. Cogn. Behav. Rev., vol. 3, Monograph, pp. 1-115, 2008.
  • 14
    • 84889822618 scopus 로고    scopus 로고
    • Connectionism
    • 1st ed. Malden, MA: Blackwell
    • M. R. W. Dawson, Connectionism : A Hands-on Approach, 1st ed. Malden, MA: Blackwell, 2005.
    • (2005) A Hands-on Approach
    • Dawson, M.R.W.1
  • 15
    • 0027325820 scopus 로고
    • Choice in honeybees as a function of the probability of reward
    • Aug
    • M. E. Fischer, P. A. Couvillon, and M. E. Bitterman, "Choice in honeybees as a function of the probability of reward," Animal Learn. Behav., vol. 21, pp. 187-195, Aug. 1993.
    • (1993) Animal Learn. Behav , vol.21 , pp. 187-195
    • Fischer, M.E.1    Couvillon, P.A.2    Bitterman, M.E.3
  • 16
    • 0036862934 scopus 로고    scopus 로고
    • Bees in two-armed bandit situations: Foraging choices and possible decision mechanisms
    • Nov.-Dec
    • T. Keasar, E. Rashkovich, D. Cohen, and A. Shmida, "Bees in two-armed bandit situations: Foraging choices and possible decision mechanisms," Behav. Ecol., vol. 13, pp. 757-765, Nov.-Dec. 2002.
    • (2002) Behav. Ecol , vol.13 , pp. 757-765
    • Keasar, T.1    Rashkovich, E.2    Cohen, D.3    Shmida, A.4
  • 17
    • 2642665866 scopus 로고
    • Probability-learning and habit-reversal in the cockroach
    • N. Longo, "Probability-learning and habit-reversal in the cockroach," Amer. J. Psychol., vol. 77, pp. 29-41, 1964.
    • (1964) Amer. J. Psychol , vol.77 , pp. 29-41
    • Longo, N.1
  • 18
    • 0036972336 scopus 로고    scopus 로고
    • Evolution of reinforcement learning in uncertain environments: A simple explanation for complex foraging behaviors
    • Y. Niv, D. Joel, I. Meilijson, and E. Ruppin, "Evolution of reinforcement learning in uncertain environments: A simple explanation for complex foraging behaviors," Adapt. Behav., vol. 10, pp. 5-24, 2002.
    • (2002) Adapt. Behav , vol.10 , pp. 5-24
    • Niv, Y.1    Joel, D.2    Meilijson, I.3    Ruppin, E.4
  • 19
    • 0003301122 scopus 로고
    • Probability-matching in the fish
    • E. R. Behrend and M. E. Bitterman, "Probability-matching in the fish," Amer. J. Psychol., vol. 74, pp. 542-551, 1961.
    • (1961) Amer. J. Psychol , vol.74 , pp. 542-551
    • Behrend, E.R.1    Bitterman, M.E.2
  • 20
    • 0003368011 scopus 로고
    • Probability-learning by the turtle
    • K. L. Kirk and M. E. Bitterman, "Probability-learning by the turtle," Science, vol. 148, pp. 1484-1485, 1965.
    • (1965) Science , vol.148 , pp. 1484-1485
    • Kirk, K.L.1    Bitterman, M.E.2
  • 21
    • 7344222429 scopus 로고
    • Further experiments on probability-matching in the pigeon
    • V. Graf, D. H. Bullock, and M. E. Bitterman, "Further experiments on probability-matching in the pigeon," J. Exp. Anal. Behav., vol. 7, pp. 151-157, 1964.
    • (1964) J. Exp. Anal. Behav , vol.7 , pp. 151-157
    • Graf, V.1    Bullock, D.H.2    Bitterman, M.E.3
  • 22
    • 0001493642 scopus 로고
    • Analysis of a verbal conditioning situation in terms of statistical learning theory
    • W. K. Estes and J. H. Straughan, "Analysis of a verbal conditioning situation in terms of statistical learning theory," J. Exp. Psychol. vol. 47, pp. 225-234, 1954.
    • (1954) J. Exp. Psychol , vol.47 , pp. 225-234
    • Estes, W.K.1    Straughan, J.H.2
  • 23
    • 0021834552 scopus 로고
    • A new approach to the design of reinforcement schemes for learning automata
    • Feb
    • M. A. L. Thathachar and P. S. Sastry, "A new approach to the design of reinforcement schemes for learning automata," IEEE Trans. Syst. Man Cybern., vol. SMC-15, no. 1, pp. 168-175, Feb. 1985.
    • (1985) IEEE Trans. Syst. Man Cybern , vol.SMC-15 , Issue.1 , pp. 168-175
    • Thathachar, M.A.L.1    Sastry, P.S.2
  • 25
    • 0010790615 scopus 로고
    • Autonomous processing in PDP networks
    • M. R. W. Dawson and D. P. Schopflocher, "Autonomous processing in PDP networks," Philosoph. Psychol., vol. 5, pp. 199-219, 1992.
    • (1992) Philosoph. Psychol , vol.5 , pp. 199-219
    • Dawson, M.R.W.1    Schopflocher, D.P.2
  • 26
    • 4444227991 scopus 로고    scopus 로고
    • Computational model of selection by consequences
    • May
    • J. J. McDowell, "Computational model of selection by consequences," J. Exp. Anal. Behav., vol. 81, pp. 297-317, May 2004.
    • (2004) J. Exp. Anal. Behav , vol.81 , pp. 297-317
    • McDowell, J.J.1
  • 27
    • 33646035835 scopus 로고    scopus 로고
    • The quantitative law of effect is a robust emergent property of an evolutionary algorithm for reinforcement learning
    • Cambridge, MA:MIT Press
    • J. J. McDowell and Z. Ansari, "The quantitative law of effect is a robust emergent property of an evolutionary algorithm for reinforcement learning," in Advances in Artificial Life. Cambridge, MA:MIT Press, 2005, vol. 3630, pp. 413-422.
    • (2005) Advances in Artificial Life , vol.3630 , pp. 413-422
    • McDowell, J.J.1    Ansari, Z.2
  • 28
    • 34247149281 scopus 로고    scopus 로고
    • Undermatching is an emergent property of selection by consequences
    • Jun
    • J. J. McDowell and M. L. Caron, "Undermatching is an emergent property of selection by consequences," Behav. Processes, vol. 75, pp. 97-106, Jun. 2007.
    • (2007) Behav. Processes , vol.75 , pp. 97-106
    • McDowell, J.J.1    Caron, M.L.2
  • 29
    • 33750228991 scopus 로고    scopus 로고
    • A computational theory of adaptive behavior based on an evolutionary reinforcement mechanism
    • J. J. McDowell, P. L. Soto, J. Dallery, and S. Kulubekova, M. Keijzer, Ed, New York
    • J. J. McDowell, P. L. Soto, J. Dallery, and S. Kulubekova, M. Keijzer, Ed., "A computational theory of adaptive behavior based on an evolutionary reinforcement mechanism," in Proc. Conf. Genetic Evol. Comput., New York, 2006, pp. 175-182.
    • (2006) Proc. Conf. Genetic Evol. Comput , pp. 175-182
  • 30
    • 0024614579 scopus 로고
    • Evolution, selection and cognition: From "learning" to parameter setting in biology and in the study of language
    • M. Piattelli-Palmarini, "Evolution, selection and cognition: From "learning" to parameter setting in biology and in the study of language," Cognition, vol. 31, pp. 1-44, 1989.
    • (1989) Cognition , vol.31 , pp. 1-44
    • Piattelli-Palmarini, M.1
  • 32
    • 68949203640 scopus 로고    scopus 로고
    • Connectionist selectionism: A case study of parity
    • R. B. T. Lowry and M. R. W. Dawson, "Connectionist selectionism: A case study of parity," Neural Inf. Process. - Lett. Rev., vol. 9, pp. 59-67, 2005.
    • (2005) Neural Inf. Process. - Lett. Rev , vol.9 , pp. 59-67
    • Lowry, R.B.T.1    Dawson, M.R.W.2
  • 33
    • 0001594484 scopus 로고
    • Derivatives of matching
    • R. J. Herrnstein, "Derivatives of matching," Psychol. Rev., vol. 86, pp. 486-495, 1979.
    • (1979) Psychol. Rev , vol.86 , pp. 486-495
    • Herrnstein, R.J.1
  • 34
    • 0019537951 scopus 로고
    • Toward a modern theory of adaptive networks: Expectation and prediction
    • R. S. Sutton and A. G. Barto, "Toward a modern theory of adaptive networks: Expectation and prediction," Psychol. Rev., vol. 88, pp. 135-170, 1981.
    • (1981) Psychol. Rev , vol.88 , pp. 135-170
    • Sutton, R.S.1    Barto, A.G.2
  • 36
    • 0038172310 scopus 로고    scopus 로고
    • Equilibria of the Rescorla-Wagner model
    • Apr
    • D. Danks, "Equilibria of the Rescorla-Wagner model," J. Math. Psychol., vol. 47, pp. 109-121, Apr. 2003.
    • (2003) J. Math. Psychol , vol.47 , pp. 109-121
    • Danks, D.1
  • 38
    • 0001304384 scopus 로고
    • On two types of deviation from the matching law: Bias and undermatching
    • W. M. Baum, "On two types of deviation from the matching law: Bias and undermatching," J. Exp. Anal. Behav., vol. 22, pp. 231-242, 1974.
    • (1974) J. Exp. Anal. Behav , vol.22 , pp. 231-242
    • Baum, W.M.1
  • 39
    • 21244442335 scopus 로고    scopus 로고
    • Is there a geometric module for spatial orientation? Squaring theory and evidence
    • Feb
    • K. Cheng and N. S. Newcombe, "Is there a geometric module for spatial orientation? Squaring theory and evidence," Psychonom. Bull. Rev. vol. 12, pp. 1-23, Feb. 2005.
    • (2005) Psychonom. Bull. Rev , vol.12 , pp. 1-23
    • Cheng, K.1    Newcombe, N.S.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.