메뉴 건너뛰기




Volumn 13, Issue 1, 2016, Pages 54-67

Parameter estimation in softmax decision-making models with linear objective functions

Author keywords

Automation; Decision making; Estimation

Indexed keywords

AUTOMATION; DECISION MAKING; ESTIMATION; MAXIMUM LIKELIHOOD; MAXIMUM LIKELIHOOD ESTIMATION; STOCHASTIC MODELS; STOCHASTIC SYSTEMS;

EID: 85006480147     PISSN: 15455955     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASE.2015.2499244     Document Type: Article
Times cited : (62)

References (38)
  • 1
    • 0040528764 scopus 로고
    • Multinomial logistic regression algorithm
    • D. Böhning, "Multinomial logistic regression algorithm," Ann. Inst. Statist. Math., vol. 44, no. 1, pp. 197-200, 1992.
    • (1992) Ann. Inst. Statist. Math. , vol.44 , Issue.1 , pp. 197-200
    • Böhning, D.1
  • 2
    • 19944366594 scopus 로고
    • The convergence of a class of double-rank minimization algorithms
    • C. G. Broyden, "The convergence of a class of double-rank minimization algorithms," IMA J. Appl. Math., vol. 6, no. 1, pp. 76-90, 1970.
    • (1970) IMA J. Appl. Math. , vol.6 , Issue.1 , pp. 76-90
    • Broyden, C.G.1
  • 3
    • 79960392344 scopus 로고    scopus 로고
    • Amazon's mechanical turk: A new source of inexpensive, yet high-quality, data?
    • M. Buhrmester, T. Kwang, and S. D. Gosling, "Amazon's mechanical turk: A new source of inexpensive, yet high-quality, data?," Perspectives on Psychological Sci., vol. 6, no. 1, pp. 3-5, 2011.
    • (2011) Perspectives On Psychological Sci. , vol.6 , Issue.1 , pp. 3-5
    • Buhrmester, M.1    Kwang, T.2    Gosling, S.D.3
  • 4
    • 34250348767 scopus 로고    scopus 로고
    • Should i stay or should i go? How the human brain manages the trade-off between exploitation and exploration
    • J. D. Cohen, S. M. McClure, and A. J. Yu, "Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration," Philosph. Trans. Roy. Soc. B: Bio. Sci., vol. 362, no. 1481, pp. 933-942, 2007.
    • (2007) Philosph. Trans. Roy. Soc. B: Bio. Sci. , vol.362 , Issue.1481 , pp. 933-942
    • Cohen, J.D.1    McClure, S.M.2    Yu, A.J.3
  • 5
    • 84904322113 scopus 로고    scopus 로고
    • Training attention improves decision making in individuals with elevated self-reported depressive symptoms
    • June
    • J. A. Cooper, M. A. Gorlick, T. Denny, D. A. Worthy, C. G. Beevers, and W. T. Maddox, "Training attention improves decision making in individuals with elevated self-reported depressive symptoms," Cognitive, Affective Behavioral Neurosci., vol. 14, no. 2, pp. 729-741, June, 2014.
    • (2014) Cognitive, Affective Behavioral Neurosci. , vol.14 , Issue.2 , pp. 729-741
    • Cooper, J.A.1    Gorlick, M.A.2    Denny, T.3    Worthy, D.A.4    Beevers, C.G.5    Maddox, W.T.6
  • 7
    • 33745223257 scopus 로고    scopus 로고
    • Cortical substrates for exploratory decisions in humans
    • N. D. Daw, J. P. O'Docherty, P. Dayan, B. Seymour, and R. J. Dolan, "Cortical substrates for exploratory decisions in humans," Nature, vol. 441, no. 7095, pp. 876-879, 2006.
    • (2006) Nature , vol.441 , Issue.7095 , pp. 876-879
    • Daw, N.D.1    O'Docherty, J.P.2    Dayan, P.3    Seymour, B.4    Dolan, R.J.5
  • 8
    • 0014825610 scopus 로고
    • A new approach to variable metric algorithms
    • R. Fletcher, "A new approach to variable metric algorithms," The Comput. J., vol. 13, no. 3, pp. 317-322, 1970.
    • (1970) The Comput. J. , vol.13 , Issue.3 , pp. 317-322
    • Fletcher, R.1
  • 9
    • 80053262704 scopus 로고    scopus 로고
    • Softmax-margin training for structured log-linear models
    • Pittsburgh, PA, USA, Tech. Rep. CMU-LTI-10-008
    • K. Gimpel and N. A. Smith, "Softmax-margin training for structured log-linear models," Carnegie Mellon Univ., Pittsburgh, PA, USA, Tech. Rep. CMU-LTI-10-008, 2010.
    • (2010) Carnegie Mellon Univ.
    • Gimpel, K.1    Smith, N.A.2
  • 10
    • 77953260848 scopus 로고    scopus 로고
    • States versus rewards: Dissociable neural prediction error signals underlying modelbased and model-free reinforcement learning
    • J. Gläscher, N. Daw, P. Dayan, and J. P. O'Doherty, "States versus rewards: Dissociable neural prediction error signals underlying modelbased and model-free reinforcement learning," Neuron, vol. 66, no. 4, pp. 585-595, 2010.
    • (2010) Neuron , vol.66 , Issue.4 , pp. 585-595
    • Gläscher, J.1    Daw, N.2    Dayan, P.3    O'Doherty, J.P.4
  • 12
    • 84966251980 scopus 로고
    • A family of variable-metric methods derived by variational means
    • D. Goldfarb, "A family of variable-metric methods derived by variational means," Math. Comput., vol. 24, no. 109, pp. 23-26, 1970.
    • (1970) Math. Comput. , vol.24 , Issue.109 , pp. 23-26
    • Goldfarb, D.1
  • 13
    • 84954519509 scopus 로고    scopus 로고
    • On Bayesian upper confidence bounds for bandit problems
    • La Palma, Canary Islands, Spain, Apr.
    • E. Kaufmann, O. Cappé, and A. Garivier, "On Bayesian upper confidence bounds for bandit problems," in Proc. Int. Conf. Artif. Intell. Statist., La Palma, Canary Islands, Spain, Apr. 2012, pp. 592-600.
    • (2012) Proc. Int. Conf. Artif. Intell. Statist. , pp. 592-600
    • Kaufmann, E.1    Cappé, O.2    Garivier, A.3
  • 16
    • 33644782012 scopus 로고    scopus 로고
    • Dynamic response-by-response models of matching behavior in rhesus monkeys
    • Nov.
    • B. Lau and P. W. Glimcher, "Dynamic response-by-response models of matching behavior in rhesus monkeys," J. Experimental Anal. Behavior, vol. 84, no. 3, pp. 555-579, Nov. 2005.
    • (2005) J. Experimental Anal. Behavior , vol.84 , Issue.3 , pp. 555-579
    • Lau, B.1    Glimcher, P.W.2
  • 17
    • 0002297105 scopus 로고
    • Conditional logit analysis of qualitative choice behavior
    • P. Zarembka, Ed. New York: Academic Press
    • D. McFadden, "Conditional logit analysis of qualitative choice behavior," in Frontiers Econometrics, P. Zarembka, Ed. New York: Academic Press, 1974, pp. 105-142.
    • (1974) Frontiers Econometrics , pp. 105-142
    • McFadden, D.1
  • 18
    • 0001345363 scopus 로고
    • Convergence and finite-time behavior of simulated annealing
    • D. Mitra, F. Romeo, and A. Sangiovanni-Vincentelli, "Convergence and finite-time behavior of simulated annealing," Adv. Appl. Probability, vol. 18, no. 3, pp. 747-771, 1986.
    • (1986) Adv. Appl. Probability , vol.18 , Issue.3 , pp. 747-771
    • Mitra, D.1    Romeo, F.2    Sangiovanni-Vincentelli, A.3
  • 20
    • 84876943382 scopus 로고    scopus 로고
    • A healthy fear of the unknown: Perspectives on the interpretation of parameter fits from computational models in neuroscience
    • M. R. Nassar and J. I. Gold, "A healthy fear of the unknown: Perspectives on the interpretation of parameter fits from computational models in neuroscience," PLoS Comput. Bio., vol. 9, no. 4, 2013, e1003015.
    • (2013) PLoS Comput. Bio. , vol.9 , Issue.4 , pp. e1003015
    • Nassar, M.R.1    Gold, J.I.2
  • 21
    • 84857313572 scopus 로고    scopus 로고
    • A decision task in a social context: Human experiments, models, and analyses of behavioral data
    • A. Nedic, D. Tomlin, P. Holmes, D. A. Prentice, and J. D. Cohen, "A decision task in a social context: Human experiments, models, and analyses of behavioral data," Proc. IEEE, vol. 100, no. 3, pp. 713-733, 2012.
    • (2012) Proc. IEEE , vol.100 , Issue.3 , pp. 713-733
    • Nedic, A.1    Tomlin, D.2    Holmes, P.3    Prentice, D.A.4    Cohen, J.D.5
  • 22
    • 70350096085 scopus 로고
    • Large sample estimation and hypothesis testing
    • R. F. Engle and D. L. McFadden, Eds. Philadelphia, PA, USA: Elsevier, ch. 36
    • W. K. Newey and D. McFadden, "Large sample estimation and hypothesis testing," in Handbook of Econometrics, R. F. Engle and D. L. McFadden, Eds. Philadelphia, PA, USA: Elsevier, 1994, vol. 4, ch. 36, pp. 2111-2245.
    • (1994) Handbook of Econometrics , vol.4 , pp. 2111-2245
    • Newey, W.K.1    McFadden, D.2
  • 23
    • 0042547347 scopus 로고    scopus 로고
    • Algorithms for inverse reinforcement learning
    • A. Y. Ng and S. J. Russell, "Algorithms for inverse reinforcement learning," in Proc. Int. Conf. Mach. Learn., 2000, pp. 663-670.
    • (2000) Proc. Int. Conf. Mach. Learn. , pp. 663-670
    • Ng, A.Y.1    Russell, S.J.2
  • 24
    • 80053157575 scopus 로고    scopus 로고
    • Estimation of maximum-likelihood discrete-choice models of the runway configuration selection process
    • V. Ramanujam and H. Balakrishnan, "Estimation of maximum-likelihood discrete-choice models of the runway configuration selection process," in Proc. Amer. Control Conf., 2011, pp. 2160-2167.
    • (2011) Proc. Amer. Control Conf. , pp. 2160-2167
    • Ramanujam, V.1    Balakrishnan, H.2
  • 26
    • 84897532572 scopus 로고    scopus 로고
    • Modeling human decision-making in generalized Gaussian multi-armed bandits
    • P. Reverdy, V. Srivastava, and N. E. Leonard, "Modeling human decision-making in generalized Gaussian multi-armed bandits," Proc. IEEE, vol. 102, no. 4, pp. 544-571, 2014.
    • (2014) Proc. IEEE , vol.102 , Issue.4 , pp. 544-571
    • Reverdy, P.1    Srivastava, V.2    Leonard, N.E.3
  • 27
    • 84966203785 scopus 로고
    • Some aspects of the sequential design of experiments
    • H. Robbins, "Some aspects of the sequential design of experiments," Bull. Amer. Math. Soc., vol. 58, pp. 527-535, 1952.
    • (1952) Bull. Amer. Math. Soc. , vol.58 , pp. 527-535
    • Robbins, H.1
  • 29
    • 28144449057 scopus 로고    scopus 로고
    • Representation of action-specific reward values in the striatum
    • K. Samejima, Y. Ueda, K. Doya, and M. Kimura, "Representation of action-specific reward values in the striatum," Science, vol. 310, no. 5752, pp. 1337-1340, 2005.
    • (2005) Science , vol.310 , Issue.5752 , pp. 1337-1340
    • Samejima, K.1    Ueda, Y.2    Doya, K.3    Kimura, M.4
  • 30
    • 84968497764 scopus 로고
    • Conditioning of quasi-Newton methods for function minimization
    • D. F. Shanno, "Conditioning of quasi-Newton methods for function minimization," Math. Comput., vol. 24, no. 111, pp. 647-656, 1970.
    • (1970) Math. Comput. , vol.24 , Issue.111 , pp. 647-656
    • Shanno, D.F.1
  • 32
    • 84857287538 scopus 로고    scopus 로고
    • Towards human-robot teams: Model-based analysis of human decision making in two-alternative choice tasks with social feedback
    • A. R. Stewart, M. Cao, A. Nedic, D. Tomlin, and N. E. Leonard, "Towards human-robot teams: Model-based analysis of human decision making in two-alternative choice tasks with social feedback," Proc. IEEE, vol. 100, no. 3, pp. 751-775, 2012.
    • (2012) Proc. IEEE , vol.100 , Issue.3 , pp. 751-775
    • Stewart, A.R.1    Cao, M.2    Nedic, A.3    Tomlin, D.4    Leonard, N.E.5
  • 34
    • 85040624016 scopus 로고    scopus 로고
    • The Mathworks, Inc., [Online].
    • The Mathworks, Inc., Fminunc 2015. [Online]. Available: http://www. mathworks.com/help/optim/ug/fminunc.html
    • (2015) Fminunc
  • 35
    • 34249833101 scopus 로고
    • Learning
    • C. J. C. H. Watkins and P. Dayan, "-learning," Mach. Learn., vol. 8, no. 3-4, pp. 279-292, 1992.
    • (1992) Mach. Learn. , vol.8 , Issue.3-4 , pp. 279-292
    • Watkins, C.J.C.H.1    Dayan, P.2
  • 36
    • 84870267223 scopus 로고
    • The generalization of "student's" problem when several different population variances are involved
    • B. L. Welch, "The generalization of "Student's" problem when several different population variances are involved," Biometrika, vol. 34, no. 1-2, pp. 28-35, 1947.
    • (1947) Biometrika , vol.34 , Issue.1-2 , pp. 28-35
    • Welch, B.L.1
  • 37
    • 84925600345 scopus 로고    scopus 로고
    • Humans use directed and random exploration to solve the exploreexploit dilemma
    • R. C. Wilson, A. Geana, J. M. White, E. A. Ludvig, and J. D. Cohen, "Humans use directed and random exploration to solve the exploreexploit dilemma," J. Experimental Psychology: Gen., vol. 143, no. 6, pp. 2074-2081, 2014.
    • (2014) J. Experimental Psychology: Gen. , vol.143 , Issue.6 , pp. 2074-2081
    • Wilson, R.C.1    Geana, A.2    White, J.M.3    Ludvig, E.A.4    Cohen, J.D.5


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.