SCOPUS 정보 검색 플랫폼

IEEE Transactions on Automation Science and Engineering

Volumn 13, Issue 1, 2016, Pages 54-67

Parameter estimation in softmax decision-making models with linear objective functions

(2) Reverdy, Paul a Leonard, Naomi Ehrich b

a UNIVERSITY OF PENNSYLVANIA (United States)

b Princeton University (United States)

Author keywords

Automation; Decision making; Estimation

Indexed keywords

AUTOMATION; DECISION MAKING; ESTIMATION; MAXIMUM LIKELIHOOD; MAXIMUM LIKELIHOOD ESTIMATION; STOCHASTIC MODELS; STOCHASTIC SYSTEMS;

ASYMPTOTIC DISTRIBUTIONS; DECISION MAKING MODELS; HUMAN DECISION MAKING; LIKELIHOOD FUNCTIONS; LINEAR OBJECTIVE FUNCTIONS; NONLINEAR OBJECTIVE FUNCTIONS; PARAMETER ESTIMATION PROBLEMS; STATISTICALLY SIGNIFICANT DIFFERENCE;

PARAMETER ESTIMATION;

EID: 85006480147 PISSN: 15455955 EISSN: None Source Type: Journal
DOI: 10.1109/TASE.2015.2499244 Document Type: Article

Times cited : (62)

References (38)

1
- 0040528764
- Multinomial logistic regression algorithm
- D. Böhning, "Multinomial logistic regression algorithm," Ann. Inst. Statist. Math., vol. 44, no. 1, pp. 197-200, 1992.
- (1992) Ann. Inst. Statist. Math. , vol.44 , Issue.1 , pp. 197-200
- Böhning, D.¹

2
- 19944366594
- The convergence of a class of double-rank minimization algorithms
- C. G. Broyden, "The convergence of a class of double-rank minimization algorithms," IMA J. Appl. Math., vol. 6, no. 1, pp. 76-90, 1970.
- (1970) IMA J. Appl. Math. , vol.6 , Issue.1 , pp. 76-90
- Broyden, C.G.¹

3
- 79960392344
- Amazon's mechanical turk: A new source of inexpensive, yet high-quality, data?
- M. Buhrmester, T. Kwang, and S. D. Gosling, "Amazon's mechanical turk: A new source of inexpensive, yet high-quality, data?," Perspectives on Psychological Sci., vol. 6, no. 1, pp. 3-5, 2011.
- (2011) Perspectives On Psychological Sci. , vol.6 , Issue.1 , pp. 3-5
- Buhrmester, M.¹ Kwang, T.² Gosling, S.D.³

4
- 34250348767
- Should i stay or should i go? How the human brain manages the trade-off between exploitation and exploration
- J. D. Cohen, S. M. McClure, and A. J. Yu, "Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration," Philosph. Trans. Roy. Soc. B: Bio. Sci., vol. 362, no. 1481, pp. 933-942, 2007.
- (2007) Philosph. Trans. Roy. Soc. B: Bio. Sci. , vol.362 , Issue.1481 , pp. 933-942
- Cohen, J.D.¹ McClure, S.M.² Yu, A.J.³

5
- 84904322113
- Training attention improves decision making in individuals with elevated self-reported depressive symptoms
- June
- J. A. Cooper, M. A. Gorlick, T. Denny, D. A. Worthy, C. G. Beevers, and W. T. Maddox, "Training attention improves decision making in individuals with elevated self-reported depressive symptoms," Cognitive, Affective Behavioral Neurosci., vol. 14, no. 2, pp. 729-741, June, 2014.
- (2014) Cognitive, Affective Behavioral Neurosci. , vol.14 , Issue.2 , pp. 729-741
- Cooper, J.A.¹ Gorlick, M.A.² Denny, T.³ Worthy, D.A.⁴ Beevers, C.G.⁵ Maddox, W.T.⁶

6
- 84921627984
- Trial-by-trial data analysis using computational models
- N. D. Daw, "Trial-by-trial data analysis using computational models," Decision Making, Affect, and Learning: Attention and Performance XXIII, vol. 23, pp. 3-38, 2011.
- (2011) Decision Making, Affect, and Learning: Attention and Performance XXIII , vol.23 , pp. 3-38
- Daw, N.D.¹

7
- 33745223257
- Cortical substrates for exploratory decisions in humans
- N. D. Daw, J. P. O'Docherty, P. Dayan, B. Seymour, and R. J. Dolan, "Cortical substrates for exploratory decisions in humans," Nature, vol. 441, no. 7095, pp. 876-879, 2006.
- (2006) Nature , vol.441 , Issue.7095 , pp. 876-879
- Daw, N.D.¹ O'Docherty, J.P.² Dayan, P.³ Seymour, B.⁴ Dolan, R.J.⁵

8
- 0014825610
- A new approach to variable metric algorithms
- R. Fletcher, "A new approach to variable metric algorithms," The Comput. J., vol. 13, no. 3, pp. 317-322, 1970.
- (1970) The Comput. J. , vol.13 , Issue.3 , pp. 317-322
- Fletcher, R.¹

9
- 80053262704
- Softmax-margin training for structured log-linear models
- Pittsburgh, PA, USA, Tech. Rep. CMU-LTI-10-008
- K. Gimpel and N. A. Smith, "Softmax-margin training for structured log-linear models," Carnegie Mellon Univ., Pittsburgh, PA, USA, Tech. Rep. CMU-LTI-10-008, 2010.
- (2010) Carnegie Mellon Univ.
- Gimpel, K.¹ Smith, N.A.²

10
- 77953260848
- States versus rewards: Dissociable neural prediction error signals underlying modelbased and model-free reinforcement learning
- J. Gläscher, N. Daw, P. Dayan, and J. P. O'Doherty, "States versus rewards: Dissociable neural prediction error signals underlying modelbased and model-free reinforcement learning," Neuron, vol. 66, no. 4, pp. 585-595, 2010.
- (2010) Neuron , vol.66 , Issue.4 , pp. 585-595
- Gläscher, J.¹ Daw, N.² Dayan, P.³ O'Doherty, J.P.⁴

11
- 0004097527
- Cambridge, MA, USA: Harvard Univ. Press
- A. S. Goldberger, A Course in Econometrics. Cambridge, MA, USA: Harvard Univ. Press, 1991.
- (1991) A Course in Econometrics.
- Goldberger, A.S.¹

12
- 84966251980
- A family of variable-metric methods derived by variational means
- D. Goldfarb, "A family of variable-metric methods derived by variational means," Math. Comput., vol. 24, no. 109, pp. 23-26, 1970.
- (1970) Math. Comput. , vol.24 , Issue.109 , pp. 23-26
- Goldfarb, D.¹

13
- 84954519509
- On Bayesian upper confidence bounds for bandit problems
- La Palma, Canary Islands, Spain, Apr.
- E. Kaufmann, O. Cappé, and A. Garivier, "On Bayesian upper confidence bounds for bandit problems," in Proc. Int. Conf. Artif. Intell. Statist., La Palma, Canary Islands, Spain, Apr. 2012, pp. 592-600.
- (2012) Proc. Int. Conf. Artif. Intell. Statist. , pp. 592-600
- Kaufmann, E.¹ Cappé, O.² Garivier, A.³

14
- 0003837293
- Englewood Cliffs, NJ, USA: Prentice-Hall
- S. Kay, Fundamentals of Statistical Signal Processing, Volume I: Estimation Theory. Englewood Cliffs, NJ, USA: Prentice-Hall, 1993.
- (1993) Fundamentals of Statistical Signal Processing, Volume I: Estimation Theory.
- Kay, S.¹

15
- 21244437589
- Sparse multinomial logistic regression: Fast algorithms and generalization bounds
- Jun.
- B. Krishnapuram, L. Carin, M. A. T. Figueiredo, and A. J. Hartemink, "Sparse multinomial logistic regression: Fast algorithms and generalization bounds," IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 6, pp. 957-968, Jun. 2005.
- (2005) IEEE Trans. Pattern Anal. Mach. Intell. , vol.27 , Issue.6 , pp. 957-968
- Krishnapuram, B.¹ Carin, L.² Figueiredo, M.A.T.³ Hartemink, A.J.⁴

16
- 33644782012
- Dynamic response-by-response models of matching behavior in rhesus monkeys
- Nov.
- B. Lau and P. W. Glimcher, "Dynamic response-by-response models of matching behavior in rhesus monkeys," J. Experimental Anal. Behavior, vol. 84, no. 3, pp. 555-579, Nov. 2005.
- (2005) J. Experimental Anal. Behavior , vol.84 , Issue.3 , pp. 555-579
- Lau, B.¹ Glimcher, P.W.²

17
- 0002297105
- Conditional logit analysis of qualitative choice behavior
- P. Zarembka, Ed. New York: Academic Press
- D. McFadden, "Conditional logit analysis of qualitative choice behavior," in Frontiers Econometrics, P. Zarembka, Ed. New York: Academic Press, 1974, pp. 105-142.
- (1974) Frontiers Econometrics , pp. 105-142
- McFadden, D.¹

18
- 0001345363
- Convergence and finite-time behavior of simulated annealing
- D. Mitra, F. Romeo, and A. Sangiovanni-Vincentelli, "Convergence and finite-time behavior of simulated annealing," Adv. Appl. Probability, vol. 18, no. 3, pp. 747-771, 1986.
- (1986) Adv. Appl. Probability , vol.18 , Issue.3 , pp. 747-771
- Mitra, D.¹ Romeo, F.² Sangiovanni-Vincentelli, A.³

19
- 33748337293
- Imaging valuation models in human choice
- P. R. Montague, B. King-Casas, and J. D. Cohen, "Imaging valuation models in human choice," Annu. Rev. Neurosci., vol. 29, pp. 417-448, 2006.
- (2006) Annu. Rev. Neurosci. , vol.29 , pp. 417-448
- Montague, P.R.¹ King-Casas, B.² Cohen, J.D.³

20
- 84876943382
- A healthy fear of the unknown: Perspectives on the interpretation of parameter fits from computational models in neuroscience
- M. R. Nassar and J. I. Gold, "A healthy fear of the unknown: Perspectives on the interpretation of parameter fits from computational models in neuroscience," PLoS Comput. Bio., vol. 9, no. 4, 2013, e1003015.
- (2013) PLoS Comput. Bio. , vol.9 , Issue.4 , pp. e1003015
- Nassar, M.R.¹ Gold, J.I.²

21
- 84857313572
- A decision task in a social context: Human experiments, models, and analyses of behavioral data
- A. Nedic, D. Tomlin, P. Holmes, D. A. Prentice, and J. D. Cohen, "A decision task in a social context: Human experiments, models, and analyses of behavioral data," Proc. IEEE, vol. 100, no. 3, pp. 713-733, 2012.
- (2012) Proc. IEEE , vol.100 , Issue.3 , pp. 713-733
- Nedic, A.¹ Tomlin, D.² Holmes, P.³ Prentice, D.A.⁴ Cohen, J.D.⁵

22
- 70350096085
- Large sample estimation and hypothesis testing
- R. F. Engle and D. L. McFadden, Eds. Philadelphia, PA, USA: Elsevier, ch. 36
- W. K. Newey and D. McFadden, "Large sample estimation and hypothesis testing," in Handbook of Econometrics, R. F. Engle and D. L. McFadden, Eds. Philadelphia, PA, USA: Elsevier, 1994, vol. 4, ch. 36, pp. 2111-2245.
- (1994) Handbook of Econometrics , vol.4 , pp. 2111-2245
- Newey, W.K.¹ McFadden, D.²

23
- 0042547347
- Algorithms for inverse reinforcement learning
- A. Y. Ng and S. J. Russell, "Algorithms for inverse reinforcement learning," in Proc. Int. Conf. Mach. Learn., 2000, pp. 663-670.
- (2000) Proc. Int. Conf. Mach. Learn. , pp. 663-670
- Ng, A.Y.¹ Russell, S.J.²

24
- 80053157575
- Estimation of maximum-likelihood discrete-choice models of the runway configuration selection process
- V. Ramanujam and H. Balakrishnan, "Estimation of maximum-likelihood discrete-choice models of the runway configuration selection process," in Proc. Amer. Control Conf., 2011, pp. 2160-2167.
- (2011) Proc. Amer. Control Conf. , pp. 2160-2167
- Ramanujam, V.¹ Balakrishnan, H.²

25
- 84992120348
- Ph.D. dissertation, Dept. Mech. Aerosp. Eng., Princeton Univ., Princeton, NJ, USA
- P. Reverdy, "Human-inspired algorithms for search: a framework for human-machine multi-armed bandit problems," Ph.D. dissertation, Dept. Mech. Aerosp. Eng., Princeton Univ., Princeton, NJ, USA, 2014.
- (2014) Human-inspired Algorithms for Search: A Framework for Human-machine Multi-armed Bandit Problems
- Reverdy, P.¹

26
- 84897532572
- Modeling human decision-making in generalized Gaussian multi-armed bandits
- P. Reverdy, V. Srivastava, and N. E. Leonard, "Modeling human decision-making in generalized Gaussian multi-armed bandits," Proc. IEEE, vol. 102, no. 4, pp. 544-571, 2014.
- (2014) Proc. IEEE , vol.102 , Issue.4 , pp. 544-571
- Reverdy, P.¹ Srivastava, V.² Leonard, N.E.³

27
- 84966203785
- Some aspects of the sequential design of experiments
- H. Robbins, "Some aspects of the sequential design of experiments," Bull. Amer. Math. Soc., vol. 58, pp. 527-535, 1952.
- (1952) Bull. Amer. Math. Soc. , vol.58 , pp. 527-535
- Robbins, H.¹

28
- 0031640746
- Learning agents for uncertain environments
- S. Russell, "Learning agents for uncertain environments," in Proc. 11th ACM Annu. Conf. Comput. Learn. Theory, 1998, pp. 101-103.
- (1998) Proc. 11th ACM Annu. Conf. Comput. Learn. Theory , pp. 101-103
- Russell, S.¹

29
- 28144449057
- Representation of action-specific reward values in the striatum
- K. Samejima, Y. Ueda, K. Doya, and M. Kimura, "Representation of action-specific reward values in the striatum," Science, vol. 310, no. 5752, pp. 1337-1340, 2005.
- (2005) Science , vol.310 , Issue.5752 , pp. 1337-1340
- Samejima, K.¹ Ueda, Y.² Doya, K.³ Kimura, M.⁴

30
- 84968497764
- Conditioning of quasi-Newton methods for function minimization
- D. F. Shanno, "Conditioning of quasi-Newton methods for function minimization," Math. Comput., vol. 24, no. 111, pp. 647-656, 1970.
- (1970) Math. Comput. , vol.24 , Issue.111 , pp. 647-656
- Shanno, D.F.¹

31
- 85040622473
- arXiv:1507. 01160v2
- V. Srivastava, P. Reverdy, and N. E. Leonard, "Correlated multiarmed bandit problem: Bayesian algorithms and regret analysis," arXiv:1507. 01160v2, 2015.
- (2015) Correlated Multiarmed Bandit Problem: Bayesian Algorithms and Regret Analysis
- Srivastava, V.¹ Reverdy, P.² Leonard, N.E.³

32
- 84857287538
- Towards human-robot teams: Model-based analysis of human decision making in two-alternative choice tasks with social feedback
- A. R. Stewart, M. Cao, A. Nedic, D. Tomlin, and N. E. Leonard, "Towards human-robot teams: Model-based analysis of human decision making in two-alternative choice tasks with social feedback," Proc. IEEE, vol. 100, no. 3, pp. 751-775, 2012.
- (2012) Proc. IEEE , vol.100 , Issue.3 , pp. 751-775
- Stewart, A.R.¹ Cao, M.² Nedic, A.³ Tomlin, D.⁴ Leonard, N.E.⁵

33
- 0003420416
- Cambridge, MA, USA: MIT Press
- R. S. Sutton and A. G. Barto, Introduction to Reinforcement Learning. Cambridge, MA, USA: MIT Press, 1998.
- (1998) Introduction to Reinforcement Learning.
- Sutton, R.S.¹ Barto, A.G.²

34
- 85040624016
- The Mathworks, Inc., [Online].
- The Mathworks, Inc., Fminunc 2015. [Online]. Available: http://www. mathworks.com/help/optim/ug/fminunc.html
- (2015) Fminunc

35
- 34249833101
- Learning
- C. J. C. H. Watkins and P. Dayan, "-learning," Mach. Learn., vol. 8, no. 3-4, pp. 279-292, 1992.
- (1992) Mach. Learn. , vol.8 , Issue.3-4 , pp. 279-292
- Watkins, C.J.C.H.¹ Dayan, P.²

36
- 84870267223
- The generalization of "student's" problem when several different population variances are involved
- B. L. Welch, "The generalization of "Student's" problem when several different population variances are involved," Biometrika, vol. 34, no. 1-2, pp. 28-35, 1947.
- (1947) Biometrika , vol.34 , Issue.1-2 , pp. 28-35
- Welch, B.L.¹

37
- 84925600345
- Humans use directed and random exploration to solve the exploreexploit dilemma
- R. C. Wilson, A. Geana, J. M. White, E. A. Ludvig, and J. D. Cohen, "Humans use directed and random exploration to solve the exploreexploit dilemma," J. Experimental Psychology: Gen., vol. 143, no. 6, pp. 2074-2081, 2014.
- (2014) J. Experimental Psychology: Gen. , vol.143 , Issue.6 , pp. 2074-2081
- Wilson, R.C.¹ Geana, A.² White, J.M.³ Ludvig, E.A.⁴ Cohen, J.D.⁵

38
- 85040543614
- Is model fitting necessary for model-based fMRI?
- R. C. Wilson and Y. Niv, "Is model fitting necessary for model-based fMRI?," in Proc. Multi-Disciplinary Conf. Reinforcement Learn. Decision Making, 2013, p. S41.
- (2013) Proc. Multi-Disciplinary Conf. Reinforcement Learn. Decision Making , pp. S41
- Wilson, R.C.¹ Niv, Y.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.