메뉴 건너뛰기




Volumn 32, Issue 1, 2003, Pages 48-77

The nonstochastic multiarmed bandit problem

Author keywords

Adversarial bandit problem; Unknown matrix games

Indexed keywords

ALGORITHMS; GAME THEORY; MATRIX ALGEBRA; THEOREM PROVING;

EID: 0037709910     PISSN: 00975397     EISSN: None     Source Type: Journal    
DOI: 10.1137/S0097539701398375     Document Type: Article
Times cited : (2459)

References (18)
  • 4
    • 84889281816 scopus 로고
    • Elements of information theory
    • John Wiley, New York
    • T. M. Cover and J. A. Thomas, Elements of Information Theory, John Wiley, New York, 1991.
    • (1991)
    • Cover, T.M.1    Thomas, J.A.2
  • 5
    • 0002476325 scopus 로고    scopus 로고
    • Regret in the on-line decision problem
    • D. P. Foster and R. Vohra, Regret in the on-line decision problem, Games Econom. Behav., 29 (1999), pp. 7-36.
    • (1999) Games Econom. Behav. , vol.29 , pp. 7-36
    • Foster, D.P.1    Vohra, R.2
  • 6
    • 0031211090 scopus 로고    scopus 로고
    • A decision-theoretic generalization of on-line learning and an application to boosting
    • Y. Freund and R. E. Schapire, A decision-theoretic generalization of on-line learning and an application to boosting, J. Comput. System Sci., 55 (1997) pp. 119-139.
    • (1997) J. Comput. System Sci. , vol.55 , pp. 119-139
    • Freund, Y.1    Schapire, R.E.2
  • 7
    • 0002267135 scopus 로고    scopus 로고
    • Adaptive game playing using multiplicative weights
    • Y. Freund and R. E. Schapire, Adaptive game playing using multiplicative weights, Games Econom. Behav., 29 (1999), pp. 79-103.
    • (1999) Games Econom. Behav. , vol.29 , pp. 79-103
    • Freund, Y.1    Schapire, R.E.2
  • 8
    • 0000668347 scopus 로고
    • Consistency and cautious fictitious play
    • D. Fudenberg and D. K. Levine, Consistency and cautious fictitious play, J. Econom. Dynam. Control, 19 (1995), pp. 1065-1089.
    • (1995) J. Econom. Dynam. Control , vol.19 , pp. 1065-1089
    • Fudenberg, D.1    Levine, D.K.2
  • 9
    • 84891584370 scopus 로고
    • Multi-armed bandit allocation indices
    • John Wiley, Chichester, UK
    • J. C. Gittins, Multi-armed Bandit Allocation Indices, John Wiley, Chichester, UK, 1989.
    • (1989)
    • Gittins, J.C.1
  • 10
    • 0001976283 scopus 로고
    • Approximation to bayes risk in repeated play
    • M. Dresher, A. W. Tucker, and P. Wolfe, eds., Princeton University Press, Princeton, NJ
    • J. Hannan, Approximation to Bayes risk in repeated play, in Contributions to the Theory of Games, Vol. III, M. Dresher, A. W. Tucker, and P. Wolfe, eds., Princeton University Press, Princeton, NJ, 1957, pp. 97-139.
    • (1957) Contributions to the Theory of Games , vol.3 , pp. 97-139
    • Hannan, J.1
  • 11
    • 0000908510 scopus 로고    scopus 로고
    • A simple procedure leading to correlated equilibrium
    • S. Hart and A. Mas-Colell, A simple procedure leading to correlated equilibrium, Econometrica, 68 (2000), pp. 1127-1150.
    • (2000) Econometrica , vol.68 , pp. 1127-1150
    • Hart, S.1    Mas-Colell, A.2
  • 12
    • 0013327463 scopus 로고    scopus 로고
    • A general class of adaptive strategies
    • S. Hart and A. Mas-Colell, A general class of adaptive strategies, J. Econom. Theory, 98 (2001), pp. 26-54.
    • (2001) J. Econom. Theory , vol.98 , pp. 26-54
    • Hart, S.1    Mas-Colell, A.2
  • 13
    • 0028531055 scopus 로고
    • Multi-armed bandit problem revisited
    • T. Ishikida and P. Varaiya, Multi-armed bandit problem revisited, J. Optim. Theory Appl., 83 (1994), pp. 113-154.
    • (1994) J. Optim. Theory Appl. , vol.83 , pp. 113-154
    • Ishikida, T.1    Varaiya, P.2
  • 14
    • 0002899547 scopus 로고
    • Asymptotically efficient adaptive allocation rules
    • T. L. Lai and H. Robbins, Asymptotically efficient adaptive allocation rules, Adv. in Appl. Math., 6 (1985), pp. 4-22.
    • (1985) Adv. in Appl. Math. , vol.6 , pp. 4-22
    • Lai, T.L.1    Robbins, H.2
  • 16
    • 0038675791 scopus 로고
    • On repeated games with incomplete information played by non-Bayesian players
    • N. Megiddo, On repeated games with incomplete information played by non-Bayesian players, International J. Game Theory, 9 (1980), pp. 157-167.
    • (1980) International J. Game Theory , vol.9 , pp. 157-167
    • Megiddo, N.1
  • 17
    • 84966203785 scopus 로고
    • Some aspects of the sequential design of experiments
    • H. Robbins, Some aspects of the sequential design of experiments, Bull. Amer. Math. Soc., 55 (1952), pp. 527-535.
    • (1952) Bull. Amer. Math. Soc. , vol.55 , pp. 527-535
    • Robbins, H.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.