메뉴 건너뛰기




Volumn , Issue , 2010, Pages 67-79

An asymptotically optimal bandit algorithm for bounded support models

Author keywords

[No Author keywords available]

Indexed keywords

ASYMPTOTIC BOUNDS; ASYMPTOTICALLY OPTIMAL; BANDIT PROBLEMS; BOUNDED SUPPORT; CONVEX OPTIMIZATION TECHNIQUES; EXPLORATION AND EXPLOITATION; MULTI-ARMED BANDIT PROBLEM; MULTIPLE ARMS;

EID: 84898077171     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (122)

References (21)
  • 1
    • 0345224411 scopus 로고
    • The continuum-armed bandit problem
    • Agrawal, R. (1995). The continuum-armed bandit problem. SIAM J. Control Optim., 33, 1926-1951.
    • (1995) SIAM J. Control Optim. , vol.33 , pp. 1926-1951
    • Agrawal, R.1
  • 2
    • 0036568025 scopus 로고    scopus 로고
    • Finite-time analysis of the multiarmed bandit problem
    • Auer, P., Cesa-Bianchi, N., & Fischer, P. (2002). Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47, 235-256.
    • (2002) Machine Learning , vol.47 , pp. 235-256
    • Auer, P.1    Cesa-Bianchi, N.2    Fischer, P.3
  • 5
    • 0030159874 scopus 로고    scopus 로고
    • Optimal adaptive policies for sequential allocation problems
    • Burnetas, A. N., & Katehakis, M. N. (1996). Optimal adaptive policies for sequential allocation problems. Adv. Appl. Math., 17, 122-142.
    • (1996) Adv. Appl. Math. , vol.17 , pp. 122-142
    • Burnetas, A.N.1    Katehakis, M.N.2
  • 7
    • 84937398609 scopus 로고    scopus 로고
    • Pac bounds for multi-armed bandit and Markov decision processes
    • London, UK: Springer-Verlag
    • Even-Dar, E., Mannor, S., & Mansour, Y. (2002). Pac bounds for multi-armed bandit and markov decision processes. Proceedings of COLT 2002 (pp. 255-270). London, UK: Springer-Verlag.
    • (2002) Proceedings of COLT 2002 , pp. 255-270
    • Even-Dar, E.1    Mannor, S.2    Mansour, Y.3
  • 9
    • 84891584370 scopus 로고
    • Wiley-Interscience Series in Systems and Optimization. Chichester: John Wiley & Sons Ltd. With a foreword by Peter Whittle
    • Gittins, J. C. (1989). Multi-armed bandit allocation indices. Wiley-Interscience Series in Systems and Optimization. Chichester: John Wiley & Sons Ltd. With a foreword by Peter Whittle.
    • (1989) Multi-armed Bandit Allocation Indices
    • Gittins, J.C.1
  • 10
    • 84898076934 scopus 로고    scopus 로고
    • An asymptotically optimal policy for finite support models in the multi-armed bandit problem
    • Submitted to arXiv:0905.2776v3
    • Honda, J., & Takemura, A. (2010). An asymptotically optimal policy for finite support models in the multi-armed bandit problem. Submitted to Machine Learning, arXiv:0905.2776v3.
    • (2010) Machine Learning
    • Honda, J.1    Takemura, A.2
  • 11
    • 0028531055 scopus 로고
    • Multi-armed bandit problem revisited
    • Ishikida, T., & Varaiya, P. (1994). Multi-armed bandit problem revisited. J. Optim. Theory Appl., 83, 113-154.
    • (1994) J. Optim. Theory Appl. , vol.83 , pp. 113-154
    • Ishikida, T.1    Varaiya, P.2
  • 12
    • 84898981061 scopus 로고    scopus 로고
    • Nearly tight bounds for the continuum-armed bandit problem
    • MIT Press
    • Kleinberg, R. (2005). Nearly tight bounds for the continuum-armed bandit problem. Proceedings of NIPS 2005 (pp. 697-704). MIT Press.
    • (2005) Proceedings of NIPS 2005 , pp. 697-704
    • Kleinberg, R.1
  • 14
    • 0002899547 scopus 로고
    • Asymptotically efficient adaptive allocation rules
    • Lai, T. L., & Robbins, H. (1985). Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6, 4-22.
    • (1985) Advances in Applied Mathematics , vol.6 , pp. 4-22
    • Lai, T.L.1    Robbins, H.2
  • 15
    • 0038673523 scopus 로고    scopus 로고
    • Probability; a survey of the mathematical theory
    • New York: John Wiley & Sons Ltd. Second edition
    • Lamperti, J. (1996). Probability; a survey of the mathematical theory. Wiley Series in Probability Statistics. New York: John Wiley & Sons Ltd. Second edition.
    • (1996) Wiley Series in Probability Statistics
    • Lamperti, J.1
  • 16
    • 0032679082 scopus 로고    scopus 로고
    • Exploration of multi-state environments: Local measures and back-propagation of uncertainty
    • Meuleau, N., & Bourgine, P. (1999). Exploration of multi-state environments: Local measures and back-propagation of uncertainty. Machine Learning, 35, 117-154.
    • (1999) Machine Learning , vol.35 , pp. 117-154
    • Meuleau, N.1    Bourgine, P.2
  • 19
    • 14344258433 scopus 로고    scopus 로고
    • A Bayesian framework for reinforcement learning
    • Morgan Kaufmann, San Francisco, CA
    • Strens, M. (2000). A bayesian framework for reinforcement learning. Proceedings of ICML 2000 (pp. 943-950). Morgan Kaufmann, San Francisco, CA.
    • (2000) Proceedings of ICML 2000 , pp. 943-950
    • Strens, M.1
  • 20
    • 33646406807 scopus 로고    scopus 로고
    • Multi-armed bandit algorithms and empirical evaluation
    • Porto, Portugal: Springer
    • Vermorel, J., & Mohri, M. (2005). Multi-armed bandit algorithms and empirical evaluation. Proceedings of ECML 2005 (pp. 437-448). Porto, Portugal: Springer.
    • (2005) Proceedings of ECML 2005 , pp. 437-448
    • Vermorel, J.1    Mohri, M.2
  • 21
    • 0000607073 scopus 로고
    • Nonparametric bandit methods
    • Yakowitz, S., & Lowe, W. (1991). Nonparametric bandit methods. Ann. Oper. Res., 28, 297-312.
    • (1991) Ann. Oper. Res. , vol.28 , pp. 297-312
    • Yakowitz, S.1    Lowe, W.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.