SCOPUS 정보 검색 플랫폼

COLT 2010 - The 23rd Conference on Learning Theory

Volumn , Issue , 2010, Pages 67-79

An asymptotically optimal bandit algorithm for bounded support models

Author keywords

[No Author keywords available]

Indexed keywords

ASYMPTOTIC BOUNDS; ASYMPTOTICALLY OPTIMAL; BANDIT PROBLEMS; BOUNDED SUPPORT; CONVEX OPTIMIZATION TECHNIQUES; EXPLORATION AND EXPLOITATION; MULTI-ARMED BANDIT PROBLEM; MULTIPLE ARMS;

CONVEX OPTIMIZATION; REINFORCEMENT LEARNING;

PROBABILITY;

EID: 84898077171 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (122)

References (21)

1
- 0345224411
- The continuum-armed bandit problem
- Agrawal, R. (1995). The continuum-armed bandit problem. SIAM J. Control Optim., 33, 1926-1951.
- (1995) SIAM J. Control Optim. , vol.33 , pp. 1926-1951
- Agrawal, R.¹

2
- 0036568025
- Finite-time analysis of the multiarmed bandit problem
- Auer, P., Cesa-Bianchi, N., & Fischer, P. (2002). Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47, 235-256.
- (2002) Machine Learning , vol.47 , pp. 235-256
- Auer, P.¹ Cesa-Bianchi, N.² Fischer, P.³

3
- 0037709910
- The nonstochastic multiarmed bandit problem
- Auer, P., Cesa-Bianchi, N., Freund, Y., & Schapire, R. E. (2003). The nonstochastic multiarmed bandit problem. SIAM J. Comput., 32, 48-77.
- (2003) SIAM J. Comput. , vol.32 , pp. 48-77
- Auer, P.¹ Cesa-Bianchi, N.² Freund, Y.³ Schapire, R.E.⁴

4
- 0004055894
- Cambridge University Press
- Boyd, S., & Vandenberghe, L. (2004). Convex optimization. Cambridge University Press.
- (2004) Convex Optimization
- Boyd, S.¹ Vandenberghe, L.²

5
- 0030159874
- Optimal adaptive policies for sequential allocation problems
- Burnetas, A. N., & Katehakis, M. N. (1996). Optimal adaptive policies for sequential allocation problems. Adv. Appl. Math., 17, 122-142.
- (1996) Adv. Appl. Math. , vol.17 , pp. 122-142
- Burnetas, A.N.¹ Katehakis, M.N.²

6
- 0003836048
- New York: Springer-Verlag. Second edition
- Dembo, A., & Zeitouni, O. (1998). Large deviations techniques and applications, Vol. 38 of Applications of Mathematics. New York: Springer-Verlag. Second edition.
- (1998) Large Deviations Techniques and Applications, Vol. 38 of Applications of Mathematics
- Dembo, A.¹ Zeitouni, O.²

7
- 84937398609
- Pac bounds for multi-armed bandit and Markov decision processes
- London, UK: Springer-Verlag
- Even-Dar, E., Mannor, S., & Mansour, Y. (2002). Pac bounds for multi-armed bandit and markov decision processes. Proceedings of COLT 2002 (pp. 255-270). London, UK: Springer-Verlag.
- (2002) Proceedings of COLT 2002 , pp. 255-270
- Even-Dar, E.¹ Mannor, S.² Mansour, Y.³

8
- 0003814609
- New York: Academic Press
- Fiacco, A. V. (1983). Introduction to sensitivity and stability analysis in nonlinear programming. New York: Academic Press.
- (1983) Introduction to Sensitivity and Stability Analysis in Nonlinear Programming
- Fiacco, A.V.¹

9
- 84891584370
- Wiley-Interscience Series in Systems and Optimization. Chichester: John Wiley & Sons Ltd. With a foreword by Peter Whittle
- Gittins, J. C. (1989). Multi-armed bandit allocation indices. Wiley-Interscience Series in Systems and Optimization. Chichester: John Wiley & Sons Ltd. With a foreword by Peter Whittle.
- (1989) Multi-armed Bandit Allocation Indices
- Gittins, J.C.¹

10
- 84898076934
- An asymptotically optimal policy for finite support models in the multi-armed bandit problem
- Submitted to arXiv:0905.2776v3
- Honda, J., & Takemura, A. (2010). An asymptotically optimal policy for finite support models in the multi-armed bandit problem. Submitted to Machine Learning, arXiv:0905.2776v3.
- (2010) Machine Learning
- Honda, J.¹ Takemura, A.²

11
- 0028531055
- Multi-armed bandit problem revisited
- Ishikida, T., & Varaiya, P. (1994). Multi-armed bandit problem revisited. J. Optim. Theory Appl., 83, 113-154.
- (1994) J. Optim. Theory Appl. , vol.83 , pp. 113-154
- Ishikida, T.¹ Varaiya, P.²

12
- 84898981061
- Nearly tight bounds for the continuum-armed bandit problem
- MIT Press
- Kleinberg, R. (2005). Nearly tight bounds for the continuum-armed bandit problem. Proceedings of NIPS 2005 (pp. 697-704). MIT Press.
- (2005) Proceedings of NIPS 2005 , pp. 697-704
- Kleinberg, R.¹

13
- 84862291603
- Regret bounds for sleeping experts and bandits
- Kleinberg, R. D., Niculescu-Mizil, A., & Sharma, Y. (2008). Regret bounds for sleeping experts and bandits. Proceedings of COLT 2008 (pp. 425-436).
- (2008) Proceedings of COLT 2008 , pp. 425-436
- Kleinberg, R.D.¹ Niculescu-Mizil, A.² Sharma, Y.³

14
- 0002899547
- Asymptotically efficient adaptive allocation rules
- Lai, T. L., & Robbins, H. (1985). Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6, 4-22.
- (1985) Advances in Applied Mathematics , vol.6 , pp. 4-22
- Lai, T.L.¹ Robbins, H.²

15
- 0038673523
- Probability; a survey of the mathematical theory
- New York: John Wiley & Sons Ltd. Second edition
- Lamperti, J. (1996). Probability; a survey of the mathematical theory. Wiley Series in Probability Statistics. New York: John Wiley & Sons Ltd. Second edition.
- (1996) Wiley Series in Probability Statistics
- Lamperti, J.¹

16
- 0032679082
- Exploration of multi-state environments: Local measures and back-propagation of uncertainty
- Meuleau, N., & Bourgine, P. (1999). Exploration of multi-state environments: Local measures and back-propagation of uncertainty. Machine Learning, 35, 117-154.
- (1999) Machine Learning , vol.35 , pp. 117-154
- Meuleau, N.¹ Bourgine, P.²

17
- 0003818048
- San Francisco: Holden Day
- Pinsker, M. S. (1964). Information and information stability of random variables and processes (transl.). San Francisco: Holden Day.
- (1964) Information and Information Stability of Random Variables and Processes (transl.)
- Pinsker, M.S.¹

18
- 0004031920
- Princeton University Press
- Rockafellar, R. T. (1970). Convex analysis (Princeton Mathematical Series). Princeton University Press.
- (1970) Convex Analysis (Princeton Mathematical Series)
- Rockafellar, R.T.¹

19
- 14344258433
- A Bayesian framework for reinforcement learning
- Morgan Kaufmann, San Francisco, CA
- Strens, M. (2000). A bayesian framework for reinforcement learning. Proceedings of ICML 2000 (pp. 943-950). Morgan Kaufmann, San Francisco, CA.
- (2000) Proceedings of ICML 2000 , pp. 943-950
- Strens, M.¹

20
- 33646406807
- Multi-armed bandit algorithms and empirical evaluation
- Porto, Portugal: Springer
- Vermorel, J., & Mohri, M. (2005). Multi-armed bandit algorithms and empirical evaluation. Proceedings of ECML 2005 (pp. 437-448). Porto, Portugal: Springer.
- (2005) Proceedings of ECML 2005 , pp. 437-448
- Vermorel, J.¹ Mohri, M.²

21
- 0000607073
- Nonparametric bandit methods
- Yakowitz, S., & Lowe, W. (1991). Nonparametric bandit methods. Ann. Oper. Res., 28, 297-312.
- (1991) Ann. Oper. Res. , vol.28 , pp. 297-312
- Yakowitz, S.¹ Lowe, W.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.