메뉴 건너뛰기




Volumn , Issue , 2008, Pages 343-354

Adapting to a changing environment: The brownian restless bandits

Author keywords

[No Author keywords available]

Indexed keywords

CHANGING ENVIRONMENT; DISCRETE RANDOM WALK; DYNAMIC SETTINGS; INFINITE TIME HORIZON; MULTI ARMED BANDIT; PRACTICAL ONLINE; RESTLESS BANDITS; STEADY-STATE ANALYSIS;

EID: 84898070003     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (129)

References (33)
  • 1
    • 0041966002 scopus 로고    scopus 로고
    • Using confidence bounds for exploitation-exploration trade-offs
    • Preliminary version in 41st IEEE FOCS, 2000
    • P. Auer. Using confidence bounds for exploitation-exploration trade-offs. J. Machine Learning Research, 3: 397-422, 2002. Preliminary version in 41st IEEE FOCS, 2000.
    • (2002) J. Machine Learning Research , vol.3 , pp. 397-422
    • Auer, P.1
  • 2
    • 0036568025 scopus 로고    scopus 로고
    • Finite-time analysis of the multiarmed bandit problem
    • Preliminary version in 15th ICML, 1998
    • P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2-3): 235-256, 2002. Preliminary version in 15th ICML, 1998.
    • (2002) Machine Learning , vol.47 , Issue.2-3 , pp. 235-256
    • Auer, P.1    Cesa-Bianchi, N.2    Fischer, P.3
  • 3
    • 0037709910 scopus 로고    scopus 로고
    • The nonstochastic multiarmed bandit problem
    • Preliminary version in 36th IEEE FOCS, 1995
    • P. Auer, N. Cesa-Bianchi, Y. Freund, and R. E. Schapire. The nonstochastic multiarmed bandit problem. SIAM J. Comput., 32(1): 48-77, 2002. Preliminary version in 36th IEEE FOCS, 1995.
    • (2002) SIAM J. Comput. , vol.32 , Issue.1 , pp. 48-77
    • Auer, P.1    Cesa-Bianchi, N.2    Freund, Y.3    Schapire, R.E.4
  • 4
    • 4544345025 scopus 로고    scopus 로고
    • Adaptive routing with end-to-end feedback: Distributed learning and geometric approaches
    • B. Awerbuch and R. D. Kleinberg. Adaptive routing with end-to-end feedback: distributed learning and geometric approaches. In 36th ACM Symp. on Theory of Computing (STOC), pages 45-53, 2004.
    • (2004) 36th ACM Symp. on Theory of Computing (STOC) , pp. 45-53
    • Awerbuch, B.1    Kleinberg, R.D.2
  • 6
    • 0030134077 scopus 로고    scopus 로고
    • Conservation laws, extended polymatroids and multi-armed bandit problems: A unified polyhedral approach
    • D. Bertsimas and J. Nino-Mora. Conservation laws, extended polymatroids and multi-armed bandit problems: A unified polyhedral approach. Math. of Oper. Res, 21(2): 257-306, 1996.
    • (1996) Math. of Oper. Res , vol.21 , Issue.2 , pp. 257-306
    • Bertsimas, D.1    Nino-Mora, J.2
  • 7
    • 0343441515 scopus 로고    scopus 로고
    • Restless bandits, linear programming relaxations, and a primal-dual index heuristic
    • D. Bertsimas and J. Nino-Mora. Restless bandits, linear programming relaxations, and a primal-dual index heuristic. Operations Research, 48(1): 80-90, 2000.
    • (2000) Operations Research , vol.48 , Issue.1 , pp. 80-90
    • Bertsimas, D.1    Nino-Mora, J.2
  • 9
    • 33244456637 scopus 로고    scopus 로고
    • Robbing the bandit: Less regret in online geometric optimization against an adaptive adversary
    • V. Dani and T. P. Hayes. Robbing the bandit: less regret in online geometric optimization against an adaptive adversary. In 17th ACM-SIAM Symp. on Discrete Algorithms (SODA), pages 937-943, 2006.
    • (2006) 17th ACM-SIAM Symp. on Discrete Algorithms (SODA) , pp. 937-943
    • Dani, V.1    Hayes, T.P.2
  • 11
    • 0000169010 scopus 로고
    • Bandit processes and dynamic allocation indices (with discussion)
    • J. C. Gittins. Bandit processes and dynamic allocation indices (with discussion). J. Roy. Statist. Soc. Ser. B, 41: 148-177, 1979.
    • (1979) J. Roy. Statist. Soc. Ser. B , vol.41 , pp. 148-177
    • Gittins, J.C.1
  • 13
    • 0002955623 scopus 로고
    • A dynamic allocation index for the sequential design of experiments
    • J. G. et al., editor North-Holland
    • J. C. Gittins and D. M. Jones. A dynamic allocation index for the sequential design of experiments. In J. G. et al., editor, Progress in Statistics, pages 241-266. North-Holland, 1974.
    • (1974) Progress in Statistics , pp. 241-266
    • Gittins, J.C.1    Jones, D.M.2
  • 14
    • 46749146164 scopus 로고    scopus 로고
    • Approximation algorithms for partial-information based stochastic control with Markovian rewards
    • S. Guha and K. Munagala. Approximation algorithms for partial-information based stochastic control with Markovian rewards. In 48th Symp. on Foundations of Computer Science (FOCS), 2007.
    • (2007) 48th Symp. on Foundations of Computer Science (FOCS)
    • Guha, S.1    Munagala, K.2
  • 16
    • 49949119498 scopus 로고    scopus 로고
    • Reinforcement learning-based load shared sequential routing
    • F. Heidari, S. Mannor, and L. Mason. Reinforcement learning-based load shared sequential routing. In IFIP Networking, 2007.
    • (2007) IFIP Networking
    • Heidari, F.1    Mannor, S.2    Mason, L.3
  • 17
    • 84898981061 scopus 로고    scopus 로고
    • Nearly tight bounds for the continuum-armed bandit problem
    • Full version appeared as Chapters 4-5 in [18]
    • R. D. Kleinberg. Nearly tight bounds for the continuum-armed bandit problem. In 18th Advances in Neural Information Processing Systems (NIPS), 2004. Full version appeared as Chapters 4-5 in [18].
    • (2004) 18th Advances in Neural Information Processing Systems (NIPS)
    • Kleinberg, R.D.1
  • 19
    • 33244473533 scopus 로고    scopus 로고
    • Anytime algorithms for multi-armed bandit problems
    • Full version appeared as Chapter 6 in [18]
    • R. D. Kleinberg. Anytime algorithms for multi-armed bandit problems. In 17th ACM-SIAM Symp. on Discrete Algorithms (SODA), pages 928-936, 2006. Full version appeared as Chapter 6 in [18].
    • (2006) 17th ACM-SIAM Symp. on Discrete Algorithms (SODA) , pp. 928-936
    • Kleinberg, R.D.1
  • 21
    • 0002899547 scopus 로고
    • Asymptotically efficient adaptive allocation rules
    • T. Lai and H. Robbins. Asymptotically efficient Adaptive Allocation Rules. Advances in Applied Mathematics, 6: 4-22, 1985.
    • (1985) Advances in Applied Mathematics , vol.6 , pp. 4-22
    • Lai, T.1    Robbins, H.2
  • 22
    • 9444257628 scopus 로고    scopus 로고
    • Online geometric optimization in the bandit setting against an adaptive adversary
    • H. B. McMahan and A. Blum. Online Geometric Optimization in the Bandit Setting Against an Adaptive Adversary. In 17th Conference on Learning Theory (COLT), pages 109-123, 2004.
    • (2004) 17th Conference on Learning Theory (COLT) , pp. 109-123
    • McMahan, H.B.1    Blum, A.2
  • 23
    • 27944479719 scopus 로고    scopus 로고
    • On the constant in the nonuniform version of the berry-esseen theorem
    • 2005
    • K. Neammanee. On the constant in the nonuniform version of the Berry-Esseen theorem. Intl. J. of Mathematics and Mathematical Sciences, 2005: 12: 1951-1967, 2005.
    • (2005) Intl. J. of Mathematics and Mathematical Sciences , vol.12 , pp. 1951-1967
    • Neammanee, K.1
  • 24
    • 17744388964 scopus 로고    scopus 로고
    • Restless bandits, partial conservation laws and indexability
    • J. Nino-Mora. Restless bandits, partial conservation laws and indexability. Advances in Applied Probability, 33: 76-98, 2001.
    • (2001) Advances in Applied Probability , vol.33 , pp. 76-98
    • Nino-Mora, J.1
  • 27
    • 84966203785 scopus 로고
    • Some aspects of the sequential design of experiments
    • H. Robbins. Some Aspects of the Sequential Design of Experiments. Bull. Amer. Math. Soc., 58: 527-535, 1952.
    • (1952) Bull. Amer. Math. Soc. , vol.58 , pp. 527-535
    • Robbins, H.1
  • 28
    • 0030095077 scopus 로고    scopus 로고
    • Markov chain convergence: From finite to infinite
    • J. S. Rosenthal. Markov chain convergence: From finite to infinite. Stochastic Processes Appl., 62(1): 55-72, 1996.
    • (1996) Stochastic Processes Appl. , vol.62 , Issue.1 , pp. 55-72
    • Rosenthal, J.S.1
  • 30
    • 0242590668 scopus 로고
    • A short proof of the gittins index theorem
    • J. N. Tsitsiklis. A short proof of the Gittins index theorem. Annals of Applied Probability, 4(1): 194-199, 1994.
    • (1994) Annals of Applied Probability , vol.4 , Issue.1 , pp. 194-199
    • Tsitsiklis, J.N.1
  • 31
    • 84975987963 scopus 로고
    • Branching bandit processes
    • G. Weiss. Branching bandit processes. Probab. Engng. Inform. Sci., 2: 269-278, 1988.
    • (1988) Probab. Engng. Inform. Sci. , vol.2 , pp. 269-278
    • Weiss, G.1
  • 32
    • 0000595228 scopus 로고
    • Arm acquiring bandits
    • P. Whittle. Arm acquiring bandits. Ann. Probab., 9: 284-292, 1981.
    • (1981) Ann. Probab. , vol.9 , pp. 284-292
    • Whittle, P.1
  • 33
    • 0001043843 scopus 로고
    • Restless bandits: Activity allocation in a changing world
    • P. Whittle. Restless bandits: Activity allocation in a changing world. J. of Appl. Prob., 25A: 287-298, 1988.
    • (1988) J. of Appl. Prob. , vol.25 A , pp. 287-298
    • Whittle, P.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.