메뉴 건너뛰기




Volumn 6098 LNAI, Issue PART 3, 2010, Pages 199-208

Solving non-stationary bandit problems by random sampling from sibling Kalman filters

Author keywords

Bandit Problems; Bayesian Learning; Kalman Filter

Indexed keywords

BANDIT PROBLEMS; BAYESIAN; BAYESIAN LEARNING; BAYESIAN METHODS; CLASSICAL OPTIMIZATION; FIXED STRATEGY; HYPER-PARAMETER; MULTI-ARMED BANDIT PROBLEM; MULTIPLE ARMS; NON-STATIONARY ENVIRONMENT; NONSTATIONARY; NOVEL SOLUTIONS; OPTIMAL DECISION MAKING; PARAMETER SETTING; RANDOM REWARD; RANDOM SAMPLING; STATIONARY ENVIRONMENTS;

EID: 79551524402     PISSN: 03029743     EISSN: 16113349     Source Type: Book Series    
DOI: 10.1007/978-3-642-13033-5_21     Document Type: Conference Paper
Times cited : (31)

References (15)
  • 3
    • 0001395850 scopus 로고
    • On the likelihood that one unknown probability exceeds another in view of the evidence of two samples
    • Thompson, W.R.: On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25, 285-294 (1933)
    • (1933) Biometrika , vol.25 , pp. 285-294
    • Thompson, W.R.1
  • 4
    • 33847613520 scopus 로고    scopus 로고
    • Learning Automatabased Solutions to the Nonlinear Fractional Knapsack Problem with Applications to Optimal Resource Allocation
    • Granmo, O.C., Oommen, B.J., Myrer, S.A., Olsen, M.G.: Learning Automatabased Solutions to the Nonlinear Fractional Knapsack Problem with Applications to Optimal Resource Allocation. IEEE Transactions on Systems, Man, and Cybernetics, Part B 37(1), 166-175 (2007)
    • (2007) IEEE Transactions on Systems, Man, and Cybernetics, Part B , vol.37 , Issue.1 , pp. 166-175
    • Granmo, O.C.1    Oommen, B.J.2    Myrer, S.A.3    Olsen, M.G.4
  • 6
    • 33646406807 scopus 로고    scopus 로고
    • Multi-armed bandit algorithms and empirical evaluation
    • Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. Springer, Heidelberg
    • Vermorel, J., Mohri, M.: Multi-armed bandit algorithms and empirical evaluation. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 437-448. Springer, Heidelberg (2005)
    • (2005) LNCS (LNAI) , vol.3720 , pp. 437-448
    • Vermorel, J.1    Mohri, M.2
  • 8
    • 0036568025 scopus 로고    scopus 로고
    • Finite-time analysis of the multiarmed bandit problem
    • DOI 10.1023/A:1013689704352, Computational Learning Theory
    • Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time Analysis of the Multiarmed Bandit Problem. Machine Learning 47, 235-256 (2002) (Pubitemid 34126111)
    • (2002) Machine Learning , vol.47 , Issue.2-3 , pp. 235-256
    • Auer, P.1    Cesa-Bianchi, N.2    Fischer, P.3
  • 11
    • 33749818313 scopus 로고    scopus 로고
    • Nearly optimal exploration-exploitation decision thresholds
    • Kollias, S.D., Stafylopatis, A., Duch, W., Oja, E. (eds.) ICANN 2006. Springer, Heidelberg
    • Dimitrakakis, C.: Nearly optimal exploration-exploitation decision thresholds. In: Kollias, S.D., Stafylopatis, A., Duch, W., Oja, E. (eds.) ICANN 2006. LNCS, vol. 4131, pp. 850-859. Springer, Heidelberg (2006)
    • (2006) LNCS , vol.4131 , pp. 850-859
    • Dimitrakakis, C.1
  • 12
    • 0031619316 scopus 로고    scopus 로고
    • Bayesian q-learning
    • AAAI Press, Menlo Park
    • Dearden, R., Friedman, N., Russell, S.: Bayesian q-learning. In: AAAI/IAAI, pp. 761-768. AAAI Press, Menlo Park (1998)
    • (1998) AAAI/IAAI , pp. 761-768
    • Dearden, R.1    Friedman, N.2    Russell, S.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.