메뉴 건너뛰기




Volumn , Issue , 1997, Pages 1019-1025

Local bandit approximation for optimal learning problems

Author keywords

[No Author keywords available]

Indexed keywords

MARKOV PROCESSES;

EID: 16244388049     PISSN: 10495258     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (24)

References (15)
  • 1
    • 0041443966 scopus 로고
    • Caution, probing and the value of information in the control of uncertain systems
    • Bar-Shalom, Y. & Tse, E. (1976) Caution, probing and the value of information in the control of uncertain systems, Ann. Econ. Soc. Meas. 5:323-337.
    • (1976) Ann. Econ. Soc. Meas , vol.5 , pp. 323-337
    • Bar-Shalom, Y.1    Tse, E.2
  • 2
    • 84938011869 scopus 로고
    • On adaptive control processes
    • R. Bellman & R. Kalaba, (1959) On adaptive control processes. IRE Trans., 4:1-9.
    • (1959) IRE Trans. , vol.4 , pp. 1-9
    • Bellman, R.1    Kalaba, R.2
  • 3
    • 0018678571 scopus 로고
    • Adaptive control of markov chains i: Finite parameter set
    • Bokar, V. & Varaiya, P.P. (1979) Adaptive control of Markov chains I: finite parameter set. IEEE Trans. Auto. Control 24:953-958.
    • (1979) IEEE Trans. Auto. Control , vol.24 , pp. 953-958
    • Bokar, V.1    Varaiya, P.P.2
  • 5
    • 0030260201 scopus 로고    scopus 로고
    • Exploration bonuses and dual control
    • in press
    • Dayan, P. & Sejnowski, T. (1996) Exploration Bonuses and Dual Control. Machine Learning (in press).
    • (1996) Machine Learning
    • Dayan, P.1    Sejnowski, T.2
  • 9
    • 0000169010 scopus 로고
    • Bandit processes and dynamic allocation indices (with discussion)
    • Gittins, J.C. & Jones, D. (1979) Bandit processes and dynamic allocation indices (with discussion). J. R. Statist. Soc. 5 41:148-177.
    • (1979) J. R. Statist. Soc , vol.5 , Issue.41 , pp. 148-177
    • Gittins, J.C.1    Jones, D.2
  • 10
    • 0023345261 scopus 로고
    • The multi-armed bandit problem: Decomposition and computation
    • Katehakis, M.H. & Veinott, A.F. (1987) The multi-armed bandit problem: decomposition and computation Math. OR 12: 262-268.
    • (1987) Math. or , vol.12 , pp. 262-268
    • Katehakis, M.H.1    Veinott, A.F.2
  • 11
    • 0006193487 scopus 로고
    • A modified dynamic programming method for Markov decision problems
    • MacQueen, J. (1966). A modified dynamic programming method for Markov decision problems, J. Math. Anal. Appl., 14:38-43.
    • (1966) J. Math. Anal. Appl. , vol.14 , pp. 38-43
    • MacQueen, J.1
  • 14
    • 0022060331 scopus 로고
    • Extensions of the multiarmed bandit problem: The discounted case
    • Varaiya, P.P., Walrand, J.C, & Buyukkoc, C. (1985) Extensions of the multiarmed bandit problem: the discounted case. IEEE Trans. Auto. Control 30(5):426-439.
    • (1985) IEEE Trans. Auto. Control , vol.30 , Issue.5 , pp. 426-439
    • Varaiya, P.P.1    Walrand, J.C.2    Buyukkoc, C.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.