메뉴 건너뛰기




Volumn 59, Issue 2, 2011, Pages 383-399

The irrevocable multiarmed bandit problem

Author keywords

Dynamic programming optimal control; Learning; Multiarmed bandit problem; Production scheduling; Sequencing; Stochastic

Indexed keywords

DYNAMIC PROGRAMMING/OPTIMAL CONTROL; LEARNING; MULTI-ARMED BANDIT PROBLEM; PRODUCTION/SCHEDULING; SEQUENCING; STOCHASTIC;

EID: 79957458110     PISSN: 0030364X     EISSN: 15265463     Source Type: Journal    
DOI: 10.1287/opre.1100.0891     Document Type: Article
Times cited : (27)

References (24)
  • 1
    • 0023453059 scopus 로고
    • Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-part I: i.i.d. rewards
    • Anantharam, V., P. Varaiya, J. Walrand. 1987a. Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-part I: i.i.d. rewards. IEEE Trans. Automatic Control 32(11) 968-976. (Pubitemid 18521625)
    • (1987) IEEE Transactions on Automatic Control , vol.AC-32 , Issue.11 , pp. 968-976
    • Anantharam, V.1    Varaiya, P.2    Walrand, J.3
  • 2
    • 0023450663 scopus 로고
    • Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-part II: Markovian rewards
    • Anantharam, V., P. Varaiya, J. Walrand. 1987b. Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-part II: Markovian rewards. IEEE Trans. Automatic Control 32(11) 977-982. (Pubitemid 18521626)
    • (1987) IEEE Transactions on Automatic Control , vol.AC-32 , Issue.11 , pp. 977-982
    • Anantharam, V.1    Varaiya, P.2    Walrand, J.3
  • 3
    • 84947400374 scopus 로고
    • Sequential medical trials
    • Anscombe, F. J. 1963. Sequential medical trials. J. Amer. Statist. Assoc. 58(302) 365-383.
    • (1963) J. Amer. Statist. Assoc. , vol.58 , Issue.302 , pp. 365-383
    • Anscombe, F.J.1
  • 4
    • 79951614535 scopus 로고    scopus 로고
    • Optimal employee retention when inferring unknown learning curves
    • B. Johansson, S. Jain, J. Montoya-Torres, J. Hugan, E. Yücesan, eds
    • Arlotto, A., S. E. Chick, N. Gans. 2010. Optimal employee retention when inferring unknown learning curves. B. Johansson, S. Jain, J. Montoya-Torres, J. Hugan, E. Yücesan, eds. Proc. Winter Simulation Conf., 2010.
    • (2010) Proc. Winter Simulation Conf. , vol.2010
    • Arlotto, A.1    Chick, S.E.2    Gans, N.3
  • 6
    • 0004870746 scopus 로고
    • A problem in the sequential design of experiments. Sankhya: Indian
    • Bellman, R. E. 1956. A problem in the sequential design of experiments. Sankhya: Indian J. Statist. 16(3/4) 221-229.
    • (1956) J. Statist. , vol.16 , Issue.3-4 , pp. 221-229
    • Bellman, R.E.1
  • 9
    • 38549103206 scopus 로고    scopus 로고
    • A learning approach for interactive marketing to a customer segment
    • DOI 10.1287/opre.1070.0427
    • Bertsimas, D., A. J. Mersereau. 2007. A learning approach for interactive marketing to a customer segment. Oper. Res. 55(6) 1120-1135. (Pubitemid 351159521)
    • (2007) Operations Research , vol.55 , Issue.6 , pp. 1120-1135
    • Bertsimas, D.1    Mersereau, A.J.2
  • 10
    • 0343441515 scopus 로고    scopus 로고
    • Restless bandits, linear programming relaxations, and a primal-dual index heuristic
    • Bertsimas, D., J. Niño-Mora. 2000. Restless bandits, linear programming relaxations, and a primal-dual index heuristic. Oper. Res. 48(1) 80-90.
    • (2000) Oper. Res. , vol.48 , Issue.1 , pp. 80-90
    • Bertsimas, D.1    Niño-Mora, J.2
  • 12
    • 33847255926 scopus 로고    scopus 로고
    • Dynamic assortment with demand learning for seasonal consumer goods
    • DOI 10.1287/mnsc.1060.0613
    • Caro, F., J. Gallien. 2007. Dynamic assortment with demand learning for seasonal consumer goods. Management Sci. 53(2) 276-292. (Pubitemid 46326180)
    • (2007) Management Science , vol.53 , Issue.2 , pp. 276-292
    • Caro, F.1    Gallien, J.2
  • 13
    • 61449225483 scopus 로고    scopus 로고
    • Approximating the stochastic knapsack problem: The benefit of adaptivity
    • Dean, B. C., M. X. Goemans, J. Vondrak. 2008. Approximating the stochastic knapsack problem: The benefit of adaptivity. Math. Oper. Res. 33(4) 945-964.
    • (2008) Math. Oper. Res. , vol.33 , Issue.4 , pp. 945-964
    • Dean, B.C.1    Goemans, M.X.2    Vondrak, J.3
  • 15
    • 0002955623 scopus 로고
    • A dynamic allocation index for the sequential design of experiments
    • J. Gani, ed North-Holland, Amsterdam
    • Gittins, J. C., D. M. Jones. 1974. A dynamic allocation index for the sequential design of experiments. J. Gani, ed. Progress in Statistics. North-Holland, Amsterdam, 241-266.
    • (1974) Progress in Statistics , pp. 241-266
    • Gittins, J.C.1    Jones, D.M.2
  • 16
    • 21144463800 scopus 로고
    • The learning component of dynamic allocation indices
    • Gittins, J., Y. G. Wang. 1992. The learning component of dynamic allocation indices. Ann. Statist. 20(3) 1625-1636.
    • (1992) Ann. Statist. , vol.20 , Issue.3 , pp. 1625-1636
    • Gittins, J.1    Wang, Y.G.2
  • 17
    • 0034346711 scopus 로고    scopus 로고
    • Index-based policies for discounted multi-armed bandits on parallel machines
    • Glazebrook, K. D., D. J. Wilkinson. 2000. Index-based policies for discounted multi-armed bandits on parallel machines. Ann. Appl. Probab. 10(3) 877-896.
    • (2000) Ann. Appl. Probab. , vol.10 , Issue.3 , pp. 877-896
    • Glazebrook, K.D.1    Wilkinson, D.J.2
  • 19
    • 35448979910 scopus 로고    scopus 로고
    • Approximation algorithms for budgeted learning problems
    • DOI 10.1145/1250790.1250807, STOC'07: Proceedings of the 39th Annual ACM Symposium on Theory of Computing
    • Guha, S., K. Munagala. 2007. Approximation algorithms for budgeted learning problems. STOC '07: Proc. Thirty-Ninth Annual ACM Sympos. Theory Comput., ACM, New York, 104-113. (Pubitemid 47630726)
    • (2007) Proceedings of the Annual ACM Symposium on Theory of Computing , pp. 104-113
    • Guha, S.1    Munagala, K.2
  • 20
    • 0002899547 scopus 로고
    • Asymptotically efficient adaptive allocation rules
    • Lai, T., H. Robbins. 1985. Asymptotically efficient adaptive allocation rules. Adv. Appl. Math. 6(1) 4-22.
    • (1985) Adv. Appl. Math. , vol.6 , Issue.1 , pp. 4-22
    • Lai, T.1    Robbins, H.2
  • 23
    • 0002327722 scopus 로고
    • On an index policy for restless bandits
    • Weber, R. R., G. Weiss. 1990. On an index policy for restless bandits. J. Appl. Probab. 27(3) 637-648.
    • (1990) J. Appl. Probab. , vol.27 , Issue.3 , pp. 637-648
    • Weber, R.R.1    Weiss, G.2
  • 24
    • 0001043843 scopus 로고
    • Restless bandits: Activity allocation in a changing world
    • Whittle, P. 1988. Restless bandits: activity allocation in a changing world. J. Appl. Probab. 25 287-298.
    • (1988) J. Appl. Probab. , vol.25 , pp. 287-298
    • Whittle, P.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.