SCOPUS 정보 검색 플랫폼

Proceedings of the IEEE Conference on Decision and Control

Volumn , Issue , 2008, Pages 4945-4950

A structured multiarmed bandit problem and the greedy policy

(3) Mersereau, Adam J a Rusmevichientong, Paat b Tsitsiklis, John N c

a UNIVERSITY OF NORTH CAROLINA (United States)

b Cornell University (United States)

c MASSACHUSETTS INSTITUTE OF TECHNOLOGY (United States)

Author keywords

[No Author keywords available]

Indexed keywords

CONTROL ENGINEERING;

DISCOUNTED REWARD; INFINITE HORIZONS; LINEAR FUNCTIONS; MULTI-ARMED BANDIT PROBLEM; NUMERICAL RESULTS; OPTIMAL POLICIES; PRIOR DISTRIBUTION; STATISTICAL CORRELATION;

PROBABILITY;

EID: 62949175102 PISSN: 07431546 EISSN: 25762370 Source Type: Conference Proceeding
DOI: 10.1109/CDC.2008.4738680 Document Type: Conference Paper

Times cited : (8)

References (15)

1
- 0000854435
- Adaptive treatment allocation and the multi-armed bandit problem
- T. L. Lai, "Adaptive treatment allocation and the multi-armed bandit problem," Ann. Stat., vol. 15, no. 3, pp. 1091-1114, 1987.
- (1987) Ann. Stat , vol.15 , Issue.3 , pp. 1091-1114
- Lai, T.L.¹

2
- 0002955623
- A dynamic allocation index for the sequential design of experiments
- J. Gani, Ed. Amsterdam: North-Holland
- J. Gittins and D. M. Jones, "A dynamic allocation index for the sequential design of experiments," in Progress in Statistics, J. Gani, Ed. Amsterdam: North-Holland, 1974, pp. 241-266.
- (1974) Progress in Statistics , pp. 241-266
- Gittins, J.¹ Jones, D.M.²

3
- 0002899547
- Asymptotically efficient adaptive allocation rules
- T. L. Lai and H. Robbins, "Asymptotically efficient adaptive allocation rules," Adv. Appl. Math., vol. 6, pp. 4-22, 1985.
- (1985) Adv. Appl. Math , vol.6 , pp. 4-22
- Lai, T.L.¹ Robbins, H.²

4
- 0001395850
- On the likelihood that one unknown probability exceeds another in view of the evidence of two samples
- W. R. Thompson, "On the likelihood that one unknown probability exceeds another in view of the evidence of two samples," Biometrika, vol. 25, pp. 285-294, 1933.
- (1933) Biometrika , vol.25 , pp. 285-294
- Thompson, W.R.¹

5
- 84966203785
- Some aspects of the sequential design of experiments
- H. Robbins, "Some aspects of the sequential design of experiments," Bull. Amer. Math. Soc., vol. 58, pp. 527-535, 1952.
- (1952) Bull. Amer. Math. Soc , vol.58 , pp. 527-535
- Robbins, H.¹

6
- 0004181906
- London: Chapman and Hall
- D. Berry and B. Fristedt, Bandit Problems: Sequential Allocation of Experiments. London: Chapman and Hall, 1985.
- (1985) Bandit Problems: Sequential Allocation of Experiments
- Berry, D.¹ Fristedt, B.²

7
- 0001492860
- Contributions to the "two-armed bandit" problem
- D. Feldman, "Contributions to the "two-armed bandit" problem," Ann. Math. Stat., vol. 33, pp. 847-856, 1962.
- (1962) Ann. Math. Stat , vol.33 , pp. 847-856
- Feldman, D.¹

8
- 0010948196
- Further contributions to the "two-armed bandit" problem
- R. Keener, "Further contributions to the "two-armed bandit" problem," Ann. Stat., vol. 13, no. 1, pp. 418-422, 1985.
- (1985) Ann. Stat , vol.13 , Issue.1 , pp. 418-422
- Keener, R.¹

9
- 0002308024
- London: Academic Press
- E. L. Pressman and I. N. Sonin, Sequential Control With Incomplete Information. London: Academic Press, 1990.
- (1990) Sequential Control With Incomplete Information
- Pressman, E.L.¹ Sonin, I.N.²

10
- 0000532482
- Response surface bandits
- J. Ginebra and M. K. Clayton, "Response surface bandits," J. Roy. Stat. Soc. B, vol. 57, no. 4, pp. 771-784, 1995.
- (1995) J. Roy. Stat. Soc. B , vol.57 , Issue.4 , pp. 771-784
- Ginebra, J.¹ Clayton, M.K.²

11
- 34547966991
- Multi-armed bandit problems with dependent arms
- S. Pandey, D. Chakrabarti, and D. Agrawal, "Multi-armed bandit problems with dependent arms," in Proceedings of the 24th International Conference on Machine Learning, 2007.
- (2007) Proceedings of the 24th International Conference on Machine Learning
- Pandey, S.¹ Chakrabarti, D.² Agrawal, D.³

12
- 85162041468
- Optimistic linear programming gives logarithmic regret for irreducible MDPs
- A. Tewari and P. L. Bartlett, "Optimistic linear programming gives logarithmic regret for irreducible MDPs," in Advances in Neural Information Processing Systems 20, 2008.
- (2008) Advances in Neural Information Processing Systems , vol.20
- Tewari, A.¹ Bartlett, P.L.²

13
- 0004083492
- Belmont: Duxbury Press
- R. Durrett, Probability: Theory and Examples. Belmont: Duxbury Press, 1996.
- (1996) Probability: Theory and Examples
- Durrett, R.¹

14
- 0009943101
- Incomplete learning from endogenous data in dynamic allocation
- M. Brezzi and T. L. Lai, "Incomplete learning from endogenous data in dynamic allocation," Econometrica, vol. 68, no. 6, pp. 1511-1516, 2000.
- (2000) Econometrica , vol.68 , Issue.6 , pp. 1511-1516
- Brezzi, M.¹ Lai, T.L.²

15
- 0000024577
- On dynamic programming with unbounded rewards
- S. A. Lippman, "On dynamic programming with unbounded rewards," Management Sci., vol. 21, no. 11, pp. 1225-1233, 1975.
- (1975) Management Sci , vol.21 , Issue.11 , pp. 1225-1233
- Lippman, S.A.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.