SCOPUS 정보 검색 플랫폼

Annals of Mathematics and Artificial Intelligence

Volumn 61, Issue 3, 2011, Pages 203-230

Multi-armed bandits with episode context

(1) Rosin, Christopher D a

a Parity Computing Inc (United States)

Author keywords

Computational learning theory; Computer Go; Contextual bandits; Multi armed bandits; PUCB; UCB

Indexed keywords

EID: 82355173286 PISSN: 10122443 EISSN: None Source Type: Journal
DOI: 10.1007/s10472-011-9258-6 Document Type: Article

Times cited : (259)

References (31)

1
- 84898079018
- Minimax policies for adversarial and stochastic bandits
- Audibert, J. Y., Bubeck, S.: Minimax policies for adversarial and stochastic bandits. In: 22nd Annual Conference on Learning Theory (COLT 2009) (2009).
- (2009) Annual Conference on Learning Theory (COLT 2009)
- Audibert, J.Y.¹ Bubeck, S.²

2
- 62949181077
- Exploration-exploitation tradeoff using variance estimates in multi-armed bandits
- Audibert, J. Y., Munos, R., Szepesvári, C.: Exploration-exploitation tradeoff using variance estimates in multi-armed bandits. Theor. Comput. Sci. 410(19), 1876-1902 (2009).
- (2009) Theor. Comput. Sci. , vol.410 , Issue.19 , pp. 1876-1902
- Audibert, J.Y.¹ Munos, R.² Szepesvári, C.³

3
- 0036568025
- Finite-time analysis of the multiarmed bandit problem
- Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47(2-3), 235-256 (2002).
- (2002) Mach. Learn. , vol.47 , Issue.2-3 , pp. 235-256
- Auer, P.¹ Cesa-Bianchi, N.² Fischer, P.³

4
- 0037709910
- The nonstochastic multiarmed bandit problem
- Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R. E.: The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32(1), 48-77 (2002).
- (2002) SIAM J. Comput. , vol.32 , Issue.1 , pp. 48-77
- Auer, P.¹ Cesa-Bianchi, N.² Freund, Y.³ Schapire, R.E.⁴

5
- 0035479281
- Computer Go: An AI oriented survey
- Bouzy, B., Cazenave, T.: Computer Go: An AI oriented survey. Artif. Intell. 132(1), 39-103 (2001).
- (2001) Artif. Intell. , vol.132 , Issue.1 , pp. 39-103
- Bouzy, B.¹ Cazenave, T.²

6
- 80053655011
- Bayesian generation and integration of K-nearest-neighbor patterns for 19x19 Go
- Bouzy, B., Chaslot, G.: Bayesian generation and integration of K-nearest-neighbor patterns for 19x19 Go. In: IEEE Symposium on Computational Intelligence in Games (CIG05), pp. 176-181 (2005).
- (2005) IEEE Symposium on Computational Intelligence in Games (CIG05) , pp. 176-181
- Bouzy, B.¹ Chaslot, G.²

7
- 84902513084
- Monte-Carlo Go developments
- H. J. Herikvan den, H. Iida, and E. A. Heinz (Eds.), New York: Springer
- Bouzy, B., Helmstetter, B.: Monte-Carlo Go developments. In: van den Herik, H. J., Iida, H., Heinz, E. A. (eds.) Advances in Computer Games (ACG 2003), IFIP, vol. 263, pp. 159-174. Springer, New York (2003).
- (2003) Advances in Computer Games (ACG 2003), IFIP, Vol. 263 , pp. 159-174
- Bouzy, B.¹ Helmstetter, B.²

8
- 77952070805
- Pure exploration in multi-armed bandits problems
- Lecture Notes in Computer Science, R. Gavaldà, G. Lugosi, T. Zeugmann, and S. Zilles (Eds.), New York: Springer
- Bubeck, S., Munos, R., Stoltz, G.: Pure exploration in multi-armed bandits problems. In: Gavaldà, R., Lugosi, G., Zeugmann, T., Zilles, S. (eds.) Algorithmic Learning Theory (ALT 2009), Lecture Notes in Computer Science, vol. 5809, pp. 23-37. Springer, New York (2009).
- (2009) Algorithmic Learning Theory (ALT 2009) , vol.5809 , pp. 23-37
- Bubeck, S.¹ Munos, R.² Stoltz, G.³

9
- 34250005402
- Computer Go: A grand challenge to AI
- Studies in Computational Intelligence, W. Duch and J. Mandziuk (Eds.), New York: Springer
- Cai, X., Wunsch, D. C.: Computer Go: A grand challenge to AI. In: Duch, W., Mandziuk, J. (eds.) Challenges for Computational Intelligence, Studies in Computational Intelligence, vol. 63, pp. 443-465. Springer, New York (2007).
- (2007) Challenges for Computational Intelligence , vol.63 , pp. 443-465
- Cai, X.¹ Wunsch, D.C.²

10
- 67650687540
- Progressive strategies for Monte-Carlo tree search
- Chaslot, G., Winands, M., Uiterwijk, J., van den Herik, H., Bouzy, B.: Progressive strategies for Monte-Carlo tree search. In: Proceedings of the 10th Joint Conference on Information Sciences (JCIS 2007), pp. 655-661 (2007).
- (2007) Proceedings of the 10th Joint Conference on Information Sciences (JCIS 2007) , pp. 655-661
- Chaslot, G.¹ Winands, M.² Uiterwijk, J.³ van den Herik, H.⁴ Bouzy, B.⁵

11
- 82355189414
- Chaslot, G., Chatriot, L., Fiter, C., Gelly, S., Hoock, J., Perez, J., Rimmel, A., Teytaud, O.: Combining expert, offline, transient and online knowledge in Monte-Carlo exploration. http://www. lri. fr/~teytaud/eg. pdf (2008).
- (2008) Combining expert, offline, transient and online knowledge in Monte-Carlo exploration
- Chaslot, G.¹ Chatriot, L.² Fiter, C.³ Gelly, S.⁴ Hoock, J.⁵ Perez, J.⁶ Rimmel, A.⁷ Teytaud, O.⁸

12
- 77953762833
- Adding expert knowledge and exploration in Monte-Carlo tree search
- New York: Springer
- Chaslot, G., Fiter, C., Hoock, J. B., Rimmel, A., Teytaud, O.: Adding expert knowledge and exploration in Monte-Carlo tree search. In: Advances in Computer Games (ACG12). Springer, New York (2009).
- (2009) Advances in Computer Games (ACG12)
- Chaslot, G.¹ Fiter, C.² Hoock, J.B.³ Rimmel, A.⁴ Teytaud, O.⁵

13
- 38049037928
- Efficient selectivity and backup operators in Monte-Carlo tree search
- Lecture Notes in Computer Science, H. J. Herikvan den, P. Ciancarini, and HHLMDonkers (Eds.), New York: Springer
- Coulom, R.: Efficient selectivity and backup operators in Monte-Carlo tree search. In: van den Herik, H. J., Ciancarini, P., Donkers, H. H. L. M. (eds.) Computers and Games (CG 2006), Lecture Notes in Computer Science, vol. 4630, pp. 72-83. Springer, New York (2006).
- (2006) Computers and Games (CG 2006) , vol.4630 , pp. 72-83
- Coulom, R.¹

14
- 70349287633
- Computing Elo ratings of move patterns in the game of Go
- Coulom, R.: Computing Elo ratings of move patterns in the game of Go. In: Computer Games Workshop 2007 (2007).
- (2007) Computer Games Workshop 2007
- Coulom, R.¹

15
- 70349270662
- Monte-Carlo tree search in crazy stone
- Coulom, R.: Monte-Carlo tree search in crazy stone. In: 12th Game Programming Workshop (GPW-07) (2007).
- (2007) 12th Game Programming Workshop (GPW-07)
- Coulom, R.¹

16
- 71149107214
- Bandit-based optimization on graphs with application to library performance tuning
- A. P. Danyluk, L. Bottou, and M. L. Littman (Eds.), New York: ACM
- de Mesmay, F., Rimmel, A., Voronenko, Y., Püschel, M.: Bandit-based optimization on graphs with application to library performance tuning. In: Danyluk, A. P., Bottou, L., Littman, M. L. (eds.) International Conference on Machine Learning (ICML 2009), pp. 729-736. ACM, New York (2009).
- (2009) International Conference on Machine Learning (ICML 2009) , pp. 729-736
- de Mesmay, F.¹ Rimmel, A.² Voronenko, Y.³ Püschel, M.⁴

17
- 33745295134
- Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems
- Even-Dar, E., Mannor, S., Mansour, Y.: Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems. J. Mach. Learn. Res. 7, 1079-1105 (2006).
- (2006) J. Mach. Learn. Res. , vol.7 , pp. 1079-1105
- Even-Dar, E.¹ Mannor, S.² Mansour, Y.³

18
- 57749181518
- Simulation-based approach to general game playing
- Fox, D., Gomes, C. P. (eds.) AAAI 2008, Chicago, IL, USA, 13-17 July 2008 AAAI Press, Menlo Park (2008)
- Finnsson, H., Björnsson, Y.: Simulation-based approach to general game playing. In: Fox, D., Gomes, C. P. (eds.) Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, AAAI 2008, Chicago, IL, USA, 13-17 July 2008, pp. 259-264. AAAI Press, Menlo Park (2008).
- Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence , pp. 259-264
- Finnsson, H.¹ Björnsson, Y.²

19
- 34547990649
- Combining online and offline knowledge in UCT
- Ghahramani, Z. (ed.)ACM, New York
- Gelly, S., Silver, D.: Combining online and offline knowledge in UCT. In: Ghahramani, Z. (ed.) International Conference on Machine Learning (ICML 2007), pp. 273-280. ACM, New York (2007).
- (2007) International Conference on Machine Learning (ICML 2007) , pp. 273-280
- Gelly, S.¹ Silver, D.²

20
- 57749091602
- Achieving master level play in 9 x 9 computer Go
- Fox, D., Gomes, C. P. (eds.) Chicago, IL, USA, 13-17 July 2008, AAAI Press, Menlo Park
- Gelly, S., Silver, D.: Achieving master level play in 9 x 9 computer Go. In: Fox, D., Gomes, C. P. (eds.) Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, AAAI 2008, Chicago, IL, USA, 13-17 July 2008, pp. 1537-1540. AAAI Press, Menlo Park (2008).
- (2008) Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, AAAI 2008 , pp. 1537-1540
- Gelly, S.¹ Silver, D.²

21
- 82355180345
- Gap-free bounds for multi-armed stochastic bandit
- Juditsky, A., Nazin, A., Tsybakov, A., Vayatis, N.: Gap-free bounds for multi-armed stochastic bandit. In: World Congress of the International Federation of Automatic Control (IFAC) 2008 (2008).
- (2008) World Congress of the International Federation of Automatic Control (IFAC) 2008
- Juditsky, A.¹ Nazin, A.² Tsybakov, A.³ Vayatis, N.⁴

22
- 56449104477
- Efficient bandit algorithms for online multiclass prediction
- Cohen, W. W., McCallum, A., Roweis, S. T. (eds.)
- Kakade, S. M., Shalev-Shwartz, S., Tewari, A.: Efficient bandit algorithms for online multiclass prediction. In: Cohen, W. W., McCallum, A., Roweis, S. T. (eds.) International Conference on Machine Learning (ICML 2008), pp. 440-447. ACM, New York (2008).
- (2008) International Conference on Machine Learning (ICML 2008) ACM, New York , pp. 440-447
- Kakade, S.M.¹ Shalev-Shwartz, S.² Tewari, A.³

23
- 33750293964
- Bandit based Monte-Carlo planning
- Kocsis, L., Szepesvari, C.: Bandit based Monte-Carlo planning. In: European Conference on Machine Learning (ECML 2006), pp. 282-293 (2006).
- (2006) European Conference on Machine Learning (ECML 2006) , pp. 282-293
- Kocsis, L.¹ Szepesvari, C.²

24
- 83055177001
- The epoch-greedy algorithm for multi-armed bandits with side information
- J. C. Platt, D. Koller, Y. Singer, and S. T. Roweis (Eds.), Cambridge: MIT Press
- Langford, J., Zhang, T.: The epoch-greedy algorithm for multi-armed bandits with side information. In: Platt, J. C., Koller, D., Singer, Y., Roweis, S. T. (eds.) Neural Information Processing Systems (NIPS). MIT Press, Cambridge (2007).
- (2007) Neural Information Processing Systems (NIPS)
- Langford, J.¹ Zhang, T.²

25
- 30044441333
- The sample complexity of exploration in the multi-armed bandit problem
- Mannor, S., Tsitsiklis, J. N.: The sample complexity of exploration in the multi-armed bandit problem. J. Mach. Learn. Res. 5, 623-648 (2004).
- (2004) J. Mach. Learn. Res. , vol.5 , pp. 623-648
- Mannor, S.¹ Tsitsiklis, J.N.²

26
- 84874141248
- Multi-armed bandits with episode context
- Rosin, C. D.: Multi-armed bandits with episode context. In: The Eleventh International Symposium on Artificial Intelligence and Mathematics (ISAIM 2010) (2010).
- (2010) The Eleventh International Symposium on Artificial Intelligence and Mathematics (ISAIM 2010)
- Rosin, C.D.¹

27
- 84898064829
- Stochastic convex optimization
- Shalev-Shwartz, S., Shamir, O., Srebro, N., Sridharan, K.: Stochastic convex optimization. In: 22nd Annual Conference on Learning Theory (COLT 2009) (2009).
- (2009) Annual Conference on Learning Theory (COLT 2009)
- Shalev-Shwartz, S.¹ Shamir, O.² Srebro, N.³ Sridharan, K.⁴

28
- 33750375100
- A simple distribution-free approach to the max k-armed bandit problem
- Lecture Notes in Computer Science, F. Benhamou (Ed.), New York: Springer
- Streeter, M. J., Smith, S. F.: A simple distribution-free approach to the max k-armed bandit problem. In: Benhamou, F. (ed.) Principles and Practice of Constraint Programming (CP 2006), Lecture Notes in Computer Science, vol. 4204, pp. 560-574. Springer, New York (2006).
- (2006) Principles and Practice of Constraint Programming (CP 2006) , vol.4204 , pp. 560-574
- Streeter, M.J.¹ Smith, S.F.²

29
- 33749242078
- Experience-efficient learning in associative bandit problems
- Cohen, W. W., Moore, A. (eds.) ACM, New York
- Strehl, A. L., Mesterharm, C., Littman, M. L., Hirsh, H.: Experience-efficient learning in associative bandit problems. In: Cohen, W. W., Moore, A. (eds.) International Conference on Machine Learning (ICML 2006), pp. 889-896. ACM, New York (2006).
- (2006) International Conference on Machine Learning (ICML 2006) , pp. 889-896
- Strehl, A.L.¹ Mesterharm, C.² Littman, M.L.³ Hirsh, H.⁴

30
- 85162031443
- Learning from Logged Implicit Exploration Data
- Lafferty, J., Williams, C. K. I, Shawe-Taylor, J., Zemel, R. S., Culotta, A. (eds.)
- Strehl, A. L., Langford, J., Li, L., Kakade, S. M.: Learning from Logged Implicit Exploration Data. In: Lafferty, J., Williams, C. K. I, Shawe-Taylor, J., Zemel, R. S., Culotta, A. (eds.) Neural Information Processing Systems (NIPS) (2010).
- (2010) Neural Information Processing Systems (NIPS)
- Strehl, A.L.¹ Langford, J.² Li, L.³ Kakade, S.M.⁴

31
- 82355180344
- Teytaud, O., Gelly, S., Sebag, M.: Anytime many-armed bandits. In: Zucker, J., Cornuéjols, A. (eds.) Conférence d'Apprentissage (CAP07), pp. 387-402 (2007).

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.