SCOPUS 정보 검색 플랫폼

11th International Symposium on Artificial Intelligence and Mathematics, ISAIM 2010

Volumn , Issue , 2010, Pages

Multi-armed bandits with episode context

(1) Rosin, Christopher D a

a Parity Computing Inc (United States)

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTER GO; CONTEXTUAL BANDITS; MULTI ARMED BANDIT; SIDE INFORMATION; WORSTCASE;

ARTIFICIAL INTELLIGENCE;

ALGORITHMS;

EID: 84874141248 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (8)

References (30)

1
- 84898079018
- Minimax policies for adversarial and stochastic bandits
- Audibert, J.-Y., and Bubeck, S. 2009. Minimax policies for adversarial and stochastic bandits. In COLT.
- (2009) COLT
- Audibert, J.-Y.¹ Bubeck, S.²

2
- 62949181077
- Exploration-exploitation tradeoff using variance estimates in multi-armed bandits
- Audibert, J.-Y.; Munos, R.; and Szepesvári, C. 2009. Exploration-exploitation tradeoff using variance estimates in multi-armed bandits. Theor. Comput. Sci. 410(19):1876-1902.
- (2009) Theor. Comput. Sci. , vol.410 , Issue.19 , pp. 1876-1902
- Audibert, J.-Y.¹ Munos, R.² Szepesvári, C.³

3
- 0037709910
- The nonstochastic multiarmed bandit problem
- Auer, P.; Cesa-Bianchi, N.; Freund, Y.; and Schapire, R. E. 2002. The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32(1):48-77.
- (2002) SIAM J. Comput. , vol.32 , Issue.1 , pp. 48-77
- Auer, P.¹ Cesa-Bianchi, N.² Freund, Y.³ Schapire, R.E.⁴

4
- 0036568025
- Finitetime analysis of the multiarmed bandit problem
- Auer, P.; Cesa-Bianchi, N.; and Fischer, P. 2002. Finitetime analysis of the multiarmed bandit problem. Machine Learning 47(2-3):235-256.
- (2002) Machine Learning , vol.47 , Issue.2-3 , pp. 235-256
- Auer, P.¹ Cesa-Bianchi, N.² Fischer, P.³

5
- 0035479281
- Computer Go: An AI oriented survey
- Bouzy, B., and Cazenave, T. 2001. Computer Go: An AI oriented survey. Artif. Intell. 132(1):39-103.
- (2001) Artif. Intell. , vol.132 , Issue.1 , pp. 39-103
- Bouzy, B.¹ Cazenave, T.²

6
- 80053655011
- Bayesian generation and integration of K-nearest-neighbor patterns for 19×19 Go
- Bouzy, B., and Chaslot, G. 2005. Bayesian generation and integration of K-nearest-neighbor patterns for 19×19 Go. In IEEE Symposium on Computational Intelligence in Games, 176-181.
- (2005) IEEE Symposium on Computational Intelligence in Games , pp. 176-181
- Bouzy, B.¹ Chaslot, G.²

7
- 24944458186
- Monte-carlo go developments
- van den Herik, H. J.; Iida, H.; and Heinz, E. A., eds., volume 263 of IFIP, Kluwer
- Bouzy, B., and Helmstetter, B. 2003. Monte-Carlo Go developments. In van den Herik, H. J.; Iida, H.; and Heinz, E. A., eds., ACG, volume 263 of IFIP, 159-174. Kluwer.
- (2003) ACG , pp. 159-174
- Bouzy, B.¹ Helmstetter, B.²

8
- 77952070805
- Pure exploration in multi-armed bandits problems
- Gavaldà, R.; Lugosi, G.; Zeugmann, T.; and Zilles, S., eds., ALT, Springer
- Bubeck, S.; Munos, R.; and Stoltz, G. 2009. Pure exploration in multi-armed bandits problems. In Gavaldà, R.; Lugosi, G.; Zeugmann, T.; and Zilles, S., eds., ALT, volume 5809 of Lecture Notes in Computer Science, 23-37. Springer.
- (2009) Lecture Notes in Computer Science , vol.5809 , pp. 23-37
- Bubeck, S.¹ Munos, R.² Stoltz, G.³

9
- 34250005402
- Computer go: A grand challenge to ai
- Duch,W., and Mandziuk, J., eds., volume 63 of Studies in Computational Intelligence. Springer
- Cai, X., and Wunsch, D. C. 2007. Computer Go: A grand challenge to AI. In Duch,W., and Mandziuk, J., eds., Challenges for Computational Intelligence, volume 63 of Studies in Computational Intelligence. Springer. 443-465.
- (2007) Challenges for Computational Intelligence , pp. 443-465
- Cai, X.¹ Wunsch, D.C.²

10
- 67650687540
- Progressive strategies for Monte-Carlo tree search
- Chaslot, G.; Winands, M.; Uiterwijk, J.; van den Herik, H.; and Bouzy, B. 2007. Progressive strategies for Monte-Carlo tree search. In Proceedings of the 10th Joint Conference on Information Sciences (JCIS 2007), 655-661.
- (2007) Proceedings of the 10th Joint Conference on Information Sciences (JCIS 2007) , pp. 655-661
- Chaslot, G.¹ Winands, M.² Uiterwijk, J.³ Van Den Herik, H.⁴ Bouzy, B.⁵

11
- 82355189414
- Chaslot, G.; Chatriot, L.; Fiter, C.; Gelly, S.; Hoock, J.; Perez, J.; Rimmel, A.; and Teytaud, O. 2008. Combining expert, offline, transient and online knowledge in Monte-Carlo exploration. http://www.lri.fr/~teytaud/eg.pdf.
- (2008) Combining Expert, Offline, Transient and Online Knowledge in Monte-Carlo Exploration
- Chaslot, G.¹ Chatriot, L.² Fiter, C.³ Gelly, S.⁴ Hoock, J.⁵ Perez, J.⁶ Rimmel, A.⁷ Teytaud, O.⁸

12
- 38049037928
- Efficient selectivity and backup operators in Monte-Carlo tree search
- van den Herik, H. J.; Ciancarini, P.; and Donkers, H. H. L. M., eds., volume 4630 of Lecture Notes in Computer Science, Springer
- Coulom, R. 2006. Efficient selectivity and backup operators in Monte-Carlo tree search. In van den Herik, H. J.; Ciancarini, P.; and Donkers, H. H. L. M., eds., Computers and Games, volume 4630 of Lecture Notes in Computer Science, 72-83. Springer.
- (2006) Computers and Games , pp. 72-83
- Coulom, R.¹

13
- 70349287633
- Computing Elo ratings of move patterns in the game of Go
- Coulom, R. 2007a. Computing Elo ratings of move patterns in the game of Go. In Computer Games Workshop.
- (2007) Computer Games Workshop
- Coulom, R.¹

14
- 70349270662
- Monte-carlo tree search in crazy stone
- Coulom, R. 2007b. Monte-Carlo tree search in Crazy Stone. In 12th Game Programming Workshop.
- (2007) 12th Game Programming Workshop
- Coulom, R.¹

15
- 70049104257
- Bandit-based optimization on graphs with application to library performance tuning
- Danyluk, A. P.; Bottou, L.; and Littman, M. L., eds., ICML,. ACM
- de Mesmay, F.; Rimmel, A.; Voronenko, Y.; and Püschel, M. 2009. Bandit-based optimization on graphs with application to library performance tuning. In Danyluk, A. P.; Bottou, L.; and Littman, M. L., eds., ICML, volume 382 of ACM International Conference Proceeding Series, 92. ACM.
- (2009) ACM International Conference Proceeding Series , vol.382 , pp. 92
- De Mesmay, F.¹ Rimmel, A.² Voronenko, Y.³ Püschel, M.⁴

16
- 33745295134
- Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems
- Even-Dar, E.; Mannor, S.; and Mansour, Y. 2006. Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems. Journal of Machine Learning Research 7:1079-1105.
- (2006) Journal of Machine Learning Research , vol.7 , pp. 1079-1105
- Even-Dar, E.¹ Mannor, S.² Mansour, Y.³

17
- 57749181518
- Simulation-based approach to general game playing
- 2008
- Finnsson, H., and Björnsson, Y. 2008. Simulation-based approach to general game playing. In Fox and Gomes (2008), 259-264.
- (2008) Fox and Gomes , pp. 259-264
- Finnsson, H.¹ Björnsson, Y.²

18
- 84874125372
- AAAI 2008, Chicago, Illinois, USA, July 13-17, 2008. AAAI Press
- Fox, D., and Gomes, C. P., eds. 2008. Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, AAAI 2008, Chicago, Illinois, USA, July 13-17, 2008. AAAI Press.
- (2008) Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence
- Fox, D.¹ Gomes, C.P.²

19
- 34547990649
- Combining online and offline knowledge in UCT
- Ghahramani, Z., ed. ICML. ACM
- Gelly, S., and Silver, D. 2007. Combining online and offline knowledge in UCT. In Ghahramani, Z., ed., ICML, volume 227 of ACM International Conference Proceeding Series, 273-280. ACM.
- (2007) ACM International Conference Proceeding Series , vol.227 , pp. 273-280
- Gelly, S.¹ Silver, D.²

20
- 57749091602
- Achieving master level play in 9 × 9 computer Go
- 2008
- Gelly, S., and Silver, D. 2008. Achieving master level play in 9 × 9 computer Go. In Fox and Gomes (2008), 1537-1540.
- (2008) Fox and Gomes , pp. 1537-1540
- Gelly, S.¹ Silver, D.²

21
- 84874181959
- Gap-free bounds for multi-armed stochastic bandit
- Juditsky, A.; Nazin, A.; Tsybakov, A.; and Vayatis, N. 2008. Gap-free bounds for multi-armed stochastic bandit. In World Congr. of IFAC.
- (2008) World Congr. of IFAC
- Juditsky, A.¹ Nazin, A.² Tsybakov, A.³ Vayatis, N.⁴

22
- 56449104477
- Efficient bandit algorithms for online multiclass prediction
- Cohen, W. W.; McCallum, A.; and Roweis, S. T., eds., ICML. ACM
- Kakade, S. M.; Shalev-Shwartz, S.; and Tewari, A. 2008. Efficient bandit algorithms for online multiclass prediction. In Cohen, W. W.; McCallum, A.; and Roweis, S. T., eds., ICML, volume 307 of ACM International Conference Proceeding Series, 440-447. ACM.
- (2008) ACM International Conference Proceeding Series , vol.30 , pp. 440-447
- Kakade, S.M.¹ Shalev-Shwartz, S.² Tewari, A.³

23
- 34547975806
- Bandit based monte-carlo planning
- Kocsis, L., and Szepesvari, C. 2006. Bandit based Monte-Carlo planning. In ECML.
- (2006) ECML
- Kocsis, L.¹ Szepesvari, C.²

24
- 83055177001
- The epoch-greedy algorithm for multi-armed bandits with side information
- Platt, J. C.; Koller, D.; Singer, Y.; and Roweis, S. T., eds., MIT Press
- Langford, J., and Zhang, T. 2007. The epoch-greedy algorithm for multi-armed bandits with side information. In Platt, J. C.; Koller, D.; Singer, Y.; and Roweis, S. T., eds., NIPS. MIT Press.
- (2007) NIPS
- Langford, J.¹ Zhang, T.²

25
- 30044441333
- The sample complexity of exploration in the multi-armed bandit problem
- Mannor, S., and Tsitsiklis, J. N. 2004. The sample complexity of exploration in the multi-armed bandit problem. Journal of Machine Learning Research 5:623-648.
- (2004) Journal of Machine Learning Research , vol.5 , pp. 623-648
- Mannor, S.¹ Tsitsiklis, J.N.²

26
- 84898064829
- Stochastic convex optimization
- Shalev-Shwartz, S.; Shamir, O.; Srebro, N.; and Sridharan, K. 2009. Stochastic convex optimization. In COLT.
- (2009) COLT
- Shalev-Shwartz, S.¹ Shamir, O.² Srebro, N.³ Sridharan, K.⁴

27
- 33750375100
- A simple distribution-free approach to the max k-armed bandit problem
- Benhamou, F., ed., CP. Springer
- Streeter, M. J., and Smith, S. F. 2006. A simple distribution-free approach to the max k-armed bandit problem. In Benhamou, F., ed., CP, volume 4204 of Lecture Notes in Computer Science, 560-574. Springer.
- (2006) Lecture Notes in Computer Science , vol.4204 , pp. 560-574
- Streeter, M.J.¹ Smith, S.F.²

28
- 34250750797
- Experience-efficient learning in associative bandit problems
- Cohen, W. W., and Moore, A., eds., ICML. ACM
- Strehl, A. L.; Mesterharm, C.; Littman, M. L.; and Hirsh, H. 2006. Experience-efficient learning in associative bandit problems. In Cohen, W. W., and Moore, A., eds., ICML, volume 148 of ACM International Conference Proceeding Series, 889-896. ACM.
- (2006) ACM International Conference Proceeding Series , vol.148 , pp. 889-896
- Strehl, A.L.¹ Mesterharm, C.² Littman, M.L.³ Hirsh, H.⁴

29
- 84863410809
- Anytime many-armed bandits
- Teytaud, O.; Gelly, S.; and Sebag, M. 2007. Anytime many-armed bandits. In CAP07.
- (2007) CAP07
- Teytaud, O.¹ Gelly, S.² Sebag, M.³

30
- 15844389867
- Bandit problems with side observations
- Wang, C.-C.; Kulkarni, S.; and Poor, H. 2005. Bandit problems with side observations. IEEE Tr. Aut. Cont. 50:338-355.
- (2005) IEEE Tr. Aut. Cont. , vol.50 , pp. 338-355
- Wang, C.-C.¹ Kulkarni, S.² Poor, H.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.