-
1
-
-
84898079018
-
Minimax policies for adversarial and stochastic bandits
-
Audibert, J.-Y., and Bubeck, S. 2009. Minimax policies for adversarial and stochastic bandits. In COLT.
-
(2009)
COLT
-
-
Audibert, J.-Y.1
Bubeck, S.2
-
2
-
-
62949181077
-
Exploration-exploitation tradeoff using variance estimates in multi-armed bandits
-
Audibert, J.-Y.; Munos, R.; and Szepesvári, C. 2009. Exploration-exploitation tradeoff using variance estimates in multi-armed bandits. Theor. Comput. Sci. 410(19):1876-1902.
-
(2009)
Theor. Comput. Sci.
, vol.410
, Issue.19
, pp. 1876-1902
-
-
Audibert, J.-Y.1
Munos, R.2
Szepesvári, C.3
-
3
-
-
0037709910
-
The nonstochastic multiarmed bandit problem
-
Auer, P.; Cesa-Bianchi, N.; Freund, Y.; and Schapire, R. E. 2002. The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32(1):48-77.
-
(2002)
SIAM J. Comput.
, vol.32
, Issue.1
, pp. 48-77
-
-
Auer, P.1
Cesa-Bianchi, N.2
Freund, Y.3
Schapire, R.E.4
-
4
-
-
0036568025
-
Finitetime analysis of the multiarmed bandit problem
-
Auer, P.; Cesa-Bianchi, N.; and Fischer, P. 2002. Finitetime analysis of the multiarmed bandit problem. Machine Learning 47(2-3):235-256.
-
(2002)
Machine Learning
, vol.47
, Issue.2-3
, pp. 235-256
-
-
Auer, P.1
Cesa-Bianchi, N.2
Fischer, P.3
-
5
-
-
0035479281
-
Computer Go: An AI oriented survey
-
Bouzy, B., and Cazenave, T. 2001. Computer Go: An AI oriented survey. Artif. Intell. 132(1):39-103.
-
(2001)
Artif. Intell.
, vol.132
, Issue.1
, pp. 39-103
-
-
Bouzy, B.1
Cazenave, T.2
-
7
-
-
24944458186
-
Monte-carlo go developments
-
van den Herik, H. J.; Iida, H.; and Heinz, E. A., eds., volume 263 of IFIP, Kluwer
-
Bouzy, B., and Helmstetter, B. 2003. Monte-Carlo Go developments. In van den Herik, H. J.; Iida, H.; and Heinz, E. A., eds., ACG, volume 263 of IFIP, 159-174. Kluwer.
-
(2003)
ACG
, pp. 159-174
-
-
Bouzy, B.1
Helmstetter, B.2
-
8
-
-
77952070805
-
Pure exploration in multi-armed bandits problems
-
Gavaldà, R.; Lugosi, G.; Zeugmann, T.; and Zilles, S., eds., ALT, Springer
-
Bubeck, S.; Munos, R.; and Stoltz, G. 2009. Pure exploration in multi-armed bandits problems. In Gavaldà, R.; Lugosi, G.; Zeugmann, T.; and Zilles, S., eds., ALT, volume 5809 of Lecture Notes in Computer Science, 23-37. Springer.
-
(2009)
Lecture Notes in Computer Science
, vol.5809
, pp. 23-37
-
-
Bubeck, S.1
Munos, R.2
Stoltz, G.3
-
9
-
-
34250005402
-
Computer go: A grand challenge to ai
-
Duch,W., and Mandziuk, J., eds., volume 63 of Studies in Computational Intelligence. Springer
-
Cai, X., and Wunsch, D. C. 2007. Computer Go: A grand challenge to AI. In Duch,W., and Mandziuk, J., eds., Challenges for Computational Intelligence, volume 63 of Studies in Computational Intelligence. Springer. 443-465.
-
(2007)
Challenges for Computational Intelligence
, pp. 443-465
-
-
Cai, X.1
Wunsch, D.C.2
-
10
-
-
67650687540
-
Progressive strategies for Monte-Carlo tree search
-
Chaslot, G.; Winands, M.; Uiterwijk, J.; van den Herik, H.; and Bouzy, B. 2007. Progressive strategies for Monte-Carlo tree search. In Proceedings of the 10th Joint Conference on Information Sciences (JCIS 2007), 655-661.
-
(2007)
Proceedings of the 10th Joint Conference on Information Sciences (JCIS 2007)
, pp. 655-661
-
-
Chaslot, G.1
Winands, M.2
Uiterwijk, J.3
Van Den Herik, H.4
Bouzy, B.5
-
11
-
-
82355189414
-
-
Chaslot, G.; Chatriot, L.; Fiter, C.; Gelly, S.; Hoock, J.; Perez, J.; Rimmel, A.; and Teytaud, O. 2008. Combining expert, offline, transient and online knowledge in Monte-Carlo exploration. http://www.lri.fr/~teytaud/eg.pdf.
-
(2008)
Combining Expert, Offline, Transient and Online Knowledge in Monte-Carlo Exploration
-
-
Chaslot, G.1
Chatriot, L.2
Fiter, C.3
Gelly, S.4
Hoock, J.5
Perez, J.6
Rimmel, A.7
Teytaud, O.8
-
12
-
-
38049037928
-
Efficient selectivity and backup operators in Monte-Carlo tree search
-
van den Herik, H. J.; Ciancarini, P.; and Donkers, H. H. L. M., eds., volume 4630 of Lecture Notes in Computer Science, Springer
-
Coulom, R. 2006. Efficient selectivity and backup operators in Monte-Carlo tree search. In van den Herik, H. J.; Ciancarini, P.; and Donkers, H. H. L. M., eds., Computers and Games, volume 4630 of Lecture Notes in Computer Science, 72-83. Springer.
-
(2006)
Computers and Games
, pp. 72-83
-
-
Coulom, R.1
-
13
-
-
70349287633
-
Computing Elo ratings of move patterns in the game of Go
-
Coulom, R. 2007a. Computing Elo ratings of move patterns in the game of Go. In Computer Games Workshop.
-
(2007)
Computer Games Workshop
-
-
Coulom, R.1
-
15
-
-
70049104257
-
Bandit-based optimization on graphs with application to library performance tuning
-
Danyluk, A. P.; Bottou, L.; and Littman, M. L., eds., ICML,. ACM
-
de Mesmay, F.; Rimmel, A.; Voronenko, Y.; and Püschel, M. 2009. Bandit-based optimization on graphs with application to library performance tuning. In Danyluk, A. P.; Bottou, L.; and Littman, M. L., eds., ICML, volume 382 of ACM International Conference Proceeding Series, 92. ACM.
-
(2009)
ACM International Conference Proceeding Series
, vol.382
, pp. 92
-
-
De Mesmay, F.1
Rimmel, A.2
Voronenko, Y.3
Püschel, M.4
-
16
-
-
33745295134
-
Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems
-
Even-Dar, E.; Mannor, S.; and Mansour, Y. 2006. Action elimination and stopping conditions for the multi-armed bandit and reinforcement learning problems. Journal of Machine Learning Research 7:1079-1105.
-
(2006)
Journal of Machine Learning Research
, vol.7
, pp. 1079-1105
-
-
Even-Dar, E.1
Mannor, S.2
Mansour, Y.3
-
17
-
-
57749181518
-
Simulation-based approach to general game playing
-
2008
-
Finnsson, H., and Björnsson, Y. 2008. Simulation-based approach to general game playing. In Fox and Gomes (2008), 259-264.
-
(2008)
Fox and Gomes
, pp. 259-264
-
-
Finnsson, H.1
Björnsson, Y.2
-
18
-
-
84874125372
-
-
AAAI 2008, Chicago, Illinois, USA, July 13-17, 2008. AAAI Press
-
Fox, D., and Gomes, C. P., eds. 2008. Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, AAAI 2008, Chicago, Illinois, USA, July 13-17, 2008. AAAI Press.
-
(2008)
Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence
-
-
Fox, D.1
Gomes, C.P.2
-
19
-
-
34547990649
-
Combining online and offline knowledge in UCT
-
Ghahramani, Z., ed. ICML. ACM
-
Gelly, S., and Silver, D. 2007. Combining online and offline knowledge in UCT. In Ghahramani, Z., ed., ICML, volume 227 of ACM International Conference Proceeding Series, 273-280. ACM.
-
(2007)
ACM International Conference Proceeding Series
, vol.227
, pp. 273-280
-
-
Gelly, S.1
Silver, D.2
-
20
-
-
57749091602
-
Achieving master level play in 9 × 9 computer Go
-
2008
-
Gelly, S., and Silver, D. 2008. Achieving master level play in 9 × 9 computer Go. In Fox and Gomes (2008), 1537-1540.
-
(2008)
Fox and Gomes
, pp. 1537-1540
-
-
Gelly, S.1
Silver, D.2
-
22
-
-
56449104477
-
Efficient bandit algorithms for online multiclass prediction
-
Cohen, W. W.; McCallum, A.; and Roweis, S. T., eds., ICML. ACM
-
Kakade, S. M.; Shalev-Shwartz, S.; and Tewari, A. 2008. Efficient bandit algorithms for online multiclass prediction. In Cohen, W. W.; McCallum, A.; and Roweis, S. T., eds., ICML, volume 307 of ACM International Conference Proceeding Series, 440-447. ACM.
-
(2008)
ACM International Conference Proceeding Series
, vol.30
, pp. 440-447
-
-
Kakade, S.M.1
Shalev-Shwartz, S.2
Tewari, A.3
-
23
-
-
34547975806
-
Bandit based monte-carlo planning
-
Kocsis, L., and Szepesvari, C. 2006. Bandit based Monte-Carlo planning. In ECML.
-
(2006)
ECML
-
-
Kocsis, L.1
Szepesvari, C.2
-
24
-
-
83055177001
-
The epoch-greedy algorithm for multi-armed bandits with side information
-
Platt, J. C.; Koller, D.; Singer, Y.; and Roweis, S. T., eds., MIT Press
-
Langford, J., and Zhang, T. 2007. The epoch-greedy algorithm for multi-armed bandits with side information. In Platt, J. C.; Koller, D.; Singer, Y.; and Roweis, S. T., eds., NIPS. MIT Press.
-
(2007)
NIPS
-
-
Langford, J.1
Zhang, T.2
-
25
-
-
30044441333
-
The sample complexity of exploration in the multi-armed bandit problem
-
Mannor, S., and Tsitsiklis, J. N. 2004. The sample complexity of exploration in the multi-armed bandit problem. Journal of Machine Learning Research 5:623-648.
-
(2004)
Journal of Machine Learning Research
, vol.5
, pp. 623-648
-
-
Mannor, S.1
Tsitsiklis, J.N.2
-
27
-
-
33750375100
-
A simple distribution-free approach to the max k-armed bandit problem
-
Benhamou, F., ed., CP. Springer
-
Streeter, M. J., and Smith, S. F. 2006. A simple distribution-free approach to the max k-armed bandit problem. In Benhamou, F., ed., CP, volume 4204 of Lecture Notes in Computer Science, 560-574. Springer.
-
(2006)
Lecture Notes in Computer Science
, vol.4204
, pp. 560-574
-
-
Streeter, M.J.1
Smith, S.F.2
-
28
-
-
34250750797
-
Experience-efficient learning in associative bandit problems
-
Cohen, W. W., and Moore, A., eds., ICML. ACM
-
Strehl, A. L.; Mesterharm, C.; Littman, M. L.; and Hirsh, H. 2006. Experience-efficient learning in associative bandit problems. In Cohen, W. W., and Moore, A., eds., ICML, volume 148 of ACM International Conference Proceeding Series, 889-896. ACM.
-
(2006)
ACM International Conference Proceeding Series
, vol.148
, pp. 889-896
-
-
Strehl, A.L.1
Mesterharm, C.2
Littman, M.L.3
Hirsh, H.4
|