-
1
-
-
0345224411
-
The continuum-armed bandit problem
-
Agrawal, R. (1995). The continuum-armed bandit problem. SIAMJ. Control Optim., 33(6):1926-1951.
-
(1995)
SIAMJ. Control Optim
, vol.33
, Issue.6
, pp. 1926-1951
-
-
Agrawal, R.1
-
3
-
-
0000768035
-
Denumerable-armed bandits
-
emetrp/v60yl992i5pl071-96.html, Available at
-
Banks, J. S. and Sundaram, R. K. (1992). Denumerable-armed bandits. Econometrica, 60(5): 1071-96. Available at http://ideas.repec.Org/a/ecm/ emetrp/v60yl992i5pl071-96.html.
-
(1992)
Econometrica
, vol.60
, Issue.5
, pp. 1071-1096
-
-
Banks, J.S.1
Sundaram, R.K.2
-
4
-
-
0002700781
-
Learning to act using real-time dynamic programming
-
Technical Report UM-CS-1993-002
-
Barto, A., Bradtke, S., and Singh, S. (1993). Learning to act using real-time dynamic programming. Technical Report UM-CS-1993-002.
-
(1993)
-
-
Barto, A.1
Bradtke, S.2
Singh, S.3
-
6
-
-
0031534756
-
Bandit problems with infinitely many arms
-
Berry, D. A., Chen, R. W., Zame, A., Heath, D. C, and Shepp, L. A. (1997). Bandit problems with infinitely many arms. Ann. Statist., 25(5):2103-2116.
-
(1997)
Ann. Statist
, vol.25
, Issue.5
, pp. 2103-2116
-
-
Berry, D.A.1
Chen, R.W.2
Zame, A.3
Heath, D.C.4
Shepp, L.A.5
-
8
-
-
0003640133
-
Monte carlo go
-
Unpublished
-
Bruegmann, B. (1993). Monte carlo go. Unpublished.
-
(1993)
-
-
Bruegmann, B.1
-
9
-
-
77049109578
-
Combining tactical search and monte-carlo in the game of go
-
Cazenave, T. and Helmstetter, B. (2005). Combining tactical search and monte-carlo in the game of go. IEEE CIG2005, pages 171-175.
-
(2005)
IEEE
, vol.CIG2005
, pp. 171-175
-
-
Cazenave, T.1
Helmstetter, B.2
-
11
-
-
77949549457
-
-
Coulom, R. (2006). Efficient selectivity and backup operators in monte-carlo tree search. In P. Ciancarini and H. J. van den Herik, editors, Proceedings of the 5th International Conference on Computers and Games, Turin, Italy.
-
Coulom, R. (2006). Efficient selectivity and backup operators in monte-carlo tree search. In P. Ciancarini and H. J. van den Herik, editors, Proceedings of the 5th International Conference on Computers and Games, Turin, Italy.
-
-
-
-
12
-
-
70349287633
-
Computing elo ratings of move patterns in the game of go
-
van den Herik, H. J, Uiterwijk, J. W. H. M, Winands, M, and Schadd, M, editors, Amsterdam
-
Coulom, R. (2007). Computing elo ratings of move patterns in the game of go. In van den Herik, H. J., Uiterwijk, J. W. H. M., Winands, M., and Schadd, M., editors, Computer Games Workshop, Amsterdam.
-
(2007)
Computer Games Workshop
-
-
Coulom, R.1
-
13
-
-
33244456637
-
Robbing the bandit: Less regret in online geometric optimization against an adaptive adversary
-
New York, NY, USA. ACM Press
-
Dani, V. and Hayes, T. P. (2006). Robbing the bandit: less regret in online geometric optimization against an adaptive adversary. In SODA '06: Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm, pages 937-943, New York, NY, USA. ACM Press.
-
(2006)
SODA '06: Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
, pp. 937-943
-
-
Dani, V.1
Hayes, T.P.2
-
14
-
-
34547990649
-
Combining online and offline knowledge in uct
-
New York, NY, USA. ACM Press
-
Gelly, S. and Silver, D. (2007). Combining online and offline knowledge in uct. In ICML '07: Proceedings of the 24th international conference on Machine learning, pages 273-280, New York, NY, USA. ACM Press.
-
(2007)
ICML '07: Proceedings of the 24th international conference on Machine learning
, pp. 273-280
-
-
Gelly, S.1
Silver, D.2
-
15
-
-
55849113981
-
Exploration vs. exploitation challenge
-
Hussain, Z., Auer, P., Cesa-Bianchi, N., Newnham, L., and Shawe-Taylor, J. (2006). Exploration vs. exploitation challenge. Pascal Network of Excellence.
-
(2006)
Pascal Network of Excellence
-
-
Hussain, Z.1
Auer, P.2
Cesa-Bianchi, N.3
Newnham, L.4
Shawe-Taylor, J.5
-
17
-
-
34547975806
-
Bandit-based monte-carlo planning
-
Kocsis, L. and Szepesvari, C. (2006a). Bandit-based monte-carlo planning. ECML '06.
-
(2006)
ECML '06
-
-
Kocsis, L.1
Szepesvari, C.2
-
19
-
-
0002899547
-
Asymptotically efficient adaptive allocation rules
-
Lai, T. and Robbins, H. (1985). Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6:4-22.
-
(1985)
Advances in Applied Mathematics
, vol.6
, pp. 4-22
-
-
Lai, T.1
Robbins, H.2
-
21
-
-
34547981323
-
Modifications of UCT and sequence-like simulations for Monte-Carlo Go
-
Honolulu, Hawaii
-
Wang, Y. and Gelly, S. (2007). Modifications of UCT and sequence-like simulations for Monte-Carlo Go. In IEEE Symposium on Computational Intelligence and Games, Honolulu, Hawaii, pages 175-182.
-
(2007)
IEEE Symposium on Computational Intelligence and Games
, pp. 175-182
-
-
Wang, Y.1
Gelly, S.2
|