-
1
-
-
84863165439
-
-
Springer
-
Friedman, J., Hastie, T. & Tibshirani, R. The Elements of Statistical Learning: Data Mining, Inference, and Prediction (Springer, 2009).
-
(2009)
The Elements of Statistical Learning: Data Mining, Inference, and Prediction
-
-
Friedman, J.1
Hastie, T.2
Tibshirani, R.3
-
2
-
-
84930630277
-
Deep learning
-
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436-444 (2015).
-
(2015)
Nature
, vol.521
, pp. 436-444
-
-
LeCun, Y.1
Bengio, Y.2
Hinton, G.3
-
3
-
-
84876231242
-
ImageNet classification with deep convolutional neural networks
-
(eds Pereira, F., Burges, C. J. C., Bottou, L. & Weinberger, K. Q.)
-
Krizhevsky, A., Sutskever, I. & Hinton, G. ImageNet classification with deep convolutional neural networks. In Adv. Neural Inf. Process. Syst. Vol. 25 (eds Pereira, F., Burges, C. J. C., Bottou, L. & Weinberger, K. Q.) 1097-1105 (2012).
-
(2012)
Adv. Neural Inf. Process. Syst.
, vol.25
, pp. 1097-1105
-
-
Krizhevsky, A.1
Sutskever, I.2
Hinton, G.3
-
4
-
-
84986274465
-
Deep residual learning for image recognition
-
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proc. 29th IEEE Conf. Comput. Vis. Pattern Recognit. 770-778 (2016).
-
(2016)
Proc. 29th IEEE Conf. Comput. Vis. Pattern Recognit.
, pp. 770-778
-
-
He, K.1
Zhang, X.2
Ren, S.3
Sun, J.4
-
6
-
-
84924051598
-
Human-level control through deep reinforcement learning
-
Mnih, V. et al. Human-level control through deep reinforcement learning. Nature 518, 529-533 (2015).
-
(2015)
Nature
, vol.518
, pp. 529-533
-
-
Mnih, V.1
-
7
-
-
84937779024
-
Deep learning for real-time Atari game play using offline Monte-Carlo tree search planning
-
(eds Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N. D. & Weinberger, K. Q.)
-
Guo, X., Singh, S. P., Lee, H., Lewis, R. L. & Wang, X. Deep learning for real-time Atari game play using offline Monte-Carlo tree search planning. In Adv. Neural Inf. Process. Syst. Vol. 27 (eds Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N. D. & Weinberger, K. Q.) 3338-3346 (2014).
-
(2014)
Adv. Neural Inf. Process. Syst.
, vol.27
, pp. 3338-3346
-
-
Guo, X.1
Singh, S.P.2
Lee, H.3
Lewis, R.L.4
Wang, X.5
-
8
-
-
84971448181
-
Asynchronous methods for deep reinforcement learning
-
(eds Balcan, M. F. & Weinberger, K. Q.)
-
Mnih, V. et al. Asynchronous methods for deep reinforcement learning. In Proc. 33rd Int. Conf. Mach. Learn. Vol. 48 (eds Balcan, M. F. & Weinberger, K. Q.) 1928-1937 (2016).
-
(2016)
Proc. 33rd Int. Conf. Mach. Learn.
, vol.48
, pp. 1928-1937
-
-
Mnih, V.1
-
9
-
-
85088229768
-
Reinforcement learning with unsupervised auxiliary tasks
-
Jaderberg, M. et al. Reinforcement learning with unsupervised auxiliary tasks. In 5th Int. Conf. Learn. Representations (2017).
-
(2017)
5th Int. Conf. Learn. Representations
-
-
Jaderberg, M.1
-
12
-
-
84963949906
-
Mastering the game of Go with deep neural networks and tree search
-
Silver, D. et al. Mastering the game of Go with deep neural networks and tree search. Nature 529, 484-489 (2016).
-
(2016)
Nature
, vol.529
, pp. 484-489
-
-
Silver, D.1
-
13
-
-
34547971839
-
Efficient selectivity and backup operators in Monte-Carlo tree search
-
(eds Ciancarini, P. & van den Herik, H. J.)
-
Coulom, R. Efficient selectivity and backup operators in Monte-Carlo tree search. In 5th Int. Conf. Computers and Games (eds Ciancarini, P. & van den Herik, H. J.) 72-83 (2006).
-
(2006)
5th Int. Conf. Computers and Games
, pp. 72-83
-
-
Coulom, R.1
-
15
-
-
84858960516
-
A survey of Monte Carlo tree search methods
-
Browne, C. et al. A survey of Monte Carlo tree search methods. IEEE Trans. Comput. Intell. AI Games 4, 1-49 (2012).
-
(2012)
IEEE Trans. Comput. Intell. AI Games
, vol.4
, pp. 1-49
-
-
Browne, C.1
-
16
-
-
0019152630
-
Neocognitron: A self organizing neural network model for a mechanism of pattern recognition unaffected by shift in position
-
Fukushima, K. Neocognitron: a self organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36, 193-202 (1980).
-
(1980)
Biol. Cybern.
, vol.36
, pp. 193-202
-
-
Fukushima, K.1
-
18
-
-
84969584486
-
Batch normalization: Accelerating deep network training by reducing internal covariate shift
-
Ioffe, S. & Szegedy, C. Batch normalization: accelerating deep network training by reducing internal covariate shift. In Proc. 32nd Int. Conf. Mach. Learn. Vol. 37 448-456 (2015).
-
(2015)
Proc. 32nd Int. Conf. Mach. Learn.
, vol.37
, pp. 448-456
-
-
Ioffe, S.1
Szegedy, C.2
-
19
-
-
0034702306
-
Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit
-
Hahnloser, R. H. R., Sarpeshkar, R., Mahowald, M. A., Douglas, R. J. & Seung, H. S. Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit. Nature 405, 947-951 (2000).
-
(2000)
Nature
, vol.405
, pp. 947-951
-
-
Hahnloser, R.H.R.1
Sarpeshkar, R.2
Mahowald, M.A.3
Douglas, R.J.4
Seung, H.S.5
-
22
-
-
79960439729
-
Approximate policy iteration: A survey and some new methods
-
Bertsekas, D. P. Approximate policy iteration: a survey and some new methods. J. Control Theory Appl. 9, 310-335 (2011).
-
(2011)
J. Control Theory Appl.
, vol.9
, pp. 310-335
-
-
Bertsekas, D.P.1
-
23
-
-
85031913482
-
Approximate policy iteration schemes: A comparison
-
Scherrer, B. Approximate policy iteration schemes: a comparison. In Proc. 31st Int. Conf. Mach. Learn. Vol. 32 1314-1322 (2014).
-
(2014)
Proc. 31st Int. Conf. Mach. Learn.
, vol.32
, pp. 1314-1322
-
-
Scherrer, B.1
-
24
-
-
82355173286
-
Multi-armed bandits with episode context
-
Rosin, C. D. Multi-armed bandits with episode context. Ann. Math. Artif. Intell. 61, 203-230 (2011).
-
(2011)
Ann. Math. Artif. Intell.
, vol.61
, pp. 203-230
-
-
Rosin, C.D.1
-
25
-
-
55249086085
-
Whole-history rating: A Bayesian rating system for players of time-varying strength
-
(eds van den Herik, H. J., Xu, X. Ma, Z. & Winands, M. H. M.) Springer
-
Coulom, R. Whole-history rating: a Bayesian rating system for players of time-varying strength. In Int. Conf. Comput. Games (eds van den Herik, H. J., Xu, X. Ma, Z. & Winands, M. H. M.) Vol. 5131 113-124 (Springer, 2008).
-
(2008)
Int. Conf. Comput. Games
, vol.5131
, pp. 113-124
-
-
Coulom, R.1
-
26
-
-
80052079894
-
The world of independent learners is not Markovian
-
Laurent, G. J., Matignon, L. & Le Fort-Piat, N. The world of independent learners is not Markovian. Int. J. Knowledge-Based Intelligent Engineering Systems 15, 55-64 (2011).
-
(2011)
Int. J. Knowledge-Based Intelligent Engineering Systems
, vol.15
, pp. 55-64
-
-
Laurent, G.J.1
Matignon, L.2
Le Fort-Piat, N.3
-
27
-
-
85038115383
-
Stabilising experience replay for deep multi-agent reinforcement learning
-
Foerster, J. N. et al. Stabilising experience replay for deep multi-agent reinforcement learning. In Proc. 34th Int. Conf. Mach. Learn. Vol. 70 1146-1155 (2017).
-
(2017)
Proc. 34th Int. Conf. Mach. Learn.
, vol.70
, pp. 1146-1155
-
-
Foerster, J.N.1
-
29
-
-
85025594365
-
In-datacenter performance analysis of a Tensor Processing Unit
-
Jouppi, N. P. et al. In-datacenter performance analysis of a Tensor Processing Unit. Proc. 44th Annu. Int. Symp. Comp. Architecture Vol. 17 1-12 (2017).
-
(2017)
Proc. 44th Annu. Int. Symp. Comp. Architecture
, vol.17
, pp. 1-12
-
-
Jouppi, N.P.1
-
30
-
-
85083951314
-
Move evaluation in Go using deep convolutional neural networks
-
Maddison, C. J., Huang, A., Sutskever, I. & Silver, D. Move evaluation in Go using deep convolutional neural networks. In 3rd Int. Conf. Learn. Representations. (2015).
-
(2015)
3rd Int. Conf. Learn. Representations.
-
-
Maddison, C.J.1
Huang, A.2
Sutskever, I.3
Silver, D.4
-
31
-
-
84969920322
-
Training deep convolutional neural networks to play Go
-
Clark, C. & Storkey, A. J. Training deep convolutional neural networks to play Go. In Proc. 32nd Int. Conf. Mach. Learn. Vol. 37 1766-1774 (2015).
-
(2015)
Proc. 32nd Int. Conf. Mach. Learn.
, vol.37
, pp. 1766-1774
-
-
Clark, C.1
Storkey, A.J.2
-
32
-
-
85083953106
-
Better computer Go player with neural network and long-term prediction
-
Tian, Y. & Zhu, Y. Better computer Go player with neural network and long-term prediction. In 4th Int. Conf. Learn. Representations (2016).
-
(2016)
4th Int. Conf. Learn. Representations
-
-
Tian, Y.1
Zhu, Y.2
-
35
-
-
2442603180
-
Monte Carlo matrix inversion and reinforcement learning
-
Barto, A. G. & Duff, M. Monte Carlo matrix inversion and reinforcement learning. Adv. Neural Inf. Process. Syst. 6, 687-694 (1994).
-
(1994)
Adv. Neural Inf. Process. Syst.
, vol.6
, pp. 687-694
-
-
Barto, A.G.1
Duff, M.2
-
36
-
-
0029753630
-
Reinforcement learning with replacing eligibility traces
-
Singh, S. P. & Sutton, R. S. Reinforcement learning with replacing eligibility traces. Mach. Learn. 22, 123-158 (1996).
-
(1996)
Mach. Learn.
, vol.22
, pp. 123-158
-
-
Singh, S.P.1
Sutton, R.S.2
-
37
-
-
1942420814
-
Reinforcement learning as classification: Leveraging modern classifiers
-
Lagoudakis, M. G. & Parr, R. Reinforcement learning as classification: leveraging modern classifiers. In Proc. 20th Int. Conf. Mach. Learn. 424-431 (2003).
-
(2003)
Proc. 20th Int. Conf. Mach. Learn.
, pp. 424-431
-
-
Lagoudakis, M.G.1
Parr, R.2
-
38
-
-
84962317462
-
Approximate modified policy iteration and its application to the game of Tetris
-
Scherrer, B., Ghavamzadeh, M., Gabillon, V., Lesner, B. & Geist, M. Approximate modified policy iteration and its application to the game of Tetris. J. Mach. Learn. Res. 16, 1629-1676 (2015).
-
(2015)
J. Mach. Learn. Res.
, vol.16
, pp. 1629-1676
-
-
Scherrer, B.1
Ghavamzadeh, M.2
Gabillon, V.3
Lesner, B.4
Geist, M.5
-
39
-
-
85149834820
-
Markov games as a framework for multi-agent reinforcement learning
-
Littman, M. L. Markov games as a framework for multi-agent reinforcement learning. In Proc. 11th Int. Conf. Mach. Learn. 157-163 (1994).
-
(1994)
Proc. 11th Int. Conf. Mach. Learn.
, pp. 157-163
-
-
Littman, M.L.1
-
41
-
-
85031899055
-
-
(eds Van Den Herik, H. J., Iida, H. & Heinz, E. A.)
-
Enzenberger, M. in Advances in Computer Games (eds Van Den Herik, H. J., Iida, H. & Heinz, E. A.) 97-108 (2003).
-
(2003)
Advances in Computer Games
, pp. 97-108
-
-
Enzenberger, M.1
-
42
-
-
33847202724
-
Learning to predict by the method of temporal differences
-
Sutton, R. Learning to predict by the method of temporal differences. Mach. Learn. 3, 9-44 (1988).
-
(1988)
Mach. Learn.
, vol.3
, pp. 9-44
-
-
Sutton, R.1
-
43
-
-
0000433333
-
Temporal difference learning of position evaluation in the game of Go
-
Schraudolph, N. N., Dayan, P. & Sejnowski, T. J. Temporal difference learning of position evaluation in the game of Go. Adv. Neural Inf. Process. Syst. 6, 817-824 (1994).
-
(1994)
Adv. Neural Inf. Process. Syst.
, vol.6
, pp. 817-824
-
-
Schraudolph, N.N.1
Dayan, P.2
Sejnowski, T.J.3
-
44
-
-
84863416482
-
Temporal-difference search in computer Go
-
Silver, D., Sutton, R. & Müller, M. Temporal-difference search in computer Go. Mach. Learn. 87, 183-219 (2012).
-
(2012)
Mach. Learn.
, vol.87
, pp. 183-219
-
-
Silver, D.1
Sutton, R.2
Müller, M.3
-
46
-
-
79956202655
-
Monte-Carlo tree search and rapid action value estimation in computer Go
-
Gelly, S. & Silver, D. Monte-Carlo tree search and rapid action value estimation in computer Go. Artif. Intell. 175, 1856-1875 (2011).
-
(2011)
Artif. Intell.
, vol.175
, pp. 1856-1875
-
-
Gelly, S.1
Silver, D.2
-
47
-
-
38849139064
-
Computing Elo ratings of move patterns in the game of Go
-
Coulom, R. Computing Elo ratings of move patterns in the game of Go. Int. Comput. Games Assoc. J. 30, 198-208 (2007).
-
(2007)
Int. Comput. Games Assoc. J.
, vol.30
, pp. 198-208
-
-
Coulom, R.1
-
48
-
-
34250659969
-
-
INRIA
-
Gelly, S., Wang, Y., Munos, R. & Teytaud, O. Modification of UCT with patterns in Monte-Carlo Go. Report No. 6062 (INRIA, 2006).
-
(2006)
Modification of UCT with Patterns in Monte-Carlo Go. Report No. 6062
-
-
Gelly, S.1
Wang, Y.2
Munos, R.3
Teytaud, O.4
-
49
-
-
0034275416
-
Learning to play chess using temporal differences
-
Baxter, J., Tridgell, A. & Weaver, L. Learning to play chess using temporal differences. Mach. Learn. 40, 243-263 (2000).
-
(2000)
Mach. Learn.
, vol.40
, pp. 243-263
-
-
Baxter, J.1
Tridgell, A.2
Weaver, L.3
-
50
-
-
84858720579
-
Bootstrapping from game tree search
-
Veness, J., Silver, D., Blair, A. & Uther, W. Bootstrapping from game tree search. In Adv. Neural Inf. Process. Syst. 1937-1945 (2009).
-
(2009)
Adv. Neural Inf. Process. Syst.
, pp. 1937-1945
-
-
Veness, J.1
Silver, D.2
Blair, A.3
Uther, W.4
-
52
-
-
0038145011
-
Temporal difference learning applied to a high-performance game-playing program
-
Schaeffer, J., Hlynka, M. & Jussila, V. Temporal difference learning applied to a high-performance game-playing program. In Proc. 17th Int. Jt Conf. Artif. Intell. Vol. 1 529-534 (2001).
-
(2001)
Proc. 17th Int. Jt Conf. Artif. Intell.
, vol.1
, pp. 529-534
-
-
Schaeffer, J.1
Hlynka, M.2
Jussila, V.3
-
53
-
-
0000985504
-
TD-gammon, a self-teaching backgammon program, achieves master-level play
-
Tesauro, G. TD-gammon, a self-teaching backgammon program, achieves master-level play. Neural Comput. 6, 215-219 (1994).
-
(1994)
Neural Comput.
, vol.6
, pp. 215-219
-
-
Tesauro, G.1
-
54
-
-
84956863737
-
From simple features to sophisticated evaluation functions
-
Buro, M. From simple features to sophisticated evaluation functions. In Proc. 1st Int. Conf. Comput. Games 126-145 (1999).
-
(1999)
Proc. 1st Int. Conf. Comput. Games
, pp. 126-145
-
-
Buro, M.1
-
55
-
-
0036146034
-
World-championship-caliber scrabble
-
Sheppard, B. World-championship-caliber Scrabble. Artif. Intell. 134, 241-275 (2002).
-
(2002)
Artif. Intell.
, vol.134
, pp. 241-275
-
-
Sheppard, B.1
-
56
-
-
85014477370
-
DeepStack: Expert-level artificial intelligence in heads-up no-limit poker
-
Moravčík, M. et al. DeepStack: expert-level artificial intelligence in heads-up no-limit poker. Science 356, 508-513 (2017).
-
(2017)
Science
, vol.356
, pp. 508-513
-
-
Moravčík, M.1
-
58
-
-
0025559238
-
Neurogammon: A neural-network backgammon program
-
Tesauro, G. Neurogammon: a neural-network backgammon program. In Proc. Int. Jt Conf. Neural Netw. Vol. 3, 33-39 (1990).
-
(1990)
Proc. Int. Jt Conf. Neural Netw.
, vol.3
, pp. 33-39
-
-
Tesauro, G.1
-
59
-
-
0001201757
-
Some studies in machine learning using the game of checkers II-recent progress
-
Samuel, A. L. Some studies in machine learning using the game of checkers II-recent progress. IBM J. Res. Develop. 11, 601-617 (1967).
-
(1967)
IBM J. Res. Develop.
, vol.11
, pp. 601-617
-
-
Samuel, A.L.1
-
60
-
-
84884276459
-
Reinforcement learning in robotics: A survey
-
Kober, J., Bagnell, J. A. & Peters, J. Reinforcement learning in robotics: a survey. Int. J. Robot. Res. 32, 1238-1274 (2013).
-
(2013)
Int. J. Robot. Res.
, vol.32
, pp. 1238-1274
-
-
Kober, J.1
Bagnell, J.A.2
Peters, J.3
-
64
-
-
33646357710
-
Empirical comparison of various reinforcement learning strategies for sequential targeted marketing
-
Abe, N. et al. Empirical comparison of various reinforcement learning strategies for sequential targeted marketing. In IEEE Int. Conf. Data Mining 3-10 (2002).
-
(2002)
IEEE Int. Conf. Data Mining
, pp. 3-10
-
-
Abe, N.1
-
65
-
-
84897527953
-
Concurrent reinforcement learning from customer interactions
-
Silver, D., Newnham, L., Barker, D., Weller, S. & McFall, J. Concurrent reinforcement learning from customer interactions. In Proc. 30th Int. Conf. Mach. Learn. Vol. 28 924-932 (2013).
-
(2013)
Proc. 30th Int. Conf. Mach. Learn.
, vol.28
, pp. 924-932
-
-
Silver, D.1
Newnham, L.2
Barker, D.3
Weller, S.4
McFall, J.5
-
67
-
-
0036149616
-
Computer go
-
Müller, M. Computer Go. Artif. Intell. 134, 145-179 (2002).
-
(2002)
Artif. Intell.
, vol.134
, pp. 145-179
-
-
Müller, M.1
-
68
-
-
84949985138
-
Taking the human out of the loop: A review of Bayesian optimization
-
Shahriari, B., Swersky, K., Wang, Z., Adams, R. P. & de Freitas, N. Taking the human out of the loop: a review of Bayesian optimization. Proc. IEEE 104, 148-175 (2016).
-
(2016)
Proc. IEEE
, vol.104
, pp. 148-175
-
-
Shahriari, B.1
Swersky, K.2
Wang, Z.3
Adams, R.P.4
De Freitas, N.5
-
69
-
-
79952022478
-
On the scalability of parallel UCT
-
Segal, R. B. On the scalability of parallel UCT. Comput. Games 6515, 36-47 (2011).
-
(2011)
Comput. Games
, vol.6515
, pp. 36-47
-
-
Segal, R.B.1
|