-
1
-
-
84969752808
-
Weight uncertainty in neural networks
-
Charles Blundell, Julien Cornebise, Koray Kavukcuoglu, and Daan Wierstra. Weight uncertainty in neural networks. In Proceedings of the 32nd International Conference on Machine Learning (ICML), pp. 1613–1622, 2015.
-
(2015)
Proceedings of the 32nd International Conference on Machine Learning (ICML)
, pp. 1613-1622
-
-
Blundell, C.1
Cornebise, J.2
Kavukcuoglu, K.3
Wierstra, D.4
-
2
-
-
85088227437
-
Recurrent batch normalization
-
Tim Cooijmans, Nicolas Ballas, César Laurent, Çaglar Gülçehre, and Aaron Courville. Recurrent batch normalization. In International Conference on Learning Representations (ICLR), 2017.
-
(2017)
International Conference on Learning Representations (ICLR)
-
-
Cooijmans, T.1
Ballas, N.2
Laurent, C.3
Gülçehre, Ç.4
Courville, A.5
-
3
-
-
85046770231
-
-
arXiv preprint
-
Meire Fortunato, Mohammad Gheshlaghi Azar, Bilal Piot, Jacob Menick, Ian Osband, Alex Graves, Vlad Mnih, Remi Munos, Demis Hassabis, Olivier Pietquin, et al. Noisy networks for exploration. arXiv preprint arXiv:1706.10295, 2017.
-
(2017)
Noisy Networks for Exploration
-
-
Fortunato, M.1
Azar, M.G.2
Piot, B.3
Menick, J.4
Osband, I.5
Graves, A.6
Mnih, V.7
Munos, R.8
Hassabis, D.9
Pietquin, O.10
-
8
-
-
84969584486
-
Batch normalization: Accelerating deep network training by reducing internal covariate shift
-
Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International Conference on Machine Learning (ICML), pp. 448–456, 2015.
-
(2015)
International Conference on Machine Learning (ICML)
, pp. 448-456
-
-
Ioffe, S.1
Szegedy, C.2
-
9
-
-
85021155217
-
-
Norman P. Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, Rick Boyle, Pierre luc Cantin, Clifford Chao, Chris Clark, Jeremy Coriell, Mike Daley, Matt Dau, Jeffrey Dean, Ben Gelb, Tara Vazir Ghaemmaghami, Rajendra Gottipati, William Gulland, Robert Hagmann, C. Richard Ho, Doug Hogberg, John Hu, Robert Hundt, Dan Hurt, Julian Ibarz, Aaron Jaffey, Alek Jaworski, Alexander Kaplan, Harshit Khaitan, Andy Koch, Naveen Kumar, Steve Lacy, James Laudon, James Law, Diemthu Le, Chris Leary, Zhuyuan Liu, Kyle Lucke, Alan Lundin, Gordon MacKean, Adriana Maggiore, Maire Mahony, Kieran Miller, Rahul Nagarajan, Ravi Narayanaswami, Ray Ni, Kathy Nix, Thomas Norrie, Mark Omernick, Narayana Penukonda, Andy Phelps, and Jonathan Ross. In-datacenter performance analysis of a tensor processing unit. 2017. URL https://arxiv.org/pdf/1704.04760.pdf.
-
(2017)
In-Datacenter Performance Analysis of A Tensor Processing Unit
-
-
Jouppi, N.P.1
Young, C.2
Patil, N.3
Patterson, D.4
Agrawal, G.5
Bajwa, R.6
Bates, S.7
Bhatia, S.8
Boden, N.9
Borchers, A.10
Boyle, R.11
Cantin, P.12
Chao, C.13
Clark, C.14
Coriell, J.15
Daley, M.16
Dau, M.17
Dean, J.18
Gelb, B.19
Ghaemmaghami, T.V.20
Gottipati, R.21
Gulland, W.22
Hagmann, R.23
Ho, C.R.24
Hogberg, D.25
Hu, J.26
Hundt, R.27
Hurt, D.28
Ibarz, J.29
Jaffey, A.30
Jaworski, A.31
Kaplan, A.32
Khaitan, H.33
Koch, A.34
Kumar, N.35
Lacy, S.36
Laudon, J.37
Law, J.38
Le, D.39
Leary, C.40
Liu, Z.41
Lucke, K.42
Lundin, A.43
MacKean, G.44
Maggiore, A.45
Mahony, M.46
Miller, K.47
Nagarajan, R.48
Narayanaswami, R.49
Ni, R.50
Nix, K.51
Norrie, T.52
Omernick, M.53
Penukonda, N.54
Phelps, A.55
Ross, J.56
more..
-
12
-
-
77956002520
-
Learning multiple layers of features from tiny images
-
University of Toronto
-
Alex Krizhevsky and Geoffrey Hinton. Learning multiple layers of features from tiny images. In Technical Report. University of Toronto, 2009.
-
(2009)
Technical Report
-
-
Krizhevsky, A.1
Hinton, G.2
-
13
-
-
85064840479
-
Zoneout: Regularizing RNNs by randomly preserving hidden activations
-
abs/1606.01305
-
David Krueger, Tegan Maharaj, János Kramár, Mohammad Pezeshki, Nicolas Ballas, Nan Rosemary Ke, Anirudh Goyal, Yoshua Bengio, Hugo Larochelle, Aaron C. Courville, and Chris Pal. Zoneout: Regularizing RNNs by randomly preserving hidden activations. CoRR, abs/1606.01305, 2016.
-
(2016)
CoRR
-
-
Krueger, D.1
Maharaj, T.2
Kramár, J.3
Pezeshki, M.4
Ballas, N.5
Ke, N.R.6
Goyal, A.7
Bengio, Y.8
Larochelle, H.9
Courville, A.C.10
Pal, C.11
-
15
-
-
0032203257
-
Gradient-based learning applied to document recognition
-
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
-
(1998)
Proceedings of the IEEE
, vol.86
, Issue.11
, pp. 2278-2324
-
-
LeCun, Y.1
Bottou, L.2
Bengio, Y.3
Haffner, P.4
-
17
-
-
34249852033
-
Building a large annotated corpus of English: The Penn Treebank
-
Mitchell P Marcus, Mary Ann Marcinkiewicz, and Beatrice Santorini. Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2):313–330, 1993.
-
(1993)
Computational Linguistics
, vol.19
, Issue.2
, pp. 313-330
-
-
Marcus, M.P.1
Marcinkiewicz, M.A.2
Santorini, B.3
-
21
-
-
85050620510
-
-
arXiv preprint
-
Matthias Plappert, Rein Houthooft, Prafulla Dhariwal, Szymon Sidor, Richard Y Chen, Xi Chen, Tamim Asfour, Pieter Abbeel, and Marcin Andrychowicz. Parameter space noise for exploration. arXiv preprint arXiv:1706.01905, 2017.
-
(2017)
Parameter Space Noise for Exploration
-
-
Plappert, M.1
Houthooft, R.2
Dhariwal, P.3
Sidor, S.4
Chen, R.Y.5
Chen, X.6
Asfour, T.7
Abbeel, P.8
Andrychowicz, M.9
-
27
-
-
33847649288
-
Training recurrent networks by evolino
-
Jürgen Schmidhuber, Daan Wierstra, Matteo Gagliolo, and Faustino Gomez. Training recurrent networks by evolino. Neural Computation, 19(3):757–779, 2007.
-
(2007)
Neural Computation
, vol.19
, Issue.3
, pp. 757-779
-
-
Schmidhuber, J.1
Wierstra, D.2
Gagliolo, M.3
Gomez, F.4
-
30
-
-
84904163933
-
Dropout: A simple way to prevent neural networks from overfitting
-
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15:1929–1958, 2014.
-
(2014)
Journal of Machine Learning Research
, vol.15
, pp. 1929-1958
-
-
Srivastava, N.1
Hinton, G.2
Krizhevsky, A.3
Sutskever, I.4
Salakhutdinov, R.5
-
31
-
-
84897550107
-
Regularization of neural networks using DropConnect
-
Li Wan, Matthew Zeiler, Sixin Zhang, Yann L Cun, and Rob Fergus. Regularization of neural networks using DropConnect. In Proceedings of the 30th International Conference on Machine Learning (ICML), pp. 1058–1066, 2013.
-
(2013)
Proceedings of the 30th International Conference on Machine Learning (ICML)
, pp. 1058-1066
-
-
Wan, L.1
Zeiler, M.2
Zhang, S.3
Cun, Y.L.4
Fergus, R.5
-
32
-
-
0000337576
-
Simple statistical gradient-following algorithms for connectionist reinforcement learning
-
Ronald J Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8(3-4):229–256, 1992.
-
(1992)
Machine Learning
, vol.8
, Issue.3-4
, pp. 229-256
-
-
Williams, R.J.1
|