-
1
-
-
84958264664
-
-
arXiv preprint, 1603.04467 arxiv.org/abs/1603.04467. Software available from tensorflow.org
-
M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. J. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Józefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mane, R. Monga, S. Moore, D. G. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. A. Tucker, V. Vanhoucke, V. Vasudevan, F. B. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng. TensorFlow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint, 1603.04467, 2016. arxiv.org/abs/1603.04467. Software available from tensorflow.org.
-
(2016)
TensorFlow: Large-scale Machine Learning on Heterogeneous Distributed Systems
-
-
Abadi, M.1
Agarwal, A.2
Barham, P.3
Brevdo, E.4
Chen, Z.5
Citro, C.6
Corrado, G.S.7
Davis, A.8
Dean, J.9
Devin, M.10
Ghemawat, S.11
Goodfellow, I.J.12
Harp, A.13
Irving, G.14
Isard, M.15
Jia, Y.16
Józefowicz, R.17
Kaiser, L.18
Kudlur, M.19
Levenberg, J.20
Mane, D.21
Monga, R.22
Moore, S.23
Murray, D.G.24
Olah, C.25
Schuster, M.26
Shlens, J.27
Steiner, B.28
Sutskever, I.29
Talwar, K.30
Tucker, P.A.31
Vanhoucke, V.32
Vasudevan, V.33
Viégas, F.B.34
Vinyals, O.35
Warden, P.36
Wattenberg, M.37
Wicke, M.38
Yu, Y.39
Zheng, X.40
more..
-
2
-
-
84979557463
-
-
arXiv preprint, 1605.02688 arxiv.org/abs/1605.02688
-
R. Al-Rfou, G. Alain, A. Almahairi, C. Anger-mueller, D. Bahdanau, N. Ballas, F. Bastien, J. Bayer, A. Belikov, A. Belopolsky, Y. Bengio, A. Bergeron, J. Bergstra, V. Bisson, J. Bleecher Snyder, N. Bouchard, N. Boulanger-Lewandowski, X. Bouthillier, A. de Brébisson, O. Breuleux, P.-L. Carrier, K. Cho, J. Chorowski, P. Christiano, T. Cooijmans, M.-A. Côté, M. Côté, A. Courville, Y. N. Dauphin, O. Delalleau, J. Demouth, G. Desjardins, S. Dieleman, L. Dinh, M. Ducoffe, V. Dumoulin, S. Ebrahimi Kahou, D. Erhan, Z. Fan, O. Firat, M. Germain, X. Glorot, I. Goodfellow, M. Graham, C. Gulcehre, P. Hamel, I. Harlouchet, J.-P. Heng, B. Hidasi, S. Honari, A. Jain, S. Jean, K. Jia, M. Korobov, V. Kulkarni, A. Lamb, P. Lamblin, E. Larsen, C. Laurent, S. Lee, S. Lefrancois, S. Lemieux, N. Léonard, Z. Lin, J. A. Livezey, C. Lorenz, J. Lowin, Q. Ma, P.-A. Manzagol, O. Mastropietro, R. T. McGibbon, R. Memisevic, B. van Merriënboer, V. Michalski, M. Mirza, A. Orlandi, C. Pal, R. Pascanu, M. Pezeshki, C. Raffel, D. Renshaw, M. Rocklin, A. Romero, M. Roth, P. Sadowski, J. Salvatier, F. Savard, J. Schlüter, J. Schulman, G. Schwartz, I. V. Serban, D. Serdyuk, S. Shabanian, E. Simon, S. Spieckermann, S. R. Subramanyam, J. Sygnowski, J. Tanguay, G. van Tulder, J. Turian, S. Urban, P. Vincent, F. Visin, H. de Vries, D. Warde-Farley, D. J. Webb, M. Willson, K. Xu, L. Xue, L. Yao, S. Zhang, and Y. Zhang. Theano: A Python framework for fast computation of mathematical expressions. arXiv preprint, 1605.02688, 2016. arxiv.org/abs/1605.02688.
-
(2016)
Theano: A Python Framework for Fast Computation of Mathematical Expressions
-
-
Al-Rfou, R.1
Alain, G.2
Almahairi, A.3
Anger-Mueller, C.4
Bahdanau, D.5
Ballas, N.6
Bastien, F.7
Bayer, J.8
Belikov, A.9
Belopolsky, A.10
Bengio, Y.11
Bergeron, A.12
Bergstra, J.13
Bisson, V.14
Bleecher Snyder, J.15
Bouchard, N.16
Boulanger-Lewandowski, N.17
Bouthillier, X.18
De Brébisson, A.19
Breuleux, O.20
Carrier, P.-L.21
Cho, K.22
Chorowski, J.23
Christiano, P.24
Cooijmans, T.25
Côté, M.-A.26
Côté, M.27
Courville, A.28
Dauphin, Y.N.29
Delalleau, O.30
Demouth, J.31
Desjardins, G.32
Dieleman, S.33
Dinh, L.34
Ducoffe, M.35
Dumoulin, V.36
Ebrahimi Kahou, S.37
Erhan, D.38
Fan, Z.39
Firat, O.40
Germain, M.41
Glorot, X.42
Goodfellow, I.43
Graham, M.44
Gulcehre, C.45
Hamel, P.46
Harlouchet, I.47
Heng, J.-P.48
Hidasi, B.49
Honari, S.50
Jain, A.51
Jean, S.52
Jia, K.53
Korobov, M.54
Kulkarni, V.55
Lamb, A.56
Lamblin, P.57
Larsen, E.58
Laurent, C.59
Lee, S.60
Lefrancois, S.61
Lemieux, S.62
Léonard, N.63
Lin, Z.64
Livezey, J.A.65
Lorenz, C.66
Lowin, J.67
Ma, Q.68
Manzagol, P.-A.69
Mastropietro, O.70
McGibbon, R.T.71
Memisevic, R.72
Van Merriënboer, B.73
Michalski, V.74
Mirza, M.75
Orlandi, A.76
Pal, C.77
Pascanu, R.78
Pezeshki, M.79
Raffel, C.80
Renshaw, D.81
Rocklin, M.82
Romero, A.83
Roth, M.84
Sadowski, P.85
Salvatier, J.86
Savard, F.87
Schlüter, J.88
Schulman, J.89
Schwartz, G.90
Serban, I.V.91
Serdyuk, D.92
Shabanian, S.93
Simon, E.94
Spieckermann, S.95
Subramanyam, S.R.96
Sygnowski, J.97
Tanguay, J.98
Van Tulder, G.99
Turian, J.100
more..
-
3
-
-
84938261376
-
Pedestrian detection with a large-field-of-view deep network
-
A. Angelova, A. Krizhevsky, and V. Vanhoucke. Pedestrian detection with a large-field-of-view deep network. In Proceedings of ICRA, pages 704-711. IEEE, 2015. www.vision.caltech.edu/anelia/publications/Angelova15LFOV.pdf.
-
(2015)
Proceedings of ICRA
, pp. 704-711
-
-
Angelova, A.1
Krizhevsky, A.2
Vanhoucke, V.3
-
4
-
-
0003376994
-
Dataflow architectures
-
Annual Reviews Inc., 1986
-
Arvind and D. E. Culler. Dataflow architectures. In Annual Review of Computer Science Vol. 1, 1986, pages 225-253. Annual Reviews Inc., 1986. www.dtic.mil/cgi-bin/GetTRDoc?Location=U2&doc=GetTRDoc.pdf&AD=ADA166235.
-
(1986)
Annual Review of Computer Science
, vol.1
, pp. 225-253
-
-
Arvind1
Culler, D.E.2
-
6
-
-
0142166851
-
A neural probabilistic language model
-
Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin. A neural probabilistic language model. Journal of Machine Learning Research, 3:1137-1155, 2003. jmlr.org/papers/volume3/bengio03a/bengio03a.pdf.
-
(2003)
Journal of Machine Learning Research
, vol.3
, pp. 1137-1155
-
-
Bengio, Y.1
Ducharme, R.2
Vincent, P.3
Jauvin, C.4
-
8
-
-
84865685824
-
Sample size selection in optimization methods for machine learning
-
R. H. Byrd, G. M. Chin, J. Nocedal, and Y. Wu. Sample size selection in optimization methods for machine learning. Mathematical Programming, 134(1):127-155, 2012. dx.doi.org/10.1007/s10107-012-0572-5.
-
(2012)
Mathematical Programming
, vol.134
, Issue.1
, pp. 127-155
-
-
Byrd, R.H.1
Chin, G.M.2
Nocedal, J.3
Wu, Y.4
-
9
-
-
84943795466
-
-
arXiv preprint, 1312.3005 arxiv.org/abs/1312.3005
-
C. Chelba, T. Mikolov, M. Schuster, Q. Ge, T. Brants, and P. Koehn. One billion word benchmark for measuring progress in statistical language modeling. arXiv preprint, 1312.3005, 2013. arxiv.org/abs/1312.3005.
-
(2013)
One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling
-
-
Chelba, C.1
Mikolov, T.2
Schuster, M.3
Ge, Q.4
Brants, T.5
Koehn, P.6
-
11
-
-
84990032982
-
MXNet: A flexible and efficient machine learning library for heterogeneous distributed systems
-
T. Chen, M. Li, Y. Li, M. Lin, N. Wang, M. Wang, T. Xiao, B. Xu, C. Zhang, and Z. Zhang. MXNet: A flexible and efficient machine learning library for heterogeneous distributed systems. In Proceedings of LearningSys, 2015. www.cs.cmu.edu/~muli/file/mxnet-learning-sys.pdf.
-
(2015)
Proceedings of LearningSys
-
-
Chen, T.1
Li, M.2
Li, Y.3
Lin, M.4
Wang, N.5
Wang, M.6
Xiao, T.7
Xu, B.8
Zhang, C.9
Zhang, Z.10
-
12
-
-
84991252160
-
-
arXiv preprint, 1606.07792 arxiv.org/abs/1606.07792
-
H.-T. Cheng, L. Koc, J. Harmsen, T. Shaked, T. Chandra, H. Aradhye, G. Anderson, G. Corrado, W. Chai, M. Ispir, R. Anil, Z. Haque, L. Hong, V. Jain, X. Liu, and H. Shah. Wide & deep learning for recommender systems. arXiv preprint, 1606.07792, 2016. arxiv.org/abs/1606.07792.
-
(2016)
Wide & Deep Learning for Recommender Systems
-
-
Cheng, H.-T.1
Koc, L.2
Harmsen, J.3
Shaked, T.4
Chandra, T.5
Aradhye, H.6
Anderson, G.7
Corrado, G.8
Chai, W.9
Ispir, M.10
Anil, R.11
Haque, Z.12
Hong, L.13
Jain, V.14
Liu, X.15
Shah, H.16
-
13
-
-
84944081816
-
-
arXiv preprint, 1410.0759 arxiv.org/abs/1410.0759
-
S. Chetlur, C. Woolley, P. Vandermersch, J. Cohen, J. Tran, B. Catanzaro, and E. Shelhamer. cuDNN: Efficient primitives for deep learning. arXiv preprint, 1410.0759, 2014. arxiv.org/abs/1410.0759.
-
(2014)
CuDNN: Efficient Primitives for Deep Learning
-
-
Chetlur, S.1
Woolley, C.2
Vandermersch, P.3
Cohen, J.4
Tran, J.5
Catanzaro, B.6
Shelhamer, E.7
-
14
-
-
85069497682
-
Project adam: Building an efficient and scalable deep learning training system
-
T. Chilimbi, Y. Suzue, J. Apacible, and K. Kalyanaraman. Project Adam: Building an efficient and scalable deep learning training system. In Proceedings of OSDI, pages 571-582, 2014. www.usenix.org/system/files/conference/osdi14/osdi14-paper-chilimbi.pdf.
-
(2014)
Proceedings of OSDI
, pp. 571-582
-
-
Chilimbi, T.1
Suzue, Y.2
Apacible, J.3
Kalyanaraman, K.4
-
16
-
-
84881142714
-
LINQits: Big data on little clients
-
E. S. Chung, J. D. Davis, and J. Lee. LINQits: Big data on little clients. In Proceedings of ISCA, pages 261-272, 2013. www.microsoft.com/en-us/research/wp-content/uploads/2013/06/ISCA13-linqits.pdf.
-
(2013)
Proceedings of ISCA
, pp. 261-272
-
-
Chung, E.S.1
Davis, J.D.2
Lee, J.3
-
17
-
-
5044234815
-
Torch: A modular machine learning software library
-
R. Collobert, S. Bengio, and J. Mariéthoz. Torch: A modular machine learning software library. Technical report, IDIAP, 2002. infoscience.epfl.ch/record/82802/files/rr02-46.pdf.
-
(2002)
Technical Report, IDIAP
-
-
Collobert, R.1
Bengio, S.2
Mariéthoz, J.3
-
18
-
-
84971575164
-
GeePS: Scalable deep learning on distributed GPUs with a GPU-specialized parameter server
-
H. Cui, H. Zhang, G. R. Ganger, P. B. Gibbons, and E. P. Xing. GeePS: Scalable deep learning on distributed GPUs with a GPU-specialized parameter server. In Proceedings of EuroSys, 2016. www.pdl.cmu.edu/PDL-FTP/CloudComputing/GeePS-cui-eurosys16.pdf.
-
(2016)
Proceedings of EuroSys
-
-
Cui, H.1
Zhang, H.2
Ganger, G.R.3
Gibbons, P.B.4
Xing, E.P.5
-
20
-
-
84877760312
-
Large scale distributed deep networks
-
J. Dean, G. S. Corrado, R. Monga, K. Chen, M. Devin, Q. V. Le, M. Z. Mao, M. Ranzato, A. Senior, P. Tucker, K. Yang, and A. Y. Ng. Large scale distributed deep networks. In Proceedings of NIPS, pages 1232-1240, 2012. research.google.com/archive/large_deep_networks_nips2012.pdf.
-
(2012)
Proceedings of NIPS
, pp. 1232-1240
-
-
Dean, J.1
Corrado, G.S.2
Monga, R.3
Chen, K.4
Devin, M.5
Le, Q.V.6
Mao, M.Z.7
Ranzato, M.8
Senior, A.9
Tucker, P.10
Yang, K.11
Ng, A.Y.12
-
21
-
-
85030321143
-
MapReduce: Simplified data processing on large clusters
-
J. Dean and S. Ghemawat. MapReduce: Simplified data processing on large clusters. In Proceedings of OSDI, pages 137-149, 2004. research.google.com/archive/mapreduce-osdi04.pdf.
-
(2004)
Proceedings of OSDI
, pp. 137-149
-
-
Dean, J.1
Ghemawat, S.2
-
23
-
-
80052250414
-
Adaptive subgradient methods for online learning and stochastic optimization
-
J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12:2121-2159, 2011. jmlr.org/papers/volume12/duchi11a/duchi11a.pdf.
-
(2011)
Journal of Machine Learning Research
, vol.12
, pp. 2121-2159
-
-
Duchi, J.1
Hazan, E.2
Singer, Y.3
-
24
-
-
84898958665
-
DeVISE: A deep visual-semantic embedding model
-
A. Frome, G. S. Corrado, J. Shlens, S. Bengio, J. Dean, T. Mikolov, et al DeVISE: A deep visual-semantic embedding model. In Proceedings of NIPS, pages 2121-2129, 2013. research.google.com/pubs/archive/41473.pdf.
-
(2013)
Proceedings of NIPS
, pp. 2121-2129
-
-
Frome, A.1
Corrado, G.S.2
Shlens, J.3
Bengio, S.4
Dean, J.5
Mikolov, T.6
-
25
-
-
84922385124
-
Frame-by-frame language identification in short utterances using deep neural networks
-
J. Gonzalez-Dominguez, I. Lopez-Moreno, P. J. Moreno, and J. Gonzalez-Rodriguez. Frame-by-frame language identification in short utterances using deep neural networks. Neural Networks, 64:49-58, 2015. research.google.com/pubs/archive/42929.pdf.
-
(2015)
Neural Networks
, vol.64
, pp. 49-58
-
-
Gonzalez-Dominguez, J.1
Lopez-Moreno, I.2
Moreno, P.J.3
Gonzalez-Rodriguez, J.4
-
26
-
-
84937849144
-
Generative adversarial nets
-
I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. C. Courville, and Y. Bengio. Generative adversarial nets. In Proceedings of NIPS, pages 2672-2680, 2014. papers.nips.cc/paper/5423-generative-adversarial-nets.pdf.
-
(2014)
Proceedings of NIPS
, pp. 2672-2680
-
-
Goodfellow, I.J.1
Pouget-Abadie, J.2
Mirza, M.3
Xu, B.4
Warde-Farley, D.5
Ozair, S.6
Courville, A.C.7
Bengio, Y.8
-
27
-
-
85076759367
-
-
Google Research. Tensorflow serving, 2016. tensor-flow.github.io/serving/.
-
(2016)
Tensorflow Serving
-
-
-
28
-
-
84986274465
-
Deep residual learning for image recognition
-
arxiv.org/abs/1512.03385
-
K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of CVPR, pages 770-778, 2016. arxiv.org/abs/1512.03385.
-
(2016)
Proceedings of CVPR
, pp. 770-778
-
-
He, K.1
Zhang, X.2
Ren, S.3
Sun, J.4
-
29
-
-
84890539009
-
Multilingual acoustic models using distributed deep neural networks
-
G. Heigold, V. Vanhoucke, A. Senior, P. Nguyen, M. Ranzato, M. Devin, and J. Dean. Multilingual acoustic models using distributed deep neural networks. In Proceedings of ICASSP, pages 8619-8623, 2013. research.google.com/pubs/archive/40807.pdf.
-
(2013)
Proceedings of ICASSP
, pp. 8619-8623
-
-
Heigold, G.1
Vanhoucke, V.2
Senior, A.3
Nguyen, P.4
Ranzato, M.5
Devin, M.6
Dean, J.7
-
31
-
-
85032751458
-
Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
-
G. E. Hinton, L. Deng, D. Yu, G. E. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath, and B. Kingsbury. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Process. Mag., 29(6):82-97, 2012. www.cs.toronto.edu/~gdahl/papers/deepSpeechReviewSPM2012.pdf.
-
(2012)
IEEE Signal Process. Mag.
, vol.29
, Issue.6
, pp. 82-97
-
-
Hinton, G.E.1
Deng, L.2
Yu, D.3
Dahl, G.E.4
Mohamed, A.5
Jaitly, N.6
Senior, A.7
Vanhoucke, V.8
Nguyen, P.9
Sainath, T.N.10
Kingsbury, B.11
-
32
-
-
0031573117
-
Long short-term memory
-
S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation, 9(8):1735-1780, 1997. deeplearning.cs.cmu.edu/pdfs/Hochreiter97-lstm.pdf.
-
(1997)
Neural Computation
, vol.9
, Issue.8
, pp. 1735-1780
-
-
Hochreiter, S.1
Schmidhuber, J.2
-
33
-
-
84969584486
-
Batch normalization: Accelerating deep network training by reducing internal covariate shift
-
S. Ioffe and C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of ICML, pages 448-456, 2015. jmlr.org/proceedings/papers/v37/ioffe15.pdf.
-
(2015)
Proceedings of ICML
, pp. 448-456
-
-
Ioffe, S.1
Szegedy, C.2
-
34
-
-
34548041192
-
Dryad: Distributed data-parallel programs from sequential building blocks
-
M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly. Dryad: distributed data-parallel programs from sequential building blocks. In Proceedings of EuroSys, pages 59-72, 2007. www.microsoft.com/en-us/research/wp-content/uploads/2007/03/eurosys07.pdf.
-
(2007)
Proceedings of EuroSys
, pp. 59-72
-
-
Isard, M.1
Budiu, M.2
Yu, Y.3
Birrell, A.4
Fetterly, D.5
-
37
-
-
84943744936
-
On using very large target vocabulary for neural machine translation
-
July
-
S. Jean, K. Cho, R. Memisevic, and Y. Bengio. On using very large target vocabulary for neural machine translation. In Proceedings of ACL-ICJNLP, pages 1-10, July 2015. www.aclweb.org/anthology/P15-1001.
-
(2015)
Proceedings of ACL-ICJNLP
, pp. 1-10
-
-
Jean, S.1
Cho, K.2
Memisevic, R.3
Bengio, Y.4
-
38
-
-
84913580146
-
Caffe: Convolutional architecture for fast feature embedding
-
arxiv.org/abs/1408.5093
-
Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. In Proceedings of ACM Multimedia, pages 675-678, 2014. arxiv.org/abs/1408.5093.
-
(2014)
Proceedings of ACM Multimedia
, pp. 675-678
-
-
Jia, Y.1
Shelhamer, E.2
Donahue, J.3
Karayev, S.4
Long, J.5
Girshick, R.6
Guadarrama, S.7
Darrell, T.8
-
41
-
-
84978840213
-
-
arXiv preprint, 1602.02410 arxiv.org/abs/1602.02410
-
R. Józefowicz, O. Vinyals, M. Schuster, N. Shazeer, and Y. Wu. Exploring the limits of language modeling. arXiv preprint, 1602.02410, 2016. arxiv.org/abs/1602.02410.
-
(2016)
Exploring the Limits of Language Modeling
-
-
Józefowicz, R.1
Vinyals, O.2
Schuster, M.3
Shazeer, N.4
Wu, Y.5
-
42
-
-
84911364368
-
Large-scale video classification with convolutional neural networks
-
A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-Fei. Large-scale video classification with convolutional neural networks. In Proceedings of CVPR, pages 1725-1732, 2014. research.google.com/pubs/archive/42455.pdf.
-
(2014)
Proceedings of CVPR
, pp. 1725-1732
-
-
Karpathy, A.1
Toderici, G.2
Shetty, S.3
Leung, T.4
Sukthankar, R.5
Fei-Fei, L.6
-
44
-
-
84876231242
-
ImageNet classification with deep convolutional neural networks
-
A. Krizhevsky, I. Sutskever, and G. E. Hinton. ImageNet classification with deep convolutional neural networks. In Proceedings of NIPS, pages 1106-1114, 2012. papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf.
-
(2012)
Proceedings of NIPS
, pp. 1106-1114
-
-
Krizhevsky, A.1
Sutskever, I.2
Hinton, G.E.3
-
45
-
-
59449087310
-
Exploring strategies for training deep neural networks
-
H. Larochelle, Y. Bengio, J. Louradour, and P. Lamblin. Exploring strategies for training deep neural networks. Journal of Machine Learning Research, 10:1-40, 2009. jmlr.org/papers/volume10/larochelle09a/larochelle09a.pdf.
-
(2009)
Journal of Machine Learning Research
, vol.10
, pp. 1-40
-
-
Larochelle, H.1
Bengio, Y.2
Louradour, J.3
Lamblin, P.4
-
46
-
-
84986325583
-
Fast algorithms for convolutional neural networks
-
arxiv.org/abs/1509.09308
-
A. Lavin and S. Gray. Fast algorithms for convolutional neural networks. In Proceedings of CVPR, pages 4013-4021, 2016. arxiv.org/abs/1509.09308.
-
(2016)
Proceedings of CVPR
, pp. 4013-4021
-
-
Lavin, A.1
Gray, S.2
-
47
-
-
84867135575
-
Building high-level features using large scale unsupervised learning
-
Q. Le, M. Ranzato, R. Monga, M. Devin, G. Corrado, K. Chen, J. Dean, and A. Ng. Building high-level features using large scale unsupervised learning. In Proceedings of ICML, pages 81-88, 2012. research.google.com/archive/unsupervised-icml2012.pdf.
-
(2012)
Proceedings of ICML
, pp. 81-88
-
-
Le, Q.1
Ranzato, M.2
Monga, R.3
Devin, M.4
Corrado, G.5
Chen, K.6
Dean, J.7
Ng, A.8
-
49
-
-
84937912100
-
Scaling distributed machine learning with the parameter server
-
M. Li, D. G. Andersen, J. Park, A. J. Smola, A. Ahmed, V. Josifovski, J. Long, E. J. Shekita, and B.-Y. Su. Scaling distributed machine learning with the Parameter Server. In Proceedings of OSDI, pages 583-598, 2014. www.usenix.org/system/files/conference/osdi14/osdi14-paper-li_mu.pdf.
-
(2014)
Proceedings of OSDI
, pp. 583-598
-
-
Li, M.1
Andersen, D.G.2
Park, J.3
Smola, A.J.4
Ahmed, A.5
Josifovski, V.6
Long, J.7
Shekita, E.J.8
Su, B.-Y.9
-
50
-
-
84943782204
-
-
arXiv preprint, 1412.6564 arxiv.org/abs/1412.6564
-
C. J. Maddison, A. Huang, I. Sutskever, and D. Silver. Move evaluation in Go using deep convolutional neural networks. arXiv preprint, 1412.6564, 2014. arxiv.org/abs/1412.6564.
-
(2014)
Move Evaluation in Go Using Deep Convolutional Neural Networks
-
-
Maddison, C.J.1
Huang, A.2
Sutskever, I.3
Silver, D.4
-
53
-
-
84937959846
-
Recurrent models of visual attention
-
V. Mnih, N. Heess, A. Graves, and K. Kavukcuoglu. Recurrent models of visual attention. In Proceedings of NIPS, pages 2204-2212, 2014. papers.nips.cc/paper/5542-recurrent-models-of-visual-attention.pdf.
-
(2014)
Proceedings of NIPS
, pp. 2204-2212
-
-
Mnih, V.1
Heess, N.2
Graves, A.3
Kavukcuoglu, K.4
-
54
-
-
84924051598
-
Human-level control through deep reinforcement learning
-
02
-
V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis. Human-level control through deep reinforcement learning. Nature, 518(7540):529-533, 02 2015. dx.doi.org/10.1038/nature14236.
-
(2015)
Nature
, vol.518
, Issue.7540
, pp. 529-533
-
-
Mnih, V.1
Kavukcuoglu, K.2
Silver, D.3
Rusu, A.A.4
Veness, J.5
Bellemare, M.G.6
Graves, A.7
Riedmiller, M.8
Fidjeland, A.K.9
Ostrovski, G.10
Petersen, S.11
Beattie, C.12
Sadik, A.13
Antonoglou, I.14
King, H.15
Kumaran, D.16
Wierstra, D.17
Legg, S.18
Hassabis, D.19
-
56
-
-
84989299198
-
Incremental, iterative data processing with timely dataflow
-
Sept.
-
D. G. Murray, F. McSherry, M. Isard, R. Isaacs, P. Barham, and M. Abadi. Incremental, iterative data processing with timely dataflow. Commun. ACM, 59(10):75-83, Sept. 2016. dl.acm.org/citation.cfm?id=2983551.
-
(2016)
Commun. ACM
, vol.59
, Issue.10
, pp. 75-83
-
-
Murray, D.G.1
McSherry, F.2
Isard, M.3
Isaacs, R.4
Barham, P.5
Abadi, M.6
-
57
-
-
84980007683
-
-
arXiv preprint, 1507.04296 arxiv.org/abs/1507.04296
-
A. Nair, P. Srinivasan, S. Blackwell, C. Alcicek, R. Fearon, A. De Maria, V. Panneershelvam, M. Suleyman, C. Beattie, S. Petersen, et al Massively parallel methods for deep reinforcement learning. arXiv preprint, 1507.04296, 2015. arxiv.org/abs/1507.04296.
-
(2015)
Massively Parallel Methods for Deep Reinforcement Learning
-
-
Nair, A.1
Srinivasan, P.2
Blackwell, S.3
Alcicek, C.4
Fearon, R.5
De Maria, A.6
Panneershelvam, V.7
Suleyman, M.8
Beattie, C.9
Petersen, S.10
-
60
-
-
84892982833
-
On the difficulty of training recurrent neural networks
-
R. Pascanu, T. Mikolov, and Y. Bengio. On the difficulty of training recurrent neural networks. In Proceedings of ICML, pages 1310-1318, 2013. jmlr.org/proceedings/papers/v28/pascanu13.pdf.
-
(2013)
Proceedings of ICML
, pp. 1310-1318
-
-
Pascanu, R.1
Mikolov, T.2
Bengio, Y.3
-
61
-
-
85162467517
-
Hogwild: A lock-free approach to parallelizing stochastic gradient descent
-
B. Recht, C. Re, S. Wright, and F. Niu. Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In Proceedings of NIPS, pages 693-701, 2011. papers.nips.cc/paper/4390-hogwild-a-lock-free-approach-to-parallelizing-stochastic-gradient-descent.pdf.
-
(2011)
Proceedings of NIPS
, pp. 693-701
-
-
Recht, B.1
Re, C.2
Wright, S.3
Niu, F.4
-
62
-
-
84889679621
-
Dandelion: A compiler and runtime for heterogeneous systems
-
C. J. Rossbach, Y. Yu, J. Currey, J.-P. Martin, and D. Fetterly. Dandelion: a compiler and runtime for heterogeneous systems. In Proceedings of SOSP, pages 49-68, 2013. sigops.org/sosp/sosp13/papers/p49-rossbach.pdf.
-
(2013)
Proceedings of SOSP
, pp. 49-68
-
-
Rossbach, C.J.1
Yu, Y.2
Currey, J.3
Martin, J.-P.4
Fetterly, D.5
-
63
-
-
84921817164
-
Learning representations by back-propagating errors
-
MIT Press
-
D. E. Rumelhart, G. E. Hinton, and R. J. Williams. Learning representations by back-propagating errors. In Cognitive modeling, Volume 5, pages 213-220. MIT Press, 1988. www.cs.toronto.edu/~hinton/absps/naturebp.pdf.
-
(1988)
Cognitive Modeling
, vol.5
, pp. 213-220
-
-
Rumelhart, D.E.1
Hinton, G.E.2
Williams, R.J.3
-
64
-
-
84947041871
-
ImageNet large scale visual recognition challenge
-
arxiv.org/abs/1409.0575
-
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. FeiFei. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision, 115(3):211-252, 2015. arxiv.org/abs/1409.0575.
-
(2015)
International Journal of Computer Vision
, vol.115
, Issue.3
, pp. 211-252
-
-
Russakovsky, O.1
Deng, J.2
Su, H.3
Krause, J.4
Satheesh, S.5
Ma, S.6
Huang, Z.7
Karpathy, A.8
Khosla, A.9
Bernstein, M.10
Berg, A.C.11
FeiFei, L.12
-
65
-
-
80052119994
-
An architecture for parallel topic models
-
Sept. vldb.org/pvldb/vldb2010/papers/R63.pdf
-
A. Smola and S. Narayanamurthy. An architecture for parallel topic models. Proc. VLDB Endow., 3(1-2):703-710, Sept. 2010. vldb.org/pvldb/vldb2010/papers/R63.pdf.
-
(2010)
Proc. VLDB Endow.
, vol.3
, Issue.1-2
, pp. 703-710
-
-
Smola, A.1
Narayanamurthy, S.2
-
66
-
-
84892623436
-
On the importance of initialization and momentum in deep learning
-
I. Sutskever, J. Martens, G. E. Dahl, and G. E. Hinton. On the importance of initialization and momentum in deep learning. In Proceedings of ICML, pages 1139-1147, 2013. jmlr.org/proceedings/papers/v28/sutskever13.pdf.
-
(2013)
Proceedings of ICML
, pp. 1139-1147
-
-
Sutskever, I.1
Martens, J.2
Dahl, G.E.3
Hinton, G.E.4
-
67
-
-
84928547704
-
Sequence to sequence learning with neural networks
-
I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to sequence learning with neural networks. In Proceedings of NIPS, pages 3104-3112, 2014. papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural.pdf.
-
(2014)
Proceedings of NIPS
, pp. 3104-3112
-
-
Sutskever, I.1
Vinyals, O.2
Le, Q.V.3
-
68
-
-
84937522268
-
Going deeper with convolutions
-
arxiv.org/abs/1409.4842
-
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In Proceedings of CVPR, pages 1-9, 2015. arxiv.org/abs/1409.4842.
-
(2015)
Proceedings of CVPR
, pp. 1-9
-
-
Szegedy, C.1
Liu, W.2
Jia, Y.3
Sermanet, P.4
Reed, S.5
Anguelov, D.6
Erhan, D.7
Vanhoucke, V.8
Rabinovich, A.9
-
69
-
-
84990032289
-
-
arXiv preprint, 1512.00567 arxiv.org/abs/1512.00567
-
C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna. Rethinking the Inception architecture for computer vision. arXiv preprint, 1512.00567, 2015. arxiv.org/abs/1512.00567.
-
(2015)
Rethinking the Inception Architecture for Computer Vision
-
-
Szegedy, C.1
Vanhoucke, V.2
Ioffe, S.3
Shlens, J.4
Wojna, Z.5
-
70
-
-
56049109090
-
Map-reduce for machine learning on multi-core
-
C. tao Chu, S. K. Kim, Y. an Lin, Y. Yu, G. Bradski, K. Olukotun, and A. Y. Ng. Map-reduce for machine learning on multi-core. In Proceedings of NIPS, pages 281-288, 2007. papers.nips.cc/paper/3150-map-reduce-for-machine-learning-on-multicore.pdf.
-
(2007)
Proceedings of NIPS
, pp. 281-288
-
-
Tao Chu, C.1
Kim, S.K.2
An Lin, Y.3
Yu, Y.4
Bradski, G.5
Olukotun, K.6
Ng, A.Y.7
-
71
-
-
84929574917
-
Large-scale cluster management at google with borg
-
A. Verma, L. Pedrosa, M. Korupolu, D. Oppenheimer, E. Tune, and J. Wilkes. Large-scale cluster management at Google with Borg. In Proceedings of EuroSys, 2015. research.google.com/pubs/archive/43438.pdf.
-
(2015)
Proceedings of EuroSys
-
-
Verma, A.1
Pedrosa, L.2
Korupolu, M.3
Oppenheimer, D.4
Tune, E.5
Wilkes, J.6
-
72
-
-
84943761168
-
Grammar as a foreign language
-
arxiv.org/abs/1412.7449
-
O. Vinyals, L. Kaiser, T. Koo, S. Petrov, I. Sutskever, and G. Hinton. Grammar as a foreign language. arXiv preprint, 2014. arxiv.org/abs/1412.7449.
-
(2014)
ArXiv Preprint
-
-
Vinyals, O.1
Kaiser, L.2
Koo, T.3
Petrov, S.4
Sutskever, I.5
Hinton, G.6
-
73
-
-
85018271332
-
-
arXiv preprint, 1609.08144 arxiv.org/abs/1609.08144
-
Y. Wu, M. Schuster, Z. Chen, Q. V. Le, M. Norouzi, W. Macherey, M. Krikun, Y. Cao, Q. Gao, K. Macherey, J. Klingner, A. Shah, M. Johnson, X. Liu, L. Kaiser, S. Gouws, Y. Kato, T. Kudo, H. Kazawa, K. Stevens, G. Kurian, N. Patil, W. Wang, C. Young, J. Smith, J. Riesa, A. Rudnick, O. Vinyals, G. Corrado, M. Hughes, and J. Dean. Google's Neural Machine Translation system: Bridging the gap between human and machine translation. arXiv preprint, 1609.08144, 2016. arxiv.org/abs/1609.08144.
-
(2016)
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
-
-
Wu, Y.1
Schuster, M.2
Chen, Z.3
Le, Q.V.4
Norouzi, M.5
Macherey, W.6
Krikun, M.7
Cao, Y.8
Gao, Q.9
Macherey, K.10
Klingner, J.11
Shah, A.12
Johnson, M.13
Liu, X.14
Kaiser, L.15
Gouws, S.16
Kato, Y.17
Kudo, T.18
Kazawa, H.19
Stevens, K.20
Kurian, G.21
Patil, N.22
Wang, W.23
Young, C.24
Smith, J.25
Riesa, J.26
Rudnick, A.27
Vinyals, O.28
Corrado, G.29
Hughes, M.30
Dean, J.31
more..
-
74
-
-
85076882757
-
DryadLINQ: A system for general-purpose distributed dataparallel computing using a high-level language
-
Y. Yu, M. Isard, D. Fetterly, M. Budiu, U. Erlingsson, P. K. Gunda, and J. Currey. DryadLINQ: A system for general-purpose distributed dataparallel computing using a high-level language. In Proceedings of OSDI, pages 1-14, 2008. www.usenix.org/legacy/event/osdi08/tech/full papers/yu_y/yu_y.pdf.
-
(2008)
Proceedings of OSDI
, pp. 1-14
-
-
Yu, Y.1
Isard, M.2
Fetterly, D.3
Budiu, M.4
Erlingsson, U.5
Gunda, P.K.6
Currey, J.7
-
75
-
-
85040175609
-
Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing
-
M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In Proceedings of NSDI, pages 15-28, 2012. https://www.usenix.org/system/files/conference/nsdi12/nsdi12-final138.pdf.
-
(2012)
Proceedings of NSDI
, pp. 15-28
-
-
Zaharia, M.1
Chowdhury, M.2
Das, T.3
Dave, A.4
Ma, J.5
McCauley, M.6
Franklin, M.J.7
Shenker, S.8
Stoica, I.9
-
76
-
-
84890471125
-
On rectified linear units for speech processing
-
M. D. Zeiler, M. Ranzato, R. Monga, M. Mao, K. Yang, Q. Le, P. Nguyen, A. Senior, V. Vanhoucke, J. Dean, and G. E. Hinton. On rectified linear units for speech processing. In Proceedings of ICASSP, pages 3517-3521, 2013. research.google.com/pubs/archive/40811.pdf.
-
(2013)
Proceedings of ICASSP
, pp. 3517-3521
-
-
Zeiler, M.D.1
Ranzato, M.2
Monga, R.3
Mao, M.4
Yang, K.5
Le, Q.6
Nguyen, P.7
Senior, A.8
Vanhoucke, V.9
Dean, J.10
Hinton, G.E.11
|