SCOPUS 정보 검색 플랫폼

5th International Conference on Learning Representations, ICLR 2017 - Conference Track Proceedings

Volumn , Issue , 2017, Pages

Capacity and trainability in recurrent neural networks

(3) Collins, Jasmine a Sohl Dickstein, Jascha a Sussillo, David a

a GOOGLE INC (United States)

Author keywords

[No Author keywords available]

Indexed keywords

ARCHITECTURE; NETWORK ARCHITECTURE;

CAPACITY BOUND; HIDDEN UNITS; PER UNIT; REAL NUMBER; RECURRENT NEURAL NETWORK (RNNS); TASK INFORMATION; TRAINING EFFECTIVENESS;

LONG SHORT-TERM MEMORY;

EID: 85088225685 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (80)

References (42)

1
- 85010720695
- Deep speech 2: End-to-end speech recognition in english and Mandarin
- abs/1512.02595
- Dario Amodei, Rishita Anubhai, Eric Battenberg, Carl Case, Jared Casper, Bryan C. Catanzaro, Jingdong Chen, Mike Chrzanowski, Adam Coates, Greg Diamos, Erich Elsen, Jesse Engel, Linxi Fan, Christopher Fougner, Tony Han, Awni Y. Hannun, Billy Jun, Patrick LeGresley, Libby Lin, Sharan Narang, Andrew Y. Ng, Sherjil Ozair, Ryan Prenger, Jonathan Raiman, Sanjeev Satheesh, David Seetapun, Shubho Sengupta, Yi Wang, Zhiqian Wang, Chong Wang, Bo Xiao, Dani Yogatama, Jun Zhan, and Zhenyao Zhu. Deep speech 2: End-to-end speech recognition in english and mandarin. CoRR, abs/1512.02595, 2015. URL http://arxiv.org/abs/1512.02595.
- (2015) CoRR
- Amodei, D.¹ Anubhai, R.² Battenberg, E.³ Case, C.⁴ Casper, J.⁵ Catanzaro, B.C.⁶ Chen, J.⁷ Chrzanowski, M.⁸ Coates, A.⁹ Diamos, G.¹⁰ Elsen, E.¹¹ Engel, J.¹² Fan, L.¹³ Fougner, C.¹⁴ Han, T.¹⁵ Hannun, A.Y.¹⁶ Jun, B.¹⁷ LeGresley, P.¹⁸ Lin, L.¹⁹ Narang, S.²⁰ more..

2
- 84922389693
- arXiv preprint
- Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.
- (2014) Neural Machine Translation by Jointly Learning to Align and Translate
- Bahdanau, D.¹ Cho, K.² Bengio, Y.³

3
- 0001163081
- Number of stable points for spin-glasses and neural networks of higher orders
- Pierre Baldi and Santosh S Venkatesh. Number of stable points for spin-glasses and neural networks of higher orders. Physical Review Letters, 58(9):913, 1987.
- (1987) Physical Review Letters , vol.58 , Issue.9 , pp. 913
- Baldi, P.¹ Venkatesh, S.S.²

4
- 84955254040
- Nanoconnectomic upper bound on the variability of synaptic plasticity
- Thomas M Bartol, Cailey Bromer, Justin Kinney, Michael A Chirillo, Jennifer N Bourne, Kristen M Harris, and Terrence J Sejnowski. Nanoconnectomic upper bound on the variability of synaptic plasticity. eLife, 4: e10778, 2016.
- (2016) eLife , vol.4
- Bartol, T.M.¹ Bromer, C.² Kinney, J.³ Chirillo, M.A.⁴ Bourne, J.N.⁵ Harris, K.M.⁶ Sejnowski, T.J.⁷

5
- 84899857287
- Short-term memory capacity in networks via the restricted isometry property
- Adam S Charles, Han Lun Yap, and Christopher J Rozell. Short-term memory capacity in networks via the restricted isometry property. Neural computation, 26(6):1198-1235, 2014.
- (2014) Neural Computation , vol.26 , Issue.6 , pp. 1198-1235
- Charles, A.S.¹ Yap, H.L.² Rozell, C.J.³

6
- 84961291190
- arXiv preprint
- Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078, 2014.
- (2014) Learning Phrase Representations Using Rnn Encoder-Decoder for Statistical Machine Translation
- Cho, K.¹ Van Merriënboer, B.² Gulcehre, C.³ Bahdanau, D.⁴ Bougares, F.⁵ Schwenk, H.⁶ Bengio, Y.⁷

7
- 84939821078
- arXiv preprint
- Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555, 2014.
- (2014) Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
- Chung, J.¹ Gulcehre, C.² Cho, K.³ Bengio, Y.⁴

8
- 84918441630
- Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition
- Thomas M Cover. Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition. IEEE transactions on electronic computers, (3):326-334, 1965.
- (1965) IEEE Transactions on Electronic Computers , Issue.3 , pp. 326-334
- Cover, T.M.¹

9
- 84942597150
- Parallelizing exploration-exploitation tradeoffs in Gaussian process bandit optimization
- Thomas Desautels, Andreas Krause, and Joel W Burdick. Parallelizing exploration-exploitation tradeoffs in gaussian process bandit optimization. The Journal of Machine Learning Research, 15(1):3873-3923, 2014.
- (2014) The Journal of Machine Learning Research , vol.15 , Issue.1 , pp. 3873-3923
- Desautels, T.¹ Krause, A.² Burdick, J.W.³

10
- 0141463508
- Dept. of Biology, UCSD, Tech. Rep
- Kenji Doya. Universality of fully connected recurrent neural networks. Dept. of Biology, UCSD, Tech. Rep, 1993.
- (1993) Universality of Fully Connected Recurrent Neural Networks
- Doya, K.¹

11
- 85071025850
- Intelligible language modeling with input switched affine networks
- Jakob Foerster, Justin Gilmer, Jan Chorowski, Jascha Sohl-Dickstein, and David Sussillo. Intelligible language modeling with input switched affine networks. ICLR 2017 submission, 2016.
- (2016) ICLR 2017 Submission
- Foerster, J.¹ Gilmer, J.² Chorowski, J.³ Sohl-Dickstein, J.⁴ Sussillo, D.⁵

12
- 57749113625
- Memory traces in dynamical systems
- Surya Ganguli, Dongsung Huh, and Haim Sompolinsky. Memory traces in dynamical systems. Proceedings of the National Academy of Sciences, 105(48):18970-18975, 2008.
- (2008) Proceedings of the National Academy of Sciences , vol.105 , Issue.48 , pp. 18970-18975
- Ganguli, S.¹ Huh, D.² Sompolinsky, H.³

13
- 36149029786
- The space of interactions in neural network models
- Elizabeth Gardner. The space of interactions in neural network models. Journal of physics A: Mathematical and general, 21(1):257, 1988.
- (1988) Journal of Physics A: Mathematical and General , vol.21 , Issue.1 , pp. 257
- Gardner, E.¹

14
- 0033344091
- Learning to forget: Continual prediction with lstm
- Felix A. Gers, Jurgen Schmidhuber, and Fred Cummins. Learning to forget: Continual prediction with lstm. Artificial Neural Networks, ICANN 99. Ninth International Conference on (Conf. Publ. No. 470), 2:850-855, 1999.
- (1999) Artificial Neural Networks, ICANN 99. Ninth International Conference on (Conf. Publ. No. 470) , vol.2 , pp. 850-855
- Gers, F.A.¹ Schmidhuber, J.² Cummins, F.³

15
- 64849110608
- A novel connectionist system for unconstrained handwriting recognition. Pattern analysis and machine Intelligence
- Alex Graves, Marcus Liwicki, Santiago Fernández, Roman Bertolami, Horst Bunke, and Jürgen Schmidhuber. A novel connectionist system for unconstrained handwriting recognition. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 31(5):855-868, 2009.
- (2009) IEEE Transactions on , vol.31 , Issue.5 , pp. 855-868
- Graves, A.¹ Liwicki, M.² Fernández, S.³ Bertolami, R.⁴ Bunke, H.⁵ Schmidhuber, J.⁶

16
- 84943739264
- arXiv preprint
- Klaus Greff, Rupesh Kumar Srivastava, Jan Koutník, Bas R Steunebrink, and Jürgen Schmidhuber. Lstm: A search space odyssey. arXiv preprint arXiv:1503.04069, 2015.
- (2015) Lstm: A Search Space Odyssey
- Greff, K.¹ Srivastava, R.K.² Koutník, J.³ Steunebrink, B.R.⁴ Schmidhuber, J.⁵

17
- 84965100881
- arXiv preprint
- Karol Gregor, Ivo Danihelka, Alex Graves, and Daan Wierstra. Draw: A recurrent neural network for image generation. arXiv preprint arXiv:1502.04623, 2015.
- (2015) Draw: A Recurrent Neural Network for Image Generation
- Gregor, K.¹ Danihelka, I.² Graves, A.³ Wierstra, D.⁴

18
- 84965175092
- arXiv preprint
- Song Han, Huizi Mao, and William J Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149, 2015.
- (2015) Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
- Han, S.¹ Mao, H.² Dally, W.J.³

19
- 84958589374
- arXiv preprint
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385, 2015.
- (2015) Deep Residual Learning for Image Recognition
- He, K.¹ Zhang, X.² Ren, S.³ Sun, J.⁴

20
- 0031573117
- Long short-term memory
- Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural computation, 9(8):1735-1780, 1997.
- (1997) Neural Computation , vol.9 , Issue.8 , pp. 1735-1780
- Hochreiter, S.¹ Schmidhuber, J.²

21
- 85026322165
- arXiv preprint
- Itay Hubara, Daniel Soudry, and Ran El Yaniv. Binarized neural networks. arXiv preprint arXiv:1602.02505, 2016.
- (2016) Binarized Neural Networks
- Hubara, I.¹ Soudry, D.² Yaniv, R.E.³

22
- 1842421269
- Harnessing nonlinearity: Predicting chaotic systems and saving energy in wireless communication
- Herbert Jaeger and Harald Haas. Harnessing nonlinearity: Predicting chaotic systems and saving energy in wireless communication. science, 304(5667):78-80, 2004.
- (2004) Science , vol.304 , Issue.5667 , pp. 78-80
- Jaeger, H.¹ Haas, H.²

23
- 84969972527
- An empirical exploration of recurrent network architectures
- Rafal Jozefowicz, Wojciech Zaremba, and Ilya Sutskever. An empirical exploration of recurrent network architectures. In Proceedings of the 32nd International Conference on Machine Learning (ICML-15), pp. 2342-2350, 2015.
- (2015) Proceedings of the 32nd International Conference on Machine Learning (ICML-15) , pp. 2342-2350
- Jozefowicz, R.¹ Zaremba, W.² Sutskever, I.³

24
- 84994193137
- Exploring the limits of language modeling
- abs
- Rafal Józefowicz, Oriol Vinyals, Mike Schuster, Noam Shazeer, and Yonghui Wu. Exploring the limits of language modeling. CoRR, abs/1602.02410, 2016. URL http://arxiv.org/abs/1602.02410.
- (2016) CoRR
- Józefowicz, R.¹ Vinyals, O.² Schuster, M.³ Shazeer, N.⁴ Wu, Y.⁵

25
- 84959876313
- arXiv preprint
- Andrej Karpathy, Justin Johnson, and Fei-Fei Li. Visualizing and understanding recurrent networks. arXiv preprint arXiv:1506.02078, 2015.
- (2015) Visualizing and Understanding Recurrent Networks
- Karpathy, A.¹ Johnson, J.² Li, F.-F.³

26
- 85083951076
- ADaM: A method for stochastic optimization
- abs
- Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. CoRR, abs/1412.6980, 2014. URL http://arxiv.org/abs/1412.6980.
- (2014) CoRR
- Kingma, D.P.¹ Ba, J.²

27
- 0002702215
- Vapnik-chervonenkis dimension of recurrent neural networks
- Pascal Koiran and Eduardo D Sontag. Vapnik-chervonenkis dimension of recurrent neural networks. Discrete Applied Mathematics, 86(1):63-79, 1998.
- (1998) Discrete Applied Mathematics , vol.86 , Issue.1 , pp. 63-79
- Koiran, P.¹ Sontag, E.D.²

28
- 84939821079
- arXiv preprint
- Quoc V Le, Navdeep Jaitly, and Geoffrey E Hinton. A simple way to initialize recurrent networks of rectified linear units. arXiv preprint arXiv:1504.00941, 2015.
- (2015) A Simple Way to Initialize Recurrent Networks of Rectified Linear Units
- Le, Q.V.¹ Jaitly, N.² Hinton, G.E.³

29
- 0036834701
- Real-time computing without stable states: A new framework for neural computation based on perturbations
- Wolfgang Maass, Thomas Natschläger, and Henry Markram. Real-time computing without stable states: A new framework for neural computation based on perturbations. Neural computation, 14(11):2531-2560, 2002.
- (2002) Neural Computation , vol.14 , Issue.11 , pp. 2531-2560
- Maass, W.¹ Natschläger, T.² Markram, H.³

30
- 79953225681
- Online; accessed 15-November-2016
- Matt Mahoney. Large text compression benchmark: About the test data, 2011. URL http://mattmahoney.net/dc/textdata. [Online; accessed 15-November-2016].
- (2011) Large Text Compression Benchmark: About the Test Data
- Mahoney, M.¹

31
- 84887390404
- Context-dependent computation by recurrent dynamics in prefrontal cortex
- Valerio Mante, David Sussillo, Krishna V Shenoy, and William T Newsome. Context-dependent computation by recurrent dynamics in prefrontal cortex. Nature, 503(7474):78-84, 2013.
- (2013) Nature , vol.503 , Issue.7474 , pp. 78-84
- Mante, V.¹ Sussillo, D.² Shenoy, K.V.³ Newsome, W.T.⁴

32
- 80053451847
- Learning recurrent neural networks with hessian-free optimization
- James Martens and Ilya Sutskever. Learning recurrent neural networks with hessian-free optimization. In Proceedings of the 28th International Conference on Machine Learning (ICML-11), pp. 1033-1040, 2011.
- (2011) Proceedings of the 28th International Conference on Machine Learning (ICML-11) , pp. 1033-1040
- Martens, J.¹ Sutskever, I.²

33
- 84965163874
- Deep knowledge tracing
- Chris Piech, Jonathan Bassen, Jonathan Huang, Surya Ganguli, Mehran Sahami, Leonidas J Guibas, and Jascha Sohl-Dickstein. Deep knowledge tracing. In Advances in Neural Information Processing Systems, pp. 505-513, 2015.
- (2015) Advances in Neural Information Processing Systems , pp. 505-513
- Piech, C.¹ Bassen, J.² Huang, J.³ Ganguli, S.⁴ Sahami, M.⁵ Guibas, L.J.⁶ Sohl-Dickstein, J.⁷

34
- 85088226307
- Outrageously large neural networks: The sparsely-gated mixture-of-experts layer
- abs
- Noam Shazeer, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc V. Le, Geoffrey E. Hinton, and Jeff Dean. Outrageously large neural networks: The sparsely-gated mixture-of-experts layer. CoRR, abs/1701.06538, 2017. URL http://arxiv.org/abs/1701.06538.
- (2017) CoRR
- Shazeer, N.¹ Mirhoseini, A.² Maziarz, K.³ Davis, A.⁴ Le, Q.V.⁵ Hinton, G.E.⁶ Dean, J.⁷

35
- 84869201485
- Practical Bayesian optimization of machine learning algorithms
- Jasper Snoek, Hugo Larochelle, and Ryan P Adams. Practical bayesian optimization of machine learning algorithms. In Advances in neural information processing systems, pp. 2951-2959, 2012.
- (2012) Advances in Neural Information Processing Systems , pp. 2951-2959
- Snoek, J.¹ Larochelle, H.² Adams, R.P.³

36
- 84965156812
- arXiv preprint
- Rupesh Kumar Srivastava, Klaus Greff, and Jürgen Schmidhuber. Highway networks. arXiv preprint arXiv:1505.00387, 2015.
- (2015) Highway Networks
- Srivastava, R.K.¹ Greff, K.² Schmidhuber, J.³

37
- 84877827546
- Opening the black box: Low-dimensional dynamics in high-dimensional recurrent neural networks
- David Sussillo and Omri Barak. Opening the black box: low-dimensional dynamics in high-dimensional recurrent neural networks. Neural computation, 25(3):626-649, 2013.
- (2013) Neural Computation , vol.25 , Issue.3 , pp. 626-649
- Sussillo, D.¹ Barak, O.²

38
- 84928547704
- Sequence to sequence learning with neural networks
- Ilya Sutskever, Oriol Vinyals, and Quoc V Le. Sequence to sequence learning with neural networks. In Advances in neural information processing systems, pp. 3104-3112, 2014.
- (2014) Advances in Neural Information Processing Systems , pp. 3104-3112
- Sutskever, I.¹ Vinyals, O.² Le, Q.V.³

39
- 84893343292
- Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude
- Tijmen Tieleman and Geoffrey. Hinton. Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural Networks for Machine Learning, 4, 2012.
- (2012) COURSERA: Neural Networks for Machine Learning , vol.4
- Tieleman, T.¹ Hinton, G.²

40
- 2342592517
- Short-term memory in orthogonal neural networks
- Olivia L White, Daniel D Lee, and Haim Sompolinsky. Short-term memory in orthogonal neural networks. Physical review letters, 92(14):148102, 2004.
- (2004) Physical Review Letters , vol.92 , Issue.14 , pp. 148102
- White, O.L.¹ Lee, D.D.² Sompolinsky, H.³

41
- 84973904224
- Deep fried convnets
- Zichao Yang, Marcin Moczulski, Misha Denil, Nando de Freitas, Alex Smola, Le Song, and Ziyu Wang. Deep fried convnets. In Proceedings of the IEEE International Conference on Computer Vision, pp. 1476-1483, 2015.
- (2015) Proceedings of the IEEE International Conference on Computer Vision , pp. 1476-1483
- Yang, Z.¹ Moczulski, M.² Denil, M.³ De Freitas, N.⁴ Smola, A.⁵ Song, L.⁶ Wang, Z.⁷

42
- 84975705947
- Minimal gated unit for recurrent neural networks
- Guo-Bing Zhou, Jianxin Wu, Chen-Lin Zhang, and Zhi-Hua Zhou. Minimal gated unit for recurrent neural networks. International Journal of Automation and Computing, 13(3):226-234, 2016. ISSN 1751-8520. doi: 10.1007/s11633-016-1006-2. URL http://dx.doi.org/10.1007/s11633-016-1006-2.
- (2016) International Journal of Automation and Computing , vol.13 , Issue.3 , pp. 226-234
- Zhou, G.-B.¹ Wu, J.² Zhang, C.-L.³ Zhou, Z.-H.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.