SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn 2015-August, Issue , 2015, Pages 4624-4628

Speech acoustic modeling from raw multichannel waveforms

(3) Hoshen, Yedid a Weiss, Ron J b Wilson, Kevin W b

a HEBREW UNIVERSITY OF JERUSALEM (Israel)

b GOOGLE INC (United States)

Author keywords

acoustic modeling; Automatic speech recognition; beamforming; convolutional neural networks

Indexed keywords

ACOUSTIC NOISE; AUDIO SIGNAL PROCESSING; BEAMFORMING; CONVOLUTION; DEEP NEURAL NETWORKS; FILTER BANKS; NEURAL NETWORKS; SPEECH; SPEECH COMMUNICATION;

ACOUSTIC MODEL; AUTOMATIC SPEECH RECOGNITION; CONVOLUTIONAL NEURAL NETWORK; FEATURE REPRESENTATION; REVERBERANT CONDITION; SIGNAL IN SPACES; SPATIAL LOCATION; SUPERVISED TRAININGS;

SPEECH RECOGNITION;

EID: 84946030537 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2015.7178847 Document Type: Conference Paper

Times cited : (224)

References (20)

1
- 84876231242
- Imagenet classification with deep convolutional neural networks
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton, "Imagenet classification with deep convolutional neural networks, " in NIPS, 2012, pp. 1097-1105
- (2012) NIPS , pp. 1097-1105
- Krizhevsky, A.¹ Sutskever, I.² Hinton, G.E.³

2
- 84994264999
- Estimating phoneme class conditional probabilities from raw speech signal using convolutional neural networks
- Dimitri Palaz, Ronan Coli obert, and Mathew Magimai Doss, "Estimating phoneme class conditional probabilities from raw speech signal using convolutional neural networks, " Interspeech, 2014
- (2014) Interspeech
- Palaz, D.¹ Coli Obert, R.² Magimai Doss, M.³

3
- 84910065702
- Acoustic modeling with deep neural networks using raw time signal for LV CSR
- Singapore, Sept
- Zoltan Ttiske, Pavel Golik, Ralf SchlUter, and Hermann Ney, "Acoustic modeling with deep neural networks using raw time signal for LV CSR, " in Interspeech, Singapore, Sept. 2014
- (2014) Interspeech
- Ttiske, Z.¹ Golik, P.² SchlUter, R.³ Ney, H.⁴

4
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
- Geoffrey Hinton, Li Deng, Dong Yu, George E Dahl, Abdelrahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara N Sainath, et aI., "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups, " Signal Processing Magazine, IEEE, vol. 29, no. 6, pp. 82-97, 2012
- (2012) Signal Processing Magazine, IEEE , vol.29 , Issue.6 , pp. 82-97
- Hinton, G.¹ Deng, L.² Yu, D.³ Dahl, G.E.⁴ Mohamed, A.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.N.¹⁰ Ai, E.¹¹

5
- 80051609011
- Learning a better representation of speech soundwaves using restricted Boltzmann machines
- Navdeep Jaitly and Geoffrey Hinton, "Learning a better representation of speech soundwaves using restricted Boltzmann machines, " in ICASSP. IEEE, 2011, pp. 5884-5887
- (2011) ICASSP. IEEE , pp. 5884-5887
- Jaitly, N.¹ Hinton, G.²

6
- 84893622444
- The REVERB challenge: A common evaluation framework for dereverberation and recognition of reverberant speech
- Keisuke Kinoshita, Marc Delcroix, Takuya Yoshioka, Tomohiro Nakatani, Armin Sehr, Walter Kellermann, and Roland Maas, "The REVERB challenge: A common evaluation framework for dereverberation and recognition of reverberant speech, " in W ASPAA. IEEE, 2013, pp. 1-4
- (2013) W ASPAA. IEEE , pp. 1-4
- Kinoshita, K.¹ Delcroix, M.² Yoshioka, T.³ Nakatani, T.⁴ Sehr, A.⁵ Kellermann, W.⁶ Maas, R.⁷

7
- 84890541701
- The second CHiME speech separation and recognition challenge: Datasets, tasks and baselines
- Emmanuel Vincent, Jon Barker, Shinji Watanabe, Jonathan Le Roux, Francesco Nesta, and Marco Matassoni, 'The second CHiME speech separation and recognition challenge: Datasets, tasks and baselines, " in ICASSP. IEEE, 2013, pp. 126-130
- (2013) ICASSP. IEEE , pp. 126-130
- Vincent, E.¹ Barker, J.² Watanabe, S.³ Le Roux, J.⁴ Nesta, F.⁵ Matassoni, M.⁶

8
- 84933559263
- Linear prediction-based dereverberation with advanced speech enhancement and recognition technologies for the reverb challenge
- Marc Delcroix, Takuya Yoshioka, Atsunori Ogawa, Yotaro Kubo, Masakiyo Fujimoto, Nobutaka Ito, Keisuke Kinoshita, Miquel Espi, Takaaki Hori, Tomohiro Nakatani, and Atsushi Nakamura, "Linear prediction-based dereverberation with advanced speech enhancement and recognition technologies for the reverb challenge, " in REVERB Workshop, 2014
- (2014) REVERB Workshop
- Delcroix, M.¹ Yoshioka, T.² Ogawa, A.³ Kubo, Y.⁴ Fujimoto, M.⁵ Ito, N.⁶ Kinoshita, K.⁷ Espi, M.⁸ Hori, T.⁹ Nakatani, T.¹⁰ Nakamura, A.¹¹

9
- 80052067786
- Reverberant speech segregation based on multipitch tracking and classification
- Zhaozhang Jin and DeLiang Wang, "Reverberant speech segregation based on multipitch tracking and classification, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 19, no. 8, pp. 2328-2337, 2011
- (2011) IEEE Transactions on Audio, Speech, and Language Processing , vol.19 , Issue.8 , pp. 2328-2337
- Jin, Z.¹ Wang, D.²

10
- 63449087062
- Springer
- Jacob Benesty, Jingdong Chen, and Yiteng Huang, Microphone Array Signal Processing, Springer, 2008
- (2008) Microphone Array Signal Processing
- Benesty, J.¹ Chen, J.² Huang, Y.³

11
- 84893688455
- Learning filter banks within a deep neural network framework
- Tara N. Sainath, Brian Kingsbury, Abdel-rahman Mohamed, and Bhuvana Ramabhadran, "Learning filter banks within a deep neural network framework, " in ASRU. IEEE, 2013, pp. 297-302
- (2013) ASRU. IEEE , pp. 297-302
- Sainath, T.N.¹ Kingsbury, B.² Mohamed, A.-R.³ Ramabhadran, B.⁴

12
- 0020596154
- Cepstral analysis synthesis on the mel frequency scale
- Satoshi Imai, "Cepstral analysis synthesis on the mel frequency scale, " in ICASSP. IEEE, 1983, vol. 8, pp. 93-96
- (1983) ICASSP. IEEE , vol.8 , pp. 93-96
- Imai, S.¹

13
- 84893704659
- Hybrid acoustic models for distant and multichannel large vocabulary speech recognition
- Pawel Swietojanski, Arnab Ghoshal, and Steve Renals, "Hybrid acoustic models for distant and multichannel large vocabulary speech recognition, " in ASRU. IEEE, 2013, pp. 285-290
- (2013) ASRU. IEEE , pp. 285-290
- Swietojanski, P.¹ Ghoshal, A.² Renals, S.³

14
- 84901999583
- Convolutional neural networks for distant speech recognition
- Pawel Swietojanski, Arnab Ghoshal, and Steve Renals, "Convolutional neural networks for distant speech recognition, " Signal Processing Letters, 2014
- (2014) Signal Processing Letters
- Swietojanski, P.¹ Ghoshal, A.² Renals, S.³

15
- 77956509090
- Rectified linear units improve restricted Boltzmann machines
- Vinod Nair and Geoffrey E Hinton, "Rectified linear units improve restricted Boltzmann machines, " in Proceedings of the 27th International Conference on Machine Learning (ICML-JO), 2010, pp. 807-814
- (2010) Proceedings of the 27th International Conference on Machine Learning (ICML-JO) , pp. 807-814
- Nair, V.¹ Hinton, G.E.²

16
- 34547539413
- Gammatone features and feature combination for large vocabulary speech recognition
- Ralf SchlUter, Ilja Bezrukov, Hermann Wagner, and Hermann Ney, "Gammatone features and feature combination for large vocabulary speech recognition, " in ICASSP. IEEE, 2007, vol. 4, pp. IV-649
- (2007) ICASSP. IEEE , vol.4 , pp. 4-649
- SchlUter, R.¹ Bezrukov, I.² Wagner, H.³ Ney, H.⁴

17
- 84885728886
- Your word is my command: Google search by voice: A case study
- Springer
- Johan Schalkwyk, Doug Beeferman, Franóise Beaufays, Bill Byrne, Ciprian Chelba, Mike Cohen, Maryam Kamvar, and Brian Strope, "Your Word is my Command: Google search by voice: A case study, " in Advances in Speech Recognition, pp. 61-90. Springer, 2010
- (2010) Advances in Speech Recognition , pp. 61-90
- Schalkwyk, J.¹ Beeferman, D.² Beaufays, F.³ Byrne, B.⁴ Chelba, C.⁵ Cohen, M.⁶ Kamvar, M.⁷ Strope, B.⁸

18
- 84946051811
- Stephen G McGovern, "A model for room acoustics, " http://www.sgm-audio.com/research/rir/ rir. html, 2003
- (2003) A Model for Room Acoustics
- Stephen, G.M.¹

19
- 84877760312
- Large scale distributed deep networks
- Jeffrey Dean, Greg Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Mark Mao, Andrew Senior, Paul Tucker, Ke Yang, Quoc V Le, et aI., "Large scale distributed deep networks, " in NIPS, 2012, pp. 1223-1231
- (2012) NIPS , pp. 1223-1231
- Dean, J.¹ Corrado, G.² Monga, R.³ Chen, K.⁴ Devin, M.⁵ Mao, M.⁶ Senior, A.⁷ Tucker, P.⁸ Yang, K.⁹ Le, Q.V.¹⁰ Ai, E.¹¹

20
- 80052250414
- Adaptive subgradient methods for online learning and stochastic optimization
- John Duchi, Elad Hazan, and Yoram Singer, "Adaptive subgradient methods for online learning and stochastic optimization, " T he lournal of Machine Learning Research, vol. 12, pp. 2121-2159, 2011
- (2011) T He Lournal of Machine Learning Research , vol.12 , pp. 2121-2159
- Duchi, J.¹ Hazan, E.² Singer, Y.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.