-
1
-
-
84055211743
-
Acoustic modeling using deep belief networks
-
A. Mohamed, G. Dahl, and G. Hinton, "Acoustic modeling using deep belief networks, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 1, pp. 14-22, 2012.
-
(2012)
IEEE Transactions on Audio, Speech, and Language Processing
, vol.20
, Issue.1
, pp. 14-22
-
-
Mohamed, A.1
Dahl, G.2
Hinton, G.3
-
2
-
-
85032751458
-
Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
-
G. Hinton, L. Deng, D. Yu, G. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath, and B. Kingsbury, "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups, " IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 82-97, 2012.
-
(2012)
IEEE Signal Processing Magazine
, vol.29
, Issue.6
, pp. 82-97
-
-
Hinton, G.1
Deng, L.2
Yu, D.3
Dahl, G.4
Mohamed, A.5
Jaitly, N.6
Senior, A.7
Vanhoucke, V.8
Nguyen, P.9
Sainath, T.10
Kingsbury, B.11
-
5
-
-
84858976070
-
Feature engineering in context-dependent deep neural networks for conversational speech transcription
-
F. Seide, G. Li, X. Chen, and D. Yu, "Feature engineering in context-dependent deep neural networks for conversational speech transcription, " in Proc. IEEE Workshop on Automatic Speech Recognition Understanding (ASRU), 2011, pp. 24-29.
-
(2011)
Proc. IEEE Workshop on Automatic Speech Recognition Understanding (ASRU)
, pp. 24-29
-
-
Seide, F.1
Li, G.2
Chen, X.3
Yu, D.4
-
6
-
-
84867585919
-
Understanding how deep belief networks perform acoustic modelling
-
A. Mohamed, G. Hinton, and G. Penn, "Understanding how deep belief networks perform acoustic modelling, " in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2012, pp. 4273-4276.
-
(2012)
Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP
, pp. 4273-4276
-
-
Mohamed, A.1
Hinton, G.2
Penn, G.3
-
9
-
-
0030245128
-
Robust continuous speech recognition using parallel model combination
-
M. J. F. Gales and S. Young, "Robust continuous speech recognition using parallel model combination, " IEEE Transactions on Speech and Audio Processing, vol. 4, no. 5, pp. 352-359, 1996.
-
(1996)
IEEE Transactions on Speech and Audio Processing
, vol.4
, Issue.5
, pp. 352-359
-
-
Gales, M.J.F.1
Young, S.2
-
10
-
-
0032048385
-
Speech recognition in noisy environments using first-order vector Taylor series
-
D. Y. Kim, C. K. Un, and N. S. Kim, "Speech recognition in noisy environments using first-order vector Taylor series, " Speech Communication, pp. 39-49, 1998.
-
(1998)
Speech Communication
, pp. 39-49
-
-
Kim, D.Y.1
Un, C.K.2
Kim, N.S.3
-
11
-
-
85009113852
-
HMM adaptation using vector Taylor series for noisy speech recognition
-
A. Acero, L. Deng, T. Kristjansson, and J. Zhang, "HMM adaptation using vector Taylor series for noisy speech recognition, " in Proc. International Conference on Spoken Language Processing (ICSLP), 2000, pp. 869-872.
-
(2000)
Proc. International Conference on Spoken Language Processing (ICSLP)
, pp. 869-872
-
-
Acero, A.1
Deng, L.2
Kristjansson, T.3
Zhang, J.4
-
12
-
-
0035396555
-
Noise power spectral density estimation based on optimal smoothing and minimum statistics
-
R. Martin, "Noise power spectral density estimation based on optimal smoothing and minimum statistics, " IEEE Transactions on Speech and Audio Processing, vol. 9, no. 5, pp. 504-512, 2001.
-
(2001)
IEEE Transactions on Speech and Audio Processing
, vol.9
, Issue.5
, pp. 504-512
-
-
Martin, R.1
-
13
-
-
77949352396
-
Hierarchical variational loopy belief propagation for multi-talker speech recognition
-
S. Rennie, J. Hershey, and P. Olsen, "Hierarchical variational loopy belief propagation for multi-talker speech recognition, " in Proc. IEEE Workshop on Automatic Speech Recognition Understanding (ASRU), 2009, pp. 176-181.
-
(2009)
Proc. IEEE Workshop on Automatic Speech Recognition Understanding (ASRU)
, pp. 176-181
-
-
Rennie, S.1
Hershey, J.2
Olsen, P.3
-
14
-
-
79959854950
-
Multichannel source separation based on source location cue with logspectral shaping by hidden Markov source model
-
T. Nakatani, S. Araki, T. Yoshioka, and M. Fujimoto, "Multichannel source separation based on source location cue with logspectral shaping by hidden Markov source model, " in Proc. Interspeech, 2010, pp. 2766-2769.
-
(2010)
Proc. Interspeech
, pp. 2766-2769
-
-
Nakatani, T.1
Araki, S.2
Yoshioka, T.3
Fujimoto, M.4
-
15
-
-
84865754161
-
Reduction of highly nonstationary ambient noise by integrating spectral and locational characteristics of speech and noise for robust ASR
-
T. Nakatani, S. Araki, M. Delcroix, T. Yoshioka, and M. Fujimoto, "Reduction of highly nonstationary ambient noise by integrating spectral and locational characteristics of speech and noise for robust ASR, " in Proc. Interspeech, 2011, pp. 1785-1788.
-
(2011)
Proc. Interspeech
, pp. 1785-1788
-
-
Nakatani, T.1
Araki, S.2
Delcroix, M.3
Yoshioka, T.4
Fujimoto, M.5
-
16
-
-
77955673019
-
Model-based feature enhancement for reverberant speech recognition
-
A. Krueger and R. Haeb-Umbach, "Model-based feature enhancement for reverberant speech recognition, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 18, no. 7, pp. 1692-1707, 2010.
-
(2010)
IEEE Transactions on Audio, Speech, and Language Processing
, vol.18
, Issue.7
, pp. 1692-1707
-
-
Krueger, A.1
Haeb-Umbach, R.2
-
17
-
-
85009070292
-
Large-vocabulary speech recognition under adverse acoustic environments
-
L. Deng, A. Acero, M. Plumpe, and X. Huang, "Large-vocabulary speech recognition under adverse acoustic environments, " in Proc. International Conference on Spoken Language Processing (ICSLP), 2000, pp. 806-809.
-
(2000)
Proc. International Conference on Spoken Language Processing (ICSLP)
, pp. 806-809
-
-
Deng, L.1
Acero, A.2
Plumpe, M.3
Huang, X.4
-
18
-
-
51449102822
-
Combined static and dynamic variance adaptation for efficient interconnection of speech enhancement pre-processor with speech recognizer
-
M. Delcroix, T. Nakatani, and S. Watanabe, "Combined static and dynamic variance adaptation for efficient interconnection of speech enhancement pre-processor with speech recognizer, " in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2008, pp. 4073-4076.
-
(2008)
Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
, pp. 4073-4076
-
-
Delcroix, M.1
Nakatani, T.2
Watanabe, S.3
-
19
-
-
84945900998
-
Best practices for convolutional neural networks applied to visual document analysis
-
P. Simard, D. Steinkraus, and J. C. Platt, "Best practices for convolutional neural networks applied to visual document analysis, " in Proc. International Conference on Document Analysis and Recognition, 2003, pp. 958-963.
-
(2003)
Proc. International Conference on Document Analysis and Recognition
, pp. 958-963
-
-
Simard, P.1
Steinkraus, D.2
Platt, J.C.3
-
20
-
-
45749110924
-
Representational power of restricted Boltzmann machines and deep belief networks
-
N. L. Roux and Y. Bengio, "Representational power of restricted Boltzmann machines and deep belief networks, " Neural Computation, vol. 20, no. 6, pp. 1631-1649, 2008.
-
(2008)
Neural Computation
, vol.20
, Issue.6
, pp. 1631-1649
-
-
Roux, N.L.1
Bengio, Y.2
-
21
-
-
84867591985
-
LogMax observation model with MFCC-based spectral prior for reduction of highly nonstationary ambient noise
-
T. Nakatani, T. Yoshioka, S. Araki, M. Delcroix, and M. Fujimoto, "LogMax observation model with MFCC-based spectral prior for reduction of highly nonstationary ambient noise, " in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2012, pp. 4029-4032.
-
(2012)
Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
, pp. 4029-4032
-
-
Nakatani, T.1
Yoshioka, T.2
Araki, S.3
Delcroix, M.4
Fujimoto, M.5
-
22
-
-
84887382524
-
Dominance based integration of spatial and spectral features for speech enhancement
-
T. Nakatani, T. Yoshioka, S. Araki, M. Delcroix, and M. Fujimoto, "Dominance based integration of spatial and spectral features for speech enhancement, " Submitted to IEEE Transactions on Audio, Speech, and Language Processing, 2013.
-
(2013)
IEEE Transactions on Audio, Speech, and Language Processing
-
-
Nakatani, T.1
Yoshioka, T.2
Araki, S.3
Delcroix, M.4
Fujimoto, M.5
-
23
-
-
84887395149
-
Speech recognition in living rooms: Integrated speech enhancement and recognition system based on spatial, spectral and temporal modeling of sounds
-
M. Delcroix, K. Kinoshita, T. Nakatani, S. Araki, A. Ogawa, T. Hori, S.Watanabe, M. Fujimoto, T. Yoshioka, T. Oba, Y. Kubo, M. Souden, S.-J. Hahm, and A. Nakamura, "Speech recognition in living rooms: Integrated speech enhancement and recognition system based on spatial, spectral and temporal modeling of sounds, " Computer Speech&Language, vol. 27, no. 3, pp. 851 - 873, 2013.
-
(2013)
Computer Speech & Language
, vol.27
, Issue.3
, pp. 851-873
-
-
Delcroix, M.1
Kinoshita, K.2
Nakatani, T.3
Araki, S.4
Ogawa, A.5
Hori, T.6
Watanabe, S.7
Fujimoto, M.8
Yoshioka, T.9
Oba, T.10
Kubo, Y.11
Souden, M.12
Hahm, S.-J.13
Nakamura, A.14
-
24
-
-
78650016939
-
Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignment
-
H. Sawada, S. Araki, and S. Makino, "Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignment, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 19, no. 3, pp. 516-527, 2010.
-
(2010)
IEEE Transactions on Audio, Speech, and Language Processing
, vol.19
, Issue.3
, pp. 516-527
-
-
Sawada, H.1
Araki, S.2
Makino, S.3
-
25
-
-
84878543263
-
The PASCAL CHiME speech separation and recognition challenge
-
J. Barker, E. Vincent, N. Ma, H. Christensen, and P. Green, "The PASCAL CHiME speech separation and recognition challenge, " Computer Speech&Language, vol. 27, no. 3, pp. 621 - 633, 2013.
-
(2013)
Computer Speech&Language
, vol.27
, Issue.3
, pp. 621-633
-
-
Barker, J.1
Vincent, E.2
Ma, N.3
Christensen, H.4
Green, P.5
-
26
-
-
85018751865
-
-
cited April 24 2012
-
J. Barker, E. Vincent, N. Ma, C. Christensen, and P. Green, "The PASCAL CHiME peech separation and recognition challenge, " http://www.dcs.shef.ac.uk/spandh/chime/challenge.html cited April 24 2012.
-
The PASCAL CHiME Peech Separation and Recognition Challenge
-
-
Barker, J.1
Vincent, E.2
Ma, N.3
Christensen, C.4
Green, P.5
-
27
-
-
45849093239
-
Efficient WFSTbased one-pass decoding with on-the-fly hypothesis rescoring in extremely large vocabulary continuous speech recognition
-
T. Hori, C. Hori, Y. Minami, and A. Nakamura, "Efficient WFSTbased one-pass decoding with on-the-fly hypothesis rescoring in extremely large vocabulary continuous speech recognition, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 4, pp. 1352-1365, 2006.
-
(2006)
IEEE Transactions on Audio, Speech, and Language Processing
, vol.15
, Issue.4
, pp. 1352-1365
-
-
Hori, T.1
Hori, C.2
Minami, Y.3
Nakamura, A.4
-
28
-
-
70450194926
-
Margin-space integration of MPE loss via differencing of MMI functionals for generalized error-weighted discriminative training
-
E. McDermott, S. Watanabe, and A. Nakamura, "Margin-space integration of MPE loss via differencing of MMI functionals for generalized error-weighted discriminative training, " in Proc. Interspeech, 2009, pp. 224-227.
-
(2009)
Proc. Interspeech
, pp. 224-227
-
-
McDermott, E.1
Watanabe, S.2
Nakamura, A.3
-
29
-
-
0036296863
-
Minimum phone error and Ismoothing for improved discriminative training
-
IEEE
-
D. Povey and P. Woodland, "Minimum phone error and Ismoothing for improved discriminative training, " in Proc. International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 1. IEEE, 2002, pp. 105-108.
-
(2002)
Proc. International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
, vol.1
, pp. 105-108
-
-
Povey, D.1
Woodland, P.2
-
30
-
-
84874226579
-
Adaptation of context-dependent deep neural networks for automatic speech recognition
-
K. Yao, D. Yu, F. Seide, H. Su, L. Deng, and Y. Gong, "Adaptation of context-dependent deep neural networks for automatic speech recognition, " in Proc. IEEE Spoken Language Technology Workshop (SLT), 2012, pp. 366-369.
-
(2012)
Proc. IEEE Spoken Language Technology Workshop (SLT)
, pp. 366-369
-
-
Yao, K.1
Yu, D.2
Seide, F.3
Su, H.4
Deng, L.5
Gong, Y.6
-
31
-
-
84866720201
-
Robust Boltzmann machines for recognition and denoising
-
Y. Tang, R. Salakhutdinov, and G. E. Hinton, "Robust Boltzmann machines for recognition and denoising, " in Proc. IEEE International Conference on Computer Vision and Pattern Recognition, 2012, pp. 2264-2271.
-
(2012)
Proc. IEEE International Conference on Computer Vision and Pattern Recognition
, pp. 2264-2271
-
-
Tang, Y.1
Salakhutdinov, R.2
Hinton, G.E.3
-
32
-
-
56449089103
-
Extracting and composing robust features with denoising autoencoders
-
P. Vincent, H. Larochelle, Y. Bengio, and P.-A. Manzagol, "Extracting and composing robust features with denoising autoencoders, " in Proc. international conference on Machine learning, 2008, pp. 1096-1103.
-
(2008)
Proc. International Conference on Machine Learning
, pp. 1096-1103
-
-
Vincent, P.1
Larochelle, H.2
Bengio, Y.3
Manzagol, P.-A.4
-
33
-
-
84878409063
-
Recurrent neural networks for noise reduction in robust ASR
-
A. Maas, Q. Le, T. O'Neil, O. Vinyals, P. Nguyen, and A. Ng, "Recurrent neural networks for noise reduction in robust ASR, " in Proc. Interspeech, 2012.
-
(2012)
Proc. Interspeech
-
-
Maas, A.1
Le, Q.2
O'Neil, T.3
Vinyals, O.4
Nguyen, P.5
Ng, A.6
|