-
1
-
-
84867605836
-
Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition
-
O. Abdel-Hamid, A. Mohamed, H. Jiang, and G. Penn Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition Proc. Int. Conf. Acoust., Speech, Signal Process. 2012 4277 4280
-
(2012)
Proc. Int. Conf. Acoust., Speech, Signal Process.
, pp. 4277-4280
-
-
Abdel-Hamid, O.1
Mohamed, A.2
Jiang, H.3
Penn, G.4
-
3
-
-
0018455310
-
Suppression of acoustic noise in speech using spectral subtraction
-
S.F. Boll Suppression of acoustic noise in speech using spectral subtraction IEEE Trans. Acoust. Speech Signal Process. 27 1979 113 120
-
(1979)
IEEE Trans. Acoust. Speech Signal Process.
, vol.27
, pp. 113-120
-
-
Boll, S.F.1
-
5
-
-
33745530242
-
The AMI meeting corpus: A pre-announcement
-
J. Carletta, S. Ashby, S. Bourban, M. Flynn, M. Guillemot, T. Hain, J. Kadlec, V. Karaiskos, W. Kraaij, M. Kronenthal, G. Lathoud, M. Lincoln, A. Lisowska, I. McCowan, D. Reidsma, and W.P. Wellner P The AMI meeting corpus: a pre-announcement Proceedings of International Workshop on Machine Learning for Multimodal Interaction 2006 28 39
-
(2006)
Proceedings of International Workshop on Machine Learning for Multimodal Interaction
, pp. 28-39
-
-
Carletta, J.1
Ashby, S.2
Bourban, S.3
Flynn, M.4
Guillemot, M.5
Hain, T.6
Kadlec, J.7
Karaiskos, V.8
Kraaij, W.9
Kronenthal, M.10
Lathoud, G.11
Lincoln, M.12
Lisowska, A.13
McCowan, I.14
Reidsma, D.15
Wellner, P.W.P.16
-
6
-
-
84890488704
-
Spectro-temporal features for noise-robust speech recognition using power-law nonlinearity and power-bias subtraction
-
S.Y. Chang, B.T. Meyer, and N. Morgan Spectro-temporal features for noise-robust speech recognition using power-law nonlinearity and power-bias subtraction Proc. Int. Conf. Acoust., Speech, Signal Process. 2013 7063 7067
-
(2013)
Proc. Int. Conf. Acoust., Speech, Signal Process.
, pp. 7063-7067
-
-
Chang, S.Y.1
Meyer, B.T.2
Morgan, N.3
-
7
-
-
0036543522
-
Optimal speech enhancement under signal presence uncertainty using log-spectral amplitude estimator
-
I. Cohen Optimal speech enhancement under signal presence uncertainty using log-spectral amplitude estimator IEEE Signal Process. Lett. 9 2002 113 116
-
(2002)
IEEE Signal Process. Lett.
, vol.9
, pp. 113-116
-
-
Cohen, I.1
-
8
-
-
0041360463
-
Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging
-
I. Cohen Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging IEEE Trans. Speech Audio Process. 11 2003 466 475
-
(2003)
IEEE Trans. Speech Audio Process.
, vol.11
, pp. 466-475
-
-
Cohen, I.1
-
9
-
-
84890527827
-
Improving deep neural networks for LVCSR using rectified linear units and dropout
-
G.E. Dahl, T.N. Sainath, and G.E. Hinton Improving deep neural networks for LVCSR using rectified linear units and dropout Proc. Int. Conf. Acoust., Speech, Signal Process. 2013 8609 8613
-
(2013)
Proc. Int. Conf. Acoust., Speech, Signal Process.
, pp. 8609-8613
-
-
Dahl, G.E.1
Sainath, T.N.2
Hinton, G.E.3
-
10
-
-
84055222005
-
Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
-
G.E. Dahl, D. Yu, L. Deng, and A. Acero Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition IEEE Trans. Audio Speech Lang. Process. 20 2012 30 42
-
(2012)
IEEE Trans. Audio Speech Lang. Process.
, vol.20
, pp. 30-42
-
-
Dahl, G.E.1
Yu, D.2
Deng, L.3
Acero, A.4
-
11
-
-
18744371585
-
Histogram equalization of speech representation for robust speech recognition
-
A. de la Torre, A.M. Peinado, J.C. Segura, J.L. P’erez-Córdoba, M.C. Benítez, and A.J. Rubio Histogram equalization of speech representation for robust speech recognition IEEE Trans. Audio Speech Lang. Process. 13 2005 355 366
-
(2005)
IEEE Trans. Audio Speech Lang. Process.
, vol.13
, pp. 355-366
-
-
De La Torre, A.1
Peinado, A.M.2
Segura, J.C.3
Perez-Córdoba, J.L.4
Benítez, M.C.5
Rubio, A.J.6
-
12
-
-
2142756950
-
Enhancement of log mel power spectra of speech using a phase-sensitive model of the acoustic environment and sequential estimation of the corrupting noise
-
L. Deng, J. Droppo, and A. Acero Enhancement of log mel power spectra of speech using a phase-sensitive model of the acoustic environment and sequential estimation of the corrupting noise IEEE Trans. Speech Audio Process. 12 2004 133 143
-
(2004)
IEEE Trans. Speech Audio Process.
, vol.12
, pp. 133-143
-
-
Deng, L.1
Droppo, J.2
Acero, A.3
-
13
-
-
84890491198
-
Recent advances in deep learning for speech research at Microsoft
-
L. Deng, J. Li, J.T. Huang, K. Yao, D. Yu, F. Seide, M. Seltzer, G. Zweig, X. He, J. Williams, Y. Gong, and A. Acero Recent advances in deep learning for speech research at Microsoft Proc. Int. Conf. Acoust., Speech, Signal Process. 2013 8604 8608
-
(2013)
Proc. Int. Conf. Acoust., Speech, Signal Process.
, pp. 8604-8608
-
-
Deng, L.1
Li, J.2
Huang, J.T.3
Yao, K.4
Yu, D.5
Seide, F.6
Seltzer, M.7
Zweig, G.8
He, X.9
Williams, J.10
Gong, Y.11
Acero, A.12
-
14
-
-
20444414457
-
Analysis and comparison of two speech feature extraction/compensation algorithms
-
L. Deng, J. Wu, J. Droppo, and A. Acero Analysis and comparison of two speech feature extraction/compensation algorithms IEEE Signal Process. Lett. 12 2005 477 480
-
(2005)
IEEE Signal Process. Lett.
, vol.12
, pp. 477-480
-
-
Deng, L.1
Wu, J.2
Droppo, J.3
Acero, A.4
-
15
-
-
84901773892
-
Environmental robustness
-
J. Benesty, M.M. Sondhi, Y. Huang, Springer
-
J. Droppo, and A. Acero Environmental robustness J. Benesty, M.M. Sondhi, Y. Huang, Springer Handbook of Speech Processing 2008 Springer 653 679
-
(2008)
Springer Handbook of Speech Processing
, pp. 653-679
-
-
Droppo, J.1
Acero, A.2
-
16
-
-
85006734596
-
Evaluation of the SPLICE algorithm on the Aurora2 database
-
J. Droppo, A. Acero, and L. Deng Evaluation of the SPLICE algorithm on the Aurora2 database Proc. Eurospeech 2001 217 220
-
(2001)
Proc. Eurospeech
, pp. 217-220
-
-
Droppo, J.1
Acero, A.2
Deng, L.3
-
17
-
-
84905284245
-
Synthesized stereo mapping via deep neural networks for noisy speech recognition
-
J. Du, L.R. Dai, and Q. Huo Synthesized stereo mapping via deep neural networks for noisy speech recognition Proc. Int. Conf. Acoust., Speech, Signal Process. 2014 1764 1768
-
(2014)
Proc. Int. Conf. Acoust., Speech, Signal Process.
, pp. 1764-1768
-
-
Du, J.1
Dai, L.R.2
Huo, Q.3
-
18
-
-
0021892216
-
Speech enhancement using a minimum mean-square error log-spectral amplitude estimator
-
Y. Ephraim Speech enhancement using a minimum mean-square error log-spectral amplitude estimator IEEE Trans. Acoust. Speech Signal Process. 33 1985 443 445
-
(1985)
IEEE Trans. Acoust. Speech Signal Process.
, vol.33
, pp. 443-445
-
-
Ephraim, Y.1
-
20
-
-
0032050110
-
Maximum likelihood linear transformations for HMM-based speech recognition
-
M.J.F. Gales Maximum likelihood linear transformations for HMM-based speech recognition Comput. Speech Lang. 12 1998 75 98
-
(1998)
Comput. Speech Lang.
, vol.12
, pp. 75-98
-
-
Gales, M.J.F.1
-
21
-
-
0032638856
-
Semi-tied covariance matrices for hidden Markov models
-
M.J.F. Gales Semi-tied covariance matrices for hidden Markov models IEEE Trans. Speech Audio Process. 7 1999 272 281
-
(1999)
IEEE Trans. Speech Audio Process.
, vol.7
, pp. 272-281
-
-
Gales, M.J.F.1
-
22
-
-
0034227757
-
Cluster adaptive training of hidden Markov models
-
M.J.F. Gales Cluster adaptive training of hidden Markov models IEEE Trans. Speech Audio Process. 8 2000 417 428
-
(2000)
IEEE Trans. Speech Audio Process.
, vol.8
, pp. 417-428
-
-
Gales, M.J.F.1
-
23
-
-
84878418279
-
Model-based approaches for degraded channel modelling in robust ASR
-
M.J.F. Gales, and F. Flego Model-based approaches for degraded channel modelling in robust ASR Proc. Interspeech 2012
-
(2012)
Proc. Interspeech
-
-
Gales, M.J.F.1
Flego, F.2
-
24
-
-
84910095643
-
Memory-enhanced neural networks and NMF for robust ASR
-
J.T. Geiger, F. Weninger, J.F. Gemmeke, M. Wöllmer, B. Schller, and G. Rigoll Memory-enhanced neural networks and NMF for robust ASR IEEE/ACM Trans. Audio Speech Lang. Process. 22 2014 1037 1046
-
(2014)
IEEE/ACM Trans. Audio Speech Lang. Process.
, vol.22
, pp. 1037-1046
-
-
Geiger, J.T.1
Weninger, F.2
Gemmeke, J.F.3
Wöllmer, M.4
Schller, B.5
Rigoll, G.6
-
25
-
-
80051634401
-
Simplification and optimization of i-vector extraction
-
O. Glembek, L. Burget, P. Matějka, M. Karafiát, and P. Kenny Simplification and optimization of i-vector extraction Proc. Int. Conf. Acoust., Speech, Signal Process. 2011 4516 4519
-
(2011)
Proc. Int. Conf. Acoust., Speech, Signal Process.
, pp. 4516-4519
-
-
Glembek, O.1
Burget, L.2
Matějka, P.3
Karafiát, M.4
Kenny, P.5
-
26
-
-
34547548235
-
Probabilistic and bottle-neck features for LVCSR of meetings
-
IV-757-IV-760
-
F. Grezl, M. Karafiat, S. Kontar, and J. Cernocky Probabilistic and bottle-neck features for LVCSR of meetings Proc. Int. Conf. Acoust., Speech, Signal Process. 2007 IV-757-IV-760
-
(2007)
Proc. Int. Conf. Acoust., Speech, Signal Process.
-
-
Grezl, F.1
Karafiat, M.2
Kontar, S.3
Cernocky, J.4
-
27
-
-
85008520364
-
Transcribing meetings with the AMIDA systems
-
T. Hain, L. Burget, J. Dines, P.N. Garner, F. Greźl, A. El Hannani, M. Huijbregts, M. Karafiát, M. Lincoln, and V. Wan Transcribing meetings with the AMIDA systems IEEE Trans. Audio Speech Lang. Process. 20 2012 486 498
-
(2012)
IEEE Trans. Audio Speech Lang. Process.
, vol.20
, pp. 486-498
-
-
Hain, T.1
Burget, L.2
Dines, J.3
Garner, P.N.4
Greźl, F.5
El Hannani, A.6
Huijbregts, M.7
Karafiát, M.8
Lincoln, M.9
Wan, V.10
-
28
-
-
0033709098
-
Tandem connectionist feature extraction for conventional HMM systems
-
H. Hermansky, D. Ellis, and S. Sharma Tandem connectionist feature extraction for conventional HMM systems Proc. Int. Conf. Acoust., Speech, Signal Process. 2000 1635 1638
-
(2000)
Proc. Int. Conf. Acoust., Speech, Signal Process.
, pp. 1635-1638
-
-
Hermansky, H.1
Ellis, D.2
Sharma, S.3
-
29
-
-
34047249084
-
Quantile based histogram equalization for noise robust large vocabulary speech recognition
-
F. Hilger, and H. Ney Quantile based histogram equalization for noise robust large vocabulary speech recognition IEEE Trans. Audio Speech Lang. Process. 14 2006 845 854
-
(2006)
IEEE Trans. Audio Speech Lang. Process.
, vol.14
, pp. 845-854
-
-
Hilger, F.1
Ney, H.2
-
30
-
-
85032751458
-
Deep neural networks for acoustic modeling in speech recognition
-
G. Hinton, L. Deng, D. Yu, G. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath, and B. Kinsgbury Deep neural networks for acoustic modeling in speech recognition IEEE Signal Process. Mag. 29 2012 82 97
-
(2012)
IEEE Signal Process. Mag.
, vol.29
, pp. 82-97
-
-
Hinton, G.1
Deng, L.2
Yu, D.3
Dahl, G.4
Mohamed, A.5
Jaitly, N.6
Senior, A.7
Vanhoucke, V.8
Nguyen, P.9
Sainath, T.10
Kinsgbury, B.11
-
32
-
-
70349452200
-
Robust speech dereverberation based on non-negativity and sparse nature of speech spectrograms
-
H. Kameoka, T. Nakatani, and T. Yoshioka Robust speech dereverberation based on non-negativity and sparse nature of speech spectrograms Proc. Int. Conf. Acoust., Speech, Signal Process. 2009 45 48
-
(2009)
Proc. Int. Conf. Acoust., Speech, Signal Process.
, pp. 45-48
-
-
Kameoka, H.1
Nakatani, T.2
Yoshioka, T.3
-
33
-
-
84867608537
-
Power-normalized cepstral coefficients (pncc) for robust speech recognition
-
C. Kim, and R.M. Sterm Power-normalized cepstral coefficients (pncc) for robust speech recognition Proc. Int. Conf. Acoust., Speech, Signal Process. 2012 4101 4104
-
(2012)
Proc. Int. Conf. Acoust., Speech, Signal Process.
, pp. 4101-4104
-
-
Kim, C.1
Sterm, R.M.2
-
34
-
-
84893668957
-
Investigation of multilingual deep neural networks for spoken term detection
-
K.M. Knill, M.J.F. Gales, S.P. Rath, P.C. Woodland, C. Zhang, and S.X. Zhang Investigation of multilingual deep neural networks for spoken term detection Proc. Workshop on Automatic Speech Recognition and Understanding 2013 138 143
-
(2013)
Proc. Workshop on Automatic Speech Recognition and Understanding
, pp. 138-143
-
-
Knill, K.M.1
Gales, M.J.F.2
Rath, S.P.3
Woodland, P.C.4
Zhang, C.5
Zhang, S.X.6
-
35
-
-
77955673019
-
Model-based feature enhancement for reverberant speech recognition
-
A. Krueger, and R. Haeb-Umbach Model-based feature enhancement for reverberant speech recognition IEEE Trans. Audio Speech Lang. Process. 18 2010 1692 1707
-
(2010)
IEEE Trans. Audio Speech Lang. Process.
, vol.18
, pp. 1692-1707
-
-
Krueger, A.1
Haeb-Umbach, R.2
-
36
-
-
14344274593
-
A new method based on spectral subtraction for speech dereverberation
-
K. Lebart, J.M. Boucher, and P.N. Denbigh A new method based on spectral subtraction for speech dereverberation Acta Acust. Unit. Acust. 87 2001 359 366
-
(2001)
Acta Acust. Unit. Acust.
, vol.87
, pp. 359-366
-
-
Lebart, K.1
Boucher, J.M.2
Denbigh, P.N.3
-
37
-
-
79959849500
-
Comparison of discriminative input and output transformations for speaker adaptation in the hybrid NN/HMM systems
-
B. Li, and K.C. Sim Comparison of discriminative input and output transformations for speaker adaptation in the hybrid NN/HMM systems Proc. Interspeech 2010 526 529
-
(2010)
Proc. Interspeech
, pp. 526-529
-
-
Li, B.1
Sim, K.C.2
-
38
-
-
84890532503
-
Noise adaptive front-end normalization based on vector Taylor series for deep neural networks in robust speech recognition
-
B. Li, and K.C. Sim Noise adaptive front-end normalization based on vector Taylor series for deep neural networks in robust speech recognition Proc. Int. Conf. Acoust., Speech, Signal Process. 2013 7408 7412
-
(2013)
Proc. Int. Conf. Acoust., Speech, Signal Process.
, pp. 7408-7412
-
-
Li, B.1
Sim, K.C.2
-
39
-
-
84905216746
-
An ideal hidden-activation mask for deep neural networks based noise-robust speech recognition
-
B. Li, and K.C. Sim An ideal hidden-activation mask for deep neural networks based noise-robust speech recognition Proc. Int. Conf. Acoust., Speech, Signal Process. 2014 200 204
-
(2014)
Proc. Int. Conf. Acoust., Speech, Signal Process.
, pp. 200-204
-
-
Li, B.1
Sim, K.C.2
-
42
-
-
85009242725
-
Evaluation of a noise-robust DSR front-end on AURORA databases
-
D. Macho, L. Mauuary, B. Noé, Y.M. Cheng, D. Ealey, D. Jouvet, H. Kelleher, D. Pearce, and F. Saadoun Evaluation of a noise-robust DSR front-end on AURORA databases Proc. Int. Conf. Spoken Language Process. 2002 17 20
-
(2002)
Proc. Int. Conf. Spoken Language Process.
, pp. 17-20
-
-
Macho, D.1
Mauuary, L.2
Noé, B.3
Cheng, Y.M.4
Ealey, D.5
Jouvet, D.6
Kelleher, H.7
Pearce, D.8
Saadoun, F.9
-
44
-
-
84867585919
-
Understanding how deep belief networks perform acoustic modelling
-
A. Mohamed, G. Hinton, and G. Penn Understanding how deep belief networks perform acoustic modelling Proc. Int. Conf. Acoust., Speech, Signal Process. 2012 4273 4276
-
(2012)
Proc. Int. Conf. Acoust., Speech, Signal Process.
, pp. 4273-4276
-
-
Mohamed, A.1
Hinton, G.2
Penn, G.3
-
45
-
-
0029725301
-
A vector Taylor series approach for environmental-independent speech recognition
-
P.J. Moreno, B. Raj, and R.M. Stern A vector Taylor series approach for environmental-independent speech recognition Proc. Int. Conf. Acoust., Speech, Signal Process. 1996 733 736
-
(1996)
Proc. Int. Conf. Acoust., Speech, Signal Process.
, pp. 733-736
-
-
Moreno, P.J.1
Raj, B.2
Stern, R.M.3
-
46
-
-
0029306621
-
Continuous speech recognition: An introduction to the hybrid HMM/connectionist approach
-
N. Morgan, and H. Bourlard Continuous speech recognition: an introduction to the hybrid HMM/connectionist approach IEEE Signal Process. Mag. 12 1995 24 42
-
(1995)
IEEE Signal Process. Mag.
, vol.12
, pp. 24-42
-
-
Morgan, N.1
Bourlard, H.2
-
48
-
-
0029750993
-
Speaker-adaptation in a hybrid HMM-MLP recognizer
-
J.P. Neto, C. Martins, and L.B. Almeida Speaker-adaptation in a hybrid HMM-MLP recognizer Proc. Int. Conf. Acoust., Speech, Signal Process. 1996 3382 3385
-
(1996)
Proc. Int. Conf. Acoust., Speech, Signal Process.
, pp. 3382-3385
-
-
Neto, J.P.1
Martins, C.2
Almeida, L.B.3
-
49
-
-
79251574977
-
The efficient incorporation of MLP features into automatic speech recognition systems
-
J. Park, F. Diehl, M.J.F. Gales, M. Tomalin, and P.C. Woodland The efficient incorporation of MLP features into automatic speech recognition systems Comput. Speech Lang. 25 2011 519 534
-
(2011)
Comput. Speech Lang.
, vol.25
, pp. 519-534
-
-
Park, J.1
Diehl, F.2
Gales, M.J.F.3
Tomalin, M.4
Woodland, P.C.5
-
50
-
-
0032665650
-
On the limits of speech recognition in noise
-
S.D. Peters, P. Stubley, and J.M. Valin On the limits of speech recognition in noise Proc. Int. Conf. Acoust., Speech, Signal Process. 1999 365 368
-
(1999)
Proc. Int. Conf. Acoust., Speech, Signal Process.
, pp. 365-368
-
-
Peters, S.D.1
Stubley, P.2
Valin, J.M.3
-
51
-
-
84858985237
-
Improved acoustic feature combination for LVCSR by neural networks
-
C. Plahl, R. Schlüter, and H. Ney Improved acoustic feature combination for LVCSR by neural networks Proc. Interspeech 2011 1237 1240
-
(2011)
Proc. Interspeech
, pp. 1237-1240
-
-
Plahl, C.1
Schlüter, R.2
Ney, H.3
-
52
-
-
33646788786
-
FMPE: Discriminatively trained features for speech recognition
-
D. Povey, B. Kingsbury, L. Mangu, G. Saon, H. Soltau, and G. Zweig FMPE: Discriminatively trained features for speech recognition Proc. Int. Conf. Acoust., Speech, Signal Process. 2005 961 964
-
(2005)
Proc. Int. Conf. Acoust., Speech, Signal Process.
, pp. 961-964
-
-
Povey, D.1
Kingsbury, B.2
Mangu, L.3
Saon, G.4
Soltau, H.5
Zweig, G.6
-
53
-
-
0028194709
-
Connectionist probability estimators in HMM speech recognition
-
S. Renals, N. Morgan, H. Bourlard, M. Cohen, and H. Franco Connectionist probability estimators in HMM speech recognition IEEE Trans. Speech Audio Process. 2 1994 161 174
-
(1994)
IEEE Trans. Speech Audio Process.
, vol.2
, pp. 161-174
-
-
Renals, S.1
Morgan, N.2
Bourlard, H.3
Cohen, M.4
Franco, H.5
-
55
-
-
80051608940
-
Robust speech recognition using dynamic noise adaptation
-
S. Rennie, P. Dognin, and P. Fousek Robust speech recognition using dynamic noise adaptation Proc. Int. Conf. Acoust., Speech, Signal Process. 2011 4592 4595
-
(2011)
Proc. Int. Conf. Acoust., Speech, Signal Process.
, pp. 4592-4595
-
-
Rennie, S.1
Dognin, P.2
Fousek, P.3
-
58
-
-
34547539413
-
Gammatone features and feature combination for large vocabulary speech recognition
-
IV-649-IV-652
-
R. Schlüter, I. Bezrukov, H. Wagner, and H. Ney Gammatone features and feature combination for large vocabulary speech recognition Proc. Int. Conf. Acoust., Speech, Signal Process. 2007 IV-649-IV-652
-
(2007)
Proc. Int. Conf. Acoust., Speech, Signal Process.
-
-
Schlüter, R.1
Bezrukov, I.2
Wagner, H.3
Ney, H.4
-
60
-
-
84890492030
-
An investigation of deep neural networks for noise robust speech recognition
-
M.L. Seltzer, D. Yu, and Y. Wang An investigation of deep neural networks for noise robust speech recognition Proc. Int. Conf. Acoust., Speech, Signal Process. 2013 7398 7402
-
(2013)
Proc. Int. Conf. Acoust., Speech, Signal Process.
, pp. 7398-7402
-
-
Seltzer, M.L.1
Yu, D.2
Wang, Y.3
-
61
-
-
70349206345
-
Bayesian feature enhancement using a mixture of unscented transformations for uncertainty decoding of noisy speech
-
Y. Shinohara, and M. Akamine Bayesian feature enhancement using a mixture of unscented transformations for uncertainty decoding of noisy speech Proc. Int. Conf. Acoust., Speech, Signal Process. 2009 4569 4572
-
(2009)
Proc. Int. Conf. Acoust., Speech, Signal Process.
, pp. 4569-4572
-
-
Shinohara, Y.1
Akamine, M.2
-
66
-
-
67650107416
-
Recognition of reverberant speech using frequency domain linear prediction
-
S.S. Thomas, S. Ganapathy, and H. Hermansky Recognition of reverberant speech using frequency domain linear prediction IEEE Signal Process. Lett. 2008 681 684
-
(2008)
IEEE Signal Process. Lett.
, pp. 681-684
-
-
Thomas, S.S.1
Ganapathy, S.2
Hermansky, H.3
-
67
-
-
84862293102
-
Speaker and noise factorization for robust speech recognition
-
Y. Wang, and M.J.F. Gales Speaker and noise factorization for robust speech recognition IEEE Trans. Audio Speech Lang. Process. 20 2012 2149 2158
-
(2012)
IEEE Trans. Audio Speech Lang. Process.
, vol.20
, pp. 2149-2158
-
-
Wang, Y.1
Gales, M.J.F.2
-
68
-
-
84905216003
-
Deep recurrent de-noising auto-encoder and blind de-reverberation for reverberated speech recognition
-
F. Weninger, S. Watanabe, Y. Tachioka, and B. Schuller Deep recurrent de-noising auto-encoder and blind de-reverberation for reverberated speech recognition Proc. Int. Conf. Acoust., Speech, Signal Process. 2014 4656 4660
-
(2014)
Proc. Int. Conf. Acoust., Speech, Signal Process.
, pp. 4656-4660
-
-
Weninger, F.1
Watanabe, S.2
Tachioka, Y.3
Schuller, B.4
-
69
-
-
84878418827
-
A feature space transformation method for personalization using generalized i-vector clustering
-
K. Yao, Y. Gong, and C. Liu A feature space transformation method for personalization using generalized i-vector clustering Proc. Interspeech 2011
-
(2011)
Proc. Interspeech
-
-
Yao, K.1
Gong, Y.2
Liu, C.3
-
70
-
-
84874226579
-
Adaptation of context-dependent deep neural networks for automatic speech recognition
-
K. Yao, D. Yu, F. Seide, H. Su, L. Deng, and Y. Gong Adaptation of context-dependent deep neural networks for automatic speech recognition Proc. IEEE Workshop on Spoken Language Technology 2012 366 369
-
(2012)
Proc. IEEE Workshop on Spoken Language Technology
, pp. 366-369
-
-
Yao, K.1
Yu, D.2
Seide, F.3
Su, H.4
Deng, L.5
Gong, Y.6
-
71
-
-
84905247922
-
Impact of single-microphone dereverberation on DNN-based meeting transcription systems
-
T. Yoshioka, X. Chen, and M.J.F. Gales Impact of single-microphone dereverberation on DNN-based meeting transcription systems Proc. Int. Conf. Acoust., Speech, Signal Process. 2014 5527 5531
-
(2014)
Proc. Int. Conf. Acoust., Speech, Signal Process.
, pp. 5527-5531
-
-
Yoshioka, T.1
Chen, X.2
Gales, M.J.F.3
-
72
-
-
84867693894
-
Generalization of multi-channel linear prediction methods for blind MIMO impulse response shortening
-
T. Yoshioka, and T. Nakatani Generalization of multi-channel linear prediction methods for blind MIMO impulse response shortening IEEE Trans. Audio Speech Lang. Process. 20 2012 2707 2720
-
(2012)
IEEE Trans. Audio Speech Lang. Process.
, vol.20
, pp. 2707-2720
-
-
Yoshioka, T.1
Nakatani, T.2
-
73
-
-
84881043147
-
Noise model transfer: Novel approach to robustness against nonstationary noise
-
T. Yoshioka, and T. Nakatani Noise model transfer: novel approach to robustness against nonstationary noise IEEE Trans. Audio Speech Lang. Process. 21 2013 2182 2192
-
(2013)
IEEE Trans. Audio Speech Lang. Process.
, vol.21
, pp. 2182-2192
-
-
Yoshioka, T.1
Nakatani, T.2
-
74
-
-
85032751613
-
Making machines understand us in reverberant rooms: Robustness against reverberation for automatic speech recognition
-
T. Yoshioka, A. Sehr, M. Delcroix, K. Kinoshita, R. Maas, T. Nakatani, and W. Kellermann Making machines understand us in reverberant rooms: robustness against reverberation for automatic speech recognition IEEE Signal Process. Mag. 29 2012 114 126
-
(2012)
IEEE Signal Process. Mag.
, vol.29
, pp. 114-126
-
-
Yoshioka, T.1
Sehr, A.2
Delcroix, M.3
Kinoshita, K.4
Maas, R.5
Nakatani, T.6
Kellermann, W.7
-
75
-
-
70349210281
-
Adaptive dereverberation of speech signals with speaker-position change detection
-
T. Yoshioka, H. Tachibana, T. Nakatani, and M. Miyoshi Adaptive dereverberation of speech signals with speaker-position change detection Proc. Int. Conf. Acoust., Speech, Signal Process. 2009 3733 3736
-
(2009)
Proc. Int. Conf. Acoust., Speech, Signal Process.
, pp. 3733-3736
-
-
Yoshioka, T.1
Tachibana, H.2
Nakatani, T.3
Miyoshi, M.4
-
76
-
-
60749097551
-
-
Cambridge University Engineering Department Cambridge, UK
-
S.J. Young, G. Evermann, M.J.F. Gales, T. Hain, D. Kershaw, X. Liu, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P.C. Woodland The HTK Book version 3.4.1 2009 Cambridge University Engineering Department Cambridge, UK
-
(2009)
The HTK Book Version 3.4.1
-
-
Young, S.J.1
Evermann, G.2
Gales, M.J.F.3
Hain, T.4
Kershaw, D.5
Liu, X.6
Moore, G.7
Odell, J.8
Ollason, D.9
Povey, D.10
Valtchev, V.11
Woodland, P.C.12
-
77
-
-
66149101303
-
Robust speech recognition using a cepstral minimum-mean-square-error-motivated noise suppressor
-
D. Yu, L. Deng, J. Droppo, J. Wu, Y. Gong, and A. Acero Robust speech recognition using a cepstral minimum-mean-square-error-motivated noise suppressor IEEE Trans. Audio Speech Lang. Process. 16 2008 1061 1070
-
(2008)
IEEE Trans. Audio Speech Lang. Process.
, vol.16
, pp. 1061-1070
-
-
Yu, D.1
Deng, L.2
Droppo, J.3
Wu, J.4
Gong, Y.5
Acero, A.6
-
78
-
-
84890542079
-
KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition
-
D. Yu, K. Yao, H. Su, G. Li, and F. Seide KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition Proc. Int. Conf. Acoust., Speech, Signal Process. 2013 7893 7897
-
(2013)
Proc. Int. Conf. Acoust., Speech, Signal Process.
, pp. 7893-7897
-
-
Yu, D.1
Yao, K.2
Su, H.3
Li, G.4
Seide, F.5
|