-
1
-
-
0029288202
-
Speech recognition in noisy environments: A survey
-
Y. Gong, "Speech recognition in noisy environments: A survey," Speech Commun., vol. 16, pp. 261-291, 1995.
-
(1995)
Speech Commun.
, vol.16
, pp. 261-291
-
-
Gong, Y.1
-
2
-
-
0030245128
-
Robust continuous speech recognition using parallel model combination
-
Sep.
-
M. J. F. Gales and S. J. Young, "Robust continuous speech recognition using parallel model combination," IEEE Trans. Speech Audio Process., vol. 4, no. 5, pp. 352-359, Sep. 1996.
-
(1996)
IEEE Trans. Speech Audio Process.
, vol.4
, Issue.5
, pp. 352-359
-
-
Gales, M.J.F.1
Young, S.J.2
-
3
-
-
0027166410
-
Recognition of speech in additive and convolutional noise based on RASTA spectral processing
-
H. Hermansky, N. Morgan, and H.-G. Hirsch, "Recognition of speech in additive and convolutional noise based on RASTA spectral processing," in Proc. ICASSP, 1993, vol. 10, pp. 509-512.
-
(1993)
Proc. ICASSP
, vol.10
, pp. 509-512
-
-
Hermansky, H.1
Morgan, N.2
Hirsch, H.-G.3
-
4
-
-
0018455310
-
Suppression of acoustic noise in speech using spectral subtraction
-
Apr.
-
S. F. Boll, "Suppression of acoustic noise in speech using spectral subtraction," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-27, no. 2, pp. 113-120, Apr. 1979.
-
(1979)
IEEE Trans. Acoust., Speech, Signal Process.
, vol.ASSP-27
, Issue.2
, pp. 113-120
-
-
Boll, S.F.1
-
6
-
-
84892233308
-
On ideal binary mask as the computational goal of auditory scene analysis
-
P. Divenyi, Ed. Norwell, MA, USA: Kluwer
-
D. L. Wang, "On ideal binary mask as the computational goal of auditory scene analysis," in Speech Separation by Humans and Machines, P. Divenyi, Ed. Norwell, MA, USA: Kluwer, 2005, pp. 181-197.
-
(2005)
Speech Separation by Humans and Machines
, pp. 181-197
-
-
Wang, D.L.1
-
7
-
-
85032752225
-
Missing-feature approaches in speech recognition
-
Sep.
-
B. Raj and R. M. Stern, "Missing-feature approaches in speech recognition," IEEE Signal Process. Mag., vol. 22, no. 2, pp. 101-116, Sep. 2005.
-
(2005)
IEEE Signal Process. Mag.
, vol.22
, Issue.2
, pp. 101-116
-
-
Raj, B.1
Stern, R.M.2
-
8
-
-
84877621926
-
The role of binary mask patterns in automatic speech recognition in background noise
-
A. Narayanan and D. L. Wang, "The role of binary mask patterns in automatic speech recognition in background noise," J. Acoust. Soc. Amer., vol. 133, no. 5, pp. 8083-8093, 2013.
-
(2013)
J. Acoust. Soc. Amer.
, vol.133
, Issue.5
, pp. 8083-8093
-
-
Narayanan, A.1
Wang, D.L.2
-
9
-
-
0021176902
-
The GRASP sound separation system
-
M. Weintraub, "The GRASP sound separation system," in Proc. IEEE ICASSP, 1984, pp. 18A.6.1-18A.6.4.
-
(1984)
Proc. IEEE ICASSP
-
-
Weintraub, M.1
-
10
-
-
0028531926
-
Computational auditory scene analysis
-
G. J. Brown and M. Cooke, "Computational auditory scene analysis," Comput. Speech Lang., vol. 8, pp. 297-336, 1994.
-
(1994)
Comput. Speech Lang.
, vol.8
, pp. 297-336
-
-
Brown, G.J.1
Cooke, M.2
-
11
-
-
0032682770
-
Separation of speech from interfering sounds based on oscillatory correlation
-
May
-
D. L. Wang and G. J. Brown, "Separation of speech from interfering sounds based on oscillatory correlation," IEEE Trans. Neural Netw., vol. 10, no. 3, pp. 684-697, May 1999.
-
(1999)
IEEE Trans. Neural Netw.
, vol.10
, Issue.3
, pp. 684-697
-
-
Wang, D.L.1
Brown, G.J.2
-
12
-
-
64649103540
-
Speech intelligibility in background noise with ideal binary timefre-quency masking
-
D. L. Wang, U. Kjems, M. S. Pedersen, J. B. Boldt, and T. Lunner, "Speech intelligibility in background noise with ideal binary timefre-quency masking," J. Acoust. Soc. Amer., vol. 125, pp. 2336-2347, 2009.
-
(2009)
J. Acoust. Soc. Amer.
, vol.125
, pp. 2336-2347
-
-
Wang, D.L.1
Kjems, U.2
Pedersen, M.S.3
Boldt, J.B.4
Lunner, T.5
-
13
-
-
0035342414
-
Robust automatic speech recognition with missing and unreliable acoustic data
-
M. Cooke, P. Green, L. Josifovski, and A. Vizinho, "Robust automatic speech recognition with missing and unreliable acoustic data," Speech Commun., vol. 34, pp. 267-285, 2001.
-
(2001)
Speech Commun.
, vol.34
, pp. 267-285
-
-
Cooke, M.1
Green, P.2
Josifovski, L.3
Vizinho, A.4
-
14
-
-
4644336054
-
Reconstruction of missing features for robust speech recognition
-
B. Raj, M. L. Seltzer, and R. M. Stern, "Reconstruction of missing features for robust speech recognition," Speech Commun., vol. 43, pp. 275-296, 2004.
-
(2004)
Speech Commun.
, vol.43
, pp. 275-296
-
-
Raj, B.1
Seltzer, M.L.2
Stern, R.M.3
-
15
-
-
77957739976
-
Advances in missing feature techniques for robut large-vocabulary continuous speech recognition
-
Jan.
-
M. V. Segbroeck and H. V. Hamme, "Advances in missing feature techniques for robut large-vocabulary continuous speech recognition," IEEE Trans. Acoust., Speech, Signal Process., vol. 19, no. 1, pp. 123-137, Jan. 2011.
-
(2011)
IEEE Trans. Acoust., Speech, Signal Process.
, vol.19
, Issue.1
, pp. 123-137
-
-
Segbroeck, M.V.1
Hamme, H.V.2
-
16
-
-
80051633766
-
Investigations into the incorporation of the ideal binary mask in ASR
-
Prague, Czech Republic, May
-
W. Hartmann and E. Fosler-Lussier, "Investigations into the incorporation of the ideal binary mask in ASR," in Proc. IEEE ICASSP, Prague, Czech Republic, May 2011, pp. 4804-4807.
-
(2011)
Proc. IEEE ICASSP
, pp. 4804-4807
-
-
Hartmann, W.1
Fosler-Lussier, E.2
-
17
-
-
0001556285
-
Recognising occluded speech
-
M. Cooke, A. Morris, and P. Green, "Recognising occluded speech," in Proc. ESCA Workshop Auditory Basis of Speech Percept., 1996, pp. 297-300.
-
(1996)
Proc. ESCA Workshop Auditory Basis of Speech Percept.
, pp. 297-300
-
-
Cooke, M.1
Morris, A.2
Green, P.3
-
19
-
-
0000652102
-
Some solutions to the missing feature problem in vision
-
S. J. Hanson, J. D. Cowen, and C. L. Giles, Eds. San Mateo, CA, USA: Morgan Kaufmann
-
S. Ahmad and V. Tresp, "Some solutions to the missing feature problem in vision," in Advances in Neural Information Processing Systems 5 (NIPS'92), S. J. Hanson, J. D. Cowen, and C. L. Giles, Eds. San Mateo, CA, USA: Morgan Kaufmann, 1993.
-
(1993)
Advances in Neural Information Processing Systems 5 (NIPS'92)
-
-
Ahmad, S.1
Tresp, V.2
-
20
-
-
16344396527
-
Using missing feature theory to actively select features for robust speech recognition with interruptions, filtering, and noise
-
R. Lippmann and B. A. Carlson, "Using missing feature theory to actively select features for robust speech recognition with interruptions, filtering, and noise," in Proc. Eurospeech'97, 1997, pp. 37-40.
-
(1997)
Proc. Eurospeech'97
, pp. 37-40
-
-
Lippmann, R.1
Carlson, B.A.2
-
21
-
-
0019053271
-
Comparison of parametric representation for monosyllabic word recognition in continuously spoken sentences
-
Aug.
-
S. B. Davis and P. Mermelstein, "Comparison of parametric representation for monosyllabic word recognition in continuously spoken sentences," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-28, no. 4, pp. 357-366, Aug. 1980.
-
(1980)
IEEE Trans. Acoust., Speech, Signal Process.
, vol.ASSP-28
, Issue.4
, pp. 357-366
-
-
Davis, S.B.1
Mermelstein, P.2
-
22
-
-
33750311718
-
Binary and ratio time-frequency masks for robust speech recognition
-
S. Srinivasan, N. Roman, and D. L. Wang, "Binary and ratio time-frequency masks for robust speech recognition," Speech Commun., vol. 48, pp. 1486-1501, 2006.
-
(2006)
Speech Commun.
, vol.48
, pp. 1486-1501
-
-
Srinivasan, S.1
Roman, N.2
Wang, D.L.3
-
23
-
-
56249136428
-
Transforming binary uncertainties for robust speech recognition
-
Sep.
-
S. Srinivasan and D. L. Wang, "Transforming binary uncertainties for robust speech recognition," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 7, pp. 2130-2140, Sep. 2007.
-
(2007)
IEEE Trans. Audio, Speech, Lang. Process.
, vol.15
, Issue.7
, pp. 2130-2140
-
-
Srinivasan, S.1
Wang, D.L.2
-
24
-
-
69249203845
-
Monaural speech separation based on MAXVQ and CASA for robust speech recognition
-
Jan.
-
P. Li, Y. Guan, S. Wang, B. Xu, and W. Liu, "Monaural speech separation based on MAXVQ and CASA for robust speech recognition," Comput. Speech Lang., vol. 24, no. 1, pp. 30-44, Jan. 2010.
-
(2010)
Comput. Speech Lang.
, vol.24
, Issue.1
, pp. 30-44
-
-
Li, P.1
Guan, Y.2
Wang, S.3
Xu, B.4
Liu, W.5
-
25
-
-
85009063707
-
Soft decisions in missing data techniques for robust automatic speech recognition
-
Beijing, China
-
J. Barker, L. Josifovski, M. Cooke, and P. Green, "Soft decisions in missing data techniques for robust automatic speech recognition," in Proc. Int. Conf. Spoken Lang., Beijing, China, 2000, pp. 373-376.
-
(2000)
Proc. Int. Conf. Spoken Lang
, pp. 373-376
-
-
Barker, J.1
Josifovski, L.2
Cooke, M.3
Green, P.4
-
26
-
-
84867596016
-
A novel approach to soft-mask estimation and log-spectral enhancement for robust speech recognition
-
J. V. Hout and A. Alwan, "A novel approach to soft-mask estimation and log-spectral enhancement for robust speech recognition," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., 2012, pp. 4105-4108.
-
(2012)
Proc. IEEE Int. Conf. Acoust., Speech, Signal Process.
, pp. 4105-4108
-
-
Hout, J.V.1
Alwan, A.2
-
27
-
-
77956506956
-
Missing-feature reconstruction by leveraging temporal spectral correlation for robust speech recognition in background noise conditions
-
Nov.
-
W. Kim and J. H. L. Hansen, "Missing-feature reconstruction by leveraging temporal spectral correlation for robust speech recognition in background noise conditions," IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 8, pp. 2111-2120, Nov. 2010.
-
(2010)
IEEE Trans. Audio, Speech, Lang. Process.
, vol.18
, Issue.8
, pp. 2111-2120
-
-
Kim, W.1
Hansen, J.H.L.2
-
28
-
-
0021226391
-
A database for speaker-independent digit recognition
-
R. G. Leonard, "A database for speaker-independent digit recognition," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., 1984, pp. 111-114.
-
(1984)
Proc. IEEE Int. Conf. Acoust., Speech, Signal Process.
, pp. 111-114
-
-
Leonard, R.G.1
-
29
-
-
84867227925
-
Noise reduction through compressed sensing
-
J. Gemmeke and B. Cranen, "Noise reduction through compressed sensing," in Proc. Interspeech, 2008.
-
(2008)
Proc. Interspeech
-
-
Gemmeke, J.1
Cranen, B.2
-
30
-
-
84873833546
-
Multi-candidate missing data imputation for robust speech recognition
-
doi:10.1186/1687-4722-2012-17
-
Y. Wang and H. V. Hamme, "Multi-candidate missing data imputation for robust speech recognition," EURASIP J. Audio, Speech, Music Process., vol. 17, 2012, doi:10.1186/1687-4722-2012-17.
-
(2012)
EURASIP J. Audio, Speech, Music Process.
, vol.17
-
-
Wang, Y.1
Hamme, H.V.2
-
31
-
-
85009227702
-
Analysis of the aurora large vocabulary extensions
-
Geneva, Switzerland, Sep.
-
N. Parihar and J. Picone, "Analysis of the aurora large vocabulary extensions," in Proc. Eurospeech, Geneva, Switzerland, Sep. 2003, vol. 4, pp. 337-340.
-
(2003)
Proc. Eurospeech
, vol.4
, pp. 337-340
-
-
Parihar, N.1
Picone, J.2
-
32
-
-
11144316019
-
Decoding speech in the presence of other sources
-
J. Barker, M. Cooke, and D. P. W. Ellis, "Decoding speech in the presence of other sources," Speech Commun., vol. 45, pp. 5-25, 2005.
-
(2005)
Speech Commun.
, vol.45
, pp. 5-25
-
-
Barker, J.1
Cooke, M.2
Ellis, D.P.W.3
-
33
-
-
70350038037
-
Robust speech recognition by integrating speech separation and hypothesis testing
-
S. Srinivasan and D. L. Wang, "Robust speech recognition by integrating speech separation and hypothesis testing," Speech Commun., vol. 52, pp. 72-81, 2010.
-
(2010)
Speech Commun.
, vol.52
, pp. 72-81
-
-
Srinivasan, S.1
Wang, D.L.2
-
34
-
-
82255178542
-
-
New York, NY, USA: Wiley-IEEE Press
-
D. L. Wang and G. Brown, Eds., Computational Auditory Scene Analysis: Principles, Algorithms, and Applications. New York, NY, USA: Wiley-IEEE Press, 2006.
-
(2006)
Computational Auditory Scene Analysis: Principles, Algorithms, and Applications
-
-
Wang, D.L.1
Brown, G.2
-
36
-
-
56749102248
-
-
Ph.D. dissertation, The Ohio State Univ., Columbus, OH, USA
-
S. Srinivasan, "Integrating computational auditory scene analysis and automatic speech recognition," Ph.D. dissertation, The Ohio State Univ., Columbus, OH, USA, 2006.
-
(2006)
Integrating Computational Auditory Scene Analysis and Automatic Speech Recognition
-
-
Srinivasan, S.1
-
37
-
-
0003982501
-
-
Ph.D. dissertation, Stanford Univ., Stanford, NY, USA
-
M. Weintraub, "A theory and computational model of computational auditory scene analysis," Ph.D. dissertation, Stanford Univ., Stanford, NY, USA, 1985.
-
(1985)
A Theory and Computational Model of Computational Auditory Scene Analysis
-
-
Weintraub, M.1
-
38
-
-
0003822743
-
-
Cambridge U.K.: Cambridge Univ. Publishing Dept.
-
S. Young, G. Evermann, T. Hain, D. Kershaw, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. Woodland, The HTK Book. Cambridge, U.K.: Cambridge Univ. Publishing Dept., 2002 [Online]. Available: http://htk.eng.cam. ac.uk
-
(2002)
The HTK Book
-
-
Young, S.1
Evermann, G.2
Hain, T.3
Kershaw, D.4
Moore, G.5
Odell, J.6
Ollason, D.7
Povey, D.8
Valtchev, V.9
Woodland, P.10
-
39
-
-
0004319968
-
-
Speech Research Unit, Defense Research Agency, Malvern, UK Tech. Rep
-
A. P. Varga, H. J. M. Steeneken, M. Tomlinson, and D. Jones, "The NOISEX-92 study on the effect of additive noise on automatic speech recognition," Speech Research Unit, Defense Research Agency, Malvern, UK, 1992, Tech. Rep..
-
(1992)
The NOISEX-92 Study on the Effect of Additive Noise on Automatic Speech Recognition
-
-
Varga, A.P.1
Steeneken, H.J.M.2
Tomlinson, M.3
Jones, D.4
-
40
-
-
85079095310
-
The design of wall street journal-based CSR corpus
-
Banff, AB, Canada, Oct.
-
D. Paul and J. Baker, "The design of wall street journal-based CSR corpus," in Proc. Int. Conf. Spoken Lang., Banff, AB, Canada, Oct. 1992, pp. 899-902.
-
(1992)
Proc. Int. Conf. Spoken Lang
, pp. 899-902
-
-
Paul, D.1
Baker, J.2
-
41
-
-
78049364397
-
MMSE based noise PSD tracking with low complexity
-
R. C. Hendriks, R. Heusdens, and J. Jensen, "MMSE based noise PSD tracking with low complexity," in Proc. IEEE ICASSP, 2010, pp. 4266-4269.
-
(2010)
Proc. IEEE ICASSP
, pp. 4266-4269
-
-
Hendriks, R.C.1
Heusdens, R.2
Jensen, J.3
-
42
-
-
51449104842
-
Minimum mean-square error estimation of discrete Fourier coefficients with generalized gamma priors
-
Aug.
-
J. S. Erkelens, R. C. Hendriks, R. Heusdens, and J. Jensen, "Minimum mean-square error estimation of discrete Fourier coefficients with generalized gamma priors," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 6, pp. 1741-1752, Aug. 2007.
-
(2007)
IEEE Trans. Audio, Speech, Lang. Process.
, vol.15
, Issue.6
, pp. 1741-1752
-
-
Erkelens, J.S.1
Hendriks, R.C.2
Heusdens, R.3
Jensen, J.4
-
43
-
-
84877592730
-
-
D. P. Ellis, J. A. Bilmes, E. Fosler-Lussier, H. Hermansky, D. Johnson, B. Kingsbury, and N. Morgan, "The SPRACHcore Software Package," [Online]. Available: http://www.icsi.berkeley.edu/~dpwe/projects/sprach/ sprachcore.html 2010
-
(2010)
The SPRACHcore Software Package
-
-
Ellis, D.P.1
Bilmes, J.A.2
Fosler-Lussier, E.3
Hermansky, H.4
Johnson, D.5
Kingsbury, B.6
Morgan, N.7
-
44
-
-
42549139762
-
MVA processing of speech features
-
Jan.
-
C.-P. Chen and J. A. Bilmes, "MVA processing of speech features," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 1, pp. 257-270, Jan. 2007.
-
(2007)
IEEE Trans. Audio, Speech, Lang. Process.
, vol.15
, Issue.1
, pp. 257-270
-
-
Chen, C.-P.1
Bilmes, J.A.2
|