-
1
-
-
35048881485
-
Underdetermined blind separation of convolutive mixtures of speech with directivity pattern based mask and ica
-
S. Araki, S. Makino, H. Sawada, and R. Mukai, "Underdetermined blind separation of convolutive mixtures of speech with directivity pattern based mask and ica," in Proc. 5th Int. Conf. Independent Compon. Anal., 2004, pp. 898-905.
-
(2004)
Proc. 5th Int. Conf. Independent Compon. Anal
, pp. 898-905
-
-
Araki, S.1
Makino, S.2
Sawada, H.3
Mukai, R.4
-
3
-
-
11144316019
-
Decoding speech in the presence of other sources
-
J. P. Barker, M. P. Cooke, and D. P. W. Ellis, "Decoding speech in the presence of other sources," Speech Commun., vol. 45, pp. 5-25, 2005.
-
(2005)
Speech Commun
, vol.45
, pp. 5-25
-
-
Barker, J.P.1
Cooke, M.P.2
Ellis, D.P.W.3
-
4
-
-
85009154399
-
Including uncertainty of speech observations in robust speech recognition
-
M. C. Benitez, J. C. Segura, A. D. Torre, J. Ramirez, and A. Rubio, "Including uncertainty of speech observations in robust speech recognition," in Proc. Int. Conf. Spoken Lang. Process., 2004, pp. 137-140.
-
(2004)
Proc. Int. Conf. Spoken Lang. Process
, pp. 137-140
-
-
Benitez, M.C.1
Segura, J.C.2
Torre, A.D.3
Ramirez, J.4
Rubio, A.5
-
5
-
-
64249165037
-
-
P. Boersma and D. Weenink, Praat: Doing Phonetics by Computer, Version 4.0.26, 2002, Online, Available
-
P. Boersma and D. Weenink, "Praat: Doing Phonetics by Computer, Version 4.0.26," 2002. [Online]. Available: http://www.fon.hum.uva.nl/praat
-
-
-
-
6
-
-
0018455310
-
Suppression of acoustic noise in speech using spectral subtraction
-
Apr
-
S. F. Boll, "Suppression of acoustic noise in speech using spectral subtraction," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-27, no. 2, pp. 113-120, Apr. 1979.
-
(1979)
IEEE Trans. Acoust., Speech, Signal Process
, vol.ASSP-27
, Issue.2
, pp. 113-120
-
-
Boll, S.F.1
-
8
-
-
44949173881
-
Statistical analysis and performance of DFT domain noise reduction filters for robust speech recognition
-
C. Breithaupt and R. Martin, "Statistical analysis and performance of DFT domain noise reduction filters for robust speech recognition," in Proc. Interspeech'06, 2006, pp. 365-368.
-
(2006)
Proc. Interspeech'06
, pp. 365-368
-
-
Breithaupt, C.1
Martin, R.2
-
9
-
-
33845354768
-
Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation
-
D. S. Brungart, P. S. Chang, B. D. Simpson, and D. L. Wang, "Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation," J. Acoust. Soc. Amer., vol. 120, pp. 4007-4018, 2006.
-
(2006)
J. Acoust. Soc. Amer
, vol.120
, pp. 4007-4018
-
-
Brungart, D.S.1
Chang, P.S.2
Simpson, B.D.3
Wang, D.L.4
-
10
-
-
64249167258
-
-
quot;The CMU Pronouncing Dictionary, Carnegie Mellon University, Pittsburgh, PA [Online]. Available: http://www.speech.cs.cmu.edu/cgi-bin/cmudict
-
quot;The CMU Pronouncing Dictionary," Carnegie Mellon University, Pittsburgh, PA [Online]. Available: http://www.speech.cs.cmu.edu/cgi-bin/cmudict
-
-
-
-
11
-
-
33745217651
-
Exploration of behavioral, physiological, and computational approaches to auditory scene analysis,
-
Master's thesis, Dept. Compu. Sci. Eng, The Ohio State Univ, Columbus
-
P. S. Chang, "Exploration of behavioral, physiological, and computational approaches to auditory scene analysis," Master's thesis, Dept. Compu. Sci. Eng., The Ohio State Univ., Columbus, 2004.
-
(2004)
-
-
Chang, P.S.1
-
12
-
-
0035342414
-
Robust automatic speech recognition with missing and unreliable acoustic data
-
M. Cooke, P. Green, L. Josifovski, and A. Vizinho, "Robust automatic speech recognition with missing and unreliable acoustic data," Speech Commun., vol. 34, pp. 267-285, 2001.
-
(2001)
Speech Commun
, vol.34
, pp. 267-285
-
-
Cooke, M.1
Green, P.2
Josifovski, L.3
Vizinho, A.4
-
13
-
-
0019053271
-
Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
-
Aug
-
S. B. Davis and P. Mermelstein, "Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-28, no. 4, pp. 357-366, Aug. 1980.
-
(1980)
IEEE Trans. Acoust., Speech, Signal Process
, vol.ASSP-28
, Issue.4
, pp. 357-366
-
-
Davis, S.B.1
Mermelstein, P.2
-
14
-
-
85009070292
-
Large-vocabulary speech recognition under adverse acoustic environments
-
L. Deng, A. Acero, M. Plumpe, and X. Huang, "Large-vocabulary speech recognition under adverse acoustic environments," in Proc. Int. Conf. Spoken Lang. Process., 2000, pp. 806-809.
-
(2000)
Proc. Int. Conf. Spoken Lang. Process
, pp. 806-809
-
-
Deng, L.1
Acero, A.2
Plumpe, M.3
Huang, X.4
-
15
-
-
18744401086
-
Dynamic compensation of HMM variances using the feature enhancement uncertainty computed from a parametric model of speech distortion
-
May
-
L. Deng, J. Droppo, and A. Acero, "Dynamic compensation of HMM variances using the feature enhancement uncertainty computed from a parametric model of speech distortion," IEEE Trans. Speech Audio Process., vol. 13, no. 3, pp. 412-421, May 2005.
-
(2005)
IEEE Trans. Speech Audio Process
, vol.13
, Issue.3
, pp. 412-421
-
-
Deng, L.1
Droppo, J.2
Acero, A.3
-
16
-
-
0033099548
-
On second-order statistics and linear estimation of cepstral coefficients
-
Mar
-
Y. Ephraim and M. Rahim, "On second-order statistics and linear estimation of cepstral coefficients," IEEE Trans. Speech Audio Process., vol. 7, no. 2, pp. 162-176, Mar. 1999.
-
(1999)
IEEE Trans. Speech Audio Process
, vol.7
, Issue.2
, pp. 162-176
-
-
Ephraim, Y.1
Rahim, M.2
-
17
-
-
0030245128
-
Robust continuous speech recognition using parallel model combination
-
Sep
-
M. J. F. Gales and S. J. Young, "Robust continuous speech recognition using parallel model combination," IEEE Trans. Speech Audio Process., vol. 4, no. 5, pp. 352-359, Sep. 1996.
-
(1996)
IEEE Trans. Speech Audio Process
, vol.4
, Issue.5
, pp. 352-359
-
-
Gales, M.J.F.1
Young, S.J.2
-
18
-
-
0001551844
-
Supervised learning from incomplete data via an EM approach
-
J. D. Cowan, G. Tesauro, and J. Alspector, Eds. San Francisco, CA: Morgan Kaufmann
-
Z. Ghahramani and M. I. Jordan, "Supervised learning from incomplete data via an EM approach," in Advances in Neural Information Processing Systems 6, J. D. Cowan, G. Tesauro, and J. Alspector, Eds. San Francisco, CA: Morgan Kaufmann, 1993, pp. 120-127.
-
(1993)
Advances in Neural Information Processing Systems 6
, pp. 120-127
-
-
Ghahramani, Z.1
Jordan, M.I.2
-
19
-
-
4644265990
-
Monaural speech segregation based on pitch tracking and amplitude modulation
-
Sep
-
G. Hu and D. L. Wang, "Monaural speech segregation based on pitch tracking and amplitude modulation," IEEE Trans. Neural Netw., vol. 15, no. 5, pp. 1135-1150, Sep. 2004.
-
(2004)
IEEE Trans. Neural Netw
, vol.15
, Issue.5
, pp. 1135-1150
-
-
Hu, G.1
Wang, D.L.2
-
21
-
-
44949190747
-
Improved source modeling and predictive classification for channel robust speech recognition
-
V. Ion and R. Haeb-Umbach, "Improved source modeling and predictive classification for channel robust speech recognition," in Proc. Interspeech, 2006, pp. 633-636.
-
(2006)
Proc. Interspeech
, pp. 633-636
-
-
Ion, V.1
Haeb-Umbach, R.2
-
22
-
-
64249084844
-
-
Int. Telecomm. Union (ITU-T), Transmission characteristics for wideband (150-7000 Hz) digital hands-free telephony terminals, Recommendation P.341, 2005, .
-
Int. Telecomm. Union (ITU-T), "Transmission characteristics for wideband (150-7000 Hz) digital hands-free telephony terminals," Recommendation P.341, 2005, .
-
-
-
-
23
-
-
33749058582
-
Separation and robust recognition of noisy, convolutive speech mixtures using time-frequency masking and missing-data techniques
-
D. Kolossa, A. Klimas, and R. Orglmeister, "Separation and robust recognition of noisy, convolutive speech mixtures using time-frequency masking and missing-data techniques," in Proc. IEEE Workshop Applicat. Signal Process. Audio Acoust., 2005, pp. 82-85.
-
(2005)
Proc. IEEE Workshop Applicat. Signal Process. Audio Acoust
, pp. 82-85
-
-
Kolossa, D.1
Klimas, A.2
Orglmeister, R.3
-
24
-
-
0002560960
-
A database for speaker-independent digit recognition
-
R. G. Leonard, "A database for speaker-independent digit recognition," in Proc. ICASSP'84, 1984, pp. 111-114.
-
(1984)
Proc. ICASSP'84
, pp. 111-114
-
-
Leonard, R.G.1
-
25
-
-
33745202806
-
Joint uncertainty decoding for noise robust speech recognition
-
H. Liao and M. J. F. Gales, "Joint uncertainty decoding for noise robust speech recognition," in Proc. Interspeech'05, 2005, pp. 3129-3132.
-
(2005)
Proc. Interspeech'05
, pp. 3129-3132
-
-
Liao, H.1
Gales, M.J.F.2
-
26
-
-
44949140801
-
Issues with uncertainty decoding for noise robust speech recognition
-
H. Liao and M. J. F. Gales, "Issues with uncertainty decoding for noise robust speech recognition," in Proc. Interspeech'06, 2006, pp. 1121-1124.
-
(2006)
Proc. Interspeech'06
, pp. 1121-1124
-
-
Liao, H.1
Gales, M.J.F.2
-
27
-
-
33947619691
-
Statistical methods for the enhancement of noisy speech
-
J. Benesty, S.Makino, and J. Chen, Eds. NY: Springer, ch. 3, pp
-
R. Martin, "Statistical methods for the enhancement of noisy speech," in Speech Enhancement, J. Benesty, S.Makino, and J. Chen, Eds. NY: Springer, 2005, ch. 3, pp. 43-65.
-
(2005)
Speech Enhancement
, pp. 43-65
-
-
Martin, R.1
-
29
-
-
0002671953
-
A minimax classification approach with application to robust speech recognition
-
Jan
-
N. Merhav and C. H. Lee, "A minimax classification approach with application to robust speech recognition," IEEE Trans. Speech Audio Process., vol. 1, no. 1, pp. 90-193, Jan. 1993.
-
(1993)
IEEE Trans. Speech Audio Process
, vol.1
, Issue.1
, pp. 90-193
-
-
Merhav, N.1
Lee, C.H.2
-
30
-
-
0029725301
-
A vector Taylor series approach for environment-independent speech recognition
-
P. J. Moreno, B. Raj, and R. M. Stern, "A vector Taylor series approach for environment-independent speech recognition," in Proc. ICASSP'96, 1996, vol. 2, pp. 733-736.
-
(1996)
Proc. ICASSP'96
, vol.2
, pp. 733-736
-
-
Moreno, P.J.1
Raj, B.2
Stern, R.M.3
-
31
-
-
4644304197
-
A binaural processor for missing data speech recognition in the presence of noise and smallroom reverberation
-
K. J. Palomaki, G. J. Brown, and D. L. Wang, "A binaural processor for missing data speech recognition in the presence of noise and smallroom reverberation," Speech Commun., vol. 43, pp. 361-378, 2004.
-
(2004)
Speech Commun
, vol.43
, pp. 361-378
-
-
Palomaki, K.J.1
Brown, G.J.2
Wang, D.L.3
-
32
-
-
33646773271
-
-
AuroraWorking Group, Eur. Telecomm. Standards Inst, Sophia-Antipolis Cedex, France
-
N. Parihar and J. Picone, "DSR front end LVCSR evaluation," AuroraWorking Group, Eur. Telecomm. Standards Inst., Sophia-Antipolis Cedex, France, 2002.
-
(2002)
DSR front end LVCSR evaluation
-
-
Parihar, N.1
Picone, J.2
-
33
-
-
85009227702
-
Analysis of the aurora large vocabulary evalutions
-
N. Parihar and J. Picone, "Analysis of the aurora large vocabulary evalutions," in Proc. Eurospeech'03, 2003, pp. 337-340.
-
(2003)
Proc. Eurospeech'03
, pp. 337-340
-
-
Parihar, N.1
Picone, J.2
-
34
-
-
85079095310
-
The design of wall street journal-based CSR corpus
-
D. Paul and J. Baker, "The design of wall street journal-based CSR corpus," in Proc. Int. Conf. Spoken Lang. Process., 1992, pp. 899-902.
-
(1992)
Proc. Int. Conf. Spoken Lang. Process
, pp. 899-902
-
-
Paul, D.1
Baker, J.2
-
35
-
-
33745697459
-
Separating underdetermined convolutive speech mixtures
-
M. S. Pedersen, D. L. Wang, J. Larsen, and U. Kjems, "Separating underdetermined convolutive speech mixtures," in Proc. 6th Int. Conf. Independent Compon. Anal. Blind Source Separation, 2006, pp. 674-681.
-
(2006)
Proc. 6th Int. Conf. Independent Compon. Anal. Blind Source Separation
, pp. 674-681
-
-
Pedersen, M.S.1
Wang, D.L.2
Larsen, J.3
Kjems, U.4
-
37
-
-
4644336054
-
Reconstruction of missing features for robust speech recognition
-
B. Raj, M. L. Seltzer, and R. M. Stern, "Reconstruction of missing features for robust speech recognition," Speech Commun., vol. 43, pp. 275-296, 2004.
-
(2004)
Speech Commun
, vol.43
, pp. 275-296
-
-
Raj, B.1
Seltzer, M.L.2
Stern, R.M.3
-
38
-
-
11144343436
-
Detection of reliable features for speech recognition in noisy conditions using a statistical criterion
-
P. Renevey and A. Drygajlo, "Detection of reliable features for speech recognition in noisy conditions using a statistical criterion," in Proc. Consist. Rel. Acoust. Cues Sound Anal. Workshop, 2001, pp. 71-74.
-
(2001)
Proc. Consist. Rel. Acoust. Cues Sound Anal. Workshop
, pp. 71-74
-
-
Renevey, P.1
Drygajlo, A.2
-
39
-
-
0142026377
-
Speech segregation based on sound localization
-
N. Roman, D. L. Wang, and G. J. Brown, "Speech segregation based on sound localization," J. Acoust. Soc. Amer., vol. 114, pp. 2236-2252, 2003.
-
(2003)
J. Acoust. Soc. Amer
, vol.114
, pp. 2236-2252
-
-
Roman, N.1
Wang, D.L.2
Brown, G.J.3
-
40
-
-
85009230793
-
Factorial models and refiltering for speech separation and denoising
-
S. T. Roweis, "Factorial models and refiltering for speech separation and denoising," in Proc. Eurospeech'03, 2003, pp. 1009-1012.
-
(2003)
Proc. Eurospeech'03
, pp. 1009-1012
-
-
Roweis, S.T.1
-
42
-
-
85009180557
-
A harmonic-model-based front end for robust speech recognition
-
M. L. Seltzer, J. Droppo, and A. Acero, "A harmonic-model-based front end for robust speech recognition," in Proc. Eurospeech'03, 2003, pp. 1277-1280.
-
(2003)
Proc. Eurospeech'03
, pp. 1277-1280
-
-
Seltzer, M.L.1
Droppo, J.2
Acero, A.3
-
43
-
-
9644309702
-
Discriminant training of front-end and acoustic modeling stages to heterogeneous acoustic environments for multi-stream automatic speech recognition,
-
Ph.D. dissertation, Univ. California, Berkeley
-
M. L. Shire, "Discriminant training of front-end and acoustic modeling stages to heterogeneous acoustic environments for multi-stream automatic speech recognition," Ph.D. dissertation, Univ. California, Berkeley, 2000.
-
(2000)
-
-
Shire, M.L.1
-
44
-
-
33750311718
-
Binary and ratio time-frequency masks for robust speech recognition
-
S. Srinivasan, N. Roman, and D. L. Wang, "Binary and ratio time-frequency masks for robust speech recognition," Speech Commun., vol. 48, no. 11, pp. 1486-1501, 2006.
-
(2006)
Speech Commun
, vol.48
, Issue.11
, pp. 1486-1501
-
-
Srinivasan, S.1
Roman, N.2
Wang, D.L.3
-
45
-
-
33947644911
-
A supervised learning approach to uncertainty decoding for robust speech recognition
-
S. Srinivasan and D. L. Wang, "A supervised learning approach to uncertainty decoding for robust speech recognition," in Proc. ICASSP'06, 2006, vol. I, pp. 297-300.
-
(2006)
Proc. ICASSP'06
, vol.1
, pp. 297-300
-
-
Srinivasan, S.1
Wang, D.L.2
-
46
-
-
33750376174
-
Model-based feature enhancement with uncertainty decoding for noise robust ASR
-
V. Stouten, H. V. Hamme, and P.Wambacq, "Model-based feature enhancement with uncertainty decoding for noise robust ASR," Speech Commun., vol. 48, no. 11, pp. 1502-1514, 2006.
-
(2006)
Speech Commun
, vol.48
, Issue.11
, pp. 1502-1514
-
-
Stouten, V.1
Hamme, H.V.2
Wambacq, P.3
-
47
-
-
0025681008
-
Hidden Markov model decomposition of speech and noise
-
A. P. Varga and R. K. Moore, "Hidden Markov model decomposition of speech and noise," in Proc. ICASSP'90, 1990, pp. 845-848.
-
(1990)
Proc. ICASSP'90
, pp. 845-848
-
-
Varga, A.P.1
Moore, R.K.2
-
48
-
-
64249140840
-
-
A. P. Varga, H. J. M. Steeneken, M. Tomlinson, and D. Jones, The NOISEX-92 study on the effect of additive noise on automatic speech recogonition, Speech Res. Unit, Def. Res. Agency, Malvern, U.K., 1992.
-
A. P. Varga, H. J. M. Steeneken, M. Tomlinson, and D. Jones, "The NOISEX-92 study on the effect of additive noise on automatic speech recogonition," Speech Res. Unit, Def. Res. Agency, Malvern, U.K., 1992.
-
-
-
-
49
-
-
84892233308
-
On ideal binary mask as the computational goal of auditory scene analysis
-
P. Divenyi, Ed. Norwell, MA: Kluwer
-
D. L. Wang, "On ideal binary mask as the computational goal of auditory scene analysis," in Speech Separation by Humans and Machines, P. Divenyi, Ed. Norwell, MA: Kluwer, 2005, pp. 181-197.
-
(2005)
Speech Separation by Humans and Machines
, pp. 181-197
-
-
Wang, D.L.1
-
50
-
-
14544300108
-
How to pretend that correlated variables are independent by using difference observations
-
C. K. I. Williams, "How to pretend that correlated variables are independent by using difference observations," Neural Comput., vol. 17, pp. 1-6, 2005.
-
(2005)
Neural Comput
, vol.17
, pp. 1-6
-
-
Williams, C.K.I.1
-
51
-
-
31844435714
-
Incomplete-data classification using logistic regression
-
D.Williams, X. Liao, Y. Xue, and L. Carin, L. D. Raedt and S. Wrobel, Eds
-
D.Williams, X. Liao, Y. Xue, and L. Carin, L. D. Raedt and S. Wrobel, Eds., "Incomplete-data classification using logistic regression," in Proc. 22nd Int. Mach. Learning Conf., 2005, pp. 972-979.
-
(2005)
Proc. 22nd Int. Mach. Learning Conf
, pp. 972-979
-
-
-
52
-
-
44949171870
-
Vector Taylor series based joint uncertainty decoding
-
H. Xu, L. Rigazio, and D. Kryze, "Vector Taylor series based joint uncertainty decoding," in Proc. Interspeech'06, 2006, pp. 1125-1128.
-
(2006)
Proc. Interspeech'06
, pp. 1125-1128
-
-
Xu, H.1
Rigazio, L.2
Kryze, D.3
-
53
-
-
33646768933
-
Static and dynamic spectral features: Their noise robustness and optimal weights
-
C. Yang, F. K. Soong, and T. Lee, "Static and dynamic spectral features: Their noise robustness and optimal weights," in Proc. ICASSP'05, 2005, vol. I, pp. 241-244.
-
(2005)
Proc. ICASSP'05
, vol.1
, pp. 241-244
-
-
Yang, C.1
Soong, F.K.2
Lee, T.3
-
54
-
-
3142694930
-
Blind separation of speech mixtures via time-frequency masking
-
O. Yilmaz and S. Rickard, "Blind separation of speech mixtures via time-frequency masking," IEEE Trans. Signal Process., vol. 52, pp. 1830-1847, 2004.
-
(2004)
IEEE Trans. Signal Process
, vol.52
, pp. 1830-1847
-
-
Yilmaz, O.1
Rickard, S.2
-
55
-
-
64249091246
-
-
S. Young, D. Kershaw, J. Odell, V. Valtchev, and P. Woodland, The HTK Book for HTK Version 3.0, Redmond, WA: Microsoft Corp, 2000
-
S. Young, D. Kershaw, J. Odell, V. Valtchev, and P. Woodland, The HTK Book (for HTK Version 3.0). Redmond, WA: Microsoft Corp., 2000.
-
-
-
|