SCOPUS 정보 검색 플랫폼

Computer Speech and Language

Volumn 27, Issue 3, 2013, Pages 874-894

Uncertainty-based learning of acoustic models from noisy data

(3) Ozerov, Alexey a Lagrange, Mathieu b Vincent, Emmanuel c

a Technicolor Research and Innovation ^* (France)

b CNRS (France)

c INRIA (France)

Author keywords

Acoustic model; Classification; Expectation maximization; Gaussian mixture model; Hidden Markov model; Noisy data; Training; Uncertainty

Indexed keywords

AUDIO ACOUSTICS; CLASSIFICATION (OF INFORMATION); COMMUNICATION CHANNELS (INFORMATION THEORY); DECODING; GAUSSIAN DISTRIBUTION; HIDDEN MARKOV MODELS; IMAGE SEGMENTATION; MARKOV PROCESSES; MAXIMUM LIKELIHOOD; MAXIMUM LIKELIHOOD ESTIMATION; MAXIMUM PRINCIPLE; OBJECT RECOGNITION; PERSONNEL TRAINING; TRELLIS CODES; UNCERTAINTY ANALYSIS;

ACOUSTIC MODEL; EXPECTATION - MAXIMIZATIONS; GAUSSIAN MIXTURE MODEL; NOISY DATA; UNCERTAINTY;

SPEECH RECOGNITION;

EID: 84905275072 PISSN: 08852308 EISSN: 10958363 Source Type: Journal
DOI: 10.1016/j.csl.2012.07.002 Document Type: Article

Times cited : (21)

References (40)

1
- 84863754643
- An uncertainty estimation approach for the extraction of individual source features in multisource recordings
- K. Adiloʇlu, and E. Vincent An uncertainty estimation approach for the extraction of individual source features in multisource recordings EUSIPCO, 19th European Signal Processing Conference 2011 1663 1667
- (2011) EUSIPCO, 19th European Signal Processing Conference , pp. 1663-1667
- Adiloʇlu, K.¹ Vincent, E.²

2
- 84858069176
- A tractable framework for estimating and combining spectral source models for audio source separation
- S. Arberet, A. Ozerov, F. Bimbot, and R. Gribonval A tractable framework for estimating and combining spectral source models for audio source separation Signal Processing 92 8 2012 1886 1901
- (2012) Signal Processing , vol.92 , Issue.8 , pp. 1886-1901
- Arberet, S.¹ Ozerov, A.² Bimbot, F.³ Gribonval, R.⁴

3
- 79959819066
- Ph.D. thesis, Technical University Berlin
- Astudillo, R.F., 2010. Integration of short-time Fourier domain speech enhancement and observation uncertainty techniques for robust automatic speech recognition. Ph.D. thesis, Technical University Berlin.
- (2010) Integration of Short-time Fourier Domain Speech Enhancement and Observation Uncertainty Techniques for Robust Automatic Speech Recognition
- Astudillo, R.F.¹

4
- 84878543263
- The PASCAL CHiME Speech Separation and Recognition Challenge
- J. Barker, E. Vincent, N. Ma, H. Christensen, and P. Green "The PASCAL CHiME Speech Separation and Recognition Challenge" Computer Speech and Language 27 3 2013 621 633
- (2013) Computer Speech and Language , vol.27 , Issue.3 , pp. 621-633
- Barker, J.¹ Vincent, E.² Ma, N.³ Christensen, H.⁴ Green, P.⁵

5
- 11144316019
- Decoding speech in the presence of other sources
- J.P. Barker, M.P. Cooke, and D.P.W. Ellis Decoding speech in the presence of other sources Speech Communication 45 1 2005 5 25
- (2005) Speech Communication , vol.45 , Issue.1 , pp. 5-25
- Barker, J.P.¹ Cooke, M.P.² Ellis, D.P.W.³

6
- 33846516584
- Springer
- C.M. Bishop Pattern Recognition and Machine Learning 2006 Springer
- (2006) Pattern Recognition and Machine Learning
- Bishop, C.M.¹

7
- 0035342414
- Robust automatic speech recognition with missing and unreliable acoustic data
- M. Cooke Robust automatic speech recognition with missing and unreliable acoustic data Speech Communication 34 3 2001, June 267 285
- (2001) Speech Communication , vol.34 , Issue.3 , pp. 267-285
- Cooke, M.¹

8
- 84873898784
- Speech recognition in the presence of highly non-stationary noise based on spatial, spectral and temporal speech/noise modeling combined with dynamic variance adaptation
- M. Delcroix, K. Kinoshita, T. Nakatani, S. Araki, A. Ogawa, T. Hori, S. Watanabe, M. Fujimoto, T. Yoshioka, T. Oba, Y. Kubo, M. Souden, S.-J. Hahm, and A. Nakamura Speech recognition in the presence of highly non-stationary noise based on spatial, spectral and temporal speech/noise modeling combined with dynamic variance adaptation Proc. 1st Int. Workshop on Machine Listening in Multisource Environments (CHiME) 2011 12 17
- (2011) Proc. 1st Int. Workshop on Machine Listening in Multisource Environments (CHiME) , pp. 12-17
- Delcroix, M.¹ Kinoshita, K.² Nakatani, T.³ Araki, S.⁴ Ogawa, A.⁵ Hori, T.⁶ Watanabe, S.⁷ Fujimoto, M.⁸ Yoshioka, T.⁹ Oba, T.¹⁰ Kubo, Y.¹¹ Souden, M.¹² Hahm, S.-J.¹³ Nakamura, A.¹⁴

9
- 70350450398
- Static and dynamic variance compensation for recognition of reverberant speech with dereverberation preprocessing
- M. Delcroix, T. Nakatani, and S. Watanabe Static and dynamic variance compensation for recognition of reverberant speech with dereverberation preprocessing IEEE Transactions on Audio, Speech and Language Processing 17 2 2009 324 334
- (2009) IEEE Transactions on Audio, Speech and Language Processing , vol.17 , Issue.2 , pp. 324-334
- Delcroix, M.¹ Nakatani, T.² Watanabe, S.³

10
- 0002629270
- Maximum likelihood from incomplete data via the em algorithm
- A.P. Dempster, N.M. Laird, and D.B. Rubin Maximum likelihood from incomplete data via the EM algorithm Journal of the Royal Statistical Society. Series B (Methodological) 39 1977 1 38
- (1977) Journal of the Royal Statistical Society. Series B (Methodological) , vol.39 , pp. 1-38
- Dempster, A.P.¹ Laird, N.M.² Rubin, D.B.³

11
- 85009070292
- Large vocabulary speech recognition under adverse acoustic environments
- L. Deng, A. Acero, M. Plumpe, and X. Huang Large vocabulary speech recognition under adverse acoustic environments Proc. 6th Int. Conf. on Spoken Language Processing (ICSLP) 2000 806 809
- (2000) Proc. 6th Int. Conf. on Spoken Language Processing (ICSLP) , pp. 806-809
- Deng, L.¹ Acero, A.² Plumpe, M.³ Huang, X.⁴

12
- 18744401086
- Dynamic compensation of HMM variances using the feature enhancement uncertainty computed from a parametric model of speech distortion
- L. Deng, J. Droppo, and A. Acero Dynamic compensation of HMM variances using the feature enhancement uncertainty computed from a parametric model of speech distortion IEEE Transactions on Speech and Audio Processing 13 3 2005 412 421
- (2005) IEEE Transactions on Speech and Audio Processing , vol.13 , Issue.3 , pp. 412-421
- Deng, L.¹ Droppo, J.² Acero, A.³

13
- 84901773892
- Environmental robustness
- J. Benesty, M.M. Sondhi, Y. Huang, Springer
- J. Droppo, and A. Acero Environmental robustness J. Benesty, M.M. Sondhi, Y. Huang, Handbook of Speech Processing 2008 Springer pp. 653-680
- (2008) Handbook of Speech Processing , pp. 653-680
- Droppo, J.¹ Acero, A.²

14
- 84948598244
- Statistical-model-based speech enhancement systems
- Y. Ephraim Statistical-model-based speech enhancement systems Proceedings of the IEEE 80 10 1992 1526 1555
- (1992) Proceedings of the IEEE , vol.80 , Issue.10 , pp. 1526-1555
- Ephraim, Y.¹

15
- 0030677463
- Broadband beamforming with adaptive postfiltering for speech acquisition in noisy environments
- S. Fischer, and K.-D. Kammeyer Broadband beamforming with adaptive postfiltering for speech acquisition in noisy environments IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP'97) 1997 359 362 Vol. 1
- (1997) IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP'97) , vol.1 , pp. 359-362
- Fischer, S.¹ Kammeyer, K.-D.²

16
- 0003671941
- September Ph.D. thesis, University of Cambridge, UK
- Gales, M.J.F., September 1995. Model-based techniques for noise robust speech recognition. Ph.D. thesis, University of Cambridge, UK.
- (1995) Model-based Techniques for Noise Robust Speech Recognition
- Gales, M.J.F.¹

17
- 84893675167
- Model-based approaches to handling uncertainty
- D. Kolossa, R. Haeb-Umbach, Springer Berlin, Germany
- M.J.F. Gales Model-based approaches to handling uncertainty D. Kolossa, R. Haeb-Umbach, Robust Speech Recognition of Uncertain or Missing Data - Theory and Applications 2011 Springer Berlin, Germany pp. 101-125
- (2011) Robust Speech Recognition of Uncertain or Missing Data - Theory and Applications , pp. 101-125
- Gales, M.J.F.¹

18
- 0028419019
- Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains
- J.-L. Gauvain, and C.-H. Lee Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains IEEE Transactions on Speech and Audio Processing 2 2 1994 291 298
- (1994) IEEE Transactions on Speech and Audio Processing , vol.2 , Issue.2 , pp. 291-298
- Gauvain, J.-L.¹ Lee, C.-H.²

19
- 0001551844
- Supervised learning from incomplete data via an em approach
- Z. Ghahramani, and M. Jordan Supervised learning from incomplete data via an EM approach Advance on Neural Information Processing Systems 1994 120 127
- (1994) Advance on Neural Information Processing Systems , pp. 120-127
- Ghahramani, Z.¹ Jordan, M.²

20
- 0010974144
- 2nd ed. American Mathematical Society Providence, RI
- C.M. Grinstead, and J.L. Snell Introduction to Probability 2nd ed. 1997 American Mathematical Society Providence, RI
- (1997) Introduction to Probability
- Grinstead, C.M.¹ Snell, J.L.²

21
- 84890527807
- CHIME challenge: Approaches to robustness using beamforming and uncertainty-of-observation techniques
- D. Kolossa, R.F. Astudillo, A. Abad, S. Zeiler, R. Saeidi, P. Mowlaee, J. da Silva Neto, and R. Martin CHIME challenge: approaches to robustness using beamforming and uncertainty-of-observation techniques Proc. 1st Int. Workshop on Machine Listening in Multisource Environments (CHiME) 2011 6 11
- (2011) Proc. 1st Int. Workshop on Machine Listening in Multisource Environments (CHiME) , pp. 6-11
- Kolossa, D.¹ Astudillo, R.F.² Abad, A.³ Zeiler, S.⁴ Saeidi, R.⁵ Mowlaee, P.⁶ Da Silva Neto, J.⁷ Martin, R.⁸

22
- 77954583785
- Independent component analysis and time-frequency masking for speech recognition in multitalker conditions
- D. Kolossa, R.F. Astudillo, E. Hoffmann, and R. Orglmeister Independent component analysis and time-frequency masking for speech recognition in multitalker conditions EURASIP Journal on Audio, Speech, and Music Processing 2010 2010 1 14
- (2010) EURASIP Journal on Audio, Speech, and Music Processing , vol.2010 , pp. 1-14
- Kolossa, D.¹ Astudillo, R.F.² Hoffmann, E.³ Orglmeister, R.⁴

23
- 0009623939
- Flexible speaker adaptation using maximum likelihood linear regression
- C. Leggetter, and P. Woodland Flexible speaker adaptation using maximum likelihood linear regression ARPA Spoken Lang. Technol. Workshop 1995 104 109
- (1995) ARPA Spoken Lang. Technol. Workshop , pp. 104-109
- Leggetter, C.¹ Woodland, P.²

24
- 34547528168
- Adaptive training with joint uncertainty decoding for robust recognition of noisy data
- H. Liao, and M.J.F. Gales Adaptive training with joint uncertainty decoding for robust recognition of noisy data Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP'07) 2007 389 392 Vol. 4
- (2007) Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP'07) , vol.4 , pp. 389-392
- Liao, H.¹ Gales, M.J.F.²

25
- 0029725301
- A vector Taylor series approach for environment-independent speech recognition
- P.J. Moreno, B. Raj, and R.M. Stern A vector Taylor series approach for environment-independent speech recognition IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP'96) 1996 733 736 Vol. 2
- (1996) IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP'96) , vol.2 , pp. 733-736
- Moreno, P.J.¹ Raj, B.² Stern, R.M.³

26
- 80052714543
- A unifying view on dataset shift in classification
- J.G. Moreno-Torres, T. Raeder, R. Alaiz-Rodríguez, N.V. Chawla, and F. Herrera A unifying view on dataset shift in classification Pattern Recognition 45 1 January 2012 521 530
- (2012) Pattern Recognition , vol.45 , Issue.1 , pp. 521-530
- Moreno-Torres, J.G.¹ Raeder, T.² Alaiz-Rodríguez, R.³ Chawla, N.V.⁴ Herrera, F.⁵

27
- 0031221099
- Filtering time sequences of spectral parameters for speech recognition
- C. Nadeu, P. Pachès-Leal, and B.-H. Juang Filtering time sequences of spectral parameters for speech recognition Speech Communication 22 1997 315 332
- (1997) Speech Communication , vol.22 , pp. 315-332
- Nadeu, C.¹ Pachès-Leal, P.² Juang, B.-H.³

28
- 84873420347
- GMM-based classification from noisy features
- Florence, Italy
- A. Ozerov, M. Lagrange, and E. Vincent GMM-based classification from noisy features Proc. 1st Int. Workshop on Machine Listening in Multisource Environments (CHiME) Florence, Italy September 2011 pp. 30-35
- (2011) Proc. 1st Int. Workshop on Machine Listening in Multisource Environments (CHiME) , pp. 30-35
- Ozerov, A.¹ Lagrange, M.² Vincent, E.³

29
- 84940453987
- Ozerov, A., Lagrange, M., Vincent, E., 2012. Acoustic Model Uncertainty Learning Experimental Toolbox (AMULET), http://bass-db.gforge.inria.fr/amulet/.
- (2012) Acoustic Model Uncertainty Learning Experimental Toolbox (AMULET)
- Ozerov, A.¹ Lagrange, M.² Vincent, E.³

30
- 51449094735
- Adaptation of Bayesian models for single-channel source separation and its application to voice/music separation in popular songs
- A. Ozerov, P. Philippe, F. Bimbot, and R. Gribonval Adaptation of Bayesian models for single-channel source separation and its application to voice/music separation in popular songs IEEE Transactions on Audio, Speech and Language Processing 15 5 2007 1564 1578
- (2007) IEEE Transactions on Audio, Speech and Language Processing , vol.15 , Issue.5 , pp. 1564-1578
- Ozerov, A.¹ Philippe, P.² Bimbot, F.³ Gribonval, R.⁴

31
- 84866037355
- Using the FASST source separation toolbox for noise robust speech recognition
- Florence, Italy
- A. Ozerov, and E. Vincent Using the FASST source separation toolbox for noise robust speech recognition Proc. 1st Int. Workshop on Machine Listening in Multisource Environments (CHiME) Florence, Italy September 2011 pp. 86-87
- (2011) Proc. 1st Int. Workshop on Machine Listening in Multisource Environments (CHiME) , pp. 86-87
- Ozerov, A.¹ Vincent, E.²

32
- 84897584695
- A general flexible framework for the handling of prior information in audio source separation
- A. Ozerov, E. Vincent, and F. Bimbot A general flexible framework for the handling of prior information in audio source separation IEEE Transactions on Audio, Speech and Language Processing 20 4 2012 1118 1133
- (2012) IEEE Transactions on Audio, Speech and Language Processing , vol.20 , Issue.4 , pp. 1118-1133
- Ozerov, A.¹ Vincent, E.² Bimbot, F.³

33
- 0024610919
- A tutorial on hidden Markov models and selected applications in speech recognition
- L. Rabiner A tutorial on hidden Markov models and selected applications in speech recognition Proceedings of the IEEE 77 2 1989 257 286
- (1989) Proceedings of the IEEE , vol.77 , Issue.2 , pp. 257-286
- Rabiner, L.¹

34
- 0029277506
- Large population speaker identification using clean and telephone speech
- D. Reynolds Large population speaker identification using clean and telephone speech IEEE Signal Processing Letters 2 3 1995 46 48
- (1995) IEEE Signal Processing Letters , vol.2 , Issue.3 , pp. 46-48
- Reynolds, D.¹

35
- 27544443176
- Accounting for probe-level noise in principal component analysis of microarray data
- G. Sanguinetti, M. Milo, M. Rattray, and N.D. Lawrence Accounting for probe-level noise in principal component analysis of microarray data Bioinformatics 21 19 2005 3748 3754
- (2005) Bioinformatics , vol.21 , Issue.19 , pp. 3748-3754
- Sanguinetti, G.¹ Milo, M.² Rattray, M.³ Lawrence, N.D.⁴

36
- 69249159165
- A computational auditory scene analysis system for speech segregation and robust speech recognition
- Y. Shao, S. Srinivasan, Z. Jin, and D. Wang A computational auditory scene analysis system for speech segregation and robust speech recognition Computer Speech & Language 24 1 2010 77 93
- (2010) Computer Speech & Language , vol.24 , Issue.1 , pp. 77-93
- Shao, Y.¹ Srinivasan, S.² Jin, Z.³ Wang, D.⁴

37
- 0004082513
- Tech. rep., Interval Research Corporation
- Slaney, M., 1998. Auditory toolbox version 2. Tech. rep., Interval Research Corporation.
- (1998) Auditory Toolbox Version 2
- Slaney, M.¹

38
- 56249136428
- Transforming binary uncertainties for robust speech recognition
- S. Srinivasan, and D. Wang Transforming binary uncertainties for robust speech recognition IEEE Transactions on Audio, Speech and Language Processing 15 7 2007 2130 2140
- (2007) IEEE Transactions on Audio, Speech and Language Processing , vol.15 , Issue.7 , pp. 2130-2140
- Srinivasan, S.¹ Wang, D.²

39
- 84858069855
- The signal separation evaluation campaign (2007-2010): Achievements and remaining challenges
- E. Vincent, S. Araki, F.J. Theis, G. Nolte, P. Bofill, H. Sawada, A. Ozerov, B.V. Gowreesunker, D. Lutter, and N.Q.K. Duong The signal separation evaluation campaign (2007-2010): achievements and remaining challenges Signal Processing 92 8 2012 1928 1936
- (2012) Signal Processing , vol.92 , Issue.8 , pp. 1928-1936
- Vincent, E.¹ Araki, S.² Theis, F.J.³ Nolte, G.⁴ Bofill, P.⁵ Sawada, H.⁶ Ozerov, A.⁷ Gowreesunker, B.V.⁸ Lutter, D.⁹ Duong, N.Q.K.¹⁰

40
- 0003528051
- 2nd ed. Wiley-Interscience New York
- R.F. Woolson, and W.R. Clarke Statistical Methods for the Analysis of Biomedical Data 2nd ed. 2002 Wiley-Interscience New York
- (2002) Statistical Methods for the Analysis of Biomedical Data
- Woolson, R.F.¹ Clarke, W.R.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.