SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 15, Issue 1, 2007, Pages 235-245

Training wideband acoustic models using mixed-bandwidth training data for speech recognition

(2) Seltzer, Michael L a Acero, Alex a

a MICROSOFT RESEARCH (United States)

Author keywords

Acoustic modeling; Bandwidth extension; Hidden Markov models (HMMs); Speech recognition; Telephone speech

Indexed keywords

ACOUSTIC MODELING; ACOUSTIC MODELS; BANDWIDTH EXTENSION; EXPECTATION-MAXIMIZATION ALGORITHMS; HIDDEN MARKOV MODELS (HMMS); NARROW BANDS; RECOGNITION ACCURACIES; RECOGNITION SYSTEMS; SPEECH RECOGNIZERS; SUB-OPTIMAL PERFORMANCE; TELEPHONE BANDWIDTHS; TELEPHONE SPEECH; TRAINING ALGORITHMS; TRAINING DATUM; TRAINING SCHEMES; TRAINING STRATEGIES; WIDE BANDS; WIDEBAND SPEECH;

ACOUSTICS; BANDWIDTH; COMPUTATIONAL GRAMMARS; HIDDEN MARKOV MODELS; OBJECT RECOGNITION; SPEECH ANALYSIS; TELECOMMUNICATION SYSTEMS; TELEPHONE; TELEPHONE SETS; WAVELET TRANSFORMS;

SPEECH RECOGNITION;

EID: 64149084747 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2006.876774 Document Type: Article

Times cited : (18)

References (21)

1
- 0141856326
- Meetings about meetings: Research at ICSI on speech in multiparty conversations
- Hong Kong, China, Apr
- N. Morgan, D. Baron, S. Bhagatl, H. Carvey, R. Dhillon, J. Edwards, D. Gelbart, A. Janin, A. Krupskil, B. Peskin, T. Pfau, E. Shriberg, A. Stolcke, and C. Wooters, "Meetings about meetings: research at ICSI on speech in multiparty conversations," in Proc. ICASSP, Hong Kong, China, Apr. 2003, vol. 4, pp. 740-743.
- (2003) Proc. ICASSP , vol.4 , pp. 740-743
- Morgan, N.¹ Baron, D.² Bhagatl, S.³ Carvey, H.⁴ Dhillon, R.⁵ Edwards, J.⁶ Gelbart, D.⁷ Janin, A.⁸ Krupskil, A.⁹ Peskin, B.¹⁰ Pfau, T.¹¹ Shriberg, E.¹² Stolcke, A.¹³ Wooters, C.¹⁴

2
- 33745525361
- The rich transcription 2004 Spring meeting recognition evaluation
- Montreal, QC, Canada, May
- J. S. Garofolo, C. D. Laprun, and J. G. Fiscus, "The rich transcription 2004 Spring meeting recognition evaluation," in Proc. NIST RT04 Meeting Recognition Workshop, Montreal, QC, Canada, May 2004.
- (2004) Proc. NIST RT04 Meeting Recognition Workshop
- Garofolo, J.S.¹ Laprun, C.D.² Fiscus, J.G.³

3
- 85079086476
- Sources of degradation of speech recognition in the telephone network
- Adelaide, Australia, Apr
- P. Moreno and R. M. Stern, "Sources of degradation of speech recognition in the telephone network," in Proc. ICASSP, Adelaide, Australia, Apr. 1994, vol. I, pp. 109-112.
- (1994) Proc. ICASSP , vol.1 , pp. 109-112
- Moreno, P.¹ Stern, R.M.²

4
- 0001551844
- Supervised learning from incomplete data via an EM approach
- Z. Ghahramani and M. I. Jordan, "Supervised learning from incomplete data via an EM approach," Adv. Neural Inf. Proc. Sys., pp. 120-127, 1994.
- (1994) Adv. Neural Inf. Proc. Sys , pp. 120-127
- Ghahramani, Z.¹ Jordan, M.I.²

5
- 4644336054
- Reconstruction of damaged spectrographic features for robust speech recognition
- Sep
- B. Raj, M. L. Seltzer, and R. M. Stern, "Reconstruction of damaged spectrographic features for robust speech recognition," Speech Commun., vol. 43, no. 4, pp. 275-296, Sep. 2004.
- (2004) Speech Commun , vol.43 , Issue.4 , pp. 275-296
- Raj, B.¹ Seltzer, M.L.² Stern, R.M.³

6
- 0035342414
- Robust automatic speech recognition with missing and unreliable acoustic data
- Jun
- M. Cooke, P. Green, L. Josifovski, and A. Vizinho, "Robust automatic speech recognition with missing and unreliable acoustic data," Speech Commun., vol. 34, no. 3, pp. 267-285, Jun. 2001.
- (2001) Speech Commun , vol.34 , Issue.3 , pp. 267-285
- Cooke, M.¹ Green, P.² Josifovski, L.³ Vizinho, A.⁴

7
- 4644317224
- Classifier-based mask estimation for missing feature methods of robust speech recognition
- Sep
- M. L. Seltzer, B. Raj, and R. M. Stern, "Classifier-based mask estimation for missing feature methods of robust speech recognition," Speech Commun., vol. 43, no. 4, pp. 379-393, Sep. 2004.
- (2004) Speech Commun , vol.43 , Issue.4 , pp. 379-393
- Seltzer, M.L.¹ Raj, B.² Stern, R.M.³

8
- 0028516117
- Training issues and channel equalization techniques for the construction of telephone acoustic models using a high-quality speech corpus
- Oct
- L. G. Neumeyer, V. V. Digalakis, and M. Weintraub, "Training issues and channel equalization techniques for the construction of telephone acoustic models using a high-quality speech corpus," IEEE Trans. Speech Audio Process., vol. 2, no. 4, pp. 590-597, Oct. 1994.
- (1994) IEEE Trans. Speech Audio Process , vol.2 , Issue.4 , pp. 590-597
- Neumeyer, L.G.¹ Digalakis, V.V.² Weintraub, M.³

9
- 0028517647
- Statistical recovery of wideband speech from narrowband speech
- Oct
- Y. M. Cheng, D. O'Shaughnessy, and P. Mermelstein, "Statistical recovery of wideband speech from narrowband speech," IEEE Trans. Speech Audio Process., vol. 2, no. 4, pp. 544-548, Oct. 1994.
- (1994) IEEE Trans. Speech Audio Process , vol.2 , Issue.4 , pp. 544-548
- Cheng, Y.M.¹ O'Shaughnessy, D.² Mermelstein, P.³

10
- 0033692729
- Narrowband to wideband conversion of speech using GMM based transformation
- Istanbul, Turkey, Jun
- K.-Y. Park and H. S. Kim, "Narrowband to wideband conversion of speech using GMM based transformation," in Proc. ICASSP, Istanbul, Turkey, Jun. 2000, vol. 3, pp. 1843-1846.
- (2000) Proc. ICASSP , vol.3 , pp. 1843-1846
- Park, K.-Y.¹ Kim, H.S.²

11
- 84951992170
- Wideband extension of telephone speech using a hidden Markov model
- Delavan, WI, Sep
- P. Jax and P. Vary, "Wideband extension of telephone speech using a hidden Markov model," in IEEEWorkshop on Speech Coding, Delavan, WI, Sep. 2000, pp. 133-135.
- (2000) IEEEWorkshop on Speech Coding , pp. 133-135
- Jax, P.¹ Vary, P.²

12
- 0002629270
- Maximum likelihood from incomplete data via the EM algorithm
- A. P. Dempster, N. M. Laird, and D. B. Rubin, "Maximum likelihood from incomplete data via the EM algorithm," J. R. Statistical Soc., vol. 39, no. 1, pp. 1-38, 1977.
- (1977) J. R. Statistical Soc , vol.39 , Issue.1 , pp. 1-38
- Dempster, A.P.¹ Laird, N.M.² Rubin, D.B.³

13
- 0003823976
- Englewood Cliffs, New Jersey: Prentice- Hall
- J. M. Mendel, Lessons in Estimation Theory for Signal Processing, Communications, and Control. Englewood Cliffs, New Jersey: Prentice- Hall, 1995.
- (1995) Lessons in Estimation Theory for Signal Processing, Communications, and Control
- Mendel, J.M.¹

14
- 0024610919
- A tutorial on hidden Markov models and selected applications in speech recognition
- Feb
- L. R. Rabiner, "A tutorial on hidden Markov models and selected applications in speech recognition," Proc. IEEE, vol. 77, no. 2, pp. 257-286, Feb. 1990.
- (1990) Proc. IEEE , vol.77 , Issue.2 , pp. 257-286
- Rabiner, L.R.¹

15
- 0029726509
- Improving environmental robustness in large-vocabulary speech recognition
- Atlanta, GA, May
- P. C.Woodland, M. J. F. Gales, and D. Pye, "Improving environmental robustness in large-vocabulary speech recognition," in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, Atlanta, GA, May 1996, vol. 1, pp. 65-69.
- (1996) Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing , vol.1 , pp. 65-69
- Woodland, P.C.¹ Gales, M.J.F.² Pye, D.³

16
- 0012330750
- The design of the Wall Street Journalbased CSR corpus
- Harriman, NY, Feb
- D. B. Paul and J. M. Baker, "The design of the Wall Street Journalbased CSR corpus," in Proc. ARPA Speech Nat. Lang. Workshop, Harriman, NY, Feb. 1992, pp. 357-362.
- (1992) Proc. ARPA Speech Nat. Lang. Workshop , pp. 357-362
- Paul, D.B.¹ Baker, J.M.²

17
- 64149129100
- S. Young, The HTK Hidden Markov Model Toolkit: Design and Philosophy, Cambridge Univ. Tech. Rep., Cambridge, U.K., 1994.
- S. Young, "The HTK Hidden Markov Model Toolkit: Design and Philosophy," Cambridge Univ. Tech. Rep., Cambridge, U.K., 1994.

18
- 33646785081
- Training wideband acoustic models using mixed-bandwidth training data via feature bandwidth extension
- Philadelphia, PA, Mar
- M. L. Seltzer and A. Acero, "Training wideband acoustic models using mixed-bandwidth training data via feature bandwidth extension," in Proc. ICASSP, Philadelphia, PA, Mar. 2005, vol. 1, pp. 921-924.
- (2005) Proc. ICASSP , vol.1 , pp. 921-924
- Seltzer, M.L.¹ Acero, A.²

19
- 64149086333
- C. J. Leggetter and P. C. Woodland, Speaker Adaptation of HMMs Using Linear Regression, Cambridge Univ., Cambridge, U.K., Tech. Rep. CUED/F-INFENG/TR. 181, Jun. 1994.
- C. J. Leggetter and P. C. Woodland, "Speaker Adaptation of HMMs Using Linear Regression," Cambridge Univ., Cambridge, U.K., Tech. Rep. CUED/F-INFENG/TR. 181, Jun. 1994.

20
- 85016587886
- SWITCHBOARD: Telephone speech corpus for research and development
- San Francisco, CA, Mar
- J. Godfrey, E. C. Holliman, and J. McDaniel, "SWITCHBOARD: telephone speech corpus for research and development," in Proc. ICASSP, San Francisco, CA, Mar. 1992, vol. 1, pp. 517-520.
- (1992) Proc. ICASSP , vol.1 , pp. 517-520
- Godfrey, J.¹ Holliman, E.C.² McDaniel, J.³

21
- 0003857778
- Univ. California, Berkeley, Berkeley, Tech. Rep. TR-97-021, Apr
- J. A. Bilmes, "A Gentle Tutorial of the EM Algorithm and its Applications to Parameter Estimation for Gussian Mixture and Hidden Markov Models," Univ. California, Berkeley, Berkeley, Tech. Rep. TR-97-021, Apr. 1998.
- (1998) A Gentle Tutorial of the EM Algorithm and its Applications to Parameter Estimation for Gussian Mixture and Hidden Markov Models
- Bilmes, J.A.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.