SCOPUS 정보 검색 플랫폼

Volumn 91, Issue 9, 2003, Pages 1272-1305

Interacting with computers by voice: Automatic speech recognition and synthesis

Author keywords

Continuous speech recognition; Distance measures; Hidden markov models (HMMs); Human computer dialogues; Language models (LMs); Linear predictive coding (LPC); Spectral analysis; Speech synthesis; Text to speech (TTS)

Indexed keywords

ANIMATION; COMPUTER PROGRAMMING LANGUAGES; COMPUTER SIMULATION; CONTINUOUS SPEECH RECOGNITION; DATABASE SYSTEMS; DISTANCE MEASUREMENT; FOURIER TRANSFORMS; MARKOV PROCESSES; MATHEMATICAL MODELS; SPECTRUM ANALYSIS; SPEECH SYNTHESIS; TEXT PROCESSING;

AUTOMATIC SPEECH RECOGNITION; HIDDEN MARKOV MODELS (HMM); HUMAN-COMPUTER DIALOGUES; LANGUAGE MODELS (LM); LINEAR PREDICTIVE CODING (LPC); TEXT-TO-SPEECH (TTS);

HUMAN COMPUTER INTERACTION;

EID: 4944252269 PISSN: 00189219 EISSN: None Source Type: Journal
DOI: 10.1109/JPROC.2003.817117 Document Type: Conference Paper

Times cited : (88)

References (292)

1
- 0004319970
- Boston, MA: Kluwer
- A. Acero, Acoustical and Environmental Robustness in Automatic Speech Recognition. Boston, MA: Kluwer, 1993.
- (1993) Acoustical and Environmental Robustness in Automatic Speech Recognition
- Acero, A.¹

2
- 85009113852
- HMM adaptation using vector Taylor series for noisy speech recognition
- A. Acero, L. Deng, T. Kristjansson, and J. Zhang, "HMM adaptation using vector Taylor series for noisy speech recognition," in Proc. ICSLP, vol. 3, 2000, pp. 869-872.
- (2000) Proc. ICSLP , vol.3 , pp. 869-872
- Acero, A.¹ Deng, L.² Kristjansson, T.³ Zhang, J.⁴

3
- 85135194998
- Text normalization and speech recognition in French
- G. Adda, M. Adda-Decker, J.-L. Gauvin, and L. Lamel, "Text normalization and speech recognition in French," in Proc. Eurospeech, 1997, pp. 2711-2714.
- (1997) Proc. Eurospeech , pp. 2711-2714
- Adda, G.¹ Adda-Decker, M.² Gauvin, J.-L.³ Lamel, L.⁴

4
- 0006137783
- Toward synthesis of Hindi consonants using Klsyn88
- S. Agrawal and K. Stevens, "Toward synthesis of Hindi consonants using Klsyn88," in Proc. ICSLP, 1992, pp. 177-180.
- (1992) Proc. ICSLP , pp. 177-180
- Agrawal, S.¹ Stevens, K.²

5
- 0031177213
- Combined Bayesian and predictive techniques for rapid speaker adaptation of continuous density hidden Markov models
- S. Ahadi and P. Woodland, "Combined Bayesian and predictive techniques for rapid speaker adaptation of continuous density hidden Markov models," Comput. Speech Lang., vol. 11, pp. 187-206, 1997.
- (1997) Comput. Speech Lang. , vol.11 , pp. 187-206
- Ahadi, S.¹ Woodland, P.²

6
- 0030037151
- Cepstral representation of speech motivated by time-frequency masking: An application to speech recognition
- K. Aikawa, H. Singer, H. Kawahara, and Y. Tokhura, "Cepstral representation of speech motivated by time-frequency masking: An application to speech recognition," J. Acoust. Soc. Amer., vol. 100, pp. 603-614, 1996.
- (1996) J. Acoust. Soc. Amer. , vol.100 , pp. 603-614
- Aikawa, K.¹ Singer, H.² Kawahara, H.³ Tokhura, Y.⁴

7
- 0030369319
- Archisegment-based letter-to-phone conversion for concatenative speech synthesis in Portuguese
- E. Albano and A. Moreira, "Archisegment-based letter-to-phone conversion for concatenative speech synthesis in Portuguese," in Proc. ICSLP, 1996, pp. 1708-1711.
- (1996) Proc. ICSLP , pp. 1708-1711
- Albano, E.¹ Moreira, A.²

8
- 0011138907
- Overview of text-to-speech systems
- S. Furui and M. Sondhi, Eds. New York: Marcel Dekker
- J. Allen, "Overview of text-to-speech systems," in Advances in Speech Signal Processing, S. Furui and M. Sondhi, Eds. New York: Marcel Dekker, 1992, pp. 741-790.
- (1992) Advances in Speech Signal Processing , pp. 741-790
- Allen, J.¹

9
- 0028516073
- How do humans process and recognize speech?
- _, "How do humans process and recognize speech?," IEEE Trans. Speech Audio Processing, vol. 2, pp. 567-577, 1994.
- (1994) IEEE Trans. Speech Audio Processing , vol.2 , pp. 567-577

10
- 0032762247
- Selective training for hidden Markov models with applications to speech coding
- Oct.
- L. Arslan and J. Hansen, "Selective training for hidden Markov models with applications to speech coding," IEEE Trans. Speech Audio Processing vol. 7, pp. 46-54, Oct. 1999.
- (1999) IEEE Trans. Speech Audio Processing , vol.7 , pp. 46-54
- Arslan, L.¹ Hansen, J.²

11
- 0032045825
- Phonemic transcription by analogy in text-to-speech synthesis: Novel word pronunciation and lexicon compression
- P. Bagshaw, "Phonemic transcription by analogy in text-to-speech synthesis: Novel word pronunciation and lexicon compression," Comput. Speech Lang., vol. 12, pp. 119-142, 1998.
- (1998) Comput. Speech Lang. , vol.12 , pp. 119-142
- Bagshaw, P.¹

12
- 0027683814
- A method for the construction of acoustic Markov models for words
- Oct.
- L. Bahl, P. Brown, P. de Souza, R. Mercer, and M. Picheny, "A method for the construction of acoustic Markov models for words," IEEE Trans. Speech Audio Processing, vol. 1, pp. 443-452, Oct. 1993.
- (1993) IEEE Trans. Speech Audio Processing , vol.1 , pp. 443-452
- Bahl, L.¹ Brown, P.² De Souza, P.³ Mercer, R.⁴ Picheny, M.⁵

13
- 0022890536
- Maximum mutual in formation estimation of hidden Markov model parameters for speech recognition
- L. Bahl, P. Brown, P. de Souza, and R. Mercer, "Maximum mutual in formation estimation of hidden Markov model parameters for speech recognition," Proc. IEEE ICASSP, pp. 49-52, 1986.
- (1986) Proc. IEEE ICASSP , pp. 49-52
- Bahl, L.¹ Brown, P.² De Souza, P.³ Mercer, R.⁴

14
- 0027623511
- Multonic Markov word models for large vocabulary continuous speech recognition
- July
- L. Bahl, J. Bellegarda, P. de Souza, P. Gopalakrishnan, D. Nahamoo, and M. Picheny, "Multonic Markov word models for large vocabulary continuous speech recognition," IEEE Trans. Speech Audio Processing, vol. 1, pp. 334-344, July 1993.
- (1993) IEEE Trans. Speech Audio Processing , vol.1 , pp. 334-344
- Bahl, L.¹ Bellegarda, J.² De Souza, P.³ Gopalakrishnan, P.⁴ Nahamoo, D.⁵ Picheny, M.⁶

15
- 0016615529
- Decoding for channels with insertions, deletions, and substitutions with applications to speech recognition
- July
- L. Bahl and F. Jelinek, "Decoding for channels with insertions, deletions, and substitutions with applications to speech recognition," IEEE Trans. Inform. Theory, vol. IT-21, pp. 404-411, July 1975.
- (1975) IEEE Trans. Inform. Theory , vol.IT-21 , pp. 404-411
- Bahl, L.¹ Jelinek, F.²

16
- 0020719320
- A maximum likelihood approach to continuous speech recognition
- Mar.
- L. Bahl, F. Jelinek, and R. Mercer, "A maximum likelihood approach to continuous speech recognition," IEEE Trans. Pattern Anal. Machine Intell., vol. PAMI-5, pp. 179-190, Mar. 1983.
- (1983) IEEE Trans. Pattern Anal. Machine Intell. , vol.PAMI-5 , pp. 179-190
- Bahl, L.¹ Jelinek, F.² Mercer, R.³

17
- 0016663359
- The DRAGON system - An overview
- Feb.
- J. Baker, "The DRAGON system - an overview," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-23, pp. 24-29, Feb. 1975.
- (1975) IEEE Trans. Acoust., Speech, Signal Processing , vol.ASSP-23 , pp. 24-29
- Baker, J.¹

18
- 0001862769
- An inequality and associated maximization technique in statistical estimation for probabilistic functions of Markov processes
- L. E. Baum, "An inequality and associated maximization technique in statistical estimation for probabilistic functions of Markov processes," Inequalities, vol. 3, pp. 1-8, 1972.
- (1972) Inequalities , vol.3 , pp. 1-8
- Baum, L.E.¹

19
- 0040852401
- Rule-based grapheme-to-phoneme conversion of names
- K. Belhoula, "Rule-based grapheme-to-phoneme conversion of names," in Proc. Eurospeech, 1993, pp. 881-884.
- (1993) Proc. Eurospeech , pp. 881-884
- Belhoula, K.¹

20
- 84892176357
- Exploiting both local and global constraints for multi-span statistical language modeling
- J. Bellegarda, "Exploiting both local and global constraints for multi-span statistical language modeling," in Proc. IEEE ICASSP, 1998, pp. 677-680.
- (1998) Proc. IEEE ICASSP , pp. 677-680
- Bellegarda, J.¹

21
- 3643135324
- Amsterdam, The Netherlands: North-Holland
- C. Benoit, G. Bailly, and T. Sawallis, Eds., Talking Machines: Theories, Models and Applications. Amsterdam, The Netherlands: North-Holland, 1992.
- (1992) Talking Machines: Theories, Models and Applications
- Benoit, C.¹ Bailly, G.² Sawallis, T.³

22
- 84966366503
- Rapid unit selection from a large speech corpus for concatenative speech synthesis
- M. Beutnagel, M. Mohri, and M. Riley, "Rapid unit selection from a large speech corpus for concatenative speech synthesis," in Proc. Eurospeech, 1999, pp. 607-610.
- (1999) Proc. Eurospeech , pp. 607-610
- Beutnagel, M.¹ Mohri, M.² Riley, M.³

23
- 0027228898
- Multilingual PSOLA text-to-speech system
- D. Bigorgne et al., "Multilingual PSOLA text-to-speech system," in Proc. IEEE ICASSP, vol. 2, 1993, pp. 187-190.
- (1993) Proc. IEEE ICASSP , vol.2 , pp. 187-190
- Bigorgne, D.¹

24
- 0003132144
- Multilingual speech recognition: The 1996 Byblos Callhome system
- J. Billa, K. Ma, J. McDonough, G. Zavaliagkos, and D. Miller, "Multilingual speech recognition: The 1996 Byblos Callhome system," in Proc. Eurospeech, 1997, pp. 363-366.
- (1997) Proc. Eurospeech , pp. 363-366
- Billa, J.¹ Ma, K.² McDonough, J.³ Zavaliagkos, G.⁴ Miller, D.⁵

25
- 85133526552
- Automatic clustering similar units for unit selection in speech synthesis
- A. Black and P. Taylor, "Automatic clustering similar units for unit selection in speech synthesis," in Proc. Eurospeech, 1997, pp. 601-604.
- (1997) Proc. Eurospeech , pp. 601-604
- Black, A.¹ Taylor, P.²

26
- 0030142722
- Toward increasing speech recognition error rates
- H. Bourlard, H. Hermansky, and N. Morgan, "Toward increasing speech recognition error rates," Speech Commun., vol. 18, pp. 205-231, 1996.
- (1996) Speech Commun. , vol.18 , pp. 205-231
- Bourlard, H.¹ Hermansky, H.² Morgan, N.³

27
- 0022246330
- Speaker dependent connected speech recognition via phonemic Markov models
- H. Bourlard, Y. Kamp, and C. Wellekens, "Speaker dependent connected speech recognition via phonemic Markov models," in Proc. IEEE ICASSP, 1985, pp. 1213-1216.
- (1985) Proc. IEEE ICASSP , pp. 1213-1216
- Bourlard, H.¹ Kamp, Y.² Wellekens, C.³

28
- 0003802343
- Pacific Grove, CA: Wadsworth & Brooks
- L. Breiman, J. Friedman, R. Olshen, and C. Stone, Classification and Regression Trees. Pacific Grove, CA: Wadsworth & Brooks, 1984.
- (1984) Classification and Regression Trees
- Breiman, L.¹ Friedman, J.² Olshen, R.³ Stone, C.⁴

29
- 85022919385
- Class-based n-gram models of natural language
- P. Brown, V. D. Pietra, P. deSouza, J. Lai, and R. Mercer, "Class-based n-gram models of natural language," Comput. Linguist., vol. 18, pp. 467-179, 1992.
- (1992) Comput. Linguist. , vol.18 , pp. 467-1179
- Brown, P.¹ Pietra, V.D.² Desouza, P.³ Lai, J.⁴ Mercer, R.⁵

30
- 0031675455
- An algorithm for maximum likelihood estimation of hidden Markov models with unknown state-tying
- Jan.
- O. Cappé, C. Mokbel, D. Jouvet, and E. Moulines, "An algorithm for maximum likelihood estimation of hidden Markov models with unknown state-tying," IEEE Trans. Speech Audio Processing, vol. 6, pp. 61-70, Jan. 1998.
- (1998) IEEE Trans. Speech Audio Processing , vol.6 , pp. 61-70
- Cappé, O.¹ Mokbel, C.² Jouvet, D.³ Moulines, E.⁴

31
- 84969173798
- Segmentation and modeling in segment-based recognition
- J. Chang and J. Glass, "Segmentation and modeling in segment-based recognition," in Proc. Eurospeech, 1997, pp. 1199-1202.
- (1997) Proc. Eurospeech , pp. 1199-1202
- Chang, J.¹ Glass, J.²

32
- 0033329799
- An empirical study of smoothing techniques for language modeling
- J. Chen and J. Goodman, "An empirical study of smoothing techniques for language modeling," Comput. Speech Lang., vol. 13, pp. 359-394, 1999.
- (1999) Comput. Speech Lang. , vol.13 , pp. 359-394
- Chen, J.¹ Goodman, J.²

33
- 0031146514
- HMM-based speech recognition using state-dependent, discriminatively derived transforms on mel-warped DFT features
- May
- R. Chengalvarayan and L. Deng, "HMM-based speech recognition using state-dependent, discriminatively derived transforms on mel-warped DFT features," IEEE Trans. Speech Audio Processing, vol. 5, pp. 243-256, May 1997.
- (1997) IEEE Trans. Speech Audio Processing , vol.5 , pp. 243-256
- Chengalvarayan, R.¹ Deng, L.²

34
- 0022888128
- Stress assignment in letter-to-sound rules for speech synthesis
- K. Church, "Stress assignment in letter-to-sound rules for speech synthesis," in Proc. IEEE ICASSP, 1986, pp. 2423-2426.
- (1986) Proc. IEEE ICASSP , pp. 2423-2426
- Church, K.¹

35
- 21244486894
- Dordrecht, The Netherlands: Kluwer
- _, Parsing in Speech Recognition. Dordrecht, The Netherlands: Kluwer, 1987.
- (1987) Parsing in Speech Recognition

36
- 0032204117
- A novel feature transformation for vocal tract length normalization in automatic speech recognition
- Nov.
- T. Claes, J. Dologlou, L. ten Bosch, and D. van Compernolle, "A novel feature transformation for vocal tract length normalization in automatic speech recognition," IEEE Trans. Speech Audio Processing, vol. 6, pp. 549-557, Nov. 1998.
- (1998) IEEE Trans. Speech Audio Processing , vol.6 , pp. 549-557
- Claes, T.¹ Dologlou, J.² Ten Bosch, L.³ Van Compernolle, D.⁴

37
- 0029230678
- The challenge of spoken language systems: Research directions for the nineties
- Jan.
- R. Cole et al., "The challenge of spoken language systems: Research directions for the nineties," IEEE Trans. Speech Audio Processing, vol. 3, pp. 1-21, Jan. 1995.
- (1995) IEEE Trans. Speech Audio Processing , vol.3 , pp. 1-21
- Cole, R.¹

38
- 0031161885
- A hybrid algorithm for speaker adaptation using MAP transformation and adaptation
- June
- J.-T. Chien, C.-H. Lee, and H.-C. Wang, "A hybrid algorithm for speaker adaptation using MAP transformation and adaptation," IEEE Signal Processing Lett., vol. 4, pp. 167-169, June 1997.
- (1997) IEEE Signal Processing Lett. , vol.4 , pp. 167-169
- Chien, J.-T.¹ Lee, C.-H.² Wang, H.-C.³

39
- 0025786649
- Voice quality factors: Analysis, synthesis & perception
- D. Childers and C. Lee, "Voice quality factors: Analysis, synthesis & perception," J. Acoust. Soc. Amer., vol. 90, pp. 2394-2410, 1991.
- (1991) J. Acoust. Soc. Amer. , vol.90 , pp. 2394-2410
- Childers, D.¹ Lee, C.²

40
- 0006132736
- A minimum error rate pattern recognition approach to speech recognition
- W. Chou, C.-H. Lee, B.-H. Juang, and F. Soong, "A minimum error rate pattern recognition approach to speech recognition," Int. J. Pattern Recognit. Artif. Intell., vol. 8, pp. 5-31, 1994.
- (1994) Int. J. Pattern Recognit. Artif. Intell. , vol.8 , pp. 5-31
- Chou, W.¹ Lee, C.-H.² Juang, B.-H.³ Soong, F.⁴

41
- 0000767590
- Discriminant-function-based minimum recognition error rate pattern-recognition approach to speech recognition
- Aug.
- W. Chou, B. Juang, and C.-H. Lee, "Discriminant-function-based minimum recognition error rate pattern-recognition approach to speech recognition," Proc. IEEE, vol. 88, pp. 1201-1223, Aug. 2000.
- (2000) Proc. IEEE , vol.88 , pp. 1201-1223
- Chou, W.¹ Juang, B.² Lee, C.-H.³

42
- 0017620899
- Detecting and locating key words in continuous speech using linear predictive coding
- Oct.
- R. Christiansen and C. Rushforth, "Detecting and locating key words in continuous speech using linear predictive coding," IEEE Trans. Speech Audio Processing, vol. SAP-25, pp. 361-367, Oct. 1977.
- (1977) IEEE Trans. Speech Audio Processing , vol.SAP-25 , pp. 361-367
- Christiansen, R.¹ Rushforth, C.²

43
- 0016940126
- A model of articulatory dynamics and control
- Apr.
- C. Coker, "A model of articulatory dynamics and control," Proc. IEEE, vol. 64, pp. 452-460, Apr. 1976.
- (1976) Proc. IEEE , vol.64 , pp. 452-460
- Coker, C.¹

44
- 0024392496
- Application of an auditory model to speech recognition
- J. R. Cohen, "Application of an auditory model to speech recognition," J. Acoust. Soc. Amer., vol. 85, no. 6, pp. 2623-2629, 1989.
- (1989) J. Acoust. Soc. Amer. , vol.85 , Issue.6 , pp. 2623-2629
- Cohen, J.R.¹

45
- 0003733873
- Englewood Cliffs, NJ: Prentice-Hall
- L. Cohen, Time-Frequency Analysis. Englewood Cliffs, NJ: Prentice-Hall, 1995.
- (1995) Time-frequency Analysis
- Cohen, L.¹

46
- 0030671924
- Missing data techniques for robust speech recognition
- M. Cooke, A. Morris, and P. Green, "Missing data techniques for robust speech recognition," in Proc. IEEE ICASSP, 1997, pp. 863-866.
- (1997) Proc. IEEE ICASSP , pp. 863-866
- Cooke, M.¹ Morris, A.² Green, P.³

47
- 0031222490
- MMIE training of large vocabulary recognition systems
- H. Cung and Y. Normandin, "MMIE training of large vocabulary recognition systems," Speech Commun., vol. 22, pp. 303-314, 1997.
- (1997) Speech Commun. , vol.22 , pp. 303-314
- Cung, H.¹ Normandin, Y.²

48
- 0020191331
- Some experiments in discrete utterance recognition
- Oct.
- S. Das, "Some experiments in discrete utterance recognition," IEEE Trans. Speech Audio Processing, vol. SAP-30, pp. 766-770, Oct. 1982.
- (1982) IEEE Trans. Speech Audio Processing , vol.SAP-30 , pp. 766-770
- Das, S.¹

49
- 0031644298
- Improvements in children's speech recognition performance
- S. Das, D. Nix, and M. Picheny, "Improvements in children's speech recognition performance," in Proc. IEEE ICASSP, 1998, pp. 433-436.
- (1998) Proc. IEEE ICASSP , pp. 433-436
- Das, S.¹ Nix, D.² Picheny, M.³

50
- 0020795461
- On the effects of varying filter bank parameters on isolated word recognition
- Aug.
- B. Dautrich, L. Rabiner, and T. Martin, "On the effects of varying filter bank parameters on isolated word recognition," IEEE Trans. Speech Audio Processing, vol. SAP-31, pp. 793-807, Aug. 1983.
- (1983) IEEE Trans. Speech Audio Processing , vol.SAP-31 , pp. 793-807
- Dautrich, B.¹ Rabiner, L.² Martin, T.³

51
- 0019053271
- Comparison of parametric representations for monosyllabic word recognition incontinuously spoken sentences
- Aug.
- S. Davis and P. Mermelstein, "Comparison of parametric representations for monosyllabic word recognition incontinuously spoken sentences," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-28, pp. 357-366, Aug. 1980.
- (1980) IEEE Trans. Acoust., Speech, Signal Processing , vol.ASSP-28 , pp. 357-366
- Davis, S.¹ Mermelstein, P.²

52
- 0002629270
- Maximum likelihood from incomplete data via the EM algorithm
- A. Dempster, N. Laird, and D. Rubin, "Maximum likelihood from incomplete data via the EM algorithm," J. Royal Statist. Soc., vol. 39, pp. 1-88, 1977.
- (1977) J. Royal Statist. Soc. , vol.39 , pp. 1-88
- Dempster, A.¹ Laird, N.² Rubin, D.³

53
- 0031185482
- Speaker-independent phonetic classification using hidden Markov models with mixtures of trend functions
- July
- V. Deng and M. Aksmanovik, "Speaker-independent phonetic classification using hidden Markov models with mixtures of trend functions," IEEE Trans. Speech Audio Processing, vol. 5, pp. 319-324, July 1997.
- (1997) IEEE Trans. Speech Audio Processing , vol.5 , pp. 319-324
- Deng, V.¹ Aksmanovik, M.²

54
- 0029219614
- A Markov model containing state-conditioned second-order nonstationarity: Application to speech recognition
- L. Deng and R. Chengalvarayan, "A Markov model containing state-conditioned second-order nonstationarity: Application to speech recognition," Comput. Speech Lang., vol. 9, pp. 63-86, 1995.
- (1995) Comput. Speech Lang. , vol.9 , pp. 63-86
- Deng, L.¹ Chengalvarayan, R.²

55
- 0036879732
- A new multistage algorithm for spotting new words in speech
- Nov.
- S. Dharanipragada and S. Roukos, "A new multistage algorithm for spotting new words in speech," IEEE Trans. Speech Audio Processing, vol. 10, pp. 542-550, Nov. 2002.
- (2002) IEEE Trans. Speech Audio Processing , vol.10 , pp. 542-550
- Dharanipragada, S.¹ Roukos, S.²

56
- 0030189744
- Speaker adaptation using combined transformation and Bayesian methods
- July
- V. Digalakis and G. Neumeyer, "Speaker adaptation using combined transformation and Bayesian methods," IEEE Trans. Speech Audio Processing, vol. 4, pp. 294-300, July 1996.
- (1996) IEEE Trans. Speech Audio Processing , vol.4 , pp. 294-300
- Digalakis, V.¹ Neumeyer, G.²

57
- 85041486134
- Optimizing unit selection with voice source and formants in the CHATR speech synthesis system
- W. Ding and N. Campbell, "Optimizing unit selection with voice source and formants in the CHATR speech synthesis system," in Proc. Eurospeech, 1997, pp. 537-540.
- (1997) Proc. Eurospeech , pp. 537-540
- Ding, W.¹ Campbell, N.²

58
- 85009090897
- A component by component listening test analysis of the IBM trainable speech synthesis system
- R. Donovan, "A component by component listening test analysis of the IBM trainable speech synthesis system," in Proc. Eurospeech, 2001, pp. 329-332.
- (2001) Proc. Eurospeech , pp. 329-332
- Donovan, R.¹

59
- 0032651722
- A hidden Markov-model-based trainable speech synthesizer
- R. Donovan and P. Woodland, "A hidden Markov-model-based trainable speech synthesizer," Comput. Speech Lang., vol. 13, pp. 223-241, 1999.
- (1999) Comput. Speech Lang. , vol.13 , pp. 223-241
- Donovan, R.¹ Woodland, P.²

60
- 85006734596
- Evaluation of the SPLICE algorithm on the Aurora2 database
- J. Droppo, L. Deng, and A. Acero, "Evaluation of the SPLICE algorithm on the Aurora2 database," in Proc. Eurospeech, 2001, pp. 217-220.
- (2001) Proc. Eurospeech , pp. 217-220
- Droppo, J.¹ Deng, L.² Acero, A.³

61
- 21244495453
- Boston, MA: Kluwer
- T. Dutoit, From Text to Speech: A Concatenative Approach. Boston, MA: Kluwer, 1997.
- (1997) From Text to Speech: A Concatenative Approach
- Dutoit, T.¹

62
- 0017269304
- Letter-to-sound rules for automatic translation of English text to phonetics
- Dec.
- H. S. Elovitz, R. Johnson, A. McHugh, and J. E. Shore, "Letter-to-sound rules for automatic translation of English text to phonetics," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-24, pp. 446-459, Dec. 1976.
- (1976) IEEE Trans. Acoust., Speech, Signal Processing , vol.ASSP-24 , pp. 446-459
- Elovitz, H.S.¹ Johnson, R.² McHugh, A.³ Shore, J.E.⁴

63
- 0024933962
- An unrestricted vocabulary Arabic speech synthesis system
- Dec.
- Y. El-Imam, "An unrestricted vocabulary Arabic speech synthesis system," IEEE Trans. Speech Audio Processing, vol. 37, pp. 1829-1845, Dec. 1989.
- (1989) IEEE Trans. Speech Audio Processing , vol.37 , pp. 1829-1845
- El-Imam, Y.¹

64
- 0003459126
- New York: Springer-Verlag
- R. Elliott, L. Aggoun, and J. Moore, Hidden Markov Models - Estimation and Control. New York: Springer-Verlag, 1995.
- (1995) Hidden Markov Models - Estimation and Control
- Elliott, R.¹ Aggoun, L.² Moore, J.³

65
- 0001873457
- Filterbank-energy estimation using mixture and Markov models for recognition of noisy speech
- Jan.
- A. Erell and M. Weintraub, "Filterbank-energy estimation using mixture and Markov models for recognition of noisy speech," IEEE Trans. Speech Audio Processing, vol. 1, pp. 68-76, Jan. 1993.
- (1993) IEEE Trans. Speech Audio Processing , vol.1 , pp. 68-76
- Erell, A.¹ Weintraub, M.²

66
- 21244495849
- Echo and noise reduction for hands-free terminals - State of the art
- G. Faucon and R. Le Bouquin-Jeannes, "Echo and noise reduction for hands-free terminals - state of the art," in Proc. Eurospeech, 1997, pp. 2423-2426.
- (1997) Proc. Eurospeech , pp. 2423-2426
- Faucon, G.¹ Le Bouquin-Jeannes, R.²

67
- 0003757962
- New York: Springer-Verlag
- J. Flanagan, Speech Analysis, Synthesis and Perception, 2nd ed. New York: Springer-Verlag, 1972.
- (1972) Speech Analysis, Synthesis and Perception, 2nd Ed.
- Flanagan, J.¹

68
- 21244499990
- Knowledge-based techniques in acoustic-phonetic decoding of speech: Interest and limitations
- D. Fohr, J.-P. Haton, and Y. Laprie, "Knowledge-based techniques in acoustic-phonetic decoding of speech: Interest and limitations," Int. J. Pattern Recognit. Artif. Intell., vol. 8, pp. 133-153, 1994.
- (1994) Int. J. Pattern Recognit. Artif. Intell. , vol.8 , pp. 133-153
- Fohr, D.¹ Haton, J.-P.² Laprie, Y.³

69
- 0031175880
- Unconstrained keyword spotting using phone lattices with application to spoken document retrieval
- J. Foote, S. Young, G. Jones, and K. Jones, "Unconstrained keyword spotting using phone lattices with application to spoken document retrieval," Comput. Speech Lang., vol. 11, pp. 207-224, 1997.
- (1997) Comput. Speech Lang. , vol.11 , pp. 207-224
- Foote, J.¹ Young, S.² Jones, G.³ Jones, K.⁴

70
- 0015600423
- The Viterbi algorithm
- Mar.
- G. D. Forney, "The Viterbi algorithm," Proc IEEE, vol. 61, pp. 268-278, Mar. 1973.
- (1973) Proc IEEE , vol.61 , pp. 268-278
- Forney, G.D.¹

71
- 84940820794
- Duration and intensity as physical correlates of linguistic stress
- D. Fry, "Duration and intensity as physical correlates of linguistic stress," J. Acoust. Soc. Amer., vol. 27, pp. 765-768, 1955.
- (1955) J. Acoust. Soc. Amer. , vol.27 , pp. 765-768
- Fry, D.¹

72
- 84964153357
- Experiments in the perception of stress
- _, "Experiments in the perception of stress," Lang. Speech, vol. 1, pp. 126-152, 1958.
- (1958) Lang. Speech , vol.1 , pp. 126-152

73
- 0000813409
- Syllables as concatenative phonetic units
- A. Bell and J. Hooper, Eds. Amsterdam. The Netherlands: North-Holland
- O. Fujimura and J. Lovins, "Syllables as concatenative phonetic units," in Syllables and Segments, A. Bell and J. Hooper, Eds. Amsterdam. The Netherlands: North-Holland, 1978, pp. 107-120.
- (1978) Syllables and Segments , pp. 107-120
- Fujimura, O.¹ Lovins, J.²

74
- 4243460174
- Semi-tied covariance matrices
- M. Gales, "Semi-tied covariance matrices," in Proc. IEEE ICASSP, 1998, pp. 617-660.
- (1998) Proc. IEEE ICASSP , pp. 617-660
- Gales, M.¹

75
- 0033097333
- State-based Gaussian selection in large vocabulary continuous speech recognition using HMMs
- Mar.
- M. Gales, K. Knill, and S. Young, "State-based Gaussian selection in large vocabulary continuous speech recognition using HMMs," IEEE Trans. Speech Audio Processing, vol. 7, pp. 152-161, Mar. 1999.
- (1999) IEEE Trans. Speech Audio Processing , vol.7 , pp. 152-161
- Gales, M.¹ Knill, K.² Young, S.³

76
- 0034227757
- Cluster adaptive training of hidden Markov models
- July
- M. Gales, "Cluster adaptive training of hidden Markov models," IEEE Trans. Speech Audio Processing, vol. 8, no. 4, pp. 417-428, July 2000.
- (2000) IEEE Trans. Speech Audio Processing , vol.8 , Issue.4 , pp. 417-428
- Gales, M.¹

77
- 0032050110
- Maximum likelihood linear transformations for HMM-based speech recognition
- _, "Maximum likelihood linear transformations for HMM-based speech recognition," Comput. Speech Lang., vol. 12, pp. 75-98, 1998.
- (1998) Comput. Speech Lang. , vol.12 , pp. 75-98

78
- 0032139556
- Predictive model-based compensation schemes for robust speech recognition
- _, "Predictive model-based compensation schemes for robust speech recognition," Speech Commun., vol. 25, pp. 49-74, 1998.
- (1998) Speech Commun. , vol.25 , pp. 49-74

79
- 0030638030
- Syllable - A promising recognition unit for LVCSR
- A. Ganapathiraju et al., "Syllable - a promising recognition unit for LVCSR," in IEEE Workshop Speech Recognition, 1997, pp. 207-213.
- (1997) IEEE Workshop Speech Recognition , pp. 207-213
- Ganapathiraju, A.¹

80
- 21244446048
- Noise reduction and speech recognition in noise conditions tested on LPNN-based continuous speech recognition system
- Y. Gao and J.-P. Haton, "Noise reduction and speech recognition in noise conditions tested on LPNN-based continuous speech recognition system," in Proc. Eurospeech, 1993, pp. 1035-1038.
- (1993) Proc. Eurospeech , pp. 1035-1038
- Gao, Y.¹ Haton, J.-P.²

81
- 0031632620
- On the robust incorporation of formant features into hidden Markov models for automatic speech recognition
- P. Garner and W. Holmes, "On the robust incorporation of formant features into hidden Markov models for automatic speech recognition," in Proc. IEEE ICASSP, 1998, pp. 1-4.
- (1998) Proc. IEEE ICASSP , pp. 1-4
- Garner, P.¹ Holmes, W.²

82
- 0003959189
- Boston, MA: Kluwer
- A. Gersho and R. M. Gray, Vector Quantization and Signal Compression. Boston, MA: Kluwer, 1992.
- (1992) Vector Quantization and Signal Compression
- Gersho, A.¹ Gray, R.M.²

83
- 0000030810
- Auditory nerve representation as a basis for speech processing
- S. Furui and M. Sondhi, Eds. New York: Marcel Dekker
- O. Ghitza, "Auditory nerve representation as a basis for speech processing," in Advances in Speech Signal Processing, S. Furui and M. Sondhi, Eds. New York: Marcel Dekker, 1992, pp. 453-485.
- (1992) Advances in Speech Signal Processing , pp. 453-485
- Ghitza, O.¹

84
- 85016587886
- Switchboard: Telephone speech corpus for research and development
- J. Godfrey, E. Holliman, and J. McDaniel, "Switchboard: Telephone speech corpus for research and development," in Proc. IEEE ICASSP, vol. 1, 1992, pp. 517-520.
- (1992) Proc. IEEE ICASSP , vol.1 , pp. 517-520
- Godfrey, J.¹ Holliman, E.² McDaniel, J.³

85
- 21244436199
- The interaction of phonetics, phonology and morphology in an Icelandic text-to-speech system
- B. Granstrom, P. Helgason, and H. Thráinsson, "The interaction of phonetics, phonology and morphology in an Icelandic text-to-speech system," in Proc. ICSLP, 1992, pp. 185-188.
- (1992) Proc. ICSLP , pp. 185-188
- Granstrom, B.¹ Helgason, P.² Thráinsson, H.³

86
- 0028419019
- Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains
- Apr.
- J.-L. Gauvin and C.-H. Lee, "Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains," IEEE Trans. Speech Audio Processing, vol. 2, pp. 291-298, Apr. 1994.
- (1994) IEEE Trans. Speech Audio Processing , vol.2 , pp. 291-298
- Gauvin, J.-L.¹ Lee, C.-H.²

87
- 0031232722
- Speech analysis/synthesis and modification using an analysis-by- synthesis/overlap-add sinusoidal model
- Sept.
- E. George and M. Smith, "Speech analysis/synthesis and modification using an analysis-by-synthesis/overlap-add sinusoidal model," IEEE Trans. Speech Audio Processing, vol. 5, pp. 389-406, Sept. 1997.
- (1997) IEEE Trans. Speech Audio Processing , vol.5 , pp. 389-406
- George, E.¹ Smith, M.²

88
- 0029288202
- Speech recognition in noisy environments
- Y. Gong, "Speech recognition in noisy environments," Speech Commun., vol. 16, pp. 261-291, 1995.
- (1995) Speech Commun. , vol.16 , pp. 261-291
- Gong, Y.¹

89
- 21244496605
- Speaker normalizalion through formant-based warping of the frequency scale
- E. Gouvêa and R. Stern, "Speaker normalizalion through formant-based warping of the frequency scale," in Proc. Eurospeech, 1997, pp. 1139-1142.
- (1997) Proc. Eurospeech , pp. 1139-1142
- Gouvêa, E.¹ Stern, R.²

90
- 0019050955
- Distortion measures for speech processing
- Aug.
- R. M. Gray, A. Buzo, A. H. Gray, and Y. Matsuyama, "Distortion measures for speech processing," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-28, pp. 367-376, Aug. 1980.
- (1980) IEEE Trans. Acoust., Speech, Signal Processing , vol.ASSP-28 , pp. 367-376
- Gray, R.M.¹ Buzo, A.² Gray, A.H.³ Matsuyama, Y.⁴

91
- 21244504086
- Acoustic pattern matching and beam searching
- K. Greer, B. Lowerre, and L. Wilcox, "Acoustic pattern matching and beam searching," in Proc. IEEE ICASSP, 1982, pp. 1251-1254.
- (1982) Proc. IEEE ICASSP , pp. 1251-1254
- Greer, K.¹ Lowerre, B.² Wilcox, L.³

92
- 85135180981
- Speech timing in Slovenian TTS
- J. Gros, N. Pavesik, and F. Mihelic, "Speech timing in Slovenian TTS," in Proc. Eurospeech, 1997, pp. 323-326.
- (1997) Proc. Eurospeech , pp. 323-326
- Gros, J.¹ Pavesik, N.² Mihelic, F.³

93
- 0028420015
- Improvements in beam search for 10000-word continuous speech recognition
- Apr.
- R. Haeb-Umbach and H. Ney, "Improvements in beam search for 10000-word continuous speech recognition," IEEE Trans. Speech Audio Processing, vol. 2, pp. 353-356, Apr. 1994.
- (1994) IEEE Trans. Speech Audio Processing , vol.2 , pp. 353-356
- Haeb-Umbach, R.¹ Ney, H.²

94
- 0030196359
- Feature analysis and neural network-based classification of speech under stress
- July
- J. Hansen and B. Womack, "Feature analysis and neural network-based classification of speech under stress," IEEE Trans. Speech Audio Processing, vol. 4, pp. 307-313, July 1996.
- (1996) IEEE Trans. Speech Audio Processing , vol.4 , pp. 307-313
- Hansen, J.¹ Womack, B.²

95
- 1842658044
- Robust feature-estimation and objective quality assessment for noisy speech recognition using the Credit Card Corpus
- May
- J. Hansen and L. Arslan, "Robust feature-estimation and objective quality assessment for noisy speech recognition using the Credit Card Corpus," IEEE Trans. Speech Audio Processing, vol. 3, pp. 169-184, May 1995.
- (1995) IEEE Trans. Speech Audio Processing , vol.3 , pp. 169-184
- Hansen, J.¹ Arslan, L.²

96
- 0032163635
- An auditory-based distortion measure with application to concatenative speech synthesis
- Sept.
- J. Hansen and D. Chappell, "An auditory-based distortion measure with application to concatenative speech synthesis," IEEE Trans. Speech Audio Processing, vol. 6, no. 5, pp. 489-495, Sept. 1998.
- (1998) IEEE Trans. Speech Audio Processing , vol.6 , Issue.5 , pp. 489-495
- Hansen, J.¹ Chappell, D.²

97
- 0031023993
- Glottal characteristics of female speakers: Acoustic correlates
- H. Hanson, "Glottal characteristics of female speakers: Acoustic correlates," J. Acoust. Soc. Amer., vol. 101, pp. 466-481, 1997.
- (1997) J. Acoust. Soc. Amer. , vol.101 , pp. 466-481
- Hanson, H.¹

98
- 0030635386
- Deriving phrase-based language models
- P. Heeman and G. Damnati, "Deriving phrase-based language models," in IEEE Workshop Speech Recognition, 1997, pp. 41-48.
- (1997) IEEE Workshop Speech Recognition , pp. 41-48
- Heeman, P.¹ Damnati, G.²

99
- 0028517164
- RASTA processing of speech
- Oct.
- H. Hermansky and N. Morgan, "RASTA processing of speech," IEEE Trans. Speech Audio Processing, vol. 2, pp. 578-589, Oct. 1994.
- (1994) IEEE Trans. Speech Audio Processing , vol.2 , pp. 578-589
- Hermansky, H.¹ Morgan, N.²

100
- 0032139768
- Should recognizers have ears?
- H. Hermansky, "Should recognizers have ears?," Speech Commun., vol. 25, pp. 3-27, 1998.
- (1998) Speech Commun. , vol.25 , pp. 3-27
- Hermansky, H.¹

101
- 0022245547
- Keyword recognition using template concatenation
- A. Higgins and R. Wohlford, "Keyword recognition using template concatenation," in Proc. IEEE ICASSP, 1985, pp. 1233-1236.
- (1985) Proc. IEEE ICASSP , pp. 1233-1236
- Higgins, A.¹ Wohlford, R.²

102
- 0020905802
- Formant synthesizers - Cascade or parallel?
- J. Holmes, "Formant synthesizers - cascade or parallel?," Speech Comm., vol. 2, pp. 251-273, 1983.
- (1983) Speech Comm. , vol.2 , pp. 251-273
- Holmes, J.¹

103
- 85032644657
- Using formant frequencies in speech recognition
- J. Holmes, W. Holmes, and P. Garner, "Using formant frequencies in speech recognition," in Proc. Eurospeech, vol. 3, 1997, pp. 2083-2086.
- (1997) Proc. Eurospeech , vol.3 , pp. 2083-2086
- Holmes, J.¹ Holmes, W.² Garner, P.³

104
- 0032673963
- Probabilistic-trajectory segmental HMM's
- W. Holmes and M. Russell, "Probabilistic-trajectory segmental HMM's," Comput. Speech Lang., vol. 13, pp. 3-27, 1999.
- (1999) Comput. Speech Lang. , vol.13 , pp. 3-27
- Holmes, W.¹ Russell, M.²

105
- 0033677062
- Unified frame and segment based models for automatic speech recognition
- H. Hon and K. Wang, "Unified frame and segment based models for automatic speech recognition," in Proc. IEEE ICASSP, vol. 2, 2000, pp. 1017-1020.
- (2000) Proc. IEEE ICASSP , vol.2 , pp. 1017-1020
- Hon, H.¹ Wang, K.²

106
- 0031642265
- Automatic generation of synthesis units for trainable text-to-speech systems
- H. Hon, A. Acero, X. Huang, J. Liu, and M. Plumpe, "Automatic generation of synthesis units for trainable text-to-speech systems," in Proc. IEEE ICASSP, 1998, pp. 273-276.
- (1998) Proc. IEEE ICASSP , pp. 273-276
- Hon, H.¹ Acero, A.² Huang, X.³ Liu, J.⁴ Plumpe, M.⁵

107
- 0028460279
- A fast algorithm for large vocabulary keyword spotting application
- July
- E.-F. Huang, H.-C. Wang, and F. Soong, "A fast algorithm for large vocabulary keyword spotting application," IEEE Trans. Speech Audio Processing, vol. 2, pp. 449-452, July 1994.
- (1994) IEEE Trans. Speech Audio Processing , vol.2 , pp. 449-452
- Huang, E.-F.¹ Wang, H.-C.² Soong, F.³

108
- 0004056285
- Upper Saddle River, NJ: Prentice-Hall
- X. Huang, A. Acero, and S. Hon, Spoken Language Processing, Upper Saddle River, NJ: Prentice-Hall, 2001.
- (2001) Spoken Language Processing
- Huang, X.¹ Acero, A.² Hon, S.³

109
- 0027578837
- On speaker-independent, speaker-dependent, and speaker-adaptive speech recognition
- Apr.
- X. Huang and K.-F. Lee, "On speaker-independent, speaker-dependent, and speaker-adaptive speech recognition," IEEE Trans. Speech Audio Processing, vol. 1, pp. 150-157, Apr. 1993.
- (1993) IEEE Trans. Speech Audio Processing , vol.1 , pp. 150-157
- Huang, X.¹ Lee, K.-F.²

110
- 0003462715
- Edinburgh, U.K.: Edinburgh Univ. Press
- X. Huang, Y. Ariki, and M. Jack, Hidden Markov Models for Speech Recognition. Edinburgh, U.K.: Edinburgh Univ. Press, 1990.
- (1990) Hidden Markov Models for Speech Recognition
- Huang, X.¹ Ariki, Y.² Jack, M.³

111
- 0027678306
- A comparative study of discrete, semicontinuous, and continuous hidden Markov models
- X. Huang, H. Hon, M. Hwang, and K. Lee, "A comparative study of discrete, semicontinuous, and continuous hidden Markov models," Comput. Speech Lang., vol. 7, pp. 359-368, 1993.
- (1993) Comput. Speech Lang. , vol.7 , pp. 359-368
- Huang, X.¹ Hon, H.² Hwang, M.³ Lee, K.⁴

112
- 0029765811
- Unit selection in a concatenative speech synthesis system using a large speech database
- A. Hunt and W. Black, "Unit selection in a concatenative speech synthesis system using a large speech database," in Proc. IEEE ICASSP, 1996, pp. 373-376.
- (1996) Proc. IEEE ICASSP , pp. 373-376
- Hunt, A.¹ Black, W.²

113
- 0024905238
- A comparison of several acoustic representations for speech recognition with degraded and undegraded speech
- M. Hunt and C. Lefèbvre, "A comparison of several acoustic representations for speech recognition with degraded and undegraded speech," in Proc. IEEE ICASSP, 1989, pp. 262-265.
- (1989) Proc. IEEE ICASSP , pp. 262-265
- Hunt, M.¹ Lefèbvre, C.²

114
- 21244472139
- Use of dynamic programming in a syllable-based continuous speech recognition system
- D. Sankoff and J. Kruskall, Eds. Reading, MA: Addison-Wesley
- M. Hunt, M. Lennig, and P. Mermelstein, "Use of dynamic programming in a syllable-based continuous speech recognition system," in Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison, D. Sankoff and J. Kruskall, Eds. Reading, MA: Addison-Wesley, 1983, pp. 163-187.
- (1983) Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison , pp. 163-187
- Hunt, M.¹ Lennig, M.² Mermelstein, P.³

115
- 0033900150
- A Bayesian predictive classification approach to robust speech recognition
- Mar.
- Q. Huo and C.-H. Lee, "A Bayesian predictive classification approach to robust speech recognition," IEEE Trans. Speech Audio Processing, vol. 8, pp. 200-204, Mar. 2000.
- (2000) IEEE Trans. Speech Audio Processing , vol.8 , pp. 200-204
- Huo, Q.¹ Lee, C.-H.²

116
- 0030289863
- Predicting unseen triphones with senones
- Nov.
- M.-Y. Hwang, X. Huang, and F. Alleva, "Predicting unseen triphones with senones," IEEE Trans. Speech Audio Processing, vol. 4, pp. 412-419, Nov. 1996.
- (1996) IEEE Trans. Speech Audio Processing , vol.4 , pp. 412-419
- Hwang, M.-Y.¹ Huang, X.² Alleva, F.³

117
- 0003822743
- Cambridge, U.K.: Cambridge Univ. Press
- S. Young, J. Odell, D. Ollason, V. Valtchev, and P. Woodland, The HTK Book. Cambridge, U.K.: Cambridge Univ. Press, 1998.
- (1998) The HTK Book
- Young, S.¹ Odell, J.² Ollason, D.³ Valtchev, V.⁴ Woodland, P.⁵

118
- 0032785782
- Modeling long distance dependence in language: Topic mixtures versus dynamic cache models
- Jan.
- R. Iyer and M. Ostendorf, "Modeling long distance dependence in language: Topic mixtures versus dynamic cache models," IEEE Trans. Speech Audio Processing, vol. 7, pp. 30-39, Jan. 1999.
- (1999) IEEE Trans. Speech Audio Processing , vol.7 , pp. 30-39
- Iyer, R.¹ Ostendorf, M.²

119
- 0031209168
- Using out-of-domain data to improve in-domain language models
- R. Iyer, M. Ostendorf, and H. Gish, "Using out-of-domain data to improve in-domain language models," IEEE Signal Processing Lett., vol. 4, pp. 221-223, 1997.
- (1997) IEEE Signal Processing Lett. , vol.4 , pp. 221-223
- Iyer, R.¹ Ostendorf, M.² Gish, H.³

120
- 0029345416
- A comparison of signal processing front ends for automatic word recognition
- July
- C. Jankowski, H.-D. Vo, and R. Lippmann, "A comparison of signal processing front ends for automatic word recognition," IEEE Trans. Speech Audio Processing, vol. 3, pp. 286-293, July 1995.
- (1995) IEEE Trans. Speech Audio Processing , vol.3 , pp. 286-293
- Jankowski, C.¹ Vo, H.-D.² Lippmann, R.³

121
- 0016939124
- Continuous speech recognition by statistical methods
- Apr.
- F. Jelinek, "Continuous speech recognition by statistical methods," Proc. IEEE, vol. 64, pp. 532-556, Apr. 1976.
- (1976) Proc. IEEE , vol.64 , pp. 532-556
- Jelinek, F.¹

122
- 0022150487
- The development of an experimental discrete dictation recognizer
- Nov.
- _, "The development of an experimental discrete dictation recognizer," Proc. IEEE, vol. 73, pp. 1616-1620, Nov. 1985.
- (1985) Proc. IEEE , vol.73 , pp. 1616-1620

123
- 0001993550
- Principles of lexical language-modeling for speech recognition
- S. Furui and M. Sondhi, Eds. New York: Marcel Dekker
- F. Jelinek, R. Mercer, and S. Roucos, "Principles of lexical language-modeling for speech recognition," in Advances in Speech Signal Processing, S. Furui and M. Sondhi, Eds. New York: Marcel Dekker, 1992, pp. 651-699.
- (1992) Advances in Speech Signal Processing , pp. 651-699
- Jelinek, F.¹ Mercer, R.² Roucos, S.³

124
- 0032685060
- Robust speech recognition based on Bayesian prediction approach
- July
- H. Jiang, K. Hirose, and Q. Huo, "Robust speech recognition based on Bayesian prediction approach," IEEE Trans. Speech Audio Processing, vol. 7, pp. 426-440, July 1999.
- (1999) IEEE Trans. Speech Audio Processing , vol.7 , pp. 426-440
- Jiang, H.¹ Hirose, K.² Huo, Q.³

125
- 0036124301
- A robust compensation strategy against extraneous acoustic variations in spontaneous speech recognition
- Jan.
- H. Jiang and L. Deng, "A robust compensation strategy against extraneous acoustic variations in spontaneous speech recognition," IEEE Trans. Speech Audio Processing, vol. 10, pp. 9-17, Jan. 2002.
- (2002) IEEE Trans. Speech Audio Processing , vol.10 , pp. 9-17
- Jiang, H.¹ Deng, L.²

126
- 21244458643
- Fast robust inverse transform speaker adapted training using diagonal transformations
- H. Jin, S. Matsoukas, R. Schwartz, and F. Kubala, "Fast robust inverse transform speaker adapted training using diagonal transformations," in Proc. IEEE ICASSP, 1998, pp. 785-788.
- (1998) Proc. IEEE ICASSP , pp. 785-788
- Jin, H.¹ Matsoukas, S.² Schwartz, R.³ Kubala, F.⁴

127
- 0022270364
- Mixture autoregressive hidden Markov models for speech signals
- Dec.
- B.-H. Juang and L. Rabiner, "Mixture autoregressive hidden Markov models for speech signals," IEEE Trans. Speech Audio Processing, vol. SAP-33, pp. 1404-1413, Dec. 1985.
- (1985) IEEE Trans. Speech Audio Processing , vol.SAP-33 , pp. 1404-1413
- Juang, B.-H.¹ Rabiner, L.²

128
- 0031139839
- Minimum classification error rate methods for speech recognition
- May
- B.-H. Juang, W. Chou, and C.-H. Lee, "Minimum classification error rate methods for speech recognition," IEEE Trans. Speech Audio Processing, vol. 5, pp. 257-265, May 1997.
- (1997) IEEE Trans. Speech Audio Processing , vol.5 , pp. 257-265
- Juang, B.-H.¹ Chou, W.² Lee, C.-H.³

129
- 0027465491
- The Lombard reflex and its role on human listeners and automatic speech recognizers
- J.-C. Junqua, "The Lombard reflex and its role on human listeners and automatic speech recognizers," J. Acoust. Soc. Amer., vol. 93, pp. 510-524, 1993.
- (1993) J. Acoust. Soc. Amer. , vol.93 , pp. 510-524
- Junqua, J.-C.¹

130
- 0003071809
- Evaluation and optimization of perceptually-based ASR front end
- Jan.
- J.-C. Junqua, H. Wakita, and H. Hermansky, "Evaluation and optimization of perceptually-based ASR front end," IEEE Trans. Speech Audio Processing, vol. 1, pp. 39-48, Jan. 1993.
- (1993) IEEE Trans. Speech Audio Processing , vol.1 , pp. 39-48
- Junqua, J.-C.¹ Wakita, H.² Hermansky, H.³

131
- 0003770709
- Boston, MA: Kluwer
- J.-C. Junqua and J.-P. Haton, Robustness in Automatic Speech Recognition. Boston, MA: Kluwer, 1996.
- (1996) Robustness in Automatic Speech Recognition
- Junqua, J.-C.¹ Haton, J.-P.²

132
- 0003847769
- Englewood Cliffs, NJ: Prentice-Hall
- D. Jurafsky and F. Martin, Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition. Englewood Cliffs, NJ: Prentice-Hall, 2000.
- (2000) Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition
- Jurafsky, D.¹ Martin, F.²

133
- 0020832068
- A hierarchical decision approach to large-vocabulary discrete utterance recognition
- Oct.
- T. Kaneko and N. R. Dixon, "A hierarchical decision approach to large-vocabulary discrete utterance recognition," IEEE Trans. Speech Audio Processing, vol. SAP-31, pp. 1061-1072, Oct. 1983.
- (1983) IEEE Trans. Speech Audio Processing , vol.SAP-31 , pp. 1061-1072
- Kaneko, T.¹ Dixon, N.R.²

134
- 0022045556
- Realism in synthetic speech
- Apr.
- G. Kaplan and E. Lerner, "Realism in synthetic speech," IEEE Spectrum, vol. 22, pp. 32-37, Apr. 1985.
- (1985) IEEE Spectrum , vol.22 , pp. 32-37
- Kaplan, G.¹ Lerner, E.²

135
- 0032203256
- Pattern recognition using a family of design algorithms based upon the generalized probabilistic descent method
- Nov.
- S. Katagiri, B. Juang, and C. Lee, "Pattern recognition using a family of design algorithms based upon the generalized probabilistic descent method," Proc. IEEE, vol. 86, pp. 2345-2373, Nov. 1998.
- (1998) Proc. IEEE , vol.86 , pp. 2345-2373
- Katagiri, S.¹ Juang, B.² Lee, C.³

136
- 0023312404
- Estimation of probabilities from sparse data for the language model component of a speech recognizer
- Mar.
- S. Katz, "Estimation of probabilities from sparse data for the language model component of a speech recognizer," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-35, pp. 400-401, Mar. 1987.
- (1987) IEEE Trans. Acoust., Speech, Signal Processing , vol.ASSP-35 , pp. 400-401
- Katz, S.¹

137
- 0032205629
- Flexible speech understanding based on combined key-phrase detection and verification
- Nov.
- T. Kawahara, C.-H. Lee, and B.-H. Juang, "Flexible speech understanding based on combined key-phrase detection and verification," IEEE Trans. Speech Audio Processing, vol. 6, pp. 558-568, Nov. 1998.
- (1998) IEEE Trans. Speech Audio Processing , vol.6 , pp. 558-568
- Kawahara, T.¹ Lee, C.-H.² Juang, B.-H.³

138
- 21244453364
- Designing control rules for a serial pole-zero vocal tract model
- J. Kerkhoff and L. Boves, "Designing control rules for a serial pole-zero vocal tract model," in Proc. Eurospeech, 1993, pp. 893-896.
- (1993) Proc. Eurospeech , pp. 893-896
- Kerkhoff, J.¹ Boves, L.²

139
- 77956275334
- Efficient method of establishing words tone dictionary for Korean TTS system
- S.-H. Kim and J.-Y. Kim, "Efficient method of establishing words tone dictionary for Korean TTS system," in Proc. Eurospeech, 1997, pp. 247-250.
- (1997) Proc. Eurospeech , pp. 247-250
- Kim, S.-H.¹ Kim, J.-Y.²

140
- 79952968027
- Speech recognition via phonetically featured syllables
- S. King, T. Stephenson, S. Isard, P. Taylor, and A. Strachan, "Speech recognition via phonetically featured syllables," in Proc. ICSLP, vol. 1, 1998, pp. 1031-1034.
- (1998) Proc. ICSLP , vol.1 , pp. 1031-1034
- King, S.¹ Stephenson, T.² Isard, S.³ Taylor, P.⁴ Strachan, A.⁵

141
- 0032136330
- Robust speech recognition using the modulation spectrogram
- B. Kingsbury, N. Morgan, and S. Greenberg, "Robust speech recognition using the modulation spectrogram," Speech Commun., pp. 25, 117-132, 1998.
- (1998) Speech Commun. , pp. 25
- Kingsbury, B.¹ Morgan, N.² Greenberg, S.³

142
- 0025321354
- Analysis, synthesis, and perception of voice quality variations among female and male talkers
- D. Klatt and L. Klatt, "Analysis, synthesis, and perception of voice quality variations among female and male talkers," J. Acoust. Soc. Amer., vol. 87, pp. 820-857, 1990.
- (1990) J. Acoust. Soc. Amer. , vol.87 , pp. 820-857
- Klatt, D.¹ Klatt, L.²

143
- 0017012286
- Structure of a phonological rule component for a synthesis-by-rule program
- Oct.
- D. Klatt, "Structure of a phonological rule component for a synthesis-by-rule program," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-24, pp. 391-398, Oct. 1976.
- (1976) IEEE Trans. Acoust., Speech, Signal Processing , vol.ASSP-24 , pp. 391-398
- Klatt, D.¹

144
- 0016952322
- Linguistic uses of segmental duration in English: Acoustic and perceptual evidence
- _, "Linguistic uses of segmental duration in English: Acoustic and perceptual evidence," J. Acoust. Soc. Amer., vol. 59, pp. 1208-1221, 1976.
- (1976) J. Acoust. Soc. Amer. , vol.59 , pp. 1208-1221

145
- 0017565919
- Review of the ARPA speech understanding project
- _, "Review of the ARPA speech understanding project," J. Acoust. Soc. Amer., vol. 62, pp. 1345-1366, 1977.
- (1977) J. Acoust. Soc. Amer. , vol.62 , pp. 1345-1366

146
- 0018986665
- Software for a cascade/parallel formant synthesizer
- _, "Software for a cascade/parallel formant synthesizer," J. Acoust. Soc. Amer., vol. 67, pp. 971-995, 1980.
- (1980) J. Acoust. Soc. Amer. , vol.67 , pp. 971-995

147
- 0023407575
- Review of text-to-speech conversion for English
- _, "Review of text-to-speech conversion for English," J. Acoust. Soc. Amer., vol. 82, pp. 737-793, 1987.
- (1987) J. Acoust. Soc. Amer. , vol.82 , pp. 737-793

148
- 0022106367
- Network-based isolated digit recognition using vector quantization
- Aug.
- G. Kopec and M. Bush, "Network-based isolated digit recognition using vector quantization," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-33, pp. 850-867, Aug. 1985.
- (1985) IEEE Trans. Acoust., Speech, Signal Processing , vol.ASSP-33 , pp. 850-867
- Kopec, G.¹ Bush, M.²

149
- 0029735634
- Speaker-independent speech recognition based on tree-structured speaker clustering
- T. Kosaka, S. Matsunaga, and S. Sagayama, "Speaker-independent speech recognition based on tree-structured speaker clustering," Comput. Speech Lang., vol. 10, pp. 55-74, 1996.
- (1996) Comput. Speech Lang. , vol.10 , pp. 55-74
- Kosaka, T.¹ Matsunaga, S.² Sagayama, S.³

150
- 21244505243
- Speaker modeling for speaker adaptation in automatic speech recognition
- K. Johnson and J. Mullennix, Eds. San Diego, CA: Academic
- J. Kreiman, "Speaker modeling for speaker adaptation in automatic speech recognition," in Talker Variability in Speech Processing, K. Johnson and J. Mullennix, Eds. San Diego, CA: Academic, 1997, pp. 167-189.
- (1997) Talker Variability in Speech Processing , pp. 167-189
- Kreiman, J.¹

151
- 0025446887
- A cache-based natural language model for speech recognition
- June
- R. Kuhn and R. de Mori, "A cache-based natural language model for speech recognition," IEEE Trans. Pattern Anal. Machine Intell., vol. 12, pp. 570-583, June 1990.
- (1990) IEEE Trans. Pattern Anal. Machine Intell. , vol.12 , pp. 570-583
- Kuhn, R.¹ De Mori, R.²

152
- 0000392884
- Eigenvoices for speaker adaptation
- R. Kuhn et al., "Eigenvoices for speaker adaptation," in Proc. ICSLP, 1998, pp. 1771-1774.
- (1998) Proc. ICSLP , pp. 1771-1774
- Kuhn, R.¹

153
- 0030366881
- Improving decision trees for acoustic modeling
- A. Lazaridès, Y. Normandin, and R. Kuhn, "Improving decision trees for acoustic modeling," in Proc. ICSLP, 1996, pp. 1053-1056.
- (1996) Proc. ICSLP , pp. 1053-1056
- Lazaridès, A.¹ Normandin, Y.² Kuhn, R.³

154
- 85095220056
- Real-time analysis-synthesis and intelligibility of talking faces
- B. Le Goff, T. Guiard-Marigny, M. Cohen, and C. Benoit, "Real-time analysis-synthesis and intelligibility of talking faces," in ESCA Workshop, 1994, pp. 53-56.
- (1994) ESCA Workshop , pp. 53-56
- Le Goff, B.¹ Guiard-Marigny, T.² Cohen, M.³ Benoit, C.⁴

155
- 0003339670
- Speech recognition: Past, present, and future
- Englewood Cliffs, NJ: Prentice-Hall
- W. Lea, "Speech recognition: Past, present, and future," in Trends in Speech Recognition. Englewood Cliffs, NJ: Prentice-Hall, 1980, pp. 39-98.
- (1980) Trends in Speech Recognition , pp. 39-98
- Lea, W.¹

156
- 0027625140
- Improved lone concatenation rules in a formant-based Chinese text-to-speech system
- July
- L. Lee, C. Tseng, and C.-J. Hsieh, "Improved lone concatenation rules in a formant-based Chinese text-to-speech system," IEEE Trans. Speech Audio Processing, vol. 1, pp. 287-294, July 1993.
- (1993) IEEE Trans. Speech Audio Processing , vol.1 , pp. 287-294
- Lee, L.¹ Tseng, C.² Hsieh, C.-J.³

157
- 0032140546
- On stochastic feature and model compensation approaches for robust speech recognition
- C.-H. Lee, "On stochastic feature and model compensation approaches for robust speech recognition," Speech Commun., vol. 25, pp. 29-47, 1998.
- (1998) Speech Commun. , vol.25 , pp. 29-47
- Lee, C.-H.¹

158
- 0003770711
- Boston, MA: Kluwer
- C.-H. Lee, F. Soong, and K. Paliwal, Eds., Automatic Speech and Speaker Recognition-Advanced Topics. Boston, MA: Kluwer, 1996.
- (1996) Automatic Speech and Speaker Recognition-Advanced Topics
- Lee, C.-H.¹ Soong, F.² Paliwal, K.³

159
- 0003770715
- Boston, MA: Kluwer
- K.-F. Lee, Automatic Speech Recognition: The Development of the SPHINX System. Boston, MA: Kluwer, 1989.
- (1989) Automatic Speech Recognition: The Development of the SPHINX System
- Lee, K.-F.¹

160
- 0029288633
- Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
- C. Leggetter and P. Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models," Comput. Speech Lang., pp. 9, 171-185, 1995.
- (1995) Comput. Speech Lang. , pp. 9
- Leggetter, C.¹ Woodland, P.²

161
- 0029288633
- Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
- _, "Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models," Comput. Speech Lang., vol. 9, pp. 171-185, 1995.
- (1995) Comput. Speech Lang. , vol.9 , pp. 171-185

162
- 0004266447
- Cambridge, MA: MIT Press
- I. Lehiste, Suprasegmentals. Cambridge, MA: MIT Press, 1970.
- (1970) Suprasegmentals
- Lehiste, I.¹

163
- 0022149626
- Structural methods in automatic speech recognition
- Nov.
- S. Levinson, "Structural methods in automatic speech recognition," Proc. IEEE, vol. 73, pp. 1625-1650, Nov. 1985.
- (1985) Proc. IEEE , vol.73 , pp. 1625-1650
- Levinson, S.¹

164
- 0022864384
- Continuously variable duration hidden Markov models for speech analysis
- _, "Continuously variable duration hidden Markov models for speech analysis," in Proc. IEEE ICASSP, 1986, pp. 1241-1244.
- (1986) Proc. IEEE ICASSP , pp. 1241-1244

165
- 0034322144
- GA-based noisy speech recognition using two-dimensional cepstrum
- Nov.
- C.-T. Lin, H.-W. Nein, and J.-Y. Hwu, "GA-based noisy speech recognition using two-dimensional cepstrum," IEEE Trans. Speech Audio Processing, vol. 8, pp. 664-675, Nov. 2000.
- (2000) IEEE Trans. Speech Audio Processing , vol.8 , pp. 664-675
- Lin, C.-T.¹ Nein, H.-W.² Hwu, J.-Y.³

166
- 0029408722
- Normalizing the vocal tract length for speaker independent speech recognition
- Q. Lin and C. Che, "Normalizing the vocal tract length for speaker independent speech recognition," IEEE Signal Processing Lett., vol. 2, pp. 201-203, 1995.
- (1995) IEEE Signal Processing Lett. , vol.2 , pp. 201-203
- Lin, Q.¹ Che, C.²

167
- 0020180460
- Maximum likelihood estimation for multivariate observations of Markov sources
- Sept.
- L. Liporace, "Maximum likelihood estimation for multivariate observations of Markov sources," IEEE Trans. Inform. Theory, vol. IT-28, pp. 729-734, Sept. 1982.
- (1982) IEEE Trans. Inform. Theory , vol.IT-28 , pp. 729-734
- Liporace, L.¹

168
- 0031187171
- Speech recognition by humans and machines
- R. Lippmann, "Speech recognition by humans and machines," Speech Commun., vol. 22, pp. 1-15, 1997.
- (1997) Speech Commun. , vol.22 , pp. 1-15
- Lippmann, R.¹

169
- 0030640788
- A Robust speech recognition with time-varying filtering, interruptions, and noise
- R. Lippmann and B. Carlson, "A Robust speech recognition with time-varying filtering, interruptions, and noise," in IEEE Workshop Speech Recognition, 1997, pp. 365-372.
- (1997) IEEE Workshop Speech Recognition , pp. 365-372
- Lippmann, R.¹ Carlson, B.²

170
- 0028404665
- High accuracy phone recognition using context-clustering and quasitriphonic models
- A. Ljolje, "High accuracy phone recognition using context-clustering and quasitriphonic models," Comput. Speech Lang., vol. 8, pp. 129-151, 1994.
- (1994) Comput. Speech Lang. , vol.8 , pp. 129-151
- Ljolje, A.¹

171
- 21244499562
- A new system for text-to-speech conversion, and its application to Swedish
- M. Ljungqvist, A. Lindström, and K. Gustafson, "A new system for text-to-speech conversion, and its application to Swedish," in Proc. ICSLP, 1994, pp. 1779-1782.
- (1994) Proc. ICSLP , pp. 1779-1782
- Ljungqvist, M.¹ Lindström, A.² Gustafson, K.³

172
- 0024344665
- Segmental intelligibility of synthetic speech produced by rule
- J. Logan, B. Greene, and D. Pisoni, "Segmental intelligibility of synthetic speech produced by rule," J. Acoust. Soc. Amer., vol. 86, pp. 566-581, 1989.
- (1989) J. Acoust. Soc. Amer. , vol.86 , pp. 566-581
- Logan, J.¹ Greene, B.² Pisoni, D.³

173
- 0033872141
- Utterance verification in continuous speech recognition: Decoding and training procedures
- Mar.
- E. Lleida and P. Green, "Utterance verification in continuous speech recognition: Decoding and training procedures," IEEE Trans. Speech Audio Processing, vol. 8, pp. 126-139, Mar. 2000.
- (2000) IEEE Trans. Speech Audio Processing , vol.8 , pp. 126-139
- Lleida, E.¹ Green, P.²

174
- 0030286185
- High-performance alphabet recognition
- Nov.
- P. Loizou and A. Spanias, "High-performance alphabet recognition," IEEE Trans. Speech Audio Processing, vol. 4, pp. 430-445, Nov. 1996.
- (1996) IEEE Trans. Speech Audio Processing , vol.4 , pp. 430-445
- Loizou, P.¹ Spanias, A.²

175
- 0029375555
- Implementing the Viterbi algorithm
- Sept.
- H.-L. Lou, "Implementing the Viterbi algorithm," IEEE Signal Processing Mag., vol. 12, pp. 42-52, Sept. 1995.
- (1995) IEEE Signal Processing Mag. , vol.12 , pp. 42-52
- Lou, H.-L.¹

176
- 0029748338
- Speech concatenation and synthesis using an overlap-add sinusoidal model
- M. Macon and M. Clements, "Speech concatenation and synthesis using an overlap-add sinusoidal model," in Proc. IEEE ICASSP, 1996, pp. 361-364.
- (1996) Proc. IEEE ICASSP , pp. 361-364
- Macon, M.¹ Clements, M.²

177
- 0016495091
- Linear prediction: A tutorial review
- Apr.
- J. Makhoul, "Linear prediction: A tutorial review," Proc. IEEE, vol. 63, pp. 561-580, Apr. 1975.
- (1975) Proc. IEEE , vol.63 , pp. 561-580
- Makhoul, J.¹

178
- 0030715925
- A segment-based word-spotter using phonetic filler models
- A. Manos and V. Zue, "A segment-based word-spotter using phonetic filler models," in Proc. IEEE ICASSP, 1997, pp. 899-902.
- (1997) Proc. IEEE ICASSP , pp. 899-902
- Manos, A.¹ Zue, V.²

179
- 0024766457
- A family of distortion measures based upon projection operation for robust speech recognition
- Nov.
- D. Mansour and B.-H. Juang, "A family of distortion measures based upon projection operation for robust speech recognition," IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp. 1659-1671, Nov. 1989.
- (1989) IEEE Trans. Acoust., Speech, Signal Processing , vol.37 , pp. 1659-1671
- Mansour, D.¹ Juang, B.-H.²

180
- 0030779362
- Automatic word recognition based on second-order hidden Markov models
- Jan.
- J.-F. Mari, J.-P Haton, and A. Kriouile, "Automatic word recognition based on second-order hidden Markov models," IEEE Trans. Speech Audio Processing, vol. 5, pp. 22-25, Jan. 1997.
- (1997) IEEE Trans. Speech Audio Processing , vol.5 , pp. 22-25
- Mari, J.-F.¹ Haton, J.-P.² Kriouile, A.³

181
- 21244440490
- Spoken language processing in multimodal communication
- J. Mariani, "Spoken language processing in multimodal communication," in Proc. Int. Conf. Speech Processing, 1997, pp. 3-12.
- (1997) Proc. Int. Conf. Speech Processing , pp. 3-12
- Mariani, J.¹

182
- 0024911019
- Recent advances in speech processing
- _, "Recent advances in speech processing," in Proc. IEEE ICASSP, 1989, pp. 429-440.
- (1989) Proc. IEEE ICASSP , pp. 429-440

183
- 0003874959
- Berlin, Germany: Springer-Verlag
- J. D. Markel and A. H. Gray, Linear Prediction of Speech. Berlin, Germany: Springer-Verlag, 1976.
- (1976) Linear Prediction of Speech
- Markel, J.D.¹ Gray, A.H.²

184
- 0032049073
- Algorithms for bigram and trigrain word clustering
- S. Martin, J. Liermann, and H. Ney, "Algorithms for bigram and trigrain word clustering," Speech Commun., vol. 24, pp. 19-37, 1998.
- (1998) Speech Commun. , vol.24 , pp. 19-37
- Martin, S.¹ Liermann, J.² Ney, H.³

185
- 0004084456
- Cambridge, MA: MIT Press
- D. Massaro, Perceiving Talking Faces. Cambridge, MA: MIT Press, 1997.
- (1997) Perceiving Talking Faces
- Massaro, D.¹

186
- 0016049328
- An algorithm for automatic formant extraction using linear prediction spectra
- Apr.
- S. McCandless, "An algorithm for automatic formant extraction using linear prediction spectra," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-22, pp. 135-141, Apr. 1974.
- (1974) IEEE Trans. Acoust., Speech, Signal Processing , vol.ASSP-22 , pp. 135-141
- McCandless, S.¹

187
- 0036754943
- Robust speech recognition using probabilistic union models
- Sept.
- J. Ming, P. Jancovic, and F. J. Smith, "Robust speech recognition using probabilistic union models," IEEE Trans. Speech Audio Processing, vol. 10, pp. 403-414, Sept. 2002.
- (2002) IEEE Trans. Speech Audio Processing , vol.10 , pp. 403-414
- Ming, J.¹ Jancovic, P.² Smith, F.J.³

188
- 84892167119
- Transmissions and transitions: A study of two common assumptions in multi-band ASR
- N. Mirghafori and N. Morgan, "Transmissions and transitions: A study of two common assumptions in multi-band ASR," in Proc. IEEE ICASSP, 1998, pp. 713-716.
- (1998) Proc. IEEE ICASSP , pp. 713-716
- Mirghafori, N.¹ Morgan, N.²

189
- 0029196406
- A parallel implementation of a hidden Markov model with duration modeling for speech recognition
- C. Mitchell, M. Harper, L. Jamieson, and R. Helzerman, "A parallel implementation of a hidden Markov model with duration modeling for speech recognition," in Dig. Signal Process., vol. 5, 1995, pp. 43-57.
- (1995) Dig. Signal Process. , vol.5 , pp. 43-57
- Mitchell, C.¹ Harper, M.² Jamieson, L.³ Helzerman, R.⁴

190
- 0348198473
- Finite-state transduceers in language and speech processing
- M. Mohri, "Finite-state transduceers in language and speech processing," Comput. Linguist., vol. 23, pp. 269-312, 1997.
- (1997) Comput. Linguist. , vol.23 , pp. 269-312
- Mohri, M.¹

191
- 0029375754
- Automatic word recognition in cars
- Sept.
- C. Mokbel and G. Chollet, "Automatic word recognition in cars," IEEE Trans. Speech Audio Processing, vol. 3, pp. 346-356, Sept. 1995.
- (1995) IEEE Trans. Speech Audio Processing , vol.3 , pp. 346-356
- Mokbel, C.¹ Chollet, G.²

192
- 0030287048
- The expectation-maximization algorithm
- Nov.
- T. Moon, "The expectation-maximization algorithm," IEEE Signal Processing Mag., vol. 13, pp. 47-60, Nov. 1996.
- (1996) IEEE Signal Processing Mag. , vol.13 , pp. 47-60
- Moon, T.¹

193
- 0029306621
- Continuous speech recognition
- May
- N. Morgan and H. Bourlard, "Continuous speech recognition," IEEE Signal Processing Mag., vol. 12, pp. 25-42, May 1995.
- (1995) IEEE Signal Processing Mag. , vol.12 , pp. 25-42
- Morgan, N.¹ Bourlard, H.²

194
- 0029308753
- Neural networks for statistical recognition of continuous speech
- May
- _, "Neural networks for statistical recognition of continuous speech," Proc. IEEE, vol. 83, pp. 742-770, May 1995.
- (1995) Proc. IEEE , vol.83 , pp. 742-770

195
- 0040320400
- Acoustic correlates of stress
- J. Morton and W. Jassem, "Acoustic correlates of stress," Lang. Speech, vol. 8, pp. 159-181, 1965.
- (1965) Lang. Speech , vol.8 , pp. 159-181
- Morton, J.¹ Jassem, W.²

196
- 0025543906
- Pitch synchronous waveform processing techniques for text-to-speech synthesis using diphones
- E. Moulines and F. Charpentier, "Pitch synchronous waveform processing techniques for text-to-speech synthesis using diphones," Speech Commun., vol. 9, pp. 453-467, 1990.
- (1990) Speech Commun. , vol.9 , pp. 453-467
- Moulines, E.¹ Charpentier, F.²

197
- 0027447292
- Toward the simulation of emotion in synthetic speech: A review of the literature on human vocal emotion
- I. Murray and J. Arnott, "Toward the simulation of emotion in synthetic speech: A review of the literature on human vocal emotion," J. Acoust. Soc. Amer., vol. 93, pp. 1097-1108, 1993.
- (1993) J. Acoust. Soc. Amer. , vol.93 , pp. 1097-1108
- Murray, I.¹ Arnott, J.²

198
- 21244498390
- A prototype text-to-speech system for Scottish Gaelic
- I. Murray and M. Black, "A prototype text-to-speech system for Scottish Gaelic," in Proc. Eurospeech, 1993, pp. 885-887.
- (1993) Proc. Eurospeech , pp. 885-887
- Murray, I.¹ Black, M.²

199
- 85009102728
- Room acoustics and reverberation: Impact on hands-free recognition
- S. Nakamura and K. Shikano, "Room acoustics and reverberation: Impact on hands-free recognition," in Proc. Eurospeech, 1997, pp. 2423-2426.
- (1997) Proc. Eurospeech , pp. 2423-2426
- Nakamura, S.¹ Shikano, K.²

200
- 0000635720
- Progress in dynamic programming search for LVCSR
- Aug.
- H. Ney and S. Ortmanns, "Progress in dynamic programming search for LVCSR," Proc. IEEE, vol. 88, pp. 1224-1240, Aug. 2000.
- (2000) Proc. IEEE , vol.88 , pp. 1224-1240
- Ney, H.¹ Ortmanns, S.²

201
- 0006815824
- An overview of the Philips research system for large-vocabulary continuous-speech recognition
- H. Ney, V. Steinbiss, R. Haeb-Umbach, B.-H. Tran, and U. Essen, "An overview of the Philips research system for large-vocabulary continuous-speech recognition," Int. J. Pattern Recognit. Artif. Intell., vol. 8, pp. 33-70, 1994.
- (1994) Int. J. Pattern Recognit. Artif. Intell. , vol.8 , pp. 33-70
- Ney, H.¹ Steinbiss, V.² Haeb-Umbach, R.³ Tran, B.-H.⁴ Essen, U.⁵

202
- 0004219017
- Palo Alto, CA: Tioga
- N. Nilsson, Principles of Artificial Intelligence. Palo Alto, CA: Tioga, 1980.
- (1980) Principles of Artificial Intelligence
- Nilsson, N.¹

203
- 0022227187
- Comparative study of several distortion measures for speech recognition
- N. Nocerino, F. Soong, L. Rabiner, and D. Klatt, "Comparative study of several distortion measures for speech recognition," in IEEE Int. Conf. ASSP, 1985, pp. 25-28.
- (1985) IEEE Int. Conf. ASSP , pp. 25-28
- Nocerino, N.¹ Soong, F.² Rabiner, L.³ Klatt, D.⁴

204
- 0028412908
- High-performance connected digit recognition using maximum mutual information estimation
- Apr.
- Y. Normandin, R. Cardin, and R. de Mori, "High-performance connected digit recognition using maximum mutual information estimation," IEEE Trans. Speech Audio Processing, vol. 2, pp. 299-311, Apr. 1994.
- (1994) IEEE Trans. Speech Audio Processing , vol.2 , pp. 299-311
- Normandin, Y.¹ Cardin, R.² De Mori, R.³

205
- 0035121145
- Concatenative synthesis based on a harmonic model
- Jan.
- D. O'Brien and A. I. C. Monaghan, "Concatenative synthesis based on a harmonic model," IEEE Trans. Speech Audio Processing, vol. 9, pp. 11-20, Jan. 2001.
- (2001) IEEE Trans. Speech Audio Processing , vol.9 , pp. 11-20
- O'Brien, D.¹ Monaghan, A.I.C.²

206
- 0036663550
- Stochastic natural language generation for spoken dialog systems
- A. Oh and A. Rudnicky, "Stochastic natural language generation for spoken dialog systems," Comput. Speech Lang., vol. 16, pp. 387-407, 2002.
- (2002) Comput. Speech Lang. , vol.16 , pp. 387-407
- Oh, A.¹ Rudnicky, A.²

207
- 84892189317
- Multi-band speech recognition in noisy environments
- S. Okawa, E. Bocchieri, and A. Potamianos, "Multi-band speech recognition in noisy environments," in Proc. IEEE ICASSP, 1998, pp. 641-644.
- (1998) Proc. IEEE ICASSP , pp. 641-644
- Okawa, S.¹ Bocchieri, E.² Potamianos, A.³

208
- 84966368599
- A rule-based text-to-speech system for Portuguese
- L. Oliveira, C. Viana, and I. Trancoso, "A rule-based text-to-speech system for Portuguese," in Proc. IEEE ICASSP, vol. 2, 1992, pp. 73-76.
- (1992) Proc. IEEE ICASSP , vol.2 , pp. 73-76
- Oliveira, L.¹ Viana, C.² Trancoso, I.³

209
- 0032142014
- Environmental conditions and acoustic transduction in hands-free robust speech recognition
- M. Omologo, P. Svaizer, and M. Matassoni, "Environmental conditions and acoustic transduction in hands-free robust speech recognition," Speech Commun., vol. 25, pp. 75-95, 1998.
- (1998) Speech Commun. , vol.25 , pp. 75-95
- Omologo, M.¹ Svaizer, P.² Matassoni, M.³

210
- 0030353329
- Spoken style explanation generator for Japanese Kanji using a text-to-speech system
- Y. Ooyama, H. Asano, and K. Matsuoka, "Spoken style explanation generator for Japanese Kanji using a text-to-speech system," in Proc. ICSLP, 1996, pp. 1369-1372.
- (1996) Proc. ICSLP , pp. 1369-1372
- Ooyama, Y.¹ Asano, H.² Matsuoka, K.³

211
- 0003513556
- Englewood Cliffs, NJ: Prentice-Hall
- A. Oppenheim and R. Schafer, Discrete-Time Signal Processing. Englewood Cliffs, NJ: Prentice-Hall, 1989.
- (1989) Discrete-time Signal Processing
- Oppenheim, A.¹ Schafer, R.²

212
- 0034321517
- The time-conditioned approach in dynamic programming search for LVCSR
- Nov.
- S. Ortmanns and H. Ney, "The time-conditioned approach in dynamic
- (2000) IEEE Trans. Speech Audio Processing , vol.8 , pp. 676-687
- Ortmanns, S.¹ Ney, H.²

213
- 0030719155
- A word graph algorithm for large vocabulary continuous speech recognition
- _, "A word graph algorithm for large vocabulary continuous speech recognition," Comput. Speech Lang., vol. 11, pp. 43-72, 1997.
- (1997) Comput. Speech Lang. , vol.11 , pp. 43-72

214
- 0016090735
- Consonant durations in clusters
- Aug.
- D. O'Shaughnessy, "Consonant durations in clusters," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-22, pp. 282-295, Aug. 1974.
- (1974) IEEE Trans. Acoust., Speech, Signal Processing , vol.ASSP-22 , pp. 282-295
- O'Shaughnessy, D.¹

215
- 0001098818
- Linguistic features in fundamental frequency patterns
- _, "Linguistic features in fundamental frequency patterns," J. Phonetics, vol. 7, pp. 119-145, 1979.
- (1979) J. Phonetics , vol.7 , pp. 119-145

216
- 0003522447
- Piscataway, NJ: IEEE Press
- _, Speech Communications: Human and Machine. Piscataway, NJ: IEEE Press, 2000.
- (2000) Speech Communications: Human and Machine

217
- 0030245363
- From HMM's to segment models: A unified view of stochastic modeling for speech recognition
- Sept.
- M. Ostendorf, V. Digalakis, and O. Kimball, "From HMM's to segment models: A unified view of stochastic modeling for speech recognition," IEEE Trans. Speech Audio Processing, vol. 4, pp. 360-378, Sept. 1996.
- (1996) IEEE Trans. Speech Audio Processing , vol.4 , pp. 360-378
- Ostendorf, M.¹ Digalakis, V.² Kimball, O.³

218
- 0031704151
- Speaker clustering and transformation for speaker adaptation in speech recognition systems
- Jan.
- M. Padmanabhan, L. Bahl, D. Nahamoo, and M. Picheny, "Speaker clustering and transformation for speaker adaptation in speech recognition systems," IEEE Trans. Speech Audio Processing, vol. 6, pp. 71-77, Jan. 1998.
- (1998) IEEE Trans. Speech Audio Processing , vol.6 , pp. 71-77
- Padmanabhan, M.¹ Bahl, L.² Nahamoo, D.³ Picheny, M.⁴

219
- 0030638045
- Spectral subband centroids as features for speech recognition
- K. Paliwal, "Spectral subband centroids as features for speech recognition," in IEEE Workshop Speech Recognition, 1997, pp. 124-131.
- (1997) IEEE Workshop Speech Recognition , pp. 124-131
- Paliwal, K.¹

220
- 0026746948
- On automatic estimation of articulatory parameters in a text-to-speech system
- S. Parthasarathy and C. Coker, "On automatic estimation of articulatory parameters in a text-to-speech system," Comput. Speech Lang., vol. 6, pp. 37-76, 1992.
- (1992) Comput. Speech Lang. , vol.6 , pp. 37-76
- Parthasarathy, S.¹ Coker, C.²

221
- 0022227186
- Training of HMM recognizers by simulated annealing
- D. Paul, "Training of HMM recognizers by simulated annealing, "in Proc. IEEE ICASSP, 1985, pp. 13-16.
- (1985) Proc. IEEE ICASSP , pp. 13-16
- Paul, D.¹

222
- 85036510306
- Bell Laboratories Russian text-to-speech system
- E. Pavlova, Y. Pavlov, R. Sproat, C. Shih, and P. van Santen, "Bell Laboratories Russian text-to-speech system," Comput. Speech Lang., pp. 6, 37-76, 1997.
- (1997) Comput. Speech Lang. , pp. 6
- Pavlova, E.¹ Pavlov, Y.² Sproat, R.³ Shih, C.⁴ Van Santen, P.⁵

223
- 0027659197
- Signal modeling techniques in speech recognition
- Sept.
- J. Picone, "Signal modeling techniques in speech recognition," Proc. IEEE, vol. 81, pp. 1215-1247, Sept. 1993.
- (1993) Proc. IEEE , vol.81 , pp. 1215-1247
- Picone, J.¹

224
- 0024884765
- Formant speech synthesis: Improving production quality
- Dec.
- N. Pinto, D. Childers, and A. Lalwani, "Formant speech synthesis: Improving production quality," IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp. 1870-1887, Dec. 1989.
- (1989) IEEE Trans. Acoust., Speech, Signal Processing , vol.37 , pp. 1870-1887
- Pinto, N.¹ Childers, D.² Lalwani, A.³

225
- 0022148789
- Perception of synthetic speech generated by rule
- Nov.
- D. Pisoni, H. Nusbaum, and B. Greene, "Perception of synthetic speech generated by rule," Proc. IEEE, vol. 73, pp. 1665-1676, Nov. 1985.
- (1985) Proc. IEEE , vol.73 , pp. 1665-1676
- Pisoni, D.¹ Nusbaum, H.² Greene, B.³

226
- 0041673687
- Quality assessment of text-to-speech synthesis by rule
- S. Furui and M. Sondhi, Eds. New York: Marcel Dekker
- L. Pols, "Quality assessment of text-to-speech synthesis by rule," in Advances in Speech Signal Processing, S. Furui and M. Sondhi, Eds. New York: Marcel Dekker, 1992, pp. 387-416.
- (1992) Advances in Speech Signal Processing , pp. 387-416
- Pols, L.¹

227
- 0023834849
- Hidden Markov models: A guided tour
- A. Poritz, "Hidden Markov models: A guided tour," in Proc. IEEE ICASSP, 1988, pp. 7-13.
- (1988) Proc. IEEE ICASSP , pp. 7-13
- Poritz, A.¹

228
- 85135358811
- Structure and representation of an inventory for German speech synthesis
- T. Portele, F. Höfer, and W. Hess, "Structure and representation of an inventory for German speech synthesis," in Proc. ICSLP, 1994, pp. 1759-1762.
- (1994) Proc. ICSLP , pp. 1759-1762
- Portele, T.¹ Höfer, F.² Hess, W.³

229
- 0003560513
- Englewood Cliffs, NJ: Prentice-Hall
- S. Quackenbush, T. Barnwell, and M. Clements, Objective Measures for Speech Quality. Englewood Cliffs, NJ: Prentice-Hall, 1988.
- (1988) Objective Measures for Speech Quality
- Quackenbush, S.¹ Barnwell, T.² Clements, M.³

230
- 0024610919
- A tutorial on hidden Markov models and selected applications in speech recognition
- Feb.
- L. Rabiner, "A tutorial on hidden Markov models and selected applications in speech recognition," Proc. IEEE, vol. 77, pp. 257-286, Feb. 1989.
- (1989) Proc. IEEE , vol.77 , pp. 257-286
- Rabiner, L.¹

231
- 0020735346
- On the application of vector quantization and hidden Markov models to speaker-independent, isolated word recognition
- L. Rabiner, S. Levinson, and M. Sondhi, "On the application of vector quantization and hidden Markov models to speaker-independent, isolated word recognition," Bell Syst. Tech. J., vol. 62, pp. 1075-1105, 1983.
- (1983) Bell Syst. Tech. J. , vol.62 , pp. 1075-1105
- Rabiner, L.¹ Levinson, S.² Sondhi, M.³

232
- 0021407797
- On the use of hidden Markov models for speaker-independent recognition of isolated words from a medium-size vocabulary
- _, "On the use of hidden Markov models for speaker-independent recognition of isolated words from a medium-size vocabulary," AT&T Bell Labs Tech. J., vol. 63, pp. 627-641, 1984.
- (1984) AT&T Bell Labs Tech. J. , vol.63 , pp. 627-641

233
- 0004244302
- Englewood Cliffs, NJ: Prentice-Hall
- L. Rabiner and B. Juang, Fundamentals of Speech Recognition. Englewood Cliffs, NJ: Prentice-Hall, 1993.
- (1993) Fundamentals of Speech Recognition
- Rabiner, L.¹ Juang, B.²

234
- 0030127017
- Signal conditioning techniques for robust speech recognition
- Apr.
- M. Rahim, B.-H. Juang, W. Chou, and E. Buhrke, "Signal conditioning techniques for robust speech recognition," IEEE Signal Processing Lett., vol. 3, pp. 107-109, Apr. 1996.
- (1996) IEEE Signal Processing Lett. , vol.3 , pp. 107-109
- Rahim, M.¹ Juang, B.-H.² Chou, W.³ Buhrke, E.⁴

235
- 0029769867
- Signal bias removal by maximum likelihood estimation for robust telephone speech recognition
- Jan.
- M. Rahim and B.-H. Juang, "Signal bias removal by maximum likelihood estimation for robust telephone speech recognition," IEEE Trans. Speech Audio Processing, vol. 4, pp. 19-30, Jan. 1996.
- (1996) IEEE Trans. Speech Audio Processing , vol.4 , pp. 19-30
- Rahim, M.¹ Juang, B.-H.²

236
- 0030962747
- A study on robust utterance verification for connected digits recognition
- M. Rahim, C.-H. Lee, and B.-H. Juang, "A study on robust utterance verification for connected digits recognition," J. Acoust. Soc. Amer., vol. 101, pp. 2892-2902, 1997.
- (1997) J. Acoust. Soc. Amer. , vol.101 , pp. 2892-2902
- Rahim, M.¹ Lee, C.-H.² Juang, B.-H.³

237
- 0035248922
- Deterministically annealed design of hidden Markov model speech recognizers
- Feb.
- A. Rao and K. Rose, "Deterministically annealed design of hidden Markov model speech recognizers," IEEE Trans. Speech Audio Processing, vol. 9, pp. 111-126, Feb. 2001.
- (2001) IEEE Trans. Speech Audio Processing , vol.9 , pp. 111-126
- Rao, A.¹ Rose, K.²

238
- 0035396159
- A maximum a posteriori approach to speaker adaptation using the trended hidden Markov model
- July
- C. Rathinavelu and L. Deng, "A maximum a posteriori approach to speaker adaptation using the trended hidden Markov model," IEEE Trans. Speech Audio Processing, vol. 9, pp. 549-557, July 2001.
- (2001) IEEE Trans. Speech Audio Processing , vol.9 , pp. 549-557
- Rathinavelu, C.¹ Deng, L.²

239
- 0016939166
- Speech recognition by machine: A review
- Apr.
- R. Reddy, "Speech recognition by machine: A review," Proc. IEEE, vol. 64, pp. 501-531, Apr. 1976.
- (1976) Proc. IEEE , vol.64 , pp. 501-531
- Reddy, R.¹

240
- 0034273299
- Robust decision tree state tying for continuous speech recognition
- Sept.
- W. Reichl and W. Chou, "Robust decision tree state tying for continuous speech recognition," IEEE Trans. Speech Audio Processing, vol. 8, pp. 555-566, Sept. 2000.
- (2000) IEEE Trans. Speech Audio Processing , vol.8 , pp. 555-566
- Reichl, W.¹ Chou, W.²

241
- 0029733137
- Phone deactivation pruning in large vocabulary continuous speech recognition
- Jan.
- S. Renals, "Phone deactivation pruning in large vocabulary continuous speech recognition," IEEE Signal Processing Lett., vol. 3, pp. 4-6, Jan. 1996.
- (1996) IEEE Signal Processing Lett. , vol.3 , pp. 4-6
- Renals, S.¹

242
- 0032675736
- The HDM: A segmental hidden dynamic model of coarticulation
- H. B. Richards and J. S. Bridle, "The HDM: A segmental hidden dynamic model of coarticulation," in Proc. IEEE ICASSP, vol. 1, 1999, pp. 357-360.
- (1999) Proc. IEEE ICASSP , vol.1 , pp. 357-360
- Richards, H.B.¹ Bridle, J.S.²

243
- 0012075535
- Evaluation of speech synthesis systems for Dutch in telecommunication applications in GSM and PSTN networks
- T. Rietveld et al., "Evaluation of speech synthesis systems for Dutch in telecommunication applications in GSM and PSTN networks," in Proc. Eurospeech, 1997, pp. 577-580.
- (1997) Proc. Eurospeech , pp. 577-580
- Rietveld, T.¹

244
- 0029386354
- Keyword detection in conversational speech utterances using hidden Markov model based continuous speech recognition
- R. Rose, "Keyword detection in conversational speech utterances using hidden Markov model based continuous speech recognition," Comput. Speech Lang., vol. 9, pp. 309-333, 1995.
- (1995) Comput. Speech Lang. , vol.9 , pp. 309-333
- Rose, R.¹

245
- 0030008004
- The potential role of speech production models in automatic speech recognition
- R. Rose, J. Schroeter, and M. Sondhi, "The potential role of speech production models in automatic speech recognition," J. Acoust. Soc. Amer., vol. 99, no. 3, pp. 1699-1709, 1996.
- (1996) J. Acoust. Soc. Amer. , vol.99 , Issue.3 , pp. 1699-1709
- Rose, R.¹ Schroeter, J.² Sondhi, M.³

246
- 0020766631
- Demisyllable-based isolated word recognition systems
- June
- A. Rosenberg, L. Rabiner, J. Wilpon, and D. Kahn, "Demisyllable- based isolated word recognition systems," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-31, pp. 713-726, June 1983.
- (1983) IEEE Trans. Acoust., Speech, Signal Processing , vol.ASSP-31 , pp. 713-726
- Rosenberg, A.¹ Rabiner, L.² Wilpon, J.³ Kahn, D.⁴

247
- 0030181951
- A maximum entropy approach to adaptive statistical language modeling
- R. Rosenfeld, "A maximum entropy approach to adaptive statistical language modeling," Comput. Speech Lang., vol. 10, pp. 187-228, 1996.
- (1996) Comput. Speech Lang. , vol.10 , pp. 187-228
- Rosenfeld, R.¹

248
- 0032665603
- A dynamical system model for generating fundamental frequency for speech synthesis
- May
- K. Ross and M. Ostendorf, "A dynamical system model for generating fundamental frequency for speech synthesis," IEEE Trans. Speech Audio Processing, vol. 7, pp. 295-309, May 1999.
- (1999) IEEE Trans. Speech Audio Processing , vol.7 , pp. 295-309
- Ross, K.¹ Ostendorf, M.²

249
- 0019606728
- An articulatory synthesizer for perceptual research
- P. Rubin, T. Baer, and P. Mermelstein, "An articulatory synthesizer for perceptual research," J. Acoust. Soc. Amer., vol. 70, pp. 321-328, 1981.
- (1981) J. Acoust. Soc. Amer. , vol.70 , pp. 321-328
- Rubin, P.¹ Baer, T.² Mermelstein, P.³

250
- 0031096901
- Linear trajectory segmental HMMs
- Mar.
- M. Russell and W. Holmes, "Linear trajectory segmental HMMs," IEEE Signal Processing Lett., vol. 4, pp. 72-74, Mar. 1997.
- (1997) IEEE Signal Processing Lett. , vol.4 , pp. 72-74
- Russell, M.¹ Holmes, W.²

251
- 0025229803
- Speech synthesis from text
- Jan.
- Y. Sagisaka, "Speech synthesis from text," IEEE Commun. Mag., vol. 28, pp. 35-41, Jan. 1990.
- (1990) IEEE Commun. Mag. , vol.28 , pp. 35-41
- Sagisaka, Y.¹

252
- 0032051044
- Pre-recognition measures of speaking rate
- K. Samudravijaya, S. Singh, and P. Rao, "Pre-recognition measures of speaking rate," Speech Commun., vol. 24, pp. 73-84, 1998.
- (1998) Speech Commun. , vol.24 , pp. 73-84
- Samudravijaya, K.¹ Singh, S.² Rao, P.³

253
- 0029239090
- A comparative study of mel cepstra and EIH for phone classification under adverse conditions
- S. Sandhu and O. Ghitza, "A comparative study of mel cepstra and EIH for phone classification under adverse conditions," in Proc. IEEE ICASSP, 1995, pp. 409-412.
- (1995) Proc. IEEE ICASSP , pp. 409-412
- Sandhu, S.¹ Ghitza, O.²

254
- 0033677121
- Maximum likelihood discriminant feature spaces
- G. Saon, M. Padmanabhan, R. Gopinath, and S. Chen, "Maximum likelihood discriminant feature spaces," in Proc. IEEE ICASSP, 2000, pp. 1129-1132.
- (2000) Proc. IEEE ICASSP , pp. 1129-1132
- Saon, G.¹ Padmanabhan, M.² Gopinath, R.³ Chen, S.⁴

255
- 0030648077
- Construction and evaluation of a robust multifeature speech/music discriminator
- E. Scheirer and M. Slaney, "Construction and evaluation of a robust multifeature speech/music discriminator," in Proc. IEEE ICASSP, 1997, pp. 1331-1334.
- (1997) Proc. IEEE ICASSP , pp. 1331-1334
- Scheirer, E.¹ Slaney, M.²

256
- 0028259480
- Techniques for estimating vocal tract shapes from the speech signal
- Jan.
- J. Schroeter and M. Sondhi, "Techniques for estimating vocal tract shapes from the speech signal," IEEE Trans. Speech Audio Processing, vol. 2, pp. 133-150, Jan. 1994.
- (1994) IEEE Trans. Speech Audio Processing , vol.2 , pp. 133-150
- Schroeter, J.¹ Sondhi, M.²

257
- 0002788850
- Multipass search strategies
- C.-H. Lee et al., Eds. Boston, MA: Kluwer, ch. 18
- R. Schwartz, et al., "Multipass search strategies," in Automatic Speech and Speaker Recognition, C.-H. Lee et al., Eds. Boston, MA: Kluwer, 1995, ch. 18.
- (1995) Automatic Speech and Speaker Recognition
- Schwartz, R.¹

258
- 85014377643
- TINA: A natural language system for spoken language applications
- S. Seneff, "TINA: A natural language system for spoken language applications," Comput. Linguist., vol. 18, pp. 61-86, 1992.
- (1992) Comput. Linguist. , vol.18 , pp. 61-86
- Seneff, S.¹

259
- 0036289978
- Real-time speech synthesis on an ultra-low resource, programmable DSP system
- H. Sheikhzadeh, E. Cornu, R. Brennan, and T. Schneider, "Real-time speech synthesis on an ultra-low resource, programmable DSP system," in Proc. IEEE ICASSP, vol. 1, 2002, pp. 433-436.
- (2002) Proc. IEEE ICASSP , vol.1 , pp. 433-436
- Sheikhzadeh, H.¹ Cornu, E.² Brennan, R.³ Schneider, T.⁴

260
- 2642702399
- Spectrum distance measures for speech recognition
- S. Furui and M. Sondhi, Eds. New York: Marcel Dekker
- K. Shikano and F. Itakura, "Spectrum distance measures for speech recognition," in Advances in Speech Signal Processing, S. Furui and M. Sondhi, Eds. New York: Marcel Dekker, 1992, pp. 419-452.
- (1992) Advances in Speech Signal Processing , pp. 419-452
- Shikano, K.¹ Itakura, F.²

261
- 0030247984
- Computer lipreading for improved accuracy in automatic speech recognition
- Sept.
- P. Silsbee and A. Bovik, "Computer lipreading for improved accuracy in automatic speech recognition," IEEE Trans. Speech Audio Processing, vol. 4, pp. 337-351, Sept. 1996.
- (1996) IEEE Trans. Speech Audio Processing , vol.4 , pp. 337-351
- Silsbee, P.¹ Bovik, A.²

262
- 0030165492
- Comparative experiments of several adaptation approaches to noisy speech recognition using stochastic trajectory models
- O. Siohan, Y. Gong, and J.-P. Haton, "Comparative experiments of several adaptation approaches to noisy speech recognition using stochastic trajectory models," Speech Commun., vol. 18, pp. 335-352, 1996.
- (1996) Speech Commun. , vol.18 , pp. 335-352
- Siohan, O.¹ Gong, Y.² Haton, J.-P.³

263
- 0030816220
- Incorporating phonetic properties in hidden Markov models for speech recognition
- R. Sitaram and T. Sreenivas, "Incorporating phonetic properties in hidden Markov models for speech recognition," J. Acoust. Soc. Amer., vol. 102, pp. 1149-1158, 1997.
- (1997) J. Acoust. Soc. Amer. , vol.102 , pp. 1149-1158
- Sitaram, R.¹ Sreenivas, T.²

264
- 0347387932
- On the importance of the microphone position for speech recognition in the car
- J. Smolders, T. Claes, G. Sablon, and D. van Campernolle, "On the importance of the microphone position for speech recognition in the car," in Proc. IEEE ICASSP, vol. 1, 1994, pp. 429-432.
- (1994) Proc. IEEE ICASSP , vol.1 , pp. 429-432
- Smolders, J.¹ Claes, T.² Sablon, G.³ Van Campernolle, D.⁴

265
- 0026370988
- A tree-trellis based search for finding the N best sentence hypotheses in continuous speech recognition
- F. Soong and E.-F. Huang, "A tree-trellis based search for finding the N best sentence hypotheses in continuous speech recognition," in Proc IEEE ICASSP, 1991, pp. 705-708.
- (1991) Proc IEEE ICASSP , pp. 705-708
- Soong, F.¹ Huang, E.-F.²

266
- 0004161686
- Boston, MA: Kluwer
- R. Sproat, Multi-Lingual Text-to-Speech Synthesis: The Bell Labs Approach. Boston, MA: Kluwer, 1998.
- (1998) Multi-lingual Text-to-speech Synthesis: The Bell Labs Approach
- Sproat, R.¹

267
- 0029352735
- Continuous speech dictation - From theory to practice
- V. Steinbiss et al., "Continuous speech dictation - from theory to practice," Speech Commun., vol. 17, pp. 19-38, 1995.
- (1995) Speech Commun. , vol.17 , pp. 19-38
- Steinbiss, V.¹

268
- 0031238095
- A model of dynamic auditory perception and its application to robust word recognition
- Sept.
- B. Strope and A. Alwan, "A model of dynamic auditory perception and its application to robust word recognition," IEEE Trans. Speech Audio Processing, vol. 5, pp. 451-464, Sept. 1997.
- (1997) IEEE Trans. Speech Audio Processing , vol.5 , pp. 451-464
- Strope, B.¹ Alwan, A.²

269
- 84892158557
- Robust word recognition using threaded spectral peaks
- _, "Robust word recognition using threaded spectral peaks," in Proc. IEEE ICASSP, 1998, pp. 625-628.
- (1998) Proc. IEEE ICASSP , pp. 625-628

270
- 0035279124
- Removing linear phase mismatches in concatenative speech synthesis
- Mar.
- Y. Stylianou, "Removing linear phase mismatches in concatenative speech synthesis," IEEE Trans. Speech Audio Processing, vol. 9, pp. 232-239, Mar. 2001.
- (2001) IEEE Trans. Speech Audio Processing , vol.9 , pp. 232-239
- Stylianou, Y.¹

271
- 0028195650
- Speech recognition using weighted HMM and subspace projection approaches
- Jan.
- K.-Y. Su and C.-H. Lee, "Speech recognition using weighted HMM and subspace projection approaches," IEEE Trans. Speech Audio Processing, vol. 2, pp. 69-79, Jan. 1994.
- (1994) IEEE Trans. Speech Audio Processing , vol.2 , pp. 69-79
- Su, K.-Y.¹ Lee, C.-H.²

272
- 0030287341
- Vocabulary-independent discriminative utterance verification for nonkeyword rejection in subword based speech recognition
- Nov.
- R. Sukkar and C.-H. Lee, "Vocabulary-independent discriminative utterance verification for nonkeyword rejection in subword based speech recognition," IEEE Trans. Speech Audio Processing, vol. 4, pp. 420-429, Nov. 1996.
- (1996) IEEE Trans. Speech Audio Processing , vol.4 , pp. 420-429
- Sukkar, R.¹ Lee, C.-H.²

273
- 0027316621
- Multi-microphone correlation-based processing for robust speech recognition
- T. Sullivan and R. Stern, "Multi-microphone correlation-based processing for robust speech recognition," in Proc. IEEE ICASSP, vol. 2, 1993, pp. 91-94.
- (1993) Proc. IEEE ICASSP , vol.2 , pp. 91-94
- Sullivan, T.¹ Stern, R.²

274
- 0031624617
- TD-PSOLA versus harmonic plus noise model in diphone based speech synthesis
- A. Syrdal, Y. Stylianou, A. Conkie, and J. Schroeter, "TD-PSOLA versus harmonic plus noise model in diphone based speech synthesis," in Proc. IEEE ICASSP, vol. 1, 1998, pp. 273-276.
- (1998) Proc. IEEE ICASSP , vol.1 , pp. 273-276
- Syrdal, A.¹ Stylianou, Y.² Conkie, A.³ Schroeter, J.⁴

275
- 0031624617
- TD-PSOLA versus harmonic plus noise model in diphone based speech synthesis
- _, "TD-PSOLA versus harmonic plus noise model in diphone based speech synthesis," in Proc. IEEE ICASSP, 1998, pp. 273-276.
- (1998) Proc. IEEE ICASSP , pp. 273-276

276
- 0031118076
- Vector-field-smoothed Bayesian learning for fast and incremental speaker/telephone-channel adaptation
- J. Takahashi and S. Sagayama, "Vector-field-smoothed Bayesian learning for fast and incremental speaker/telephone-channel adaptation," Comput. Speech Lang., vol. 11, pp. 127-146, 1997.
- (1997) Comput. Speech Lang. , vol.11 , pp. 127-146
- Takahashi, J.¹ Sagayama, S.²

277
- 21244459396
- An overview of different trends on CELP coding
- A. J. Ayuso and J. M. Soler, Eds. New York: Springer-Verlag
- I. Trancoso, "An overview of different trends on CELP coding," in Speech Recognition and Coding: New Advances and Trends, A. J. Ayuso and J. M. Soler, Eds. New York: Springer-Verlag, 1995, pp. 351-368.
- (1995) Speech Recognition and Coding: New Advances and Trends , pp. 351-368
- Trancoso, I.¹

278
- 0032761999
- Scale transform in speech analysis
- Jan.
- S. Umesh, L. Cohen, N. Marinovic, and D. Nelson, "Scale transform in speech analysis," IEEE Trans. Speech Audio Processing, vol. 7, pp. 40-45, Jan. 1998.
- (1998) IEEE Trans. Speech Audio Processing , vol.7 , pp. 40-45
- Umesh, S.¹ Cohen, L.² Marinovic, N.³ Nelson, D.⁴

279
- 0027147339
- Perceptual experiments for diagnostic testing of text-to-speech systems
- J. van Santen, "Perceptual experiments for diagnostic testing of text-to-speech systems," Comput. Speech Lang., vol. 7, pp. 49-100, 1993.
- (1993) Comput. Speech Lang. , vol.7 , pp. 49-100
- Van Santen, J.¹

280
- 0032296808
- A stochastic model of intonation for text-to-speech synthesis
- J. Véronis, P. di Cristo, F. Courtois, and C. Chaumette, "A stochastic model of intonation for text-to-speech synthesis," Speech Commun., vol. 26, pp. 233-244, 1998.
- (1998) Speech Commun. , vol.26 , pp. 233-244
- Véronis, J.¹ Di Cristo, P.² Courtois, F.³ Chaumette, C.⁴

281
- 0017482612
- Normalization of vowels by vocal-tract length and its application to vowel identification
- Apr.
- H. Wakita, "Normalization of vowels by vocal-tract length and its application to vowel identification," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-25, pp. 183-192, Apr. 1977.
- (1977) IEEE Trans. Acoust., Speech, Signal Processing , vol.ASSP-25 , pp. 183-192
- Wakita, H.¹

282
- 0036753897
- Speaker adaptive modeling by vocal tract normalization
- Sept.
- L. Welling, H. Ney, and S. Kanthak, "Speaker adaptive modeling by vocal tract normalization," IEEE Trans. Speech Audio Processing, vol. 10, pp. 415-426, Sept. 2002.
- (2002) IEEE Trans. Speech Audio Processing , vol.10 , pp. 415-426
- Welling, L.¹ Ney, H.² Kanthak, S.³

283
- 0031647965
- Formant estimation for speech recognition
- Jan.
- L. Welling and H. Ney, "Formant estimation for speech recognition," IEEE Trans. Speech Audio Processing, vol. 6, pp. 36-48, Jan. 1998.
- (1998) IEEE Trans. Speech Audio Processing , vol.6 , pp. 36-48
- Welling, L.¹ Ney, H.²

284
- 0025517070
- Automatic recognition of keywords in unconstrained speech using hidden Markov models
- Nov.
- J. Wilpon, L. Rabiner, C.-H. Lee, and E. Goldman, "Automatic recognition of keywords in unconstrained speech using hidden Markov models," IEEE Trans. Acoust., Speech, Signal Processing, vol. 38, pp. 1870-1878, Nov. 1990.
- (1990) IEEE Trans. Acoust., Speech, Signal Processing , vol.38 , pp. 1870-1878
- Wilpon, J.¹ Rabiner, L.² Lee, C.-H.³ Goldman, E.⁴

285
- 0035124445
- Control of spectral dynamics in concatenative speech synthesis
- Jan.
- J. Wouters and M. Macon, "Control of spectral dynamics in concatenative speech synthesis," IEEE Trans. Speech Audio Processing, vol. 9, pp. 30-38, Jan. 2001.
- (2001) IEEE Trans. Speech Audio Processing , vol.9 , pp. 30-38
- Wouters, J.¹ Macon, M.²

286
- 0030718943
- Multilingual large vocabulary speech recognition: The European SQUALE project
- S. Young et al., "Multilingual large vocabulary speech recognition: The European SQUALE project," Comput. Speech Lang., vol. 11, pp. 73-89, 1997.
- (1997) Comput. Speech Lang. , vol.11 , pp. 73-89
- Young, S.¹

287
- 0032181247
- Speech recognition evaluation: A review of the U.S. CSR and LVCSR programmes
- S. Young and L. Chase, "Speech recognition evaluation: A review of the U.S. CSR and LVCSR programmes," Comput. Speech Lang., vol. 12, pp. 263-279, 1998.
- (1998) Comput. Speech Lang. , vol.12 , pp. 263-279
- Young, S.¹ Chase, L.²

288
- 0028460810
- An acoustic-phonetic-based speaker-adaptation technique for improving speaker-independent continuous speech recognition
- July
- Y. Zhao, "An acoustic-phonetic-based speaker-adaptation technique for improving speaker-independent continuous speech recognition," IEEE Trans. Speech Audio Processing, vol. 2, pp. 380-394, July 1994.
- (1994) IEEE Trans. Speech Audio Processing , vol.2 , pp. 380-394
- Zhao, Y.¹

289
- 0022151324
- The use of speech knowledge in automatic speech recognition
- Nov.
- V. Zue, "The use of speech knowledge in automatic speech recognition," Proc. IEEE, vol. 73, pp. 1602-1615, Nov. 1985.
- (1985) Proc. IEEE , vol.73 , pp. 1602-1615
- Zue, V.¹

290
- 85036693931
- Conversational interfaces: Advances and challenges
- _, "Conversational interfaces: Advances and challenges," in Proc. Eurospeech, 1997, pp. KN-9-18.
- (1997) Proc. Eurospeech

291
- 21244470119
- Peripheral preprocessing in hearing and psychoacoustics as guidelines for speech recognition
- E. Zwicker, "Peripheral preprocessing in hearing and psychoacoustics as guidelines for speech recognition," in Proc. Montreal Symp. Speech Recognition, 1986, pp. 1-4.
- (1986) Proc. Montreal Symp. Speech Recognition , pp. 1-4
- Zwicker, E.¹

292
- 0018437122
- Automatic speech recognition using psychoacoustic models
- E. Zwicker, E. Terhardt, and E. Paulus, "Automatic speech recognition using psychoacoustic models," J. Acoust. Soc. Amer., vol. 65, pp. 487-498, 1979.
- (1979) J. Acoust. Soc. Amer. , vol.65 , pp. 487-498
- Zwicker, E.¹ Terhardt, E.² Paulus, E.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.