메뉴 건너뛰기




Volumn 91, Issue 9, 2003, Pages 1272-1305

Interacting with computers by voice: Automatic speech recognition and synthesis

Author keywords

Continuous speech recognition; Distance measures; Hidden markov models (HMMs); Human computer dialogues; Language models (LMs); Linear predictive coding (LPC); Spectral analysis; Speech synthesis; Text to speech (TTS)

Indexed keywords

ANIMATION; COMPUTER PROGRAMMING LANGUAGES; COMPUTER SIMULATION; CONTINUOUS SPEECH RECOGNITION; DATABASE SYSTEMS; DISTANCE MEASUREMENT; FOURIER TRANSFORMS; MARKOV PROCESSES; MATHEMATICAL MODELS; SPECTRUM ANALYSIS; SPEECH SYNTHESIS; TEXT PROCESSING;

EID: 4944252269     PISSN: 00189219     EISSN: None     Source Type: Journal    
DOI: 10.1109/JPROC.2003.817117     Document Type: Conference Paper
Times cited : (88)

References (292)
  • 2
    • 85009113852 scopus 로고    scopus 로고
    • HMM adaptation using vector Taylor series for noisy speech recognition
    • A. Acero, L. Deng, T. Kristjansson, and J. Zhang, "HMM adaptation using vector Taylor series for noisy speech recognition," in Proc. ICSLP, vol. 3, 2000, pp. 869-872.
    • (2000) Proc. ICSLP , vol.3 , pp. 869-872
    • Acero, A.1    Deng, L.2    Kristjansson, T.3    Zhang, J.4
  • 4
    • 0006137783 scopus 로고
    • Toward synthesis of Hindi consonants using Klsyn88
    • S. Agrawal and K. Stevens, "Toward synthesis of Hindi consonants using Klsyn88," in Proc. ICSLP, 1992, pp. 177-180.
    • (1992) Proc. ICSLP , pp. 177-180
    • Agrawal, S.1    Stevens, K.2
  • 5
    • 0031177213 scopus 로고    scopus 로고
    • Combined Bayesian and predictive techniques for rapid speaker adaptation of continuous density hidden Markov models
    • S. Ahadi and P. Woodland, "Combined Bayesian and predictive techniques for rapid speaker adaptation of continuous density hidden Markov models," Comput. Speech Lang., vol. 11, pp. 187-206, 1997.
    • (1997) Comput. Speech Lang. , vol.11 , pp. 187-206
    • Ahadi, S.1    Woodland, P.2
  • 6
    • 0030037151 scopus 로고    scopus 로고
    • Cepstral representation of speech motivated by time-frequency masking: An application to speech recognition
    • K. Aikawa, H. Singer, H. Kawahara, and Y. Tokhura, "Cepstral representation of speech motivated by time-frequency masking: An application to speech recognition," J. Acoust. Soc. Amer., vol. 100, pp. 603-614, 1996.
    • (1996) J. Acoust. Soc. Amer. , vol.100 , pp. 603-614
    • Aikawa, K.1    Singer, H.2    Kawahara, H.3    Tokhura, Y.4
  • 7
    • 0030369319 scopus 로고    scopus 로고
    • Archisegment-based letter-to-phone conversion for concatenative speech synthesis in Portuguese
    • E. Albano and A. Moreira, "Archisegment-based letter-to-phone conversion for concatenative speech synthesis in Portuguese," in Proc. ICSLP, 1996, pp. 1708-1711.
    • (1996) Proc. ICSLP , pp. 1708-1711
    • Albano, E.1    Moreira, A.2
  • 8
    • 0011138907 scopus 로고
    • Overview of text-to-speech systems
    • S. Furui and M. Sondhi, Eds. New York: Marcel Dekker
    • J. Allen, "Overview of text-to-speech systems," in Advances in Speech Signal Processing, S. Furui and M. Sondhi, Eds. New York: Marcel Dekker, 1992, pp. 741-790.
    • (1992) Advances in Speech Signal Processing , pp. 741-790
    • Allen, J.1
  • 9
    • 0028516073 scopus 로고
    • How do humans process and recognize speech?
    • _, "How do humans process and recognize speech?," IEEE Trans. Speech Audio Processing, vol. 2, pp. 567-577, 1994.
    • (1994) IEEE Trans. Speech Audio Processing , vol.2 , pp. 567-577
  • 10
    • 0032762247 scopus 로고    scopus 로고
    • Selective training for hidden Markov models with applications to speech coding
    • Oct.
    • L. Arslan and J. Hansen, "Selective training for hidden Markov models with applications to speech coding," IEEE Trans. Speech Audio Processing vol. 7, pp. 46-54, Oct. 1999.
    • (1999) IEEE Trans. Speech Audio Processing , vol.7 , pp. 46-54
    • Arslan, L.1    Hansen, J.2
  • 11
    • 0032045825 scopus 로고    scopus 로고
    • Phonemic transcription by analogy in text-to-speech synthesis: Novel word pronunciation and lexicon compression
    • P. Bagshaw, "Phonemic transcription by analogy in text-to-speech synthesis: Novel word pronunciation and lexicon compression," Comput. Speech Lang., vol. 12, pp. 119-142, 1998.
    • (1998) Comput. Speech Lang. , vol.12 , pp. 119-142
    • Bagshaw, P.1
  • 13
    • 0022890536 scopus 로고
    • Maximum mutual in formation estimation of hidden Markov model parameters for speech recognition
    • L. Bahl, P. Brown, P. de Souza, and R. Mercer, "Maximum mutual in formation estimation of hidden Markov model parameters for speech recognition," Proc. IEEE ICASSP, pp. 49-52, 1986.
    • (1986) Proc. IEEE ICASSP , pp. 49-52
    • Bahl, L.1    Brown, P.2    De Souza, P.3    Mercer, R.4
  • 15
    • 0016615529 scopus 로고
    • Decoding for channels with insertions, deletions, and substitutions with applications to speech recognition
    • July
    • L. Bahl and F. Jelinek, "Decoding for channels with insertions, deletions, and substitutions with applications to speech recognition," IEEE Trans. Inform. Theory, vol. IT-21, pp. 404-411, July 1975.
    • (1975) IEEE Trans. Inform. Theory , vol.IT-21 , pp. 404-411
    • Bahl, L.1    Jelinek, F.2
  • 16
    • 0020719320 scopus 로고
    • A maximum likelihood approach to continuous speech recognition
    • Mar.
    • L. Bahl, F. Jelinek, and R. Mercer, "A maximum likelihood approach to continuous speech recognition," IEEE Trans. Pattern Anal. Machine Intell., vol. PAMI-5, pp. 179-190, Mar. 1983.
    • (1983) IEEE Trans. Pattern Anal. Machine Intell. , vol.PAMI-5 , pp. 179-190
    • Bahl, L.1    Jelinek, F.2    Mercer, R.3
  • 18
    • 0001862769 scopus 로고
    • An inequality and associated maximization technique in statistical estimation for probabilistic functions of Markov processes
    • L. E. Baum, "An inequality and associated maximization technique in statistical estimation for probabilistic functions of Markov processes," Inequalities, vol. 3, pp. 1-8, 1972.
    • (1972) Inequalities , vol.3 , pp. 1-8
    • Baum, L.E.1
  • 19
    • 0040852401 scopus 로고
    • Rule-based grapheme-to-phoneme conversion of names
    • K. Belhoula, "Rule-based grapheme-to-phoneme conversion of names," in Proc. Eurospeech, 1993, pp. 881-884.
    • (1993) Proc. Eurospeech , pp. 881-884
    • Belhoula, K.1
  • 20
    • 84892176357 scopus 로고    scopus 로고
    • Exploiting both local and global constraints for multi-span statistical language modeling
    • J. Bellegarda, "Exploiting both local and global constraints for multi-span statistical language modeling," in Proc. IEEE ICASSP, 1998, pp. 677-680.
    • (1998) Proc. IEEE ICASSP , pp. 677-680
    • Bellegarda, J.1
  • 22
    • 84966366503 scopus 로고    scopus 로고
    • Rapid unit selection from a large speech corpus for concatenative speech synthesis
    • M. Beutnagel, M. Mohri, and M. Riley, "Rapid unit selection from a large speech corpus for concatenative speech synthesis," in Proc. Eurospeech, 1999, pp. 607-610.
    • (1999) Proc. Eurospeech , pp. 607-610
    • Beutnagel, M.1    Mohri, M.2    Riley, M.3
  • 23
    • 0027228898 scopus 로고
    • Multilingual PSOLA text-to-speech system
    • D. Bigorgne et al., "Multilingual PSOLA text-to-speech system," in Proc. IEEE ICASSP, vol. 2, 1993, pp. 187-190.
    • (1993) Proc. IEEE ICASSP , vol.2 , pp. 187-190
    • Bigorgne, D.1
  • 25
    • 85133526552 scopus 로고    scopus 로고
    • Automatic clustering similar units for unit selection in speech synthesis
    • A. Black and P. Taylor, "Automatic clustering similar units for unit selection in speech synthesis," in Proc. Eurospeech, 1997, pp. 601-604.
    • (1997) Proc. Eurospeech , pp. 601-604
    • Black, A.1    Taylor, P.2
  • 26
    • 0030142722 scopus 로고    scopus 로고
    • Toward increasing speech recognition error rates
    • H. Bourlard, H. Hermansky, and N. Morgan, "Toward increasing speech recognition error rates," Speech Commun., vol. 18, pp. 205-231, 1996.
    • (1996) Speech Commun. , vol.18 , pp. 205-231
    • Bourlard, H.1    Hermansky, H.2    Morgan, N.3
  • 27
    • 0022246330 scopus 로고
    • Speaker dependent connected speech recognition via phonemic Markov models
    • H. Bourlard, Y. Kamp, and C. Wellekens, "Speaker dependent connected speech recognition via phonemic Markov models," in Proc. IEEE ICASSP, 1985, pp. 1213-1216.
    • (1985) Proc. IEEE ICASSP , pp. 1213-1216
    • Bourlard, H.1    Kamp, Y.2    Wellekens, C.3
  • 30
    • 0031675455 scopus 로고    scopus 로고
    • An algorithm for maximum likelihood estimation of hidden Markov models with unknown state-tying
    • Jan.
    • O. Cappé, C. Mokbel, D. Jouvet, and E. Moulines, "An algorithm for maximum likelihood estimation of hidden Markov models with unknown state-tying," IEEE Trans. Speech Audio Processing, vol. 6, pp. 61-70, Jan. 1998.
    • (1998) IEEE Trans. Speech Audio Processing , vol.6 , pp. 61-70
    • Cappé, O.1    Mokbel, C.2    Jouvet, D.3    Moulines, E.4
  • 31
    • 84969173798 scopus 로고    scopus 로고
    • Segmentation and modeling in segment-based recognition
    • J. Chang and J. Glass, "Segmentation and modeling in segment-based recognition," in Proc. Eurospeech, 1997, pp. 1199-1202.
    • (1997) Proc. Eurospeech , pp. 1199-1202
    • Chang, J.1    Glass, J.2
  • 32
    • 0033329799 scopus 로고    scopus 로고
    • An empirical study of smoothing techniques for language modeling
    • J. Chen and J. Goodman, "An empirical study of smoothing techniques for language modeling," Comput. Speech Lang., vol. 13, pp. 359-394, 1999.
    • (1999) Comput. Speech Lang. , vol.13 , pp. 359-394
    • Chen, J.1    Goodman, J.2
  • 33
    • 0031146514 scopus 로고    scopus 로고
    • HMM-based speech recognition using state-dependent, discriminatively derived transforms on mel-warped DFT features
    • May
    • R. Chengalvarayan and L. Deng, "HMM-based speech recognition using state-dependent, discriminatively derived transforms on mel-warped DFT features," IEEE Trans. Speech Audio Processing, vol. 5, pp. 243-256, May 1997.
    • (1997) IEEE Trans. Speech Audio Processing , vol.5 , pp. 243-256
    • Chengalvarayan, R.1    Deng, L.2
  • 34
    • 0022888128 scopus 로고
    • Stress assignment in letter-to-sound rules for speech synthesis
    • K. Church, "Stress assignment in letter-to-sound rules for speech synthesis," in Proc. IEEE ICASSP, 1986, pp. 2423-2426.
    • (1986) Proc. IEEE ICASSP , pp. 2423-2426
    • Church, K.1
  • 35
    • 21244486894 scopus 로고
    • Dordrecht, The Netherlands: Kluwer
    • _, Parsing in Speech Recognition. Dordrecht, The Netherlands: Kluwer, 1987.
    • (1987) Parsing in Speech Recognition
  • 36
    • 0032204117 scopus 로고    scopus 로고
    • A novel feature transformation for vocal tract length normalization in automatic speech recognition
    • Nov.
    • T. Claes, J. Dologlou, L. ten Bosch, and D. van Compernolle, "A novel feature transformation for vocal tract length normalization in automatic speech recognition," IEEE Trans. Speech Audio Processing, vol. 6, pp. 549-557, Nov. 1998.
    • (1998) IEEE Trans. Speech Audio Processing , vol.6 , pp. 549-557
    • Claes, T.1    Dologlou, J.2    Ten Bosch, L.3    Van Compernolle, D.4
  • 37
    • 0029230678 scopus 로고
    • The challenge of spoken language systems: Research directions for the nineties
    • Jan.
    • R. Cole et al., "The challenge of spoken language systems: Research directions for the nineties," IEEE Trans. Speech Audio Processing, vol. 3, pp. 1-21, Jan. 1995.
    • (1995) IEEE Trans. Speech Audio Processing , vol.3 , pp. 1-21
    • Cole, R.1
  • 38
    • 0031161885 scopus 로고    scopus 로고
    • A hybrid algorithm for speaker adaptation using MAP transformation and adaptation
    • June
    • J.-T. Chien, C.-H. Lee, and H.-C. Wang, "A hybrid algorithm for speaker adaptation using MAP transformation and adaptation," IEEE Signal Processing Lett., vol. 4, pp. 167-169, June 1997.
    • (1997) IEEE Signal Processing Lett. , vol.4 , pp. 167-169
    • Chien, J.-T.1    Lee, C.-H.2    Wang, H.-C.3
  • 39
    • 0025786649 scopus 로고
    • Voice quality factors: Analysis, synthesis & perception
    • D. Childers and C. Lee, "Voice quality factors: Analysis, synthesis & perception," J. Acoust. Soc. Amer., vol. 90, pp. 2394-2410, 1991.
    • (1991) J. Acoust. Soc. Amer. , vol.90 , pp. 2394-2410
    • Childers, D.1    Lee, C.2
  • 41
    • 0000767590 scopus 로고    scopus 로고
    • Discriminant-function-based minimum recognition error rate pattern-recognition approach to speech recognition
    • Aug.
    • W. Chou, B. Juang, and C.-H. Lee, "Discriminant-function-based minimum recognition error rate pattern-recognition approach to speech recognition," Proc. IEEE, vol. 88, pp. 1201-1223, Aug. 2000.
    • (2000) Proc. IEEE , vol.88 , pp. 1201-1223
    • Chou, W.1    Juang, B.2    Lee, C.-H.3
  • 42
    • 0017620899 scopus 로고
    • Detecting and locating key words in continuous speech using linear predictive coding
    • Oct.
    • R. Christiansen and C. Rushforth, "Detecting and locating key words in continuous speech using linear predictive coding," IEEE Trans. Speech Audio Processing, vol. SAP-25, pp. 361-367, Oct. 1977.
    • (1977) IEEE Trans. Speech Audio Processing , vol.SAP-25 , pp. 361-367
    • Christiansen, R.1    Rushforth, C.2
  • 43
    • 0016940126 scopus 로고
    • A model of articulatory dynamics and control
    • Apr.
    • C. Coker, "A model of articulatory dynamics and control," Proc. IEEE, vol. 64, pp. 452-460, Apr. 1976.
    • (1976) Proc. IEEE , vol.64 , pp. 452-460
    • Coker, C.1
  • 44
    • 0024392496 scopus 로고
    • Application of an auditory model to speech recognition
    • J. R. Cohen, "Application of an auditory model to speech recognition," J. Acoust. Soc. Amer., vol. 85, no. 6, pp. 2623-2629, 1989.
    • (1989) J. Acoust. Soc. Amer. , vol.85 , Issue.6 , pp. 2623-2629
    • Cohen, J.R.1
  • 45
  • 46
    • 0030671924 scopus 로고    scopus 로고
    • Missing data techniques for robust speech recognition
    • M. Cooke, A. Morris, and P. Green, "Missing data techniques for robust speech recognition," in Proc. IEEE ICASSP, 1997, pp. 863-866.
    • (1997) Proc. IEEE ICASSP , pp. 863-866
    • Cooke, M.1    Morris, A.2    Green, P.3
  • 47
    • 0031222490 scopus 로고    scopus 로고
    • MMIE training of large vocabulary recognition systems
    • H. Cung and Y. Normandin, "MMIE training of large vocabulary recognition systems," Speech Commun., vol. 22, pp. 303-314, 1997.
    • (1997) Speech Commun. , vol.22 , pp. 303-314
    • Cung, H.1    Normandin, Y.2
  • 48
    • 0020191331 scopus 로고
    • Some experiments in discrete utterance recognition
    • Oct.
    • S. Das, "Some experiments in discrete utterance recognition," IEEE Trans. Speech Audio Processing, vol. SAP-30, pp. 766-770, Oct. 1982.
    • (1982) IEEE Trans. Speech Audio Processing , vol.SAP-30 , pp. 766-770
    • Das, S.1
  • 49
    • 0031644298 scopus 로고    scopus 로고
    • Improvements in children's speech recognition performance
    • S. Das, D. Nix, and M. Picheny, "Improvements in children's speech recognition performance," in Proc. IEEE ICASSP, 1998, pp. 433-436.
    • (1998) Proc. IEEE ICASSP , pp. 433-436
    • Das, S.1    Nix, D.2    Picheny, M.3
  • 50
    • 0020795461 scopus 로고
    • On the effects of varying filter bank parameters on isolated word recognition
    • Aug.
    • B. Dautrich, L. Rabiner, and T. Martin, "On the effects of varying filter bank parameters on isolated word recognition," IEEE Trans. Speech Audio Processing, vol. SAP-31, pp. 793-807, Aug. 1983.
    • (1983) IEEE Trans. Speech Audio Processing , vol.SAP-31 , pp. 793-807
    • Dautrich, B.1    Rabiner, L.2    Martin, T.3
  • 51
    • 0019053271 scopus 로고
    • Comparison of parametric representations for monosyllabic word recognition incontinuously spoken sentences
    • Aug.
    • S. Davis and P. Mermelstein, "Comparison of parametric representations for monosyllabic word recognition incontinuously spoken sentences," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-28, pp. 357-366, Aug. 1980.
    • (1980) IEEE Trans. Acoust., Speech, Signal Processing , vol.ASSP-28 , pp. 357-366
    • Davis, S.1    Mermelstein, P.2
  • 52
    • 0002629270 scopus 로고
    • Maximum likelihood from incomplete data via the EM algorithm
    • A. Dempster, N. Laird, and D. Rubin, "Maximum likelihood from incomplete data via the EM algorithm," J. Royal Statist. Soc., vol. 39, pp. 1-88, 1977.
    • (1977) J. Royal Statist. Soc. , vol.39 , pp. 1-88
    • Dempster, A.1    Laird, N.2    Rubin, D.3
  • 53
    • 0031185482 scopus 로고    scopus 로고
    • Speaker-independent phonetic classification using hidden Markov models with mixtures of trend functions
    • July
    • V. Deng and M. Aksmanovik, "Speaker-independent phonetic classification using hidden Markov models with mixtures of trend functions," IEEE Trans. Speech Audio Processing, vol. 5, pp. 319-324, July 1997.
    • (1997) IEEE Trans. Speech Audio Processing , vol.5 , pp. 319-324
    • Deng, V.1    Aksmanovik, M.2
  • 54
    • 0029219614 scopus 로고
    • A Markov model containing state-conditioned second-order nonstationarity: Application to speech recognition
    • L. Deng and R. Chengalvarayan, "A Markov model containing state-conditioned second-order nonstationarity: Application to speech recognition," Comput. Speech Lang., vol. 9, pp. 63-86, 1995.
    • (1995) Comput. Speech Lang. , vol.9 , pp. 63-86
    • Deng, L.1    Chengalvarayan, R.2
  • 55
    • 0036879732 scopus 로고    scopus 로고
    • A new multistage algorithm for spotting new words in speech
    • Nov.
    • S. Dharanipragada and S. Roukos, "A new multistage algorithm for spotting new words in speech," IEEE Trans. Speech Audio Processing, vol. 10, pp. 542-550, Nov. 2002.
    • (2002) IEEE Trans. Speech Audio Processing , vol.10 , pp. 542-550
    • Dharanipragada, S.1    Roukos, S.2
  • 56
    • 0030189744 scopus 로고    scopus 로고
    • Speaker adaptation using combined transformation and Bayesian methods
    • July
    • V. Digalakis and G. Neumeyer, "Speaker adaptation using combined transformation and Bayesian methods," IEEE Trans. Speech Audio Processing, vol. 4, pp. 294-300, July 1996.
    • (1996) IEEE Trans. Speech Audio Processing , vol.4 , pp. 294-300
    • Digalakis, V.1    Neumeyer, G.2
  • 57
    • 85041486134 scopus 로고    scopus 로고
    • Optimizing unit selection with voice source and formants in the CHATR speech synthesis system
    • W. Ding and N. Campbell, "Optimizing unit selection with voice source and formants in the CHATR speech synthesis system," in Proc. Eurospeech, 1997, pp. 537-540.
    • (1997) Proc. Eurospeech , pp. 537-540
    • Ding, W.1    Campbell, N.2
  • 58
    • 85009090897 scopus 로고    scopus 로고
    • A component by component listening test analysis of the IBM trainable speech synthesis system
    • R. Donovan, "A component by component listening test analysis of the IBM trainable speech synthesis system," in Proc. Eurospeech, 2001, pp. 329-332.
    • (2001) Proc. Eurospeech , pp. 329-332
    • Donovan, R.1
  • 59
    • 0032651722 scopus 로고    scopus 로고
    • A hidden Markov-model-based trainable speech synthesizer
    • R. Donovan and P. Woodland, "A hidden Markov-model-based trainable speech synthesizer," Comput. Speech Lang., vol. 13, pp. 223-241, 1999.
    • (1999) Comput. Speech Lang. , vol.13 , pp. 223-241
    • Donovan, R.1    Woodland, P.2
  • 60
    • 85006734596 scopus 로고    scopus 로고
    • Evaluation of the SPLICE algorithm on the Aurora2 database
    • J. Droppo, L. Deng, and A. Acero, "Evaluation of the SPLICE algorithm on the Aurora2 database," in Proc. Eurospeech, 2001, pp. 217-220.
    • (2001) Proc. Eurospeech , pp. 217-220
    • Droppo, J.1    Deng, L.2    Acero, A.3
  • 63
    • 0024933962 scopus 로고
    • An unrestricted vocabulary Arabic speech synthesis system
    • Dec.
    • Y. El-Imam, "An unrestricted vocabulary Arabic speech synthesis system," IEEE Trans. Speech Audio Processing, vol. 37, pp. 1829-1845, Dec. 1989.
    • (1989) IEEE Trans. Speech Audio Processing , vol.37 , pp. 1829-1845
    • El-Imam, Y.1
  • 65
    • 0001873457 scopus 로고
    • Filterbank-energy estimation using mixture and Markov models for recognition of noisy speech
    • Jan.
    • A. Erell and M. Weintraub, "Filterbank-energy estimation using mixture and Markov models for recognition of noisy speech," IEEE Trans. Speech Audio Processing, vol. 1, pp. 68-76, Jan. 1993.
    • (1993) IEEE Trans. Speech Audio Processing , vol.1 , pp. 68-76
    • Erell, A.1    Weintraub, M.2
  • 66
    • 21244495849 scopus 로고    scopus 로고
    • Echo and noise reduction for hands-free terminals - State of the art
    • G. Faucon and R. Le Bouquin-Jeannes, "Echo and noise reduction for hands-free terminals - state of the art," in Proc. Eurospeech, 1997, pp. 2423-2426.
    • (1997) Proc. Eurospeech , pp. 2423-2426
    • Faucon, G.1    Le Bouquin-Jeannes, R.2
  • 68
    • 21244499990 scopus 로고
    • Knowledge-based techniques in acoustic-phonetic decoding of speech: Interest and limitations
    • D. Fohr, J.-P. Haton, and Y. Laprie, "Knowledge-based techniques in acoustic-phonetic decoding of speech: Interest and limitations," Int. J. Pattern Recognit. Artif. Intell., vol. 8, pp. 133-153, 1994.
    • (1994) Int. J. Pattern Recognit. Artif. Intell. , vol.8 , pp. 133-153
    • Fohr, D.1    Haton, J.-P.2    Laprie, Y.3
  • 69
    • 0031175880 scopus 로고    scopus 로고
    • Unconstrained keyword spotting using phone lattices with application to spoken document retrieval
    • J. Foote, S. Young, G. Jones, and K. Jones, "Unconstrained keyword spotting using phone lattices with application to spoken document retrieval," Comput. Speech Lang., vol. 11, pp. 207-224, 1997.
    • (1997) Comput. Speech Lang. , vol.11 , pp. 207-224
    • Foote, J.1    Young, S.2    Jones, G.3    Jones, K.4
  • 70
    • 0015600423 scopus 로고
    • The Viterbi algorithm
    • Mar.
    • G. D. Forney, "The Viterbi algorithm," Proc IEEE, vol. 61, pp. 268-278, Mar. 1973.
    • (1973) Proc IEEE , vol.61 , pp. 268-278
    • Forney, G.D.1
  • 71
    • 84940820794 scopus 로고
    • Duration and intensity as physical correlates of linguistic stress
    • D. Fry, "Duration and intensity as physical correlates of linguistic stress," J. Acoust. Soc. Amer., vol. 27, pp. 765-768, 1955.
    • (1955) J. Acoust. Soc. Amer. , vol.27 , pp. 765-768
    • Fry, D.1
  • 72
    • 84964153357 scopus 로고
    • Experiments in the perception of stress
    • _, "Experiments in the perception of stress," Lang. Speech, vol. 1, pp. 126-152, 1958.
    • (1958) Lang. Speech , vol.1 , pp. 126-152
  • 73
    • 0000813409 scopus 로고
    • Syllables as concatenative phonetic units
    • A. Bell and J. Hooper, Eds. Amsterdam. The Netherlands: North-Holland
    • O. Fujimura and J. Lovins, "Syllables as concatenative phonetic units," in Syllables and Segments, A. Bell and J. Hooper, Eds. Amsterdam. The Netherlands: North-Holland, 1978, pp. 107-120.
    • (1978) Syllables and Segments , pp. 107-120
    • Fujimura, O.1    Lovins, J.2
  • 74
    • 4243460174 scopus 로고    scopus 로고
    • Semi-tied covariance matrices
    • M. Gales, "Semi-tied covariance matrices," in Proc. IEEE ICASSP, 1998, pp. 617-660.
    • (1998) Proc. IEEE ICASSP , pp. 617-660
    • Gales, M.1
  • 75
    • 0033097333 scopus 로고    scopus 로고
    • State-based Gaussian selection in large vocabulary continuous speech recognition using HMMs
    • Mar.
    • M. Gales, K. Knill, and S. Young, "State-based Gaussian selection in large vocabulary continuous speech recognition using HMMs," IEEE Trans. Speech Audio Processing, vol. 7, pp. 152-161, Mar. 1999.
    • (1999) IEEE Trans. Speech Audio Processing , vol.7 , pp. 152-161
    • Gales, M.1    Knill, K.2    Young, S.3
  • 76
    • 0034227757 scopus 로고    scopus 로고
    • Cluster adaptive training of hidden Markov models
    • July
    • M. Gales, "Cluster adaptive training of hidden Markov models," IEEE Trans. Speech Audio Processing, vol. 8, no. 4, pp. 417-428, July 2000.
    • (2000) IEEE Trans. Speech Audio Processing , vol.8 , Issue.4 , pp. 417-428
    • Gales, M.1
  • 77
    • 0032050110 scopus 로고    scopus 로고
    • Maximum likelihood linear transformations for HMM-based speech recognition
    • _, "Maximum likelihood linear transformations for HMM-based speech recognition," Comput. Speech Lang., vol. 12, pp. 75-98, 1998.
    • (1998) Comput. Speech Lang. , vol.12 , pp. 75-98
  • 78
    • 0032139556 scopus 로고    scopus 로고
    • Predictive model-based compensation schemes for robust speech recognition
    • _, "Predictive model-based compensation schemes for robust speech recognition," Speech Commun., vol. 25, pp. 49-74, 1998.
    • (1998) Speech Commun. , vol.25 , pp. 49-74
  • 79
    • 0030638030 scopus 로고    scopus 로고
    • Syllable - A promising recognition unit for LVCSR
    • A. Ganapathiraju et al., "Syllable - a promising recognition unit for LVCSR," in IEEE Workshop Speech Recognition, 1997, pp. 207-213.
    • (1997) IEEE Workshop Speech Recognition , pp. 207-213
    • Ganapathiraju, A.1
  • 80
    • 21244446048 scopus 로고
    • Noise reduction and speech recognition in noise conditions tested on LPNN-based continuous speech recognition system
    • Y. Gao and J.-P. Haton, "Noise reduction and speech recognition in noise conditions tested on LPNN-based continuous speech recognition system," in Proc. Eurospeech, 1993, pp. 1035-1038.
    • (1993) Proc. Eurospeech , pp. 1035-1038
    • Gao, Y.1    Haton, J.-P.2
  • 81
    • 0031632620 scopus 로고    scopus 로고
    • On the robust incorporation of formant features into hidden Markov models for automatic speech recognition
    • P. Garner and W. Holmes, "On the robust incorporation of formant features into hidden Markov models for automatic speech recognition," in Proc. IEEE ICASSP, 1998, pp. 1-4.
    • (1998) Proc. IEEE ICASSP , pp. 1-4
    • Garner, P.1    Holmes, W.2
  • 83
    • 0000030810 scopus 로고
    • Auditory nerve representation as a basis for speech processing
    • S. Furui and M. Sondhi, Eds. New York: Marcel Dekker
    • O. Ghitza, "Auditory nerve representation as a basis for speech processing," in Advances in Speech Signal Processing, S. Furui and M. Sondhi, Eds. New York: Marcel Dekker, 1992, pp. 453-485.
    • (1992) Advances in Speech Signal Processing , pp. 453-485
    • Ghitza, O.1
  • 84
    • 85016587886 scopus 로고
    • Switchboard: Telephone speech corpus for research and development
    • J. Godfrey, E. Holliman, and J. McDaniel, "Switchboard: Telephone speech corpus for research and development," in Proc. IEEE ICASSP, vol. 1, 1992, pp. 517-520.
    • (1992) Proc. IEEE ICASSP , vol.1 , pp. 517-520
    • Godfrey, J.1    Holliman, E.2    McDaniel, J.3
  • 85
    • 21244436199 scopus 로고
    • The interaction of phonetics, phonology and morphology in an Icelandic text-to-speech system
    • B. Granstrom, P. Helgason, and H. Thráinsson, "The interaction of phonetics, phonology and morphology in an Icelandic text-to-speech system," in Proc. ICSLP, 1992, pp. 185-188.
    • (1992) Proc. ICSLP , pp. 185-188
    • Granstrom, B.1    Helgason, P.2    Thráinsson, H.3
  • 86
    • 0028419019 scopus 로고
    • Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains
    • Apr.
    • J.-L. Gauvin and C.-H. Lee, "Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains," IEEE Trans. Speech Audio Processing, vol. 2, pp. 291-298, Apr. 1994.
    • (1994) IEEE Trans. Speech Audio Processing , vol.2 , pp. 291-298
    • Gauvin, J.-L.1    Lee, C.-H.2
  • 87
    • 0031232722 scopus 로고    scopus 로고
    • Speech analysis/synthesis and modification using an analysis-by- synthesis/overlap-add sinusoidal model
    • Sept.
    • E. George and M. Smith, "Speech analysis/synthesis and modification using an analysis-by-synthesis/overlap-add sinusoidal model," IEEE Trans. Speech Audio Processing, vol. 5, pp. 389-406, Sept. 1997.
    • (1997) IEEE Trans. Speech Audio Processing , vol.5 , pp. 389-406
    • George, E.1    Smith, M.2
  • 88
    • 0029288202 scopus 로고
    • Speech recognition in noisy environments
    • Y. Gong, "Speech recognition in noisy environments," Speech Commun., vol. 16, pp. 261-291, 1995.
    • (1995) Speech Commun. , vol.16 , pp. 261-291
    • Gong, Y.1
  • 89
    • 21244496605 scopus 로고    scopus 로고
    • Speaker normalizalion through formant-based warping of the frequency scale
    • E. Gouvêa and R. Stern, "Speaker normalizalion through formant-based warping of the frequency scale," in Proc. Eurospeech, 1997, pp. 1139-1142.
    • (1997) Proc. Eurospeech , pp. 1139-1142
    • Gouvêa, E.1    Stern, R.2
  • 91
    • 21244504086 scopus 로고
    • Acoustic pattern matching and beam searching
    • K. Greer, B. Lowerre, and L. Wilcox, "Acoustic pattern matching and beam searching," in Proc. IEEE ICASSP, 1982, pp. 1251-1254.
    • (1982) Proc. IEEE ICASSP , pp. 1251-1254
    • Greer, K.1    Lowerre, B.2    Wilcox, L.3
  • 93
    • 0028420015 scopus 로고
    • Improvements in beam search for 10000-word continuous speech recognition
    • Apr.
    • R. Haeb-Umbach and H. Ney, "Improvements in beam search for 10000-word continuous speech recognition," IEEE Trans. Speech Audio Processing, vol. 2, pp. 353-356, Apr. 1994.
    • (1994) IEEE Trans. Speech Audio Processing , vol.2 , pp. 353-356
    • Haeb-Umbach, R.1    Ney, H.2
  • 94
    • 0030196359 scopus 로고    scopus 로고
    • Feature analysis and neural network-based classification of speech under stress
    • July
    • J. Hansen and B. Womack, "Feature analysis and neural network-based classification of speech under stress," IEEE Trans. Speech Audio Processing, vol. 4, pp. 307-313, July 1996.
    • (1996) IEEE Trans. Speech Audio Processing , vol.4 , pp. 307-313
    • Hansen, J.1    Womack, B.2
  • 95
    • 1842658044 scopus 로고
    • Robust feature-estimation and objective quality assessment for noisy speech recognition using the Credit Card Corpus
    • May
    • J. Hansen and L. Arslan, "Robust feature-estimation and objective quality assessment for noisy speech recognition using the Credit Card Corpus," IEEE Trans. Speech Audio Processing, vol. 3, pp. 169-184, May 1995.
    • (1995) IEEE Trans. Speech Audio Processing , vol.3 , pp. 169-184
    • Hansen, J.1    Arslan, L.2
  • 96
    • 0032163635 scopus 로고    scopus 로고
    • An auditory-based distortion measure with application to concatenative speech synthesis
    • Sept.
    • J. Hansen and D. Chappell, "An auditory-based distortion measure with application to concatenative speech synthesis," IEEE Trans. Speech Audio Processing, vol. 6, no. 5, pp. 489-495, Sept. 1998.
    • (1998) IEEE Trans. Speech Audio Processing , vol.6 , Issue.5 , pp. 489-495
    • Hansen, J.1    Chappell, D.2
  • 97
    • 0031023993 scopus 로고    scopus 로고
    • Glottal characteristics of female speakers: Acoustic correlates
    • H. Hanson, "Glottal characteristics of female speakers: Acoustic correlates," J. Acoust. Soc. Amer., vol. 101, pp. 466-481, 1997.
    • (1997) J. Acoust. Soc. Amer. , vol.101 , pp. 466-481
    • Hanson, H.1
  • 100
    • 0032139768 scopus 로고    scopus 로고
    • Should recognizers have ears?
    • H. Hermansky, "Should recognizers have ears?," Speech Commun., vol. 25, pp. 3-27, 1998.
    • (1998) Speech Commun. , vol.25 , pp. 3-27
    • Hermansky, H.1
  • 101
    • 0022245547 scopus 로고
    • Keyword recognition using template concatenation
    • A. Higgins and R. Wohlford, "Keyword recognition using template concatenation," in Proc. IEEE ICASSP, 1985, pp. 1233-1236.
    • (1985) Proc. IEEE ICASSP , pp. 1233-1236
    • Higgins, A.1    Wohlford, R.2
  • 102
    • 0020905802 scopus 로고
    • Formant synthesizers - Cascade or parallel?
    • J. Holmes, "Formant synthesizers - cascade or parallel?," Speech Comm., vol. 2, pp. 251-273, 1983.
    • (1983) Speech Comm. , vol.2 , pp. 251-273
    • Holmes, J.1
  • 103
    • 85032644657 scopus 로고    scopus 로고
    • Using formant frequencies in speech recognition
    • J. Holmes, W. Holmes, and P. Garner, "Using formant frequencies in speech recognition," in Proc. Eurospeech, vol. 3, 1997, pp. 2083-2086.
    • (1997) Proc. Eurospeech , vol.3 , pp. 2083-2086
    • Holmes, J.1    Holmes, W.2    Garner, P.3
  • 104
    • 0032673963 scopus 로고    scopus 로고
    • Probabilistic-trajectory segmental HMM's
    • W. Holmes and M. Russell, "Probabilistic-trajectory segmental HMM's," Comput. Speech Lang., vol. 13, pp. 3-27, 1999.
    • (1999) Comput. Speech Lang. , vol.13 , pp. 3-27
    • Holmes, W.1    Russell, M.2
  • 105
    • 0033677062 scopus 로고    scopus 로고
    • Unified frame and segment based models for automatic speech recognition
    • H. Hon and K. Wang, "Unified frame and segment based models for automatic speech recognition," in Proc. IEEE ICASSP, vol. 2, 2000, pp. 1017-1020.
    • (2000) Proc. IEEE ICASSP , vol.2 , pp. 1017-1020
    • Hon, H.1    Wang, K.2
  • 106
    • 0031642265 scopus 로고    scopus 로고
    • Automatic generation of synthesis units for trainable text-to-speech systems
    • H. Hon, A. Acero, X. Huang, J. Liu, and M. Plumpe, "Automatic generation of synthesis units for trainable text-to-speech systems," in Proc. IEEE ICASSP, 1998, pp. 273-276.
    • (1998) Proc. IEEE ICASSP , pp. 273-276
    • Hon, H.1    Acero, A.2    Huang, X.3    Liu, J.4    Plumpe, M.5
  • 107
    • 0028460279 scopus 로고
    • A fast algorithm for large vocabulary keyword spotting application
    • July
    • E.-F. Huang, H.-C. Wang, and F. Soong, "A fast algorithm for large vocabulary keyword spotting application," IEEE Trans. Speech Audio Processing, vol. 2, pp. 449-452, July 1994.
    • (1994) IEEE Trans. Speech Audio Processing , vol.2 , pp. 449-452
    • Huang, E.-F.1    Wang, H.-C.2    Soong, F.3
  • 109
    • 0027578837 scopus 로고
    • On speaker-independent, speaker-dependent, and speaker-adaptive speech recognition
    • Apr.
    • X. Huang and K.-F. Lee, "On speaker-independent, speaker-dependent, and speaker-adaptive speech recognition," IEEE Trans. Speech Audio Processing, vol. 1, pp. 150-157, Apr. 1993.
    • (1993) IEEE Trans. Speech Audio Processing , vol.1 , pp. 150-157
    • Huang, X.1    Lee, K.-F.2
  • 111
    • 0027678306 scopus 로고
    • A comparative study of discrete, semicontinuous, and continuous hidden Markov models
    • X. Huang, H. Hon, M. Hwang, and K. Lee, "A comparative study of discrete, semicontinuous, and continuous hidden Markov models," Comput. Speech Lang., vol. 7, pp. 359-368, 1993.
    • (1993) Comput. Speech Lang. , vol.7 , pp. 359-368
    • Huang, X.1    Hon, H.2    Hwang, M.3    Lee, K.4
  • 112
    • 0029765811 scopus 로고    scopus 로고
    • Unit selection in a concatenative speech synthesis system using a large speech database
    • A. Hunt and W. Black, "Unit selection in a concatenative speech synthesis system using a large speech database," in Proc. IEEE ICASSP, 1996, pp. 373-376.
    • (1996) Proc. IEEE ICASSP , pp. 373-376
    • Hunt, A.1    Black, W.2
  • 113
    • 0024905238 scopus 로고
    • A comparison of several acoustic representations for speech recognition with degraded and undegraded speech
    • M. Hunt and C. Lefèbvre, "A comparison of several acoustic representations for speech recognition with degraded and undegraded speech," in Proc. IEEE ICASSP, 1989, pp. 262-265.
    • (1989) Proc. IEEE ICASSP , pp. 262-265
    • Hunt, M.1    Lefèbvre, C.2
  • 115
    • 0033900150 scopus 로고    scopus 로고
    • A Bayesian predictive classification approach to robust speech recognition
    • Mar.
    • Q. Huo and C.-H. Lee, "A Bayesian predictive classification approach to robust speech recognition," IEEE Trans. Speech Audio Processing, vol. 8, pp. 200-204, Mar. 2000.
    • (2000) IEEE Trans. Speech Audio Processing , vol.8 , pp. 200-204
    • Huo, Q.1    Lee, C.-H.2
  • 118
    • 0032785782 scopus 로고    scopus 로고
    • Modeling long distance dependence in language: Topic mixtures versus dynamic cache models
    • Jan.
    • R. Iyer and M. Ostendorf, "Modeling long distance dependence in language: Topic mixtures versus dynamic cache models," IEEE Trans. Speech Audio Processing, vol. 7, pp. 30-39, Jan. 1999.
    • (1999) IEEE Trans. Speech Audio Processing , vol.7 , pp. 30-39
    • Iyer, R.1    Ostendorf, M.2
  • 119
    • 0031209168 scopus 로고    scopus 로고
    • Using out-of-domain data to improve in-domain language models
    • R. Iyer, M. Ostendorf, and H. Gish, "Using out-of-domain data to improve in-domain language models," IEEE Signal Processing Lett., vol. 4, pp. 221-223, 1997.
    • (1997) IEEE Signal Processing Lett. , vol.4 , pp. 221-223
    • Iyer, R.1    Ostendorf, M.2    Gish, H.3
  • 120
    • 0029345416 scopus 로고
    • A comparison of signal processing front ends for automatic word recognition
    • July
    • C. Jankowski, H.-D. Vo, and R. Lippmann, "A comparison of signal processing front ends for automatic word recognition," IEEE Trans. Speech Audio Processing, vol. 3, pp. 286-293, July 1995.
    • (1995) IEEE Trans. Speech Audio Processing , vol.3 , pp. 286-293
    • Jankowski, C.1    Vo, H.-D.2    Lippmann, R.3
  • 121
    • 0016939124 scopus 로고
    • Continuous speech recognition by statistical methods
    • Apr.
    • F. Jelinek, "Continuous speech recognition by statistical methods," Proc. IEEE, vol. 64, pp. 532-556, Apr. 1976.
    • (1976) Proc. IEEE , vol.64 , pp. 532-556
    • Jelinek, F.1
  • 122
    • 0022150487 scopus 로고
    • The development of an experimental discrete dictation recognizer
    • Nov.
    • _, "The development of an experimental discrete dictation recognizer," Proc. IEEE, vol. 73, pp. 1616-1620, Nov. 1985.
    • (1985) Proc. IEEE , vol.73 , pp. 1616-1620
  • 123
    • 0001993550 scopus 로고
    • Principles of lexical language-modeling for speech recognition
    • S. Furui and M. Sondhi, Eds. New York: Marcel Dekker
    • F. Jelinek, R. Mercer, and S. Roucos, "Principles of lexical language-modeling for speech recognition," in Advances in Speech Signal Processing, S. Furui and M. Sondhi, Eds. New York: Marcel Dekker, 1992, pp. 651-699.
    • (1992) Advances in Speech Signal Processing , pp. 651-699
    • Jelinek, F.1    Mercer, R.2    Roucos, S.3
  • 124
    • 0032685060 scopus 로고    scopus 로고
    • Robust speech recognition based on Bayesian prediction approach
    • July
    • H. Jiang, K. Hirose, and Q. Huo, "Robust speech recognition based on Bayesian prediction approach," IEEE Trans. Speech Audio Processing, vol. 7, pp. 426-440, July 1999.
    • (1999) IEEE Trans. Speech Audio Processing , vol.7 , pp. 426-440
    • Jiang, H.1    Hirose, K.2    Huo, Q.3
  • 125
    • 0036124301 scopus 로고    scopus 로고
    • A robust compensation strategy against extraneous acoustic variations in spontaneous speech recognition
    • Jan.
    • H. Jiang and L. Deng, "A robust compensation strategy against extraneous acoustic variations in spontaneous speech recognition," IEEE Trans. Speech Audio Processing, vol. 10, pp. 9-17, Jan. 2002.
    • (2002) IEEE Trans. Speech Audio Processing , vol.10 , pp. 9-17
    • Jiang, H.1    Deng, L.2
  • 126
    • 21244458643 scopus 로고    scopus 로고
    • Fast robust inverse transform speaker adapted training using diagonal transformations
    • H. Jin, S. Matsoukas, R. Schwartz, and F. Kubala, "Fast robust inverse transform speaker adapted training using diagonal transformations," in Proc. IEEE ICASSP, 1998, pp. 785-788.
    • (1998) Proc. IEEE ICASSP , pp. 785-788
    • Jin, H.1    Matsoukas, S.2    Schwartz, R.3    Kubala, F.4
  • 127
    • 0022270364 scopus 로고
    • Mixture autoregressive hidden Markov models for speech signals
    • Dec.
    • B.-H. Juang and L. Rabiner, "Mixture autoregressive hidden Markov models for speech signals," IEEE Trans. Speech Audio Processing, vol. SAP-33, pp. 1404-1413, Dec. 1985.
    • (1985) IEEE Trans. Speech Audio Processing , vol.SAP-33 , pp. 1404-1413
    • Juang, B.-H.1    Rabiner, L.2
  • 128
    • 0031139839 scopus 로고    scopus 로고
    • Minimum classification error rate methods for speech recognition
    • May
    • B.-H. Juang, W. Chou, and C.-H. Lee, "Minimum classification error rate methods for speech recognition," IEEE Trans. Speech Audio Processing, vol. 5, pp. 257-265, May 1997.
    • (1997) IEEE Trans. Speech Audio Processing , vol.5 , pp. 257-265
    • Juang, B.-H.1    Chou, W.2    Lee, C.-H.3
  • 129
    • 0027465491 scopus 로고
    • The Lombard reflex and its role on human listeners and automatic speech recognizers
    • J.-C. Junqua, "The Lombard reflex and its role on human listeners and automatic speech recognizers," J. Acoust. Soc. Amer., vol. 93, pp. 510-524, 1993.
    • (1993) J. Acoust. Soc. Amer. , vol.93 , pp. 510-524
    • Junqua, J.-C.1
  • 130
  • 133
    • 0020832068 scopus 로고
    • A hierarchical decision approach to large-vocabulary discrete utterance recognition
    • Oct.
    • T. Kaneko and N. R. Dixon, "A hierarchical decision approach to large-vocabulary discrete utterance recognition," IEEE Trans. Speech Audio Processing, vol. SAP-31, pp. 1061-1072, Oct. 1983.
    • (1983) IEEE Trans. Speech Audio Processing , vol.SAP-31 , pp. 1061-1072
    • Kaneko, T.1    Dixon, N.R.2
  • 134
    • 0022045556 scopus 로고
    • Realism in synthetic speech
    • Apr.
    • G. Kaplan and E. Lerner, "Realism in synthetic speech," IEEE Spectrum, vol. 22, pp. 32-37, Apr. 1985.
    • (1985) IEEE Spectrum , vol.22 , pp. 32-37
    • Kaplan, G.1    Lerner, E.2
  • 135
    • 0032203256 scopus 로고    scopus 로고
    • Pattern recognition using a family of design algorithms based upon the generalized probabilistic descent method
    • Nov.
    • S. Katagiri, B. Juang, and C. Lee, "Pattern recognition using a family of design algorithms based upon the generalized probabilistic descent method," Proc. IEEE, vol. 86, pp. 2345-2373, Nov. 1998.
    • (1998) Proc. IEEE , vol.86 , pp. 2345-2373
    • Katagiri, S.1    Juang, B.2    Lee, C.3
  • 136
    • 0023312404 scopus 로고
    • Estimation of probabilities from sparse data for the language model component of a speech recognizer
    • Mar.
    • S. Katz, "Estimation of probabilities from sparse data for the language model component of a speech recognizer," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-35, pp. 400-401, Mar. 1987.
    • (1987) IEEE Trans. Acoust., Speech, Signal Processing , vol.ASSP-35 , pp. 400-401
    • Katz, S.1
  • 137
    • 0032205629 scopus 로고    scopus 로고
    • Flexible speech understanding based on combined key-phrase detection and verification
    • Nov.
    • T. Kawahara, C.-H. Lee, and B.-H. Juang, "Flexible speech understanding based on combined key-phrase detection and verification," IEEE Trans. Speech Audio Processing, vol. 6, pp. 558-568, Nov. 1998.
    • (1998) IEEE Trans. Speech Audio Processing , vol.6 , pp. 558-568
    • Kawahara, T.1    Lee, C.-H.2    Juang, B.-H.3
  • 138
    • 21244453364 scopus 로고
    • Designing control rules for a serial pole-zero vocal tract model
    • J. Kerkhoff and L. Boves, "Designing control rules for a serial pole-zero vocal tract model," in Proc. Eurospeech, 1993, pp. 893-896.
    • (1993) Proc. Eurospeech , pp. 893-896
    • Kerkhoff, J.1    Boves, L.2
  • 139
    • 77956275334 scopus 로고    scopus 로고
    • Efficient method of establishing words tone dictionary for Korean TTS system
    • S.-H. Kim and J.-Y. Kim, "Efficient method of establishing words tone dictionary for Korean TTS system," in Proc. Eurospeech, 1997, pp. 247-250.
    • (1997) Proc. Eurospeech , pp. 247-250
    • Kim, S.-H.1    Kim, J.-Y.2
  • 140
    • 79952968027 scopus 로고    scopus 로고
    • Speech recognition via phonetically featured syllables
    • S. King, T. Stephenson, S. Isard, P. Taylor, and A. Strachan, "Speech recognition via phonetically featured syllables," in Proc. ICSLP, vol. 1, 1998, pp. 1031-1034.
    • (1998) Proc. ICSLP , vol.1 , pp. 1031-1034
    • King, S.1    Stephenson, T.2    Isard, S.3    Taylor, P.4    Strachan, A.5
  • 141
    • 0032136330 scopus 로고    scopus 로고
    • Robust speech recognition using the modulation spectrogram
    • B. Kingsbury, N. Morgan, and S. Greenberg, "Robust speech recognition using the modulation spectrogram," Speech Commun., pp. 25, 117-132, 1998.
    • (1998) Speech Commun. , pp. 25
    • Kingsbury, B.1    Morgan, N.2    Greenberg, S.3
  • 142
    • 0025321354 scopus 로고
    • Analysis, synthesis, and perception of voice quality variations among female and male talkers
    • D. Klatt and L. Klatt, "Analysis, synthesis, and perception of voice quality variations among female and male talkers," J. Acoust. Soc. Amer., vol. 87, pp. 820-857, 1990.
    • (1990) J. Acoust. Soc. Amer. , vol.87 , pp. 820-857
    • Klatt, D.1    Klatt, L.2
  • 143
    • 0017012286 scopus 로고
    • Structure of a phonological rule component for a synthesis-by-rule program
    • Oct.
    • D. Klatt, "Structure of a phonological rule component for a synthesis-by-rule program," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-24, pp. 391-398, Oct. 1976.
    • (1976) IEEE Trans. Acoust., Speech, Signal Processing , vol.ASSP-24 , pp. 391-398
    • Klatt, D.1
  • 144
    • 0016952322 scopus 로고
    • Linguistic uses of segmental duration in English: Acoustic and perceptual evidence
    • _, "Linguistic uses of segmental duration in English: Acoustic and perceptual evidence," J. Acoust. Soc. Amer., vol. 59, pp. 1208-1221, 1976.
    • (1976) J. Acoust. Soc. Amer. , vol.59 , pp. 1208-1221
  • 145
    • 0017565919 scopus 로고
    • Review of the ARPA speech understanding project
    • _, "Review of the ARPA speech understanding project," J. Acoust. Soc. Amer., vol. 62, pp. 1345-1366, 1977.
    • (1977) J. Acoust. Soc. Amer. , vol.62 , pp. 1345-1366
  • 146
    • 0018986665 scopus 로고
    • Software for a cascade/parallel formant synthesizer
    • _, "Software for a cascade/parallel formant synthesizer," J. Acoust. Soc. Amer., vol. 67, pp. 971-995, 1980.
    • (1980) J. Acoust. Soc. Amer. , vol.67 , pp. 971-995
  • 147
    • 0023407575 scopus 로고
    • Review of text-to-speech conversion for English
    • _, "Review of text-to-speech conversion for English," J. Acoust. Soc. Amer., vol. 82, pp. 737-793, 1987.
    • (1987) J. Acoust. Soc. Amer. , vol.82 , pp. 737-793
  • 148
    • 0022106367 scopus 로고
    • Network-based isolated digit recognition using vector quantization
    • Aug.
    • G. Kopec and M. Bush, "Network-based isolated digit recognition using vector quantization," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-33, pp. 850-867, Aug. 1985.
    • (1985) IEEE Trans. Acoust., Speech, Signal Processing , vol.ASSP-33 , pp. 850-867
    • Kopec, G.1    Bush, M.2
  • 149
    • 0029735634 scopus 로고    scopus 로고
    • Speaker-independent speech recognition based on tree-structured speaker clustering
    • T. Kosaka, S. Matsunaga, and S. Sagayama, "Speaker-independent speech recognition based on tree-structured speaker clustering," Comput. Speech Lang., vol. 10, pp. 55-74, 1996.
    • (1996) Comput. Speech Lang. , vol.10 , pp. 55-74
    • Kosaka, T.1    Matsunaga, S.2    Sagayama, S.3
  • 150
    • 21244505243 scopus 로고    scopus 로고
    • Speaker modeling for speaker adaptation in automatic speech recognition
    • K. Johnson and J. Mullennix, Eds. San Diego, CA: Academic
    • J. Kreiman, "Speaker modeling for speaker adaptation in automatic speech recognition," in Talker Variability in Speech Processing, K. Johnson and J. Mullennix, Eds. San Diego, CA: Academic, 1997, pp. 167-189.
    • (1997) Talker Variability in Speech Processing , pp. 167-189
    • Kreiman, J.1
  • 151
    • 0025446887 scopus 로고
    • A cache-based natural language model for speech recognition
    • June
    • R. Kuhn and R. de Mori, "A cache-based natural language model for speech recognition," IEEE Trans. Pattern Anal. Machine Intell., vol. 12, pp. 570-583, June 1990.
    • (1990) IEEE Trans. Pattern Anal. Machine Intell. , vol.12 , pp. 570-583
    • Kuhn, R.1    De Mori, R.2
  • 152
    • 0000392884 scopus 로고    scopus 로고
    • Eigenvoices for speaker adaptation
    • R. Kuhn et al., "Eigenvoices for speaker adaptation," in Proc. ICSLP, 1998, pp. 1771-1774.
    • (1998) Proc. ICSLP , pp. 1771-1774
    • Kuhn, R.1
  • 153
    • 0030366881 scopus 로고    scopus 로고
    • Improving decision trees for acoustic modeling
    • A. Lazaridès, Y. Normandin, and R. Kuhn, "Improving decision trees for acoustic modeling," in Proc. ICSLP, 1996, pp. 1053-1056.
    • (1996) Proc. ICSLP , pp. 1053-1056
    • Lazaridès, A.1    Normandin, Y.2    Kuhn, R.3
  • 154
    • 85095220056 scopus 로고
    • Real-time analysis-synthesis and intelligibility of talking faces
    • B. Le Goff, T. Guiard-Marigny, M. Cohen, and C. Benoit, "Real-time analysis-synthesis and intelligibility of talking faces," in ESCA Workshop, 1994, pp. 53-56.
    • (1994) ESCA Workshop , pp. 53-56
    • Le Goff, B.1    Guiard-Marigny, T.2    Cohen, M.3    Benoit, C.4
  • 155
    • 0003339670 scopus 로고
    • Speech recognition: Past, present, and future
    • Englewood Cliffs, NJ: Prentice-Hall
    • W. Lea, "Speech recognition: Past, present, and future," in Trends in Speech Recognition. Englewood Cliffs, NJ: Prentice-Hall, 1980, pp. 39-98.
    • (1980) Trends in Speech Recognition , pp. 39-98
    • Lea, W.1
  • 156
    • 0027625140 scopus 로고
    • Improved lone concatenation rules in a formant-based Chinese text-to-speech system
    • July
    • L. Lee, C. Tseng, and C.-J. Hsieh, "Improved lone concatenation rules in a formant-based Chinese text-to-speech system," IEEE Trans. Speech Audio Processing, vol. 1, pp. 287-294, July 1993.
    • (1993) IEEE Trans. Speech Audio Processing , vol.1 , pp. 287-294
    • Lee, L.1    Tseng, C.2    Hsieh, C.-J.3
  • 157
    • 0032140546 scopus 로고    scopus 로고
    • On stochastic feature and model compensation approaches for robust speech recognition
    • C.-H. Lee, "On stochastic feature and model compensation approaches for robust speech recognition," Speech Commun., vol. 25, pp. 29-47, 1998.
    • (1998) Speech Commun. , vol.25 , pp. 29-47
    • Lee, C.-H.1
  • 160
    • 0029288633 scopus 로고
    • Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
    • C. Leggetter and P. Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models," Comput. Speech Lang., pp. 9, 171-185, 1995.
    • (1995) Comput. Speech Lang. , pp. 9
    • Leggetter, C.1    Woodland, P.2
  • 161
    • 0029288633 scopus 로고
    • Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
    • _, "Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models," Comput. Speech Lang., vol. 9, pp. 171-185, 1995.
    • (1995) Comput. Speech Lang. , vol.9 , pp. 171-185
  • 163
    • 0022149626 scopus 로고
    • Structural methods in automatic speech recognition
    • Nov.
    • S. Levinson, "Structural methods in automatic speech recognition," Proc. IEEE, vol. 73, pp. 1625-1650, Nov. 1985.
    • (1985) Proc. IEEE , vol.73 , pp. 1625-1650
    • Levinson, S.1
  • 164
    • 0022864384 scopus 로고
    • Continuously variable duration hidden Markov models for speech analysis
    • _, "Continuously variable duration hidden Markov models for speech analysis," in Proc. IEEE ICASSP, 1986, pp. 1241-1244.
    • (1986) Proc. IEEE ICASSP , pp. 1241-1244
  • 165
    • 0034322144 scopus 로고    scopus 로고
    • GA-based noisy speech recognition using two-dimensional cepstrum
    • Nov.
    • C.-T. Lin, H.-W. Nein, and J.-Y. Hwu, "GA-based noisy speech recognition using two-dimensional cepstrum," IEEE Trans. Speech Audio Processing, vol. 8, pp. 664-675, Nov. 2000.
    • (2000) IEEE Trans. Speech Audio Processing , vol.8 , pp. 664-675
    • Lin, C.-T.1    Nein, H.-W.2    Hwu, J.-Y.3
  • 166
    • 0029408722 scopus 로고
    • Normalizing the vocal tract length for speaker independent speech recognition
    • Q. Lin and C. Che, "Normalizing the vocal tract length for speaker independent speech recognition," IEEE Signal Processing Lett., vol. 2, pp. 201-203, 1995.
    • (1995) IEEE Signal Processing Lett. , vol.2 , pp. 201-203
    • Lin, Q.1    Che, C.2
  • 167
    • 0020180460 scopus 로고
    • Maximum likelihood estimation for multivariate observations of Markov sources
    • Sept.
    • L. Liporace, "Maximum likelihood estimation for multivariate observations of Markov sources," IEEE Trans. Inform. Theory, vol. IT-28, pp. 729-734, Sept. 1982.
    • (1982) IEEE Trans. Inform. Theory , vol.IT-28 , pp. 729-734
    • Liporace, L.1
  • 168
    • 0031187171 scopus 로고    scopus 로고
    • Speech recognition by humans and machines
    • R. Lippmann, "Speech recognition by humans and machines," Speech Commun., vol. 22, pp. 1-15, 1997.
    • (1997) Speech Commun. , vol.22 , pp. 1-15
    • Lippmann, R.1
  • 169
    • 0030640788 scopus 로고    scopus 로고
    • A Robust speech recognition with time-varying filtering, interruptions, and noise
    • R. Lippmann and B. Carlson, "A Robust speech recognition with time-varying filtering, interruptions, and noise," in IEEE Workshop Speech Recognition, 1997, pp. 365-372.
    • (1997) IEEE Workshop Speech Recognition , pp. 365-372
    • Lippmann, R.1    Carlson, B.2
  • 170
    • 0028404665 scopus 로고
    • High accuracy phone recognition using context-clustering and quasitriphonic models
    • A. Ljolje, "High accuracy phone recognition using context-clustering and quasitriphonic models," Comput. Speech Lang., vol. 8, pp. 129-151, 1994.
    • (1994) Comput. Speech Lang. , vol.8 , pp. 129-151
    • Ljolje, A.1
  • 171
    • 21244499562 scopus 로고
    • A new system for text-to-speech conversion, and its application to Swedish
    • M. Ljungqvist, A. Lindström, and K. Gustafson, "A new system for text-to-speech conversion, and its application to Swedish," in Proc. ICSLP, 1994, pp. 1779-1782.
    • (1994) Proc. ICSLP , pp. 1779-1782
    • Ljungqvist, M.1    Lindström, A.2    Gustafson, K.3
  • 172
    • 0024344665 scopus 로고
    • Segmental intelligibility of synthetic speech produced by rule
    • J. Logan, B. Greene, and D. Pisoni, "Segmental intelligibility of synthetic speech produced by rule," J. Acoust. Soc. Amer., vol. 86, pp. 566-581, 1989.
    • (1989) J. Acoust. Soc. Amer. , vol.86 , pp. 566-581
    • Logan, J.1    Greene, B.2    Pisoni, D.3
  • 173
    • 0033872141 scopus 로고    scopus 로고
    • Utterance verification in continuous speech recognition: Decoding and training procedures
    • Mar.
    • E. Lleida and P. Green, "Utterance verification in continuous speech recognition: Decoding and training procedures," IEEE Trans. Speech Audio Processing, vol. 8, pp. 126-139, Mar. 2000.
    • (2000) IEEE Trans. Speech Audio Processing , vol.8 , pp. 126-139
    • Lleida, E.1    Green, P.2
  • 175
    • 0029375555 scopus 로고
    • Implementing the Viterbi algorithm
    • Sept.
    • H.-L. Lou, "Implementing the Viterbi algorithm," IEEE Signal Processing Mag., vol. 12, pp. 42-52, Sept. 1995.
    • (1995) IEEE Signal Processing Mag. , vol.12 , pp. 42-52
    • Lou, H.-L.1
  • 176
    • 0029748338 scopus 로고    scopus 로고
    • Speech concatenation and synthesis using an overlap-add sinusoidal model
    • M. Macon and M. Clements, "Speech concatenation and synthesis using an overlap-add sinusoidal model," in Proc. IEEE ICASSP, 1996, pp. 361-364.
    • (1996) Proc. IEEE ICASSP , pp. 361-364
    • Macon, M.1    Clements, M.2
  • 177
    • 0016495091 scopus 로고
    • Linear prediction: A tutorial review
    • Apr.
    • J. Makhoul, "Linear prediction: A tutorial review," Proc. IEEE, vol. 63, pp. 561-580, Apr. 1975.
    • (1975) Proc. IEEE , vol.63 , pp. 561-580
    • Makhoul, J.1
  • 178
    • 0030715925 scopus 로고    scopus 로고
    • A segment-based word-spotter using phonetic filler models
    • A. Manos and V. Zue, "A segment-based word-spotter using phonetic filler models," in Proc. IEEE ICASSP, 1997, pp. 899-902.
    • (1997) Proc. IEEE ICASSP , pp. 899-902
    • Manos, A.1    Zue, V.2
  • 179
    • 0024766457 scopus 로고
    • A family of distortion measures based upon projection operation for robust speech recognition
    • Nov.
    • D. Mansour and B.-H. Juang, "A family of distortion measures based upon projection operation for robust speech recognition," IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp. 1659-1671, Nov. 1989.
    • (1989) IEEE Trans. Acoust., Speech, Signal Processing , vol.37 , pp. 1659-1671
    • Mansour, D.1    Juang, B.-H.2
  • 180
    • 0030779362 scopus 로고    scopus 로고
    • Automatic word recognition based on second-order hidden Markov models
    • Jan.
    • J.-F. Mari, J.-P Haton, and A. Kriouile, "Automatic word recognition based on second-order hidden Markov models," IEEE Trans. Speech Audio Processing, vol. 5, pp. 22-25, Jan. 1997.
    • (1997) IEEE Trans. Speech Audio Processing , vol.5 , pp. 22-25
    • Mari, J.-F.1    Haton, J.-P.2    Kriouile, A.3
  • 181
    • 21244440490 scopus 로고    scopus 로고
    • Spoken language processing in multimodal communication
    • J. Mariani, "Spoken language processing in multimodal communication," in Proc. Int. Conf. Speech Processing, 1997, pp. 3-12.
    • (1997) Proc. Int. Conf. Speech Processing , pp. 3-12
    • Mariani, J.1
  • 182
    • 0024911019 scopus 로고
    • Recent advances in speech processing
    • _, "Recent advances in speech processing," in Proc. IEEE ICASSP, 1989, pp. 429-440.
    • (1989) Proc. IEEE ICASSP , pp. 429-440
  • 184
    • 0032049073 scopus 로고    scopus 로고
    • Algorithms for bigram and trigrain word clustering
    • S. Martin, J. Liermann, and H. Ney, "Algorithms for bigram and trigrain word clustering," Speech Commun., vol. 24, pp. 19-37, 1998.
    • (1998) Speech Commun. , vol.24 , pp. 19-37
    • Martin, S.1    Liermann, J.2    Ney, H.3
  • 186
    • 0016049328 scopus 로고
    • An algorithm for automatic formant extraction using linear prediction spectra
    • Apr.
    • S. McCandless, "An algorithm for automatic formant extraction using linear prediction spectra," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-22, pp. 135-141, Apr. 1974.
    • (1974) IEEE Trans. Acoust., Speech, Signal Processing , vol.ASSP-22 , pp. 135-141
    • McCandless, S.1
  • 187
    • 0036754943 scopus 로고    scopus 로고
    • Robust speech recognition using probabilistic union models
    • Sept.
    • J. Ming, P. Jancovic, and F. J. Smith, "Robust speech recognition using probabilistic union models," IEEE Trans. Speech Audio Processing, vol. 10, pp. 403-414, Sept. 2002.
    • (2002) IEEE Trans. Speech Audio Processing , vol.10 , pp. 403-414
    • Ming, J.1    Jancovic, P.2    Smith, F.J.3
  • 188
    • 84892167119 scopus 로고    scopus 로고
    • Transmissions and transitions: A study of two common assumptions in multi-band ASR
    • N. Mirghafori and N. Morgan, "Transmissions and transitions: A study of two common assumptions in multi-band ASR," in Proc. IEEE ICASSP, 1998, pp. 713-716.
    • (1998) Proc. IEEE ICASSP , pp. 713-716
    • Mirghafori, N.1    Morgan, N.2
  • 189
    • 0029196406 scopus 로고
    • A parallel implementation of a hidden Markov model with duration modeling for speech recognition
    • C. Mitchell, M. Harper, L. Jamieson, and R. Helzerman, "A parallel implementation of a hidden Markov model with duration modeling for speech recognition," in Dig. Signal Process., vol. 5, 1995, pp. 43-57.
    • (1995) Dig. Signal Process. , vol.5 , pp. 43-57
    • Mitchell, C.1    Harper, M.2    Jamieson, L.3    Helzerman, R.4
  • 190
    • 0348198473 scopus 로고    scopus 로고
    • Finite-state transduceers in language and speech processing
    • M. Mohri, "Finite-state transduceers in language and speech processing," Comput. Linguist., vol. 23, pp. 269-312, 1997.
    • (1997) Comput. Linguist. , vol.23 , pp. 269-312
    • Mohri, M.1
  • 192
    • 0030287048 scopus 로고    scopus 로고
    • The expectation-maximization algorithm
    • Nov.
    • T. Moon, "The expectation-maximization algorithm," IEEE Signal Processing Mag., vol. 13, pp. 47-60, Nov. 1996.
    • (1996) IEEE Signal Processing Mag. , vol.13 , pp. 47-60
    • Moon, T.1
  • 194
    • 0029308753 scopus 로고
    • Neural networks for statistical recognition of continuous speech
    • May
    • _, "Neural networks for statistical recognition of continuous speech," Proc. IEEE, vol. 83, pp. 742-770, May 1995.
    • (1995) Proc. IEEE , vol.83 , pp. 742-770
  • 195
    • 0040320400 scopus 로고
    • Acoustic correlates of stress
    • J. Morton and W. Jassem, "Acoustic correlates of stress," Lang. Speech, vol. 8, pp. 159-181, 1965.
    • (1965) Lang. Speech , vol.8 , pp. 159-181
    • Morton, J.1    Jassem, W.2
  • 196
    • 0025543906 scopus 로고
    • Pitch synchronous waveform processing techniques for text-to-speech synthesis using diphones
    • E. Moulines and F. Charpentier, "Pitch synchronous waveform processing techniques for text-to-speech synthesis using diphones," Speech Commun., vol. 9, pp. 453-467, 1990.
    • (1990) Speech Commun. , vol.9 , pp. 453-467
    • Moulines, E.1    Charpentier, F.2
  • 197
    • 0027447292 scopus 로고
    • Toward the simulation of emotion in synthetic speech: A review of the literature on human vocal emotion
    • I. Murray and J. Arnott, "Toward the simulation of emotion in synthetic speech: A review of the literature on human vocal emotion," J. Acoust. Soc. Amer., vol. 93, pp. 1097-1108, 1993.
    • (1993) J. Acoust. Soc. Amer. , vol.93 , pp. 1097-1108
    • Murray, I.1    Arnott, J.2
  • 198
    • 21244498390 scopus 로고
    • A prototype text-to-speech system for Scottish Gaelic
    • I. Murray and M. Black, "A prototype text-to-speech system for Scottish Gaelic," in Proc. Eurospeech, 1993, pp. 885-887.
    • (1993) Proc. Eurospeech , pp. 885-887
    • Murray, I.1    Black, M.2
  • 199
    • 85009102728 scopus 로고    scopus 로고
    • Room acoustics and reverberation: Impact on hands-free recognition
    • S. Nakamura and K. Shikano, "Room acoustics and reverberation: Impact on hands-free recognition," in Proc. Eurospeech, 1997, pp. 2423-2426.
    • (1997) Proc. Eurospeech , pp. 2423-2426
    • Nakamura, S.1    Shikano, K.2
  • 200
    • 0000635720 scopus 로고    scopus 로고
    • Progress in dynamic programming search for LVCSR
    • Aug.
    • H. Ney and S. Ortmanns, "Progress in dynamic programming search for LVCSR," Proc. IEEE, vol. 88, pp. 1224-1240, Aug. 2000.
    • (2000) Proc. IEEE , vol.88 , pp. 1224-1240
    • Ney, H.1    Ortmanns, S.2
  • 203
    • 0022227187 scopus 로고
    • Comparative study of several distortion measures for speech recognition
    • N. Nocerino, F. Soong, L. Rabiner, and D. Klatt, "Comparative study of several distortion measures for speech recognition," in IEEE Int. Conf. ASSP, 1985, pp. 25-28.
    • (1985) IEEE Int. Conf. ASSP , pp. 25-28
    • Nocerino, N.1    Soong, F.2    Rabiner, L.3    Klatt, D.4
  • 204
    • 0028412908 scopus 로고
    • High-performance connected digit recognition using maximum mutual information estimation
    • Apr.
    • Y. Normandin, R. Cardin, and R. de Mori, "High-performance connected digit recognition using maximum mutual information estimation," IEEE Trans. Speech Audio Processing, vol. 2, pp. 299-311, Apr. 1994.
    • (1994) IEEE Trans. Speech Audio Processing , vol.2 , pp. 299-311
    • Normandin, Y.1    Cardin, R.2    De Mori, R.3
  • 205
  • 206
    • 0036663550 scopus 로고    scopus 로고
    • Stochastic natural language generation for spoken dialog systems
    • A. Oh and A. Rudnicky, "Stochastic natural language generation for spoken dialog systems," Comput. Speech Lang., vol. 16, pp. 387-407, 2002.
    • (2002) Comput. Speech Lang. , vol.16 , pp. 387-407
    • Oh, A.1    Rudnicky, A.2
  • 207
    • 84892189317 scopus 로고    scopus 로고
    • Multi-band speech recognition in noisy environments
    • S. Okawa, E. Bocchieri, and A. Potamianos, "Multi-band speech recognition in noisy environments," in Proc. IEEE ICASSP, 1998, pp. 641-644.
    • (1998) Proc. IEEE ICASSP , pp. 641-644
    • Okawa, S.1    Bocchieri, E.2    Potamianos, A.3
  • 208
    • 84966368599 scopus 로고
    • A rule-based text-to-speech system for Portuguese
    • L. Oliveira, C. Viana, and I. Trancoso, "A rule-based text-to-speech system for Portuguese," in Proc. IEEE ICASSP, vol. 2, 1992, pp. 73-76.
    • (1992) Proc. IEEE ICASSP , vol.2 , pp. 73-76
    • Oliveira, L.1    Viana, C.2    Trancoso, I.3
  • 209
    • 0032142014 scopus 로고    scopus 로고
    • Environmental conditions and acoustic transduction in hands-free robust speech recognition
    • M. Omologo, P. Svaizer, and M. Matassoni, "Environmental conditions and acoustic transduction in hands-free robust speech recognition," Speech Commun., vol. 25, pp. 75-95, 1998.
    • (1998) Speech Commun. , vol.25 , pp. 75-95
    • Omologo, M.1    Svaizer, P.2    Matassoni, M.3
  • 210
    • 0030353329 scopus 로고    scopus 로고
    • Spoken style explanation generator for Japanese Kanji using a text-to-speech system
    • Y. Ooyama, H. Asano, and K. Matsuoka, "Spoken style explanation generator for Japanese Kanji using a text-to-speech system," in Proc. ICSLP, 1996, pp. 1369-1372.
    • (1996) Proc. ICSLP , pp. 1369-1372
    • Ooyama, Y.1    Asano, H.2    Matsuoka, K.3
  • 213
    • 0030719155 scopus 로고    scopus 로고
    • A word graph algorithm for large vocabulary continuous speech recognition
    • _, "A word graph algorithm for large vocabulary continuous speech recognition," Comput. Speech Lang., vol. 11, pp. 43-72, 1997.
    • (1997) Comput. Speech Lang. , vol.11 , pp. 43-72
  • 215
    • 0001098818 scopus 로고
    • Linguistic features in fundamental frequency patterns
    • _, "Linguistic features in fundamental frequency patterns," J. Phonetics, vol. 7, pp. 119-145, 1979.
    • (1979) J. Phonetics , vol.7 , pp. 119-145
  • 217
    • 0030245363 scopus 로고    scopus 로고
    • From HMM's to segment models: A unified view of stochastic modeling for speech recognition
    • Sept.
    • M. Ostendorf, V. Digalakis, and O. Kimball, "From HMM's to segment models: A unified view of stochastic modeling for speech recognition," IEEE Trans. Speech Audio Processing, vol. 4, pp. 360-378, Sept. 1996.
    • (1996) IEEE Trans. Speech Audio Processing , vol.4 , pp. 360-378
    • Ostendorf, M.1    Digalakis, V.2    Kimball, O.3
  • 218
    • 0031704151 scopus 로고    scopus 로고
    • Speaker clustering and transformation for speaker adaptation in speech recognition systems
    • Jan.
    • M. Padmanabhan, L. Bahl, D. Nahamoo, and M. Picheny, "Speaker clustering and transformation for speaker adaptation in speech recognition systems," IEEE Trans. Speech Audio Processing, vol. 6, pp. 71-77, Jan. 1998.
    • (1998) IEEE Trans. Speech Audio Processing , vol.6 , pp. 71-77
    • Padmanabhan, M.1    Bahl, L.2    Nahamoo, D.3    Picheny, M.4
  • 219
    • 0030638045 scopus 로고    scopus 로고
    • Spectral subband centroids as features for speech recognition
    • K. Paliwal, "Spectral subband centroids as features for speech recognition," in IEEE Workshop Speech Recognition, 1997, pp. 124-131.
    • (1997) IEEE Workshop Speech Recognition , pp. 124-131
    • Paliwal, K.1
  • 220
    • 0026746948 scopus 로고
    • On automatic estimation of articulatory parameters in a text-to-speech system
    • S. Parthasarathy and C. Coker, "On automatic estimation of articulatory parameters in a text-to-speech system," Comput. Speech Lang., vol. 6, pp. 37-76, 1992.
    • (1992) Comput. Speech Lang. , vol.6 , pp. 37-76
    • Parthasarathy, S.1    Coker, C.2
  • 221
    • 0022227186 scopus 로고
    • Training of HMM recognizers by simulated annealing
    • D. Paul, "Training of HMM recognizers by simulated annealing, "in Proc. IEEE ICASSP, 1985, pp. 13-16.
    • (1985) Proc. IEEE ICASSP , pp. 13-16
    • Paul, D.1
  • 223
    • 0027659197 scopus 로고
    • Signal modeling techniques in speech recognition
    • Sept.
    • J. Picone, "Signal modeling techniques in speech recognition," Proc. IEEE, vol. 81, pp. 1215-1247, Sept. 1993.
    • (1993) Proc. IEEE , vol.81 , pp. 1215-1247
    • Picone, J.1
  • 225
    • 0022148789 scopus 로고
    • Perception of synthetic speech generated by rule
    • Nov.
    • D. Pisoni, H. Nusbaum, and B. Greene, "Perception of synthetic speech generated by rule," Proc. IEEE, vol. 73, pp. 1665-1676, Nov. 1985.
    • (1985) Proc. IEEE , vol.73 , pp. 1665-1676
    • Pisoni, D.1    Nusbaum, H.2    Greene, B.3
  • 226
    • 0041673687 scopus 로고
    • Quality assessment of text-to-speech synthesis by rule
    • S. Furui and M. Sondhi, Eds. New York: Marcel Dekker
    • L. Pols, "Quality assessment of text-to-speech synthesis by rule," in Advances in Speech Signal Processing, S. Furui and M. Sondhi, Eds. New York: Marcel Dekker, 1992, pp. 387-416.
    • (1992) Advances in Speech Signal Processing , pp. 387-416
    • Pols, L.1
  • 227
    • 0023834849 scopus 로고
    • Hidden Markov models: A guided tour
    • A. Poritz, "Hidden Markov models: A guided tour," in Proc. IEEE ICASSP, 1988, pp. 7-13.
    • (1988) Proc. IEEE ICASSP , pp. 7-13
    • Poritz, A.1
  • 228
    • 85135358811 scopus 로고
    • Structure and representation of an inventory for German speech synthesis
    • T. Portele, F. Höfer, and W. Hess, "Structure and representation of an inventory for German speech synthesis," in Proc. ICSLP, 1994, pp. 1759-1762.
    • (1994) Proc. ICSLP , pp. 1759-1762
    • Portele, T.1    Höfer, F.2    Hess, W.3
  • 230
    • 0024610919 scopus 로고
    • A tutorial on hidden Markov models and selected applications in speech recognition
    • Feb.
    • L. Rabiner, "A tutorial on hidden Markov models and selected applications in speech recognition," Proc. IEEE, vol. 77, pp. 257-286, Feb. 1989.
    • (1989) Proc. IEEE , vol.77 , pp. 257-286
    • Rabiner, L.1
  • 231
    • 0020735346 scopus 로고
    • On the application of vector quantization and hidden Markov models to speaker-independent, isolated word recognition
    • L. Rabiner, S. Levinson, and M. Sondhi, "On the application of vector quantization and hidden Markov models to speaker-independent, isolated word recognition," Bell Syst. Tech. J., vol. 62, pp. 1075-1105, 1983.
    • (1983) Bell Syst. Tech. J. , vol.62 , pp. 1075-1105
    • Rabiner, L.1    Levinson, S.2    Sondhi, M.3
  • 232
    • 0021407797 scopus 로고
    • On the use of hidden Markov models for speaker-independent recognition of isolated words from a medium-size vocabulary
    • _, "On the use of hidden Markov models for speaker-independent recognition of isolated words from a medium-size vocabulary," AT&T Bell Labs Tech. J., vol. 63, pp. 627-641, 1984.
    • (1984) AT&T Bell Labs Tech. J. , vol.63 , pp. 627-641
  • 234
    • 0030127017 scopus 로고    scopus 로고
    • Signal conditioning techniques for robust speech recognition
    • Apr.
    • M. Rahim, B.-H. Juang, W. Chou, and E. Buhrke, "Signal conditioning techniques for robust speech recognition," IEEE Signal Processing Lett., vol. 3, pp. 107-109, Apr. 1996.
    • (1996) IEEE Signal Processing Lett. , vol.3 , pp. 107-109
    • Rahim, M.1    Juang, B.-H.2    Chou, W.3    Buhrke, E.4
  • 235
    • 0029769867 scopus 로고    scopus 로고
    • Signal bias removal by maximum likelihood estimation for robust telephone speech recognition
    • Jan.
    • M. Rahim and B.-H. Juang, "Signal bias removal by maximum likelihood estimation for robust telephone speech recognition," IEEE Trans. Speech Audio Processing, vol. 4, pp. 19-30, Jan. 1996.
    • (1996) IEEE Trans. Speech Audio Processing , vol.4 , pp. 19-30
    • Rahim, M.1    Juang, B.-H.2
  • 236
    • 0030962747 scopus 로고    scopus 로고
    • A study on robust utterance verification for connected digits recognition
    • M. Rahim, C.-H. Lee, and B.-H. Juang, "A study on robust utterance verification for connected digits recognition," J. Acoust. Soc. Amer., vol. 101, pp. 2892-2902, 1997.
    • (1997) J. Acoust. Soc. Amer. , vol.101 , pp. 2892-2902
    • Rahim, M.1    Lee, C.-H.2    Juang, B.-H.3
  • 237
    • 0035248922 scopus 로고    scopus 로고
    • Deterministically annealed design of hidden Markov model speech recognizers
    • Feb.
    • A. Rao and K. Rose, "Deterministically annealed design of hidden Markov model speech recognizers," IEEE Trans. Speech Audio Processing, vol. 9, pp. 111-126, Feb. 2001.
    • (2001) IEEE Trans. Speech Audio Processing , vol.9 , pp. 111-126
    • Rao, A.1    Rose, K.2
  • 238
    • 0035396159 scopus 로고    scopus 로고
    • A maximum a posteriori approach to speaker adaptation using the trended hidden Markov model
    • July
    • C. Rathinavelu and L. Deng, "A maximum a posteriori approach to speaker adaptation using the trended hidden Markov model," IEEE Trans. Speech Audio Processing, vol. 9, pp. 549-557, July 2001.
    • (2001) IEEE Trans. Speech Audio Processing , vol.9 , pp. 549-557
    • Rathinavelu, C.1    Deng, L.2
  • 239
    • 0016939166 scopus 로고
    • Speech recognition by machine: A review
    • Apr.
    • R. Reddy, "Speech recognition by machine: A review," Proc. IEEE, vol. 64, pp. 501-531, Apr. 1976.
    • (1976) Proc. IEEE , vol.64 , pp. 501-531
    • Reddy, R.1
  • 240
    • 0034273299 scopus 로고    scopus 로고
    • Robust decision tree state tying for continuous speech recognition
    • Sept.
    • W. Reichl and W. Chou, "Robust decision tree state tying for continuous speech recognition," IEEE Trans. Speech Audio Processing, vol. 8, pp. 555-566, Sept. 2000.
    • (2000) IEEE Trans. Speech Audio Processing , vol.8 , pp. 555-566
    • Reichl, W.1    Chou, W.2
  • 241
    • 0029733137 scopus 로고    scopus 로고
    • Phone deactivation pruning in large vocabulary continuous speech recognition
    • Jan.
    • S. Renals, "Phone deactivation pruning in large vocabulary continuous speech recognition," IEEE Signal Processing Lett., vol. 3, pp. 4-6, Jan. 1996.
    • (1996) IEEE Signal Processing Lett. , vol.3 , pp. 4-6
    • Renals, S.1
  • 242
    • 0032675736 scopus 로고    scopus 로고
    • The HDM: A segmental hidden dynamic model of coarticulation
    • H. B. Richards and J. S. Bridle, "The HDM: A segmental hidden dynamic model of coarticulation," in Proc. IEEE ICASSP, vol. 1, 1999, pp. 357-360.
    • (1999) Proc. IEEE ICASSP , vol.1 , pp. 357-360
    • Richards, H.B.1    Bridle, J.S.2
  • 243
    • 0012075535 scopus 로고    scopus 로고
    • Evaluation of speech synthesis systems for Dutch in telecommunication applications in GSM and PSTN networks
    • T. Rietveld et al., "Evaluation of speech synthesis systems for Dutch in telecommunication applications in GSM and PSTN networks," in Proc. Eurospeech, 1997, pp. 577-580.
    • (1997) Proc. Eurospeech , pp. 577-580
    • Rietveld, T.1
  • 244
    • 0029386354 scopus 로고
    • Keyword detection in conversational speech utterances using hidden Markov model based continuous speech recognition
    • R. Rose, "Keyword detection in conversational speech utterances using hidden Markov model based continuous speech recognition," Comput. Speech Lang., vol. 9, pp. 309-333, 1995.
    • (1995) Comput. Speech Lang. , vol.9 , pp. 309-333
    • Rose, R.1
  • 245
    • 0030008004 scopus 로고    scopus 로고
    • The potential role of speech production models in automatic speech recognition
    • R. Rose, J. Schroeter, and M. Sondhi, "The potential role of speech production models in automatic speech recognition," J. Acoust. Soc. Amer., vol. 99, no. 3, pp. 1699-1709, 1996.
    • (1996) J. Acoust. Soc. Amer. , vol.99 , Issue.3 , pp. 1699-1709
    • Rose, R.1    Schroeter, J.2    Sondhi, M.3
  • 247
    • 0030181951 scopus 로고    scopus 로고
    • A maximum entropy approach to adaptive statistical language modeling
    • R. Rosenfeld, "A maximum entropy approach to adaptive statistical language modeling," Comput. Speech Lang., vol. 10, pp. 187-228, 1996.
    • (1996) Comput. Speech Lang. , vol.10 , pp. 187-228
    • Rosenfeld, R.1
  • 248
    • 0032665603 scopus 로고    scopus 로고
    • A dynamical system model for generating fundamental frequency for speech synthesis
    • May
    • K. Ross and M. Ostendorf, "A dynamical system model for generating fundamental frequency for speech synthesis," IEEE Trans. Speech Audio Processing, vol. 7, pp. 295-309, May 1999.
    • (1999) IEEE Trans. Speech Audio Processing , vol.7 , pp. 295-309
    • Ross, K.1    Ostendorf, M.2
  • 249
    • 0019606728 scopus 로고
    • An articulatory synthesizer for perceptual research
    • P. Rubin, T. Baer, and P. Mermelstein, "An articulatory synthesizer for perceptual research," J. Acoust. Soc. Amer., vol. 70, pp. 321-328, 1981.
    • (1981) J. Acoust. Soc. Amer. , vol.70 , pp. 321-328
    • Rubin, P.1    Baer, T.2    Mermelstein, P.3
  • 250
  • 251
    • 0025229803 scopus 로고
    • Speech synthesis from text
    • Jan.
    • Y. Sagisaka, "Speech synthesis from text," IEEE Commun. Mag., vol. 28, pp. 35-41, Jan. 1990.
    • (1990) IEEE Commun. Mag. , vol.28 , pp. 35-41
    • Sagisaka, Y.1
  • 252
    • 0032051044 scopus 로고    scopus 로고
    • Pre-recognition measures of speaking rate
    • K. Samudravijaya, S. Singh, and P. Rao, "Pre-recognition measures of speaking rate," Speech Commun., vol. 24, pp. 73-84, 1998.
    • (1998) Speech Commun. , vol.24 , pp. 73-84
    • Samudravijaya, K.1    Singh, S.2    Rao, P.3
  • 253
    • 0029239090 scopus 로고
    • A comparative study of mel cepstra and EIH for phone classification under adverse conditions
    • S. Sandhu and O. Ghitza, "A comparative study of mel cepstra and EIH for phone classification under adverse conditions," in Proc. IEEE ICASSP, 1995, pp. 409-412.
    • (1995) Proc. IEEE ICASSP , pp. 409-412
    • Sandhu, S.1    Ghitza, O.2
  • 255
    • 0030648077 scopus 로고    scopus 로고
    • Construction and evaluation of a robust multifeature speech/music discriminator
    • E. Scheirer and M. Slaney, "Construction and evaluation of a robust multifeature speech/music discriminator," in Proc. IEEE ICASSP, 1997, pp. 1331-1334.
    • (1997) Proc. IEEE ICASSP , pp. 1331-1334
    • Scheirer, E.1    Slaney, M.2
  • 256
    • 0028259480 scopus 로고
    • Techniques for estimating vocal tract shapes from the speech signal
    • Jan.
    • J. Schroeter and M. Sondhi, "Techniques for estimating vocal tract shapes from the speech signal," IEEE Trans. Speech Audio Processing, vol. 2, pp. 133-150, Jan. 1994.
    • (1994) IEEE Trans. Speech Audio Processing , vol.2 , pp. 133-150
    • Schroeter, J.1    Sondhi, M.2
  • 257
    • 0002788850 scopus 로고
    • Multipass search strategies
    • C.-H. Lee et al., Eds. Boston, MA: Kluwer, ch. 18
    • R. Schwartz, et al., "Multipass search strategies," in Automatic Speech and Speaker Recognition, C.-H. Lee et al., Eds. Boston, MA: Kluwer, 1995, ch. 18.
    • (1995) Automatic Speech and Speaker Recognition
    • Schwartz, R.1
  • 258
    • 85014377643 scopus 로고
    • TINA: A natural language system for spoken language applications
    • S. Seneff, "TINA: A natural language system for spoken language applications," Comput. Linguist., vol. 18, pp. 61-86, 1992.
    • (1992) Comput. Linguist. , vol.18 , pp. 61-86
    • Seneff, S.1
  • 259
    • 0036289978 scopus 로고    scopus 로고
    • Real-time speech synthesis on an ultra-low resource, programmable DSP system
    • H. Sheikhzadeh, E. Cornu, R. Brennan, and T. Schneider, "Real-time speech synthesis on an ultra-low resource, programmable DSP system," in Proc. IEEE ICASSP, vol. 1, 2002, pp. 433-436.
    • (2002) Proc. IEEE ICASSP , vol.1 , pp. 433-436
    • Sheikhzadeh, H.1    Cornu, E.2    Brennan, R.3    Schneider, T.4
  • 260
    • 2642702399 scopus 로고
    • Spectrum distance measures for speech recognition
    • S. Furui and M. Sondhi, Eds. New York: Marcel Dekker
    • K. Shikano and F. Itakura, "Spectrum distance measures for speech recognition," in Advances in Speech Signal Processing, S. Furui and M. Sondhi, Eds. New York: Marcel Dekker, 1992, pp. 419-452.
    • (1992) Advances in Speech Signal Processing , pp. 419-452
    • Shikano, K.1    Itakura, F.2
  • 261
    • 0030247984 scopus 로고    scopus 로고
    • Computer lipreading for improved accuracy in automatic speech recognition
    • Sept.
    • P. Silsbee and A. Bovik, "Computer lipreading for improved accuracy in automatic speech recognition," IEEE Trans. Speech Audio Processing, vol. 4, pp. 337-351, Sept. 1996.
    • (1996) IEEE Trans. Speech Audio Processing , vol.4 , pp. 337-351
    • Silsbee, P.1    Bovik, A.2
  • 262
    • 0030165492 scopus 로고    scopus 로고
    • Comparative experiments of several adaptation approaches to noisy speech recognition using stochastic trajectory models
    • O. Siohan, Y. Gong, and J.-P. Haton, "Comparative experiments of several adaptation approaches to noisy speech recognition using stochastic trajectory models," Speech Commun., vol. 18, pp. 335-352, 1996.
    • (1996) Speech Commun. , vol.18 , pp. 335-352
    • Siohan, O.1    Gong, Y.2    Haton, J.-P.3
  • 263
    • 0030816220 scopus 로고    scopus 로고
    • Incorporating phonetic properties in hidden Markov models for speech recognition
    • R. Sitaram and T. Sreenivas, "Incorporating phonetic properties in hidden Markov models for speech recognition," J. Acoust. Soc. Amer., vol. 102, pp. 1149-1158, 1997.
    • (1997) J. Acoust. Soc. Amer. , vol.102 , pp. 1149-1158
    • Sitaram, R.1    Sreenivas, T.2
  • 264
    • 0347387932 scopus 로고
    • On the importance of the microphone position for speech recognition in the car
    • J. Smolders, T. Claes, G. Sablon, and D. van Campernolle, "On the importance of the microphone position for speech recognition in the car," in Proc. IEEE ICASSP, vol. 1, 1994, pp. 429-432.
    • (1994) Proc. IEEE ICASSP , vol.1 , pp. 429-432
    • Smolders, J.1    Claes, T.2    Sablon, G.3    Van Campernolle, D.4
  • 265
    • 0026370988 scopus 로고
    • A tree-trellis based search for finding the N best sentence hypotheses in continuous speech recognition
    • F. Soong and E.-F. Huang, "A tree-trellis based search for finding the N best sentence hypotheses in continuous speech recognition," in Proc IEEE ICASSP, 1991, pp. 705-708.
    • (1991) Proc IEEE ICASSP , pp. 705-708
    • Soong, F.1    Huang, E.-F.2
  • 267
    • 0029352735 scopus 로고
    • Continuous speech dictation - From theory to practice
    • V. Steinbiss et al., "Continuous speech dictation - from theory to practice," Speech Commun., vol. 17, pp. 19-38, 1995.
    • (1995) Speech Commun. , vol.17 , pp. 19-38
    • Steinbiss, V.1
  • 268
    • 0031238095 scopus 로고    scopus 로고
    • A model of dynamic auditory perception and its application to robust word recognition
    • Sept.
    • B. Strope and A. Alwan, "A model of dynamic auditory perception and its application to robust word recognition," IEEE Trans. Speech Audio Processing, vol. 5, pp. 451-464, Sept. 1997.
    • (1997) IEEE Trans. Speech Audio Processing , vol.5 , pp. 451-464
    • Strope, B.1    Alwan, A.2
  • 269
    • 84892158557 scopus 로고    scopus 로고
    • Robust word recognition using threaded spectral peaks
    • _, "Robust word recognition using threaded spectral peaks," in Proc. IEEE ICASSP, 1998, pp. 625-628.
    • (1998) Proc. IEEE ICASSP , pp. 625-628
  • 270
    • 0035279124 scopus 로고    scopus 로고
    • Removing linear phase mismatches in concatenative speech synthesis
    • Mar.
    • Y. Stylianou, "Removing linear phase mismatches in concatenative speech synthesis," IEEE Trans. Speech Audio Processing, vol. 9, pp. 232-239, Mar. 2001.
    • (2001) IEEE Trans. Speech Audio Processing , vol.9 , pp. 232-239
    • Stylianou, Y.1
  • 271
    • 0028195650 scopus 로고
    • Speech recognition using weighted HMM and subspace projection approaches
    • Jan.
    • K.-Y. Su and C.-H. Lee, "Speech recognition using weighted HMM and subspace projection approaches," IEEE Trans. Speech Audio Processing, vol. 2, pp. 69-79, Jan. 1994.
    • (1994) IEEE Trans. Speech Audio Processing , vol.2 , pp. 69-79
    • Su, K.-Y.1    Lee, C.-H.2
  • 272
    • 0030287341 scopus 로고    scopus 로고
    • Vocabulary-independent discriminative utterance verification for nonkeyword rejection in subword based speech recognition
    • Nov.
    • R. Sukkar and C.-H. Lee, "Vocabulary-independent discriminative utterance verification for nonkeyword rejection in subword based speech recognition," IEEE Trans. Speech Audio Processing, vol. 4, pp. 420-429, Nov. 1996.
    • (1996) IEEE Trans. Speech Audio Processing , vol.4 , pp. 420-429
    • Sukkar, R.1    Lee, C.-H.2
  • 273
    • 0027316621 scopus 로고
    • Multi-microphone correlation-based processing for robust speech recognition
    • T. Sullivan and R. Stern, "Multi-microphone correlation-based processing for robust speech recognition," in Proc. IEEE ICASSP, vol. 2, 1993, pp. 91-94.
    • (1993) Proc. IEEE ICASSP , vol.2 , pp. 91-94
    • Sullivan, T.1    Stern, R.2
  • 274
    • 0031624617 scopus 로고    scopus 로고
    • TD-PSOLA versus harmonic plus noise model in diphone based speech synthesis
    • A. Syrdal, Y. Stylianou, A. Conkie, and J. Schroeter, "TD-PSOLA versus harmonic plus noise model in diphone based speech synthesis," in Proc. IEEE ICASSP, vol. 1, 1998, pp. 273-276.
    • (1998) Proc. IEEE ICASSP , vol.1 , pp. 273-276
    • Syrdal, A.1    Stylianou, Y.2    Conkie, A.3    Schroeter, J.4
  • 275
    • 0031624617 scopus 로고    scopus 로고
    • TD-PSOLA versus harmonic plus noise model in diphone based speech synthesis
    • _, "TD-PSOLA versus harmonic plus noise model in diphone based speech synthesis," in Proc. IEEE ICASSP, 1998, pp. 273-276.
    • (1998) Proc. IEEE ICASSP , pp. 273-276
  • 276
    • 0031118076 scopus 로고    scopus 로고
    • Vector-field-smoothed Bayesian learning for fast and incremental speaker/telephone-channel adaptation
    • J. Takahashi and S. Sagayama, "Vector-field-smoothed Bayesian learning for fast and incremental speaker/telephone-channel adaptation," Comput. Speech Lang., vol. 11, pp. 127-146, 1997.
    • (1997) Comput. Speech Lang. , vol.11 , pp. 127-146
    • Takahashi, J.1    Sagayama, S.2
  • 277
    • 21244459396 scopus 로고
    • An overview of different trends on CELP coding
    • A. J. Ayuso and J. M. Soler, Eds. New York: Springer-Verlag
    • I. Trancoso, "An overview of different trends on CELP coding," in Speech Recognition and Coding: New Advances and Trends, A. J. Ayuso and J. M. Soler, Eds. New York: Springer-Verlag, 1995, pp. 351-368.
    • (1995) Speech Recognition and Coding: New Advances and Trends , pp. 351-368
    • Trancoso, I.1
  • 279
    • 0027147339 scopus 로고
    • Perceptual experiments for diagnostic testing of text-to-speech systems
    • J. van Santen, "Perceptual experiments for diagnostic testing of text-to-speech systems," Comput. Speech Lang., vol. 7, pp. 49-100, 1993.
    • (1993) Comput. Speech Lang. , vol.7 , pp. 49-100
    • Van Santen, J.1
  • 280
    • 0032296808 scopus 로고    scopus 로고
    • A stochastic model of intonation for text-to-speech synthesis
    • J. Véronis, P. di Cristo, F. Courtois, and C. Chaumette, "A stochastic model of intonation for text-to-speech synthesis," Speech Commun., vol. 26, pp. 233-244, 1998.
    • (1998) Speech Commun. , vol.26 , pp. 233-244
    • Véronis, J.1    Di Cristo, P.2    Courtois, F.3    Chaumette, C.4
  • 281
    • 0017482612 scopus 로고
    • Normalization of vowels by vocal-tract length and its application to vowel identification
    • Apr.
    • H. Wakita, "Normalization of vowels by vocal-tract length and its application to vowel identification," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-25, pp. 183-192, Apr. 1977.
    • (1977) IEEE Trans. Acoust., Speech, Signal Processing , vol.ASSP-25 , pp. 183-192
    • Wakita, H.1
  • 282
    • 0036753897 scopus 로고    scopus 로고
    • Speaker adaptive modeling by vocal tract normalization
    • Sept.
    • L. Welling, H. Ney, and S. Kanthak, "Speaker adaptive modeling by vocal tract normalization," IEEE Trans. Speech Audio Processing, vol. 10, pp. 415-426, Sept. 2002.
    • (2002) IEEE Trans. Speech Audio Processing , vol.10 , pp. 415-426
    • Welling, L.1    Ney, H.2    Kanthak, S.3
  • 283
    • 0031647965 scopus 로고    scopus 로고
    • Formant estimation for speech recognition
    • Jan.
    • L. Welling and H. Ney, "Formant estimation for speech recognition," IEEE Trans. Speech Audio Processing, vol. 6, pp. 36-48, Jan. 1998.
    • (1998) IEEE Trans. Speech Audio Processing , vol.6 , pp. 36-48
    • Welling, L.1    Ney, H.2
  • 284
    • 0025517070 scopus 로고
    • Automatic recognition of keywords in unconstrained speech using hidden Markov models
    • Nov.
    • J. Wilpon, L. Rabiner, C.-H. Lee, and E. Goldman, "Automatic recognition of keywords in unconstrained speech using hidden Markov models," IEEE Trans. Acoust., Speech, Signal Processing, vol. 38, pp. 1870-1878, Nov. 1990.
    • (1990) IEEE Trans. Acoust., Speech, Signal Processing , vol.38 , pp. 1870-1878
    • Wilpon, J.1    Rabiner, L.2    Lee, C.-H.3    Goldman, E.4
  • 285
    • 0035124445 scopus 로고    scopus 로고
    • Control of spectral dynamics in concatenative speech synthesis
    • Jan.
    • J. Wouters and M. Macon, "Control of spectral dynamics in concatenative speech synthesis," IEEE Trans. Speech Audio Processing, vol. 9, pp. 30-38, Jan. 2001.
    • (2001) IEEE Trans. Speech Audio Processing , vol.9 , pp. 30-38
    • Wouters, J.1    Macon, M.2
  • 286
    • 0030718943 scopus 로고    scopus 로고
    • Multilingual large vocabulary speech recognition: The European SQUALE project
    • S. Young et al., "Multilingual large vocabulary speech recognition: The European SQUALE project," Comput. Speech Lang., vol. 11, pp. 73-89, 1997.
    • (1997) Comput. Speech Lang. , vol.11 , pp. 73-89
    • Young, S.1
  • 287
    • 0032181247 scopus 로고    scopus 로고
    • Speech recognition evaluation: A review of the U.S. CSR and LVCSR programmes
    • S. Young and L. Chase, "Speech recognition evaluation: A review of the U.S. CSR and LVCSR programmes," Comput. Speech Lang., vol. 12, pp. 263-279, 1998.
    • (1998) Comput. Speech Lang. , vol.12 , pp. 263-279
    • Young, S.1    Chase, L.2
  • 288
    • 0028460810 scopus 로고
    • An acoustic-phonetic-based speaker-adaptation technique for improving speaker-independent continuous speech recognition
    • July
    • Y. Zhao, "An acoustic-phonetic-based speaker-adaptation technique for improving speaker-independent continuous speech recognition," IEEE Trans. Speech Audio Processing, vol. 2, pp. 380-394, July 1994.
    • (1994) IEEE Trans. Speech Audio Processing , vol.2 , pp. 380-394
    • Zhao, Y.1
  • 289
    • 0022151324 scopus 로고
    • The use of speech knowledge in automatic speech recognition
    • Nov.
    • V. Zue, "The use of speech knowledge in automatic speech recognition," Proc. IEEE, vol. 73, pp. 1602-1615, Nov. 1985.
    • (1985) Proc. IEEE , vol.73 , pp. 1602-1615
    • Zue, V.1
  • 290
    • 85036693931 scopus 로고    scopus 로고
    • Conversational interfaces: Advances and challenges
    • _, "Conversational interfaces: Advances and challenges," in Proc. Eurospeech, 1997, pp. KN-9-18.
    • (1997) Proc. Eurospeech
  • 291
    • 21244470119 scopus 로고
    • Peripheral preprocessing in hearing and psychoacoustics as guidelines for speech recognition
    • E. Zwicker, "Peripheral preprocessing in hearing and psychoacoustics as guidelines for speech recognition," in Proc. Montreal Symp. Speech Recognition, 1986, pp. 1-4.
    • (1986) Proc. Montreal Symp. Speech Recognition , pp. 1-4
    • Zwicker, E.1
  • 292
    • 0018437122 scopus 로고
    • Automatic speech recognition using psychoacoustic models
    • E. Zwicker, E. Terhardt, and E. Paulus, "Automatic speech recognition using psychoacoustic models," J. Acoust. Soc. Amer., vol. 65, pp. 487-498, 1979.
    • (1979) J. Acoust. Soc. Amer. , vol.65 , pp. 487-498
    • Zwicker, E.1    Terhardt, E.2    Paulus, E.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.