메뉴 건너뛰기




Volumn 49, Issue 10-11, 2007, Pages 763-786

Automatic speech recognition and speech variability: A review

Author keywords

Speech analysis; Speech intrinsic variations; Speech modeling; Speech recognition

Indexed keywords

ACOUSTIC NOISE; LEARNING SYSTEMS; POPULATION DYNAMICS; SEMANTICS; SPEECH ANALYSIS;

EID: 34547941599     PISSN: 01676393     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.specom.2007.02.006     Document Type: Article
Times cited : (440)

References (328)
  • 1
    • 85009110440 scopus 로고    scopus 로고
    • Aalburg, S., Hoege, H., 2004. Foreign-accented speaker-independent speech recognition. In: Proceedings of ICSLP, Jeju Island, Korea, pp. 1465-1468.
  • 2
    • 4544386378 scopus 로고    scopus 로고
    • Abdel-Haleem, Y.H., Renals, S., Lawrence, N.D., 2004. Acoustic space dimensionality selection and combination using the maximum entropy principle. In: Proceedings of ICASSP, Montreal, Canada, pp. 637-640.
  • 3
    • 0029762799 scopus 로고    scopus 로고
    • Abrash, V., Sankar, A., Franco, H., Cohen, M., 1996. Acoustic adaptation using nonlinear transformations of HMM parameters. In: Proceedings of ICASSP, Atlanta, GA, pp. 729-732.
  • 4
    • 34547939955 scopus 로고    scopus 로고
    • Achan, K., Roweis, S., Hertzmann, A., Frey, B., 2004. A segmental HMM for speech waveforms. Technical Report UTML Techical Report 2004-001, University of Toronto, Toronto, Canada.
  • 5
    • 19944379822 scopus 로고    scopus 로고
    • Investigating syllabic structures and their variation in spontaneous french
    • Adda-Decker M., Boula de Mareuil P., Adda G., and Lamel L. Investigating syllabic structures and their variation in spontaneous french. Speech Communication 46 2 (2005) 119-139
    • (2005) Speech Communication , vol.46 , Issue.2 , pp. 119-139
    • Adda-Decker, M.1    Boula de Mareuil, P.2    Adda, G.3    Lamel, L.4
  • 6
    • 0033354260 scopus 로고    scopus 로고
    • Pronunciation variants across system configuration, language and speaking style
    • Adda-Decker M., and Lamel L. Pronunciation variants across system configuration, language and speaking style. Speech Communication 29 2 (1999) 83-98
    • (1999) Speech Communication , vol.29 , Issue.2 , pp. 83-98
    • Adda-Decker, M.1    Lamel, L.2
  • 7
    • 0023831656 scopus 로고
    • A new statistical approach for the automatic segmentation of continuous speech signals
    • Andre-Obrecht R. A new statistical approach for the automatic segmentation of continuous speech signals. IEEE Transactions on Acoustics, Speech and Signal Processing 36 1 (1988) 29-40
    • (1988) IEEE Transactions on Acoustics, Speech and Signal Processing , vol.36 , Issue.1 , pp. 29-40
    • Andre-Obrecht, R.1
  • 8
    • 85009145332 scopus 로고    scopus 로고
    • Prosody-based automatic detection of annoyance and frustration in human-computer
    • Denver, Colorado
    • Ang J., Dhillon R., Krupski A., Shriberg E., and Stolcke A. Prosody-based automatic detection of annoyance and frustration in human-computer. Proceedings of ICSLP (2002), Denver, Colorado 2037-2040
    • (2002) Proceedings of ICSLP , pp. 2037-2040
    • Ang, J.1    Dhillon, R.2    Krupski, A.3    Shriberg, E.4    Stolcke, A.5
  • 9
    • 0030165438 scopus 로고    scopus 로고
    • Language accent classification in american english
    • Arslan L.M., and Hansen J.H.L. Language accent classification in american english. Speech Communication 18 4 (1996) 353-367
    • (1996) Speech Communication , vol.18 , Issue.4 , pp. 353-367
    • Arslan, L.M.1    Hansen, J.H.L.2
  • 10
    • 0020602364 scopus 로고    scopus 로고
    • Atal, B., 1983. Efficient coding of LPC parameters by temporal decomposition. In: Proceedings of ICASSP, Boston, USA, pp. 81-84.
  • 11
    • 0016962193 scopus 로고
    • A pattern recognition approach to voiced-unvoiced-silence classification with applications to speech recognition
    • Atal B., and Rabiner L. A pattern recognition approach to voiced-unvoiced-silence classification with applications to speech recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing 24 3 (1976) 201-212
    • (1976) IEEE Transactions on Acoustics, Speech, and Signal Processing , vol.24 , Issue.3 , pp. 201-212
    • Atal, B.1    Rabiner, L.2
  • 12
    • 84863773378 scopus 로고    scopus 로고
    • Frequency domain linear prediction for temporal features
    • St. Thomas, US Virgin Islands, USA
    • Athineos M., and Ellis D. Frequency domain linear prediction for temporal features. Proceedings of ASRU (2003), St. Thomas, US Virgin Islands, USA 261-266
    • (2003) Proceedings of ASRU , pp. 261-266
    • Athineos, M.1    Ellis, D.2
  • 13
    • 0035202534 scopus 로고    scopus 로고
    • Taking the hit: leaving some lexical competition to be resolved post-lexically
    • Bard E.G., Sotillo C., Kelly M.L., and Aylett M.P. Taking the hit: leaving some lexical competition to be resolved post-lexically. Language and Cognitive Processes 15 5-6 (2001) 731-737
    • (2001) Language and Cognitive Processes , vol.15 , Issue.5-6 , pp. 731-737
    • Bard, E.G.1    Sotillo, C.2    Kelly, M.L.3    Aylett, M.P.4
  • 14
    • 33745201418 scopus 로고    scopus 로고
    • Barrault, L., de Mori, R., Gemello, R., Mana, F., Matrouf, D., 2005. Variability of automatic speech recognition systems using different features. In: Proceedings of Interspeech, Lisboa, Portugal, pp. 221-224.
  • 15
    • 34547942942 scopus 로고    scopus 로고
    • Bartkova, K., 2003. Generating proper name pronunciation variants for automatic speech recognition, In: Proceedings of ICPhS. Barcelona, Spain.
  • 16
    • 34547945689 scopus 로고    scopus 로고
    • Bartkova, K., Jouvet, D., 1999. Language based phone model combination for ASR adaptation to foreign accent. In: Proceedings of ICPhS, San Francisco, USA, pp. 1725-1728.
  • 17
    • 34547933367 scopus 로고    scopus 로고
    • Bartkova, K., Jouvet, D., 2004. Multiple models for improved speech recognition for non-native speakers. In: Proceedings of SPECOM, Saint Petersburg, Russia.
  • 18
    • 34547931925 scopus 로고    scopus 로고
    • Beattie, V., Edmondson, S., Miller, D., Patel, Y., Talvola, G., 1995. An integrated multidialect speech recognition system with optional speaker adaptation. In: Proceedings of Eurospeech, Madrid, Spain, pp. 1123-1126.
  • 19
    • 34547960250 scopus 로고    scopus 로고
    • Beauford, J.Q., 1999. Compensating for variation in speaking rate, PhD thesis, Electrical Engineering, University of Pittsburgh.
  • 21
    • 70249086510 scopus 로고    scopus 로고
    • Benitez, C., Burget, L., Chen, B., Dupont, S., Garudadri, H., Hermansky, H., Jain, P., Kajarekar, S., Sivadas, S., 2001. Robust ASR front-end using spectral based and discriminant features: experiments on the aurora task. In: Proceedings of Eurospeech, Aalborg, Denmark, pp. 429-432.
  • 22
    • 0026402806 scopus 로고
    • Adaptation to a speaker's voice in a speech recognition system based on synthetic phoneme references
    • Blomberg M. Adaptation to a speaker's voice in a speech recognition system based on synthetic phoneme references. Speech Communication 10 5-6 (1991) 453-461
    • (1991) Speech Communication , vol.10 , Issue.5-6 , pp. 453-461
    • Blomberg, M.1
  • 23
    • 34547939007 scopus 로고    scopus 로고
    • Bonaventura, P., Gallochio, F., Mari, J., Micca, G., 1998. Speech recognition methods for non-native pronunciation variants. In: Proceedings ISCA Workshop on modelling pronunciation variations for automatic speech recognition, Rolduc, Netherlands, pp. 17-23.
  • 24
    • 34547949602 scopus 로고    scopus 로고
    • Bonaventura, P., Gallochio, F., Micca, G., 1997. Multilingual speech recognition for flexible vocabularies. In: Proceedings of Eurospeech, Rhodes, Greece, pp. 355-358.
  • 25
    • 85032661370 scopus 로고    scopus 로고
    • Bou-Ghazale, S.E., Hansen, J.L.H., 1994. Duration and spectral based stress token generation for HMM speech recognition under stress. In: Proceedings of ICASSP, Adelaide, Australia, pp. 413-416.
  • 26
    • 34547953239 scopus 로고    scopus 로고
    • Bou-Ghazale, S.E., Hansen, J.L.H., 1995. Improving recognition and synthesis of stressed speech via feature perturbation in a source generator framework. In: ECSA-NATO Proceedings Speech Under Stress Workshop, Lisbon, Portugal, pp. 45-48.
  • 27
    • 0030643240 scopus 로고    scopus 로고
    • Bourlard, H., Dupont, D., 1997. Sub-band based speech recognition. In: Proceedings of ICASSP, Munich, Germany, pp. 1251-1254.
  • 28
    • 84863687026 scopus 로고    scopus 로고
    • Bozkurt, B., Couvreur, L., 2005. On the use of phase information for speech recognition. In: Proceedings of Eusipco, Antalya, Turkey.
  • 29
    • 17644365443 scopus 로고    scopus 로고
    • Zeros of z-transform representation with application to source-filter separation in speech
    • Bozkurt B., Doval B., d'Alessandro C., and Dutoit T. Zeros of z-transform representation with application to source-filter separation in speech. IEEE Signal Processing Letters 12 4 (2005) 344-347
    • (2005) IEEE Signal Processing Letters , vol.12 , Issue.4 , pp. 344-347
    • Bozkurt, B.1    Doval, B.2    d'Alessandro, C.3    Dutoit, T.4
  • 30
    • 33747701692 scopus 로고    scopus 로고
    • Brugnara, F., De Mori, R., Giuliani, D., Omologo, M., 1992. A family of parallel Hidden Markov Models. In: Proceedings of ICASSP, vol. 1. pp. 377-380.
  • 32
    • 0030376663 scopus 로고    scopus 로고
    • Carey, M., Parris, E., Lloyd-Thomas, H., Bennett, S., 1996. Robust prosodic features for speaker identification. In: Proceedings of ICSLP, Philadelphia, Pennsylvania, USA, pp. 1800-1803.
  • 33
    • 33947656987 scopus 로고    scopus 로고
    • Carlson, B., Clements, M., 1992. Speech recognition in noise using a projection-based likelihood measure for mixture density HMMs. In: Proceedings of ICASSP, San Francisco, CA, pp. 237-240.
  • 34
    • 34547936985 scopus 로고    scopus 로고
    • Chase, L., 1997. Error-responsive feedback mechanisms for speech recognizers. PhD thesis, Carnegie Mellon University.
  • 35
    • 34547949195 scopus 로고    scopus 로고
    • Chen, C.J., Gopinath, R.A., Monkowski, M.D., Picheny, M.A., Shen, K., 1997. New methods in continuous mandarin speech recognition. In: Proceedings of Eurospeech, pp. 1543-1546.
  • 36
    • 0034842303 scopus 로고    scopus 로고
    • Chen, C.J., Li, H., Shen, L., Fu, G., 2001. Recognize tone languages using pitch information on the main vowel of each syllable. In: Proceedings of ICASSP, vol. 1. pp. 61-64.
  • 37
    • 0023168987 scopus 로고    scopus 로고
    • Chen, Y., 1987. Cepstral domain stress compensation for robust speech recognition. In: Proceedings of ICASSP, Dallas, TX, pp. 717-720.
  • 38
    • 0032627220 scopus 로고    scopus 로고
    • Chesta, C., Laface, P., Ravera, F., 1999. Connected digit recognition using short and long duration models. In: Proceedings of ICASSP, vol. 2. pp. 557-560.
  • 39
    • 34547928199 scopus 로고    scopus 로고
    • Chesta, C., Siohan, O., Lee, C.-H., 1999. Maximum a posteriori linear regression for Hidden Markov Model adaptation. In: Proceedings of Eurospeech, Budapest, Hungary, pp. 211-214.
  • 40
    • 0019680114 scopus 로고    scopus 로고
    • Chollet, G.F., Astier, A.B.P., Rossi, M., 1981. Evaluating the performance of speech recognizers at the acoustic-phonetic level. In: in Proceedings of ICASSP, Atlanta, USA, pp. 758-761.
  • 41
    • 79551471715 scopus 로고    scopus 로고
    • Cincarek, T., Gruhn, R. Nakamura, S., 2004. Speech recognition for multiple non-native accent groups with speaker-group-dependent acoustic models. In: Proceedings of ICSLP, Jeju Island, Korea, pp. 1509-1512.
  • 43
    • 33745190977 scopus 로고    scopus 로고
    • Colibro, D., Fissore, L., Popovici, C., Vair, C., Laface, P., 2005. Learning pronunciation and formulation variants in continuous speech applications. In: Proceedings of ICASSP, Philadelphia, PA, pp. 1001-1004.
  • 45
    • 0034140299 scopus 로고    scopus 로고
    • Different aspects of expert pronunciation quality ratings and their relation to scores produced by speech recognition algorithms
    • Cucchiarini C., Strik H., and Boves L. Different aspects of expert pronunciation quality ratings and their relation to scores produced by speech recognition algorithms. Speech Communication 30 2-3 (2000) 109-119
    • (2000) Speech Communication , vol.30 , Issue.2-3 , pp. 109-119
    • Cucchiarini, C.1    Strik, H.2    Boves, L.3
  • 46
    • 34547948254 scopus 로고    scopus 로고
    • Dalsgaard, P., Andersen, O., Barry, W., 1998. Cross-language merged speech units and their descriptive phonetic correlates. In: Proceedings of ICSLP, Sydney, Australia, pp. 482-485.
  • 47
    • 85009136762 scopus 로고    scopus 로고
    • D'Arcy, S.M., Wong, L.P., Russell, M.J., 2004. Recognition of read and spontaneous children's speech using two new corpora. In: Proceedings of ICSLP, Jeju Island, Korea.
  • 48
    • 34547935449 scopus 로고    scopus 로고
    • Das, S., Lubensky, D., Wu, C., 1999. Towards robust speech recognition in the telephony network environment - cellular and landline conditions. In: Proceedings of Eurospeech, Budapest, Hungary, pp. 1959-1962.
  • 49
    • 0031644298 scopus 로고    scopus 로고
    • Das, S., Nix, D., Picheny, M., 1998. Improvements in children speech recognition performance. In: Proceedings of ICASSP, vol. 1. Seattle, USA, pp. 433-436.
  • 50
    • 0019053271 scopus 로고
    • Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
    • Davis S.B., and Mermelstein P. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech and Signal Processing 28 (1980) 357-366
    • (1980) IEEE Transactions on Acoustics, Speech and Signal Processing , vol.28 , pp. 357-366
    • Davis, S.B.1    Mermelstein, P.2
  • 52
    • 0343867184 scopus 로고    scopus 로고
    • Recognition of syllables in a tone language
    • Demeechai T., and Mäkeläinen K. Recognition of syllables in a tone language. Speech Communication 33 3 (2001) 241-254
    • (2001) Speech Communication , vol.33 , Issue.3 , pp. 241-254
    • Demeechai, T.1    Mäkeläinen, K.2
  • 53
    • 84988831195 scopus 로고    scopus 로고
    • Demuynck, K., Garcia, O., Van Compernolle, D., 2004. Synthesizing speech from speech recognition parameters. In: Proceedings of ICSLP'04, Jeju Island, Korea.
  • 54
    • 85009171168 scopus 로고    scopus 로고
    • Deng, Y., Mahajan, M., Acero, A., 2003. Estimating speech recognition error rate without acoustic test data. In: Proceedings of Eurospeech, Geneva, Switzerland, pp. 929-932.
  • 55
    • 34547946960 scopus 로고    scopus 로고
    • Di Benedetto, M.-G., Liénard, J.-S., 1992. Extrinsic normalization of vowel formant values based on cardinal vowels mapping. In: Proceedings of ICSLP, Alberta, USA, pp. 579-582.
  • 56
    • 34547962914 scopus 로고    scopus 로고
    • Disfluency in spontaneous speech (diss'05). 2005. Aix-en-Provence, France.
  • 57
    • 62249192231 scopus 로고    scopus 로고
    • Doddington, G., 2003. Word alignment issues in ASR scoring. In: Proceedings of ASRU, US Virgin Islands, pp. 630-633.
  • 58
    • 34547956910 scopus 로고    scopus 로고
    • Draxler, C., Burger, S., 1997. Identification of regional variants of high german from digit sequences in german telephone speech. In: Proceedings of Eurospeech, pp. 747-750.
  • 60
    • 34547959626 scopus 로고    scopus 로고
    • Dupont, S., Ris, C., Couvreur, L., Boite, J.-M., 2005. A study of implicit anf explicit modeling of coarticulation and pronunciation variation. In: Proceedings of Interspeech, Lisboa, Portugal, pp. 1353-1356.
  • 61
    • 33846194657 scopus 로고    scopus 로고
    • Dupont, S., Ris, C., Deroo, O., Poitoux, S., 2005. Feature extraction and acoustic modeling: an approach for improved generalization across languages and accents. In: Proceedings of ASRU, San Juan, Puerto-Rico, pp. 29-34.
  • 62
    • 84871614151 scopus 로고    scopus 로고
    • Eide, E., 2001. Distinctive features for use in automatic speech recognition. In: Proceedings of Eurospeech, Aalborg, Denmark, pp. 1613-1616.
  • 63
    • 0029725604 scopus 로고    scopus 로고
    • Eide, E., Gish, H., 1996. A parametric approach to vocal tract length normalization. In: Proceedings of ICASSP, Atlanta, GA, pp. 346-348.
  • 64
    • 0029725604 scopus 로고    scopus 로고
    • Eide, E., Gish, H., 1996. A parametric approach to vocal tract length normalization. In: Proceedings of ICASSP, Atlanta, GA, pp. 346-349.
  • 65
    • 0028996886 scopus 로고    scopus 로고
    • Eide, E., Gish, H., Jeanrenaud, P., Mielke, A., 1995. Understanding and improving speech recognition performance through the use of diagnostic tools. In: Proceedings of ICASSP, Detroit, Michigan, pp. 221-224.
  • 66
    • 0035426930 scopus 로고    scopus 로고
    • Xenophones: an investigation of phone set expansion in swedish and implications for speech recognition and speech synthesis
    • Eklund R., and Lindström A. Xenophones: an investigation of phone set expansion in swedish and implications for speech recognition and speech synthesis. Speech Communication 35 1-2 (2001) 81-102
    • (2001) Speech Communication , vol.35 , Issue.1-2 , pp. 81-102
    • Eklund, R.1    Lindström, A.2
  • 67
    • 34547945866 scopus 로고    scopus 로고
    • Elenius, D., Blomberg, M., 2004. Comparing speech recognition for adults and children. In: Proceedings of FONETIK, Stockholm, Sweden, pp. 156-159.
  • 68
    • 0034848926 scopus 로고    scopus 로고
    • Ellis, D., Singh, R., Sivadas, S., 2001. Tandem acoustic modeling in large-vocabulary recognition. In: Proceedings of ICASSP, Salt Lake City, USA, pp. 517-520.
  • 69
    • 34547939006 scopus 로고    scopus 로고
    • ESCA Workshop on Modeling Pronunciation Variation for Automatic Speech Recognition, 1998.
  • 70
    • 0030373675 scopus 로고    scopus 로고
    • Eskenazi, M., 1996. Detection of foreign speakers' pronunciation errors for second language training-preliminary results. In: Proceedings of ICSLP, Philadelphia, PA, pp. 1465-1468.
  • 72
    • 34547942941 scopus 로고    scopus 로고
    • Eskenazi, M., Pelton, G., 2002. Pinpointing pronunciation errors in children speech: examining the role of the speech recognizer. In: Proceedings of the PMLA Workshop, Colorado, USA.
  • 73
    • 0033692966 scopus 로고    scopus 로고
    • Falthauser, R., Pfau, T., Ruske, G., 2000. On-line speaking rate estimation using gaussian mixture models. In: Proceedings of ICASSP, Istanbul, Turkey, pp. 1355-1358.
  • 75
    • 0030638031 scopus 로고    scopus 로고
    • Fiscus, J.G., 1997. A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER). In: Proceedings of ASRU, pp. 347-354.
  • 76
    • 34547943338 scopus 로고    scopus 로고
    • Fitt, S., 1995. The pronunciation of unfamiliar native and non-native town names. In: Proceedings of Eurospeech, Madrid, Spain, pp. 2227-2230.
  • 78
    • 0038188722 scopus 로고    scopus 로고
    • Interaction between the native and second language phonetic subsystems
    • Flege J.E., Schirru C., and MacKay I.R.A. Interaction between the native and second language phonetic subsystems. Speech Communication 40 (2003) 467-491
    • (2003) Speech Communication , vol.40 , pp. 467-491
    • Flege, J.E.1    Schirru, C.2    MacKay, I.R.A.3
  • 79
    • 19944369427 scopus 로고    scopus 로고
    • A framework for predicting speech recognition errors
    • Fosler-Lussier E., Amdal I., and Kuo H.-K.J. A framework for predicting speech recognition errors. Speech Communication 46 2 (2005) 153-170
    • (2005) Speech Communication , vol.46 , Issue.2 , pp. 153-170
    • Fosler-Lussier, E.1    Amdal, I.2    Kuo, H.-K.J.3
  • 80
    • 0033321442 scopus 로고    scopus 로고
    • Effects of speaking rate and word predictability on conversational pronunciations
    • Fosler-Lussier E., and Morgan N. Effects of speaking rate and word predictability on conversational pronunciations. Speech Communication 29 2-4 (1999) 137-158
    • (1999) Speech Communication , vol.29 , Issue.2-4 , pp. 137-158
    • Fosler-Lussier, E.1    Morgan, N.2
  • 81
    • 0034140838 scopus 로고    scopus 로고
    • Combination of machine scores for automatic grading of pronunciation quality
    • Franco H., Neumeyer L., Digalakis V., and Ronen O. Combination of machine scores for automatic grading of pronunciation quality. Speech Communication 30 2-3 (2000) 121-130
    • (2000) Speech Communication , vol.30 , Issue.2-3 , pp. 121-130
    • Franco, H.1    Neumeyer, L.2    Digalakis, V.3    Ronen, O.4
  • 82
    • 0034855363 scopus 로고    scopus 로고
    • Fujinaga, K., Nakai, M., Shimodaira, H., Sagayama, S., 2001. Multiple-regression Hidden Markov Model. In: Proceedings of ICASSP, vol. 1. Salt Lake City, USA, pp. 513-516.
  • 84
    • 0032677425 scopus 로고    scopus 로고
    • Fung, P., Liu, W.K., 1999. Fast accent identification and accented speech recognition. In: Proceedings of ICASSP, Phoenix, Arizona, USA, pp. 221-224.
  • 86
    • 34547956384 scopus 로고    scopus 로고
    • Gales, M.J.F., 1998. Cluster adaptive training for speech recognition. In: Proceedings of ICSLP, Sydney, Australia, pp. 1783-1786.
  • 88
    • 34547928595 scopus 로고    scopus 로고
    • Gales, M.J.F., 2001. Acoustic factorization. In: Proceedings of ASRU, Madona di Campiglio, Italy.
  • 89
    • 0034853390 scopus 로고    scopus 로고
    • Gales, M.J.F., 2001. Multiple-cluster adaptive training schemes. In: Proceedings of ICASSP, Salt Lake City, Utah, USA, pp. 361-364.
  • 90
    • 34547955597 scopus 로고    scopus 로고
    • Gao, Y., Ramabhadran, B., Chen, J., Erdogan, H., Picheny, M., 2001. Innovative approaches for large vocabulary name recognition. In: Proceedings of ICASSP, Salt Lake City, Utah, pp. 333-336.
  • 91
    • 0031632620 scopus 로고    scopus 로고
    • Garner, P., Holmes, W., 1998. On the robust incorporation of formant features into hidden markov models for automatic speech recognition. In: Proceedings of ICASSP, pp. 1-4.
  • 92
    • 0000673740 scopus 로고
    • Speaker identification and message identification in speech recognition
    • Garvin P.L., and Ladefoged P. Speaker identification and message identification in speech recognition. Phonetica 9 (1963) 193-199
    • (1963) Phonetica , vol.9 , pp. 193-199
    • Garvin, P.L.1    Ladefoged, P.2
  • 94
    • 34547931739 scopus 로고    scopus 로고
    • Girardi, A., Shikano, K., Nakamura, S., 1998. Creating speaker independent HMM models for restricted database using straight-tempo morphing. In: Proceedings of ICSLP, Sydney, Australia, pp. 687-690.
  • 95
    • 85143190854 scopus 로고    scopus 로고
    • Giuliani, D., Gerosa, M., 2003. Investigating recognition of children speech. In: Proceedings of ICASSP, Hong Kong, pp. 137-140.
  • 96
  • 97
    • 84892187452 scopus 로고    scopus 로고
    • Gopinath, R.A., 1998. Maximum likelihood modeling with gaussian distributions for classification. In: Proceedings of ICASSP, Seattle, WA, pp. 661-664.
  • 98
    • 0346008163 scopus 로고    scopus 로고
    • Generating non-native pronunciation variants for lexicon adaptation
    • Goronzy S., Rapp S., and Kompe R. Generating non-native pronunciation variants for lexicon adaptation. Speech Communication 42 1 (2004) 109-123
    • (2004) Speech Communication , vol.42 , Issue.1 , pp. 109-123
    • Goronzy, S.1    Rapp, S.2    Kompe, R.3
  • 99
    • 4544351495 scopus 로고    scopus 로고
    • Graciarena, M., France, H., Zheng, J., Vergyri, D., Stolcke, A., 2004. Voicing feature integration in SRI's DECIPHER LVCSR system. In: Proceedings of ICASSP, Montreal, Canada, pp. 921-924.
  • 100
    • 34547952311 scopus 로고    scopus 로고
    • Greenberg, S., Chang, S., 2000. Linguistic dissection of switchboard-corpus automatic speech recognition systems. In: Proceedings of ISCA Workshop on Automatic Speech Recognition: Challenges for the New Millenium, Paris, France.
  • 101
    • 34547950990 scopus 로고    scopus 로고
    • Greenberg, S., Fosler-Lussier, E., 2000. The uninvited guest: information's role in guiding the production of spontaneous speech. In: in Proceedings of the Crest Workshop on Models of Speech Production: Motor Planning and Articulatory Modelling. Kloster Seeon, Germany.
  • 102
    • 0029726002 scopus 로고    scopus 로고
    • Gupta, S.K., Soong, F., Haimi-Cohen, R., 1996. High-accuracy connected digit recognition for mobile applications. In: Proceedings of ICASSP, vol. 1, pp. 57-60.
  • 103
    • 85017287487 scopus 로고    scopus 로고
    • Haeb-Umbach, R., Ney, H., 1992. Linear discriminant analysis for improved large vocabulary continuous speech recognition. In: Proceedings of ICASSP, San Francisco, CA, pp. 13-16.
  • 104
    • 84946707630 scopus 로고    scopus 로고
    • Children speech recognition with application to interactive books and tutors
    • St. Thomas, US Virgin Islands
    • Hagen A., Pellom B., and Cole R. Children speech recognition with application to interactive books and tutors. Proceedings of ASRU (2003), St. Thomas, US Virgin Islands 186-191
    • (2003) Proceedings of ASRU , pp. 186-191
    • Hagen, A.1    Pellom, B.2    Cole, R.3
  • 105
    • 19944415893 scopus 로고    scopus 로고
    • Implicit modelling of pronunciation variation in automatic speech recognition
    • Hain T. Implicit modelling of pronunciation variation in automatic speech recognition. Speech Communication 46 2 (2005) 171-188
    • (2005) Speech Communication , vol.46 , Issue.2 , pp. 171-188
    • Hain, T.1
  • 106
    • 34547934712 scopus 로고    scopus 로고
    • Hain, T., Woodland, P.C., 1999. Dynamic HMM selection for continuous speech recognition. In: Proceedings of Eurospeech, Budapest, Hungary, pp. 1327-1330.
  • 107
    • 0024914051 scopus 로고    scopus 로고
    • Hansen, J.H.L., 1989. Evaluation of acoustic correlates of speech under stress for robust speech recognition. In: IEEE Proceedings 15th Northeast Bioengineering Conference, Boston, MA. Boston, Mass, pp. 31-32.
  • 108
    • 0027307739 scopus 로고    scopus 로고
    • Hansen, J.H.L., 1993. Adaptive source generator compensation and enhancement for speech recognition in noisy stressful environments. In: Proceedings of ICASSP, Minneapolis, Minnesota, pp. 95-98.
  • 109
    • 34547948253 scopus 로고
    • A source generator framework for analysis of acoustic correlates of speech under stress. part i: pitch, duration, and intensity effects
    • Hansen J.H.L. A source generator framework for analysis of acoustic correlates of speech under stress. part i: pitch, duration, and intensity effects. The Journal of the Acoustical Society of America (1995)
    • (1995) The Journal of the Acoustical Society of America
    • Hansen, J.H.L.1
  • 110
    • 0030283741 scopus 로고    scopus 로고
    • Analysis and compensation of speech under stress and noise for environmental robustness in speech recognition
    • Hansen J.H.L. Analysis and compensation of speech under stress and noise for environmental robustness in speech recognition. Speech Communications, Special Issue on Speech Under Stress 20 2 (1996) 151-170
    • (1996) Speech Communications, Special Issue on Speech Under Stress , vol.20 , Issue.2 , pp. 151-170
    • Hansen, J.H.L.1
  • 111
    • 0025681006 scopus 로고    scopus 로고
    • Hanson, B.A., Applebaum, T., 1990. Robust speaker-independent word recognition using instantaneous, dynamic and acceleration features: experiments with Lombard and noisy speech. In: Proceedings of ICASSP, Albuquerque, New Mexico, pp. 857-860.
  • 115
    • 19944423811 scopus 로고    scopus 로고
    • Pronunciation modeling using a finite-state transducer representation
    • Hazen T.J., Hetherington I.L., Shu H., and Livescu K. Pronunciation modeling using a finite-state transducer representation. Speech Communication 46 2 (2005) 189-203
    • (2005) Speech Communication , vol.46 , Issue.2 , pp. 189-203
    • Hazen, T.J.1    Hetherington, I.L.2    Shu, H.3    Livescu, K.4
  • 116
    • 0042594404 scopus 로고    scopus 로고
    • Fast model selection based speaker adaptation for nonnative speech
    • He X., and Zhao Y. Fast model selection based speaker adaptation for nonnative speech. IEEE Transactions on Speech and Audio Processing 11 4 (2003) 298-307
    • (2003) IEEE Transactions on Speech and Audio Processing , vol.11 , Issue.4 , pp. 298-307
    • He, X.1    Zhao, Y.2
  • 117
    • 85009067731 scopus 로고    scopus 로고
    • Hegde, R.M., Murthy, H.A., Gadde, V.R.R., 2004. Continuous speech recognition using joint features derived from the modified group delay function and MFCC. In: Proceedings of ICSLP, Jeju, Korea, pp. 905-908.
  • 118
    • 33646756506 scopus 로고    scopus 로고
    • Hegde, R.M., Murthy, H.A., Rao, G.V.R., 2005. Speech processing using joint features derived from the modified group delay function. In: Proceedings of ICASSP, vol. I. Philadelphia, PA, pp. 541-544.
  • 119
  • 121
    • 34547961189 scopus 로고    scopus 로고
    • Hermansky, H., Sharma, S., 1998. TRAPS: classifiers of temporal patterns. In: Proceedings of ICSLP, Sydney, Australia, pp. 1003-1006.
  • 122
    • 34547945010 scopus 로고    scopus 로고
    • Hetherington, L., 1995. New words: Effect on recognition performance and incorporation issues. In: Proceedings of Eurospeech, Madrid, Spain, pp. 1645-1648.
  • 123
    • 2942568545 scopus 로고    scopus 로고
    • Prosodic and other cues to speech recognition failures
    • Hirschberg J., Litman D., and Swerts M. Prosodic and other cues to speech recognition failures. Speech Communication 43 1-2 (2004) 155-175
    • (2004) Speech Communication , vol.43 , Issue.1-2 , pp. 155-175
    • Hirschberg, J.1    Litman, D.2    Swerts, M.3
  • 124
    • 34547949601 scopus 로고    scopus 로고
    • Holmes, J.N., Holmes, W.J., Garner, P.N., 1997. Using formant frequencies in speech recognition. In: Proceedings of Eurospeech, Rhodes, Greece, pp. 2083-2086.
  • 125
    • 34547960808 scopus 로고    scopus 로고
    • Combining frame and segment based models for large vocabulary continuous speech recognition
    • Keystone, Colorado
    • Hon H.W., and Wang K. Combining frame and segment based models for large vocabulary continuous speech recognition. Proceedings of ASRU (1999), Keystone, Colorado
    • (1999) Proceedings of ASRU
    • Hon, H.W.1    Wang, K.2
  • 126
    • 85009113198 scopus 로고    scopus 로고
    • Huang, C., Chen, T., Li, S., Chang, E., Zhou, J., 2001. Analysis of speaker variability. In: Proceedings of Eurospeech, Aalborg, Denmark, pp. 1377-1380.
  • 127
    • 0026388713 scopus 로고    scopus 로고
    • Huang, X., Lee, K., 1991. On speaker-independent, speaker-dependent, and speaker-adaptive speech recognition. In: Proceedings of ICASSP, Toronto, Canada, pp. 877-880.
  • 128
    • 0033677211 scopus 로고    scopus 로고
    • Huank, H.C.-H., Seide, F., 2000. Pitch tracking and tone features for Mandarin speech recognition. In: Proceedings of ICASSP, vol. 3. pp. 1523-1526.
  • 129
    • 34547955984 scopus 로고    scopus 로고
    • Humphries, J.J., Woodland, P.C., Pearce, D., 1996. Using accent-specific pronunciation modelling for robust speech recognition. In: Proceedings of ICSLP, Rhodes, Greece, pp. 2367-2370.
  • 130
    • 0141517832 scopus 로고    scopus 로고
    • Spectral signal processing for ASR
    • Keystone, Colorado
    • Hunt M.J. Spectral signal processing for ASR. Proceedings of ASRU (1999), Keystone, Colorado
    • (1999) Proceedings of ASRU
    • Hunt, M.J.1
  • 131
    • 85009080383 scopus 로고    scopus 로고
    • Hunt, M.J., 2004. Speech recognition, syllabification and statistical phonetics. In: Proceedings of ICSLP, Jeju Island, Korea.
  • 132
    • 0024905238 scopus 로고    scopus 로고
    • Hunt, M.J., Lefebvre, C., 1989. A comparison of several acoustic representations for speech recognition with degraded and undegraded speech. In: Proceedings of ICASSP, Glasgow, UK, pp. 262-265.
  • 133
    • 34547962708 scopus 로고    scopus 로고
    • Iivonen, A., Harinen, K., Keinanen, L., Kirjavainen, J., Meister, E., Tuuri, L., 2003. Development of a multiparametric speaker profile for speaker recognition. In: Proceedings of ICPhS, Barcelona, Spain, pp. 695-698.
  • 134
    • 34547956725 scopus 로고    scopus 로고
    • ISCA Tutorial and Research Workshop, 2002. Pronunciation Modeling and Lexicon Adaptation for Spoken Language Technology (PMLA-2002).
  • 135
    • 1142288445 scopus 로고    scopus 로고
    • Word perception in fast speech: artificially time-compressed vs. naturally produced fast speech
    • Janse E. Word perception in fast speech: artificially time-compressed vs. naturally produced fast speech. Speech Communication 42 2 (2004) 155-173
    • (2004) Speech Communication , vol.42 , Issue.2 , pp. 155-173
    • Janse, E.1
  • 136
    • 34547954621 scopus 로고    scopus 로고
    • Acoustic feature selection using speech recognizers
    • Keystone, Colorado
    • Jiang K., and Huang X. Acoustic feature selection using speech recognizers. Proceedings of ASRU (1999), Keystone, Colorado
    • (1999) Proceedings of ASRU
    • Jiang, K.1    Huang, X.2
  • 138
    • 0034853397 scopus 로고    scopus 로고
    • Jurafsky, D., Ward, W., Jianping, Z., Herold, K., Xiuyang, Y., Sen, Z., 2001. What kind of pronunciation variation is hard for triphones to model? In: Proceedings of ICASSP, Salt Lake City, Utah, pp. 577-580.
  • 139
    • 34547958273 scopus 로고    scopus 로고
    • Kajarekar, S., Malayath, N., Hermansky, H., 1999. Analysis of sources of variability in speech. In: Proceedings of Eurospeech, Budapest, Hungary, pp. 343-346.
  • 140
    • 33745191628 scopus 로고    scopus 로고
    • Analysis of speaker and channel variability in speech
    • Keystone, Colorado
    • Kajarekar S., Malayath N., and Hermansky H. Analysis of speaker and channel variability in speech. Proceedings of ASRU (1999), Keystone, Colorado
    • (1999) Proceedings of ASRU
    • Kajarekar, S.1    Malayath, N.2    Hermansky, H.3
  • 142
    • 0030371812 scopus 로고    scopus 로고
    • Köhler, J., 1996. Multilingual phonemes recognition exploiting acoustic-phonetic similarities of sounds. In: Proceedings of ICSLP, Philadelphia, PA, pp. 2195-2198.
  • 143
    • 34547960807 scopus 로고    scopus 로고
    • Konig, Y., Morgan, N., 1992. GDNN: a gender-dependent neural network for continuous speech recognition. In: Proceedings of Int. Joint Conf. on Neural Networks, vol. 2. Baltimore, Maryland, pp. 332-337.
  • 144
    • 1842580255 scopus 로고    scopus 로고
    • Rapid online adaptation using speaker space model evolution
    • Kim D.K., and Kim N.S. Rapid online adaptation using speaker space model evolution. Speech Communication 42 3-4 (2004) 467-478
    • (2004) Speech Communication , vol.42 , Issue.3-4 , pp. 467-478
    • Kim, D.K.1    Kim, N.S.2
  • 145
    • 0032136330 scopus 로고    scopus 로고
    • Robust speech recognition using the modulation spectrogram
    • Kingsbury B., Morgan N., and Greenberg S. Robust speech recognition using the modulation spectrogram. Speech Communication 25 1-3 (1998) 117-132
    • (1998) Speech Communication , vol.25 , Issue.1-3 , pp. 117-132
    • Kingsbury, B.1    Morgan, N.2    Greenberg, S.3
  • 146
    • 17344389852 scopus 로고    scopus 로고
    • Kingsbury, B., Saon, G., Mangua, L., Padmanabhan, M., Sarikaya, R., 2002. Robust speech recognition in noisy environments: the 2001 IBM SPINE evaluation system. In: Proceedings of ICASSP, vol. I. Orlando, FL, pp. 53-56.
  • 147
    • 34547928978 scopus 로고    scopus 로고
    • Kirchhoff, K., 1998. Combining articulatory and acoustic information for speech recognition in noise and reverberant environments. In: Proceedings of ICSLP, Sydney, Australia, pp. 891-894.
  • 148
    • 44849123618 scopus 로고    scopus 로고
    • Kitaoka, N., Yamada, D., Nakagawa, S., 2002. Speaker independent speech recognition using features based on glottal sound source. In: Proceedings of ICSLP, Denver, USA, pp. 2125-2128.
  • 149
    • 85009233038 scopus 로고    scopus 로고
    • Kleinschmidt, M., Gelbart, D., 2002. Improving word accuracy with gabor feature extraction. In: Proceedings of ICSLP, Denver, Colorado, pp. 25-28.
  • 150
    • 0030672089 scopus 로고    scopus 로고
    • Korkmazskiy, F., Juang, B.-H., Soong, F., 1997. Generalized mixture of HMMs for continuous speech recognition. In: Proceedings of ICASSP, vol. 2. pp. 1443-1446.
  • 151
    • 85009062725 scopus 로고    scopus 로고
    • Korkmazsky, F., Deviren, M., Fohr, D., Illina, I., 2004. Hidden factor dynamic bayesian networks for speech recognition. In: Proceedings of ICSLP, Jeju Island, Korea.
  • 152
    • 0039670397 scopus 로고    scopus 로고
    • Kubala, F., Anastasakos, A., Makhoul, J., Nguyen, L., Schwartz, R., Zavaliagkos, E., 1994. Comparative experiments on large vocabulary speech recognition. In: Proceedings of ICASSP, Adelaide, Australia, pp. 561-564.
  • 153
    • 0032289099 scopus 로고    scopus 로고
    • Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition
    • Kumar N., and Andreou A.G. Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition. Speech Communication 26 4 (1998) 283-297
    • (1998) Speech Communication , vol.26 , Issue.4 , pp. 283-297
    • Kumar, N.1    Andreou, A.G.2
  • 154
    • 0032187138 scopus 로고    scopus 로고
    • An inverse signal approach to computing the envelope of a real valued signal
    • Kumaresan R. An inverse signal approach to computing the envelope of a real valued signal. IEEE Signal Processing Letters 5 10 (1998) 256-259
    • (1998) IEEE Signal Processing Letters , vol.5 , Issue.10 , pp. 256-259
    • Kumaresan, R.1
  • 155
    • 0033004349 scopus 로고    scopus 로고
    • Model-based approach to envelope and positive instantaneous frequency estimation of signals with speech applications
    • Kumaresan R., and Rao A. Model-based approach to envelope and positive instantaneous frequency estimation of signals with speech applications. The Journal of the Acoustical Society of America 105 3 (1999) 1912-1924
    • (1999) The Journal of the Acoustical Society of America , vol.105 , Issue.3 , pp. 1912-1924
    • Kumaresan, R.1    Rao, A.2
  • 156
    • 0030363012 scopus 로고    scopus 로고
    • Kumpf, K., King, R.W., 1996. Automatic accent classification of foreign accented australian english speech. In: Proceedings of ICSLP, Philadelphia, PA, pp. 1740-1743.
  • 157
    • 34547939546 scopus 로고    scopus 로고
    • Kuwabara, H., 1997. Acoustic and perceptual properties of phonemes in continuous speech as a function of speaking rate. In: Proceedings of Eurospeech, Rhodes, Greece, pp. 1003-1006.
  • 159
    • 33646805430 scopus 로고    scopus 로고
    • Lamel, L., Gauvain, J.-L., 2005. Alternate phone models for conversational speech. In: Proceedings of ICASSP, Philadelphia, Pennsylvania, pp. 1005-1008.
  • 160
    • 80054370614 scopus 로고
    • Cambridge University Press, Cambridge
    • Laver J. Principles of Phonetics (1994), Cambridge University Press, Cambridge
    • (1994) Principles of Phonetics
    • Laver, J.1
  • 161
    • 34548727590 scopus 로고    scopus 로고
    • Lawson, A.D., Harris, D.M., Grieco, J.J., 2003. Effect of foreign accent on speech recognition in the NATO N-4 corpus. In: Proceedings of Eurospeech, Geneva, Switzerland, pp. 1505-1508.
  • 162
    • 0026142334 scopus 로고
    • A study on speaker adaptation of the parameters of continuous density Hidden Markov Models
    • Lee C., Lin C., and Juang B. A study on speaker adaptation of the parameters of continuous density Hidden Markov Models. IEEE Transactions Signal Processing 39 4 (1991) 806-813
    • (1991) IEEE Transactions Signal Processing , vol.39 , Issue.4 , pp. 806-813
    • Lee, C.1    Lin, C.2    Juang, B.3
  • 163
    • 0027192618 scopus 로고    scopus 로고
    • Lee, C.-H., Gauvain, J.-L., 1993. Speaker adaptation based on MAP estimation of HMM parameters. In: Proceedings of ICASSP, vol. 2. pp. 558-561.
  • 164
    • 0029747183 scopus 로고    scopus 로고
    • Lee, L., Rose, R.C., 1996. Speaker normalization using efficient frequency warping procedures. In: Proceedings of ICASSP, vol. 1. Atlanta, Georgia, pp. 353-356.
  • 165
    • 0032969462 scopus 로고    scopus 로고
    • Acoustics of children speech: developmental changes of temporal and spectral parameters
    • Lee S., Potamianos A., and Narayanan S. Acoustics of children speech: developmental changes of temporal and spectral parameters. The Journal of the Acoustical Society of America 105 (1999) 1455-1468
    • (1999) The Journal of the Acoustical Society of America , vol.105 , pp. 1455-1468
    • Lee, S.1    Potamianos, A.2    Narayanan, S.3
  • 166
    • 0029288633 scopus 로고
    • Maximum likelihood linear regression for speaker adaptation of continuous density Hidden Markov Models
    • Leggetter C., and Woodland P. Maximum likelihood linear regression for speaker adaptation of continuous density Hidden Markov Models. Computer, Speech and Language 9 2 (1995) 171-185
    • (1995) Computer, Speech and Language , vol.9 , Issue.2 , pp. 171-185
    • Leggetter, C.1    Woodland, P.2
  • 167
    • 34547960603 scopus 로고    scopus 로고
    • Leonard, R.G., 1984. A database for speaker independent digit recognition. In: Proceedings of ICASSP, San Diego, US, pp. 328-331.
  • 168
    • 21644484352 scopus 로고    scopus 로고
    • Lin, X., Simske, S., 2004. Phoneme-less hierarchical accent classification. In: Proceedings of Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, vol. 2. Pacific Grove, CA, pp. 1801-1804.
  • 169
    • 34547953237 scopus 로고    scopus 로고
    • Lincoln, M., Cox, S.J., Ringland, S., 1997. A fast method of speaker normalisation using formant estimation. In: Proceedings of Eurospeech, Rhodes, Greece, pp. 2095-2098.
  • 170
    • 0000665734 scopus 로고
    • Explaining phonetic variation: a sketch of the H& H theory
    • Hardcastle W.J., and Marchal A. (Eds), Kluwer Academic Publishers
    • Lindblom B. Explaining phonetic variation: a sketch of the H& H theory. In: Hardcastle W.J., and Marchal A. (Eds). Speech Production and Speech Modelling (1990), Kluwer Academic Publishers
    • (1990) Speech Production and Speech Modelling
    • Lindblom, B.1
  • 171
    • 0031187171 scopus 로고    scopus 로고
    • Speech recognition by machines and humans
    • Lippmann R.P. Speech recognition by machines and humans. Speech Communication 22 1 (1997) 1-15
    • (1997) Speech Communication , vol.22 , Issue.1 , pp. 1-15
    • Lippmann, R.P.1
  • 172
    • 0023263708 scopus 로고    scopus 로고
    • Lippmann, R.P., Martin, E.A., Paul, D.B., 1987. Multi-style training for robust isolated-word speech recognition. In: Proceedings of ICASSP, Dallas, TX, pp. 705-708.
  • 173
    • 0031220487 scopus 로고    scopus 로고
    • Effects of phase on the perception of intervocalic stop consonants
    • Liu L., He J., and Palm G. Effects of phase on the perception of intervocalic stop consonants. Speech Communication 22 4 (1997) 403-417
    • (1997) Speech Communication , vol.22 , Issue.4 , pp. 403-417
    • Liu, L.1    He, J.2    Palm, G.3
  • 174
    • 34547960249 scopus 로고    scopus 로고
    • Liu, S., Doyle, S., Morris, A., Ehsani, F., 1998. The effect of fundamental frequency on mandarin speech recognition. In: Proceedings of ICSLP, vol. 6. Sydney, Australia, pp. 2647-2650.
  • 175
    • 85126499605 scopus 로고    scopus 로고
    • Liu, W.K., Fung, P., 2000. MLLR-based accent model adaptation without accented data. In: Proceedings of ICSLP, vol. 3. Beijing, China, pp. 738-741.
  • 176
    • 0033705981 scopus 로고    scopus 로고
    • Livescu, K., Glass, J., 2000. Lexical modeling of non-native speech for automatic speech recognition. In: Proceedings of ICASSP, vol. 3. Istanbul, Turkey, pp. 1683-1686.
  • 177
    • 85009101259 scopus 로고    scopus 로고
    • Ljolje, A., 2002. Speech recognition using fundamental frequency and voicing in acoustic modeling. In: Proceedings of ICSLP, Denver, USA, pp. 2137-2140.
  • 178
    • 84995620255 scopus 로고    scopus 로고
    • Llitjos, A.F., Black, A.W., 2001. Knowledge of language origin improves pronunciation accuracy of proper names. In: Proceedings of Eurospeech, Aalborg, Denmark.
  • 180
    • 3042673775 scopus 로고    scopus 로고
    • Linear dimensionality reduction via a heteroscedastic extension of LDA: The Chernoff criterion
    • Loog M., and Duin R.P.W. Linear dimensionality reduction via a heteroscedastic extension of LDA: The Chernoff criterion. IEEE Transactions Pattern Analysis and Machine Intelligence 26 6 (2004) 732-739
    • (2004) IEEE Transactions Pattern Analysis and Machine Intelligence , vol.26 , Issue.6 , pp. 732-739
    • Loog, M.1    Duin, R.P.W.2
  • 181
    • 85009070826 scopus 로고    scopus 로고
    • Magimai-Doss, M., Stephenson, T.A., Ikbal, S., Bourlard, H., 2004. Modelling auxiliary features in tandem systems. In: Proceedings of ICSLP, Jeju Island, Korea.
  • 182
    • 84946742164 scopus 로고    scopus 로고
    • Maison, B., 2003. Pronunciation modeling for names of foreign origin. In: Proceedings of ASRU, US Virgin Islands, pp. 429-434.
  • 183
    • 33646780066 scopus 로고    scopus 로고
    • Mak, B., Hsiao, R., 2004. Improving eigenspace-based MLLR adaptation by kernel PCA. In: Proceedings of ICSLP, Jeju Island, Korea.
  • 184
    • 0016495091 scopus 로고    scopus 로고
    • Makhoul, J., 1975. Linear prediction: a tutorial review. In: Proceedings of IEEE, vol. 63(4) pp. 561-580.
  • 185
    • 0034296009 scopus 로고    scopus 로고
    • Finding consensus in speech recognition: Word-error minimization and other applications of confusion networks
    • Mangu L., Brill E., and Stolcke A. Finding consensus in speech recognition: Word-error minimization and other applications of confusion networks. Computer Speech and Language 14 4 (2000) 373-400
    • (2000) Computer Speech and Language , vol.14 , Issue.4 , pp. 373-400
    • Mangu, L.1    Brill, E.2    Stolcke, A.3
  • 186
    • 0024766457 scopus 로고
    • A family of distortion measures based upon projection operation for robust speech recognition
    • Mansour D., and Juang B.H. A family of distortion measures based upon projection operation for robust speech recognition. IEEE Transactions on Acoustics, Speech and Signal Processing 37 (1989) 1659-1671
    • (1989) IEEE Transactions on Acoustics, Speech and Signal Processing , vol.37 , pp. 1659-1671
    • Mansour, D.1    Juang, B.H.2
  • 188
    • 0141590455 scopus 로고    scopus 로고
    • Markov, K., Nakamura, S., 2003. Hybrid HMM/BN LVCSR system integrating multiple acoustic features. In: Proceedings of ICASSP, vol. 1. pp. 840-843.
  • 189
    • 85009228917 scopus 로고    scopus 로고
    • Martin, A., Mauuary, L., 2003. Voicing parameter and energy-based speech/non-speech detection for speech recognition in adverse conditions. In: Proceedings of Eurospeech, Geneva, Switzerland, pp. 3069-3072.
  • 190
    • 0347211260 scopus 로고    scopus 로고
    • Martinez, F., Tapias, D., Alvarez, J., 1998. Towards speech rate independence in large vocabulary continuous speech recognition. In: Proceedings of ICASSP, Seattle, Washington, pp. 725-728.
  • 191
    • 34547931329 scopus 로고    scopus 로고
    • Martinez, F., Tapias, D., Alvarez, J., Leon, P., 1997. Characteristics of slow, average and fast speech and their effects in large vocabulary continuous speech recognition. In: Proceedings of Eurospeech, Rhodes, Greece, pp. 469-472.
  • 192
    • 85009152776 scopus 로고    scopus 로고
    • Matsuda, S., Jitsuhiro, T., Markov, K., Nakamura, S., 2004. Speech recognition system robust to noise and speaking styles. In: Proceedings of ICSLP, Jeju Island, Korea.
  • 193
    • 34547928785 scopus 로고    scopus 로고
    • Mertins, A., Rademacher, J., 2005. Vocal tract length invariant features for automatic speech recognition. In: Proceedings of ASRU, Cancun, Mexico, pp. 308-312.
  • 194
    • 85009152057 scopus 로고    scopus 로고
    • Messina, R., Jouvet, D., 2004. Context dependent long units for speech recognition. In: Proceedings of ICSLP, Jeju Island, Korea.
  • 195
    • 0030369274 scopus 로고    scopus 로고
    • Milner,. B.P., 1996. Inclusion of temporal information into features for speech recognition. In: Proceedings of ICSLP, Philadelphia, PA, pp. 256-259.
  • 196
    • 34547933351 scopus 로고    scopus 로고
    • Mirghafori, N., Fosler, E., Morgan, N., 1995. Fast speakers in large vocabulary continuous speech recognition: analysis & antidotes. In: Proceedings of Eurospeech, Madrid, Spain, pp. 491-494.
  • 197
    • 0029748337 scopus 로고    scopus 로고
    • Mirghafori, N., Fosler, E., Morgan, N., 1996. Towards robustness to fast speech in ASR. In: Proceedings of ICASSP, Atlanta, Georgia, pp. 335-338.
  • 199
    • 34547949789 scopus 로고    scopus 로고
    • Mokhtari, P., 1998. An acoustic-phonetic and articulatory study of speech-speaker dichotomy. PhD thesis, The University of New South Wales, Canberra, Australia.
  • 200
    • 4544224866 scopus 로고    scopus 로고
    • Morgan, N., Chen, B., Zhu, Q., Stolcke, A., 2004. TRAPping conversational speech: extending TRAP/tandem approaches to conversational telephone speech recognition. In: Proceedings of ICASSP, vol. 1. Montreal, Canada, pp. 536-539.
  • 201
    • 34547944260 scopus 로고    scopus 로고
    • Morgan, N., Fosler, E., Mirghafori, N., 1997. Speech recognition using on-line estimation of speaking rate. In: Proceedings of Eurospeech, vol. 4. Rhodes, Greece, pp. 2079-2082.
  • 202
    • 84892163293 scopus 로고    scopus 로고
    • Morgan, N., Fosler-Lussier, E., 1998. Combining multiple estimators of speaking rate. In: Proceedings of ICASSP, Seattle, pp. 729-732.
  • 203
    • 0027447292 scopus 로고
    • Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion
    • Murray I.R., and Arnott J.L. Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. The Journal of the Acoustical Society of America 93 2 (1993) 1097-1108
    • (1993) The Journal of the Acoustical Society of America , vol.93 , Issue.2 , pp. 1097-1108
    • Murray, I.R.1    Arnott, J.L.2
  • 204
    • 34547961576 scopus 로고    scopus 로고
    • Masaki Naito, Y.S., LiDeng, 1998. Speaker clustering for speech recognition using the parameters characterizing vocal-tract dimensions. In: Proceedings of ICASSP, Seattle, WA, pp. 1889-1893.
  • 205
    • 0036289926 scopus 로고    scopus 로고
    • Nanjo, H., Kawahara, T., 2002. Speaking-rate dependent decoding and adaptation for spontaneous lecture speech recognition. In: Proceedings of ICASSP, vol. 1. Orlando, FL, pp. 725-728.
  • 206
    • 3042704466 scopus 로고    scopus 로고
    • Language model and speaking rate adaptation for spontaneous presentation speech recognition
    • Nanjo H., and Kawahara T. Language model and speaking rate adaptation for spontaneous presentation speech recognition. IEEE Transactions on Speech and Audio Processing 12 4 (2004) 391-400
    • (2004) IEEE Transactions on Speech and Audio Processing , vol.12 , Issue.4 , pp. 391-400
    • Nanjo, H.1    Kawahara, T.2
  • 207
    • 34547954217 scopus 로고    scopus 로고
    • National Institute of Standards and Technology, 2001. SCLITE scoring software. ftp://jaguar.ncls.nist.gov/pub/sctk-1.2.tar.Z.
  • 208
    • 34547952674 scopus 로고    scopus 로고
    • Nearey, T.M., 1978. Phonetic feature systems for vowels. Indiana University Linguistics Club, Bloomington, Indiana, USA.
  • 209
    • 0030635350 scopus 로고    scopus 로고
    • Neti, C., Roukos, S., 1997. Phone-context specific gender-dependent acoustic-models for continuous speech recognition. In: Proceedings of ASRU, Santa Barbara, CA, pp. 192-198.
  • 211
    • 0030374920 scopus 로고    scopus 로고
    • Leonardo Neumeyer, Horacio Franco, Mitchel Weintraub, and Patti Price. 1996. Automatic text-independent pronunciation scoring of foreign language student speech. In: Proceedings of ICSLP, Philadelphia, PA, pp. 1457-1460.
  • 213
    • 3042569988 scopus 로고    scopus 로고
    • Nguyen, P., Rigazio, L., Junqua, J.-C., 2003. Large corpus experiments for broadcast news recognition. In: Proceedings of Eurospeech, Geneva, Switzerland, pp. 1837-1840.
  • 215
    • 33947688163 scopus 로고    scopus 로고
    • Kamal Omar, M., Chen, K., Hasegawa-Johnson, M., Bradman, Y., 2002. An evaluation of using mutual information for selection of acoustic features representation of phonemes for speech recognition. In: Proceedings of ICSLP, Denver, CO, pp. 2129-2132.
  • 216
    • 0036298107 scopus 로고    scopus 로고
    • Kamal Omar, M., Hasegawa-Johnson, M., 2002. Maximum mutual information based acoustic features representation of phonological features for speech recognition. In: Proceedings of ICASSP, vol. 1. Montreal, Canada, pp. 81-84.
  • 217
    • 84926060821 scopus 로고    scopus 로고
    • Odell, J.J., Woodlandand, P.C., Valtchev, V., Young, S.J., 1994. Large vocabulary continuous speech recognition using HTK. In: Proceedings of ICASSP, vol. 2. Adelaide, Australia, pp. 125-128.
  • 218
    • 34547938564 scopus 로고    scopus 로고
    • Ono, Y., Wakita, H., Zhao, Y., 1993. Speaker normalization using constrained spectra shifts in auditory filter domain. In: Proceedings of Eurospeech, Berlin, Germany, pp. 355-358.
  • 220
    • 0032627056 scopus 로고    scopus 로고
    • O'Shaughnessy, D., Tolba, H., 1999. Towards a robust/fast continuous speech recognition system using a voiced-unvoiced decision. In: Proceedings of ICASSP, vol. 1. Phoenix, Arizona, pp. 413-416.
  • 221
    • 0029747051 scopus 로고    scopus 로고
    • Padmanabhan, M., Bahl, L., Nahamoo, D., Picheny, M., 1996. Speaker clustering and transformation for speaker adaptation in large-vocabulary speech recognition systems. In: Proceedings of ICASSP, Atlanta, GA, pp. 701-704.
  • 224
    • 85009100883 scopus 로고    scopus 로고
    • Paliwal, K.K., Alsteris, L., 2003. Usefulness of phase spectrum in human speech perception. In: Proceedings of Eurospeech, Geneva, Switzerland, pp. 2117-2120.
  • 225
    • 85009192384 scopus 로고    scopus 로고
    • Paliwal, K.K., Atal, B.S., 2003. Frequency-related representation of speech. In: Proceedings of Eurospeech, Geneva, Switzerland, pp. 65-68.
  • 226
    • 0023246158 scopus 로고    scopus 로고
    • Paul, D.B., 1987. A speaker-stress resistant HMM isolated word recognizer. In: Proceedings of ICASSP, Dallas, Texas, pp. 713-716.
  • 227
    • 0030682299 scopus 로고    scopus 로고
    • Paul, D.B., 1997. Extensions to phone-state decision-tree clustering: single tree and tagged clustering. In: Proceedings of ICASSP, vol. 2. Munich, Germany, pp. 1487-1490.
  • 228
    • 0032665650 scopus 로고    scopus 로고
    • Peters, S.D., Stubley, P., Valin, J.-M., 1999. On the limits of speech recognition in noise. In: Proceedings of ICASSP'99. Phoenix, Arizona, pp. 365-368.
  • 230
    • 34547936772 scopus 로고    scopus 로고
    • Pfau, T., Ruske, G., 1998. Creating Hidden Markov Models for fast speech. In: Proceedings of ICSLP, Sydney, Australia.
  • 231
    • 34547948996 scopus 로고    scopus 로고
    • Michael Pitz, 2005. Investigations on Linear Transformations for Speaker Adaptation and Normalization. PhD thesis, RWTH Aachen University.
  • 232
    • 0032595183 scopus 로고    scopus 로고
    • Modeling of the glottal flow derivative waveform with application to speaker identification
    • Plumpe M., Quatieri T., and Reynolds D. Modeling of the glottal flow derivative waveform with application to speaker identification. IEEE Transactions on Speech and Audio Processing 7 5 (1999) 569-586
    • (1999) IEEE Transactions on Speech and Audio Processing , vol.7 , Issue.5 , pp. 569-586
    • Plumpe, M.1    Quatieri, T.2    Reynolds, D.3
  • 235
    • 34547952870 scopus 로고    scopus 로고
    • Potamianos, G., Narayanan, S., Lee, S., 1997. Analysis of children speech: duration, pitch and formants. In: Proceedings of Eurospeech, Rhodes, Greece, pp. 473-476.
  • 236
    • 34547962145 scopus 로고    scopus 로고
    • Potamianos, G., Narayanan, S., Lee, S., 1997. Automatic speech recognition for children. In: Proceedings of Eurospeech, Rhodes, Greece, pp. 2371-2374.
  • 238
    • 0036460984 scopus 로고    scopus 로고
    • Theory and practice of acoustic confusability
    • Printz H., and Olsen P.A. Theory and practice of acoustic confusability. Computer Speech and Language 16 1 (2002) 131-164
    • (2002) Computer Speech and Language , vol.16 , Issue.1 , pp. 131-164
    • Printz, H.1    Olsen, P.A.2
  • 239
    • 11144222882 scopus 로고    scopus 로고
    • Comparison and combination of features in a hybrid HMM/MLP and a HMM/GMM speech recognition system
    • Pujol P., Pol S., Nadeu C., Hagen A., and Bourlard H. Comparison and combination of features in a hybrid HMM/MLP and a HMM/GMM speech recognition system. IEEE Transactions on Speech and Audio Processing SAP-13 1 (2005) 14-22
    • (2005) IEEE Transactions on Speech and Audio Processing , vol.SAP-13 , Issue.1 , pp. 14-22
    • Pujol, P.1    Pol, S.2    Nadeu, C.3    Hagen, A.4    Bourlard, H.5
  • 240
  • 241
    • 0024922972 scopus 로고    scopus 로고
    • Rabiner, L.R., Lee, C.H., Juang, B.H., Wilpon, J.G., 1989. HMM clustering for connected word recognition. In: Proceedings of ICASSP, vol. 1. Glasgow, Scotland, pp. 405-408.
  • 242
    • 85009097014 scopus 로고    scopus 로고
    • Raux, A., 2004. Automated lexical adaptation and speaker clustering based on pronunciation habits for non-native speech recognition. In: Proceedings of ICSLP, Jeju Island, Korea.
  • 243
    • 0020779884 scopus 로고
    • Frequency spectrum deviation between speakers
    • Saito S., and Itakura F. Frequency spectrum deviation between speakers. Speech Communication 2 (1983) 149-152
    • (1983) Speech Communication , vol.2 , pp. 149-152
    • Saito, S.1    Itakura, F.2
  • 244
    • 85009085052 scopus 로고    scopus 로고
    • Sakauchi, S., Yamaguchi, Y., Takahashi, S., Kobashikawa, S., 2004. Robust speech recognition based on HMM composition and modified Wiener filter. In: Proceedings of Interspeech. Jeju Island, Korea, pp. 2053-2056.
  • 245
    • 0033677121 scopus 로고    scopus 로고
    • Saon, G., Padmanabhan, M., Gopinath, R., Chen, S., 2000. Maximum likelihood discriminant feature spaces. In: Proceedings of ICASSP, pp. 1129-1132.
  • 246
    • 0030715426 scopus 로고    scopus 로고
    • Schaaf, T., Kemp, T., 1997. Confidence measures for spontaneous speech recognition. In: Proceedings of ICASSP, Munich, Germany, pp. 875-878.
  • 248
    • 30644477788 scopus 로고    scopus 로고
    • Schimmel, S., Atlas, L., 2005. Coherent envelope detection for modulation filtering of speech. In: Proceedings of ICASSP, vol. 1. Philadephia, USA, pp. 221-224.
  • 250
    • 34547941307 scopus 로고    scopus 로고
    • Schötz, S., 2001. A perceptual study of speaker age. In: Working paper 49, Lund University, Dept of Linguistic, pp. 136-139.
  • 251
    • 34547926704 scopus 로고    scopus 로고
    • Schultz, T., Waibel, A., 1998. Language independent and language adaptive large vocabulary speech recognition. In: Proceedings of ICSLP, vol. 5. Sydney, Australia, pp. 1819-1822.
  • 252
    • 34547950397 scopus 로고    scopus 로고
    • Schwartz, R., Barry, C., Chow, Y.-L., Deft, A., Feng, M.-W., Kimball, O., Kubala, F., Makhoul, J., Vandegrift, J., 1989. The BBN BYBLOS continuous speech recognition system. In: Proceedings of Speech and Natural Language Workshop. Philadelphia, Pennsylvania, pp. 21-23.
  • 253
    • 34547958446 scopus 로고    scopus 로고
    • Selouani, S.-A., Tolba, H., O'Shaughnessy, D., 2002. Distinctive features, formants and cepstral coefficients to improve automatic speech recognition. In: Conference on Signal Processing, Pattern Recognition and Applications, IASTED. Crete, Greece, pp. 530-535.
  • 254
    • 19944366274 scopus 로고    scopus 로고
    • Statistical modeling of phonological rules through linguistic hierarchies
    • Seneff S., and Wang C. Statistical modeling of phonological rules through linguistic hierarchies. Speech Communication 46 2 (2005) 204-216
    • (2005) Speech Communication , vol.46 , Issue.2 , pp. 204-216
    • Seneff, S.1    Wang, C.2
  • 255
    • 79960308044 scopus 로고    scopus 로고
    • Shi, Y.Y., Liu, J., Liu, R.S., 2002. Discriminative HMM stream model for Mandarin digit string speech recognition. In: Proceedings of Int. Conf. on Signal Processing, vol. 1. Beijing, China, pp. 528-531.
  • 256
    • 33645586600 scopus 로고    scopus 로고
    • Shinozaki, T., Furui, S., 2003. Hidden mode HMM using bayesian network for modeling speaking rate fluctuation. In: Proceedings of ASRU. US Virgin Islands, pp. 417-422.
  • 257
    • 85009080958 scopus 로고    scopus 로고
    • Shinozaki, T., Furui, S., 2004. Spontaneous speech recognition using a massively parallel decoder. In: Proceedings of ICSLP, Jeju Island, Korea, pp. 1705-1708.
  • 258
    • 85009064115 scopus 로고    scopus 로고
    • Shobaki, K., Hosom, J.-P., Cole, R., 2000. The OGI kids speech corpus and recognizers. In: Proceedings of ICSLP, Beijing, China, pp. 564-567.
  • 259
    • 34547962510 scopus 로고    scopus 로고
    • Siegler, M.A., 1995. Measuring and compensating for the effects of speech rate in large vocabulary continuous speech recognition. PhD thesis, Carnegie Mellon University.
  • 260
    • 0028996973 scopus 로고    scopus 로고
    • Siegler, M.A., Stern, R.M., 1995. On the effect of speech rate in large vocabulary speech recognition system. In: Proceedings of ICASSP, Detroit, Michigan, pp. 612-615.
  • 261
    • 85009242921 scopus 로고    scopus 로고
    • Singer, H., Sagayama, S., 1992. Pitch dependent phone modelling for HMM based speech recognition. In: Proceedings of ICASSP, vol. 1. San Francisco, CA, pp. 273-276.
  • 262
    • 0028996988 scopus 로고    scopus 로고
    • Slifka, J., Anderson, T.R., 1995. Speaker modification with LPC pole analysis. In: Proceedings of ICASSP, Detroit, MI, pp. 644-647.
  • 263
    • 34547959054 scopus 로고    scopus 로고
    • Song, M.G., Jung, H.I., Shim, K.-J., Kim, H.S., 1998. Speech recognition in car noise environments using multiple models according to noise masking levels. In: Proceedings of ICSLP.
  • 264
    • 34547946786 scopus 로고    scopus 로고
    • Sotillo, C., Bard, E.G., 1998. Is hypo-articulation lexically constrained?. In: Proceedings of SPoSS. Aix-en-Provence, pp. 109-112.
  • 265
    • 15844428932 scopus 로고    scopus 로고
    • Human and machine consonant recognition
    • Sroka J.J., and Braida L.D. Human and machine consonant recognition. Speech Communication 45 4 (2005) 401-423
    • (2005) Speech Communication , vol.45 , Issue.4 , pp. 401-423
    • Sroka, J.J.1    Braida, L.D.2
  • 266
    • 0024945022 scopus 로고    scopus 로고
    • Steeneken, H.J.M., van Velden, J.G., 1989. Objective and diagnostic assessment of (isolated) word recognizers. In: Proceedings of ICASSP, vol. 1. Glasgow, UK, pp. 540-543.
  • 267
    • 85009135204 scopus 로고    scopus 로고
    • Stephenson, T.A., Bourlard, H., Bengio, S., Morris, A.C., 2000. Automatic speech recognition using dynamic Bayesian networks with both acoustic and articulatory variables. In: Proceedings of ICSLP, vol. 2. Beijing, China, pp. 951-954.
  • 269
    • 33947619591 scopus 로고    scopus 로고
    • Stolcke, A., Grezl, F., Hwang, M.-Y., Morgan, N., Vergyri, D., 2006. Cross-domain and cross-language portability of acoustic features estimated by multilayer perceptrons. In: Proceedings of ICASSP, vol. 1. Toulouse, France, pp. 321-324.
  • 270
    • 0033335618 scopus 로고    scopus 로고
    • Modeling pronunciation variation for ASR: a survey of the literature
    • Strik H., and Cucchiarini C. Modeling pronunciation variation for ASR: a survey of the literature. Speech Communication 29 2-4 (1999) 225-246
    • (1999) Speech Communication , vol.29 , Issue.2-4 , pp. 225-246
    • Strik, H.1    Cucchiarini, C.2
  • 271
    • 0028996878 scopus 로고    scopus 로고
    • Sun, D.X., Deng, L., 1995. Analysis of acoustic-phonetic variations in fluent speech using Timit. In: Proceedings of ICASSP, Detroit, Michigan, pp. 201-204.
  • 272
    • 0141813642 scopus 로고    scopus 로고
    • Suzuki, H., Zen, H., Nankaku, Y., Miyajima, C., Tokuda, K., Kitamura, T., 2003. Speech recognition using voice-characteristic-dependent acoustic models. In: Proceedings of ICASSP, vol. 1. Hong-Kong (canceled), pp. 740-743.
  • 273
    • 0024934062 scopus 로고    scopus 로고
    • Svendsen, T., Paliwal, K.K., Harborg, E., Husoy, P.O., 1989. An improved sub-word based speech recognizer. In: Proceedings of ICASSP, Glasgow, UK, pp. 108-111.
  • 274
    • 0023211850 scopus 로고    scopus 로고
    • Svendsen, T., Soong, F., 1987. On the automatic segmentation of speech signals. In: Proceedings of ICASSP, Dallas, Texas, pp. 77-80.
  • 275
    • 0030359630 scopus 로고    scopus 로고
    • Teixeira, C., Trancoso, I., Serralheiro, A., 1996. Accent identification. In: Proceedings of ICSLP, vol. 3. Philadelphia, PA, pp. 1784-1787.
  • 276
    • 0031643033 scopus 로고    scopus 로고
    • Thomson, D.L., Chengalvarayan, R., 1998. Use of periodicity and jitter as speech recognition feature. In: Proceedings of ICASSP, vol. 1. Seattle, WA, pp. 21-24.
  • 277
    • 0036642777 scopus 로고    scopus 로고
    • Use of voicing features in HMM-based speech recognition
    • Thomson D.L., and Chengalvarayan R. Use of voicing features in HMM-based speech recognition. Speech Communication 37 3-4 (2002) 197-211
    • (2002) Speech Communication , vol.37 , Issue.3-4 , pp. 197-211
    • Thomson, D.L.1    Chengalvarayan, R.2
  • 278
    • 0030682291 scopus 로고    scopus 로고
    • Tibrewala, S., Hermansky, H., 1997. Sub-band based recognition of noisy speech. In: Proceedings of ICASSP, Munich Germany, pp. 1255-1258.
  • 279
    • 17544404143 scopus 로고    scopus 로고
    • Tolba, H., Selouani, S.A., O'Shaughnessy, D., 2002. Auditory-based acoustic distinctive features and spectral cues for automatic speech recognition using a multi-stream paradigm. In: Proceedings of ICASSP, Orlando, FL, pp. 837-840.
  • 280
    • 85009162945 scopus 로고    scopus 로고
    • Tolba, H., Selouani, S.A., O'Shaughnessy, D., 2003. Comparative experiments to evaluate the use of auditory-based acoustic distinctive features and formant cues for robust automatic speech recognition in low-snr car environments. In: Proceedings of Eurospeech, Geneva, Switzerland, pp. 3085-3088.
  • 281
    • 0030643684 scopus 로고    scopus 로고
    • Tomlinson, M.J., Russell, M.J., Moore, R.K., Buckland, A.P., Fawley, M.A., 1997. Modelling asynchrony in speech using elementary single-signal decomposition. In: Proceedings of ICASSP, Munich Germany, pp. 1247-1250.
  • 282
    • 34547928782 scopus 로고    scopus 로고
    • Townshend, B., Bernstein, J., Todic, O., Warren, E., 1998. Automatic text-independent pronunciation scoring of foreign language student speech. In: Proceedings of STiLL-1998, Stockholm, pp. 179-182.
  • 283
    • 34547953041 scopus 로고    scopus 로고
    • Traunmüller, H., 1997. Perception of speaker sex, age and vocal effort. Technical Report, Institutionen för lingvistik, Stockholm Universitet.
  • 284
    • 18744406714 scopus 로고    scopus 로고
    • Discriminative linear transforms for feature normalization and speaker adaptation in HMM estimation
    • Tsakalidis S., Doumpiotis V., and Byrne W. Discriminative linear transforms for feature normalization and speaker adaptation in HMM estimation. IEEE Transactions on Speech and Audio Processing 13 3 (2005) 367-376
    • (2005) IEEE Transactions on Speech and Audio Processing , vol.13 , Issue.3 , pp. 367-376
    • Tsakalidis, S.1    Doumpiotis, V.2    Byrne, W.3
  • 285
    • 18744411268 scopus 로고    scopus 로고
    • Segmental eigenvoice with delicate eigenspace for improved speaker adaptation
    • Tsao Y., Lee S.-M., and Lee L.-S. Segmental eigenvoice with delicate eigenspace for improved speaker adaptation. IEEE Transactions on Speech and Audio Processing 13 3 (2005) 399-411
    • (2005) IEEE Transactions on Speech and Audio Processing , vol.13 , Issue.3 , pp. 399-411
    • Tsao, Y.1    Lee, S.-M.2    Lee, L.-S.3
  • 286
    • 34547949384 scopus 로고    scopus 로고
    • Tuerk, A., Young, S., 1999. Modeling speaking rate using a between frame distance metric. In: Proceedings of Eurospeech, vol. 1. Budapest, Hungary, pp. 419-422.
  • 287
    • 34547947654 scopus 로고    scopus 로고
    • Tuerk, C., Robinson, T., 1993. A new frequency shift function for reducing inter-speaker variance. In: Proceedings of Eurospeech, Berlin, Germany, pp. 351-354.
  • 288
    • 13544261390 scopus 로고    scopus 로고
    • Combining active and semi-supervised learning for spoken language understanding
    • Tur G., Hakkani-Tür D., and Schapire R.E. Combining active and semi-supervised learning for spoken language understanding. Speech Communication 45 2 (2005) 171-186
    • (2005) Speech Communication , vol.45 , Issue.2 , pp. 171-186
    • Tur, G.1    Hakkani-Tür, D.2    Schapire, R.E.3
  • 289
    • 34547955587 scopus 로고    scopus 로고
    • Mel-cepstrum modulation spectrum (MCMS) features for robust ASR
    • St. Thomas, US Virgin Islands
    • Tyagi V., McCowan I., Bourlard H., and Misra H. Mel-cepstrum modulation spectrum (MCMS) features for robust ASR. Proceedings of ASRU (2003), St. Thomas, US Virgin Islands 381-386
    • (2003) Proceedings of ASRU , pp. 381-386
    • Tyagi, V.1    McCowan, I.2    Bourlard, H.3    Misra, H.4
  • 290
    • 33846197546 scopus 로고    scopus 로고
    • Tyagi, V., Wellekens, C., 2005. Cepstrum representation of speech. In: Proceedings of ASRU, Cancun, Mexico.
  • 291
    • 33745214472 scopus 로고    scopus 로고
    • Tyagi, V., Wellekens, C., Bourlard, H., 2005. On variable-scale piecewise stationary spectral analysis of speech signals for ASR. In: Proceedings of Interspeech, Lisbon, Portugal, pp. 209-212.
  • 292
    • 0035427182 scopus 로고    scopus 로고
    • Multilingual speech recognition in seven languages
    • Uebler U. Multilingual speech recognition in seven languages. Speech Communication 35 1-2 (2001) 53-69
    • (2001) Speech Communication , vol.35 , Issue.1-2 , pp. 53-69
    • Uebler, U.1
  • 293
    • 34547941680 scopus 로고    scopus 로고
    • Uebler, U., Boros, M., 1999. Recognition of non-native german speech with multilingual recognizers. In: Proceedings of Eurospeech, vol. 2. Budapest, Hungary, pp. 911-914.
  • 295
    • 0141701264 scopus 로고    scopus 로고
    • Utsuro, T., Harada, T., Nishizaki, H., Nakagawa, S., 2002. A confidence measure based on agreement among multiple LVCSR models - correlation between pair of acoustic models and confidence. In: Proceedings of ICSLP, Denver, Colorado, pp. 701-704.
  • 296
    • 0035427204 scopus 로고    scopus 로고
    • Recognizing speech of goats, wolves, sheep and ... non-natives
    • VanCompernolle D. Recognizing speech of goats, wolves, sheep and ... non-natives. Speech Communication 35 1-2 (2001) 71-79
    • (2001) Speech Communication , vol.35 , Issue.1-2 , pp. 71-79
    • VanCompernolle, D.1
  • 297
    • 34547931326 scopus 로고    scopus 로고
    • VanCompernolle, D., Smolders, J., Jaspers, P., Hellemans, T., 1991. Speaker clustering for dialectic robustness in speaker independent speech recognition. In: Proceedings of Eurospeech, Genova, Italy, pp. 723-726.
  • 298
    • 0030643685 scopus 로고    scopus 로고
    • Vaseghi, S.V., Harte, N., Miller, B., 1997. Multi resolution phonetic/segmental features and models for HMM-based speech recognition. In: Proceedings of ICASSP, Munich Germany, pp. 1263-1266.
  • 299
    • 84907336951 scopus 로고    scopus 로고
    • Venkataraman, A., Stolcke, A., Wangal, W., Vergyri, D., Ramana Rao Gadde, V., Zheng, J., 2004. An efficient repair procedure for quick transcriptions. In: Proceedings of ICSLP, Jeju Island, Korea.
  • 300
    • 0017482612 scopus 로고
    • Normalization of vowels by vocal-tract length and its application to vowel identification
    • Wakita H. Normalization of vowels by vocal-tract length and its application to vowel identification. IEEE Transactions on Acoustics, Speech and Signal Processing 25 (1977) 183-192
    • (1977) IEEE Transactions on Acoustics, Speech and Signal Processing , vol.25 , pp. 183-192
    • Wakita, H.1
  • 301
    • 0027206092 scopus 로고
    • Speaker normalization and adaptation using second-order connectionist networks
    • Watrous R. Speaker normalization and adaptation using second-order connectionist networks. IEEE Transactions Neural Networks 4 1 (1993) 21-30
    • (1993) IEEE Transactions Neural Networks , vol.4 , Issue.1 , pp. 21-30
    • Watrous, R.1
  • 302
    • 34547956537 scopus 로고    scopus 로고
    • Mitch Weintraub, Kelsey Taussig, Kate Hunicke-Smith, Amy Snodgrass. 1996. Effect of speaking style on LVCSR performance. In: Proceedings Addendum of ICSLP, Philadelphia, PA, USA.
  • 304
    • 34547958444 scopus 로고    scopus 로고
    • Weng, F., Bratt, H., Neumeyer, L., Stomcke, A., 1997. A study of multilingual speech recognition. In: Proceedings of Eurospeech, vol. 1. Rhodes, Greece, pp. 359-362.
  • 305
    • 33745183789 scopus 로고    scopus 로고
    • Wesker, T., Meyer, B., Wagener, K., Anemüller, J., Mertins, A., Kollmeier, B., 2005. Oldenburg logatome speech corpus (OLLO) for speech recognition experiments with humans and machines. In: Proceedings of Interspeech. Lisboa, Portugal, pp. 1273-1276.
  • 306
    • 34547946949 scopus 로고    scopus 로고
    • Westphal, M., 1997. The use of cepstral means in conversational speech recognition. In: Proceedings of Eurospeech, vol. 3. Rhodes, Greece, pp. 1143-1146.
  • 307
    • 34547929803 scopus 로고    scopus 로고
    • Williams, D.A.G., 1999. Knowing what you don't know: Roles for confidence measures in automatic speech recognition. PhD thesis, University of Sheffield.
  • 308
    • 0029747582 scopus 로고    scopus 로고
    • Wilpon, J.G., Jacobsen, C.N., 1996. A study of speech recognition for children and the elderly. In: Proceedings of ICASSP, vol. 1. Atlanta, Georgia, pp. 349-352.
  • 309
    • 34547962144 scopus 로고    scopus 로고
    • Witt, S.M., Young, S.J., 1999. Off-line acoustic modelling of non-native accents. In: Proceedings of Eurospeech, vol. 3. Budapest, Hungary, pp. 1367-1370.
  • 310
    • 0034140966 scopus 로고    scopus 로고
    • Phone-level pronunciation scoring and assessment for interactive language learning
    • Witt S.M., and Young S.J. Phone-level pronunciation scoring and assessment for interactive language learning. Speech Communication 30 2-3 (2000) 95-108
    • (2000) Speech Communication , vol.30 , Issue.2-3 , pp. 95-108
    • Witt, S.M.1    Young, S.J.2
  • 311
    • 4544358972 scopus 로고    scopus 로고
    • Wong, P.-F., Siu, M.-H., 2004. Decision tree based tone modeling for chinese speech recognition. In: Proceedings of ICASSP, vol. 1. Montreal, Canada, pp. 905-908.
  • 312
    • 84892320894 scopus 로고    scopus 로고
    • Wrede, B., Fink, G.A., Sagerer, G., 2001. An investigation of modeling aspects for rate-dependent speech recognition. In: Proceedings of Eurospeech, Aalborg, Denmark.
  • 313
  • 314
    • 0024053854 scopus 로고    scopus 로고
    • Yang, W.-J., Lee, J.-C., Chang, Y.-C., Wang, H.-C., 1988. Hidden Markov Model for Mandarin lexical tone recognition. In: IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 36. pp. 988-992.
  • 315
    • 0029745232 scopus 로고    scopus 로고
    • Zavaliagkos, G., Schwartz, R., McDonough, J., 1996. Maximum a posteriori adaptation for large scale HMM recognizers. In: Proceedings of ICASSP, Atlanta, Georgia, pp. 725-728.
  • 316
    • 33646793688 scopus 로고    scopus 로고
    • Zhang, B., Matsoukas, S., 2005. Minimum phoneme error based heteroscedastic linear discriminant analysis for speech recognition. In: Proceedings of ICASSP, vol. 1. Philadelphia, PA, pp. 925-928.
  • 317
    • 34547947108 scopus 로고    scopus 로고
    • Zhan, P., Waibel, A., 1997. Vocal tract length normalization for large vocabulary continuous speech recognition. Technical Report CMU-CS-97-148, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania.
  • 318
    • 0030705337 scopus 로고    scopus 로고
    • Zhan, P., Westphal, M., 1997. Speaker normalization based on frequency warping. In: Proceedings of ICASSP, vol. 2. Munich, Germany, pp. 1039-1042.
  • 319
    • 0028444748 scopus 로고    scopus 로고
    • Zhang, Y., Desilva, C.J.S., Togneri, A., Alder, M., Attikiouzel, Y., 1994. Speaker-independent isolated word recognition using multiple Hidden Markov Models. In: Proceedings IEE Vision, Image and Signal Processing, vol. 141(3). pp. 197-202.
  • 320
    • 34547937771 scopus 로고    scopus 로고
    • Zheng, J., Franco, H., Stolcke, A., 2000. Rate of speech modeling for large vocabulary conversational speech recognition. In: Proceedings of ISCA tutorial and research workshop on automatic speech recognition: challenges for the new Millenium. Paris, France, pp. 145-149.
  • 321
    • 85009146015 scopus 로고    scopus 로고
    • Zheng, J., Franco, H., Stolcke, A., 2004. Effective acoustic modeling for rate-of-speech variation in large vocabulary conversational speech recognition. In: Proceedings of ICSLP, Jeju Island, Korea, pp. 401-404.
  • 322
    • 22544443963 scopus 로고    scopus 로고
    • Rapid discriminative acoustic model based on eigenspace mapping for fast speaker adaptation
    • Zhou B., and Hansen J.H.L. Rapid discriminative acoustic model based on eigenspace mapping for fast speaker adaptation. IEEE Transactions on Speech and Audio Processing 13 4 (2005) 554-564
    • (2005) IEEE Transactions on Speech and Audio Processing , vol.13 , Issue.4 , pp. 554-564
    • Zhou, B.1    Hansen, J.H.L.2
  • 323
    • 19244369058 scopus 로고    scopus 로고
    • Zhou, G., Deisher, M.E., Sharma, S., 2002. Causal analysis of speech recognition failure in adverse environments. In: Proceedings of ICASSP, vol. 4. Orlando, Florida, pp. 3816-1819.
  • 324
    • 85009110489 scopus 로고    scopus 로고
    • Zhu, Q., Alwan, A., 2000. AM-demodualtion of speech spectra and its application to noise robust speech recognition. In: Proceedings of ICSLP, vol. 1. Beijing, China, pp. 341-344.
  • 325
    • 4544257926 scopus 로고    scopus 로고
    • Zhu, D., Paliwal, K.K., 2004. Product of power spectrum and group delay function for speech recognition. In: Proceedings of ICASSP, pp. 125-128.
  • 326
    • 85009097225 scopus 로고    scopus 로고
    • Zhu, Q., Chen, B., Morgan, N., Stolcke, A., 2004. On using MLP features in LVCSR. In: Proceedings of ICSLP, Jeju Island, Korea.
  • 327
    • 85009247781 scopus 로고    scopus 로고
    • Zolnay, A., Schlüter, R., Ney, H., 2002. Robust speech recognition using a voiced-unvoiced feature. In: Proceedings of ICSLP, vol. 2. Denver, CO, pp. 1065-1068.
  • 328
    • 33646767079 scopus 로고    scopus 로고
    • Zolnay, A., Schlüter, R., Ney, H., 2005. Acoustic feature combination for robust speech recognition. In: Proceedings of ICASSP, vol. I. Philadelphia, PA, pp. 457-460.


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.