메뉴 건너뛰기




Volumn 49, Issue 10-11, 2007, Pages 847-860

Acoustic variability and automatic recognition of children's speech

Author keywords

Automatic speech recognition for children; Children's speech analysis; Speaker adaptive acoustic modeling; Speaker normalization

Indexed keywords

ACOUSTICS; SPEECH ANALYSIS; SPEECH INTELLIGIBILITY; VOCABULARY CONTROL;

EID: 34547939271     PISSN: 01676393     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.specom.2007.01.002     Document Type: Article
Times cited : (120)

References (48)
  • 1
    • 34547934326 scopus 로고    scopus 로고
    • Ackermann, U., Angelini, B., Brugnara, F., Federico, M., Giuliani, D., Gretter, R., Niemann, H., 1997. Speedata: a prototype for multilingual spoken data-entry. In: Proc. of EUROSPEECH. Rhodes, Greece, pp. 1807-1810.
  • 2
    • 0030362995 scopus 로고    scopus 로고
    • Anastasakos, T., McDonough, J., Schwartz, R., Makhoul, J., 1996. A Compact Model for Speaker-Adaptive Training. In: Proc. of ICSLP. Philadelphia, PA, pp. 1137-1140.
  • 3
    • 34547927849 scopus 로고    scopus 로고
    • Angelini, B., Brugnara, F., Falavigna, D., Giuliani, D., Gretter, R., Omologo, M., 1994. Speaker Independent Continuous Speech Recognition Using an Acoustic-Phonetic Italian Corpus. In: Proc. of ICSLP. Yokohama, Japan, pp. 1391-1394.
  • 4
    • 85009069271 scopus 로고    scopus 로고
    • Arunachalam, S., Gould, D., Andersen, E., Byrd, D., Narayanan, S., 2001. Politeness and Frustration Language in Child-Machine Interactions. In: Proc. of EUROSPEECH. Aalborg, Denmark, pp. 2675-2679.
  • 5
    • 51849142134 scopus 로고    scopus 로고
    • Banerjee, S., Beck, J.E., Mostow, J., 2003. Evaluating the Effect of Predicting Oral Reading Miscues. In: Proc. of EUROSPEECH. Geneva, Switzerland.
  • 6
    • 0034854955 scopus 로고    scopus 로고
    • Bertoldi, N., Brugnara, F., Cettolo, M., Federico, M., Giuliani, D., 2001. From Broadcast News to Spontaneous Dialogue Transcription: Portability Issues. In: Proc. of ICASSP. Vol. 1. Salt Lake City, UT, pp. 37-40.
  • 7
    • 4444257069 scopus 로고    scopus 로고
    • Praat, a system for doing phonetics by computer
    • Boersma P., and Weenink D. Praat, a system for doing phonetics by computer. Glot Int. 5 9/10 (2001) 341-345
    • (2001) Glot Int. , vol.5 , Issue.9-10 , pp. 341-345
    • Boersma, P.1    Weenink, D.2
  • 8
    • 63749112976 scopus 로고    scopus 로고
    • Brugnara, F., Cettolo, M., Federico, M., Giuliani, D., 2002. Issues in automatic transcription of historical audio data. In: Proc. of ICSLP. Denver, CO, pp. 1441-1444.
  • 9
    • 0030375265 scopus 로고    scopus 로고
    • Burnett, D.C., Fanty, M., 1996. Rapid unsupervised adaptation to children's speech on a connected-digit task. In: Proc. of ICSLP. Vol. 2. Philadelphia, PA, pp. 1145-1148.
  • 10
    • 0032204117 scopus 로고    scopus 로고
    • A novel feature transformation for vocal tract length normalisation in automatic speech recognition
    • Claes T., Dologlou I., ten Bosch L., and Compernolle D.V. A novel feature transformation for vocal tract length normalisation in automatic speech recognition. IEEE Trans. Speech Audio Process. 6 6 (1998) 549-557
    • (1998) IEEE Trans. Speech Audio Process. , vol.6 , Issue.6 , pp. 549-557
    • Claes, T.1    Dologlou, I.2    ten Bosch, L.3    Compernolle, D.V.4
  • 12
    • 0031644298 scopus 로고    scopus 로고
    • Das, S., Nix, D., Picheny, M., 1998. Improvements in Children's Speech Recognition Performance. In: Proc. of ICASSP. Seattle, WA.
  • 13
    • 0019053271 scopus 로고
    • Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
    • Davis S., and Mermelstein P. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoustics Speech Signal Process. 28 (1980) 357-366
    • (1980) IEEE Trans. Acoustics Speech Signal Process. , vol.28 , pp. 357-366
    • Davis, S.1    Mermelstein, P.2
  • 14
    • 0029725604 scopus 로고    scopus 로고
    • Eide, E., Gish, H., 1996. A parametric approach to vocal tract length normalization. In: Proc. of ICASSP. Atlanta, GA, pp. 346-349.
  • 15
    • 34547934131 scopus 로고    scopus 로고
    • Eskenazi, M., Pelton, G., 2002. Pinpointing pronunciation errors in children's speech: examining the role of the speech recognizer. In: PMLA. Aspen Lodge, CO, pp. 48-52.
  • 16
    • 0032878792 scopus 로고    scopus 로고
    • Morphology and development of the human vocal tract: a study using magnetic resonance imaging
    • Fitch W.T., and Giedd J. Morphology and development of the human vocal tract: a study using magnetic resonance imaging. J. Acoust. Soc. Am. 106 3 (1999) 1511-1522
    • (1999) J. Acoust. Soc. Am. , vol.106 , Issue.3 , pp. 1511-1522
    • Fitch, W.T.1    Giedd, J.2
  • 18
    • 0032050110 scopus 로고    scopus 로고
    • Maximum likelihood linear transformations for HMM-based speech recognition
    • Gales M.J.F. Maximum likelihood linear transformations for HMM-based speech recognition. Comput. Speech Lang. 12 (1998) 75-98
    • (1998) Comput. Speech Lang. , vol.12 , pp. 75-98
    • Gales, M.J.F.1
  • 19
    • 33745219452 scopus 로고    scopus 로고
    • Gerosa, M., Giuliani, D., Brugnara, F., 2005. Speaker adaptive acoustic modeling with mixture of adult and children's speech. In: Proc. of INTERSPEECH/EUROSPEECH. Lisboa, Portugal, pp. 2193-2196.
  • 20
    • 0024909979 scopus 로고    scopus 로고
    • Gillick, L., Cox, S.J., 1989. Some statistical issues in the comparison of speech recognition algorithms. In: Proc. of ICASSP. Glasgow, pp. I-532-535.
  • 21
    • 85143191136 scopus 로고    scopus 로고
    • Giuliani, D., Gerosa, M., 2003. Investigating recognition of children speech. In: Proc. of ICASSP. Vol. 2. Hong Kong, pp. 137-140.
  • 22
    • 27744595137 scopus 로고    scopus 로고
    • Improved automatic speech recognition through speaker normalization
    • Giuliani D., Gerosa M., and Brugnara F. Improved automatic speech recognition through speaker normalization. Comput. Speech Lang. 20 1 (2006) 107-123
    • (2006) Comput. Speech Lang. , vol.20 , Issue.1 , pp. 107-123
    • Giuliani, D.1    Gerosa, M.2    Brugnara, F.3
  • 23
    • 56149113752 scopus 로고    scopus 로고
    • Gustafson, J., Sjölander, K., 2000. Voice transformations for improving children's speech recognition in a publicly available dialogue system. In: Proc. of ICSLP. Beijing, China, pp. 297-300.
  • 24
    • 84946707630 scopus 로고    scopus 로고
    • Hagen, A., Pellom, B., Cole, R., 2003. Children's speech recognition with application to interactive books and tutors. In: Proc. of the ASRU Workshop. St. Thomas Irsee, US Virgin Islands.
  • 25
    • 34547952312 scopus 로고    scopus 로고
    • Hagen, A., Pellom, B., Vuuren, S.V., Cole, R., 2004. Advances in children's speech recognition within an interactive literacy tutor. In: Proc. of HLT/NAACL. Boston, MA.
  • 26
  • 27
    • 33745197960 scopus 로고    scopus 로고
    • Kumar, S.C., Mohandas, V.P., Li, H., 2005. Multilingual speech recognition: a unified approach. In: Proc. of INTERSPEECH/EUROSPEECH. Lisboa, Portugal, pp. 3357-3360.
  • 28
    • 0029747183 scopus 로고    scopus 로고
    • Lee, L., Rose, R.C., 1996. Speaker normalization using efficient frequency warping procedure. In: Proc. of ICASSP. Atlanta, GA, pp. 353-356.
  • 29
    • 0032969462 scopus 로고    scopus 로고
    • Acoustic of children's speech: developmental changes of temporal and spectral parameters
    • Lee S., Potamianos A., and Narayanan S. Acoustic of children's speech: developmental changes of temporal and spectral parameters. J. Acoust. Soc. Am. 105 3 (1999) 1455-1468
    • (1999) J. Acoust. Soc. Am. , vol.105 , Issue.3 , pp. 1455-1468
    • Lee, S.1    Potamianos, A.2    Narayanan, S.3
  • 30
    • 0029288633 scopus 로고
    • Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
    • Leggetter C.J., and Woodland P.C. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models. Comput. Speech Lang. 9 (1995) 171-185
    • (1995) Comput. Speech Lang. , vol.9 , pp. 171-185
    • Leggetter, C.J.1    Woodland, P.C.2
  • 31
    • 0030363025 scopus 로고    scopus 로고
    • Mak, B., Barnard, E., 1996. Phone clustering using the Bhattacharyya distance. In: Proc. of ICSLP. Philadelphia, PA, pp. 2005-2008.
  • 32
    • 34547936573 scopus 로고    scopus 로고
    • Mich, O., Giuliani, D., Gerosa, M., 2004. Parling: a CALL System for Children. In: Proc. of InSTIL/ICALL. Venice, Italy, pp. 169-172.
  • 33
    • 0029725864 scopus 로고    scopus 로고
    • Miller, J.D., Lee, S., Uchanski, R.M., Heidbreder, A.H., Richman, B.B., Tadlock, J., 1996. Creation of two children's speech databases. In: Proc. of ICASSP. Atlanta, GA, pp. 849-852.
  • 34
    • 0029748337 scopus 로고    scopus 로고
    • Mirghafori, N., Fosler, E., Morgan, N., 1996. Towards robustness to fast speech in ASR. In: Proc. of ICASSP. Atlanta, GA, USA, pp. 335-338.
  • 35
    • 33745194337 scopus 로고    scopus 로고
    • Nakamura, M., Iwano, K., Furui, S., 2005. Analysis of spectral space reduction in spontaneous speech and its effects on speech recognition performances. In: Proc. of EUROSPEECH. Lisbon, Portugal, pp. 3381-3384.
  • 37
    • 4544229743 scopus 로고    scopus 로고
    • Nisimura, R., Lee, A., Saruwatari, H., Shikano, K., 2004. Public speech-oriented guidance system with adult and child discrimination capability. In: Proc. of ICASSP. Vol. 1. Montreal, Canada, pp. 433-436.
  • 39
    • 34547957511 scopus 로고    scopus 로고
    • Potamianos, A., Narayanan, S., Lee, S., 1997. Automatic speech recognition for children. In: Proc. of EUROSPEECH. Rhodes, Greece, pp. 2371-2374.
  • 41
    • 34547954623 scopus 로고    scopus 로고
    • Salvi, G., 2003. Accent clustering in Swedish using the Bhattacharyya distance. In: Proc. of the 15th ICPhS International Congress of Phonetic Sciences. Barcelona, Spain.
  • 42
    • 0017482612 scopus 로고
    • Normalization of vowels by vocal tract length and its application to vowel identification
    • Wakita H. Normalization of vowels by vocal tract length and its application to vowel identification. IEEE Trans. Acoust. Speech Signal Process. 25 (1977) 183-192
    • (1977) IEEE Trans. Acoust. Speech Signal Process. , vol.25 , pp. 183-192
    • Wakita, H.1
  • 43
    • 0029764708 scopus 로고    scopus 로고
    • Wegmann, S., McAllaster, D., Orloff, J., Peskin, B., 1996. Speaker Normalisation on Conversational Telephone Speech. In: Proc. of ICASSP. Atlanta, pp. I-339-341.
  • 44
    • 0032629626 scopus 로고    scopus 로고
    • Welling, L., Kanthak, S., Ney, H., 1999. Improved methods for vocal tract normalization. In: Proc. of ICASSP. Vol. 2. Phoenix, AZ, pp. 761-764.
  • 45
    • 0034318848 scopus 로고    scopus 로고
    • Speech patterns of children and adults elicited via a picture-naming task: an acoustic study
    • Whiteside S.P., and Hodgson C. Speech patterns of children and adults elicited via a picture-naming task: an acoustic study. Speech Commun. 32 (2000) 267-285
    • (2000) Speech Commun. , vol.32 , pp. 267-285
    • Whiteside, S.P.1    Hodgson, C.2
  • 46
    • 0029747582 scopus 로고    scopus 로고
    • Wilpon, J.G., Jacobsen, C.N., 1996. A Study of Speech Recognition for Children and Elderly. In: Proc. of ICASSP. Atlanta, GA, pp. 349-352.
  • 47
    • 34547929817 scopus 로고    scopus 로고
    • Young, S.J., Odell, J.J., Woodland, P.C., 1994. Tree-based state tying for high accuracy acoustic modelling. In: HLT '94: Proceedings of the workshop on Human Language Technology, pp. 307-312.
  • 48
    • 0033677215 scopus 로고    scopus 로고
    • Zheng, J., Franco, H., Weng, F., Sankar, A., Bratt, H., 2000. Word-level rate-of-speech modeling using rate-specific phones and pronunciations. In: Proc. of ICASSP. Vol. 3. pp. 1775-1778.


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.