SCOPUS 정보 검색 플랫폼

Volumn 49, Issue 10-11, 2007, Pages 847-860

Acoustic variability and automatic recognition of children's speech

(3) Gerosa, Matteo a Giuliani, Diego a Brugnara, Fabio a

Author keywords

Automatic speech recognition for children; Children's speech analysis; Speaker adaptive acoustic modeling; Speaker normalization

Indexed keywords

ACOUSTICS; SPEECH ANALYSIS; SPEECH INTELLIGIBILITY; VOCABULARY CONTROL;

SPEAKER ADAPTIVE ACOUSTIC MODELING; SPEAKER NORMALIZATION; SPECTRAL VARIABILITY;

SPEECH RECOGNITION;

EID: 34547939271 PISSN: 01676393 EISSN: None Source Type: Journal
DOI: 10.1016/j.specom.2007.01.002 Document Type: Article

Times cited : (120)

References (48)

1
- 34547934326
- Ackermann, U., Angelini, B., Brugnara, F., Federico, M., Giuliani, D., Gretter, R., Niemann, H., 1997. Speedata: a prototype for multilingual spoken data-entry. In: Proc. of EUROSPEECH. Rhodes, Greece, pp. 1807-1810.

2
- 0030362995
- Anastasakos, T., McDonough, J., Schwartz, R., Makhoul, J., 1996. A Compact Model for Speaker-Adaptive Training. In: Proc. of ICSLP. Philadelphia, PA, pp. 1137-1140.

3
- 34547927849
- Angelini, B., Brugnara, F., Falavigna, D., Giuliani, D., Gretter, R., Omologo, M., 1994. Speaker Independent Continuous Speech Recognition Using an Acoustic-Phonetic Italian Corpus. In: Proc. of ICSLP. Yokohama, Japan, pp. 1391-1394.

4
- 85009069271
- Arunachalam, S., Gould, D., Andersen, E., Byrd, D., Narayanan, S., 2001. Politeness and Frustration Language in Child-Machine Interactions. In: Proc. of EUROSPEECH. Aalborg, Denmark, pp. 2675-2679.

5
- 51849142134
- Banerjee, S., Beck, J.E., Mostow, J., 2003. Evaluating the Effect of Predicting Oral Reading Miscues. In: Proc. of EUROSPEECH. Geneva, Switzerland.

6
- 0034854955
- Bertoldi, N., Brugnara, F., Cettolo, M., Federico, M., Giuliani, D., 2001. From Broadcast News to Spontaneous Dialogue Transcription: Portability Issues. In: Proc. of ICASSP. Vol. 1. Salt Lake City, UT, pp. 37-40.

7
- 4444257069
- Praat, a system for doing phonetics by computer
- Boersma P., and Weenink D. Praat, a system for doing phonetics by computer. Glot Int. 5 9/10 (2001) 341-345
- (2001) Glot Int. , vol.5 , Issue.9-10 , pp. 341-345
- Boersma, P.¹ Weenink, D.²

8
- 63749112976
- Brugnara, F., Cettolo, M., Federico, M., Giuliani, D., 2002. Issues in automatic transcription of historical audio data. In: Proc. of ICSLP. Denver, CO, pp. 1441-1444.

9
- 0030375265
- Burnett, D.C., Fanty, M., 1996. Rapid unsupervised adaptation to children's speech on a connected-digit task. In: Proc. of ICSLP. Vol. 2. Philadelphia, PA, pp. 1145-1148.

10
- 0032204117
- A novel feature transformation for vocal tract length normalisation in automatic speech recognition
- Claes T., Dologlou I., ten Bosch L., and Compernolle D.V. A novel feature transformation for vocal tract length normalisation in automatic speech recognition. IEEE Trans. Speech Audio Process. 6 6 (1998) 549-557
- (1998) IEEE Trans. Speech Audio Process. , vol.6 , Issue.6 , pp. 549-557
- Claes, T.¹ Dologlou, I.² ten Bosch, L.³ Compernolle, D.V.⁴

11
- 0004255998
- Arnold chapter 22, pp. 520-546
- Clarke G.M., and Cooke D. A Basic Course in Statistics (1998), Arnold chapter 22, pp. 520-546
- (1998) A Basic Course in Statistics
- Clarke, G.M.¹ Cooke, D.²

12
- 0031644298
- Das, S., Nix, D., Picheny, M., 1998. Improvements in Children's Speech Recognition Performance. In: Proc. of ICASSP. Seattle, WA.

13
- 0019053271
- Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
- Davis S., and Mermelstein P. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoustics Speech Signal Process. 28 (1980) 357-366
- (1980) IEEE Trans. Acoustics Speech Signal Process. , vol.28 , pp. 357-366
- Davis, S.¹ Mermelstein, P.²

14
- 0029725604
- Eide, E., Gish, H., 1996. A parametric approach to vocal tract length normalization. In: Proc. of ICASSP. Atlanta, GA, pp. 346-349.

15
- 34547934131
- Eskenazi, M., Pelton, G., 2002. Pinpointing pronunciation errors in children's speech: examining the role of the speech recognizer. In: PMLA. Aspen Lodge, CO, pp. 48-52.

16
- 0032878792
- Morphology and development of the human vocal tract: a study using magnetic resonance imaging
- Fitch W.T., and Giedd J. Morphology and development of the human vocal tract: a study using magnetic resonance imaging. J. Acoust. Soc. Am. 106 3 (1999) 1511-1522
- (1999) J. Acoust. Soc. Am. , vol.106 , Issue.3 , pp. 1511-1522
- Fitch, W.T.¹ Giedd, J.²

17
- 0032097263
- Academic Press, New York
- Fukunaga K. Introduction to Statistical Pattern Recognition. 2nd ed. (1990), Academic Press, New York
- (1990) Introduction to Statistical Pattern Recognition. 2nd ed.
- Fukunaga, K.¹

18
- 0032050110
- Maximum likelihood linear transformations for HMM-based speech recognition
- Gales M.J.F. Maximum likelihood linear transformations for HMM-based speech recognition. Comput. Speech Lang. 12 (1998) 75-98
- (1998) Comput. Speech Lang. , vol.12 , pp. 75-98
- Gales, M.J.F.¹

19
- 33745219452
- Gerosa, M., Giuliani, D., Brugnara, F., 2005. Speaker adaptive acoustic modeling with mixture of adult and children's speech. In: Proc. of INTERSPEECH/EUROSPEECH. Lisboa, Portugal, pp. 2193-2196.

20
- 0024909979
- Gillick, L., Cox, S.J., 1989. Some statistical issues in the comparison of speech recognition algorithms. In: Proc. of ICASSP. Glasgow, pp. I-532-535.

21
- 85143191136
- Giuliani, D., Gerosa, M., 2003. Investigating recognition of children speech. In: Proc. of ICASSP. Vol. 2. Hong Kong, pp. 137-140.

22
- 27744595137
- Improved automatic speech recognition through speaker normalization
- Giuliani D., Gerosa M., and Brugnara F. Improved automatic speech recognition through speaker normalization. Comput. Speech Lang. 20 1 (2006) 107-123
- (2006) Comput. Speech Lang. , vol.20 , Issue.1 , pp. 107-123
- Giuliani, D.¹ Gerosa, M.² Brugnara, F.³

23
- 56149113752
- Gustafson, J., Sjölander, K., 2000. Voice transformations for improving children's speech recognition in a publicly available dialogue system. In: Proc. of ICSLP. Beijing, China, pp. 297-300.

24
- 84946707630
- Hagen, A., Pellom, B., Cole, R., 2003. Children's speech recognition with application to interactive books and tutors. In: Proc. of the ASRU Workshop. St. Thomas Irsee, US Virgin Islands.

25
- 34547952312
- Hagen, A., Pellom, B., Vuuren, S.V., Cole, R., 2004. Advances in children's speech recognition within an interactive literacy tutor. In: Proc. of HLT/NAACL. Boston, MA.

26
- 0032840439
- Formants of children women and men: The effect of vocal intensity variation
- Huber J.E., Stathopoulos E.T., Curione G.M., Ash T.A., and Johnson K. Formants of children women and men: The effect of vocal intensity variation. J. Acoust. Soc. Am. 106 3 (1999) 1532-1542
- (1999) J. Acoust. Soc. Am. , vol.106 , Issue.3 , pp. 1532-1542
- Huber, J.E.¹ Stathopoulos, E.T.² Curione, G.M.³ Ash, T.A.⁴ Johnson, K.⁵

27
- 33745197960
- Kumar, S.C., Mohandas, V.P., Li, H., 2005. Multilingual speech recognition: a unified approach. In: Proc. of INTERSPEECH/EUROSPEECH. Lisboa, Portugal, pp. 3357-3360.

28
- 0029747183
- Lee, L., Rose, R.C., 1996. Speaker normalization using efficient frequency warping procedure. In: Proc. of ICASSP. Atlanta, GA, pp. 353-356.

29
- 0032969462
- Acoustic of children's speech: developmental changes of temporal and spectral parameters
- Lee S., Potamianos A., and Narayanan S. Acoustic of children's speech: developmental changes of temporal and spectral parameters. J. Acoust. Soc. Am. 105 3 (1999) 1455-1468
- (1999) J. Acoust. Soc. Am. , vol.105 , Issue.3 , pp. 1455-1468
- Lee, S.¹ Potamianos, A.² Narayanan, S.³

30
- 0029288633
- Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
- Leggetter C.J., and Woodland P.C. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models. Comput. Speech Lang. 9 (1995) 171-185
- (1995) Comput. Speech Lang. , vol.9 , pp. 171-185
- Leggetter, C.J.¹ Woodland, P.C.²

31
- 0030363025
- Mak, B., Barnard, E., 1996. Phone clustering using the Bhattacharyya distance. In: Proc. of ICSLP. Philadelphia, PA, pp. 2005-2008.

32
- 34547936573
- Mich, O., Giuliani, D., Gerosa, M., 2004. Parling: a CALL System for Children. In: Proc. of InSTIL/ICALL. Venice, Italy, pp. 169-172.

33
- 0029725864
- Miller, J.D., Lee, S., Uchanski, R.M., Heidbreder, A.H., Richman, B.B., Tadlock, J., 1996. Creation of two children's speech databases. In: Proc. of ICASSP. Atlanta, GA, pp. 849-852.

34
- 0029748337
- Mirghafori, N., Fosler, E., Morgan, N., 1996. Towards robustness to fast speech in ASR. In: Proc. of ICASSP. Atlanta, GA, USA, pp. 335-338.

35
- 33745194337
- Nakamura, M., Iwano, K., Furui, S., 2005. Analysis of spectral space reduction in spontaneous speech and its effects on speech recognition performances. In: Proc. of EUROSPEECH. Lisbon, Portugal, pp. 3381-3384.

36
- 0036475971
- Creating conversational interfaces for children
- Narayanan S., and Potamianos A. Creating conversational interfaces for children. IEEE Trans. Speech Audio Process. 10 2 (2002) 65-78
- (2002) IEEE Trans. Speech Audio Process. , vol.10 , Issue.2 , pp. 65-78
- Narayanan, S.¹ Potamianos, A.²

37
- 4544229743
- Nisimura, R., Lee, A., Saruwatari, H., Shikano, K., 2004. Public speech-oriented guidance system with adult and child discrimination capability. In: Proc. of ICASSP. Vol. 1. Montreal, Canada, pp. 433-436.

38
- 0347338002
- Robust recognition of children's speech
- Potamianos A., and Narayanan S. Robust recognition of children's speech. IEEE Trans. Speech Audio Process. 11 6 (2003) 603-615
- (2003) IEEE Trans. Speech Audio Process. , vol.11 , Issue.6 , pp. 603-615
- Potamianos, A.¹ Narayanan, S.²

39
- 34547957511
- Potamianos, A., Narayanan, S., Lee, S., 1997. Automatic speech recognition for children. In: Proc. of EUROSPEECH. Rhodes, Greece, pp. 2371-2374.

40
- 0345852475
- The STAR system: an interactive pronunciation tutor for young children
- Russell M.J., Series R.W., Wallace J.L., Brown C., and Skilling A. The STAR system: an interactive pronunciation tutor for young children. Comput. Speech Lang. 14 2 (2000) 161-175
- (2000) Comput. Speech Lang. , vol.14 , Issue.2 , pp. 161-175
- Russell, M.J.¹ Series, R.W.² Wallace, J.L.³ Brown, C.⁴ Skilling, A.⁵

41
- 34547954623
- Salvi, G., 2003. Accent clustering in Swedish using the Bhattacharyya distance. In: Proc. of the 15th ICPhS International Congress of Phonetic Sciences. Barcelona, Spain.

42
- 0017482612
- Normalization of vowels by vocal tract length and its application to vowel identification
- Wakita H. Normalization of vowels by vocal tract length and its application to vowel identification. IEEE Trans. Acoust. Speech Signal Process. 25 (1977) 183-192
- (1977) IEEE Trans. Acoust. Speech Signal Process. , vol.25 , pp. 183-192
- Wakita, H.¹

43
- 0029764708
- Wegmann, S., McAllaster, D., Orloff, J., Peskin, B., 1996. Speaker Normalisation on Conversational Telephone Speech. In: Proc. of ICASSP. Atlanta, pp. I-339-341.

44
- 0032629626
- Welling, L., Kanthak, S., Ney, H., 1999. Improved methods for vocal tract normalization. In: Proc. of ICASSP. Vol. 2. Phoenix, AZ, pp. 761-764.

45
- 0034318848
- Speech patterns of children and adults elicited via a picture-naming task: an acoustic study
- Whiteside S.P., and Hodgson C. Speech patterns of children and adults elicited via a picture-naming task: an acoustic study. Speech Commun. 32 (2000) 267-285
- (2000) Speech Commun. , vol.32 , pp. 267-285
- Whiteside, S.P.¹ Hodgson, C.²

46
- 0029747582
- Wilpon, J.G., Jacobsen, C.N., 1996. A Study of Speech Recognition for Children and Elderly. In: Proc. of ICASSP. Atlanta, GA, pp. 349-352.

47
- 34547929817
- Young, S.J., Odell, J.J., Woodland, P.C., 1994. Tree-based state tying for high accuracy acoustic modelling. In: HLT '94: Proceedings of the workshop on Human Language Technology, pp. 307-312.

48
- 0033677215
- Zheng, J., Franco, H., Weng, F., Sankar, A., Bratt, H., 2000. Word-level rate-of-speech modeling using rate-specific phones and pronunciations. In: Proc. of ICASSP. Vol. 3. pp. 1775-1778.

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.