메뉴 건너뛰기




Volumn , Issue , 2014, Pages 135-140

Vocal tract length normalisation approaches to DNN-based children's and adults' speech recognition

Author keywords

Automatic speech recognition; Children's speech recognition; Deep neural networks; Vocal tract length normalisation

Indexed keywords

DECODING; HIDDEN MARKOV MODELS; MARKOV PROCESSES; SPEECH;

EID: 84946692024     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/SLT.2014.7078563     Document Type: Conference Paper
Times cited : (93)

References (30)
  • 1
    • 0032878792 scopus 로고    scopus 로고
    • Morphology and development of the human vocal tract: A study using magnetic resonance imaging
    • Sept
    • W. T. Fitch and J. Giedd, "Morphology and development of the human vocal tract: A study using magnetic resonance imaging, " Journal of Acoust. Soc. Arner, vol. 1 06, no. 3, pp. 1 5 1 1-1 522, Sept. 1 999.
    • (1999) Journal of Acoust. Soc. Arner. , vol.1106 , Issue.3 , pp. 1511-1522
    • Fitch, W.T.1    Giedd, J.2
  • 2
    • 0032840439 scopus 로고    scopus 로고
    • Formants of children women and men: The effect of vocal intensity variation
    • Sept
    • J. E. Huber, E. T. Stathopoulos, G. M. Curione, T. A. Ash, and K. Johnson, "Formants of children women and men: The effect of vocal intensity variation, " Journal of Acoust. Soc. Arner, vol. 1 06, no. 3, pp. 1 5 3 2-1 542, Sept. 1 999.
    • (1999) Journal of Acoust. Soc. Arner. , vol.1106 , Issue.3 , pp. 1532-1542
    • Huber, J.E.1    Stathopoulos, E.T.2    Curione, G.M.3    Ash, T.A.4    Johnson, K.5
  • 3
    • 0032969462 scopus 로고    scopus 로고
    • Acoustic of children 's speech : Developmental changes of temporal and spectral parameters
    • March
    • S. Lee, A. Potamianos, and S. Narayanan, "Acoustic of children 's speech : Developmental changes of temporal and spectral parameters, " Journal of Acoust. Soc. Arner, vol. 1 05, no. 3, pp. 1 4 5 5-1 468, March 1 999.
    • (1999) Journal of Acoust. Soc. Arner. , vol.105 , Issue.3 , pp. 1455-1468
    • Lee, S.1    Potamianos, A.2    Narayanan, S.3
  • 4
    • 0029747183 scopus 로고    scopus 로고
    • Speaker normalization using efficient frequency warping procedure
    • Atlanta, GA, May
    • L. Lee and R. C. Rose, " Speaker Normalization Using Efficient Frequency Warping Procedure, " in Proc. ofIEEE ICASSP, Atlanta, GA, May 1 996, pp. 3 5 3-3 5 6.
    • (1996) Proc. OfIEEE ICASSP , pp. 353-356
    • Lee, L.1    Rose, R.C.2
  • 5
    • 0029764708 scopus 로고    scopus 로고
    • Speaker normalisation on conversational telephone speech
    • Atlanta, GA, May
    • S. Wegmann, D. McAI Ia ster, J. Orloff, and B. Peskin, "Speaker Normalisation on Conversational Telephone Speech, " in Proc. of iEEE ICASSP, Atlanta, GA, May 1 996, pp. 1-3 3 9-3 4 1.
    • (1996) Proc. of IEEE ICASSP , pp. 339-341
    • Wegmann, S.1    McAi Ia Ster, D.2    Orloff, J.3    Peskin, B.4
  • 6
    • 0029725604 scopus 로고    scopus 로고
    • A parametric approach to vocal tract lenght normalization
    • Atlanta, GA, May
    • E. Eide and H. Gish, "A Parametric Approach to Vocal Tract Lenght Normalization, " in Proc. of IEEE iCASSP, Atlanta, GA, May 1 996, pp. 3 46-3 49.
    • (1996) Proc. of IEEE ICASSP , pp. 346-349
    • Eide, E.1    Gish, H.2
  • 7
    • 84946707630 scopus 로고    scopus 로고
    • Children 's speech recognition with application to interactive books and tutors
    • St. Thomas Irsee, US Virgin Islands, Dec.
    • A. Hagen, B. Pellom, and R. Cole, "Children 's Speech Recognition with Application to Interactive Books and Tutors, " in Proc. of IEEE ASRU Workshop, St. Thomas Irsee, US Virgin Islands, Dec. 2003.
    • (2003) Proc. of IEEE ASRU Workshop
    • Hagen, A.1    Pellom, B.2    Cole, R.3
  • 8
    • 0141702066 scopus 로고    scopus 로고
    • Investigating recognition of children speech
    • Hong Kong, Apr.
    • D. Giuliani and M. Gerosa, "Investigating Recognition of Children Speech, " in Proc. of IEEE ICASSP, vol. 2, Hong Kong, Apr. 2003, pp. 1 3 7-1 40.
    • (2003) Proc. of IEEE ICASSP , vol.2 , pp. 137-140
    • Giuliani, D.1    Gerosa, M.2
  • 12
    • 84055222005 scopus 로고    scopus 로고
    • Contextdependent pre-trained deep neural networks for largevocabulary speech recognition
    • Jan 2012
    • G. Dahl, D. Yu, L. Deng, and A. Acero, "Contextdependent pre-trained deep neural networks for largevocabulary speech recognition, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 1, pp. 3 0-42, Jan 20 1 2.
    • IEEE Transactions on Audio, Speech, and Language Processing , vol.20 , Issue.1 , pp. 30-42
    • Dahl, G.1    Yu, D.2    Deng, L.3    Acero, A.4
  • 13
    • 84910049413 scopus 로고    scopus 로고
    • Using deep neural networks to improve proficiency assessment for children english language learners
    • A. Metallinou and 1. Cheng, "Using Deep Neural Networks to Improve Proficiency Assessment for Children English Language Learners, " in Proc. of INTERSPEECH, 20 1 4, pp. 1 468-1 472.
    • (2014) Proc. of INTERSPEECH , pp. 1468-1472
    • Metallinou, A.1    Cheng, I.2
  • 14
    • 84858976070 scopus 로고    scopus 로고
    • Feature engineering in context-dependent deep neural networks for conversational speech transcription
    • December
    • F. Seide, G. Li, X. Chen, and D. Yu, "Feature engineering in context-dependent deep neural networks for conversational speech transcription, " in Proc. of IEEE ASRU Workshop, December 20 1 1.
    • (2011) Proc. of IEEE ASRU Workshop
    • Seide, F.1    Li, G.2    Chen, X.3    Yu, D.4
  • 16
    • 84874278045 scopus 로고    scopus 로고
    • Unsupervised cross-lingual knowledge transfer in DNN-based LVCSR
    • Dec
    • P. Swietojanski, A. Ghoshal, and S. Renals, "Unsupervised cross-lingual knowledge transfer in DNN-based LVCSR, " in Proc. of IEEE SLT Workshop, Dec 20 1 2, pp. 246-25 1.
    • (2012) Proc. of IEEE SLT Workshop , pp. 246-251
    • Swietojanski, P.1    Ghoshal, A.2    Renals, S.3
  • 17
    • 78049384951 scopus 로고    scopus 로고
    • Multi-style ML features for BN transcription
    • March
    • V.-B. Le, L. Lamel, and 1. Gauvain, "Multi-style ML features for BN transcription, " in Proc. of IEEE ICASSP, March 20 1 0, pp. 4866-4869.
    • (2010) Proc. of IEEE ICASSP , pp. 4866-4869
    • Le, V.-B.1    Lamel, L.2    Gauvain, I.3
  • 18
    • 84890474716 scopus 로고    scopus 로고
    • Deep neural network features and semi-supervised training for low resource speech recognition
    • May
    • S. Thomas, M. Seltzer, K. Church, and H. Hermansky, "Deep neural network features and semi-supervised training for low resource speech recognition, " in Proc. of IEEE ICASSP, May 20 1 3, pp. 6704-6708.
    • (2013) Proc. of IEEE ICASSP , pp. 6704-6708
    • Thomas, S.1    Seltzer, M.2    Church, K.3    Hermansky, H.4
  • 19
    • 84890521103 scopus 로고    scopus 로고
    • Speaker adaptation of context dependent deep neural networks
    • H. Liao, "Speaker adaptation of context dependent deep neural networks, " in Proc. of IEEE ICASSP, 20 1 3, pp. 7947-795 1.
    • (2013) Proc. of IEEE ICASSP , pp. 7947-7951
    • Liao, H.1
  • 20
    • 84906225505 scopus 로고    scopus 로고
    • Rapid and effective speaker adaptation of convolutional neural network based models for speech recognition
    • O. Abdel-Hamid and H. Jiang, "Rapid and effective speaker adaptation of convolutional neural network based models for speech recognition. " in Proc. of INTERSPEECH, 20 1 3, pp. 1 248-1 252.
    • (2013) Proc. of INTERSPEECH , pp. 1248-1252
    • Abdel-Hamid, O.1    Jiang, H.2
  • 21
    • 84890492030 scopus 로고    scopus 로고
    • An investigation of deep neural networks for noise robust speech recognition
    • M. Seltzer, D. Yu, and Y. Wang, "An investigation of deep neural networks for noise robust speech recognition, " in Proc. of IEEE ICASSP, 20 1 3.
    • (2013) Proc. of IEEE ICASSP
    • Seltzer, M.1    Yu, D.2    Wang, Y.3
  • 22
    • 84905259138 scopus 로고    scopus 로고
    • Improving DNN speaker independence with I-vector inputs
    • A. Senior and I. Lopez-Moreno, "Improving DNN speaker independence with I-vector inputs, " in Proc. of IEEE ICASSP, 20 1 4.
    • (2014) Proc. of IEEE ICASSP
    • Senior, A.1    Lopez-Moreno, I.2
  • 23
    • 0032629626 scopus 로고    scopus 로고
    • Improved methods for vocal tract nonnalization
    • Phoenix, AZ, April
    • L. Welling, S. Kanthak, and H. Ney, "Improved Methods for Vocal Tract Nonnalization, " in Proc. of IEEE ICASSP, vol. 2, Phoenix, AZ, April 1 999, pp. 7 6 1-764.
    • (1999) Proc. of IEEE ICASSP , vol.2 , pp. 761-764
    • Welling, L.1    Kanthak, S.2    Ney, H.3
  • 24
    • 34547939271 scopus 로고    scopus 로고
    • Acoustic variability and automatic recognition of childrens speech
    • M. Gerosa, D. Giuliani, and F. Brugnara, "Acoustic variability and automatic recognition of childrens speech, " Speech Communication, vol. 49, no. 1 0 1 1, pp. 847-860, 2007.
    • (2007) Speech Communication , vol.49 , Issue.1011 , pp. 847-860
    • Gerosa, M.1    Giuliani, D.2    Brugnara, F.3
  • 25
    • 0001370735 scopus 로고
    • Speaker independent continuous speech recognition using an acousticphonetic italian corpus
    • Yokohama, Japan, Sept
    • B. Angelini, F. Brugnara, D. Falavigna, D. Giuliani, R. Gretter, and M. Omologo, " Speaker Independent Continuous Speech Recognition Using an AcousticPhonetic Italian Corpus, " in Proc. ofICSLP, Yokohama, Japan, Sept. 1 994, pp. 1 3 9 1-1 3 94.
    • (1994) Proc. OfICSLP , pp. 1391-1394
    • Angelini, B.1    Brugnara, F.2    Falavigna, D.3    Giuliani, D.4    Gretter, R.5    Omologo, M.6
  • 26
    • 78049271850 scopus 로고    scopus 로고
    • Parallel training of neural networks for speech recognition
    • Springer
    • K. Vesely, L. Burget, and F. Grezl, "Parallel training of neural networks for speech recognition, " i n Text, Speech and Dialogue. Springer, 20 1 0, pp. 43 9-446.
    • (2010) Text, Speech and Dialogue , pp. 439-446
    • Vesely, K.1    Burget, L.2    Grezl, F.3
  • 27
    • 33745805403 scopus 로고    scopus 로고
    • A fast learning algorithm for deep belief nets
    • G. Hinton, S. Osindero, and Y.-w. Teh, "A fast learning algorithm for deep belief nets, " Neural computation, vol. 1 8, no. 7, pp. 1 527-1 554, 2006.
    • (2006) Neural Computation , vol.18 , Issue.7 , pp. 1527-1554
    • Hinton, G.1    Osindero, S.2    Teh, Y.-W.3
  • 29
    • 0024909979 scopus 로고
    • Some statistical issues in the comparison of speech recognition algorithms
    • Glasgow, Scotland, May
    • L. Gillick and S. Cox, " Some Statistical Issues in the Comparison of Speech Recognition Algorithms, " in Proc. of IEEE ICASSP, Glasgow, Scotland, May 1 989, pp. I-53 2-5 3 5.
    • (1989) Proc. of IEEE ICASSP , pp. 532-535
    • Gillick, L.1    Cox, S.2
  • 30
    • 84905252792 scopus 로고    scopus 로고
    • Joint noise adaptive training for robust automatic speech recognition
    • A. Narayanan and D. Wang, "Joint noise adaptive training for robust automatic speech recognition, " in Proc. of IEEE ICASSP, 20 1 4, pp. 2523-2527.
    • (2014) Proc. of IEEE ICASSP , pp. 2523-2527
    • Narayanan, A.1    Wang, D.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.