메뉴 건너뛰기




Volumn 64, Issue , 2015, Pages 49-58

Frame-by-frame language identification in short utterances using deep neural networks

Author keywords

DNNs; I vectors; Real time LID

Indexed keywords

DEEP NEURAL NETWORKS; MODELING LANGUAGES; NATURAL LANGUAGE PROCESSING SYSTEMS; SPEECH RECOGNITION;

EID: 84922385124     PISSN: 08936080     EISSN: 18792782     Source Type: Journal    
DOI: 10.1016/j.neunet.2014.08.006     Document Type: Article
Times cited : (66)

References (39)
  • 2
    • 66749166374 scopus 로고    scopus 로고
    • Dialect identification: The effects of region of origin and amount of experience
    • Baker W., Eddington D., Nay L. Dialect identification: The effects of region of origin and amount of experience. American Speech 2009, 84(1):48-71.
    • (2009) American Speech , vol.84 , Issue.1 , pp. 48-71
    • Baker, W.1    Eddington, D.2    Nay, L.3
  • 4
  • 13
    • 0025041264 scopus 로고
    • Perceptual linear predictive (PLP) analysis of speech
    • Hermansky H. Perceptual linear predictive (PLP) analysis of speech. Journal of the Acoustical Society of America 1990, 87(4):1738-1752.
    • (1990) Journal of the Acoustical Society of America , vol.87 , Issue.4 , pp. 1738-1752
    • Hermansky, H.1
  • 14
    • 85032751458 scopus 로고    scopus 로고
    • Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
    • Hinton G., Deng L., Yu D., Dahl G., Mohamed A., Jaitly N., et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine 2012, 29(6):82-97. 10.1109/MSP.2012.2205597.
    • (2012) IEEE Signal Processing Magazine , vol.29 , Issue.6 , pp. 82-97
    • Hinton, G.1    Deng, L.2    Yu, D.3    Dahl, G.4    Mohamed, A.5    Jaitly, N.6
  • 17
    • 84876676725 scopus 로고    scopus 로고
    • Spoken language recognition: From fundamentals to practice
    • Li H., Ma B., Lee K.A. Spoken language recognition: From fundamentals to practice. Proceedings of the IEEE 2013, 101(5):1136-1159. 10.1109/JPROC.2012.2237151.
    • (2013) Proceedings of the IEEE , vol.101 , Issue.5 , pp. 1136-1159
    • Li, H.1    Ma, B.2    Lee, K.A.3
  • 19
    • 84863809213 scopus 로고    scopus 로고
    • Dialect identification: Impact of difference between read versus spontaneous speech
    • Liu, G., Lei, Y., Hansen, J. H. (2010). Dialect identification: Impact of difference between read versus spontaneous speech. In EUSIPCO-2010 (pp. 2003-2006).
    • (2010) EUSIPCO-2010 , pp. 2003-2006
    • Liu, G.1    Lei, Y.2    Hansen, J.H.3
  • 20
    • 84886586505 scopus 로고    scopus 로고
    • A linguistic data acquisition front-end for language recognition evaluation
    • Liu, G., Zhang, C., Hansen, J. H. L. (2012). A linguistic data acquisition front-end for language recognition evaluation. In Proc. Odyssey, Singapore.
    • (2012) Proc. Odyssey, Singapore
    • Liu, G.1    Zhang, C.2    Hansen, J.H.L.3
  • 26
    • 84867585919 scopus 로고    scopus 로고
    • Understanding how deep belief networks perform acoustic modelling
    • IEEE
    • Mohamed A.Rahman, Hinton G.E., Penn G. Understanding how deep belief networks perform acoustic modelling. ICASSP 2012, 4273-4276. IEEE.
    • (2012) ICASSP , pp. 4273-4276
    • Mohamed, A.1    Hinton, G.E.2    Penn, G.3
  • 29
    • 84905252473 scopus 로고    scopus 로고
    • NIST, 2009. The 2009 NIST SLR Evaluation Plan.. http://www.itl.nist.gov/iad/mig/tests/lre/2009/LRE09_EvalPlan_v6.pdf.
    • (2009) The 2009 NIST SLR Evaluation Plan
  • 30
    • 0029355999 scopus 로고
    • Speaker identification and verification using Gaussian mixture speaker models
    • Reynolds D. Speaker identification and verification using Gaussian mixture speaker models. Speech Communication 1995, 17(1-2):91-108.
    • (1995) Speech Communication , vol.17 , Issue.1-2 , pp. 91-108
    • Reynolds, D.1
  • 35
    • 85009275225 scopus 로고    scopus 로고
    • Approaches to language identification using Gaussian mixture models and shifted delta cepstral features
    • Torres-Carrasquillo, P. A., Singer, E., Kohler, M. A., Deller, J. R. (2002). Approaches to language identification using Gaussian mixture models and shifted delta cepstral features. In ICSLP, Vol. 1 (pp. 89-92).
    • (2002) ICSLP , vol.1 , pp. 89-92
    • Torres-Carrasquillo, P.A.1    Singer, E.2    Kohler, M.A.3    Deller, J.R.4
  • 36
    • 84867204678 scopus 로고    scopus 로고
    • Eigen-channel compensation and discriminatively trained Gaussian mixture models for dialect and accent recognition
    • Torres-Carrasquillo, P. A., Sturim, D. E., Reynolds, D. A., McCree, A. (2008). Eigen-channel compensation and discriminatively trained Gaussian mixture models for dialect and accent recognition. In INTERSPEECH (pp. 723-726).
    • (2008) INTERSPEECH , pp. 723-726
    • Torres-Carrasquillo, P.A.1    Sturim, D.E.2    Reynolds, D.A.3    McCree, A.4
  • 37
    • 85032782045 scopus 로고    scopus 로고
    • Deep learning and its applications to signal and information processing [exploratory DSP]
    • Yu D., Deng L. Deep learning and its applications to signal and information processing [exploratory DSP]. IEEE Signal Processing Magazine 2011, 28(1):145-154. 10.1109/MSP.2010.939038.
    • (2011) IEEE Signal Processing Magazine , vol.28 , Issue.1 , pp. 145-154
    • Yu, D.1    Deng, L.2
  • 38
    • 0029733178 scopus 로고    scopus 로고
    • Comparison of four approaches to automatic language identification of telephone speech
    • Zissman M. Comparison of four approaches to automatic language identification of telephone speech. IEEE Transactions on Acoustics, Speech and Signal Processing 1996, 4(1):31-44.
    • (1996) IEEE Transactions on Acoustics, Speech and Signal Processing , vol.4 , Issue.1 , pp. 31-44
    • Zissman, M.1
  • 39


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.