메뉴 건너뛰기




Volumn 2013, Issue 1, 2013, Pages

Evaluation of influence of spectral and prosodic features on GMM classification of Czech and Slovak emotional speech

Author keywords

emotional speech recognition; GMM classifier; spectral and prosodic features of speech

Indexed keywords

CLASSIFICATION PROCESS; COMPARATIVE EXPERIMENTS; COMPUTATION COMPLEXITY; EMOTION CLASSIFICATION; EMOTIONAL SPEECH RECOGNITION; GAUSSIAN MIXTURE MODEL; PROSODIC FEATURES; SETTING OF PARAMETERS;

EID: 84887051130     PISSN: 16874714     EISSN: 16874722     Source Type: Journal    
DOI: 10.1186/1687-4722-2013-8     Document Type: Article
Times cited : (30)

References (41)
  • 1
    • 77949874976 scopus 로고    scopus 로고
    • The roles of tonal and segmental information in Mandarin spoken word recognition: An eyetracking study
    • 10.1016/j.jml.2010.02.004
    • Malins JG, Joanisse MF: The roles of tonal and segmental information in Mandarin spoken word recognition: an eyetracking study. J. Mem. Lang. 2010, 62:407-420.
    • (2010) J. Mem. Lang , vol.62 , pp. 407-420
    • Malins, J.G.1    Joanisse, M.F.2
  • 4
    • 84859024513 scopus 로고    scopus 로고
    • CASA-based robust speaker identification
    • 10.1109/TASL.2012.2186803
    • Zhao X, Shao Y, DeL W: CASA-based robust speaker identification. IEEE Trans. Audio Speech 2012,20(5) 1608-1616.
    • (2012) IEEE Trans. Audio Speech , vol.20 , Issue.5 , pp. 1608-1616
    • Zhao, X.1    Shao, Y.2    Del, W.3
  • 7
    • 70449380158 scopus 로고    scopus 로고
    • Note on measures for spectral flatness
    • 10.1049/el.2009.1977
    • Madhu N: Note on measures for spectral flatness. Electron. Lett. 2009, 45:1195-1196.
    • (2009) Electron. Lett , vol.45 , pp. 1195-1196
    • Madhu, N.1
  • 9
    • 67649867403 scopus 로고    scopus 로고
    • Novel acoustic features for speech emotion recognition
    • 10.1007/s11431-009-0204-3
    • Roh YW, Kim DJ, Lee WS, Hong KS: Novel acoustic features for speech emotion recognition. Sci. China Ser. E: Technol. Sci. 2009,52(7) 1838-1848.
    • (2009) Sci. China Ser. E: Technol. Sci , vol.52 , Issue.7 , pp. 1838-1848
    • Roh, Y.W.1    Kim, D.J.2    Lee, W.S.3    Hong, K.S.4
  • 10
    • 37649005506 scopus 로고    scopus 로고
    • On the use of complementary spectral features for speaker recognition
    • Hosseinzadeh D, Krishnan S: On the use of complementary spectral features for speaker recognition. EURASIP J. Adv. Signal Process. 2008,2008(Article ID 258184) 10.
    • (2008) EURASIP J. Adv. Signal Process , vol.2008 , Issue.ARTICLE ID 258184 , pp. 10
    • Hosseinzadeh, D.1    Krishnan, S.2
  • 11
    • 84855883762 scopus 로고    scopus 로고
    • Acoustic feature selection and classification of emotions in speech using a 3D continuous emotion model
    • 10.1016/j.bspc.2011.02.008
    • Pérez-Espinoza H, Reyes-García CA, Villaseñor-Pineda L: Acoustic feature selection and classification of emotions in speech using a 3D continuous emotion model. Biomed. Signal Process. 2012, 7:79-87.
    • (2012) Biomed. Signal Process , vol.7 , pp. 79-87
    • Pérez-Espinoza, H.1    Reyes-García, C.A.2    Villaseñor-Pineda, L.3
  • 12
    • 67649271323 scopus 로고    scopus 로고
    • Automatic refinement of an expressive speech corpus assembling subjective perception and automatic classification
    • 10.1016/j.specom.2008.12.001
    • Iriondo I, Planet S, Socoró JC, Martínez E, Alías F, Monzo X: Automatic refinement of an expressive speech corpus assembling subjective perception and automatic classification. Speech Commun. 2009, 51:744-758.
    • (2009) Speech Commun , vol.51 , pp. 744-758
    • Iriondo, I.1    Planet, S.2    Socoró, J.C.3    Martínez, E.4    Alías, F.5    Monzo, X.6
  • 13
    • 79960846934 scopus 로고    scopus 로고
    • Recognizing affect from speech prosody using hierarchical graphical models
    • 10.1016/j.specom.2011.05.003
    • Fernandez R, Picard R: Recognizing affect from speech prosody using hierarchical graphical models. Speech Commun. 2011, 53:1088-1103.
    • (2011) Speech Commun , vol.53 , pp. 1088-1103
    • Fernandez, R.1    Picard, R.2
  • 14
    • 77951729327 scopus 로고    scopus 로고
    • Spectral moment features augmented by low order cepstral coefficients for robust ASR
    • 10.1109/LSP.2010.2046349
    • Tsiakoulis P, Potamianos A, Dimitriadis D: Spectral moment features augmented by low order cepstral coefficients for robust ASR. IEEE Signal Process. Lett. 2010,17(6) 551-554.
    • (2010) IEEE Signal Process. Lett , vol.17 , Issue.6 , pp. 551-554
    • Tsiakoulis, P.1    Potamianos, A.2    Dimitriadis, D.3
  • 15
    • 0034346176 scopus 로고    scopus 로고
    • Emotion recognition in speech using neural networks
    • 1157.68511 10.1007/s005210070006
    • Nicholson J, Takahashi K, Nakatsu R: Emotion recognition in speech using neural networks. Neural Comput. Appl. 2000,9(4) 290-296.
    • (2000) Neural Comput. Appl , vol.9 , Issue.4 , pp. 290-296
    • Nicholson, J.1    Takahashi, K.2    Nakatsu, R.3
  • 16
    • 33646033467 scopus 로고    scopus 로고
    • Formal prosodic structures and their application in NLP
    • V. Matousek Mautner T. Pavelka (eds) Springer Berlin 10.1007/11551874-48
    • Romport J, Matousek J: Formal prosodic structures and their application in NLP. In Text, Speech and Dialogue 2005, LNCS 3658. Edited by: Matousek V, Mautner P, Pavelka T. Berlin: Springer; 2005:371-378.
    • (2005) Text, Speech and Dialogue 2005, LNCS 3658 , pp. 371-378
    • Romport, J.1    Matousek, J.2
  • 17
    • 0242721417 scopus 로고    scopus 로고
    • Silva, speech emotion recognition using hidden Markov models
    • 10.1016/S0167-6393(03)00099-2
    • Nwe TL, Foo SW, De LC: Silva, speech emotion recognition using hidden Markov models. Speech Commun. 2003, 41:603-623.
    • (2003) Speech Commun , vol.41 , pp. 603-623
    • Nwe, T.L.1    Foo, S.W.2    De, L.C.3
  • 18
    • 0029355999 scopus 로고
    • Speaker identification and verification using Gaussian mixture speaker models
    • 10.1016/0167-6393(95)00009-D
    • Reynolds DA: Speaker identification and verification using Gaussian mixture speaker models. Speech Commun. 1995, 17:91-108.
    • (1995) Speech Commun , vol.17 , pp. 91-108
    • Reynolds, D.A.1
  • 19
    • 83655164697 scopus 로고    scopus 로고
    • Loss-scaled large-margin Gaussian mixture models for speech emotion classification
    • 10.1109/TASL.2011.2162405
    • Yun S, Yoo CD: Loss-scaled large-margin Gaussian mixture models for speech emotion classification. IEEE Trans. Audio Speech 2012,20(2) 585-598.
    • (2012) IEEE Trans. Audio Speech , vol.20 , Issue.2 , pp. 585-598
    • Yun, S.1    Yoo, C.D.2
  • 20
    • 79952707334 scopus 로고    scopus 로고
    • Study of empirical mode decomposition and spectral analysis for stress and emotion classification in natural speech
    • 10.1016/j.bspc.2010.11.001
    • He L, Lech M, Maddage NC, Allen NB: Study of empirical mode decomposition and spectral analysis for stress and emotion classification in natural speech. Biomed. Signal Process. 2011, 6:139-146.
    • (2011) Biomed. Signal Process , vol.6 , pp. 139-146
    • He, L.1    Lech, M.2    Maddage, N.C.3    Allen, N.B.4
  • 21
    • 79960848203 scopus 로고    scopus 로고
    • Formant position based weighted spectral features for emotion recognition
    • 10.1016/j.specom.2011.04.003
    • Bozkurt E, Erzin E, Erdem ÇE, Erdem AT: Formant position based weighted spectral features for emotion recognition. Speech Commun. 2011, 53:1186-1197.
    • (2011) Speech Commun , vol.53 , pp. 1186-1197
    • Bozkurt, E.1    Erzin, E.2    Erdem, Ç.3    Erdem, A.T.4
  • 22
    • 58349112115 scopus 로고    scopus 로고
    • Application of expressive speech in TTS system with cepstral description
    • A. Esposito N. Bourbakis N. Avouris I. Hatrzilygeroudis (eds) Springer Berlin
    • Přibil J, Přibilová A: Application of expressive speech in TTS system with cepstral description. In Verbal and Nonverbal Features of Human-Human and Human-Machine Interaction 2007, LNAI 5042. Edited by: Esposito A, Bourbakis N, Avouris N, Hatrzilygeroudis I. Berlin: Springer; 2008:201-213.
    • (2008) Verbal and Nonverbal Features of Human-Human and Human-Machine Interaction 2007, LNAI 5042 , pp. 201-213
    • Přibil, J.1    Přibilová, A.2
  • 23
    • 84865526267 scopus 로고    scopus 로고
    • Czech expressive speech synthesis in limited domain comparison of unit selection and HMM-based approaches
    • Sojka A. Horak I. Kopecek K. Pala (eds) Springer Berlin
    • Grůber M, Hanzlíček Z: Czech expressive speech synthesis in limited domain comparison of unit selection and HMM-based approaches. In TSD 2012, LNCS 7499. Edited by: Sojka P, Horak A, Kopecek I, Pala K. Berlin: Springer; 2012:656-664.
    • (2012) TSD 2012, LNCS 7499 , pp. 656-664
    • Grůber, M.1    Hanzlíček, Z.2
  • 25
    • 84868227049 scopus 로고    scopus 로고
    • Czech and Slovak speaking voice communicator based on PDA/smartphone device for handicapped people
    • Plzen Czech Republic
    • Přibil J, Přibilová A: Czech and Slovak speaking voice communicator based on PDA/smartphone device for handicapped people. In Proceedings of the International Conference on Applied Electronics. Plzen, Czech Republic; 2012:219-222.
    • (2012) Proceedings of the International Conference on Applied Electronics , pp. 219-222
    • Přibil, J.1    Přibilová, A.2
  • 26
    • 84880866124 scopus 로고    scopus 로고
    • Spectral properties and prosodic parameters of emotional speech in Czech and Slovak
    • I. Ipšić (eds) InTech Rijeka, Croatia
    • Přibil J, Přibilová A: Spectral properties and prosodic parameters of emotional speech in Czech and Slovak. In Speech and Language Technologies. Edited by: Ipšić I. Rijeka, Croatia: InTech; 2011:175-200.
    • (2011) Speech and Language Technologies , pp. 175-200
    • Přibil, J.1    Přibilová, A.2
  • 28
    • 0036988253 scopus 로고    scopus 로고
    • Effects of tonsillectomy on speech spectrum
    • 10.1016/S0892-1997(02)00133-9
    • Ilk HG, Eroǧul O, Satar B, Özkaptan Y: Effects of tonsillectomy on speech spectrum. J. Voice 2002, 16:580-586.
    • (2002) J. Voice , vol.16 , pp. 580-586
    • Ilk, H.G.1    Eroǧul, O.2    Satar, B.3    Özkaptan, Y.4
  • 29
    • 33645743720 scopus 로고    scopus 로고
    • Kluwer Academic Publishers Dordrecht, The Netherlands
    • Fant G: Speech Acoustics and Phonetics. Dordrecht, The Netherlands: Kluwer Academic Publishers; 2004.
    • (2004) Speech Acoustics and Phonetics
    • Fant, G.1
  • 30
    • 33751430923 scopus 로고    scopus 로고
    • Periodicity estimation in synthesized phonation signals using cepstral rahmonic peaks
    • 10.1016/j.specom.2006.09.001
    • Murphy PJ: Periodicity estimation in synthesized phonation signals using cepstral rahmonic peaks. Speech Commun. 2006, 48:1704-1713.
    • (2006) Speech Commun , vol.48 , pp. 1704-1713
    • Murphy, P.J.1
  • 31
    • 70349913647 scopus 로고    scopus 로고
    • Jitter estimation algorithms for detection of pathological voices
    • Silva DG, Olivera LC, Andrea M: Jitter estimation algorithms for detection of pathological voices. EURASIP J. Adv. Signal Process. 2009,2009(Article ID 567875) 9.
    • (2009) EURASIP J. Adv. Signal Process , vol.2009 , pp. 9
    • Silva, D.G.1    Olivera, L.C.2    Andrea, M.3
  • 32
    • 84887027080 scopus 로고    scopus 로고
    • Berlin Database of Emotional Speech: Department of Communication Science, Institute for Speech and Communication. Berlin: Technical University Accessed 13 March 2006
    • Berlin Database of Emotional Speech: Department of Communication Science, Institute for Speech and Communication. Berlin: Technical University; http://pascal.kgw.tu-berlin.de/emodb/, Accessed 13 March 2006
  • 34
    • 84870318367 scopus 로고    scopus 로고
    • Comparison of complementary spectral features of emotional speech for German, Czech, and Slovak
    • A. Esposito R. Hoffmann S. Hubler B. Wrann (eds) Springer Heidelberg
    • Přibil J, Přibilová A: Comparison of complementary spectral features of emotional speech for German, Czech, and Slovak. In Cognitive Behavioural Systems, LNCS 7403. Edited by: Esposito A, Hoffmann R, Hubler S, Wrann B. Heidelberg: Springer; 2012:236-250.
    • (2012) Cognitive Behavioural Systems, LNCS 7403 , pp. 236-250
    • Přibil, J.1    Přibilová, A.2
  • 35
    • 84887099624 scopus 로고    scopus 로고
    • Accessed 16 February 2012
    • Nabney T: Netlab Pattern Analysis Toolbox. http://www.mathworks.com/ %20984%20Q2matlabcentral/fileexchange/2654-netlab, Accessed 16 February 2012
    • Netlab Pattern Analysis Toolbox
    • Nabney, T.1
  • 38
    • 77956401353 scopus 로고    scopus 로고
    • Class-level spectral features for emotion recognition
    • 10.1016/j.specom.2010.02.010
    • Bitouk D, Verma R, Nenkova A: Class-level spectral features for emotion recognition. Speech Commun. 2010, 52:613-625.
    • (2010) Speech Commun , vol.52 , pp. 613-625
    • Bitouk, D.1    Verma, R.2    Nenkova, A.3
  • 39
    • 67650474760 scopus 로고    scopus 로고
    • Recognition of emotions in German speech using Gaussian mixture models
    • A. Esposito A. Hussain M. Marinaro R. Martone (eds) Springer Berlin 10.1007/978-3-642-00525-1-26
    • Vondra M, Vích R: Recognition of emotions in German speech using Gaussian mixture models. In Multimodal Signals: Cognitive and Algorithmic Issues, LNAI 5398. Edited by: Esposito A, Hussain A, Marinaro M, Martone R. Berlin: Springer; 2009:256-263.
    • (2009) Multimodal Signals: Cognitive and Algorithmic Issues, LNAI 5398 , pp. 256-263
    • Vondra, M.1    Vích, R.2
  • 40
    • 33947164164 scopus 로고    scopus 로고
    • An evaluation of the robustness of existing supervised machine learning approaches to the classification of emotions in speech
    • 10.1016/j.specom.2007.01.006
    • Shami M, Verhelst W: An evaluation of the robustness of existing supervised machine learning approaches to the classification of emotions in speech. Speech Commun. 2007, 49:201-212.
    • (2007) Speech Commun , vol.49 , pp. 201-212
    • Shami, M.1    Verhelst, W.2
  • 41
    • 84864723353 scopus 로고    scopus 로고
    • Speaker-independent emotion recognition exploiting a psychologically inspired binary cascade classification schema
    • 10.1007/s10772-012-9127-7
    • Kotti M, Paternò F: Speaker-independent emotion recognition exploiting a psychologically inspired binary cascade classification schema. Int. J. Speech Technol. 2012, 15:131-150. http://link.springer.com/article/10.1007/ s10772-012-9127-7#page-1
    • (2012) Int. J. Speech Technol , vol.15 , pp. 131-150
    • Kotti, M.1    Paternò, F.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.