SCOPUS 정보 검색 플랫폼

Eurasip Journal on Audio, Speech, and Music Processing

Volumn 2013, Issue 1, 2013, Pages

Evaluation of influence of spectral and prosodic features on GMM classification of Czech and Slovak emotional speech

(2) Přibil, Jiří a Přibilová, Anna b

a INSTITUTE OF MEASUREMENT SCIENCE (Slovakia)

b SLOVAK UNIVERSITY OF TECHNOLOGY (Slovakia)

Author keywords

emotional speech recognition; GMM classifier; spectral and prosodic features of speech

Indexed keywords

CLASSIFICATION PROCESS; COMPARATIVE EXPERIMENTS; COMPUTATION COMPLEXITY; EMOTION CLASSIFICATION; EMOTIONAL SPEECH RECOGNITION; GAUSSIAN MIXTURE MODEL; PROSODIC FEATURES; SETTING OF PARAMETERS;

CLASSIFICATION (OF INFORMATION); SOCIAL SCIENCES; SPEECH ANALYSIS; SPEECH PROCESSING; SPEECH RECOGNITION; SPEECH SYNTHESIS;

QUALITY CONTROL;

EID: 84887051130 PISSN: 16874714 EISSN: 16874722 Source Type: Journal
DOI: 10.1186/1687-4722-2013-8 Document Type: Article

Times cited : (30)

References (41)

1
- 77949874976
- The roles of tonal and segmental information in Mandarin spoken word recognition: An eyetracking study
- 10.1016/j.jml.2010.02.004
- Malins JG, Joanisse MF: The roles of tonal and segmental information in Mandarin spoken word recognition: an eyetracking study. J. Mem. Lang. 2010, 62:407-420.
- (2010) J. Mem. Lang , vol.62 , pp. 407-420
- Malins, J.G.¹ Joanisse, M.F.²

2
- 84860850285
- Low-variance multitaper MFCC features: A case study in robust speaker verification
- 10.1109/TASL.2012.2191960
- Kinnunen T, Saeidi R, Sedlák F, Lee KA, Sandberg J, Hansson-Sandsten M, Li H: Low-variance multitaper MFCC features: a case study in robust speaker verification. IEEE Trans. Audio Speech 2012,20(7) 1990-2001.
- (2012) IEEE Trans. Audio Speech , vol.20 , Issue.7 , pp. 1990-2001
- Kinnunen, T.¹ Saeidi, R.² Sedlák, F.³ Lee, K.A.⁴ Sandberg, J.⁵ Hansson-Sandsten, M.⁶ Li, H.⁷

3
- 66249114358
- India Patiala 1292-1296
- Koolagudi SG, Nandy S, Rao KS: Spectral features for emotion classification, in Proceedings of IEEE International Advance Computing Conference (IACC '09). Patiala: India; 2009. pp. 1292-1296
- (2009) Spectral Features for Emotion Classification, in Proceedings of IEEE International Advance Computing Conference (IACC '09)
- Koolagudi, S.G.¹ Nandy, S.² Rao, K.S.³

4
- 84859024513
- CASA-based robust speaker identification
- 10.1109/TASL.2012.2186803
- Zhao X, Shao Y, DeL W: CASA-based robust speaker identification. IEEE Trans. Audio Speech 2012,20(5) 1608-1616.
- (2012) IEEE Trans. Audio Speech , vol.20 , Issue.5 , pp. 1608-1616
- Zhao, X.¹ Shao, Y.² Del, W.³

5
- 84857488250
- Real-time robust automatic speech recognition using compact support vector machines
- 10.1109/TASL.2011.2178597
- Solera-Ureña R, García-Moral AI, Peláez-Moreno C, Martínez-Ramón M, Díaz-de-María F: Real-time robust automatic speech recognition using compact support vector machines. IEEE Trans. Audio Speech 2012,20(4) 1347-1361.
- (2012) IEEE Trans. Audio Speech , vol.20 , Issue.4 , pp. 1347-1361
- Solera-Ureña, R.¹ García-Moral, A.I.² Peláez-Moreno, C.³ Martínez-Ramón, M.⁴ Díaz-De-María, F.⁵

6
- 0035688755
- Robust matching of audio signals using spectral flatness features
- New York USA
- Herre J, Allamanche E, Hellmuth O: Robust matching of audio signals using spectral flatness features. In Proceedings of the 2001 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. New York, USA; 2001:127-130.
- (2001) Proceedings of the 2001 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics , pp. 127-130
- Herre, J.¹ Allamanche, E.² Hellmuth, O.³

7
- 70449380158
- Note on measures for spectral flatness
- 10.1049/el.2009.1977
- Madhu N: Note on measures for spectral flatness. Electron. Lett. 2009, 45:1195-1196.
- (2009) Electron. Lett , vol.45 , pp. 1195-1196
- Madhu, N.¹

8
- 33646801180
- USA Philadelphia, PA
- Misra H, Ikbal S, Sivadas S, Bourlard H: Multi-resolution spectral entropy feature for robust ASR, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '05), vol.1. Philadelphia, PA: USA; 2005:253-256.
- (2005) Multi-resolution Spectral Entropy Feature for Robust ASR, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '05), vol.1 , pp. 253-256
- Misra, H.¹ Ikbal, S.² Sivadas, S.³ Bourlard, H.⁴

9
- 67649867403
- Novel acoustic features for speech emotion recognition
- 10.1007/s11431-009-0204-3
- Roh YW, Kim DJ, Lee WS, Hong KS: Novel acoustic features for speech emotion recognition. Sci. China Ser. E: Technol. Sci. 2009,52(7) 1838-1848.
- (2009) Sci. China Ser. E: Technol. Sci , vol.52 , Issue.7 , pp. 1838-1848
- Roh, Y.W.¹ Kim, D.J.² Lee, W.S.³ Hong, K.S.⁴

10
- 37649005506
- On the use of complementary spectral features for speaker recognition
- Hosseinzadeh D, Krishnan S: On the use of complementary spectral features for speaker recognition. EURASIP J. Adv. Signal Process. 2008,2008(Article ID 258184) 10.
- (2008) EURASIP J. Adv. Signal Process , vol.2008 , Issue.ARTICLE ID 258184 , pp. 10
- Hosseinzadeh, D.¹ Krishnan, S.²

11
- 84855883762
- Acoustic feature selection and classification of emotions in speech using a 3D continuous emotion model
- 10.1016/j.bspc.2011.02.008
- Pérez-Espinoza H, Reyes-García CA, Villaseñor-Pineda L: Acoustic feature selection and classification of emotions in speech using a 3D continuous emotion model. Biomed. Signal Process. 2012, 7:79-87.
- (2012) Biomed. Signal Process , vol.7 , pp. 79-87
- Pérez-Espinoza, H.¹ Reyes-García, C.A.² Villaseñor-Pineda, L.³

12
- 67649271323
- Automatic refinement of an expressive speech corpus assembling subjective perception and automatic classification
- 10.1016/j.specom.2008.12.001
- Iriondo I, Planet S, Socoró JC, Martínez E, Alías F, Monzo X: Automatic refinement of an expressive speech corpus assembling subjective perception and automatic classification. Speech Commun. 2009, 51:744-758.
- (2009) Speech Commun , vol.51 , pp. 744-758
- Iriondo, I.¹ Planet, S.² Socoró, J.C.³ Martínez, E.⁴ Alías, F.⁵ Monzo, X.⁶

13
- 79960846934
- Recognizing affect from speech prosody using hierarchical graphical models
- 10.1016/j.specom.2011.05.003
- Fernandez R, Picard R: Recognizing affect from speech prosody using hierarchical graphical models. Speech Commun. 2011, 53:1088-1103.
- (2011) Speech Commun , vol.53 , pp. 1088-1103
- Fernandez, R.¹ Picard, R.²

14
- 77951729327
- Spectral moment features augmented by low order cepstral coefficients for robust ASR
- 10.1109/LSP.2010.2046349
- Tsiakoulis P, Potamianos A, Dimitriadis D: Spectral moment features augmented by low order cepstral coefficients for robust ASR. IEEE Signal Process. Lett. 2010,17(6) 551-554.
- (2010) IEEE Signal Process. Lett , vol.17 , Issue.6 , pp. 551-554
- Tsiakoulis, P.¹ Potamianos, A.² Dimitriadis, D.³

15
- 0034346176
- Emotion recognition in speech using neural networks
- 1157.68511 10.1007/s005210070006
- Nicholson J, Takahashi K, Nakatsu R: Emotion recognition in speech using neural networks. Neural Comput. Appl. 2000,9(4) 290-296.
- (2000) Neural Comput. Appl , vol.9 , Issue.4 , pp. 290-296
- Nicholson, J.¹ Takahashi, K.² Nakatsu, R.³

16
- 33646033467
- Formal prosodic structures and their application in NLP
- V. Matousek Mautner T. Pavelka (eds) Springer Berlin 10.1007/11551874-48
- Romport J, Matousek J: Formal prosodic structures and their application in NLP. In Text, Speech and Dialogue 2005, LNCS 3658. Edited by: Matousek V, Mautner P, Pavelka T. Berlin: Springer; 2005:371-378.
- (2005) Text, Speech and Dialogue 2005, LNCS 3658 , pp. 371-378
- Romport, J.¹ Matousek, J.²

17
- 0242721417
- Silva, speech emotion recognition using hidden Markov models
- 10.1016/S0167-6393(03)00099-2
- Nwe TL, Foo SW, De LC: Silva, speech emotion recognition using hidden Markov models. Speech Commun. 2003, 41:603-623.
- (2003) Speech Commun , vol.41 , pp. 603-623
- Nwe, T.L.¹ Foo, S.W.² De, L.C.³

18
- 0029355999
- Speaker identification and verification using Gaussian mixture speaker models
- 10.1016/0167-6393(95)00009-D
- Reynolds DA: Speaker identification and verification using Gaussian mixture speaker models. Speech Commun. 1995, 17:91-108.
- (1995) Speech Commun , vol.17 , pp. 91-108
- Reynolds, D.A.¹

19
- 83655164697
- Loss-scaled large-margin Gaussian mixture models for speech emotion classification
- 10.1109/TASL.2011.2162405
- Yun S, Yoo CD: Loss-scaled large-margin Gaussian mixture models for speech emotion classification. IEEE Trans. Audio Speech 2012,20(2) 585-598.
- (2012) IEEE Trans. Audio Speech , vol.20 , Issue.2 , pp. 585-598
- Yun, S.¹ Yoo, C.D.²

20
- 79952707334
- Study of empirical mode decomposition and spectral analysis for stress and emotion classification in natural speech
- 10.1016/j.bspc.2010.11.001
- He L, Lech M, Maddage NC, Allen NB: Study of empirical mode decomposition and spectral analysis for stress and emotion classification in natural speech. Biomed. Signal Process. 2011, 6:139-146.
- (2011) Biomed. Signal Process , vol.6 , pp. 139-146
- He, L.¹ Lech, M.² Maddage, N.C.³ Allen, N.B.⁴

21
- 79960848203
- Formant position based weighted spectral features for emotion recognition
- 10.1016/j.specom.2011.04.003
- Bozkurt E, Erzin E, Erdem ÇE, Erdem AT: Formant position based weighted spectral features for emotion recognition. Speech Commun. 2011, 53:1186-1197.
- (2011) Speech Commun , vol.53 , pp. 1186-1197
- Bozkurt, E.¹ Erzin, E.² Erdem, Ç.³ Erdem, A.T.⁴

22
- 58349112115
- Application of expressive speech in TTS system with cepstral description
- A. Esposito N. Bourbakis N. Avouris I. Hatrzilygeroudis (eds) Springer Berlin
- Přibil J, Přibilová A: Application of expressive speech in TTS system with cepstral description. In Verbal and Nonverbal Features of Human-Human and Human-Machine Interaction 2007, LNAI 5042. Edited by: Esposito A, Bourbakis N, Avouris N, Hatrzilygeroudis I. Berlin: Springer; 2008:201-213.
- (2008) Verbal and Nonverbal Features of Human-Human and Human-Machine Interaction 2007, LNAI 5042 , pp. 201-213
- Přibil, J.¹ Přibilová, A.²

23
- 84865526267
- Czech expressive speech synthesis in limited domain comparison of unit selection and HMM-based approaches
- Sojka A. Horak I. Kopecek K. Pala (eds) Springer Berlin
- Grůber M, Hanzlíček Z: Czech expressive speech synthesis in limited domain comparison of unit selection and HMM-based approaches. In TSD 2012, LNCS 7499. Edited by: Sojka P, Horak A, Kopecek I, Pala K. Berlin: Springer; 2012:656-664.
- (2012) TSD 2012, LNCS 7499 , pp. 656-664
- Grůber, M.¹ Hanzlíček, Z.²

24
- 38149033442
- Czech TTS Engine for BraillePen Device Based on Pocket PC Platform
- Prague Czech Republic
- Přibil J, Přibilová A: Czech TTS Engine for BraillePen Device Based on Pocket PC Platform. In Proceedings of the 16th Conference Electronic Speech Signal Processing ESSP 05 joined with the 15th Czech-German Workshop Speech Processing. Prague, Czech Republic; 2005:402-408.
- (2005) Proceedings of the 16th Conference Electronic Speech Signal Processing ESSP 05 Joined with the 15th Czech-German Workshop Speech Processing , pp. 402-408
- Přibil, J.¹ Přibilová, A.²

25
- 84868227049
- Czech and Slovak speaking voice communicator based on PDA/smartphone device for handicapped people
- Plzen Czech Republic
- Přibil J, Přibilová A: Czech and Slovak speaking voice communicator based on PDA/smartphone device for handicapped people. In Proceedings of the International Conference on Applied Electronics. Plzen, Czech Republic; 2012:219-222.
- (2012) Proceedings of the International Conference on Applied Electronics , pp. 219-222
- Přibil, J.¹ Přibilová, A.²

26
- 84880866124
- Spectral properties and prosodic parameters of emotional speech in Czech and Slovak
- I. Ipšić (eds) InTech Rijeka, Croatia
- Přibil J, Přibilová A: Spectral properties and prosodic parameters of emotional speech in Czech and Slovak. In Speech and Language Technologies. Edited by: Ipšić I. Rijeka, Croatia: InTech; 2011:175-200.
- (2011) Speech and Language Technologies , pp. 175-200
- Přibil, J.¹ Přibilová, A.²

27
- 84866939235
- New cepstral zero-pole vocal tract models for TTS synthesis
- Bratislava Slovakia
- Vích R, Přibil J, Smékal Z: New cepstral zero-pole vocal tract models for TTS synthesis. In Proceedings of IEEE Region 8 EUROCON'2001, vol. 2. Bratislava, Slovakia; 2001:458-462.
- (2001) Proceedings of IEEE Region 8 EUROCON'2001, Vol. 2 , pp. 458-462
- Vích, R.¹ Přibil, J.² Smékal, Z.³

28
- 0036988253
- Effects of tonsillectomy on speech spectrum
- 10.1016/S0892-1997(02)00133-9
- Ilk HG, Eroǧul O, Satar B, Özkaptan Y: Effects of tonsillectomy on speech spectrum. J. Voice 2002, 16:580-586.
- (2002) J. Voice , vol.16 , pp. 580-586
- Ilk, H.G.¹ Eroǧul, O.² Satar, B.³ Özkaptan, Y.⁴

29
- 33645743720
- Kluwer Academic Publishers Dordrecht, The Netherlands
- Fant G: Speech Acoustics and Phonetics. Dordrecht, The Netherlands: Kluwer Academic Publishers; 2004.
- (2004) Speech Acoustics and Phonetics
- Fant, G.¹

30
- 33751430923
- Periodicity estimation in synthesized phonation signals using cepstral rahmonic peaks
- 10.1016/j.specom.2006.09.001
- Murphy PJ: Periodicity estimation in synthesized phonation signals using cepstral rahmonic peaks. Speech Commun. 2006, 48:1704-1713.
- (2006) Speech Commun , vol.48 , pp. 1704-1713
- Murphy, P.J.¹

31
- 70349913647
- Jitter estimation algorithms for detection of pathological voices
- Silva DG, Olivera LC, Andrea M: Jitter estimation algorithms for detection of pathological voices. EURASIP J. Adv. Signal Process. 2009,2009(Article ID 567875) 9.
- (2009) EURASIP J. Adv. Signal Process , vol.2009 , pp. 9
- Silva, D.G.¹ Olivera, L.C.² Andrea, M.³

32
- 84887027080
- Berlin Database of Emotional Speech: Department of Communication Science, Institute for Speech and Communication. Berlin: Technical University Accessed 13 March 2006
- Berlin Database of Emotional Speech: Department of Communication Science, Institute for Speech and Communication. Berlin: Technical University; http://pascal.kgw.tu-berlin.de/emodb/, Accessed 13 March 2006

33
- 33745202280
- A database of German emotional speech
- Lisbon Portugal
- Burkhardt F, Paeschke A, Rolfes M, Sendlmeier W, Weiss B: A database of German emotional speech. In Proceedings of Interspeech 2005. Lisbon, Portugal; 2005:1517-1520.
- (2005) Proceedings of Interspeech 2005 , pp. 1517-1520
- Burkhardt, F.¹ Paeschke, A.² Rolfes, M.³ Sendlmeier, W.⁴ Weiss, B.⁵

34
- 84870318367
- Comparison of complementary spectral features of emotional speech for German, Czech, and Slovak
- A. Esposito R. Hoffmann S. Hubler B. Wrann (eds) Springer Heidelberg
- Přibil J, Přibilová A: Comparison of complementary spectral features of emotional speech for German, Czech, and Slovak. In Cognitive Behavioural Systems, LNCS 7403. Edited by: Esposito A, Hoffmann R, Hubler S, Wrann B. Heidelberg: Springer; 2012:236-250.
- (2012) Cognitive Behavioural Systems, LNCS 7403 , pp. 236-250
- Přibil, J.¹ Přibilová, A.²

35
- 84887099624
- Accessed 16 February 2012
- Nabney T: Netlab Pattern Analysis Toolbox. http://www.mathworks.com/ %20984%20Q2matlabcentral/fileexchange/2654-netlab, Accessed 16 February 2012
- Netlab Pattern Analysis Toolbox
- Nabney, T.¹

36
- 84884932130
- Accessed 16 February 2012
- Bishop CM, Nabney IT: NETLAB Online Reference Documentation. http://www.fizyka.umk.pl/netlab/, Accessed 16 February 2012
- NETLAB Online Reference Documentation
- Bishop, C.M.¹ Nabney, I.T.²

37
- 67650405956
- An 'open-set' detection evaluation methodology for automatic emotion recognition in speech
- Saarbrücken Germany
- Truong KP, Leeuven DA: An 'open-set' detection evaluation methodology for automatic emotion recognition in speech. In ParaLing 2007: Workshop on Paralinguistic Speech - Between Models and Data. Saarbrücken, Germany; 2007:5-10.
- (2007) ParaLing 2007: Workshop on Paralinguistic Speech - Between Models and Data , pp. 5-10
- Truong, K.P.¹ Leeuven, D.A.²

38
- 77956401353
- Class-level spectral features for emotion recognition
- 10.1016/j.specom.2010.02.010
- Bitouk D, Verma R, Nenkova A: Class-level spectral features for emotion recognition. Speech Commun. 2010, 52:613-625.
- (2010) Speech Commun , vol.52 , pp. 613-625
- Bitouk, D.¹ Verma, R.² Nenkova, A.³

39
- 67650474760
- Recognition of emotions in German speech using Gaussian mixture models
- A. Esposito A. Hussain M. Marinaro R. Martone (eds) Springer Berlin 10.1007/978-3-642-00525-1-26
- Vondra M, Vích R: Recognition of emotions in German speech using Gaussian mixture models. In Multimodal Signals: Cognitive and Algorithmic Issues, LNAI 5398. Edited by: Esposito A, Hussain A, Marinaro M, Martone R. Berlin: Springer; 2009:256-263.
- (2009) Multimodal Signals: Cognitive and Algorithmic Issues, LNAI 5398 , pp. 256-263
- Vondra, M.¹ Vích, R.²

40
- 33947164164
- An evaluation of the robustness of existing supervised machine learning approaches to the classification of emotions in speech
- 10.1016/j.specom.2007.01.006
- Shami M, Verhelst W: An evaluation of the robustness of existing supervised machine learning approaches to the classification of emotions in speech. Speech Commun. 2007, 49:201-212.
- (2007) Speech Commun , vol.49 , pp. 201-212
- Shami, M.¹ Verhelst, W.²

41
- 84864723353
- Speaker-independent emotion recognition exploiting a psychologically inspired binary cascade classification schema
- 10.1007/s10772-012-9127-7
- Kotti M, Paternò F: Speaker-independent emotion recognition exploiting a psychologically inspired binary cascade classification schema. Int. J. Speech Technol. 2012, 15:131-150. http://link.springer.com/article/10.1007/ s10772-012-9127-7#page-1
- (2012) Int. J. Speech Technol , vol.15 , pp. 131-150
- Kotti, M.¹ Paternò, F.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.