SCOPUS 정보 검색 플랫폼

Speech Communication

Volumn 53, Issue 5, 2011, Pages 768-785

Automatic speech emotion recognition using modulation spectral features

(3) Wu, Siqing a Falk, Tiago H b Chan, Wai Yip a

a QUEEN S UNIVERSITY (Canada)

b UNIVERSITY OF TORONTO (Canada)

Author keywords

Affective computing; Emotion recognition; Spectro temporal representation; Speech analysis; Speech modulation

Indexed keywords

ACOUSTIC FREQUENCY; AFFECTIVE COMPUTING; AUTOMATIC RECOGNITION; EMOTION RECOGNITION; ESTIMATION PERFORMANCE; HUMAN EVALUATION; HUMAN SPEECH PERCEPTION; MEL-FREQUENCY CEPSTRAL COEFFICIENTS; MODULATION FILTERBANK; PERCEPTUAL LINEAR PREDICTIONS; PROSODIC FEATURES; RECOGNITION PERFORMANCE; RECOGNITION RATES; SPECTRAL FEATURE; SPECTRAL REPRESENTATIONS; SPECTRO-TEMPORAL REPRESENTATION; SPEECH EMOTION RECOGNITION; TEMPORAL MODULATIONS; TEMPORAL REPRESENTATIONS;

FEATURE EXTRACTION; MODULATION; SPEECH ANALYSIS;

SPEECH RECOGNITION;

EID: 79953659944 PISSN: 01676393 EISSN: None Source Type: Journal
DOI: 10.1016/j.specom.2010.08.013 Document Type: Article

Times cited : (354)

References (60)

1
- 0012643725
- Cross linguistic interpretation of emotional prosody
- Abelin, A., Allwood, J., 2000. Cross linguistic interpretation of emotional prosody. In: Proc. ISCA ITRW on Speech and Emotion, pp. 110-113.
- (2000) Proc. ISCA ITRW on Speech and Emotion , pp. 110-113
- Abelin, A.¹ Allwood, J.²

2
- 0019199104
- Spectro-temporal receptive fields of auditory neurons in the grassfrog. I. Characterization of tonal and natural stimuli
- DOI 10.1007/BF00337015
- A. Aertsen, and P. Johannesma Spectro-temporal receptive fields of auditory neurons in the grass frog. I. Characterization of tonal and natural stimuli Biol. Cybernet. 38 1980 223 234 (Pubitemid 11220294)
- (1980) Biological Cybernetics , vol.38 , Issue.4 , pp. 223-234
- Aertsen, A.M.H.J.¹ Johannesma, P.I.M.²

3
- 0038443474
- Joint acoustic and modulation frequency
- L. Atlas, and S. Shamma Joint acoustic and modulation frequency EURASIP J. Appl. Signal Process. 2003 668 675
- (2003) EURASIP J. Appl. Signal Process. , pp. 668-675
- Atlas, L.¹ Shamma, S.²

4
- 33947617827
- Prosodic and segmental rubrics in emotion identification
- Barra, R., Montero, J., Macias-Guarasa, J., D'Haro, L., San-Segundo, R., Cordoba, R., 2006. Prosodic and segmental rubrics in emotion identification. In: Proc. Internat. Conf. on Acoustics, Speech and Signal Processing, Vol. 1, pp. 1085-1088.
- (2006) Proc. Internat. Conf. on Acoustics, Speech and Signal Processing , vol.1 , pp. 1085-1088
- Barra, R.¹ Montero, J.² MacIas-Guarasa, J.³ D'Haro, L.⁴ San-Segundo, R.⁵ Cordoba, R.⁶

5
- 34547505647
- Combining efforts for improving automatic classification of emotional user states
- Batliner, A., Steidl, S., Schuller, B., Seppi, D., Laskowski, K., Vogt, T., Devillers, L., Vidrascu, L., Amir, N., Kessous, L., Aharonson, V., 2006. Combining efforts for improving automatic classification of emotional user states. In: Proc. IS-LTC, pp. 240-245.
- (2006) Proc. IS-LTC , pp. 240-245
- Batliner, A.¹ Steidl, S.² Schuller, B.³ Seppi, D.⁴ Laskowski, K.⁵ Vogt, T.⁶ Devillers, L.⁷ Vidrascu, L.⁸ Amir, N.⁹ Kessous, L.¹⁰ Aharonson, V.¹¹

6
- 33846516584
- Springer New York
- C. Bishop Pattern Recognition and Machine Learning 2006 Springer New York
- (2006) Pattern Recognition and Machine Learning
- Bishop, C.¹

7
- 33745202280
- A database of German emotional speech
- 9th European Conference on Speech Communication and Technology, Eurospeech Interspeech
- Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., Weiss, B., 2005. A database of German emotional speech. In: Proc. Interspeech, pp. 1517-1520. (Pubitemid 43908362)
- (2005) 9th European Conference on Speech Communication and Technology , pp. 1517-1520
- Burkhardt, F.¹ Paeschke, A.² Rolfes, M.³ Sendlmeier, W.⁴ Weiss, B.⁵

8
- 65249116503
- Analysis of emotionally salient aspects of fundamental frequency for emotion detection
- C. Busso, S. Lee, and S. Narayanan Analysis of emotionally salient aspects of fundamental frequency for emotion detection IEEE Trans. Audio Speech Language Process. 17 2009 582 596
- (2009) IEEE Trans. Audio Speech Language Process , vol.17 , pp. 582-596
- Busso, C.¹ Lee, S.² Narayanan, S.³

9
- 0003710380
- LIBSVM: A library for support vector machines
- Department of Computer Science, National Taiwan University. Software
- Chang, C.-C., Lin, C.-J., 2009. LIBSVM: A library for support vector machines. Tech. rep., Department of Computer Science, National Taiwan University. Software available at: .
- (2009) Tech. Rep.
- Chang, C.-C.¹ Lin, C.-J.²

10
- 23744508888
- Multiresolution spectrotemporal analysis of complex sounds
- DOI 10.1121/1.1945807
- T. Chih, P. Ru, and S. Shamma Multiresolution spectrotemporal analysis of complex sounds J. Acoust. Soc. Amer. 118 2005 887 906 (Pubitemid 41129224)
- (2005) Journal of the Acoustical Society of America , vol.118 , Issue.2 , pp. 887-906
- Chi, T.¹ Ru, P.² Shamma, S.A.³

11
- 44149109121
- Fear-type emotion recognition for future audio-based surveillance systems
- C. Clavel, I. Vasilescu, L. Devillers, G. Richard, and T. Ehrette Fear-type emotion recognition for future audio-based surveillance systems Speech Commun. 50 2008 487 503
- (2008) Speech Commun. , vol.50 , pp. 487-503
- Clavel, C.¹ Vasilescu, I.² Devillers, L.³ Richard, G.⁴ Ehrette, T.⁵

12
- 0030352957
- Automatic statistical analysis of the signal and prosodic signs of emotion in speech
- Cowie, R., Douglas-Cowie, E., 1996. Automatic statistical analysis of the signal and prosodic signs of emotion in speech. In: Proc. Internat. Conf. on Spoken Language Processing, Vol. 3, pp. 1989-1992.
- (1996) Proc. Internat. Conf. on Spoken Language Processing , vol.3 , pp. 1989-1992
- Cowie, R.¹ Douglas-Cowie, E.²

13
- 85032751766
- Emotion recognition in human-computer interaction
- DOI 10.1109/79.911197
- R. Cowie, E. Douglas-Cowie, N. Tsapatsoulis, G. Votsis, S. Kollias, W. Fellenz, and J. Taylor Emotion recognition in human-computer interaction IEEE Signal Process. Mag. 18 2001 32 80 (Pubitemid 32287669)
- (2001) IEEE Signal Processing Magazine , vol.18 , Issue.1 , pp. 32-80
- Cowie, R.¹ Douglas-Cowie, E.² Tsapatsoulis, N.³ Votsis, G.⁴ Kollias, S.⁵ Fellenz, W.⁶ Taylor, J.G.⁷

14
- 0019053271
- Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
- S. Davis, and P. Mermelstein Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences IEEE Trans. Audio Speech Language Process. 28 1980 357 366 (Pubitemid 11464930)
- (1980) IEEE Transactions on Acoustics, Speech, and Signal Processing , vol.ASSP-28 , Issue.4 , pp. 357-366
- Davis Steven, B.¹ Mermelstein Paul²

15
- 0035097825
- Spectro-temporal response field characterization with dynamic ripples in ferret primary auditory cortex
- D. Depireux, J. Simon, D. Klein, and S. Shamma Spectro-temporal response field characterization with dynamic ripples in ferret primary auditory cortex J. Neurophysiol. 85 2001 1220 1234 (Pubitemid 32209608)
- (2001) Journal of Neurophysiology , vol.85 , Issue.3 , pp. 1220-1234
- Depireux, D.A.¹ Simon, J.Z.² Klein, D.J.³ Shamma, S.A.⁴

16
- 0037380084
- Emotional speech: Towards a new generation of databases
- E. Douglas-Cowie, N. Campbell, R. Cowie, and P. Roach Emotional speech: Towards a new generation of databases Speech Commun. 40 2003 33 60
- (2003) Speech Commun. , vol.40 , pp. 33-60
- Douglas-Cowie, E.¹ Campbell, N.² Cowie, R.³ Roach, P.⁴

17
- 34548283842
- John Wiley New York
- P. Ekman Basic Emotions 1999 John Wiley New York pp. 45-60
- (1999) Basic Emotions
- Ekman, P.¹

18
- 0033813112
- Characterizing frequency selectivity for envelope fluctuations
- S. Ewert, and T. Dau Characterizing frequency selectivity for envelope fluctuations J. Acoust. Soc. Amer. 108 2000 1181 1196
- (2000) J. Acoust. Soc. Amer. , vol.108 , pp. 1181-1196
- Ewert, S.¹ Dau, T.²

19
- 70449620235
- A non-intrusive quality measure of dereverberated speech
- Falk, T.H., Chan, W.-Y., 2008. A non-intrusive quality measure of dereverberated speech. In: Proc. Internat. Workshop for Acoustic Echo and Noise Control.
- (2008) Proc. Internat. Workshop for Acoustic Echo and Noise Control
- Falk, T.H.¹ Chan, W.-Y.²

20
- 70449360175
- Modulation spectral features for robust far-field speaker identification
- T.H. Falk, and W.-Y. Chan Modulation spectral features for robust far-field speaker identification IEEE Trans. Audio Speech Language Process. 18 2010 90 100
- (2010) IEEE Trans. Audio Speech Language Process. , vol.18 , pp. 90-100
- Falk, T.H.¹ Chan, W.-Y.²

21
- 77949423782
- Temporal dynamics for blind measurement of room acoustical parameters
- T.H. Falk, and W.-Y. Chan Temporal dynamics for blind measurement of room acoustical parameters IEEE Trans. Instrum. Meas. 59 2010 978 989
- (2010) IEEE Trans. Instrum. Meas. , vol.59 , pp. 978-989
- Falk, T.H.¹ Chan, W.-Y.²

22
- 0002439510
- Auditory patterns
- H. Fletcher Auditory patterns Rev. Modern Phys. 12 1940 47 65
- (1940) Rev. Modern Phys. , vol.12 , pp. 47-65
- Fletcher, H.¹

23
- 70349439215
- A dimensional approach to emotion recognition of speech from movies
- Giannakopoulos, T., Pikrakis, A., Theodoridis, S., 2009. A dimensional approach to emotion recognition of speech from movies. In: Proc. Internat. Conf. on Acoustics, Speech and Signal Processing, pp. 65-68.
- (2009) Proc. Internat. Conf. on Acoustics, Speech and Signal Processing , pp. 65-68
- Giannakopoulos, T.¹ Pikrakis, A.² Theodoridis, S.³

24
- 0025110885
- Derivation of auditory filter shapes from notched-noise data
- DOI 10.1016/0378-5955(90)90170-T
- B. Glasberg, and B. Moore Derivation of auditory filter shapes from notched-noise data Hearing Res. 47 1990 103 138 (Pubitemid 20244652)
- (1990) Hearing Research , vol.47 , Issue.1-2 , pp. 103-138
- Glasberg, B.R.¹ Moore, B.C.J.²

25
- 34547940048
- Primitives-based evaluation and estimation of emotions in speech
- DOI 10.1016/j.specom.2007.01.010, PII S0167639307000040
- M. Grimm, K. Kroschel, E. Mower, and S. Narayanan Primitives-based evaluation and estimation of emotions in speech Speech Commun. 49 2007 787 800 (Pubitemid 47268568)
- (2007) Speech Communication , vol.49 , Issue.10-11 , pp. 787-800
- Grimm, M.¹ Kroschel, K.² Mower, E.³ Narayanan, S.⁴

26
- 34547518166
- Support vector regression for automatic recognition of spontaneous emotions in speech
- Grimm, M., Kroschel, K., Narayanan, S., 2007b. Support vector regression for automatic recognition of spontaneous emotions in speech. In: Proc. Internat. Conf. on Acoustics, Speech and Signal Processing, Vol. 4, pp. 1085-1088.
- (2007) Proc. Internat. Conf. on Acoustics, Speech and Signal Processing , vol.4 , pp. 1085-1088
- Grimm, M.¹ Kroschel, K.² Narayanan, S.³

27
- 54049132925
- The Vera am Mittag German audio-visual emotional speech database
- Grimm, M., Kroschel, K., Narayanan, S., 2008. The Vera am Mittag German audio-visual emotional speech database. In: Proc. Internat. Conf. on Multimedia & Expo, pp. 865-868.
- (2008) Proc. Internat. Conf. on Multimedia & Expo , pp. 865-868
- Grimm, M.¹ Kroschel, K.² Narayanan, S.³

28
- 34547951152
- Bi-modal emotion recognition from expressive face and body gestures
- DOI 10.1016/j.jnca.2006.09.007, PII S1084804506000774
- H. Gunes, and M. Piccard Bi-modal emotion recognition from expressive face and body gestures J. Network Comput. Appl. 30 2007 1334 1345 (Pubitemid 47263518)
- (2007) Journal of Network and Computer Applications , vol.30 , Issue.4 , pp. 1334-1345
- Gunes, H.¹ Piccardi, M.²

29
- 0025041264
- Perceptual linear predictive (PLP) analysis of speech
- DOI 10.1121/1.399423
- H. Hermansky Perceptual linear predictive (PLP) analysis of speech J. Acoust. Soc. Amer. 87 1990 1738 1752 (Pubitemid 20256470)
- (1990) Journal of the Acoustical Society of America , vol.87 , Issue.4 , pp. 1738-1752
- Hermansky, H.¹

30
- 4944228528
- A practical guide to support vector classification
- Department of Computer Science, National Taiwan University
- Hsu, C.-C., Chang, C.-C., Lin, C.-J., 2007. A practical guide to support vector classification. Tech. rep., Department of Computer Science, National Taiwan University.
- (2007) Tech. Rep.
- Hsu, C.-C.¹ Chang, C.-C.² Lin, C.-J.³

31
- 0003629316
- Intl. Telecom. Union Switzerland
- Intl. Telecom. Union, 1993. Objective measurement of active speech level. ITU-T P.56, Switzerland.
- (1993) Objective Measurement of Active Speech Level. ITU-T P.56

32
- 79953645436
- Intl. Telecom. Union Switzerland
- Intl. Telecom. Union, 1996. A silence compression scheme for G.729 optimized for terminals conforming to ITU-T recommendation V.70. ITU-T G.729 Annex B, Switzerland.
- (1996) A Silence Compression Scheme for G.729 Optimized for Terminals Conforming to ITU-T Recommendation V.70. ITU-T G.729 Annex B

33
- 80052689797
- Analysis of the roles and the dynamics of breathy and whispery voice qualities in dialogue speech
- C. Ishi, H. Ishiguro, and N. Hagita Analysis of the roles and the dynamics of breathy and whispery voice qualities in dialogue speech EURASIP J. Audio Speech Music Process. 2010 article ID 528193, 12 pages
- (2010) EURASIP J. Audio Speech Music Process
- Ishi, C.¹ Ishiguro, H.² Hagita, N.³

34
- 0025635254
- On a simple algorithm to calculate the 'energy' of a signal
- Kaiser, J., 1990. On a simple algorithm to calculate the 'energy' of a signal. In: Proc. Internat. Conf. on Acoustics, Speech and Signal Processing, Vol. 1, pp. 381-384.
- (1990) Proc. Internat. Conf. on Acoustics, Speech and Signal Processing , vol.1 , pp. 381-384
- Kaiser, J.¹

35
- 0032676337
- On the relative importance of various components of the modulation spectrum for automatic speech recognition
- N. Kanederaa, T. Araib, H. Hermanskyc, and M. Pavel On the relative importance of various components of the modulation spectrum for automatic speech recognition Speech Commun. 28 1999 43 55
- (1999) Speech Commun. , vol.28 , pp. 43-55
- Kanederaa, N.¹ Araib, T.² Hermanskyc, H.³ Pavel, M.⁴

36
- 0002774069
- Feature set search algorithms
- J. Kittler Feature set search algorithms Pattern Recognition Signal Process. 1978 41 60
- (1978) Pattern Recognition Signal Process , pp. 41-60
- Kittler, J.¹

37
- 33745191649
- An articulatory study of emotional speech production
- 9th European Conference on Speech Communication and Technology, Eurospeech Interspeech
- Lee, S., Yildirim, S., Kazemzadeh, A., Narayanan, S., 2005. An articulatory study of emotional speech production. In: Proc. Interspeech, pp. 497-500. (Pubitemid 43908108)
- (2005) 9th European Conference on Speech Communication and Technology , pp. 497-500
- Lee, S.¹ Yildirim, S.² Kazemzadeh, A.³ Narayanan, S.⁴

38
- 51449108623
- Cascaded emotion classification via psychological emotion dimensions using a large set of voice quality parameters
- Lugger, M., Yang, B., 2008. Cascaded emotion classification via psychological emotion dimensions using a large set of voice quality parameters. In: Proc. Internat. Conf. on Acoustics, Speech and Signal Processing, Vol. 4, pp. 4945-4948.
- (2008) Proc. Internat. Conf. on Acoustics, Speech and Signal Processing , vol.4 , pp. 4945-4948
- Lugger, M.¹ Yang, B.²

39
- 85032751546
- Pushing the envelope - Aside
- DOI 10.1109/MSP.2005.1511826
- N. Morgan, Q. Zhu, A. Stolcke, K. Sonmez, S. Sivadas, T. Shinozaki, M. Ostendorf, P. Jain, H. Hermansky, D. Ellis, G. Doddington, B. Chen, O. Cetin, H. Bourlard, and M. Athineos Pushing the envelope - aside IEEE Signal Process. Mag. 22 2005 81 88 (Pubitemid 41508702)
- (2005) IEEE Signal Processing Magazine , vol.22 , Issue.5 , pp. 81-88
- Morgan, N.¹ Zhu, Q.² Stolcke, A.³ Sonmez, K.⁴ Sivadas, S.⁵ Shinozaki, T.⁶ Ostendorf, M.⁷ Jain, P.⁸ Hermansky, H.⁹ Ellis, D.¹⁰ Doddington, G.¹¹ Chen, B.¹² Cetin, O.¹³ Bourlard, H.¹⁴ Athineos, M.¹⁵

40
- 21844472185
- Prosody and emotions
- Mozziconacci, S., 2002. Prosody and emotions. In: Proc. Speech Prosody, pp. 1-9.
- (2002) Proc. Speech Prosody , pp. 1-9
- Mozziconacci, S.¹

41
- 0242721417
- Speech emotion recognition using hidden markov models
- T. Nwe, S. Foo, and L. De Silva Speech emotion recognition using hidden markov models Speech Commun. 41 2003 603 623
- (2003) Speech Commun. , vol.41 , pp. 603-623
- Nwe, T.¹ Foo, S.² De Silva, L.³

42
- 0003959340
- The MIT Press
- R. Picard Affective Computing 1997 The MIT Press
- (1997) Affective Computing
- Picard, R.¹

43
- 0004244302
- Prentice-Hall Englewood Cliffs, NJ
- L. Rabiner, and B. Juang Fundamentals of Speech Recognition 1993 Prentice-Hall Englewood Cliffs, NJ
- (1993) Fundamentals of Speech Recognition
- Rabiner, L.¹ Juang, B.²

44
- 0037384712
- Vocal communication of emotion: A review of research paradigms
- K. Scherer Vocal communication of emotion: A review of research paradigms Speech Commun. 40 2003 227 256
- (2003) Speech Commun. , vol.40 , pp. 227-256
- Scherer, K.¹

45
- 67649771591
- Classifier fusion for emotion recognition from speech
- Scherer, S., Schwenker, F., Palm, G., 2007. Classifier fusion for emotion recognition from speech. In: Proc. IET Internat. Conf. on Intelligent Environments, pp. 152-155.
- (2007) Proc. IET Internat. Conf. on Intelligent Environments , pp. 152-155
- Scherer, S.¹ Schwenker, F.² Palm, G.³

46
- 34547493864
- Emotion recognition in the noise applying large acoustic feature sets
- Schuller, B., Seppi, D., Batliner, A., Maier, A., Steidl, S., 2006. Emotion recognition in the noise applying large acoustic feature sets. In: Proc. Speech Prosody.
- (2006) Proc. Speech Prosody
- Schuller, B.¹ Seppi, D.² Batliner, A.³ Maier, A.⁴ Steidl, S.⁵

47
- 48249094713
- The relevance of feature type for the automatic classification of emotional user states: Low level descriptors and functionals
- Schuller, B., Batliner, A., Seppi, D., Steidl, S., Vogt, T., Wagner, J., Devillers, L., Vidrascu, L., Amir, N., Kessous, L., Aharonson, V., 2007a. The relevance of feature type for the automatic classification of emotional user states: Low level descriptors and functionals. In: Proc. Interspeech, pp. 2253-2256.
- (2007) Proc. Interspeech , pp. 2253-2256
- Schuller, B.¹ Batliner, A.² Seppi, D.³ Steidl, S.⁴ Vogt, T.⁵ Wagner, J.⁶ Devillers, L.⁷ Vidrascu, L.⁸ Amir, N.⁹ Kessous, L.¹⁰ Aharonson, V.¹¹

48
- 34547549142
- Towards more reality in the recognition of emotional speech
- Schuller, B., Seppi, D., Batliner, A., Maier, A., Steidl, S., 2007b. Towards more reality in the recognition of emotional speech. In: Proc. Internat. Conf. on Acoustics, Speech and Signal Processing, Vol. 4. pp. 941-944.
- (2007) Proc. Internat. Conf. on Acoustics, Speech and Signal Processing , vol.4 , pp. 941-944
- Schuller, B.¹ Seppi, D.² Batliner, A.³ Maier, A.⁴ Steidl, S.⁵

49
- 33947164164
- An evaluation of the robustness of existing supervised machine learning approaches to the classification of emotions in speech
- DOI 10.1016/j.specom.2007.01.006, PII S016763930700009X
- M. Shami, and W. Verhelst An evaluation of the robustness of existing supervised machine learning approaches to the classification of emotions in speech Speech Commun. 49 2007 201 212 (Pubitemid 46413361)
- (2007) Speech Communication , vol.49 , Issue.3 , pp. 201-212
- Shami, M.¹ Verhelst, W.²

50
- 0035425442
- On the role of space and time in auditory processing
- DOI 10.1016/S1364-6613(00)01704-6, PII S1364661300017046
- S. Shamma On the role of space and time in auditory processing Trends Cogn. Sci. 5 2001 340 348 (Pubitemid 32703803)
- (2001) Trends in Cognitive Sciences , vol.5 , Issue.8 , pp. 340-348
- Shamma, S.¹

51
- 0041941417
- Encoding sound timbre in the auditory system
- S. Shamma Encoding sound timbre in the auditory system IETE J. Res. 49 2003 193 205
- (2003) IETE J. Res. , vol.49 , pp. 193-205
- Shamma, S.¹

52
- 0003913694
- An efficient implementation of the Patterson-Holdsworth auditory filterbank
- Apple Computer, Perception Group
- Slaney, M., 1993. An efficient implementation of the Patterson-Holdsworth auditory filterbank. Tech. rep., Apple Computer, Perception Group.
- (1993) Tech. Rep.
- Slaney, M.¹

53
- 70349199919
- Investigating glottal parameters for differentiating emotional categories with similar prosodics
- Sun, R., E., M., Torres, J., 2009. Investigating glottal parameters for differentiating emotional categories with similar prosodics. In: Proc. Internat. Conf. on Acoustics, Speech and Signal Processing, pp. 4509-4512.
- (2009) Proc. Internat. Conf. on Acoustics, Speech and Signal Processing , pp. 4509-4512
- Sun, R.M.E.¹ Torres, J.²

54
- 0001455934
- A robust algorithm for pitch tracking (RAPT)
- Talkin, D., 1995. A robust algorithm for pitch tracking (RAPT). Elsevier Science, Speech Coding and Synthesis (Chapter 14).
- (1995) Elsevier Science, Speech Coding and Synthesis (Chapter 14)
- Talkin, D.¹

55
- 0003450542
- Springer New York
- V. Vapnik The Nature of Statistical Learning Theory 1995 Springer New York
- (1995) The Nature of Statistical Learning Theory
- Vapnik, V.¹

56
- 33746410556
- Emotional speech recognition: Resources, features, and methods
- DOI 10.1016/j.specom.2006.04.003, PII S0167639306000422
- D. Ververidis, and C. Kotropoulos Emotional speech recognition: Resources, features, and methods Speech Commun. 48 2006 1162 1181 (Pubitemid 44128615)
- (2006) Speech Communication , vol.48 , Issue.9 , pp. 1162-1181
- Ververidis, D.¹ Kotropoulos, C.²

57
- 56149115138
- Combining frame and turn-level information for robust recognition of emotions within speech
- Vlasenko, B., Schuller, B., Wendemuth, A., Rigoll, G., 2007. Combining frame and turn-level information for robust recognition of emotions within speech. In: Proc. Interspeech, pp. 2225-2228.
- (2007) Proc. Interspeech , pp. 2225-2228
- Vlasenko, B.¹ Schuller, B.² Wendemuth, A.³ Rigoll, G.⁴

58
- 84862156369
- Abandoning emotion classes - Towards continuous emotion recognition with modelling of long-range dependencies
- Wollmer, M., Eyben, F., Reiter, S., Schuller, B., Cox, C., Douglas-Cowie, E., Cowie, R., 2008. Abandoning emotion classes - Towards continuous emotion recognition with modelling of long-range dependencies. In: Proc. Interspeech, pp. 597-600.
- (2008) Proc. Interspeech , pp. 597-600
- Wollmer, M.¹ Eyben, F.² Reiter, S.³ Schuller, B.⁴ Cox, C.⁵ Douglas-Cowie, E.⁶ Cowie, R.⁷

59
- 70449580752
- Automatic recognition of speech emotion using long-term spectro-temporal features
- Wu, S., Falk, T., Chan, W.-Y., 2009. Automatic recognition of speech emotion using long-term spectro-temporal features. In: Proc. Internat. Conf. on Digital Signal Processing, pp. 1-6.
- (2009) Proc. Internat. Conf. on Digital Signal Processing , pp. 1-6
- Wu, S.¹ Falk, T.² Chan, W.-Y.³

60
- 0035278948
- Nonlinear feature based classification of speech under stress
- DOI 10.1109/89.905995, PII S1063667601013232
- G. Zhou, J. Hansen, and J. Kaiser Nonlinear feature based classification of speech under stress IEEE Trans. Audio Speech Language Process. 9 2001 201 216 (Pubitemid 32286594)
- (2001) IEEE Transactions on Speech and Audio Processing , vol.9 , Issue.3 , pp. 201-216
- Zhou, G.¹ Hansen, J.H.L.² Kaiser, J.F.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.