SCOPUS 정보 검색 플랫폼

2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013 - Proceedings

Volumn , Issue , 2013, Pages 216-221

Emotion recognition from spontaneous speech using Hidden Markov models with deep belief networks

(2) Le, Duc a Provost, Emily Mower a

a UNIVERSITY OF MICHIGAN (United States)

Author keywords

deep belief networks; dynamic modeling; emotion classification; FAU Aibo; spontaneous speech

Indexed keywords

AUTOMATIC EMOTION RECOGNITION; BENCHMARK DATASETS; DEEP BELIEF NETWORKS; EMOTION CLASSIFICATION; EMOTION RECOGNITION; FAU AIBO; HIGH DIMENSIONAL DATA; SPONTANEOUS SPEECH;

DYNAMIC MODELS;

HIDDEN MARKOV MODELS;

EID: 84893690785 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ASRU.2013.6707732 Document Type: Conference Paper

Times cited : (102)

References (33)

1
- 70449388050
- Ph.D. thesis
- S. Steidl, Automatic classification of emotion related user states in spontaneous children's speech, Ph.D. thesis, 2009.
- (2009) Automatic Classification of Emotion Related User States in Spontaneous Children's Speech
- Steidl, S.¹

2
- 84055211743
- Acoustic modeling using deep belief networks
- A. Mohamed, G. E. Dahl, and G. E. Hinton, Acoustic modeling using deep belief networks, IEEE Transactions on Audio, Speech & Language Processing, vol. 20, no. 1, pp. 14-22, 2012.
- (2012) IEEE Transactions on Audio, Speech & Language Processing , vol.20 , Issue.1 , pp. 14-22
- Mohamed, A.¹ Dahl, G.E.² Hinton, G.E.³

3
- 84890447803
- Multiple windowed spectral features for emotion recognition
- Vancouver, BC, Canada
- Y. Attabi, J. Alam, P. Dumouchel, P. Kenny, and D. O'Shaughnessy, Multiple windowed spectral features for emotion recognition, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vancouver, BC, Canada, 2013.
- (2013) IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- Attabi, Y.¹ Alam, J.² Dumouchel, P.³ Kenny, P.⁴ O'Shaughnessy, D.⁵

4
- 84878390959
- Combining ranking and classification to improve emotion recognition in spontaneous speech
- Portland, OR, USA
- H. Cao, R. Verma, and A. Nenkova, Combining ranking and classification to improve emotion recognition in spontaneous speech, in Proc. of the 13th Annual Conference of the International Speech Communication Association (INTERSPEECH), Portland, OR, USA, 2012.
- (2012) Proc. of the 13th Annual Conference of the International Speech Communication Association (INTERSPEECH)
- Cao, H.¹ Verma, R.² Nenkova, A.³

5
- 70450206416
- The interspeech 2009 emotion challenge
- Brighton, United Kingdom
- B. Schuller, S. Steidl, and A. Batliner, The interspeech 2009 emotion challenge, in Proc. of the 10th Annual Conference of the International Speech Communication Association (INTERSPEECH), Brighton, United Kingdom, 2009, pp. 312-315.
- (2009) Proc. of the 10th Annual Conference of the International Speech Communication Association (INTERSPEECH , pp. 312-315
- Schuller, B.¹ Steidl, S.² Batliner, A.³

6
- 70450177653
- Brno university of technology system for interspeech 2009 emotion challenge
- Brighton, United Kingdom, number
- M. Kockmann, L. Burget, and J. Cernocký, Brno university of technology system for interspeech 2009 emotion challenge, in Proc. of the 10th Annual Conference of the International Speech Communication Association (INTERSPEECH), Brighton, United Kingdom, 2009, number 9, pp. 348-351.
- (2009) Proc. of the 10th Annual Conference of the International Speech Communication Association (INTERSPEECH) , Issue.9 , pp. 348-351
- Kockmann, M.¹ Burget, L.² Cernocký, J.³

7
- 79960846940
- Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge
- Sensing Emotion and Affect - Facing Realism in Speech Processing
- B. Schuller, A. Batliner, S. Steidl, and D. Seppi, Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge, Speech Communication, vol. 53, no. 910, pp. 1062 - 1087, 2011, Sensing Emotion and Affect - Facing Realism in Speech Processing.
- (2011) Speech Communication , vol.53 , Issue.910 , pp. 1062-1087
- Schuller, B.¹ Batliner, A.² Steidl, S.³ Seppi, D.⁴

8
- 84876262647
- On acoustic emotion recognition: Compensating for covariate shift
- A. Hassan, R. Damper, and M. Niranjan, On acoustic emotion recognition: Compensating for covariate shift, IEEE Transactions on Audio, Speech & Language Processing, vol. 21, no. 7, pp. 1458-1468, 2013.
- (2013) IEEE Transactions on Audio, Speech & Language Processing , vol.21 , Issue.7 , pp. 1458-1468
- Hassan, A.¹ Damper, R.² Niranjan, M.³

9
- 0013344078
- Training products of experts by minimizing contrastive divergence
- Aug
- G. E. Hinton, Training products of experts by minimizing contrastive divergence, Neural Computation, vol. 14, no. 8, pp. 1771-1800, Aug. 2002.
- (2002) Neural Computation , vol.14 , Issue.8 , pp. 1771-1800
- Hinton, G.E.¹

10
- 33745805403
- A fast learning algorithm for deep belief nets
- DOI 10.1162/neco.2006.18.7.1527
- G. E. Hinton, S. Osindero, and Y. Teh, A fast learning algorithm for deep belief nets, Neural Computation, vol. 18, no. 7, pp. 1527-1554, July 2006. (Pubitemid 44024729)
- (2006) Neural Computation , vol.18 , Issue.7 , pp. 1527-1554
- Hinton, G.E.¹ Osindero, S.² Teh, Y.-W.³

11
- 78650474133
- Tech. Rep
- G. E. Hinton, A practical guide to training restricted boltzmann machines, Tech. Rep., 2010.
- (2010) A Practical Guide to Training Restricted Boltzmann Machines
- Hinton, G.E.¹

12
- 85161980001
- Sparse deep belief net model for visual area v2
- MIT Press
- H. Lee, C. Ekanadham, and A. Y. Ng, Sparse deep belief net model for visual area v2, in Proc. of the 22nd Annual Conference on Neural Information Processing Systems (NIPS). 2008, MIT Press.
- (2008) Proc. of the 22nd Annual Conference on Neural Information Processing Systems (NIPS)
- Lee, H.¹ Ekanadham, C.² Ng, A.Y.³

13
- 71149119164
- Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations
- Montreal, QC, Canada, ACM
- H. Lee, R. Grosse, R. Ranganath, and A. Y. Ng, Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations, in Proc. of the 26th Annual International Conference on Machine Learning (ICML), Montreal, QC, Canada, 2009, pp. 609-616, ACM.
- (2009) Proc. of the 26th Annual International Conference on Machine Learning (ICML) , pp. 609-616
- Lee, H.¹ Grosse, R.² Ranganath, R.³ Ng, A.Y.⁴

14
- 77956542104
- Deep networks for robust visual recognition
- Haifa, Israel, Omnipress
- Y. Tang and C. Eliasmith, Deep networks for robust visual recognition, in Proc. of the 27th Annual International Conference on Machine Learning (ICML), Haifa, Israel, 2010, pp. 1055-1062, Omnipress.
- (2010) Proc. of the 27th Annual International Conference on Machine Learning (ICML) , pp. 1055-1062
- Tang, Y.¹ Eliasmith, C.²

15
- 84876231242
- Imagenet classification with deep convolutional neural networks
- Lake Tahoe, NV, USA
- A. Krizhevsky, I. Sutskever, and G. E. Hinton, Imagenet classification with deep convolutional neural networks, in Proc. of the 26th Annual Conference on Neural Information Processing Systems (NIPS), Lake Tahoe, NV, USA, 2012, pp. 1106-1114.
- (2012) Proc. of the 26th Annual Conference on Neural Information Processing Systems (NIPS) , pp. 1106-1114
- Krizhevsky, A.¹ Sutskever, I.² Hinton, G.E.³

16
- 85032751458
- Deep neural networks for acoustic modeling in speech recognition
- G. E. Hinton, D. Li, D. Yu, G. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath, and B. Kingsbury, Deep neural networks for acoustic modeling in speech recognition, IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 82-97, 2012.
- (2012) IEEE Signal Processing Magazine , vol.29 , Issue.6 , pp. 82-97
- Hinton, G.E.¹ Li, D.² Yu, D.³ Dahl, G.⁴ Mohamed, A.⁵ Jaitly, N.⁶ Senior, A.⁷ Vanhoucke, V.⁸ Nguyen, P.⁹ Sainath, T.¹⁰ Kingsbury, B.¹¹

17
- 84255177123
- Deep and wide: Multiple layers in automatic speech recognition
- N. Morgan, Deep and wide: Multiple layers in automatic speech recognition., IEEE Transactions on Audio, Speech & Language Processing, vol. 20, no. 1, pp. 7-13, 2012.
- (2012) IEEE Transactions on Audio, Speech & Language Processing , vol.20 , Issue.1 , pp. 7-13
- Morgan, N.¹

18
- 84055212007
- Sparse multilayer perceptron for phoneme recognition
- G.S.V.S. Sivaram and H. Hermansky, Sparse multilayer perceptron for phoneme recognition, IEEE Transactions on Audio, Speech & Language Processing, vol. 20, no. 1, pp. 23-29, 2012.
- (2012) IEEE Transactions on Audio, Speech & Language Processing , vol.20 , Issue.1 , pp. 23-29
- Sivaram, G.S.V.S.¹ Hermansky, H.²

19
- 80051631315
- Deep neural networks for acoustic emotion recognition: Raising the benchmarks
- Prague, Czech Republic
- A. Stuhlsatz, C. Meyer, F. Eyben, T. ZieIke, G. Meier, and B. Schuller, Deep neural networks for acoustic emotion recognition: Raising the benchmarks, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech Republic, 2011, pp. 5688-5691.
- (2011) IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 5688-5691
- Stuhlsatz, A.¹ Meyer, C.² Eyben, F.³ Zieike, T.⁴ Meier, G.⁵ Schuller, B.⁶

20
- 83455238990
- Learning emotion-based acoustic features with deep belief networks
- New Paltz, NY, USA
- E. M. Schmidt and Y. E. Kim, Learning emotion-based acoustic features with deep belief networks, in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA, 2011, pp. 65-68.
- (2011) IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) , pp. 65-68
- Schmidt, E.M.¹ Kim, Y.E.²

21
- 84878405775
- Likability classification - A not so deep neural network approach
- Portland, OR, USA
- R. Brueckner and B. Schuller, Likability classification - A not so deep neural network approach, in Proc. of the 13th Annual Conference of the International Speech Communication Association (INTERSPEECH), Portland, OR, USA, 2012.
- (2012) Proc. of the 13th Annual Conference of the International Speech Communication Association (INTERSPEECH)
- Brueckner, R.¹ Schuller, B.²

22
- 84893647216
- Deep learning for robust feature generation in audio-visual emotion recognition
- Vancouver, BC, Canada
- Y. Kim, H. Lee, and E. Mower Provost, Deep learning for robust feature generation in audio-visual emotion recognition, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vancouver, BC, Canada.
- IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- Kim, Y.¹ Lee, H.² Provost, E.M.³

23
- 84878392511
- The interspeech 2012 speaker trait challenge
- Portland, OR, USA
- B. Schuller, S. Steidl, A. Batliner, E. Noth, A. Vinciarelli, F. Burkhardt, R. van Son, F. Weninger, F. Eyben, T. Bocklet, G. Mohammadi, and B. Weiss, The interspeech 2012 speaker trait challenge, in 13th Annual Conference of the International Speech Communication Association (INTERSPEECH), Portland, OR, USA, 2012.
- (2012) 13th Annual Conference of the International Speech Communication Association (INTERSPEECH)
- Schuller, B.¹ Steidl, S.² Batliner, A.³ Noth, E.⁴ Vinciarelli, A.⁵ Burkhardt, F.⁶ Son, R.V.⁷ Weninger, F.⁸ Eyben, F.⁹ Bocklet, T.¹⁰ Mohammadi, G.¹¹ Weiss, B.¹²

24
- 84857819132
- Theano: A CPU and GPU math expression compiler
- Austin, TX, USA, June
- J. Bergstra, O. Breuleux, F. Bastien, P. Lamblin, R. Pascanu, G. Desjardins, J. Turian, D. Warde-Farley, and Y. Bengio, Theano: A CPU and GPU math expression compiler, in Proc. of the Python for Scientific Computing Conference (SciPy), Austin, TX, USA, June 2010.
- (2010) Proc. of the Python for Scientific Computing Conference (SciPy)
- Bergstra, J.¹ Breuleux, O.² Bastien, F.³ Lamblin, P.⁴ Pascanu, R.⁵ Desjardins, G.⁶ Turian, J.⁷ Warde-Farley, D.⁸ Bengio, Y.⁹

25
- 79959856651
- Incremental acoustic valence recognition: An inter-corpus perspective on features, matching, and performance in a gating paradigm
- Makuhari, Japan
- B. Schuller and L. Devillers, Incremental acoustic valence recognition: An inter-corpus perspective on features, matching, and performance in a gating paradigm, in Proc. of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH), Makuhari, Japan, 2010, pp. 801-804.
- (2010) Proc. of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH) , pp. 801-804
- Schuller, B.¹ Devillers, L.²

26
- 80051647584
- Sentence level emotion recognition based on decisions from subsentence segments
- Prague, Czech Republic
- J. H. Jeon, R. Xia, and Y. Liu, Sentence level emotion recognition based on decisions from subsentence segments, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech Republic, 2011, pp. 4940-4943.
- (2011) IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 4940-4943
- Jeon, J.H.¹ Xia, R.² Liu, Y.³

27
- 80051607532
- A hierarchical static-dynamic framework for emotion classification
- Prague, Czech Republic
- E. Mower and S. Narayanan, A hierarchical static-dynamic framework for emotion classification, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech Republic, 2011, pp. 2372-2375.
- (2011) IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 2372-2375
- Mower, E.¹ Narayanan, S.²

28
- 0346586663
- SMOTE: Synthetic minority over-sampling technique
- N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, Smote: Synthetic minority over-sampling technique, Journal of Artificial Intelligence Research (JAIR), vol. 16, pp. 321-357, 2002. (Pubitemid 43057176)
- (2002) Journal of Artificial Intelligence Research , vol.16 , pp. 321-357
- Chawla, N.V.¹ Bowyer, K.W.² Hall, L.O.³ Kegelmeyer, W.P.⁴

29
- 0000551189
- Popular ensemble methods: An empirical study
- D. Opitz and R. Maclin, Popular ensemble methods: An empirical study, Journal of Artificial Intelligence Research (JAIR), vol. 11, pp. 169-198, 1999. (Pubitemid 129628763)
- (1999) Journal of Artificial Intelligence Research , vol.11 , pp. 169-198
- Opitz, D.¹ Maclin, R.²

30
- 0036293830
- An overview of automatic speaker recognition technology
- Orlando, FL, USA
- D. A. Reynolds, An overview of automatic speaker recognition technology, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Orlando, FL, USA, 2002, vol. 4, pp. 4072-4075.
- (2002) IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , vol.4 , pp. 4072-4075
- Reynolds, D.A.¹

31
- 70350125882
- An overview of text-independent speaker recognition: From features to supervectors
- Jan
- T. Kinnunen and H. Li, An overview of text-independent speaker recognition: From features to supervectors, Speech Communication, vol. 52, no. 1, pp. 12-40, Jan. 2010.
- (2010) Speech Communication , vol.52 , Issue.1 , pp. 12-40
- Kinnunen, T.¹ Li, H.²

32
- 79958818321
- An overview of speaker identification: Accuracy and robustness issues
- R. Togneri and D. Pullella, An overview of speaker identification: Accuracy and robustness issues, Circuits and Systems Magazine, vol. 11, no. 2, pp. 23-61, 2011.
- (2011) Circuits and Systems Magazine , vol.11 , Issue.2 , pp. 23-61
- Togneri, R.¹ Pullella, D.²

33
- 56149101219
- Using neutral speech models for emotional speech analysis
- Antwerp, Belgium
- C. Busso, S. Lee, and S. Narayanan, Using neutral speech models for emotional speech analysis, in Proc. of the 8th Annual Conference of the International Speech Communication Association (INTERSPEECH), Antwerp, Belgium, 2007, p. 22252228.
- (2007) Proc. of the 8th Annual Conference of the International Speech Communication Association (INTERSPEECH) , pp. 22252228
- Busso, C.¹ Lee, S.² Narayanan, S.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.