-
1
-
-
84910651844
-
Deep learning in neural networks: An overview
-
January
-
J. Schmidhuber, "Deep learning in neural networks: An overview, " Neural Networks, vol. 61, pp. 85-117, January 2015.
-
(2015)
Neural Networks
, vol.61
, pp. 85-117
-
-
Schmidhuber, J.1
-
2
-
-
85032751458
-
Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
-
November
-
G. Hinton, L. Deng, Y. Dong, G. E. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath, and B. Kingsbury, "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups, " Signal Processing Magazine, vol. 29, no. 6, pp. 82-97, November 2012.
-
(2012)
Signal Processing Magazine
, vol.29
, Issue.6
, pp. 82-97
-
-
Hinton, G.1
Deng, L.2
Dong, Y.3
Dahl, G.E.4
Mohamed, A.5
Jaitly, N.6
Senior, A.7
Vanhoucke, V.8
Nguyen, P.9
Sainath, T.N.10
Kingsbury, B.11
-
3
-
-
84936143793
-
Towards end-to-end speech recognition with recurrent neural networks
-
Beijing, China
-
A. Graves and N. Jaitly, "Towards end-to-end speech recognition with recurrent neural networks, " in Proc. ICML, Beijing, China, 2014, pp. 1764-1772.
-
(2014)
Proc. ICML
, pp. 1764-1772
-
-
Graves, A.1
Jaitly, N.2
-
4
-
-
84906269266
-
The INTERSPEECH 2013 Computational Paralinguistics Challenge: Social signals, conflict, emotion, autism
-
Lyon, France, August ISCA
-
B. Schuller, S. Steidl, A. Batliner, A. Vinciarelli, K. Scherer, F. Ringeval, M. Chetouani, F. Weninger, F. Eyben, E. Marchi, M. Mortillaro, H. Salamin, A. Polychroniou, F. Valente, and S. Kim, "The INTERSPEECH 2013 Computational Paralinguistics Challenge: Social signals, conflict, emotion, autism, " in Proc. INTERSPEECH, Lyon, France, August 2013, pp. 148-152, ISCA.
-
(2013)
Proc. INTERSPEECH
, pp. 148-152
-
-
Schuller, B.1
Steidl, S.2
Batliner, A.3
Vinciarelli, A.4
Scherer, K.5
Ringeval, F.6
Chetouani, M.7
Weninger, F.8
Eyben, F.9
Marchi, E.10
Mortillaro, M.11
Salamin, H.12
Polychroniou, A.13
Valente, F.14
Kim, S.15
-
5
-
-
84960854232
-
AV+EC 2015-the first affect recognition challenge bridging across audio, video, and physiological data
-
Eds., Brisbane, Australia, October ACM
-
F. Ringeval et al., "AV+EC 2015-The First Affect Recognition Challenge Bridging Across Audio, Video, and Physiological Data, " in Proc. AVEC, Fabien Ringeval, Björn Schuller, Michel Valstar, Roddy Cowie, and Maja Pantic, Eds., Brisbane, Australia, October 2015, pp. 3-8, ACM.
-
(2015)
Proc. AVEC, Fabien Ringeval, Björn Schuller, Michel Valstar, Roddy Cowie, and Maja Pantic
, pp. 3-8
-
-
Ringeval, F.1
-
6
-
-
80051609011
-
Learning a better representation of speech sound waves using restricted Boltzmann machines
-
Prague, Czech Republic, May IEEE
-
N. Jaitly and G. Hinton, "Learning a better representation of speech sound waves using restricted Boltzmann machines, " in Proc. ICASSP, Prague, Czech Republic, May 2011, pp. 5884-5887, IEEE.
-
(2011)
Proc. ICASSP
, pp. 5884-5887
-
-
Jaitly, N.1
Hinton, G.2
-
7
-
-
84959098603
-
Architectures for deep neural network based acoustic models defined over windowed speech waveforms
-
Dresden, Germany, September ISCA
-
M. Bhargava and R. Rose, "Architectures for deep neural network based acoustic models defined over windowed speech waveforms, " in Proc. INTERSPEECH, Dresden, Germany, September 2015, pp. 6-10, ISCA.
-
(2015)
Proc. INTERSPEECH
, pp. 6-10
-
-
Bhargava, M.1
Rose, R.2
-
8
-
-
84946037134
-
Convolutional, long short-term memory, fully connected deep neural networks
-
Brisbane, Australia, April IEEE
-
T. Sainath, O. Vinyals, A. Senior, and H. Sak, "Convolutional, long short-term memory, fully connected deep neural networks, " in Proc. ICASSP, Brisbane, Australia, April 2015, pp. 4580-4584, IEEE.
-
(2015)
Proc. ICASSP
, pp. 4580-4584
-
-
Sainath, T.1
Vinyals, O.2
Senior, A.3
Sak, H.4
-
9
-
-
84959168440
-
Learning the speech front-end with raw waveform cldnns
-
Dresden, Germany, September ISCA
-
T. Sainath, R. Weiss, A. Senior, K. Wilson, and O. Vinyals, "Learning the speech front-end with raw waveform cldnns, " in Proc. INTERSPEECH, Dresden, Germany, September 2015, pp. 1-5, ISCA.
-
(2015)
Proc. INTERSPEECH
, pp. 1-5
-
-
Sainath, T.1
Weiss, R.2
Senior, A.3
Wilson, K.4
Vinyals, O.5
-
10
-
-
84906273908
-
Estimating phoneme class conditional probabilities from raw speech signal using convolutional neural networks
-
Lyon, France, August ISCA
-
D. Palaz, R. Collobert, and M. Magimai-Doss, "Estimating phoneme class conditional probabilities from raw speech signal using convolutional neural networks, " in Proc. INTERSPEECH, Lyon, France, August 2013, pp. 1766-1770, ISCA.
-
(2013)
Proc. INTERSPEECH
, pp. 1766-1770
-
-
Palaz, D.1
Collobert, R.2
Magimai-Doss, M.3
-
11
-
-
84955059475
-
Analysis of cnn-based speech recognition system using raw speech as input
-
Dresden, Germany, September, ISCA
-
D. Palaz, M. Magimai-Doss, and R. Collobert, "Analysis of cnn-based speech recognition system using raw speech as input, " in Proc. INTERSPEECH, Dresden, Germany, September 2015, pp. 11-15, ISCA.
-
(2015)
Proc. INTERSPEECH
, pp. 11-15
-
-
Palaz, D.1
Magimai-Doss, M.2
Collobert, R.3
-
12
-
-
84905248193
-
End-to-end learning for music audio
-
Florence, Italy, April
-
S. Dieleman and B. Schrauwen, "End-to-end learning for music audio, " in Proc. ICASSP, Florence, Italy, April 2014, pp. 7014-7018.
-
(2014)
Proc. ICASSP
, pp. 7014-7018
-
-
Dieleman, S.1
Schrauwen, B.2
-
14
-
-
84959157337
-
Using representation learning and out-of-domain data for a paralinguistic speech task
-
Dresden, Germany, September, ISCA
-
B. Milde and C. Biemann, "Using representation learning and out-of-domain data for a paralinguistic speech task, " in Proc. INTERSPEECH, Dresden, Germany, September 2015, pp. 904-908, ISCA.
-
(2015)
Proc. INTERSPEECH
, pp. 904-908
-
-
Milde, B.1
Biemann, C.2
-
15
-
-
84913548678
-
Learning salient features for speech emotion recognition using convolutional neural networks
-
Dec
-
Q. Mao, M. Dong, Z. Huang, and Y. Zhan, "Learning salient features for speech emotion recognition using convolutional neural networks, " IEEE Transactions on Multimedia, vol. 16, no. 8, pp. 2203-2213, Dec 2014.
-
(2014)
IEEE Transactions on Multimedia
, vol.16
, Issue.8
, pp. 2203-2213
-
-
Mao, Q.1
Dong, M.2
Huang, Z.3
Zhan, Y.4
-
16
-
-
0024521543
-
A concordance correlation coefficient to evaluate reproducibility
-
March
-
L. I-Kuei Lin, "A concordance correlation coefficient to evaluate reproducibility, " Biometrics, vol. 45, no. 1, pp. 255-268, March 1989.
-
(1989)
Biometrics
, vol.45
, Issue.1
, pp. 255-268
-
-
I-Kuei Lin, L.1
-
17
-
-
0011823639
-
Improved speech recognition using high-pass filtering of subband envelopes
-
Genoa, Italy, September, ISCA
-
H. G. Hirsch, P. Meyer, and H. W. Ruehl, "Improved speech recognition using high-pass filtering of subband envelopes, " in Proc. EUROSPEECH, Genoa, Italy, September 1991, pp. 413-416, ISCA.
-
(1991)
Proc. EUROSPEECH
, pp. 413-416
-
-
Hirsch, H.G.1
Meyer, P.2
Ruehl, H.W.3
-
18
-
-
34547539413
-
Gammatone features and feature combination for large vocabulary speech recognition
-
IEEE
-
R. Schlüter, L. Bezrukov, H. Wagner, and H. Ney, "Gammatone features and feature combination for large vocabulary speech recognition, " in Proc. ICASSP. April 2007, vol. 4, pp. 649-652, IEEE.
-
Proc. ICASSP. April 2007
, vol.4
, pp. 649-652
-
-
Schlüter, R.1
Bezrukov, L.2
Wagner, H.3
Ney, H.4
-
19
-
-
27744588611
-
Framewise phoneme classification with bidirectional lstm and other neural network architectures
-
July-August
-
A. Graves and J. Schmidhuber, "Framewise phoneme classification with bidirectional lstm and other neural network architectures, " Neural Networks, IJCNN Special Issue, vol. 18, no. 5-6, pp. 602-610, July-August 2005.
-
(2005)
Neural Networks, IJCNN Special Issue
, vol.18
, Issue.5-6
, pp. 602-610
-
-
Graves, A.1
Schmidhuber, J.2
-
20
-
-
0031573117
-
Long short-term memory
-
S. Hochreiter and J. Schmidhuber, "Long short-term memory, " Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997.
-
(1997)
Neural Computation
, vol.9
, Issue.8
, pp. 1735-1780
-
-
Hochreiter, S.1
Schmidhuber, J.2
-
21
-
-
84943197961
-
Prediction of asynchronous dimensional emotion ratings from audiovisual and physiological data
-
November
-
F. Ringeval et al., "Prediction of asynchronous dimensional emotion ratings from audiovisual and physiological data, " Pattern Recognition Letters, vol. 66, pp. 22-30, November 2015.
-
(2015)
Pattern Recognition Letters
, vol.66
, pp. 22-30
-
-
Ringeval, F.1
-
22
-
-
84915817064
-
Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions
-
FG, Shanghai, China
-
F. Ringeval, A. Sonderegger, J. Sauer, and D. Lalanne, "Introducing the RECOLA Multimodal Corpus of Remote Collaborative and Affective Interactions, " in Proc. of EmoSPACE, FG, Shanghai, China, 2013.
-
(2013)
Proc. of EmoSPACE
-
-
Ringeval, F.1
Sonderegger, A.2
Sauer, J.3
Lalanne, D.4
-
23
-
-
84947915210
-
The geneva minimalistic acoustic parameter set (GeMAPS) for voice research and affective computing
-
in press
-
F. Eyben et al., "The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing, " IEEE Transactions on Affective Computing, 2015, in press.
-
(2015)
IEEE Transactions on Affective Computing
-
-
Eyben, F.1
-
24
-
-
85083951076
-
Adam: A method for stochastic optimization
-
San Diego, USA
-
D. Kingma and J. Ba, "Adam: A method for stochastic optimization, " in Proc. ICLR, San Diego, USA, 2015.
-
(2015)
Proc. ICLR
-
-
Kingma, D.1
Ba, J.2
-
25
-
-
84904163933
-
Dropout: A simple way to prevent neural networks from overfitting
-
January
-
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, "Dropout: A simple way to prevent neural networks from overfitting, " Journal of Machine Learning Research, vol. 15, no. 1, pp. 1929-1958, January 2014.
-
(2014)
Journal of Machine Learning Research
, vol.15
, Issue.1
, pp. 1929-1958
-
-
Srivastava, N.1
Hinton, G.2
Krizhevsky, A.3
Sutskever, I.4
Salakhutdinov, R.5
-
26
-
-
84960847562
-
Ensemble methods for continuous affect recognition: Multimodality, temporality, and challenges
-
Björn Schuller, Michel Valstar, Roddy Cowie, and Maja Pantic, Eds., Brisbane, Australia, October
-
M. Kächele, P. Thiam, G. Palm, F. Schwenker, and M. Schels, "Ensemble methods for continuous affect recognition: Multimodality, temporality, and challenges, " in Proc. AVEC, Fabien Ringeval, Björn Schuller, Michel Valstar, Roddy Cowie, and Maja Pantic, Eds., Brisbane, Australia, October 2015, pp. 9-16.
-
(2015)
Proc. AVEC, Fabien Ringeval
, pp. 9-16
-
-
Kächele, M.1
Thiam, P.2
Palm, G.3
Schwenker, F.4
Schels, M.5
-
27
-
-
84930944930
-
Correcting time-continuous emotional labels by modeling the reaction lag of evaluators
-
April-June
-
S. Mariooryad and C. Busso, "Correcting time-continuous emotional labels by modeling the reaction lag of evaluators, " IEEE Transactions on Affective Computing, vol. 6, no. 2, pp. 97-108, April-June 2015.
-
(2015)
IEEE Transactions on Affective Computing
, vol.6
, Issue.2
, pp. 97-108
-
-
Mariooryad, S.1
Busso, C.2
-
28
-
-
0037384712
-
Vocal communication of emotion: A review of research paradigms
-
April
-
K. Scherer, "Vocal communication of emotion: A review of research paradigms, " Speech Communication, vol. 40, no. 1-2, pp. 227-256, April 2003.
-
(2003)
Speech Communication
, vol.40
, Issue.1-2
, pp. 227-256
-
-
Scherer, K.1
|