-
1
-
-
84874909204
-
The PASCAL CHiME speech separation and recognition challenge
-
submitted for publication.
-
Barker, J.P., Vincent, E., Ma, N., Christensen, H., Green, P.D. The PASCAL CHiME speech separation and recognition challenge. Computer Speech and Language, submitted for publication.
-
Computer Speech and Language
-
-
Barker, J.P.1
Vincent, E.2
Ma, N.3
Christensen, H.4
Green, P.D.5
-
3
-
-
33750368310
-
An audio-visual corpus for speech perception and automatic speech recognition
-
M. Cooke, J. Barker, S. Cunningham, and X. Shao An audio-visual corpus for speech perception and automatic speech recognition The Journal of the Acoustical Society of America 120 5 2006 2421 2424
-
(2006)
The Journal of the Acoustical Society of America
, vol.120
, Issue.5
, pp. 2421-2424
-
-
Cooke, M.1
Barker, J.2
Cunningham, S.3
Shao, X.4
-
5
-
-
84873898784
-
Speech recognition in the presence of highly non-stationary noise based on spatial, spectral and temporal speech/noise modeling combined with dynamic variance adaptation.
-
Florence, Italy
-
M. Delcroix, K. Kinoshita, T. Nakatani, S. Araki, A. Ogawa, T. Hori, S. Watanabe, M. Fujimoto, T. Yoshioka, T. Oba, Y. Kubo, M. Souden, S.J. Hahm, and A. Nakamura Speech recognition in the presence of highly non-stationary noise based on spatial, spectral and temporal speech/noise modeling combined with dynamic variance adaptation. Proc. of Machine Listening in Multisource Environments (CHiME 2011), Satellite Workshop of Interspeech 2011 Florence, Italy 2011 12 17
-
(2011)
Proc. of Machine Listening in Multisource Environments (CHiME 2011), Satellite Workshop of Interspeech 2011
, pp. 12-17
-
-
Delcroix, M.1
Kinoshita, K.2
Nakatani, T.3
Araki, S.4
Ogawa, A.5
Hori, T.6
Watanabe, S.7
Fujimoto, M.8
Yoshioka, T.9
Oba, T.10
Kubo, Y.11
Souden, M.12
Hahm, S.J.13
Nakamura, A.14
-
6
-
-
0000259511
-
Approximate statistical tests for comparing supervised classification learning algorithms
-
T.G. Dietterich Approximate statistical tests for comparing supervised classification learning algorithms Neural Computation 10 1998 1895 1923
-
(1998)
Neural Computation
, vol.10
, pp. 1895-1923
-
-
Dietterich, T.G.1
-
8
-
-
38149014113
-
An application of recurrent neural networks to discriminative keyword spotting
-
Porto, Portugal
-
S. Fernandez, A. Graves, and J. Schmidhuber An application of recurrent neural networks to discriminative keyword spotting Proc. of ICANN Porto, Portugal 2007 220 229
-
(2007)
Proc. of ICANN
, pp. 220-229
-
-
Fernandez, S.1
Graves, A.2
Schmidhuber, J.3
-
9
-
-
79960657803
-
Exemplar-based sparse representations for noise robust automatic speech recognition
-
J. Gemmeke, T. Virtanen, and A. Hurmalainen Exemplar-based sparse representations for noise robust automatic speech recognition IEEE Transactions on Audio, Speech, and Language Processing 19 7 2011 2067 2080
-
(2011)
IEEE Transactions on Audio, Speech, and Language Processing
, vol.19
, Issue.7
, pp. 2067-2080
-
-
Gemmeke, J.1
Virtanen, T.2
Hurmalainen, A.3
-
10
-
-
84890521030
-
Exemplar-based speech enhancement and its application to noise-robust automatic speech recognition
-
Florence, Italy
-
J.F. Gemmeke, T. Virtanen, and A. Hurmalainen Exemplar-based speech enhancement and its application to noise-robust automatic speech recognition Proc. of CHiME Workshop Florence, Italy 2011 53 57
-
(2011)
Proc. of CHiME Workshop
, pp. 53-57
-
-
Gemmeke, J.F.1
Virtanen, T.2
Hurmalainen, A.3
-
11
-
-
0034293152
-
Learning to forget: continual prediction with LSTM
-
F. Gers, J. Schmidhuber, and F. Cummins Learning to forget: continual prediction with LSTM Neural Computation 12 10 2000 2451 2471
-
(2000)
Neural Computation
, vol.12
, Issue.10
, pp. 2451-2471
-
-
Gers, F.1
Schmidhuber, J.2
Cummins, F.3
-
12
-
-
33749259827
-
Connectionist temporal classification: labelling unsegmented data with recurrent neural networks
-
Pittsburgh, USA
-
A. Graves, S. Fernandez, F. Gomez, and J. Schmidhuber Connectionist temporal classification: labelling unsegmented data with recurrent neural networks Proc. of ICML Pittsburgh, USA 2006 369 376
-
(2006)
Proc. of ICML
, pp. 369-376
-
-
Graves, A.1
Fernandez, S.2
Gomez, F.3
Schmidhuber, J.4
-
13
-
-
85161980569
-
Unconstrained online handwriting recognition with recurrent neural networks
-
A. Graves, S. Fernandez, M. Liwicki, H. Bunke, and J. Schmidhuber Unconstrained online handwriting recognition with recurrent neural networks Advances in Neural Information Processing Systems 20 2008 1 8
-
(2008)
Advances in Neural Information Processing Systems
, vol.20
, pp. 1-8
-
-
Graves, A.1
Fernandez, S.2
Liwicki, M.3
Bunke, H.4
Schmidhuber, J.5
-
14
-
-
27744588611
-
Framewise phoneme classification with bidirectional LSTM and other neural network architectures
-
A. Graves, and J. Schmidhuber Framewise phoneme classification with bidirectional LSTM and other neural network architectures Neural Networks 18 5-6 2005 602 610
-
(2005)
Neural Networks
, vol.18
, Issue.5-6
, pp. 602-610
-
-
Graves, A.1
Schmidhuber, J.2
-
16
-
-
84863690059
-
Separation of drums from polyphonic music using non-negative matrix factorization and support vector machine
-
Antalya, Turkey
-
M. Helen, and T. Virtanen Separation of drums from polyphonic music using non-negative matrix factorization and support vector machine Proc. of EUSIPCO Antalya, Turkey 2005
-
(2005)
Proc. of EUSIPCO
-
-
Helen, M.1
Virtanen, T.2
-
17
-
-
0033709098
-
Tandem connectionist feature extraction for conventional HMM systems
-
Istanbul, Turkey
-
H. Hermansky, D.P.W. Ellis, and S. Sharma Tandem connectionist feature extraction for conventional HMM systems Proc. of ICASSP Istanbul, Turkey 2000 1635 1638
-
(2000)
Proc. of ICASSP
, pp. 1635-1638
-
-
Hermansky, H.1
Ellis, D.P.W.2
Sharma, S.3
-
18
-
-
0041914606
-
Gradient flow in recurrent nets: the difficulty of learning long-term dependencies
-
S.C. Kremer, J.F. Kolen, IEEE Press
-
S. Hochreiter, Y. Bengio, P. Frasconi, and J. Schmidhuber Gradient flow in recurrent nets: the difficulty of learning long-term dependencies S.C. Kremer, J.F. Kolen, A Field Guide to Dynamical Recurrent Neural Networks 2001 IEEE Press 1 15
-
(2001)
A Field Guide to Dynamical Recurrent Neural Networks
, pp. 1-15
-
-
Hochreiter, S.1
Bengio, Y.2
Frasconi, P.3
Schmidhuber, J.4
-
20
-
-
84869754173
-
Exemplar-based recognition of speech in highly variable noise
-
Florence, Italy
-
A. Hurmalainen, K. Mahkonen, J.F. Gemmeke, and T. Virtanen Exemplar-based recognition of speech in highly variable noise Proc. of Machine Listening in Multisource Environments (CHiME 2011), Satellite Workshop of Interspeech 2011 Florence, Italy 2011 1 5
-
(2011)
Proc. of Machine Listening in Multisource Environments (CHiME 2011), Satellite Workshop of Interspeech 2011
, pp. 1-5
-
-
Hurmalainen, A.1
Mahkonen, K.2
Gemmeke, J.F.3
Virtanen, T.4
-
21
-
-
1842436050
-
The echo state approach to analyzing and training recurrent neural networks
-
Tech. Rep. German National Research Center for Information Technology, Bremen
-
Jaeger, H., 2001. The echo state approach to analyzing and training recurrent neural networks. Tech. Rep. German National Research Center for Information Technology, Bremen (Tech. Rep. No. 148).
-
(2001)
Tech. Rep. No. 148
-
-
Jaeger, H.1
-
22
-
-
0025254722
-
A time-delay neural network architecture for isolated word recognition
-
K.J. Lang, A.H. Waibel, and G.E. Hinton A time-delay neural network architecture for isolated word recognition Neural Networks 3 1 1990 23 43
-
(1990)
Neural Networks
, vol.3
, Issue.1
, pp. 23-43
-
-
Lang, K.J.1
Waibel, A.H.2
Hinton, G.E.3
-
23
-
-
33646241633
-
Learning long-term dependencies in NARX recurrent neural networks
-
T. Lin, B.G. Horne, P. Tino, and C.L. Giles Learning long-term dependencies in NARX recurrent neural networks IEEE Transactions on Neural Networks 7 6 1996 1329 1338
-
(1996)
IEEE Transactions on Neural Networks
, vol.7
, Issue.6
, pp. 1329-1338
-
-
Lin, T.1
Horne, B.G.2
Tino, P.3
Giles, C.L.4
-
25
-
-
84865736185
-
Phoneme-dependent NMF for speech enhancement in monaural mixtures
-
ISCA, Florence, Italy
-
B. Raj, R. Singh, and T. Virtanen Phoneme-dependent NMF for speech enhancement in monaural mixtures Proc. of Interspeech ISCA, Florence, Italy 2011 1217 1220
-
(2011)
Proc. of Interspeech
, pp. 1217-1220
-
-
Raj, B.1
Singh, R.2
Virtanen, T.3
-
26
-
-
79959818117
-
Non-negative matrix factorization based compensation of music for automatic speech recognition
-
Makuhari, Japan
-
B. Raj, T. Virtanen, S. Chaudhuri, and R. Singh Non-negative matrix factorization based compensation of music for automatic speech recognition Proc. of Interspeech Makuhari, Japan 2010 717 720
-
(2010)
Proc. of Interspeech
, pp. 717-720
-
-
Raj, B.1
Virtanen, T.2
Chaudhuri, S.3
Singh, R.4
-
27
-
-
51449100115
-
Efficient model-based speech separation and denoising using non-negative subspace analysis
-
Las Vegas, NV, USA
-
S.J. Rennie, J.R. Hershey, and P.A. Olsen Efficient model-based speech separation and denoising using non-negative subspace analysis Proc. of ICASSP Las Vegas, NV, USA 2008 1833 1836
-
(2008)
Proc. of ICASSP
, pp. 1833-1836
-
-
Rennie, S.J.1
Hershey, J.R.2
Olsen, P.A.3
-
28
-
-
56449109755
-
Learning long-term dependencies with recurrent neural networks
-
A.M. Schaefer, S. Udluft, and H.G. Zimmermann Learning long-term dependencies with recurrent neural networks Neurocomputing 71 13-15 2008 2481 2488
-
(2008)
Neurocomputing
, vol.71
, Issue.13-15
, pp. 2481-2488
-
-
Schaefer, A.M.1
Udluft, S.2
Zimmermann, H.G.3
-
29
-
-
0001033889
-
Learning complex extended sequences using the principle of history compression
-
J. Schmidhuber Learning complex extended sequences using the principle of history compression Neural Computing 4 2 1992 234 242
-
(1992)
Neural Computing
, vol.4
, Issue.2
, pp. 234-242
-
-
Schmidhuber, J.1
-
30
-
-
44949110218
-
Single-channel speech separation using sparse non-negative matrix factorization
-
Pittsburgh, PA, USA
-
M.N. Schmidt, and R.K. Olsson Single-channel speech separation using sparse non-negative matrix factorization Proc. of Interspeech Pittsburgh, PA, USA 2006
-
(2006)
Proc. of Interspeech
-
-
Schmidt, M.N.1
Olsson, R.K.2
-
31
-
-
67650135931
-
Recognition of noisy speech: a comparative survey of robust model architecture and feature enhancement
-
B. Schuller, M. Wöllmer, T. Moosmayr, and G. Rigoll Recognition of noisy speech: a comparative survey of robust model architecture and feature enhancement Journal on Audio, Speech, and Music Processing 2009 (ID 942617)
-
(2009)
Journal on Audio, Speech, and Music Processing
-
-
Schuller, B.1
Wöllmer, M.2
Moosmayr, T.3
Rigoll, G.4
-
33
-
-
78049383291
-
Discovering auditory objects through non-negativity constraints
-
Jeju, Korea
-
P. Smaragdis Discovering auditory objects through non-negativity constraints Proc. of SAPA Jeju, Korea 2004
-
(2004)
Proc. of SAPA
-
-
Smaragdis, P.1
-
34
-
-
38049021850
-
Convolutive speech bases and their application to supervised speech separation
-
P. Smaragdis Convolutive speech bases and their application to supervised speech separation IEEE Transactions on Audio, Speech and Language Processing 15 1 2007 1 14
-
(2007)
IEEE Transactions on Audio, Speech and Language Processing
, vol.15
, Issue.1
, pp. 1-14
-
-
Smaragdis, P.1
-
35
-
-
67650142420
-
A multiplicative algorithm for convolutive non-negative matrix factorization based on squared Euclidean distance
-
W. Wang, A. Cichocki, and J.A. Chambers A multiplicative algorithm for convolutive non-negative matrix factorization based on squared Euclidean distance IEEE Transactions on Signal Processing 57 7 2009 July 2858 2864
-
(2009)
IEEE Transactions on Signal Processing
, vol.57
, Issue.7
, pp. 2858-2864
-
-
Wang, W.1
Cichocki, A.2
Chambers, J.A.3
-
36
-
-
84857258863
-
The Munich 2011 CHiME Challenge Contribution: NMF-BLSTM Speech Enhancement and Recognition for Reverberated Multisource Environments
-
Florence, Italy
-
F. Weninger, J. Geiger, M. Wöllmer, B. Schuller, and G. Rigoll The Munich 2011 CHiME Challenge Contribution: NMF-BLSTM Speech Enhancement and Recognition for Reverberated Multisource Environments Proc. of Machine Listening in Multisource Environments (CHiME 2011), Satellite Workshop of Interspeech 2011 Florence, Italy 2011 24 29
-
(2011)
Proc. of Machine Listening in Multisource Environments (CHiME 2011), Satellite Workshop of Interspeech 2011
, pp. 24-29
-
-
Weninger, F.1
Geiger, J.2
Wöllmer, M.3
Schuller, B.4
Rigoll, G.5
-
37
-
-
80051618211
-
openBliSSART: design and evaluation of a research toolkit for blind source separation in audio recognition tasks
-
Prague, Czech Republic
-
F. Weninger, A. Lehmann, and B. Schuller openBliSSART: design and evaluation of a research toolkit for blind source separation in audio recognition tasks Proc. of ICASSP Prague, Czech Republic 2011 1625 1628
-
(2011)
Proc. of ICASSP
, pp. 1625-1628
-
-
Weninger, F.1
Lehmann, A.2
Schuller, B.3
-
38
-
-
84867600087
-
Non-Negative Matrix Factorization for Highly Noise-Robust ASR: to Enhance or to Recognize?
-
Kyoto, Japan
-
F. Weninger, M. Wöllmer, J. Geiger, B. Schuller, J. Gemmeke, A. Hurmalainen, T. Virtanen, and G. Rigoll Non-Negative Matrix Factorization for Highly Noise-Robust ASR: to Enhance or to Recognize? Proc. of ICASSP Kyoto, Japan 2012 4681 4684
-
(2012)
Proc. of ICASSP
, pp. 4681-4684
-
-
Weninger, F.1
Wöllmer, M.2
Geiger, J.3
Schuller, B.4
Gemmeke, J.5
Hurmalainen, A.6
Virtanen, T.7
Rigoll, G.8
-
39
-
-
51449092704
-
Speech denoising using nonnegative matrix factorization with priors
-
Las Vegas, NV, USA
-
K.W. Wilson, B. Raj, P. Smaragdis, and A. Divakaran Speech denoising using nonnegative matrix factorization with priors Proc. of ICASSP Las Vegas, NV, USA 2008 4029 4032
-
(2008)
Proc. of ICASSP
, pp. 4029-4032
-
-
Wilson, K.W.1
Raj, B.2
Smaragdis, P.3
Divakaran, A.4
-
40
-
-
79958176949
-
On-line driver distraction detection using long short-term memory
-
M. Wöllmer, C. Blaschke, T. Schindl, B. Schuller, B. Färber, S. Mayer, and B. Trefflich On-line driver distraction detection using long short-term memory IEEE Transactions on Intelligent Transportation Systems 12 2 2011 574 582
-
(2011)
IEEE Transactions on Intelligent Transportation Systems
, vol.12
, Issue.2
, pp. 574-582
-
-
Wöllmer, M.1
Blaschke, C.2
Schindl, T.3
Schuller, B.4
Färber, B.5
Mayer, S.6
Trefflich, B.7
-
41
-
-
78651563436
-
Bidirectional LSTM networks for context-sensitive keyword detection in a cognitive virtual agent framework
-
M. Wöllmer, F. Eyben, A. Graves, B. Schuller, and G. Rigoll Bidirectional LSTM networks for context-sensitive keyword detection in a cognitive virtual agent framework Cognitive Computation 2 3 2010 180 190
-
(2010)
Cognitive Computation
, vol.2
, Issue.3
, pp. 180-190
-
-
Wöllmer, M.1
Eyben, F.2
Graves, A.3
Schuller, B.4
Rigoll, G.5
-
42
-
-
80051637579
-
A multi-stream ASR framework for BLSTM modeling of conversational speech
-
Prague, Czech Republic
-
M. Wöllmer, F. Eyben, B. Schuller, and G. Rigoll A multi-stream ASR framework for BLSTM modeling of conversational speech Proc. of ICASSP Prague, Czech Republic 2011 4860 4863
-
(2011)
Proc. of ICASSP
, pp. 4860-4863
-
-
Wöllmer, M.1
Eyben, F.2
Schuller, B.3
Rigoll, G.4
-
43
-
-
81155123235
-
Enhancing spontaneous speech recognition with BLSTM features
-
Las Palmas de Gran Canaria, Spain
-
M. Wöllmer, and B. Schuller Enhancing spontaneous speech recognition with BLSTM features Proc. of NOLISP Las Palmas de Gran Canaria, Spain 2011 17 24
-
(2011)
Proc. of NOLISP
, pp. 17-24
-
-
Wöllmer, M.1
Schuller, B.2
-
44
-
-
77956721304
-
Combining long short-term memory and dynamic Bayesian networks for incremental emotion-sensitive artificial listening
-
M. Wöllmer, B. Schuller, F. Eyben, and G. Rigoll Combining long short-term memory and dynamic Bayesian networks for incremental emotion-sensitive artificial listening IEEE Journal of Selected Topics in Signal Processing 4 5 2010 867 881
-
(2010)
IEEE Journal of Selected Topics in Signal Processing
, vol.4
, Issue.5
, pp. 867-881
-
-
Wöllmer, M.1
Schuller, B.2
Eyben, F.3
Rigoll, G.4
-
45
-
-
84865748400
-
Feature frame stacking in RNN-based Tandem ASR systems - learned vs. predefined context
-
Florence, Italy
-
M. Wöllmer, B. Schuller, and G. Rigoll Feature frame stacking in RNN-based Tandem ASR systems - learned vs. predefined context Proc. of Interspeech Florence, Italy 2011 1233 1236
-
(2011)
Proc. of Interspeech
, pp. 1233-1236
-
-
Wöllmer, M.1
Schuller, B.2
Rigoll, G.3
|