-
1
-
-
0033903480
-
Robust voice activity detection algorithm for estimating noise spectrum
-
K. Woo, T. Yang, K. Park, and C. Lee, "Robust voice activity detection algorithm for estimating noise spectrum," IET Electronics Letters, 2000.
-
(2000)
IET Electronics Letters
-
-
Woo, K.1
Yang, T.2
Park, K.3
Lee, C.4
-
2
-
-
79953283970
-
AR-GARCH in presence of noise: Parameter estimation and its application to voice activity detection
-
S. Mousazadeh and I. Cohen, "AR-GARCH in Presence of Noise: Parameter Estimation and Its Application to Voice Activity Detection," IEEE Transactions on Audio Speech and Language Processing, vol. 19, no. 4, pp. 916-926, 2011.
-
(2011)
IEEE Transactions on Audio Speech and Language Processing
, vol.19
, Issue.4
, pp. 916-926
-
-
Mousazadeh, S.1
Cohen, I.2
-
3
-
-
84878610785
-
Speech/nonspeech segmentation in web videos
-
Portland, USA. September, ISCA
-
A. Misra, "Speech/nonspeech segmentation in web videos," in Proc. of INTERSPEECH 2012, Portland, USA. September 2012, ISCA.
-
(2012)
Proc. of INTERSPEECH 2012
-
-
Misra, A.1
-
4
-
-
84878535284
-
Developing a speech activity detection system for the darpa rats program
-
Portland, USA. September, ISCA
-
T. Ng, B. Zhang, L. Nguyen, S. Matsoukas, X. Zhou, N. Mesgarani, K. Vesel, and P. Matjka, "Developing a speech activity detection system for the darpa rats program," in Proc. of INTERSPEECH 2012, Portland, USA. September 2012, ISCA.
-
(2012)
Proc. of INTERSPEECH 2012
-
-
Ng, T.1
Zhang, B.2
Nguyen, L.3
Matsoukas, S.4
Zhou, X.5
Mesgarani, N.6
Vesel, K.7
Matjka, P.8
-
5
-
-
0032762471
-
A statistical model-based voice activity detection
-
J. Sohn and N. Kim, "A statistical model-based voice activity detection," IEEE Signal Processing Letters, vol. 6, no. 1, pp. 1-3, 1999.
-
(1999)
IEEE Signal Processing Letters
, vol.6
, Issue.1
, pp. 1-3
-
-
Sohn, J.1
Kim, N.2
-
6
-
-
23344452899
-
Statistical voice activity detection using a multiple observation likelihood ratio test
-
J. Ramirez, J. Segura, C. Benitez, L. Garcia, and A. Rubio, "Statistical voice activity detection using a multiple observation likelihood ratio test," IEEE Signal Processing Letters, vol. 12, no. 10, pp. 689-692, 2005.
-
(2005)
IEEE Signal Processing Letters
, vol.12
, Issue.10
, pp. 689-692
-
-
Ramirez, J.1
Segura, J.2
Benitez, C.3
Garcia, L.4
Rubio, A.5
-
7
-
-
4544379392
-
On the decision-directed estimation approach of Ephraim and Malah
-
I. Cohen, "On the decision-directed estimation approach of Ephraim and Malah," in Proc. of ICASSP. IEEE, 2004, vol. I, pp. 1-293.
-
(2004)
Proc. of ICASSP. IEEE
, vol.1
, pp. 1-293
-
-
Cohen, I.1
-
8
-
-
1842476689
-
Efficient voice activity detection algorithms using long-term speech information
-
J. Ramirez, J. Segura, M. Benitez, A. De La Torre, and A. Rubio, "Efficient voice activity detection algorithms using long-term speech information," Speech Communication, vol. 42, no. 3, pp. 271-287, 2004.
-
(2004)
Speech Communication
, vol.42
, Issue.3
, pp. 271-287
-
-
Ramirez, J.1
Segura, J.2
Benitez, M.3
De La Torre, A.4
Rubio, A.5
-
9
-
-
0041360463
-
Noise spectrum estimation in adverse environment: Improved minima controlled recursive averaging
-
I. Cohen, "Noise spectrum estimation in adverse environment: Improved minima controlled recursive averaging," IEEE Trans. Audio Speech Processing, vol. 11, no. 5, pp. 466-475, 2003.
-
(2003)
IEEE Trans. Audio Speech Processing
, vol.11
, Issue.5
, pp. 466-475
-
-
Cohen, I.1
-
10
-
-
0031573117
-
Long short-term memory
-
S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural Computation, vol. 9(8), pp. 1735-1780, 1997.
-
(1997)
Neural Computation
, vol.9
, Issue.8
, pp. 1735-1780
-
-
Hochreiter, S.1
Schmidhuber, J.2
-
11
-
-
33745194565
-
Non-linear esimation of voice activity to improve automatic recognition of noisy speech
-
Lisbon, Portugal. September, ISCA
-
R. Gemello, F. Mana, and R.D. Mori, "Non-linear esimation of voice activity to improve automatic recognition of noisy speech," in Proc. of INTERSPEECH 2005, Lisbon, Portugal. September 2005, pp. 2617-2620, ISCA.
-
(2005)
Proc. of INTERSPEECH 2005
, pp. 2617-2620
-
-
Gemello, R.1
Mana, F.2
Mori, R.D.3
-
12
-
-
0041914606
-
Gradient flow in recurrent nets: The difficulty of learning long-term dependencies
-
S.C. Kremer and J.F. Kolen, Eds., IEEE Press
-
S. Hochreiter, Y. Bengio, P. Frasconi, and J. Schmidhuber, "Gradient flow in recurrent nets: the difficulty of learning long-term dependencies," in A Field Guide to Dynamical Recurrent Neural Networks., S.C. Kremer and J.F. Kolen, Eds. 2001, IEEE Press.
-
(2001)
A Field Guide to Dynamical Recurrent Neural Networks
-
-
Hochreiter, S.1
Bengio, Y.2
Frasconi, P.3
Schmidhuber, J.4
-
13
-
-
0025041264
-
Perceptual linear predictive (PLP) analysis of speech
-
Apr.
-
H. Hermansky, "Perceptual linear predictive (PLP) analysis of speech," Journal of the Acoustical Society of America, vol. 87, no. 4, pp. 1738-1752, Apr. 1990.
-
(1990)
Journal of the Acoustical Society of America
, vol.87
, Issue.4
, pp. 1738-1752
-
-
Hermansky, H.1
-
14
-
-
78650977476
-
OpenSMILE-the munich versatile and fast open-source audio feature extractor
-
Florence, Italy, ACM
-
F. Eyben, M. Wöllmer, and B. Schuller, "openSMILE- the munich versatile and fast open-source audio feature extractor," in Proc. ACM Multimedia (MM), Florence, Italy. 2010, pp. 1459-1462, ACM.
-
(2010)
Proc. ACM Multimedia (MM)
, pp. 1459-1462
-
-
Eyben, F.1
Wöllmer, M.2
Schuller, B.3
-
15
-
-
70349287581
-
Multidimensional recurrent neural networks
-
Porto, Portugal, September
-
A. Graves, S. Fernández, and J. Schmidhuber, " Multidimensional recurrent neural networks," in Proc. of the 2007 International Conference on Artificial Neural Networks, Porto, Portugal, September 2007.
-
(2007)
Proc. of the 2007 International Conference on Artificial Neural Networks
-
-
Graves, A.1
Fernández, S.2
Schmidhuber, J.3
-
16
-
-
51449106187
-
-
Department of Psychology, Ohio State University (Distributor), Columbus, OH, USA
-
M.A. Pitt, L. Dilley, K. Johnson, S. Kiesling, W. Raymond, E. Hume, and E. Fosler-Lussier, Buckeye Corpus of Conversational Speech (2nd release), Department of Psychology, Ohio State University (Distributor), Columbus, OH, USA, 2007, [www.buckeyecorpus.osu.edu].
-
(2007)
Buckeye Corpus of Conversational Speech (2nd Release)
-
-
Pitt, M.A.1
Dilley, L.2
Johnson, K.3
Kiesling, S.4
Raymond, W.5
Hume, E.6
Fosler-Lussier, E.7
-
17
-
-
0003548585
-
-
J.S. Garofolo, L.F. Lamel, W.M. Fisher, J.G. Fiscus, D.S. Pallett, N.L. Dahlgrena, and V. Zue, "TIMIT acoustic-phonetic continuous speech corpus," 1993.
-
(1993)
TIMIT Acoustic-phonetic Continuous Speech Corpus
-
-
Garofolo, J.S.1
Lamel, L.F.2
Fisher, W.M.3
Fiscus, J.G.4
Pallett, D.S.5
Dahlgrena, N.L.6
Zue, V.7
-
18
-
-
80051621128
-
Localization of non-linguistic events in spontaneous speech by non-negative matrix factorization and long short-term memory
-
Prague, Czech Republic
-
F. Weninger, B. Schuller, M. Wöllmer, and G. Rigoll, "Localization of non-linguistic events in spontaneous speech by non-negative matrix factorization and Long Short-Term Memory," in Proc. of ICASSP, Prague, Czech Republic, 2011, pp. 5840-5843.
-
(2011)
Proc. of ICASSP
, pp. 5840-5843
-
-
Weninger, F.1
Schuller, B.2
Wöllmer, M.3
Rigoll, G.4
-
19
-
-
84877658023
-
The media eval 2012 affect task: Violent scenes detection in hollywood movies
-
Pisa, Italy
-
C.H. Demarty, C. Penet, G. Gravier, and M. Soleymani, "The MediaEval 2012 Affect Task: Violent scenes detection in Hollywood Movies," in Proc. of MediaEval 2012 Workshop, Pisa, Italy, 2012.
-
(2012)
Proc. of MediaEval 2012 Workshop
-
-
Demarty, C.H.1
Penet, C.2
Gravier, G.3
Soleymani, M.4
-
20
-
-
84878543378
-
Speaker-dependent voice activity detection robust to background speech noise
-
Portland, USA. September, ISCA
-
S. Matsuda, N. Ito, K. Tsujino, H. Kashioka, and S. Sagayama, "Speaker-dependent voice activity detection robust to background speech noise," in Proc. of INTERSPEECH 2012, Portland, USA. September 2012, ISCA.
-
(2012)
Proc. of INTERSPEECH 2012
-
-
Matsuda, S.1
Ito, N.2
Tsujino, K.3
Kashioka, H.4
Sagayama, S.5
-
21
-
-
80051622763
-
A modified MAP criterion based on hidden Markov model for voice activity detecion
-
may, IEEE
-
S. Deng, J. Han, T. Zheng, and G. Zheng, "A modified MAP criterion based on hidden Markov model for voice activity detecion," in Proc. of ICASSP. may 2011, pp. 5220-5223, IEEE.
-
(2011)
Proc. of ICASSP
, pp. 5220-5223
-
-
Deng, S.1
Han, J.2
Zheng, T.3
Zheng, G.4
-
22
-
-
85008579584
-
Multiple acoustic model-based discriminative likelihood ratio weighting for voice activity detection
-
aug
-
Y. Suh and H. Kim, "Multiple acoustic model-based discriminative likelihood ratio weighting for voice activity detection," Signal Processing Letters, vol. 19, no. 8, pp. 507-510, aug 2012.
-
(2012)
Signal Processing Letters
, vol.19
, Issue.8
, pp. 507-510
-
-
Suh, Y.1
Kim, H.2
-
23
-
-
80055089790
-
Frame-wise model re-estimation method based on gaussian pruning with weight normalization for noise robust voice activity detection
-
M. Fujimoto, S.Watanabe, and T. Nakatani, "Frame-wise model re-estimation method based on gaussian pruning with weight normalization for noise robust voice activity detection," Speech Communication, vol. 54, no. 2, pp. 229-244, 2012.
-
(2012)
Speech Communication
, vol.54
, Issue.2
, pp. 229-244
-
-
Fujimoto, M.1
Watanabe, S.2
Nakatani, T.3
-
24
-
-
84878548167
-
Speech activity detection for noisy data using adaptation techniques
-
Portland, USA. September, ISCA
-
M.K. Omar, "Speech activity detection for noisy data using adaptation techniques," in Proc. of INTERSPEECH 2012, Portland, USA. September 2012, ISCA.
-
(2012)
Proc. of INTERSPEECH 2012
-
-
Omar, M.K.1
-
25
-
-
84878390907
-
Voice activity detection using speech recognizer feedback
-
Portland, USA. September, ISCA
-
K. Thambiratnam, W. Zhu, and F. Seide, "Voice activity detection using speech recognizer feedback," in Proc. of INTERSPEECH 2012, Portland, USA. September 2012, ISCA.
-
(2012)
Proc. of INTERSPEECH 2012
-
-
Thambiratnam, K.1
Zhu, W.2
Seide, F.3
-
26
-
-
84878590831
-
Acoustic and data-driven features for robust speech activity detection
-
Portland, USA. September, ISCA
-
S. Thomas, S.H. Mallidi, T. Janu, H. Hermansky, N. Mesgarani, X. Zhou, S. Shamma, T. Ng, B. Zhang, L. Nguyen, and S. Matsoukas, "Acoustic and data-driven features for robust speech activity detection," in Proc. of INTERSPEECH 2012, Portland, USA. September 2012, ISCA.
-
(2012)
Proc. of INTERSPEECH 2012
-
-
Thomas, S.1
Mallidi, S.H.2
Janu, T.3
Hermansky, H.4
Mesgarani, N.5
Zhou, X.6
Shamma, S.7
Ng, T.8
Zhang, B.9
Nguyen, L.10
Matsoukas, S.11
|