메뉴 건너뛰기




Volumn , Issue , 2013, Pages 483-487

Real-life voice activity detection with LSTM Recurrent Neural Networks and an application to Hollywood movies

Author keywords

Long Short Term Memory; Neural Networks; Speech Detection; Voice Activity Detection

Indexed keywords

DATA-DRIVEN APPROACH; EQUAL ERROR RATE; LONG SHORT-TERM MEMORY; LONG-TERM RECORDING; REFERENCE ALGORITHM; SPEECH DETECTION; SPONTANEOUS SPEECH; VOICE ACTIVITY DETECTION;

EID: 84890443834     PISSN: 15206149     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ICASSP.2013.6637694     Document Type: Conference Paper
Times cited : (234)

References (26)
  • 1
    • 0033903480 scopus 로고    scopus 로고
    • Robust voice activity detection algorithm for estimating noise spectrum
    • K. Woo, T. Yang, K. Park, and C. Lee, "Robust voice activity detection algorithm for estimating noise spectrum," IET Electronics Letters, 2000.
    • (2000) IET Electronics Letters
    • Woo, K.1    Yang, T.2    Park, K.3    Lee, C.4
  • 2
    • 79953283970 scopus 로고    scopus 로고
    • AR-GARCH in presence of noise: Parameter estimation and its application to voice activity detection
    • S. Mousazadeh and I. Cohen, "AR-GARCH in Presence of Noise: Parameter Estimation and Its Application to Voice Activity Detection," IEEE Transactions on Audio Speech and Language Processing, vol. 19, no. 4, pp. 916-926, 2011.
    • (2011) IEEE Transactions on Audio Speech and Language Processing , vol.19 , Issue.4 , pp. 916-926
    • Mousazadeh, S.1    Cohen, I.2
  • 3
    • 84878610785 scopus 로고    scopus 로고
    • Speech/nonspeech segmentation in web videos
    • Portland, USA. September, ISCA
    • A. Misra, "Speech/nonspeech segmentation in web videos," in Proc. of INTERSPEECH 2012, Portland, USA. September 2012, ISCA.
    • (2012) Proc. of INTERSPEECH 2012
    • Misra, A.1
  • 5
    • 0032762471 scopus 로고    scopus 로고
    • A statistical model-based voice activity detection
    • J. Sohn and N. Kim, "A statistical model-based voice activity detection," IEEE Signal Processing Letters, vol. 6, no. 1, pp. 1-3, 1999.
    • (1999) IEEE Signal Processing Letters , vol.6 , Issue.1 , pp. 1-3
    • Sohn, J.1    Kim, N.2
  • 6
    • 23344452899 scopus 로고    scopus 로고
    • Statistical voice activity detection using a multiple observation likelihood ratio test
    • J. Ramirez, J. Segura, C. Benitez, L. Garcia, and A. Rubio, "Statistical voice activity detection using a multiple observation likelihood ratio test," IEEE Signal Processing Letters, vol. 12, no. 10, pp. 689-692, 2005.
    • (2005) IEEE Signal Processing Letters , vol.12 , Issue.10 , pp. 689-692
    • Ramirez, J.1    Segura, J.2    Benitez, C.3    Garcia, L.4    Rubio, A.5
  • 7
    • 4544379392 scopus 로고    scopus 로고
    • On the decision-directed estimation approach of Ephraim and Malah
    • I. Cohen, "On the decision-directed estimation approach of Ephraim and Malah," in Proc. of ICASSP. IEEE, 2004, vol. I, pp. 1-293.
    • (2004) Proc. of ICASSP. IEEE , vol.1 , pp. 1-293
    • Cohen, I.1
  • 8
    • 1842476689 scopus 로고    scopus 로고
    • Efficient voice activity detection algorithms using long-term speech information
    • J. Ramirez, J. Segura, M. Benitez, A. De La Torre, and A. Rubio, "Efficient voice activity detection algorithms using long-term speech information," Speech Communication, vol. 42, no. 3, pp. 271-287, 2004.
    • (2004) Speech Communication , vol.42 , Issue.3 , pp. 271-287
    • Ramirez, J.1    Segura, J.2    Benitez, M.3    De La Torre, A.4    Rubio, A.5
  • 9
    • 0041360463 scopus 로고    scopus 로고
    • Noise spectrum estimation in adverse environment: Improved minima controlled recursive averaging
    • I. Cohen, "Noise spectrum estimation in adverse environment: Improved minima controlled recursive averaging," IEEE Trans. Audio Speech Processing, vol. 11, no. 5, pp. 466-475, 2003.
    • (2003) IEEE Trans. Audio Speech Processing , vol.11 , Issue.5 , pp. 466-475
    • Cohen, I.1
  • 11
    • 33745194565 scopus 로고    scopus 로고
    • Non-linear esimation of voice activity to improve automatic recognition of noisy speech
    • Lisbon, Portugal. September, ISCA
    • R. Gemello, F. Mana, and R.D. Mori, "Non-linear esimation of voice activity to improve automatic recognition of noisy speech," in Proc. of INTERSPEECH 2005, Lisbon, Portugal. September 2005, pp. 2617-2620, ISCA.
    • (2005) Proc. of INTERSPEECH 2005 , pp. 2617-2620
    • Gemello, R.1    Mana, F.2    Mori, R.D.3
  • 13
    • 0025041264 scopus 로고
    • Perceptual linear predictive (PLP) analysis of speech
    • Apr.
    • H. Hermansky, "Perceptual linear predictive (PLP) analysis of speech," Journal of the Acoustical Society of America, vol. 87, no. 4, pp. 1738-1752, Apr. 1990.
    • (1990) Journal of the Acoustical Society of America , vol.87 , Issue.4 , pp. 1738-1752
    • Hermansky, H.1
  • 14
    • 78650977476 scopus 로고    scopus 로고
    • OpenSMILE-the munich versatile and fast open-source audio feature extractor
    • Florence, Italy, ACM
    • F. Eyben, M. Wöllmer, and B. Schuller, "openSMILE- the munich versatile and fast open-source audio feature extractor," in Proc. ACM Multimedia (MM), Florence, Italy. 2010, pp. 1459-1462, ACM.
    • (2010) Proc. ACM Multimedia (MM) , pp. 1459-1462
    • Eyben, F.1    Wöllmer, M.2    Schuller, B.3
  • 18
    • 80051621128 scopus 로고    scopus 로고
    • Localization of non-linguistic events in spontaneous speech by non-negative matrix factorization and long short-term memory
    • Prague, Czech Republic
    • F. Weninger, B. Schuller, M. Wöllmer, and G. Rigoll, "Localization of non-linguistic events in spontaneous speech by non-negative matrix factorization and Long Short-Term Memory," in Proc. of ICASSP, Prague, Czech Republic, 2011, pp. 5840-5843.
    • (2011) Proc. of ICASSP , pp. 5840-5843
    • Weninger, F.1    Schuller, B.2    Wöllmer, M.3    Rigoll, G.4
  • 20
    • 84878543378 scopus 로고    scopus 로고
    • Speaker-dependent voice activity detection robust to background speech noise
    • Portland, USA. September, ISCA
    • S. Matsuda, N. Ito, K. Tsujino, H. Kashioka, and S. Sagayama, "Speaker-dependent voice activity detection robust to background speech noise," in Proc. of INTERSPEECH 2012, Portland, USA. September 2012, ISCA.
    • (2012) Proc. of INTERSPEECH 2012
    • Matsuda, S.1    Ito, N.2    Tsujino, K.3    Kashioka, H.4    Sagayama, S.5
  • 21
    • 80051622763 scopus 로고    scopus 로고
    • A modified MAP criterion based on hidden Markov model for voice activity detecion
    • may, IEEE
    • S. Deng, J. Han, T. Zheng, and G. Zheng, "A modified MAP criterion based on hidden Markov model for voice activity detecion," in Proc. of ICASSP. may 2011, pp. 5220-5223, IEEE.
    • (2011) Proc. of ICASSP , pp. 5220-5223
    • Deng, S.1    Han, J.2    Zheng, T.3    Zheng, G.4
  • 22
    • 85008579584 scopus 로고    scopus 로고
    • Multiple acoustic model-based discriminative likelihood ratio weighting for voice activity detection
    • aug
    • Y. Suh and H. Kim, "Multiple acoustic model-based discriminative likelihood ratio weighting for voice activity detection," Signal Processing Letters, vol. 19, no. 8, pp. 507-510, aug 2012.
    • (2012) Signal Processing Letters , vol.19 , Issue.8 , pp. 507-510
    • Suh, Y.1    Kim, H.2
  • 23
    • 80055089790 scopus 로고    scopus 로고
    • Frame-wise model re-estimation method based on gaussian pruning with weight normalization for noise robust voice activity detection
    • M. Fujimoto, S.Watanabe, and T. Nakatani, "Frame-wise model re-estimation method based on gaussian pruning with weight normalization for noise robust voice activity detection," Speech Communication, vol. 54, no. 2, pp. 229-244, 2012.
    • (2012) Speech Communication , vol.54 , Issue.2 , pp. 229-244
    • Fujimoto, M.1    Watanabe, S.2    Nakatani, T.3
  • 24
    • 84878548167 scopus 로고    scopus 로고
    • Speech activity detection for noisy data using adaptation techniques
    • Portland, USA. September, ISCA
    • M.K. Omar, "Speech activity detection for noisy data using adaptation techniques," in Proc. of INTERSPEECH 2012, Portland, USA. September 2012, ISCA.
    • (2012) Proc. of INTERSPEECH 2012
    • Omar, M.K.1
  • 25
    • 84878390907 scopus 로고    scopus 로고
    • Voice activity detection using speech recognizer feedback
    • Portland, USA. September, ISCA
    • K. Thambiratnam, W. Zhu, and F. Seide, "Voice activity detection using speech recognizer feedback," in Proc. of INTERSPEECH 2012, Portland, USA. September 2012, ISCA.
    • (2012) Proc. of INTERSPEECH 2012
    • Thambiratnam, K.1    Zhu, W.2    Seide, F.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.