메뉴 건너뛰기




Volumn , Issue , 2013, Pages 1668-1672

Detecting overlapping speech with long short-term memory recurrent neural networks

Author keywords

Long short term memory; Neural networks; Speaker diarization; Speech overlap detection

Indexed keywords

BRAIN; FORECASTING; NEURAL NETWORKS; RECURRENT NEURAL NETWORKS;

EID: 84906242216     PISSN: 2308457X     EISSN: 19909772     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (49)

References (28)
  • 1
    • 33745224103 scopus 로고    scopus 로고
    • Spontaneous speech: How people really talk and why engineers should care
    • Lisbon, Portugal
    • E. Shriberg, "Spontaneous Speech: How People Really Talk and Why Engineers Should Care, " in Proc. Eurospeech, Lisbon, Portugal, 2005, pp. 1781-1784.
    • (2005) Proc. Eurospeech , pp. 1781-1784
    • Shriberg, E.1
  • 2
    • 84878379997 scopus 로고    scopus 로고
    • On the dynamics of overlap in multi-party conversation
    • Portland, OR, USA
    • K. Laskowski, M. Heldner, and J. Edlund, "On the Dynamics of Overlap in Multi-Party Conversation, " in Proc. Interspeech, Portland, OR, USA, 2012.
    • (2012) Proc. Interspeech
    • Laskowski, K.1    Heldner, M.2    Edlund, J.3
  • 3
    • 84878390551 scopus 로고    scopus 로고
    • Temporal entrainment in overlapped speech: Cross-linguistic study
    • Portland, OR, USA
    • M. Wlodarczak, J. Simko, and P. Wagner, "Temporal entrainment in overlapped speech: Cross-linguistic study, " in Proc. Interspeech, Portland, OR, USA, 2012.
    • (2012) Proc. Interspeech
    • Wlodarczak, M.1    Simko, J.2    Wagner, P.3
  • 4
    • 79952619484 scopus 로고    scopus 로고
    • Turn-taking cues in taskoriented dialogue
    • A. Gravano and J. Hirschberg, "Turn-taking cues in taskoriented dialogue, " Computer Speech & Language, vol. 25, no. 3, pp. 601-634, 2011.
    • (2011) Computer Speech & Language , vol.25 , Issue.3 , pp. 601-634
    • Gravano, A.1    Hirschberg, J.2
  • 5
    • 85009145345 scopus 로고    scopus 로고
    • Observations on overlap: Findings and implications for automatic processing of multi-party conversation
    • Aalborg, Denmark
    • E. Shriberg, A. Stolcke, and D. Baron, "Observations on Overlap: Findings and Implications for Automatic Processing of Multi-Party Conversation, " in Proc. Eurospeech, Aalborg, Denmark, 2001, pp. 1359-1362.
    • (2001) Proc. Eurospeech , pp. 1359-1362
    • Shriberg, E.1    Stolcke, A.2    Baron, D.3
  • 6
  • 7
    • 84865726897 scopus 로고    scopus 로고
    • Study of overlapped speech detection for NIST SRE summed channel speaker recognition
    • Florence, Italy
    • H. Sun and B. Ma, "Study of Overlapped Speech Detection for NIST SRE Summed Channel Speaker Recognition, " in Proc. Interspeech, Florence, Italy, 2011, pp. 2345-2348.
    • (2011) Proc. Interspeech , pp. 2345-2348
    • Sun, H.1    Ma, B.2
  • 9
    • 51449111990 scopus 로고    scopus 로고
    • Overlapped speech detection for improved diarization in multi-party meetings
    • Las Vegas, NV, USA
    • K. Boakye, B. Trueba-Hornero, O. Vinyals, and G. Friedland, "Overlapped Speech Detection for Improved Diarization in Multi-Party Meetings, " in Proc. ICASSP, Las Vegas, NV, USA, 2008, pp. 4353-4356.
    • (2008) Proc. ICASSP , pp. 4353-4356
    • Boakye, K.1    Trueba-Hornero, B.2    Vinyals, O.3    Friedland, G.4
  • 10
    • 84865798489 scopus 로고    scopus 로고
    • The detection of overlapping speech with prosodic features for speaker diarization
    • Florence, Italy
    • M. Zelenak and J. Hernando, "The Detection of Overlapping Speech with Prosodic Features for Speaker Diarization, " in Proc. Interspeech, Florence, Italy, 2011, pp. 1041-1044.
    • (2011) Proc. Interspeech , pp. 1041-1044
    • Zelenak, M.1    Hernando, J.2
  • 12
    • 84867599879 scopus 로고    scopus 로고
    • Speech overlap detection and attribution using convolutive non-negative sparse coding
    • Kyoto, Japan
    • R. Vipperla, J. Geiger, S. Bozonnet, D. Wang, N. Evans, B. Schuller, and G. Rigoll, "Speech Overlap Detection and Attribution Using Convolutive Non-Negative Sparse Coding, " in Proc. ICASSP, Kyoto, Japan, 2012, pp. 4181- 4184.
    • (2012) Proc. ICASSP , pp. 4181-4184
    • Vipperla, R.1    Geiger, J.2    Bozonnet, S.3    Wang, D.4    Evans, N.5    Schuller, B.6    Rigoll, G.7
  • 13
    • 84878541035 scopus 로고    scopus 로고
    • Convolutive non-negative sparse coding and new features for speech overlap handling in speaker diarization
    • Portland, OR, USA
    • J. Geiger, R. Vipperla, S. Bozonnet, N. Evans, B. Schuller, and G. Rigoll, "Convolutive Non-Negative Sparse Coding and New Features for Speech Overlap Handling in Speaker Diarization, " in Proc. Interspeech, Portland, OR, USA, 2012.
    • (2012) Proc. Interspeech
    • Geiger, J.1    Vipperla, R.2    Bozonnet, S.3    Evans, N.4    Schuller, B.5    Rigoll, G.6
  • 14
    • 84869791486 scopus 로고    scopus 로고
    • Speech overlap detection and attribution using convolutive non-negative sparse coding: New improvements and insights
    • Bucharest, Romania
    • J. Geiger, R. Vipperla, N. Evans, B. Schuller, and G. Rigoll, "Speech Overlap Detection and Attribution Using Convolutive Non-Negative Sparse Coding: New Improvements and Insights, " in Proc. EUSIPCO, Bucharest, Romania, 2012, pp. 340-344.
    • (2012) Proc. EUSIPCO , pp. 340-344
    • Geiger, J.1    Vipperla, R.2    Evans, N.3    Schuller, B.4    Rigoll, G.5
  • 15
    • 84878382961 scopus 로고    scopus 로고
    • Speaker diarization of overlapping speech based on silence distribution in meeting recordings
    • Portland, OR, USA
    • S. H. Yella and F. Valente, "Speaker Diarization of Overlapping Speech based on Silence Distribution in Meeting Recordings, " in Proc. Interspeech, Portland, OR, USA, 2012.
    • (2012) Proc. Interspeech
    • Yella, S.H.1    Valente, F.2
  • 16
    • 84890508572 scopus 로고    scopus 로고
    • Improved overlap speech diarization of meeting recordings using long-term conversational features
    • Vancouver, Canada
    • S. H. Yella and H. Bourlard, "Improved Overlap Speech Diarization of Meeting Recordings using Long-Term Conversational Features, " in Proc. ICASSP, Vancouver, Canada, 2013.
    • (2013) Proc. ICASSP
    • Yella, S.H.1    Bourlard, H.2
  • 17
    • 27144509179 scopus 로고    scopus 로고
    • Learning long-term temporal features in lvcsr using neural networks
    • Jeju Island, Korea
    • Barry Chen, Qifeng Zhu, and Nelson Morgan, "Learning long-term temporal features in lvcsr using neural networks, " in Proc. Interspeech, Jeju Island, Korea, 2004, pp. 612-615.
    • (2004) Proc. Interspeech , pp. 612-615
    • Chen, B.1    Zhu, Q.2    Morgan, N.3
  • 18
    • 0031573117 scopus 로고    scopus 로고
    • Long short-term memory
    • S. Hochreiter and J. Schmidhuber, "Long short-term memory, " Neural Computation, vol. 9, no. 8, pp. 1735- 1780, 1997.
    • (1997) Neural Computation , vol.9 , Issue.8 , pp. 1735-1780
    • Hochreiter, S.1    Schmidhuber, J.2
  • 19
    • 84890443834 scopus 로고    scopus 로고
    • Real-life voice activity detection with lstm recurrent neural networks and an application to hollywood movies
    • Vancouver, Canada
    • F. Eyben, F. Weninger, S. Squartini, and B. Schuller, "Real-life Voice Activity Detection with LSTM Recurrent Neural Networks and an Application to Hollywood Movies, " in Proc. ICASSP, Vancouver, Canada, 2013.
    • (2013) Proc. ICASSP
    • Eyben, F.1    Weninger, F.2    Squartini, S.3    Schuller, B.4
  • 20
    • 78650977476 scopus 로고    scopus 로고
    • Opensmile: The munich versatile and fast open-source audio feature extractor
    • Florence, Italy
    • F. Eyben, M. Wöllmer, and B. Schuller, "openSMILE: The Munich Versatile and Fast Open-Source Audio Feature Extractor, " in Proc. ACM Multimedia (MM), Florence, Italy, 2010, pp. 1459-1462.
    • (2010) Proc. ACM Multimedia (MM) , pp. 1459-1462
    • Eyben, F.1    Wöllmer, M.2    Schuller, B.3
  • 21
    • 84900510076 scopus 로고    scopus 로고
    • Non-negative matrix factorization with sparseness constraints
    • P. O. Hoyer, "Non-negative Matrix Factorization with Sparseness Constraints, " Journal of Machine Learning Research, vol. 5, pp. 1457-1469, 2004.
    • (2004) Journal of Machine Learning Research , vol.5 , pp. 1457-1469
    • Hoyer, P.O.1
  • 22
    • 38049021850 scopus 로고    scopus 로고
    • Convolutive speech bases and their application to supervised speech separation
    • P. Smaragdis, "Convolutive Speech Bases and Their Application to Supervised Speech Separation, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 1, pp. 1-12, 2007.
    • (2007) IEEE Transactions on Audio, Speech, and Language Processing , vol.15 , Issue.1 , pp. 1-12
    • Smaragdis, P.1
  • 23
    • 84865777049 scopus 로고    scopus 로고
    • Online pattern learning for non-negative convolutive sparse coding
    • Florence, Italy
    • D. Wang, R. Vipperla, and N. Evans, "Online pattern learning for non-negative convolutive sparse coding, " in Proc. Interspeech, Florence, Italy, 2011, pp. 65-68.
    • (2011) Proc. Interspeech , pp. 65-68
    • Wang, D.1    Vipperla, R.2    Evans, N.3
  • 24
    • 84870974763 scopus 로고    scopus 로고
    • Online non-negative convolutive pattern learning for speech signals
    • D.Wang, R. Vipperla, N. Evans, and T. F. Zheng, "Online non-negative convolutive pattern learning for speech signals, " IEEE Transactions on Signal Processing, vol. 61, no. 1, pp. 44-56, 2013.
    • (2013) IEEE Transactions on Signal Processing , vol.61 , Issue.1 , pp. 44-56
    • Wang, D.1    Vipperla, R.2    Evans, N.3    Zheng, T.F.4
  • 25
    • 78049378635 scopus 로고    scopus 로고
    • The liaeurecom RT09 speaker diarization system: Enhancements in speaker modelling and cluster purification
    • Dallas, TX, USA
    • S. Bozonnet, N. Evans, and C. Fredouille, "The LIAEurecom RT09 Speaker Diarization System: Enhancements in Speaker Modelling and Cluster Purification, " in Proc. ICASSP, Dallas, TX, USA, 2010, pp. 4958-4961.
    • (2010) Proc. ICASSP , pp. 4958-4961
    • Bozonnet, S.1    Evans, N.2    Fredouille, C.3
  • 28
    • 84865747509 scopus 로고    scopus 로고
    • Improved overlapped speech handling for speaker diarization
    • Florence, Italy
    • K. Boakye, O. Vinyals, and G. Friedland, "Improved Overlapped Speech Handling for Speaker Diarization, " in Proc. Interspeech, Florence, Italy, 2011, pp. 941-944.
    • (2011) Proc. Interspeech , pp. 941-944
    • Boakye, K.1    Vinyals, O.2    Friedland, G.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.