메뉴 건너뛰기




Volumn 20, Issue 2, 2012, Pages 436-446

Simultaneous Speech Detection with Spatial Features for Speaker Diarization

Author keywords

[No Author keywords available]

Indexed keywords


EID: 85008556062     PISSN: 15587916     EISSN: 15587924     Source Type: Journal    
DOI: 10.1109/TASL.2011.2160167     Document Type: Article
Times cited : (31)

References (35)
  • 1
    • 85009145345 scopus 로고    scopus 로고
    • Observations on overlap: Findings and implications for automatic processing of multi-party conversation
    • E. Shriberg, A. Stolcke, and D. Baron, “Observations on overlap: Findings and implications for automatic processing of multi-party conversation,” in Proc. Eurospeech′01, Aalborg, Denmark, 2001, vol. 2, pp. 1359–1362.
    • (2001) Proc. Eurospeech′01, Aalborg, Denmark , vol.2 , pp. 1359-1362
    • Shriberg, E.1    Stolcke, A.2    Baron, D.3
  • 2
    • 33745224103 scopus 로고    scopus 로고
    • Spontaneous speech: How people really talk and why engineers should care
    • E. Shriberg, “Spontaneous speech: How people really talk and why engineers should care,” in Proc. Interspeech′05, Lisbon, Portugal, 2005, pp. 1781–1784.
    • (2005) Proc. Interspeech′05, Lisbon, Portugal , pp. 1781-1784
    • Shriberg, E.1
  • 3
  • 6
    • 11144232847 scopus 로고    scopus 로고
    • Speech and crosstalk detection in multichannel audio
    • Jan.
    • S. Wrigley, G. Brown, V. Wan, and S. Renals, “Speech and crosstalk detection in multichannel audio,” IEEE Trans. Speech Audio Process., vol. 13, no. 1, pp. 84–91, Jan. 2005.
    • (2005) IEEE Trans. Speech Audio Process. , vol.13 , Issue.1 , pp. 84-91
    • Wrigley, S.1    Brown, G.2    Wan, V.3    Renals, S.4
  • 7
    • 33947615205 scopus 로고    scopus 로고
    • Unsupervised learning of overlapped speech model parameters for multichannel speech activity detection in meetings
    • K. Laskowski and T. Schultz, “Unsupervised learning of overlapped speech model parameters for multichannel speech activity detection in meetings,” in Proc. ICASSP′06, Toulouse, France, 2006, vol. I, pp. 993–996.
    • (2006) Proc. ICASSP′06, Toulouse, France , vol.1 , pp. 993-996
    • Laskowski, K.1    Schultz, T.2
  • 10
    • 77249176190 scopus 로고    scopus 로고
    • The AMI speaker diarization system for NIST RT06s meeting data
    • D. van Leeuwen and M. Huijbregts, “The AMI speaker diarization system for NIST RT06s meeting data,” Mach. Learn. Multimodal Interact., vol. 4299/2006, pp. 371–384, 2006.
    • (2006) Mach. Learn. Multimodal Interact. , vol.4299-2006 , pp. 371-384
    • van Leeuwen, D.1    Huijbregts, M.2
  • 12
    • 84867228708 scopus 로고    scopus 로고
    • Two's a crowd: Improving speaker diarization by automatically identifying and excluding overlapped speech
    • K. Boakye, O. Vinyals, and G. Friedland, “Two's a crowd: Improving speaker diarization by automatically identifying and excluding overlapped speech,” in Proc. Interspeech′08, Brisbane, Australia, 2008, pp. 32–35.
    • (2008) Proc. Interspeech′08, Brisbane, Australia , pp. 32-35
    • Boakye, K.1    Vinyals, O.2    Friedland, G.3
  • 13
    • 50449086237 scopus 로고    scopus 로고
    • Acoustic beamforming for speaker diarization of meetings
    • Sep.
    • X. Anguera, C. Wooters, and J. Hernando, “Acoustic beamforming for speaker diarization of meetings,” IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 7, pp. 2011–2022, Sep. 2007.
    • (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.7 , pp. 2011-2022
    • Anguera, X.1    Wooters, C.2    Hernando, J.3
  • 14
    • 0346707503 scopus 로고    scopus 로고
    • Source localization in reverberant environments: Modeling and statistical analysis
    • Nov.
    • T. Gustafsson, B. Rao, and M. Trivedi, “Source localization in reverberant environments: Modeling and statistical analysis,” IEEE Trans. SpeechAudioProcess., vol. 11, no. 6, pp. 791–803, Nov. 2003.
    • (2003) IEEE Trans. SpeechAudioProcess. , vol.11 , Issue.6 , pp. 791-803
    • Gustafsson, T.1    Rao, B.2    Trivedi, M.3
  • 15
    • 79959829540 scopus 로고    scopus 로고
    • Overlap detection for speaker diarization by fusing spectral and spatial features
    • M. Zelenak, C. Segura, and J. Hernando, “Overlap detection for speaker diarization by fusing spectral and spatial features,” in Proc. Interspeech′10, Makuhari, Japan, 2010, pp. 2302–2305.
    • (2010) Proc. Interspeech′10, Makuhari, Japan , pp. 2302-2305
    • Zelenak, M.1    Segura, C.2    Hernando, J.3
  • 17
    • 11144286121 scopus 로고    scopus 로고
    • The Spectral Autocorrelation Peak Valley Ratio (SAPVR) --A usable speech measure employed as a co-channel detection system
    • R. Yantorno, et al., “The Spectral Autocorrelation Peak Valley Ratio (SAPVR) --A usable speech measure employed as a co-channel detection system,” in Proc. IEEE Int. Workshop Intell. Signal Process. (WISP), Budapest, Hungary, 2001.
    • (2001) Proc. IEEE Int. Workshop Intell. Signal Process. (WISP), Budapest, Hungary
    • Yantorno, R.1
  • 18
    • 84857343608 scopus 로고    scopus 로고
    • Usable speech detection using linear predictive analysis--A model based approach
    • N. Sundaram, R. Yantorno, B. Smolenski, and A. Iyer, “Usable speech detection using linear predictive analysis--A model based approach,” in Proc. ISPACS, Awaji Island, Japan, 2003, pp. 231–235.
    • (2003) Proc. ISPACS, Awaji Island, Japan , pp. 231-235
    • Sundaram, N.1    Yantorno, R.2    Smolenski, B.3    Iyer, A.4
  • 19
    • 85008549381 scopus 로고    scopus 로고
    • A robust method for speech signal time-delay estimation in reverberant rooms
    • P. Svaizer, et al., “A robust method for speech signal time-delay estimation in reverberant rooms,” in Proc. ICASSP′97, Munich, Germany, 1997, pp. 231–234.
    • (1997) Proc. ICASSP′97, Munich, Germany , pp. 231-234
    • Svaizer, P.1
  • 20
    • 0030701369 scopus 로고    scopus 로고
    • A robust method for speech signal time-delay estimation in reverberant rooms
    • M. Brandstein and H. Silverman, “A robust method for speech signal time-delay estimation in reverberant rooms,” in Proc. ICASSP′97, Munich, Germany, 1997, pp. 375–378.
    • (1997) Proc. ICASSP′97, Munich, Germany , pp. 375-378
    • Brandstein, M.1    Silverman, H.2
  • 21
    • 47749127366 scopus 로고    scopus 로고
    • Speaker diarization for conference room: The UPC RT07s evaluation system
    • J. Luque, X. Anguera, A. Temko, and J. Hernando, “Speaker diarization for conference room: The UPC RT07s evaluation system,” Multimodal Technol. Percept. Humans, vol. 4625/2008, pp. 543–553, 2008.
    • (2008) Multimodal Technol. Percept. Humans , vol.4625-2008 , pp. 543-553
    • Luque, J.1    Anguera, X.2    Temko, A.3    Hernando, J.4
  • 23
    • 39749173057 scopus 로고    scopus 로고
    • Incremental learning for robust visual tracking
    • May
    • D. Ross, J. Lim, R. Lin, and M. Yang, “Incremental learning for robust visual tracking,” Int. J. Comput. Vis., vol. 77, no. 1, pp. 125–141, May 2008.
    • (2008) Int. J. Comput. Vis. , vol.77 , Issue.1 , pp. 125-141
    • Ross, D.1    Lim, J.2    Lin, R.3    Yang, M.4
  • 24
    • 0034247885 scopus 로고    scopus 로고
    • Sequential Karhunen-Loeve basis extraction and its application to images
    • Aug.
    • A. Levy and M. Lindenbaum, “Sequential Karhunen-Loeve basis extraction and its application to images,” IEEE Trans. Image Process., vol. 9, no. 8, pp. 1371–1374, Aug. 2000.
    • (2000) IEEE Trans. Image Process. , vol.9 , Issue.8 , pp. 1371-1374
    • Levy, A.1    Lindenbaum, M.2
  • 25
    • 70349225212 scopus 로고    scopus 로고
    • Improved location features for meeting speaker diarization
    • S. Otterson, “Improved location features for meeting speaker diarization,” in Proc. Interspeech′07, Antwerp, Belgium, 2007, pp. 1849–1852.
    • (2007) Proc. Interspeech′07, Antwerp, Belgium , pp. 1849-1852
    • Otterson, S.1
  • 28
    • 85008563256 scopus 로고    scopus 로고
    • The Rich Transcription 2009 Meeting Recognition Evaluation
    • [Online]. Available: http://www.itl.nist.gov/iad/mig/tests/rt/2009/docs/rt09-meeting-eval-pl%an-v2.pdf
    • “The Rich Transcription 2009 Meeting Recognition Evaluation,” [Online]. Available: http://www.itl.nist.gov/iad/mig/tests/rt/2009/docs/rt09-meeting-eval-pl%an-v2.pdf.
  • 29
    • 84867198548 scopus 로고    scopus 로고
    • Clustering initialization based on spatial information for speaker diarization of meetings
    • J. Luque, C. Segura, and J. Hernando, “Clustering initialization based on spatial information for speaker diarization of meetings,” in Proc. Interspeech′08, Brisbane, Australia, 2008, pp. 383–386.
    • (2008) Proc. Interspeech′08, Brisbane, Australia , pp. 383-386
    • Luque, J.1    Segura, C.2    Hernando, J.3
  • 31
    • 0022352370 scopus 로고
    • Computer-steered microphone arrays for sound transduction in large rooms
    • J. Flanagan, J. Johnson, R. Kahn, and G. Elko, “Computer-steered microphone arrays for sound transduction in large rooms,” J. Acoust. Soc. Amer., vol. 78, no. 5, pp. 1508–1518, 1985.
    • (1985) J. Acoust. Soc. Amer. , vol.78 , Issue.5 , pp. 1508-1518
    • Flanagan, J.1    Johnson, J.2    Kahn, R.3    Elko, G.4
  • 32
    • 34547526911 scopus 로고    scopus 로고
    • Enhanced SVM training for robust speech activity detection
    • A. Temko, D. Macho, and C. Nadeu, “Enhanced SVM training for robust speech activity detection,” in Proc. ICASSP′07, Honolulu, HI, 2007, pp. 1025–1028.
    • (2007) Proc. ICASSP′07, Honolulu, HI , pp. 1025-1028
    • Temko, A.1    Macho, D.2    Nadeu, C.3
  • 33
    • 0035789613 scopus 로고    scopus 로고
    • Proximal support vector machine classifiers
    • G. Fung and O. Mangasarian, “Proximal support vector machine classifiers,” in Proc. KDDM, 2001, pp. 77–86.
    • (2001) Proc. KDDM , pp. 77-86
    • Fung, G.1    Mangasarian, O.2
  • 35
    • 47749103773 scopus 로고    scopus 로고
    • Progress in the AMIDA speaker diarization systemfor meeting data
    • D. A. van Leeuwen and M. Konecn$yU, “Progress in the AMIDA speaker diarization systemfor meeting data,” Multimodal Technol Percept. Humans, vol. 4625/2008, pp. 475–483, 2008.
    • (2008) Multimodal Technol Percept. Humans , vol.4625-2008 , pp. 475-483
    • van Leeuwen, D.A.1    Konecn$yU, M.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.