메뉴 건너뛰기




Volumn 15, Issue 7, 2007, Pages 2011-2022

Acoustic beamforming for speaker diarization of meetings

Author keywords

Acoustic beamforming; Meeting processing; Speaker diarization; Speaker segmentation and clustering

Indexed keywords

ACOUSTIC INFORMATIONS; BEAM-FORMING TECHNIQUES; CHANNEL SELECTIONS; DYNAMIC OUTPUTS; MEETING PROCESSING; NOVEL ALGORITHMS; SPEAKER DIARIZATION; SPEAKER SEGMENTATION AND CLUSTERING; STEP TIME; TEST SETS; VITERBI;

EID: 50449086237     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2007.902460     Document Type: Article
Times cited : (418)

References (32)
  • 1
    • 64249157122 scopus 로고    scopus 로고
    • The Macquarie speaker diarization system for RT04s
    • Montreal, QC, Canada, Online, Available
    • S. Cassidy, "The Macquarie speaker diarization system for RT04s," in Proc. NIST 2004 Spring Meetings Recognition Evaluation Workshop, Montreal, QC, Canada, 2004 [Online]. Available: http://www.nist.gov/speech/test-beds/mr-proj/icassp-program.html
    • (2004) Proc. NIST 2004 Spring Meetings Recognition Evaluation Workshop
    • Cassidy, S.1
  • 2
    • 33745528924 scopus 로고    scopus 로고
    • D. van Leeuwen, The TNO Speaker Diarization System for NIST RT05s for Meeting Data, in Lecture Notes in Computer Science, ser. Machine Learning for Multimodal Interaction (MLMI 2005). Berlin, Germany: Springer, 2006, 3869, pp. 440-449.
    • D. van Leeuwen, "The TNO Speaker Diarization System for NIST RT05s for Meeting Data," in Lecture Notes in Computer Science, ser. Machine Learning for Multimodal Interaction (MLMI 2005). Berlin, Germany: Springer, 2006, vol. 3869, pp. 440-449.
  • 3
    • 77249176190 scopus 로고    scopus 로고
    • D. van Leeuwen and M. Huijbregts, The AMI Speaker Diarization System for NIST RT06s Meeting Data, in Lecture Notes in Computer Science, ser. Machine Learning for Multimodal Interaction (MLMI 2006). Berlin, Germany: Springer, 2006, 4299, pp. 371-384.
    • D. van Leeuwen and M. Huijbregts, "The AMI Speaker Diarization System for NIST RT06s Meeting Data," in Lecture Notes in Computer Science, ser. Machine Learning for Multimodal Interaction (MLMI 2006). Berlin, Germany: Springer, 2006, vol. 4299, pp. 371-384.
  • 5
    • 29044436483 scopus 로고    scopus 로고
    • The NIST 2004 spring rich transcription evaluation: Two-axis merging strategy in the context of multiple distant microphone based meeting speaker segmentation
    • Montreal, QC, Canada, Online, Available
    • C. Fredouille, D. Moraru, S. Meignier, L. Besacier, and J.-F. Bonastre, "The NIST 2004 spring rich transcription evaluation: Two-axis merging strategy in the context of multiple distant microphone based meeting speaker segmentation," in NIST 2004 Spring Meetings Recognition Evaluation Workshop, Montreal, QC, Canada, 2004 [Online]. Available: http://www.nist.gov/speech/test-beds/mr-proj/icassp-program.html
    • (2004) NIST 2004 Spring Meetings Recognition Evaluation Workshop
    • Fredouille, C.1    Moraru, D.2    Meignier, S.3    Besacier, L.4    Bonastre, J.-F.5
  • 6
    • 33745572731 scopus 로고    scopus 로고
    • D. Istrate, C. Fredouille, S. Meignier, L. Besacier, and J.-F. Bonastre, NIST RT05s Evaluation: Pre-Processing Techniques and Speaker Diarization on Multiple Microphone Meetings, in Lecture Notes in Computer Science, ser. Machine Learning for Multimodal Interaction (MLMI 2005). Berlin, Germany: Springer, 2006, 3869, pp. 428-439.
    • D. Istrate, C. Fredouille, S. Meignier, L. Besacier, and J.-F. Bonastre, "NIST RT05s Evaluation: Pre-Processing Techniques and Speaker Diarization on Multiple Microphone Meetings," in Lecture Notes in Computer Science, ser. Machine Learning for Multimodal Interaction (MLMI 2005). Berlin, Germany: Springer, 2006, vol. 3869, pp. 428-439.
  • 7
    • 77249126512 scopus 로고    scopus 로고
    • C. Fredouille and G. Senay, Technical Improvements of the E-HMM Based Speaker Diarization System for Meetings Records, in Lecture Notes in Computer Science, ser. Machine Learning for Multimodal Interaction (MLMI 2006). Berlin, Germany: Springer, 2006, 4299, pp. 359-370.
    • C. Fredouille and G. Senay, "Technical Improvements of the E-HMM Based Speaker Diarization System for Meetings Records," in Lecture Notes in Computer Science, ser. Machine Learning for Multimodal Interaction (MLMI 2006). Berlin, Germany: Springer, 2006, vol. 4299, pp. 359-370.
  • 8
    • 33846242627 scopus 로고    scopus 로고
    • Speaker diarization for multi-party meetings using acoustic fusion
    • San Juan, Puerto Rico, Nov
    • X. Anguera, C. Wooters, and J. Hernando, "Speaker diarization for multi-party meetings using acoustic fusion," in Proc. ASRU, San Juan, Puerto Rico, Nov. 2005, pp. 426-431.
    • (2005) Proc. ASRU , pp. 426-431
    • Anguera, X.1    Wooters, C.2    Hernando, J.3
  • 9
    • 33745560829 scopus 로고    scopus 로고
    • X. Anguera, C. Wooters, B. Peskin, and M. Aguilo, Robust Speaker Segmentation for Meetings: The ICSI-SRI Spring 2005 Diarization System, in Lecture Notes in Computer Science, ser. Machine Learning for Multimodal Interaction (MLMI 2005). Berlin, Germany: Springer, 2006, 3869, pp. 402-414.
    • X. Anguera, C. Wooters, B. Peskin, and M. Aguilo, "Robust Speaker Segmentation for Meetings: The ICSI-SRI Spring 2005 Diarization System," in Lecture Notes in Computer Science, ser. Machine Learning for Multimodal Interaction (MLMI 2005). Berlin, Germany: Springer, 2006, vol. 3869, pp. 402-414.
  • 10
    • 0023985457 scopus 로고
    • Beamforming: A versatile approach to spatial filtering
    • Apr
    • B. van Veen and K. M. Buckley, " Beamforming: A versatile approach to spatial filtering," IEEE ASSP Mag., vol. 5, no. 2, pp. 4-24, Apr. 1988.
    • (1988) IEEE ASSP Mag , vol.5 , Issue.2 , pp. 4-24
    • van Veen, B.1    Buckley, K.M.2
  • 11
    • 0030193445 scopus 로고    scopus 로고
    • Two decades of array signal processing research
    • Jul
    • H. Krim and M. Viberg, "Two decades of array signal processing research," IEEE Signal Process. Mag., vol. 13, no. 4, pp. 67-94, Jul. 1996.
    • (1996) IEEE Signal Process. Mag , vol.13 , Issue.4 , pp. 67-94
    • Krim, H.1    Viberg, M.2
  • 12
    • 77249114287 scopus 로고    scopus 로고
    • J. G. Fiscus, J. Ajot, M. Michet, and J. S. Garofolo, The AMI Speaker Diarization System for NIST RT06s Meeting Data, in Lecture Notes in Computer Science, ser. Machine Learning for Multimodal Interaction (MLMI 2006). Berlin, Germany: Springer, 2006, 4299, pp. 309-322.
    • J. G. Fiscus, J. Ajot, M. Michet, and J. S. Garofolo, "The AMI Speaker Diarization System for NIST RT06s Meeting Data," in Lecture Notes in Computer Science, ser. Machine Learning for Multimodal Interaction (MLMI 2006). Berlin, Germany: Springer, 2006, vol. 4299, pp. 309-322.
  • 13
    • 64249170737 scopus 로고    scopus 로고
    • quot;Beamformit: Open Source Acoustic Beamforming Software, 2007. [Online]. Available: http://www.icsi.berkeley.edu/xanguera/beamformit
    • quot;Beamformit: Open Source Acoustic Beamforming Software," 2007. [Online]. Available: http://www.icsi.berkeley.edu/xanguera/beamformit
  • 14
    • 0022352370 scopus 로고
    • Computer-steered microphone arrays for sound transduction in large rooms
    • Nov
    • J. Flanagan, J. Johnson, R. Kahn, and G. Elko, "Computer-steered microphone arrays for sound transduction in large rooms," J. Acoust. Soc. Amer., vol. 78, pp. 1508-1518, Nov. 1994.
    • (1994) J. Acoust. Soc. Amer , vol.78 , pp. 1508-1518
    • Flanagan, J.1    Johnson, J.2    Kahn, R.3    Elko, G.4
  • 16
    • 0016990291 scopus 로고
    • The generalized correlation method for estimation of time delay
    • Aug
    • C. H. Knapp and G. C. Carter, "The generalized correlation method for estimation of time delay," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-24, no. 4, pp. 320-327, Aug. 1976.
    • (1976) IEEE Trans. Acoust., Speech, Signal Process , vol.ASSP-24 , Issue.4 , pp. 320-327
    • Knapp, C.H.1    Carter, G.C.2
  • 17
    • 0030701369 scopus 로고    scopus 로고
    • A robust method for speech signal time-delay estimation in reverberant rooms
    • Munich, Germany, May
    • M. S. Brandstein and H. F. Silverman, "A robust method for speech signal time-delay estimation in reverberant rooms," in Proc. ICASSP, Munich, Germany, May 1997, pp. 375-378.
    • (1997) Proc. ICASSP , pp. 375-378
    • Brandstein, M.S.1    Silverman, H.F.2
  • 18
    • 64249095457 scopus 로고    scopus 로고
    • Wiener and Norbert, Extrapolation, Interpolation, and Smoothing of Stationary Time Series. New York: Wiley, 1949.
    • Wiener and Norbert, Extrapolation, Interpolation, and Smoothing of Stationary Time Series. New York: Wiley, 1949.
  • 20
    • 64249150043 scopus 로고    scopus 로고
    • quot;NIST Rich Transcription Evaluations, 2006. [Online]. Available: http://www.nist.gov/speech/tests/rt
    • quot;NIST Rich Transcription Evaluations," 2006. [Online]. Available: http://www.nist.gov/speech/tests/rt
  • 21
    • 64249158551 scopus 로고    scopus 로고
    • quot;ICSI Meeting Recorder Project: Channel Skew in ICSI-Recorded Meetings, 2006. [Online]. Available: http://www.icsi.berkeley.edu/dpwe/ research/mtgrcdr/chanskew.html
    • quot;ICSI Meeting Recorder Project: Channel Skew in ICSI-Recorded Meetings," 2006. [Online]. Available: http://www.icsi.berkeley.edu/dpwe/ research/mtgrcdr/chanskew.html
  • 23
    • 44949197897 scopus 로고    scopus 로고
    • Robust speaker diarization for meetings: ICSI RT06s evaluation system
    • Pittsburgh, PA, Sep
    • X. Anguera, C. Wooters, and J. M. Pardo, "Robust speaker diarization for meetings: ICSI RT06s evaluation system," in Proc. ICSLP, Pittsburgh, PA, Sep. 2006, pp. 1674-1677.
    • (2006) Proc. ICSLP , pp. 1674-1677
    • Anguera, X.1    Wooters, C.2    Pardo, J.M.3
  • 24
    • 44849123928 scopus 로고    scopus 로고
    • Robust speaker diarization for meetings,
    • Ph.D. dissertation, Universitat Politecnica de Catalunya, Barcelona, Spain
    • X. Anguera, "Robust speaker diarization for meetings," Ph.D. dissertation, Universitat Politecnica de Catalunya, Barcelona, Spain, 2006.
    • (2006)
    • Anguera, X.1
  • 25
    • 84875953283 scopus 로고    scopus 로고
    • Clustering via the bayesian information criterion with applications in speech recognition
    • Seattle, WA
    • S. S. Chen and P. Gopalakrishnan, "Clustering via the bayesian information criterion with applications in speech recognition," in Proc. ICASSP, Seattle, WA, 1998, vol. 2, pp. 645-648.
    • (1998) Proc. ICASSP , vol.2 , pp. 645-648
    • Chen, S.S.1    Gopalakrishnan, P.2
  • 26
    • 84946742526 scopus 로고    scopus 로고
    • A robust speaker clustering algorithm
    • U.S. Virgin Islands, Dec
    • J. Ajmera and C. Wooters, "A robust speaker clustering algorithm," in Proc. ASRU, U.S. Virgin Islands, Dec. 2003, pp. 411-416.
    • (2003) Proc. ASRU , pp. 411-416
    • Ajmera, J.1    Wooters, C.2
  • 27
    • 34547516864 scopus 로고    scopus 로고
    • Automatic weighting for the combination of TDOA and acoustic features in speaker diarization for meetings
    • Apr
    • X. Anguera, C. Wooters, J. M. Pardo, and J. Hernando, "Automatic weighting for the combination of TDOA and acoustic features in speaker diarization for meetings," in Proc. ICASSP, Apr. 2007, pp. 241-244.
    • (2007) Proc. ICASSP , pp. 241-244
    • Anguera, X.1    Wooters, C.2    Pardo, J.M.3    Hernando, J.4
  • 28
    • 34548351229 scopus 로고    scopus 로고
    • Speaker diarization for multiple distant microphone meetings: Mixing acoustic features and interchannel time differences
    • Sep
    • J. M. Pardo, X. Anguera, and C.Wooters, "Speaker diarization for multiple distant microphone meetings: Mixing acoustic features and interchannel time differences," in Proc. ICSLP, Sep. 2006, pp. 2194-2197.
    • (2006) Proc. ICSLP , pp. 2194-2197
    • Pardo, J.M.1    Anguera, X.2    Wooters, C.3
  • 29
    • 0024909979 scopus 로고
    • Some statistical issues in the comparison of speech recognition algorithms
    • L. Gillick and S. Cox, "Some statistical issues in the comparison of speech recognition algorithms," in Proc. ICASSP, 1989, pp. 532-535.
    • (1989) Proc. ICASSP , pp. 532-535
    • Gillick, L.1    Cox, S.2
  • 30
    • 0025680226 scopus 로고
    • Tools for the analysis of benchmark speech recognition tests
    • D. Pallett, W. Fisher, and J. Fiscus, "Tools for the analysis of benchmark speech recognition tests," in Proc. ICASSP, 1990, vol. 1, pp. 97-100.
    • (1990) Proc. ICASSP , vol.1 , pp. 97-100
    • Pallett, D.1    Fisher, W.2    Fiscus, J.3
  • 31
    • 33745515429 scopus 로고    scopus 로고
    • A. Stolcke, X. Anguera, K. Boakye, O. Cetin, F. Grezl, A. Janin, A. Mandal, B. Peskin, C. Wooters, and J. Zheng, Further Progress in Meeting Recognition: The ICSI-SRI Spring 2005 Speech-to-Text Evaluation System, in Lecture Notes in Computer Science, ser. Machine Learning for Multimodal Interaction (MLMI 2005). Berlin, Germany: Springer, 2006, 3869, pp. 463-475.
    • A. Stolcke, X. Anguera, K. Boakye, O. Cetin, F. Grezl, A. Janin, A. Mandal, B. Peskin, C. Wooters, and J. Zheng, "Further Progress in Meeting Recognition: The ICSI-SRI Spring 2005 Speech-to-Text Evaluation System," in Lecture Notes in Computer Science, ser. Machine Learning for Multimodal Interaction (MLMI 2005). Berlin, Germany: Springer, 2006, vol. 3869, pp. 463-475.
  • 32
    • 51449099706 scopus 로고    scopus 로고
    • A. Janin, A. Stolcke, X. Anguera, K. Boakye, O. Cetin, J. Frankel, and J. Zheng, The ICSI-SRI Spring 2006 Meeting Recognition System, in Lecture Notes in Computer Science, ser. Machine Learning for Multimodal Interaction (MLMI 2005). Berlin, Germany: Springer, 2006, 3869, pp. 444-456.
    • A. Janin, A. Stolcke, X. Anguera, K. Boakye, O. Cetin, J. Frankel, and J. Zheng, "The ICSI-SRI Spring 2006 Meeting Recognition System," in Lecture Notes in Computer Science, ser. Machine Learning for Multimodal Interaction (MLMI 2005). Berlin, Germany: Springer, 2006, vol. 3869, pp. 444-456.


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.