메뉴 건너뛰기




Volumn , Issue , 2010, Pages 424-429

Real-time meeting recognition and understanding using distant microphones and omni-directional camera

Author keywords

Distant microphones; Meeting analysis; Speaker diarization; Speech enhancement; Speech recognition; Topic tracking

Indexed keywords

AUDIO PROCESSING; DISTANT MICROPHONES; FACE POSE; LOW-LATENCY; MEETING ANALYSIS; MICROPHONE ARRAYS; OMNIDIRECTIONAL CAMERAS; SPEAKER DIARIZATION; SPEECH RECOGNIZER; SPEECH SIGNALS; TOPIC TRACKING;

EID: 79951797950     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/SLT.2010.5700890     Document Type: Conference Paper
Times cited : (12)

References (24)
  • 4
    • 44849090969 scopus 로고    scopus 로고
    • Recognition and understanding of meetings the AMI and AMIDA projects
    • S. Renals, T. Hain, and H. Bourlard, "Recognition and understanding of meetings The AMI and AMIDA projects," in Proc. ASRU, 2007, pp. 238-247.
    • Proc. ASRU, 2007 , pp. 238-247
    • Renals, S.1    Hain, T.2    Bourlard, H.3
  • 6
    • 60949097180 scopus 로고    scopus 로고
    • A realtime multimodal system for analyzing group meetings by combining face pose tracking and speaker diarization
    • K. Otsuka, S. Araki, K. Ishizuka, M. Fujimoto, M. Heinrich, and J. Yamato, "A realtime multimodal system for analyzing group meetings by combining face pose tracking and speaker diarization," in Proc. ICMI, 2008, pp. 257-264.
    • Proc. ICMI, 2008 , pp. 257-264
    • Otsuka, K.1    Araki, S.2    Ishizuka, K.3    Fujimoto, M.4    Heinrich, M.5    Yamato, J.6
  • 9
    • 70450204727 scopus 로고    scopus 로고
    • A study of mutual front-end processing method based on statistical model for noise robust speech recognition
    • M. Fujimoto, K. Ishizuka, and T. Nakatani, "A study of mutual front-end processing method based on statistical model for noise robust speech recognition," in Proc. Interspeech, 2009, pp. 1235-1238.
    • Proc. Interspeech, 2009 , pp. 1235-1238
    • Fujimoto, M.1    Ishizuka, K.2    Nakatani, T.3
  • 10
    • 33645758265 scopus 로고    scopus 로고
    • NTT Speech recognizer with Outlook on the Next generation: SOLON
    • [Online]. Available
    • T. Hori, "NTT Speech recognizer with Outlook On the Next generation: SOLON," in Proc. NTT Workshop on Communication Scene Analysis, 2004, pp. SP-6. [Online]. Available: www.kecl.ntt.co.jp/icl/signal/hori/publications/thori csa2004.pdf.
    • Proc. NTT Workshop on Communication Scene Analysis, 2004
    • Hori, T.1
  • 12
    • 51449113843 scopus 로고    scopus 로고
    • Speaker indexing and speech enhancement in real meeting / conversations
    • S. Araki, M. Fujimoto, K. Ishizuka, H. Sawada, and S. Makino, "Speaker indexing and speech enhancement in real meeting / conversations," in Proc. ICASSP, 2008, vol. I, pp. 93-96.
    • Proc. ICASSP, 2008 , vol.1 , pp. 93-96
    • Araki, S.1    Fujimoto, M.2    Ishizuka, K.3    Sawada, H.4    Makino, S.5
  • 13
    • 0016990291 scopus 로고
    • The generalized correlation method for estimation of time delay
    • C. H. Knapp and G. C. Carter, "The generalized correlation method for estimation of time delay," IEEE Trans. Acoust. Speech and Signal Processing, vol. 24, no. 4, pp. 320-327, 1976.
    • (1976) IEEE Trans. Acoust. Speech and Signal Processing , vol.24 , Issue.4 , pp. 320-327
    • Knapp, C.H.1    Carter, G.C.2
  • 14
    • 34247223586 scopus 로고    scopus 로고
    • Underdetermined blind sparse source separation for arbitrarily arranged multiple sensors
    • Aug
    • S. Araki, H. Sawada, R. Mukai, and S. Makino, "Underdetermined blind sparse source separation for arbitrarily arranged multiple sensors," Signal Processing, vol. 77, no. 8, pp. 1833-1847, Aug 2007.
    • (2007) Signal Processing , vol.77 , Issue.8 , pp. 1833-1847
    • Araki, S.1    Sawada, H.2    Mukai, R.3    Makino, S.4
  • 15
    • 50449097931 scopus 로고    scopus 로고
    • Noise robust voice activity detection based on switching Kalman filter
    • M. Fujimoto and K. Ishizuka, "Noise robust voice activity detection based on switching Kalman filter," in Proc. Interspeech, 2007, pp. 2933-2936.
    • Proc. Interspeech, 2007 , pp. 2933-2936
    • Fujimoto, M.1    Ishizuka, K.2
  • 16
    • 0032762471 scopus 로고    scopus 로고
    • A statistical model-based voice activity detection
    • January
    • J. Sohn, N. S. Kim, and W. Sung, "A statistical model-based voice activity detection," IEEE Signal Processing Letters, vol. 6, no. 1, pp. 1-3, January 1999.
    • (1999) IEEE Signal Processing Letters , vol.6 , Issue.1 , pp. 1-3
    • Sohn, J.1    Kim, N.S.2    Sung, W.3
  • 17
    • 33745207361 scopus 로고    scopus 로고
    • A Japanese national project on spontaneous speech corpus and processing technology
    • S. Furui, K. Maekawa, and H. Isahara, "A Japanese national project on spontaneous speech corpus and processing technology," in Proc. ASR, 2000, pp. 244-248.
    • Proc. ASR, 2000 , pp. 244-248
    • Furui, S.1    Maekawa, K.2    Isahara, H.3
  • 18
    • 78049393373 scopus 로고    scopus 로고
    • A comparative study on methods of weighted language model training for reranking LVCSR n-best hypotheses
    • T. Oba, T. Hori, and A. Nakamura, "A comparative study on methods of weighted language model training for reranking LVCSR n-best hypotheses," in Proc. ICASSP, 2010, pp. 5126-5129.
    • Proc. ICASSP, 2010 , pp. 5126-5129
    • Oba, T.1    Hori, T.2    Nakamura, A.3
  • 19
    • 45849093239 scopus 로고    scopus 로고
    • Efficient WFST-based one-pass decoding with on-the-fly hypothesis rescoring in extremely large vocabulary continuous speech recognition
    • T. Hori, C. Hori, Y. Minami, and A. Nakamura, "Effi- cient WFST-based one-pass decoding with on-the-fly hypothesis rescoring in extremely large vocabulary continuous speech recognition," IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 4, pp. 1352-1365, 2007.
    • (2007) IEEE Transactions on Audio, Speech, and Language Processing , vol.15 , Issue.4 , pp. 1352-1365
    • Hori, T.1    Hori, C.2    Minami, Y.3    Nakamura, A.4
  • 20
    • 33646426591 scopus 로고    scopus 로고
    • Generalized fast on-the-fly composition algorithm for WFST-based speech recognition
    • T. Hori and A. Nakamura, "Generalized fast on-the-fly composition algorithm for WFST-based speech recognition," in Proc. Interspeech- Eurospeech, 2005, pp. 557-560.
    • Proc. Interspeech-Eurospeech, 2005 , pp. 557-560
    • Hori, T.1    Nakamura, A.2
  • 21
    • 85009271609 scopus 로고    scopus 로고
    • Towards automatic closed captioning: Low latency real time broadcast news transcription
    • M. Saraclar, M. Riley, E. Bocchieri, and V. Goffin, "Towards automatic closed captioning: low latency real time broadcast news transcription," in Proc. ICSLP, 2002, pp. 1741-1744.
    • Proc. ICSLP, 2002 , pp. 1741-1744
    • Saraclar, M.1    Riley, M.2    Bocchieri, E.3    Goffin, V.4
  • 23
    • 77956207114 scopus 로고    scopus 로고
    • Topic tracking model for analyzing consumer purchase behavior
    • T. Iwata, S. Watanabe, T. Yamada, and N. Ueda, "Topic tracking model for analyzing consumer purchase behavior," in Proc. IJCAI, 2009, pp. 1427-1432.
    • Proc. IJCAI, 2009 , pp. 1427-1432
    • Iwata, T.1    Watanabe, S.2    Yamada, T.3    Ueda, N.4
  • 24
    • 70450162101 scopus 로고    scopus 로고
    • Memory-based particle filter for face pose tracking robust under complex dynamics
    • D. Mikami, K. Otsuka, and J. Yamato, "Memory-based particle filter for face pose tracking robust under complex dynamics," in Proc. CVPR, 2009, pp. 999-1006.
    • Proc. CVPR, 2009 , pp. 999-1006
    • Mikami, D.1    Otsuka, K.2    Yamato, J.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.