SCOPUS 정보 검색 플랫폼

International Speech Communication Association - 8th Annual Conference of the International Speech Communication Association, Interspeech 2007

Volumn 4, Issue , 2007, Pages 2512-2515

Detection, diarization, and transcription of far-field lecture speech

(5) Huang, Jing a Marcheret, Etienne a Visweswariah, Karthik a Libal, Vit a Potamianos, Gerasimos a

a IBM T J WATSON RESEARCH CENTER (United States)

Author keywords

Lectures; Smart rooms; Speaker diarization; Speech activity detection; Speech processing; Speech recognition

Indexed keywords

SPEECH; SPEECH PROCESSING; SPEECH RECOGNITION; TRANSCRIPTION;

EUROPEANS; EVALUATION TESTS; FIELD CONDITIONS; LECTURE DATUMS; LECTURES; MEETING RECOGNITIONS; RELATIVE REDUCTIONS; SMART ROOMS; SPEAKER DIARIZATION; SPEECH ACTIVITY DETECTION; SPEECH ACTIVITY DETECTIONS; WORD ERROR RATES;

SPEECH COMMUNICATION;

EID: 56149089096 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (4)

References (13)

1
- 34250167213
- CHIL:, Online, Available
- "CHIL: Computers in the Human Interaction Loop" [Online]. Available: http://chil.server.de
- Computers in the Human Interaction Loop

2
- 50449092763
- AMI:, Online, Available
- "AMI: Augmented Multi-Party Interaction" [Online]. Available: http://www.arniproject.org
- Augmented Multi-Party Interaction

3
- 56149084482
- The NIST SmartSpace Laboratory [Online, Available
- "The NIST SmartSpace Laboratory" [Online]. Available: http://www.nist.gov/smartspace

4
- 77249114287
- J.G. Fiscus, J. Ajot, M. Michel, and J.S. Garofolo, The Rich Transcription 2006 Spring meeting recognition evaluation, in Machine Learning for Multimodal Interaction, S. Renals, S. Bengio, and J.G. Fiscus (Eds.), LNCS 4299, pp. 309-322, 2006.
- J.G. Fiscus, J. Ajot, M. Michel, and J.S. Garofolo, "The Rich Transcription 2006 Spring meeting recognition evaluation," in Machine Learning for Multimodal Interaction, S. Renals, S. Bengio, and J.G. Fiscus (Eds.), LNCS vol. 4299, pp. 309-322, 2006.

5
- 50449101617
- The IBM RT06s evaluation system for speech activity detection in CHIL seminars
- Machine learning for Multimodal Interaction, S. Renals, S. Bengio, and J.G. Fiscus Eds
- E. Marcheret, G. Potamianos, K. Visweswariah, and J. Huang, "The IBM RT06s evaluation system for speech activity detection in CHIL seminars," in Machine learning for Multimodal Interaction, S. Renals, S. Bengio, and J.G. Fiscus (Eds.), LNCS vol. 4299, pp. 323-335, 2006.
- (2006) LNCS , vol.4299 , pp. 323-335
- Marcheret, E.¹ Potamianos, G.² Visweswariah, K.³ Huang, J.⁴

6
- 47949095692
- The IBM Rich Transcription Spring 2006 speech-to-text system for lecture meetings
- Machine Learning/or Multimodal Interaction, S. Renals, S. Bengio, and J.G. Fiscus Eds
- J. Huang, M. Westphal, S. Chen, et al., "The IBM Rich Transcription Spring 2006 speech-to-text system for lecture meetings," in Machine Learning/or Multimodal Interaction, S. Renals, S. Bengio, and J.G. Fiscus (Eds.), LNCS vol. 4299, pp. 432-443, 2006.
- (2006) LNCS , vol.4299 , pp. 432-443
- Huang, J.¹ Westphal, M.² Chen, S.³

7
- 0030638031
- A post-processing system to yield reduced word error rates: Recogniser output voting error reduction (ROVER)
- Santa Barbara, CA, pp
- J.G. Fiscus, "A post-processing system to yield reduced word error rates: Recogniser output voting error reduction (ROVER)," in Proc. ASRU Workshop, Santa Barbara, CA, pp. 347-352, 1997.
- (1997) Proc. ASRU Workshop , pp. 347-352
- Fiscus, J.G.¹

8
- 33646818291
- Constructing ensembles of ASR systems using randomized decision trees
- Philadelphia, PA
- O. Siohan, B. Ramabhadran, and B. Kingsbury, "Constructing ensembles of ASR systems using randomized decision trees," in Proc. Int. Conf. Acoustics Speech Signal Process., Philadelphia, PA, vol. 1, pp. 197-200, 2005.
- (2005) Proc. Int. Conf. Acoustics Speech Signal Process , vol.1 , pp. 197-200
- Siohan, O.¹ Ramabhadran, B.² Kingsbury, B.³

9
- 50449107696
- Linguistic Data Consortium, University of Pennsylvania. Philadelphia, PA, Online, Available
- "The LDC Corpus Catalog," Linguistic Data Consortium, University of Pennsylvania. Philadelphia, PA. [Online]. Available: http://www.ldc.upenn.edu
- The LDC Corpus Catalog

10
- 0034841234
- Linear feature space projections for speaker adaptation
- Salt Lake City, UT, pp
- G. Saon, G. Zweig, and M. Padmanabhan, "Linear feature space projections for speaker adaptation," in Proc. Int. Conf. Acoustics Speech Signal Process., Salt Lake City, UT, pp. 325-328, 2001.
- (2001) Proc. Int. Conf. Acoustics Speech Signal Process , pp. 325-328
- Saon, G.¹ Zweig, G.² Padmanabhan, M.³

11
- 33646788786
- fMPE: Discriminatively trained features for speech recognition
- Philadelphia, PA
- D. Povey, B. Kingsbury, L. Mangu, G. Saon, H. Soltau, and G. Zweig, "fMPE: Discriminatively trained features for speech recognition," in Proc. Int. Conf. Acoustics Speech Signal Process., Philadelphia, PA, vol. 1, pp. 961-964, 2005.
- (2005) Proc. Int. Conf. Acoustics Speech Signal Process , vol.1 , pp. 961-964
- Povey, D.¹ Kingsbury, B.² Mangu, L.³ Saon, G.⁴ Soltau, H.⁵ Zweig, G.⁶

12
- 0036296863
- Minimum phone error and Ismoothing for improved discriminative training
- Orlando, FL, pp
- D. Povey and P.C. Woodland, "Minimum phone error and Ismoothing for improved discriminative training," in Proc. Int. Conf. Acoustics Speech Signal Process., Orlando, FL, pp. 105-108, 2002.
- (2002) Proc. Int. Conf. Acoustics Speech Signal Process , pp. 105-108
- Povey, D.¹ Woodland, P.C.²

13
- 0012611072
- Entropy-based pruning of backoff language models
- Lansdowne, VA, pp
- A. Stolcke, "Entropy-based pruning of backoff language models," in Proc. DARPA Broadcast News Transcription and Understanding Workshop, Lansdowne, VA, pp. 270-274, 1998.
- (1998) Proc. DARPA Broadcast News Transcription and Understanding Workshop , pp. 270-274
- Stolcke, A.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.