SCOPUS 정보 검색 플랫폼

Volumn 26, Issue 1, 2012, Pages 52-66

The design and collection of COSINE, a multi-microphone in situ speech corpus recorded in noisy environments

(5) Stupakov, Alex a Hanusa, Evan a Vijaywargi, Deepak a Fox, Dieter a Bilmes, Jeff a

a University of Washington (United States)

Author keywords

Microphone arrays; Multi microphone; Multi party corpora; Noise robust speech recognition; Portable recording; Speech recognition

Indexed keywords

ACOUSTIC NOISE; AUDIO RECORDINGS; HUMAN COMPUTER INTERACTION; MICROPHONES; SPEECH; TRANSCRIPTION;

DESIGN AND CONSTRUCTION; HIGH-QUALITY RECORDINGS; MICROPHONE ARRAYS; MULTI-PARTY CONVERSATIONS; MULTI-PARTY CORPORA; NOISE ROBUST SPEECH RECOGNITION; PORTABLE RECORDING; REAL WORLD ENVIRONMENTS;

SPEECH RECOGNITION;

EID: 79959404069 PISSN: 08852308 EISSN: 10958363 Source Type: Journal
DOI: 10.1016/j.csl.2010.12.003 Document Type: Article

Times cited : (30)

References (36)

1
- 79960040533
- Aurora speech recognition experimental framework. http://aurora.hsnr.de/.
- Aurora Speech Recognition Experimental Framework

2
- 4444257069
- Praat, a system for doing phonetics by computer
- P. Boersma Praat, a system for doing phonetics by computer Glot International 5 9/10 2001 341 345
- (2001) Glot International , vol.5 , Issue.9-10 , pp. 341-345
- Boersma, P.¹

3
- 79960067669
- C. M. University. cmudict0.7a
- C. M. University, 2008. cmudict0.7a, https://cmusphinx.svn.sourceforge. net/svnroot/cmusphinx/trunk/cmudict/cmudict0. 7a.
- (2008)

4
- 33745530242
- The AMI meeting corpus: A pre-announcement
- J. Carletta, S. Ashby, S. Bourban, and M. Flynn The AMI meeting corpus: a pre-announcement Lecture notes in computer science 3869 2006 28
- (2006) Lecture Notes in Computer Science , vol.3869 , pp. 28
- Carletta, J.¹ Ashby, S.² Bourban, S.³ Flynn, M.⁴

5
- 33646783378
- Ph.D. thesis, University of Washington
- Chen, C., 2004. Noise robustness in automatic speech recognition. Ph.D. thesis, University of Washington.
- (2004) Noise Robustness in Automatic Speech Recognition
- Chen, C.¹

6
- 0036291376
- Uncertainty decoding with SPLICE for noise robust speech recognition
- J. Droppo, A. Acero, and L. Deng Uncertainty decoding with SPLICE for noise robust speech recognition IEEE International Conference On Acoustics Speech And Signal Processing, vol. 1, IEEE 1999 2002
- (1999) IEEE International Conference on Acoustics Speech and Signal Processing, Vol. 1, IEEE , pp. 2002
- Droppo, J.¹ Acero, A.² Deng, L.³

7
- 79960039131
- FLAC - Free Lossless Audio Codec, v1.1
- FLAC - Free Lossless Audio Codec, v1.1. http://flac.sourceforge.net/.

8
- 0003548585
- J.S. Garofolo, L.F. Lamel, W.M. Fisher, J.G. Fiscus, D.S. Pallett, and N.L. Dahlgren DARPA TIMIT Acoustic Phonetic Continuous Speech Corpus CDROM 1993
- (1993) DARPA TIMIT Acoustic Phonetic Continuous Speech Corpus CDROM
- Garofolo, J.S.¹ Lamel, L.F.² Fisher, W.M.³ Fiscus, J.G.⁴ Pallett, D.S.⁵ Dahlgren, N.L.⁶

9
- 85016587886
- SWITCHBOARD: Telephone speech corpus for research and development
- J. Godfrey, E. Holliman, and J. McDaniel SWITCHBOARD: telephone speech corpus for research and development ICASSP, vol. 1 1992 517 520
- (1992) ICASSP, Vol. 1 , pp. 517-520
- Godfrey, J.¹ Holliman, E.² McDaniel, J.³

10
- 0029288202
- Speech recognition in noisy environments: A survey
- Y. Gong Speech recognition in noisy environments: a survey Speech Communication 16 3 1995 261 291
- (1995) Speech Communication , vol.16 , Issue.3 , pp. 261-291
- Gong, Y.¹

11
- 0002992867
- The 1996 broadcast news speech and language-model corpus
- D. Graff, Z. Wu, R. MacIntyre, and M. Liberman The 1996 broadcast news speech and language-model corpus Proceedings of the DARPA Workshop on Spoken Language technology 1997 11 14
- (1997) Proceedings of the DARPA Workshop on Spoken Language Technology , pp. 11-14
- Graff, D.¹ Wu, Z.² MacIntyre, R.³ Liberman, M.⁴

12
- 34547540831
- An auditory neural feature extraction method for robust speech recognition
- W. Guo, L. Zhang, and B. Xia An auditory neural feature extraction method for robust speech recognition ICASSP 2007
- (2007) ICASSP
- Guo, W.¹ Zhang, L.² Xia, B.³

13
- 0000259871
- Models and selection criteria for regression and classification
- D. Heckerman, and C. Meek Models and selection criteria for regression and classification UAI 1997
- (1997) UAI
- Heckerman, D.¹ Meek, C.²

14
- 34447092407
- Subjective comparison and evaluation of speech enhancement algorithms
- DOI 10.1016/j.specom.2006.12.006, PII S0167639306001920
- Y. Hu, and P. Loizou Subjective evaluation and comparison of speech enhancement algorithms Speech Communication 49 2007 588 601 (Pubitemid 47031352)
- (2007) Speech Communication , vol.49 , Issue.7-8 , pp. 588-601
- Hu, Y.¹ Loizou, P.C.²

15
- 44949151536
- An improved mel-Wiener filter for mel-LPC based speech recognition
- M. Islam, H. Matsumoto, and K. Yamamoto An improved mel-Wiener filter for mel-LPC based speech recognition Interspeech-ICSLP 2006
- (2006) Interspeech-ICSLP
- Islam, M.¹ Matsumoto, H.² Yamamoto, K.³

16
- 0025680225
- NTIMIT: A phonetically balanced, continuous speech, telephone bandwidth speech database
- C. Jankowski NTIMIT: a phonetically balanced, continuous speech, telephone bandwidth speech database ICASSP 1990
- (1990) ICASSP
- Jankowski, C.¹

17
- 0141814662
- The ICSI meeting corpus
- A. Janin, D. Baron, J. Edwards, D. Ellis, D. Gelbart, N. Morgan, B. Peskin, T. Pfau, E. Shriberg, A. Stolcke, and C. Wooters The ICSI meeting corpus ICASSP, vol. 1 2003 364 367
- (2003) ICASSP, Vol. 1 , pp. 364-367
- Janin, A.¹ Baron, D.² Edwards, J.³ Ellis, D.⁴ Gelbart, D.⁵ Morgan, N.⁶ Peskin, B.⁷ Pfau, T.⁸ Shriberg, E.⁹ Stolcke, A.¹⁰ Wooters, C.¹¹

18
- 85187956132
- The Lombard effect: A reflex to better communicate with others in noise
- J.-C. Junqua, S. Fincke, and K. Field The Lombard effect: a reflex to better communicate with others in noise ICASSP 1999 2083 2086
- (1999) ICASSP , pp. 2083-2086
- Junqua, J.-C.¹ Fincke, S.² Field, K.³

19
- 85009074965
- Construction of speech corpus in moving car environment
- N. Kawaguchi, S. Matsubara, H. Iwa, S. Kajita, K. Takeda, F. Itakura, and Y. Inagaki Construction of speech corpus in moving car environment ICSLP 2000
- (2000) ICSLP
- Kawaguchi, N.¹ Matsubara, S.² Iwa, H.³ Kajita, S.⁴ Takeda, K.⁵ Itakura, F.⁶ Inagaki, Y.⁷

20
- 85009135251
- AVICAR: Audio-visual speech corpus in a car environment
- B. Lee, M. Hasegawa-Johnson, C. Goudeseune, S. Kamdar, S. Borys, M. Liu, and T. Huang AVICAR: audio-visual speech corpus in a car environment ICSLP 2004
- (2004) ICSLP
- Lee, B.¹ Hasegawa-Johnson, M.² Goudeseune, C.³ Kamdar, S.⁴ Borys, S.⁵ Liu, M.⁶ Huang, T.⁷

21
- 51449098446
- Cepstral domain feature compensation based on diagonal approximation
- W. Lim, C. Han, J. Shin, and N. Kim Cepstral domain feature compensation based on diagonal approximation ICASSP 2008
- (2008) ICASSP
- Lim, W.¹ Han, C.² Shin, J.³ Kim, N.⁴

22
- 0023263708
- Multi-style training for robust isolated-word speech recognition
- R. Lippmann, E. Martin, and D. Paul Multi-style training for robust isolated-word speech recognition ICASSP, vol. 12 1987 705 708 (Pubitemid 17596279)
- (1987) ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings , pp. 705-708
- Lippmann Richard, P.¹ Martin Edward, A.² Paul Douglas, B.³

23
- 0000874053
- Le signe de l'elevation de la voix
- pp. 101-119
- E. Lombard Le signe de l'elevation de la voix Annales des Maladies de L'Oreille et du Larynx, vol. 37 1911 pp. 101-119
- (1911) Annales des Maladies de l'Oreille et du Larynx, Vol. 37
- Lombard, E.¹

24
- 85061832597
- Albayzin speech database: Design of the phonetic corpus
- A. Moreno, D. Poch, A. Bonafonte, E. Lleida, J. Llisterri, J. Marino, and C. Nadeu Albayzin speech database: design of the phonetic corpus EUROSPEECH 1993 175 178
- (1993) EUROSPEECH , pp. 175-178
- Moreno, A.¹ Poch, D.² Bonafonte, A.³ Lleida, E.⁴ Llisterri, J.⁵ Marino, J.⁶ Nadeu, C.⁷

25
- 85009193359
- Speech in noisy environments (SPINE) adds new dimension to speech recognition R&D
- A. Schmidt-Nielsen, T. Crystal, and E. Marsh Speech in noisy environments (SPINE) adds new dimension to speech recognition R&D HLT 2002
- (2002) HLT
- Schmidt-Nielsen, A.¹ Crystal, T.² Marsh, E.³

26
- 4344712996
- Ph.D. thesis, Carnegie Mellon University
- Seltzer, M., 2003. Microphone array processing for robust speech recognition. Ph.D. thesis, Carnegie Mellon University.
- (2003) Microphone Array Processing for Robust Speech Recognition
- Seltzer, M.¹

27
- 33745828208
- Spontaneous speech: How people really talk and why engineers should care
- E. Shriberg Spontaneous speech: how people really talk and why engineers should care EUROSPEECH 2005
- (2005) EUROSPEECH
- Shriberg, E.¹

28
- 84943262548
- SoX - Sound eXchange. http://sox.sourceforge.net/.
- SoX - Sound EXchange

29
- 70349199112
- COSINE - A corpus of multi-party conversational speech in noisy environments
- A. Stupakov, E. Hanusa, J. Bilmes, and D. Fox COSINE - A corpus of multi-party conversational speech in noisy environments ICASSP 2009
- (2009) ICASSP
- Stupakov, A.¹ Hanusa, E.² Bilmes, J.³ Fox, D.⁴

30
- 85026956548
- Virtual evidence for training speech recognizers using partially labeled data
- A. Subramanya, and J. Bilmes Virtual evidence for training speech recognizers using partially labeled data HLT 2007
- (2007) HLT
- Subramanya, A.¹ Bilmes, J.²

31
- 84867197731
- Applications of virtual-evidence based speech recognizer training
- A. Subramanya, and J. Bilmes Applications of virtual-evidence based speech recognizer training Interspeech 2008
- (2008) Interspeech
- Subramanya, A.¹ Bilmes, J.²

32
- 44849086817
- Uncertainty in training large vocabulary speech recognizers
- A. Subramanya, C. Bartels, J. Bilmes, and P. Nguyen Uncertainty in training large vocabulary speech recognizers ASRU 2007
- (2007) ASRU
- Subramanya, A.¹ Bartels, C.² Bilmes, J.³ Nguyen, P.⁴

33
- 0009578471
- Ph.D. thesis, Carnegie Mellon University
- Sullivan, T., 1996. Multi-microphone correlation-based processing for robust automatic speech recognition. Ph.D. thesis, Carnegie Mellon University.
- (1996) Multi-microphone Correlation-based Processing for Robust Automatic Speech Recognition
- Sullivan, T.¹

34
- 36248936922
- UT-SCOPE-a corpus for speech under cognitive/physical task stress and emotion
- V. Varadarajan, J. Hansen, and I. Ayako UT-SCOPE-a corpus for speech under cognitive/physical task stress and emotion The Workshop Programme Corpora for Research on Emotion and Affect 2006
- (2006) The Workshop Programme Corpora for Research on Emotion and Affect
- Varadarajan, V.¹ Hansen, J.² Ayako, I.³

35
- 51449089990
- A minimum-mean-square-error noise reduction algorithm on mel-frequency cepstra for robust speech recognition
- D. Yu, L. Deng, J. Droppo, J. Wu, Y. Gong, and A. Acero A minimum-mean-square-error noise reduction algorithm on mel-frequency cepstra for robust speech recognition ICASSP 2008
- (2008) ICASSP
- Yu, D.¹ Deng, L.² Droppo, J.³ Wu, J.⁴ Gong, Y.⁵ Acero, A.⁶

36
- 33947692806
- Joint segmentation and classification of dialog acts in multiparty meetings
- M. Zimmermann, A. Stolcke, and E. Shriberg Joint segmentation and classification of dialog acts in multiparty meetings ICASSP 2006
- (2006) ICASSP
- Zimmermann, M.¹ Stolcke, A.² Shriberg, E.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.