SCOPUS 정보 검색 플랫폼

Journal of the Acoustical Society of America

Volumn 129, Issue 1, 2011, Pages 388-403

Effect of speech-intrinsic variations on human and automatic recognition of spoken phonemes

(3) Meyer, Bernd T a Brand, Thomas a Kollmeier, Birger a

a UNIVERSITY OF OLDENBURG (Germany)

Author keywords

[No Author keywords available]

Indexed keywords

AUTOMATIC RECOGNITION; AUTOMATIC SPEECH RECOGNITION SYSTEM; ERROR RATE; HUMAN LISTENERS; HUMAN PERFORMANCE; HUMAN-MACHINE; INTRINSIC VARIABILITIES; NOISY ENVIRONMENT; PHONEME RECOGNITION; RECOGNITION PERFORMANCE; RECOGNITION RATES; SIGNAL TO NOISE; SPEAKING RATE; TEMPORAL CUES; TEMPORAL DYNAMICS;

CONTINUOUS SPEECH RECOGNITION; SIGNAL TO NOISE RATIO; SPEECH;

FEATURE EXTRACTION;

ARTICLE; ARTIFICIAL NEURAL NETWORK; ASSOCIATION; AUDITORY STIMULATION; AUDITORY THRESHOLD; AUTOMATED PATTERN RECOGNITION; AUTOMATIC SPEECH RECOGNITION; COMPARATIVE STUDY; FEMALE; HUMAN; MALE; NOISE; PATTERN RECOGNITION; PERCEPTION; PHONETICS; PURE TONE AUDIOMETRY; RECOGNITION; SPEECH; SPEECH AUDIOMETRY; SPEECH PERCEPTION; TIME; TIME PERCEPTION;

ACOUSTIC STIMULATION; AUDIOMETRY, PURE-TONE; AUDIOMETRY, SPEECH; AUDITORY THRESHOLD; CUES; FEMALE; HUMANS; MALE; NEURAL NETWORKS (COMPUTER); NOISE; PATTERN RECOGNITION, AUTOMATED; PATTERN RECOGNITION, PHYSIOLOGICAL; PERCEPTUAL MASKING; PHONETICS; RECOGNITION (PSYCHOLOGY); SPEECH ACOUSTICS; SPEECH PERCEPTION; SPEECH RECOGNITION SOFTWARE; TIME FACTORS; TIME PERCEPTION;

EID: 79551679242 PISSN: 00014966 EISSN: None Source Type: Journal
DOI: 10.1121/1.3514525 Document Type: Article

Times cited : (39)

References (34)

1
- 34247568840
- Modelling speaker intelligibility in noise
- DOI 10.1016/j.specom.2006.11.003, PII S0167639306001701, Bridging the Gap between Human and Automatic Speech Recognition
- Barker, J., and Cooke, M. (2007). Modelling speaker intelligibility in noise., Speech Commun. 49, 402-417. 10.1016/j.specom.2006.11.003 (Pubitemid 46670363)
- (2007) Speech Communication , vol.49 , Issue.5 , pp. 402-417
- Barker, J.¹ Cooke, M.²

2
- 34547941599
- Automatic speech recognition and speech variability: A review
- DOI 10.1016/j.specom.2007.02.006, PII S0167639307000404
- Benzeguiba, M., De Mori, R., Deroo, O., Dupont, S., Erbes, T., Jouvet, D., Fissore, L, Laface, P., Mertins, A., Ris, C., Rose, R., Tyagi, V., and Wellekens, C. (2007). Automatic speech recognition and speech variability: A review., Speech Commun. 49, 763-786. 10.1016/j.specom.2007.02.006 (Pubitemid 47268571)
- (2007) Speech Communication , vol.49 , Issue.10-11 , pp. 763-786
- Benzeghiba, M.¹ De Mori, R.² Deroo, O.³ Dupont, S.⁴ Erbes, T.⁵ Jouvet, D.⁶ Fissore, L.⁷ Laface, P.⁸ Mertins, A.⁹ Ris, C.¹⁰ Rose, R.¹¹ Tyagi, V.¹² Wellekens, C.¹³

3
- 0027465489
- A model for context effects in speech recognition
- , 10.1121/1.406844
- Bronkhorst, A. W., Bosman, A. J., and Smoorenburg, G. F. (1993). A model for context effects in speech recognition., J. Acoust. Soc. Am. 93, 499-509. 10.1121/1.406844
- (1993) J. Acoust. Soc. Am. , vol.93 , pp. 499-509
- Bronkhorst, A.W.¹ Bosman, A.J.² Smoorenburg, G.F.³

4
- 33745213565
- A speech similarity distance weighting for robust recognition
- in, Lisbon, Portugal
- Carey, M. J., and Quang, T. P. (2005). A speech similarity distance weighting for robust recognition., in Proceedings of Interspeech, Lisbon, Portugal, pp. 1257-1260.
- (2005) Proceedings of Interspeech , pp. 1257-1260
- Carey, M.J.¹ Quang, T.P.²

5
- 33644661135
- A glimpsing model of speech perception in noise
- 10.1121/1.2166600
- Cooke, M. (2005). A glimpsing model of speech perception in noise., J. Acoust. Soc. Am. 119, 1562-1573. 10.1121/1.2166600
- (2005) J. Acoust. Soc. Am. , vol.119 , pp. 1562-1573
- Cooke, M.¹

6
- 70450178921
- The Interspeech 2008 consonant challenge
- in
- Cooke, M., and Scharenborg, O. (2008). The Interspeech 2008 consonant challenge., in Proceedings of Interspeech, pp. 1781-1784.
- (2008) Proceedings of Interspeech , pp. 1781-1784
- Cooke, M.¹ Scharenborg, O.²

7
- 0019053271
- Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
- 10.1109/TASSP.1980.1163420
- Davis, S., and Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences., IEEE Trans. Acoust. Speech. Signal Process. 28, 357-366. 10.1109/TASSP.1980.1163420
- (1980) IEEE Trans. Acoust. Speech. Signal Process. , vol.28 , pp. 357-366
- Davis, S.¹ Mermelstein, P.²

8
- 84988831195
- Synthesizing speech from speech recognition parameters
- Van. , in
- Demuynck, K., Garcia, O., and Van Compernolle, D. (2004). Synthesizing speech from speech recognition parameters., in Proceedings of Interspeech, pp. II-945-II-948.
- (2004) Proceedings of Interspeech
- Demuynck, K.¹ Garcia, O.² Compernolle, D.³

9
- 0034920512
- ICRA noises: Artificial noise signals with speech-like spectral and temporal properties for hearing instrument assessment
- , 10.3109/00206090109073110
- Dreschler, W. A., Verschuure, H., Ludvigson, C, and Westermann, S. (2001). ICRA noises: Artificial noise signals with speech-like spectral and temporal properties for hearing instrument assessment., Int. J. Audiol. 40, 148-157. 10.3109/00206090109073110
- (2001) Int. J. Audiol. , vol.40 , pp. 148-157
- Dreschler, W.A.¹ Verschuure, H.² Ludvigson, C.³ Westermann, S.⁴

10
- 0024909979
- Some statistical issues in the comparison of speech recognition algorithms
- in, Glasgow, United Kingdom
- Gillick, L., and Cox, S. J. (1989). Some statistical issues in the comparison of speech recognition algorithms., in Proceedings of the 1989 International Conference on Acoustics, Speech, and Signal Processing, Glasgow, United Kingdom, pp. 532-535.
- (1989) Proceedings of the 1989 International Conference on Acoustics, Speech, and Signal Processing , pp. 532-535
- Gillick, L.¹ Cox, S.J.²

11
- 0021407831
- Signal estimation from modified short-time Fourier transform
- Griffin, D., and Lim, J. (1984). Signal estimation from modified short-time Fourier transform., IEEE Trans. Acoust. Speech. Signal Process. 32, 236-243. 10.1109/TASSP.1984.1164317 (Pubitemid 14608418)
- (1984) IEEE Transactions on Acoustics, Speech, and Signal Processing , vol.ASSP-32 , Issue.2 , pp. 236-243
- Griffin, D.¹ Lim, J.²

12
- 0345443169
- Roles and representations of systematic fine phonetic detail in speech understanding
- DOI 10.1016/j.wocn.2003.09.006
- Hawkins, S. (2003). Roles and representations of systematic fine phonetic detail in speech understanding., J. Phonetics 31, 373-405. 10.1016/j.wocn.2003. 09.006 (Pubitemid 37495914)
- (2003) Journal of Phonetics , vol.31 , Issue.3-4 , pp. 373-405
- Hawkins, S.¹

13
- 0032658253
- Temporal patterns (TRAPS) in ASR of noisy speech
- in, Phoenix, Arizona
- Hermansky, H., and Sharma, S. (1999). Temporal patterns (TRAPS) in ASR of noisy speech., in Proceedings of the 1999 International Conference on Acoustics, Speech, and Signal Processing, Phoenix, Arizona, pp. 289-292.
- (1999) Proceedings of the 1999 International Conference on Acoustics, Speech, and Signal Processing , pp. 289-292
- Hermansky, H.¹ Sharma, S.²

14
- 0029060033
- Acoustic characteristics of American English vowels
- , 10.1121/1.411872
- Hillenbrand, J., Getty, L., Clark, M., and Wheeler, K. (1995). Acoustic characteristics of American English vowels., J. Acoust. Soc. Am. 97, 3099-3111. 10.1121/1.411872
- (1995) J. Acoust. Soc. Am. , vol.97 , pp. 3099-3111
- Hillenbrand, J.¹ Getty, L.² Clark, M.³ Wheeler, K.⁴

15
- 0027465491
- The Lombard reflex and its role on human listeners and automatic speech recognizers
- 10.1121/1.405631
- Junqua, J. -C. (1993). The Lombard reflex and its role on human listeners and automatic speech recognizers., J. Acoust. Soc. Am. 93, 510-524. 10.1121/1.405631
- (1993) J. Acoust. Soc. Am. , vol.93 , pp. 510-524
- Junqua, J.-C.¹

16
- 79551666547
- Predicting consonant recognition in quiet for listeners with normal hearing and hearing impairment using an auditory model (A)
- Jrgens, T., Brand, T., and Kollmeier, B. (2009). Predicting consonant recognition in quiet for listeners with normal hearing and hearing impairment using an auditory model (A)., J. Acoust. Soc. Am. 125, 2533.
- (2009) J. Acoust. Soc. Am. , vol.125 , pp. 2533
- Jrgens, T.¹ Brand, T.² Kollmeier, B.³

17
- 0030362970
- Automatic detection and segmentation of pronunciation variants in German speech corpora
- in, Philadelphia, Pennsylvania
- Kipp, A., Wesenick, M., and Schiel, F. (1996). Automatic detection and segmentation of pronunciation variants in German speech corpora., in Proceedings of the 1996 International Conference on Spoken Language Processing, Philadelphia, Pennsylvania, pp. 106-109.
- (1996) Proceedings of the 1996 International Conference on Spoken Language Processing , pp. 106-109
- Kipp, A.¹ Wesenick, M.² Schiel, F.³

18
- 0003545504
- (Erich Schmidt Verlag, Berlin)
- Kohler, K. (1995). Einfhrung in die Phonetik des Deutschen (Introduction to German phonetics) (Erich Schmidt Verlag, Berlin), pp. 1-249.
- (1995) Einfhrung in Die Phonetik des Deutschen (Introduction to German Phonetics) , pp. 1-249
- Kohler, K.¹

19
- 0242358162
- The effects of speaking rate on the intelligibility of speech for various speaking modes (A)
- 10.1121/1.413900
- Krause, J. C., and Braida, L. D. (1995). The effects of speaking rate on the intelligibility of speech for various speaking modes (A)., J. Acoust. Soc. Am. 98, 2982. 10.1121/1.413900
- (1995) J. Acoust. Soc. Am. , vol.98 , pp. 2982
- Krause, J.C.¹ Braida, L.D.²

20
- 1642499127
- Acoustic properties of naturally produced clear speech at normal speaking rates
- 10.1121/1.1635842
- Krause, J. C., and Braida, L. D. (2003). Acoustic properties of naturally produced clear speech at normal speaking rates., J. Acoust. Soc. Am. 115, 362-378. 10.1121/1.1635842
- (2003) J. Acoust. Soc. Am. , vol.115 , pp. 362-378
- Krause, J.C.¹ Braida, L.D.²

21
- 0021226391
- A database for speaker-independent digit recognition
- in, Vol.
- Leonard, R. (1984). A database for speaker-independent digit recognition., in Proceedings of the 1984 International Conference on Acoustics, Speech, and Signal Processing, Vol. IX, pp. 328-331.
- (1984) Proceedings of the 1984 International Conference on Acoustics, Speech, and Signal Processing , vol.9 , pp. 328-331
- Leonard, R.¹

22
- 0031187171
- Speech recognition by machines and humans
- 10.1016/S0167-6393(97)00021-6
- Lippmann, R. (1997). Speech recognition by machines and humans., Speech Commun. 22, 1-15. 10.1016/S0167-6393(97)00021-6
- (1997) Speech Commun. , vol.22 , pp. 1-15
- Lippmann, R.¹

23
- 56149102452
- A human-machine comparison in speech recognition based on a logatome corpus
- in
- Meyer, B., and Wesker, T. (2006). A human-machine comparison in speech recognition based on a logatome corpus., in Workshop on Speech-Intrinsic Variation, pp. 95-100.
- (2006) Workshop on Speech-Intrinsic Variation , pp. 95-100
- Meyer, B.¹ Wesker, T.²

24
- 84933250500
- The intelligibility of interrupted speech
- 10.1121/1.1906584
- Miller, G. A., and Licklider, J. (1950). The intelligibility of interrupted speech., J. Acoust. Soc. Am. 22, 167-173. 10.1121/1.1906584
- (1950) J. Acoust. Soc. Am. , vol.22 , pp. 167-173
- Miller, G.A.¹ Licklider, J.²

25
- 84955023511
- An analysis of perceptual confusions among some English consonants
- 10.1121/1.1907526
- Miller, G. A., and Nicely, P. E. (1955). An analysis of perceptual confusions among some English consonants., J. Acoust. Soc. Am. 27, 338-352. 10.1121/1.1907526
- (1955) J. Acoust. Soc. Am. , vol.27 , pp. 338-352
- Miller, G.A.¹ Nicely, P.E.²

26
- 0032665650
- On the limits of speech recognition in noise
- in, Phoenix, Arizona
- Peters, S., Stubley, P., and Valin, J. (1999). On the limits of speech recognition in noise., in Proceedings of the 1999 International Conference on Acoustics, Speech, and Signal Processing, Phoenix, Arizona, pp. 365-368.
- (1999) Proceedings of the 1999 International Conference on Acoustics, Speech, and Signal Processing , pp. 365-368
- Peters, S.¹ Stubley, P.² Valin, J.³

27
- 34047247534
- Consonant and vowel confusions in speech-weighted noise
- DOI 10.1121/1.2642397
- Phatak, S. A., and Allen, J. B. (2007). Consonant and vowel confusions in speech-weighted noise., J. Acoust. Soc. Am. 121, 2312-2326. 10.1121/1.2642397 (Pubitemid 46548430)
- (2007) Journal of the Acoustical Society of America , vol.121 , Issue.4 , pp. 2312-2326
- Phatak, S.A.¹ Allen, J.B.²

28
- 34247580087
- Reaching over the gap: A review of efforts to link human and automatic speech recognition research
- DOI 10.1016/j.specom.2007.01.009, PII S0167639307000106, Bridging the Gap between Human and Automatic Speech Recognition
- Scharenborg, O. (2007). Reaching over the gap: A review of efforts to link human and automatic speech recognition research., Speech Commun. 49, 336-347. 10.1016/j.specom.2007.01.009 (Pubitemid 46670364)
- (2007) Speech Communication , vol.49 , Issue.5 , pp. 336-347
- Scharenborg, O.¹

29
- 0034843163
- Using phase spectrum information for improved speech recognition performance
- in, Salt Lake City, Utah
- Schlter, R., and Ney, H. (2001). Using phase spectrum information for improved speech recognition performance., in Proceedings of the 2001 International Conference on Acoustics, Speech, and Signal Processing, Salt Lake City, Utah, pp. 133-136.
- (2001) Proceedings of the 2001 International Conference on Acoustics, Speech, and Signal Processing , pp. 133-136
- Schlter, R.¹ Ney, H.²

30
- 84867215557
- Two protocols comparing human and machine phonetic recognition performance in conversational speech
- in
- Shen, W., Olive, J., and Jones, D. (2008). Two protocols comparing human and machine phonetic recognition performance in conversational speech., in Proceedings of Interspeech, pp. 1630-1633.
- (2008) Proceedings of Interspeech , pp. 1630-1633
- Shen, W.¹ Olive, J.² Jones, D.³

31
- 0021143595
- A procedure for phonetic transcription by consensus
- Shriberg, L. D., Kwiatkowski, J., and Hoffmann, K. (1984). A procedure for phonetic transcription by consensus., J. Speech Lang. Hear. Res. 27, 456-465.
- (1984) J. Speech Lang. Hear. Res. , vol.27 , pp. 456-465
- Shriberg, L.D.¹ Kwiatkowski, J.² Hoffmann, K.³

32
- 15844428932
- Human and machine consonant recognition
- DOI 10.1016/j.specom.2004.11.009, PII S0167639304001499
- Sroka, J. J., and Braida, L. D. (2005). Human and machine consonant recognition., Speech Commun. 45, 401-423. 10.1016/j.specom.2004.11.009 (Pubitemid 40423287)
- (2005) Speech Communication , vol.45 , Issue.4 , pp. 401-423
- Sroka, J.J.¹ Braida, L.D.²

33
- 0002788784
- Signal processing for robust speech recognition
- in, edited by C. -H. Lee, F. K. Soong, and K. K. Paliwal (Springer, Berlin)
- Stern, R., Acero, A., Liu, F. H., and Ohshima, Y. (1996). Signal processing for robust speech recognition., in Automatic Speech and Speaker Recognition, edited by, C. -H. Lee, F. K. Soong, and, K. K. Paliwal, (Springer, Berlin), pp. 357-384.
- (1996) Automatic Speech and Speaker Recognition , pp. 357-384
- Stern, R.¹ Acero, A.² Liu, F.H.³ Ohshima, Y.⁴

34
- 33745183789
- Oldenburg Logatome Speech Corpus (OLLO) for speech recognition experiments with humans and machines
- in
- Wesker, T., Meyer, B., Wagener, K., Anemueller, J., Mertins, A., and Kollmeier, B. (2005). Oldenburg Logatome Speech Corpus (OLLO) for speech recognition experiments with humans and machines., in Proceedings of Interspeech, pp. 1273-1276.
- (2005) Proceedings of Interspeech , pp. 1273-1276
- Wesker, T.¹ Meyer, B.² Wagener, K.³ Anemueller, J.⁴ Mertins, A.⁵ Kollmeier, B.⁶

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.