메뉴 건너뛰기




Volumn 15, Issue 8, 2007, Pages 2190-2201

Robust speech rate estimation for spontaneous speech

Author keywords

Rich speech transcription; Speech prosody; Speech rate estimation; Spontaneous speech processing

Indexed keywords

ACOUSTIC FEATURES; ALGORITHM PARAMETERS; AUTOMATED APPROACHES; CORRELATION COEFFICIENTS; DIRECT METHODS; EXPERIMENTAL EVALUATIONS; GROUND TRUTHS; MANUAL SEGMENTATIONS; MONTE CARLO SIMULATIONS; NOVEL COMPONENTS; OPTIMAL SETTINGS; PARAMETER SENSITIVITIES; PHONETIC SEGMENTATIONS; RICH SPEECH TRANSCRIPTION; ROBUST SPEECH; ROBUSTNESS ISSUES; SIGNAL CORRELATIONS; SPEECH PROSODY; SPEECH RATE ESTIMATION; SPONTANEOUS SPEECH PROCESSING; SUBBANDS; SWITCHBOARD CORPORA; TEMPORAL CORRELATIONS; TEMPORAL SIGNALS;

EID: 64149099088     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2007.905178     Document Type: Article
Times cited : (98)

References (54)
  • 1
    • 33646820149 scopus 로고    scopus 로고
    • Improvements on speech recognition for fast talkers
    • Budapest, Hungary
    • M. Richardson, M. Hwang, and A. D. H. AceroX, "Improvements on speech recognition for fast talkers," in Proc. Eurospeech, Budapest, Hungary, 1999, vol. 1, pp. 411-414.
    • (1999) Proc. Eurospeech , vol.1 , pp. 411-414
    • Richardson, M.1    Hwang, M.2    AceroX, A.D.H.3
  • 2
    • 64149096756 scopus 로고    scopus 로고
    • S. Greenberg, The switchboard transcription project, Tech. Rep., 1996 Johns Hopkins CLSP Workshop on Innovative Techniques for Large Vocabulary Continuous Speech Recognition, Baltimore, MD, 1997.
    • S. Greenberg, "The switchboard transcription project," Tech. Rep., 1996 Johns Hopkins CLSP Workshop on Innovative Techniques for Large Vocabulary Continuous Speech Recognition, Baltimore, MD, 1997.
  • 3
    • 0030359629 scopus 로고    scopus 로고
    • Relationship between discourse structure and dynamic speech rate
    • Philadelphia, PA
    • F. J. K. Beinum and M. E. van Donzel, "Relationship between discourse structure and dynamic speech rate," in Proc. Int. Conf. Spoken Lang. Process., Philadelphia, PA, 1996, vol. 3, pp. 1724-1727.
    • (1996) Proc. Int. Conf. Spoken Lang. Process , vol.3 , pp. 1724-1727
    • Beinum, F.J.K.1    van Donzel, M.E.2
  • 4
    • 0342931849 scopus 로고
    • Fast speakers in large vocabulary continuous speech recognition: Analysis and antidotes
    • Madrid, Spain, Sep
    • N. Mirghafori, E. Fosler, and N. Morgan, "Fast speakers in large vocabulary continuous speech recognition: Analysis and antidotes," in Proc. Eurospeech'95, Madrid, Spain, Sep. 1995, pp. 491-494.
    • (1995) Proc. Eurospeech'95 , pp. 491-494
    • Mirghafori, N.1    Fosler, E.2    Morgan, N.3
  • 5
    • 0347211260 scopus 로고    scopus 로고
    • Towards speech rate independence in large vocabulary continuous speech recognition
    • Seattle, WA, May
    • F. Martinez, D. Tapias, and J. Alvarez, "Towards speech rate independence in large vocabulary continuous speech recognition," in Proc. ICASSP, Seattle, WA, May 1998, pp. 725-728.
    • (1998) Proc. ICASSP , pp. 725-728
    • Martinez, F.1    Tapias, D.2    Alvarez, J.3
  • 7
    • 84866079752 scopus 로고    scopus 로고
    • Automatic punctuation and disfluency detection in multi-party meetings using prosodic and lexical cues
    • Denver, CO
    • D. Byron, E. Shriberg, and A. Stolcke, "Automatic punctuation and disfluency detection in multi-party meetings using prosodic and lexical cues," in Proc. Int. Conf. Spoken Lang. Process., Denver, CO, 2002, vol. 2, pp. 949-952.
    • (2002) Proc. Int. Conf. Spoken Lang. Process , vol.2 , pp. 949-952
    • Byron, D.1    Shriberg, E.2    Stolcke, A.3
  • 8
    • 85009212501 scopus 로고    scopus 로고
    • Automatic prosodic prominence detection in speech using acoustic features: An unsupervised system
    • Geneva, Switzerland
    • F. Tamburini, "Automatic prosodic prominence detection in speech using acoustic features: An unsupervised system," in Proc. Eurospeech' 03, Geneva, Switzerland, 2003, pp. 129-132.
    • (2003) Proc. Eurospeech' 03 , pp. 129-132
    • Tamburini, F.1
  • 9
    • 84892163293 scopus 로고    scopus 로고
    • Combining multiple estimators of speaking rate
    • N. Morgan and E. Fosler-Lussier, "Combining multiple estimators of speaking rate," in Proc. ICASSP, 1998, vol. 2, pp. 729-732.
    • (1998) Proc. ICASSP , vol.2 , pp. 729-732
    • Morgan, N.1    Fosler-Lussier, E.2
  • 10
    • 0036289926 scopus 로고    scopus 로고
    • Speaking-rate dependent decoding and adaptation for spontaneous lecture speech recognition
    • H. Nanjo and T. Kawahara, "Speaking-rate dependent decoding and adaptation for spontaneous lecture speech recognition," in Proc. ICASSP, 2002, pp. 725-728.
    • (2002) Proc. ICASSP , pp. 725-728
    • Nanjo, H.1    Kawahara, T.2
  • 12
    • 85135173867 scopus 로고    scopus 로고
    • Speech recognition using on-line estimation of speaking rate
    • Rhodes, Greece
    • N. Morgan, E. Fosler, and N. Mirghafori, "Speech recognition using on-line estimation of speaking rate," in Proc. Eurospeech, Rhodes, Greece, 1997, pp. 2079-2082.
    • (1997) Proc. Eurospeech , pp. 2079-2082
    • Morgan, N.1    Fosler, E.2    Mirghafori, N.3
  • 13
    • 0343802846 scopus 로고    scopus 로고
    • Extraction and representation rhythmic components of spontaneous speech
    • Rhodes, Greece
    • S. Kitazawa, H. Ichikawa, S. Kobayashi, and Y. Nishinuma, "Extraction and representation rhythmic components of spontaneous speech," in Proc. Eurospeech, Rhodes, Greece, 1997, pp. 641-644.
    • (1997) Proc. Eurospeech , pp. 641-644
    • Kitazawa, S.1    Ichikawa, H.2    Kobayashi, S.3    Nishinuma, Y.4
  • 14
    • 64149086532 scopus 로고    scopus 로고
    • quot;Speech filing system. [Online]. Available: http://www.phon. ucl.ac.uk/resource/sfs/
    • quot;Speech filing system." [Online]. Available: http://www.phon. ucl.ac.uk/resource/sfs/
  • 15
    • 85009064397 scopus 로고    scopus 로고
    • A combination of speaker normalization and speech rate normalization for automatic speech recognition
    • Beijing, China
    • T. Pfau, R. Faltlhauser, and G. Ruske, "A combination of speaker normalization and speech rate normalization for automatic speech recognition," in Proc. Int. Conf. Spoken Lang. Process., Beijing, China, 2000, vol. 4, pp. 362-365.
    • (2000) Proc. Int. Conf. Spoken Lang. Process , vol.4 , pp. 362-365
    • Pfau, T.1    Faltlhauser, R.2    Ruske, G.3
  • 16
    • 0003257037 scopus 로고    scopus 로고
    • The prosody of speech: Melody and rhythm
    • I. W. Hardcastle and J. Laver, Eds. Oxford, U.K, Blackwell
    • S. Nooteboom, "The prosody of speech: Melody and rhythm," in The Handbook of Phonetic Sciences, I. W. Hardcastle and J. Laver, Eds. Oxford, U.K.: Blackwell, 1997, pp. 640-673.
    • (1997) The Handbook of Phonetic Sciences , pp. 640-673
    • Nooteboom, S.1
  • 18
    • 33646765233 scopus 로고    scopus 로고
    • Speech rate estimation via temporal correlation and selected sub-band correlation
    • Philadelphia, PA, Mar
    • S. Narayanan and D. Wang, "Speech rate estimation via temporal correlation and selected sub-band correlation," in Proc. ICASSP, Philadelphia, PA, Mar. 2005, pp. 413-416.
    • (2005) Proc. ICASSP , pp. 413-416
    • Narayanan, S.1    Wang, D.2
  • 20
    • 0030363953 scopus 로고    scopus 로고
    • Syllable detection in read and spontaneous speech
    • Philadelphia, PA
    • H. R. Pfitzinger, S. Burger, and S. Heid, "Syllable detection in read and spontaneous speech," in Proc. ICSLP'96, Philadelphia, PA, 1996, vol. 2, pp. 1261-1264.
    • (1996) Proc. ICSLP'96 , vol.2 , pp. 1261-1264
    • Pfitzinger, H.R.1    Burger, S.2    Heid, S.3
  • 21
    • 33646785006 scopus 로고    scopus 로고
    • An unsupervised quantitative measure for word prominence in spontaneous speech
    • Philadelphia, PA, Mar
    • D.Wang and S. Narayanan, "An unsupervised quantitative measure for word prominence in spontaneous speech," in Proc. ICASSP, Philadelphia, PA, Mar. 2005, pp. 377-380.
    • (2005) Proc. ICASSP , pp. 377-380
    • Wang, D.1    Narayanan, S.2
  • 24
    • 3543047196 scopus 로고    scopus 로고
    • Fast and slowspeech rate:Acharacterisation for french
    • Sydney, Australia, Dec
    • B. Zellner, "Fast and slowspeech rate:Acharacterisation for french," in Proc. Int. Conf. Spoken Lang. Process., Sydney, Australia, Dec. 1998, vol. 7, pp. 3159-3163.
    • (1998) Proc. Int. Conf. Spoken Lang. Process , vol.7 , pp. 3159-3163
    • Zellner, B.1
  • 26
    • 0028996976 scopus 로고
    • Timing patterns in fluent and disfluent spontaneous speech
    • D. O'Shaughnessy, "Timing patterns in fluent and disfluent spontaneous speech," in Proc. ICASSP, 1995, vol. 1, pp. 600-603.
    • (1995) Proc. ICASSP , vol.1 , pp. 600-603
    • O'Shaughnessy, D.1
  • 28
    • 0018656518 scopus 로고
    • An approach to segmenting speech into vowel- and nonvowel-like syllables
    • Aug
    • H. Kasuya and H. Wakita, "An approach to segmenting speech into vowel- and nonvowel-like syllables," IEEE Trans. Acoust. Speech, Signal Process., vol. ASSP-27, no. 4, pp. 319-327, Aug. 1979.
    • (1979) IEEE Trans. Acoust. Speech, Signal Process , vol.ASSP-27 , Issue.4 , pp. 319-327
    • Kasuya, H.1    Wakita, H.2
  • 29
    • 0016567060 scopus 로고
    • Automatic segmentation of speech into syllabic units
    • Oct
    • P. Mermelstein, "Automatic segmentation of speech into syllabic units," J. Acoust. Soc. Amer., vol. 58, no. 4, pp. 880-883, Oct. 1975.
    • (1975) J. Acoust. Soc. Amer , vol.58 , Issue.4 , pp. 880-883
    • Mermelstein, P.1
  • 31
    • 0021529626 scopus 로고
    • On the application of energy contours to the recognition of connected word sequences
    • Nov
    • L. Rabiner, "On the application of energy contours to the recognition of connected word sequences," T Bell Labs Tech. J., vol. 63, no. 9, pp. 1981-1995, Nov. 1984.
    • (1984) T Bell Labs Tech. J , vol.63 , Issue.9 , pp. 1981-1995
    • Rabiner, L.1
  • 32
    • 0018437122 scopus 로고
    • Automatic speech recognition using psychoacoustic models
    • Feb
    • E. Zwicker, E. E. Terhardt, and E. Paulus, "Automatic speech recognition using psychoacoustic models," J. Acoust. Soc. Amer., vol. 65, no. 2, pp. 487-498, Feb. 1979.
    • (1979) J. Acoust. Soc. Amer , vol.65 , Issue.2 , pp. 487-498
    • Zwicker, E.1    Terhardt, E.E.2    Paulus, E.3
  • 33
    • 0005453196 scopus 로고
    • Syllable segmentation of continuous speech with artificial neural networks
    • Berlin, Germany, Sep
    • W. Reichl and G. Ruske, "Syllable segmentation of continuous speech with artificial neural networks," in Proc. Eurospeech'93, Berlin, Germany, Sep. 1993, vol. 3, pp. 1771-1774.
    • (1993) Proc. Eurospeech'93 , vol.3 , pp. 1771-1774
    • Reichl, W.1    Ruske, G.2
  • 34
    • 44849088884 scopus 로고
    • Speech representations in the SYLK recognition project
    • M. Cook, S. Beet, and M. Crawford, Eds. New York: Wiley
    • P. Green, N. Kew, and D. Miller, "Speech representations in the SYLK recognition project," in Visual Representations of Speech Signals, M. Cook, S. Beet, and M. Crawford, Eds. New York: Wiley, 1993.
    • (1993) Visual Representations of Speech Signals
    • Green, P.1    Kew, N.2    Miller, D.3
  • 38
    • 64149108201 scopus 로고    scopus 로고
    • Syllabification software
    • Gaithersburg, MD: National Inst. Standards Technol
    • W. M. Fisher, "Syllabification software," in The Spoken Natural Language Processing Group. Gaithersburg, MD: National Inst. Standards Technol., 1997.
    • (1997) The Spoken Natural Language Processing Group
    • Fisher, W.M.1
  • 39
    • 85016587886 scopus 로고
    • SWITCHBOARD: Telephone speech corpus for research and development
    • J. Godfrey, E. Holliman, and J. McDaniel, "SWITCHBOARD: Telephone speech corpus for research and development," in Proc. ICASSP'92, 1992, pp. 517-520.
    • (1992) Proc. ICASSP'92 , pp. 517-520
    • Godfrey, J.1    Holliman, E.2    McDaniel, J.3
  • 41
    • 0008758887 scopus 로고    scopus 로고
    • Automatic Syllable Detection for Vowel Landmarks,
    • Ph.D. dissertation, Mass. Inst. Technol, Cambridge, MA
    • A. W. Howitt, "Automatic Syllable Detection for Vowel Landmarks," Ph.D. dissertation, Mass. Inst. Technol., Cambridge, MA, 2000.
    • (2000)
    • Howitt, A.W.1
  • 42
    • 84910067535 scopus 로고    scopus 로고
    • Interpreting symptoms of cognitive load in speech input
    • A. Berthold and A. Jameson, J. Kay, Ed
    • A. Berthold and A. Jameson, J. Kay, Ed., "Interpreting symptoms of cognitive load in speech input," in Proc. 7th Int. Conf. UM99, User Modeling, 1999, pp. 235-244.
    • (1999) Proc. 7th Int. Conf. UM99, User Modeling , pp. 235-244
  • 44
    • 0021729848 scopus 로고
    • Articulation Rate and Its Variability in Spontaneous Speech: A Reanalysis and Some Implications
    • J. L. Miller, F. Grosjean, and L. Concetta, "Articulation Rate and Its Variability in Spontaneous Speech: A Reanalysis and Some Implications," Phonetica, vol. 41, pp. 215-225, 1984.
    • (1984) Phonetica , vol.41 , pp. 215-225
    • Miller, J.L.1    Grosjean, F.2    Concetta, L.3
  • 45
    • 44849126804 scopus 로고    scopus 로고
    • An acoustic measure for word prominence in spontaneous speech
    • Feb
    • D. Wang and S. Narayanan, "An acoustic measure for word prominence in spontaneous speech," IEEE Trans. Speech, Audio, Language Process., vol. 15, no. 2, pp. 690-701, Feb. 2007.
    • (2007) IEEE Trans. Speech, Audio, Language Process , vol.15 , Issue.2 , pp. 690-701
    • Wang, D.1    Narayanan, S.2
  • 46
    • 64149119032 scopus 로고
    • A robust algorithm for pitch tracking (RAPT)
    • D. Talkin, "A robust algorithm for pitch tracking (RAPT)," in Proc. ICASSP, 1983, pp. 1352-1355.
    • (1983) Proc. ICASSP , pp. 1352-1355
    • Talkin, D.1
  • 47
    • 0003409586 scopus 로고    scopus 로고
    • Delft Univ. Technol, The Netherlands, Online, Available
    • I. Young, J. Gerbrands, and L. v. Vliet, "Fundamentals of image processing," Delft Univ. Technol., The Netherlands, 1998 [Online]. Available: http://www.ph.tn.tudelft.nl/Courses/FIP/noframes/fip.html
    • (1998) Fundamentals of image processing
    • Young, I.1    Gerbrands, J.2    Vliet, L.V.3
  • 48
    • 0037324538 scopus 로고    scopus 로고
    • Effects of disfluencies, predictability, and utterance position on word form variation in English conversation
    • A. Bell, D. Jurafsky, E. Fosler-Lussier, C. G. M. Gregory, and D. Gildea, "Effects of disfluencies, predictability, and utterance position on word form variation in English conversation," J. Acoust. Soc. Amer. 113, pp. 1001-1024, 2003.
    • (2003) J. Acoust. Soc. Amer , vol.113 , pp. 1001-1024
    • Bell, A.1    Jurafsky, D.2    Fosler-Lussier, E.3    Gregory, C.G.M.4    Gildea, D.5
  • 49
    • 0031009252 scopus 로고    scopus 로고
    • Articulatory strengthening at the edges of prosodic domains
    • C. Fougeron and P. Keating, "Articulatory strengthening at the edges of prosodic domains," J. Acoust. Soc. Amer. 101, pp. 3728-3740, 1997.
    • (1997) J. Acoust. Soc. Amer , vol.101 , pp. 3728-3740
    • Fougeron, C.1    Keating, P.2
  • 50
    • 84926271877 scopus 로고
    • Stress-timing and syllable-timing reanalyzed
    • R. Dauer, "Stress-timing and syllable-timing reanalyzed," J. Phonetics 11, pp. 51-62, 1983.
    • (1983) J. Phonetics , vol.11 , pp. 51-62
    • Dauer, R.1
  • 51
    • 0001599591 scopus 로고
    • Phonetic and phonological components of language rhythm
    • R. Dauer, "Phonetic and phonological components of language rhythm," in Proc. Int. Congr. Phonetic Sci. 5, 1987, pp. 447-450.
    • (1987) Proc. Int. Congr. Phonetic Sci , vol.5 , pp. 447-450
    • Dauer, R.1
  • 52
    • 0003269957 scopus 로고
    • The sonority cycle and syllable organization
    • W. Dressler, H. Luschutzky, O. Pfeiffer, and J. Rennison, Eds. Cambridge, U.K, Cambridge Univ. Press
    • G. Clements, "The sonority cycle and syllable organization," in Phonologica 1988,W. Dressler, H. Luschutzky, O. Pfeiffer, and J. Rennison, Eds. Cambridge, U.K.: Cambridge Univ. Press, 1992, pp. 63-76.
    • (1992) Phonologica 1988 , pp. 63-76
    • Clements, G.1
  • 53
    • 0001950305 scopus 로고    scopus 로고
    • The syllable in phonological theory
    • J. Goldsmith, Ed. Oxford, U.K, Blackwell
    • J. Blevins, "The syllable in phonological theory," in Handbook of Phonological Theory , J. Goldsmith, Ed. Oxford, U.K.: Blackwell, 1996, pp. 35-59.
    • (1996) Handbook of Phonological Theory , pp. 35-59
    • Blevins, J.1
  • 54
    • 0029748337 scopus 로고    scopus 로고
    • Towards robust-ness to fast speech in ASR
    • N. Mirghafori, E. Fosler, and N. Morgan, "Towards robust-ness to fast speech in ASR," in Proc. ICASSP, 1996, vol. 1, pp. 335-338.
    • (1996) Proc. ICASSP , vol.1 , pp. 335-338
    • Mirghafori, N.1    Fosler, E.2    Morgan, N.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.