메뉴 건너뛰기




Volumn 11, Issue 4, 2003, Pages 321-333

Prosodic and accentual information for automatic speech recognition

Author keywords

Accentuation; Continuous speech recognition; Language models; Prosody

Indexed keywords

MARKOV PROCESSES; SPEECH PROCESSING; STATISTICAL METHODS;

EID: 0042093525     PISSN: 10636676     EISSN: None     Source Type: Journal    
DOI: 10.1109/TSA.2003.814368     Document Type: Article
Times cited : (13)

References (68)
  • 2
    • 0030165438 scopus 로고    scopus 로고
    • Language accent classification in American English
    • L. M. Arslan and I. H. L. Hansen, "Language accent classification in American English," Speech Commun., vol. 18, pp. 353-367, 1996.
    • (1996) Speech Commun. , vol.18 , pp. 353-367
    • Arslan, L.M.1    Hansen, I.H.L.2
  • 3
    • 85009136277 scopus 로고    scopus 로고
    • Selective prosodic post-processing for improving recognition of French telephone numbers
    • K. Bartkova and D. Jouvet, "Selective prosodic post-processing for improving recognition of French telephone numbers," in Proc. 7th Eur. Conf. Speech Communication and Technology, vol. 1, 1999, pp. 267-270.
    • (1999) Proc. 7th Eur. Conf. Speech Communication and Technology , vol.1 , pp. 267-270
    • Bartkova, K.1    Jouvet, D.2
  • 6
    • 0042394423 scopus 로고    scopus 로고
    • The role of prosody in infants' native-language discrimination abilities: The case of two phonologically close languages
    • L. Bosch and N. Gallés, "The role of prosody in infants' native-language discrimination abilities: the case of two phonologically close languages," in Proc. 5th Eur. Conf. Speech Commun. Technol., vol. 1, 1997, pp. 231-234.
    • (1997) Proc. 5th Eur. Conf. Speech Commun. Technol. , vol.1 , pp. 231-234
    • Bosch, L.1    Gallés, N.2
  • 7
    • 4244089078 scopus 로고    scopus 로고
    • A comparative study of HMM-based approaches for the automatic recognition of perceptually relevant aspects of spontaneous German speech melody
    • C. Brindöpke, G. A. Fink, and F. Kummert, "A comparative study of HMM-based approaches for the automatic recognition of perceptually relevant aspects of spontaneous German speech melody," in Proc. 7th Eur. Conf. Speech Commun. Technol., vol. 2, 1999, pp. 699-702.
    • (1999) Proc. 7th Eur. Conf. Speech Commun. Technol. , vol.2 , pp. 699-702
    • Brindöpke, C.1    Fink, G.A.2    Kummert, F.3
  • 8
    • 4243850589 scopus 로고    scopus 로고
    • A HMM-based recognition system for perceptive relevant pitch movements of spontaneous German speech
    • Prosody and Emotion 6
    • C. Brindöpke, G. A. Fink, F. Kummert, and G. Sagerer, "A HMM-based recognition system for perceptive relevant pitch movements of spontaneous German speech," in 5th Int. Conf. Spoken Language Processing, 1998, Prosody and Emotion 6.
    • (1998) 5th Int. Conf. Spoken Language Processing
    • Brindöpke, C.1    Fink, G.A.2    Kummert, F.3    Sagerer, G.4
  • 9
    • 0030149810 scopus 로고    scopus 로고
    • Robust parametric modeling of durations in hidden Markov models
    • D. Busdhtein, "Robust parametric modeling of durations in hidden Markov models," IEEE Trans. Speech Audio Processing, vol. 4, no. 3, 1996.
    • (1996) IEEE Trans. Speech Audio Processing , vol.4 , Issue.3
    • Busdhtein, D.1
  • 10
    • 79951864468 scopus 로고    scopus 로고
    • A computational memory and processing model for prosody
    • Prosody and Emotion 2
    • J. E. Cahn, "A computational memory and processing model for prosody," in Proc. 5th Int. Conf. Spoken Language Processing, 1998, Prosody and Emotion 2.
    • (1998) Proc. 5th Int. Conf. Spoken Language Processing
    • Cahn, J.E.1
  • 13
    • 0040465429 scopus 로고    scopus 로고
    • Testing the meaning of four Dutch pitch accent types
    • J. Caspers, "Testing the meaning of four Dutch pitch accent types," in Proc. 5th Eur. Conf. Speech Commun. Technol., vol. 2, 1997, pp. 863-866.
    • (1997) Proc. 5th Eur. Conf. Speech Commun. Technol. , vol.2 , pp. 863-866
    • Caspers, J.1
  • 14
    • 0032073761 scopus 로고    scopus 로고
    • An RNN-based prosodic information synthesizer for Mandarin text-to-speech
    • S.-H. Chen, S.-H. Hwang, and Y.-R. Wang, "An RNN-based prosodic information synthesizer for Mandarin text-to-speech," IEEE Trans. Speech Audio Processing, vol. 6, no. 3, 1998.
    • (1998) IEEE Trans. Speech Audio Processing , vol.6 , Issue.3
    • Chen, S.-H.1    Hwang, S.-H.2    Wang, Y.-R.3
  • 15
    • 0030143016 scopus 로고    scopus 로고
    • On jointly learning the parameters in a character synchronous integrated speech and language model
    • T.-H. Chiang, Y.-C. Lin, and K.-Y. Su, "On jointly learning the parameters in a character synchronous integrated speech and language model," IEEE Trans. Speech Audio Processing, vol. 4, no. 3, 1996.
    • (1996) IEEE Trans. Speech Audio Processing , vol.4 , Issue.3
    • Chiang, T.-H.1    Lin, Y.-C.2    Su, K.-Y.3
  • 16
    • 0030120916 scopus 로고    scopus 로고
    • Frameworks for recognition of Mandarin syllables with tones using sub-syllabic units
    • L. Chih-Heng, W. Chien-Hsing, T. Pei-Yih, and W. Hsin-Min, "Frameworks for recognition of Mandarin syllables with tones using sub-syllabic units," Speech Commun., vol. 18, pp. 175-190, 1996.
    • (1996) Speech Commun. , vol.18 , pp. 175-190
    • Chih-Heng, L.1    Chien-Hsing, W.2    Pei-Yih, T.3    Hsin-Min, W.4
  • 17
    • 79951898003 scopus 로고    scopus 로고
    • Improvements in speech understanding accuracy through the integration of hierarchical linguistic, prosodic, and phonological constraints in the Jupiter domain
    • G. Chung and S. Seneff, "Improvements in speech understanding accuracy through the integration of hierarchical linguistic, prosodic, and phonological constraints in the Jupiter domain," in Proc. 5th Int. Conf. Spoken Language Processing, Spoken Language Understanding Systems, vol. 1, 1998.
    • (1998) Proc. 5th Int. Conf. Spoken Language Processing, Spoken Language Understanding Systems , vol.1
    • Chung, G.1    Seneff, S.2
  • 20
    • 0033709101 scopus 로고    scopus 로고
    • Detection of prosodic word boundaries by statistical modeling of mora transitions of fundamental frequency contours and its use for continuous speech recognition
    • K. Hirose and K. Iwano, "Detection of prosodic word boundaries by statistical modeling of mora transitions of fundamental frequency contours and its use for continuous speech recognition," in Proc. IEEE 25th Int. Conf. Acoustics, Speech, Signal Processing, vol. 3, 2000, pp. 1763-1766.
    • (2000) Proc. IEEE 25th Int. Conf. Acoustics, Speech, Signal Processing , vol.3 , pp. 1763-1766
    • Hirose, K.1    Iwano, K.2
  • 21
    • 0041392274 scopus 로고    scopus 로고
    • The prosody of broad and narrow focus in English: Two experiments
    • S. Hoskins, "The prosody of broad and narrow focus in English: Two experiments," in Proc. 5th Eur. Conf. Speech Commun. Technol., vol. 2, 1997, pp. 791-794.
    • (1997) Proc. 5th Eur. Conf. Speech Commun. Technol. , vol.2 , pp. 791-794
    • Hoskins, S.1
  • 23
    • 0032785782 scopus 로고    scopus 로고
    • Modeling long distance dependence in language: Topic mixtures versus dynamic cache models
    • R. M. Iyer and M. Ostendorf, "Modeling long distance dependence in language: Topic mixtures versus dynamic cache models," IEEE Trans. Speech Audio Processing, vol. 7, no. 1, 1999.
    • (1999) IEEE Trans. Speech Audio Processing , vol.7 , Issue.1
    • Iyer, R.M.1    Ostendorf, M.2
  • 27
    • 0031191419 scopus 로고    scopus 로고
    • The contribution of intonation, segmental durations, and spectral features to the perception of a spontaneous and a read speaking style
    • G. P. M. Laan, "The contribution of intonation, segmental durations, and spectral features to the perception of a spontaneous and a read speaking style," Speech Commun., vol. 22, pp. 43-65, 1997.
    • (1997) Speech Commun. , vol.22 , pp. 43-65
    • Laan, G.P.M.1
  • 29
    • 0032677483 scopus 로고    scopus 로고
    • Cantonese syllable recognition using neural networks
    • T. Lee and C. Ching, "Cantonese syllable recognition using neural networks," IEEE Trans. Speech Audio Processing, vol. 7, no. 4, 1999.
    • (1999) IEEE Trans. Speech Audio Processing , vol.7 , Issue.4
    • Lee, T.1    Ching, C.2
  • 31
    • 0020180460 scopus 로고
    • Maximum likelihood estimation for multivariate stochastic observations of Markov chains
    • L. A. Liporace, "Maximum likelihood estimation for multivariate stochastic observations of Markov chains," IEEE Trans. Inform. Theory, vol. IT-28, no. 5, 1982.
    • (1982) IEEE Trans. Inform. Theory , vol.IT-28 , Issue.5
    • Liporace, L.A.1
  • 32
    • 0031187171 scopus 로고    scopus 로고
    • Speech recognition by machines and humans
    • R. P. Lippmann, "Speech recognition by machines and humans," Speech Commun., vol. 22, pp. 1-15, 1997.
    • (1997) Speech Commun. , vol.22 , pp. 1-15
    • Lippmann, R.P.1
  • 35
    • 0030204644 scopus 로고    scopus 로고
    • Speaker attribution of successive utterances: The role of discontinuities in voice characteristics and prosody
    • V. Lublinskaja and C. Sappok, "Speaker attribution of successive utterances: The role of discontinuities in voice characteristics and prosody," Speech Commun., vol. 19, pp. 145-159, 1996.
    • (1996) Speech Commun. , vol.19 , pp. 145-159
    • Lublinskaja, V.1    Sappok, C.2
  • 38
    • 0041392279 scopus 로고    scopus 로고
    • Árboles de redes neuronales autoorganizativas
    • _, "Árboles de redes neuronales autoorganizativas," Revista Mexicana de Ingeniería Biomédica, vol. 29, no. 4, 1998.
    • (1998) Revista Mexicana de Ingeniería Biomédica , vol.29 , Issue.4
  • 40
    • 85009244099 scopus 로고    scopus 로고
    • Suprasegmental duration modeling with elastic contraints in automatic speech recognition
    • Hidden Markov Model Techniques 3
    • L. Molloy and S. Isard, "Suprasegmental duration modeling with elastic contraints in automatic speech recognition," in Proc. 5th Int. Conf. Spoken Language Processing, 1998, Hidden Markov Model Techniques 3.
    • (1998) Proc. 5th Int. Conf. Spoken Language Processing
    • Molloy, L.1    Isard, S.2
  • 41
    • 0014055288 scopus 로고
    • Cepstrum pitch determination
    • A. M. Noll, "Cepstrum pitch determination," J. Acoust. Soc. Amer., vol. 41, pp. 293-309, 1967.
    • (1967) J. Acoust. Soc. Amer. , vol.41 , pp. 293-309
    • Noll, A.M.1
  • 43
    • 0031074261 scopus 로고    scopus 로고
    • Prosody generation for German CTS/TTS systems (from theoretical intonation patterns to practical realization)
    • O. Gábor and N. Géza, "Prosody generation for German CTS/TTS systems (from theoretical intonation patterns to practical realization)," Speech Commun., vol. 21, pp. 37-60, 1997.
    • (1997) Speech Commun. , vol.21 , pp. 37-60
    • Gábor, O.1    Géza, N.2
  • 45
    • 0030205397 scopus 로고    scopus 로고
    • Modeling of phone duration (using the TIMIT database) and its potential benefit for ASR
    • L. C. W. Pols, X. Wang, and L. F. M. Bosch, "Modeling of phone duration (using the TIMIT database) and its potential benefit for ASR," Speech Commun., vol. 19, pp. 161-176, 1996.
    • (1996) Speech Commun. , vol.19 , pp. 161-176
    • Pols, L.C.W.1    Wang, X.2    Bosch, L.F.M.3
  • 46
    • 0031071430 scopus 로고    scopus 로고
    • Toward a prominence-based synthesis system
    • T. Portele and B. Heuft, "Toward a prominence-based synthesis system," Speech Commun., vol. 21, pp. 61-72, 1997.
    • (1997) Speech Commun. , vol.21 , pp. 61-72
    • Portele, T.1    Heuft, B.2
  • 47
    • 0032089995 scopus 로고    scopus 로고
    • A study of n-gram and decision tree letter language modeling methods
    • G. Potamianos and F. Jelinek, "A study of n-gram and decision tree letter language modeling methods," Speech Commun., vol. 24, pp. 171-192, 1998.
    • (1998) Speech Commun. , vol.24 , pp. 171-192
    • Potamianos, G.1    Jelinek, F.2
  • 48
    • 0032795155 scopus 로고    scopus 로고
    • Classification of Thai tone sequences in syllable-segmented speech using the analysis-by-synthesis method
    • S. Potisuk, M. P. Harper, and J. Gandour, "Classification of Thai tone sequences in syllable-segmented speech using the analysis-by-synthesis method," IEEE Trans. Speech Audio Processing, vol. 7, no. 1, 1999.
    • (1999) IEEE Trans. Speech Audio Processing , vol.7 , Issue.1
    • Potisuk, S.1    Harper, M.P.2    Gandour, J.3
  • 51
    • 0022594196 scopus 로고
    • An introduction to hidden Markov models
    • Jan.
    • L. R. Rabiner and B. H. Juang, "An introduction to hidden Markov models," IEEE ASSP Mag., Jan. 1986.
    • (1986) IEEE ASSP Mag.
    • Rabiner, L.R.1    Juang, B.H.2
  • 52
  • 54
    • 0032665603 scopus 로고    scopus 로고
    • A dynamical system model for generating fundamental frequency for speech synthesis
    • N. K. Ross and M. Ostendorf, "A dynamical system model for generating fundamental frequency for speech synthesis," IEEE Trans. Speech Audio Processing, vol. 7, no. 3, 1999.
    • (1999) IEEE Trans. Speech Audio Processing , vol.7 , Issue.3
    • Ross, N.K.1    Ostendorf, M.2
  • 58
    • 0031185913 scopus 로고    scopus 로고
    • Prosodic and lexical indications of discourse structure in human-machine interactions
    • M. Swerts and M. Ostendorf, "Prosodic and lexical indications of discourse structure in human-machine interactions," Speech Commun., vol. 22, pp. 25-41, 1997.
    • (1997) Speech Commun. , vol.22 , pp. 25-41
    • Swerts, M.1    Ostendorf, M.2
  • 59
    • 0030170432 scopus 로고    scopus 로고
    • Acoustic parameters for place of articulation identification and classification of Spanish unvoiced stops
    • M. I. Torres and P. Iparraguirre, "Acoustic parameters for place of articulation identification and classification of Spanish unvoiced stops," Speech Commun., vol. 18, pp. 369-379, 1996.
    • (1996) Speech Commun. , vol.18 , pp. 369-379
    • Torres, M.I.1    Iparraguirre, P.2
  • 60
    • 0033096914 scopus 로고    scopus 로고
    • Acoustic characteristics of lexical stress in continuous telephone speech
    • Van Kuijk and L. Boves, "Acoustic characteristics of lexical stress in continuous telephone speech," Speech Commun., vol. 27, pp. 95-111, 1999.
    • (1999) Speech Commun. , vol.27 , pp. 95-111
    • Kuijk, V.1    Boves, L.2
  • 63
    • 0032296808 scopus 로고    scopus 로고
    • A stochastic model of intonation for text-to-speech synthesis
    • J. Véronis, P. Di Cristo, F. Courtois, and C. Chaumette, "A stochastic model of intonation for text-to-speech synthesis," Speech Commun., vol. 26, pp. 233-244, 1998.
    • (1998) Speech Commun. , vol.26 , pp. 233-244
    • Véronis, J.1    Di Cristo, P.2    Courtois, F.3    Chaumette, C.4
  • 64
    • 85128385898 scopus 로고    scopus 로고
    • A study of tones and tempo in continuous Mandarin digit strings and their application in telephone quality speech recognition
    • Prosody and Emotion 2
    • C. Wang and S. Seneff, "A study of tones and tempo in continuous Mandarin digit strings and their application in telephone quality speech recognition," in Proc. 5th Int. Conf. Spoken Language Processing, 1998, Prosody and Emotion 2.
    • (1998) Proc. 5th Int. Conf. Spoken Language Processing
    • Wang, C.1    Seneff, S.2
  • 68
    • 0030181237 scopus 로고    scopus 로고
    • Register as a variable in prosodic analysis: The case of the English negative
    • M. Yaeger-Dror, "Register as a variable in prosodic analysis: The case of the English negative," Speech Commun., vol. 19, pp. 39-60, 1996.
    • (1996) Speech Commun. , vol.19 , pp. 39-60
    • Yaeger-Dror, M.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.