메뉴 건너뛰기




Volumn 14, Issue 1, 2006, Pages 256-265

A bidirectional target-filtering model of speech coarticulation and reduction: Two-stage implementation for phonetic recognition

Author keywords

Cepstral dynamics; Contextual assimilation; Filtering of targets; Formant dynamics; Long span context dependence; Phonetic recognition; Phonetic reduction; Resonances; TIMIT

Indexed keywords

ACOUSTIC WAVES; FIR FILTERS; IMPULSE RESPONSE; MARKOV PROCESSES; MATHEMATICAL MODELS; RESONANCE;

EID: 33744966561     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TSA.2005.854107     Document Type: Conference Paper
Times cited : (26)

References (33)
  • 1
    • 0039046406 scopus 로고
    • Coarticulation modeling with continuous-state HMMs
    • New York
    • R. Bakis, "Coarticulation modeling with continuous-state HMMs," in Proc. IEEE Workshop Automatic Speech Recognition, New York, 1991. pp. 20-21.
    • (1991) Proc. IEEE Workshop Automatic Speech Recognition , pp. 20-21
    • Bakis, R.1
  • 2
    • 0037841402 scopus 로고    scopus 로고
    • Graphical models and automatic speech recognition
    • M. Johnson, M. Ostendorf, S. Khudanpur, and R. Rosenfeld, Eds. New York: Springer-Verlag
    • J. Bilmes, "Graphical models and automatic speech recognition," in Mathematical Foundations of Speech and Language Processing, M. Johnson, M. Ostendorf, S. Khudanpur, and R. Rosenfeld, Eds. New York: Springer-Verlag, 2004, pp. 135-186.
    • (2004) Mathematical Foundations of Speech and Language Processing , pp. 135-186
    • Bilmes, J.1
  • 4
    • 0034295822 scopus 로고    scopus 로고
    • Structured language modeling
    • Oct.
    • C. Chelba and F. Jelinek, "Structured language modeling," Compur. Speech Lang., pp. 283-332, Oct. 2000.
    • (2000) Compur. Speech Lang. , pp. 283-332
    • Chelba, C.1    Jelinek, F.2
  • 5
    • 0026854213 scopus 로고
    • A generalized hidden Markov model with state-conditioned trend functions of time for the speech signal
    • L. Deng, "A generalized hidden Markov model with state-conditioned trend functions of time for the speech signal," Signal Process., vol. 27, pp. 65-78, 1992.
    • (1992) Signal Process. , vol.27 , pp. 65-78
    • Deng, L.1
  • 6
    • 0032119268 scopus 로고    scopus 로고
    • A dynamic, feature-based approach to the interface between phonology and phonetics for speech modeling and recognition
    • _, "A dynamic, feature-based approach to the interface between phonology and phonetics for speech modeling and recognition," Speech Commun., vol. 24. no. 4, pp. 299-323, 1998.
    • (1998) Speech Commun. , vol.24 , Issue.4 , pp. 299-323
  • 7
    • 0039503389 scopus 로고    scopus 로고
    • Computational models for speech production
    • K. Ponting, Ed. Berlin, Germany: Springer-Verlag
    • _, "Computational models for speech production," in Computational Models of Speech Pattern Processing, K. Ponting, Ed. Berlin, Germany: Springer-Verlag, 1999, pp. 199-213.
    • (1999) Computational Models of Speech Pattern Processing , pp. 199-213
  • 8
    • 33744966595 scopus 로고    scopus 로고
    • Switching dynamic system models for speech articulation and acoustics
    • M. Johnson, M. Ostendorf, S. Khudanpur, and R. Rosenfeld, Eds. New York: Springer-Verlag
    • _, "Switching dynamic system models for speech articulation and acoustics," in Mathematical Foundations of Speech and Language Processing, M. Johnson, M. Ostendorf, S. Khudanpur, and R. Rosenfeld, Eds. New York: Springer-Verlag, 2004, pp. 115-134.
    • (2004) Mathematical Foundations of Speech and Language Processing , pp. 115-134
  • 10
    • 0028088646 scopus 로고
    • Context-dependent Markov model structured by locus equations: Applications to phonetic classification
    • Oct.
    • L. Deng and D. Braam, "Context-dependent Markov model structured by locus equations: Applications to phonetic classification," J. Acoust. Soc. Amer., vol. 96, pp. 2008-2025, Oct. 1994.
    • (1994) J. Acoust. Soc. Amer. , vol.96 , pp. 2008-2025
    • Deng, L.1    Braam, D.2
  • 11
    • 4544323815 scopus 로고    scopus 로고
    • A structured speech model with continuous hidden dynamics and prediction-residual training for tracking vocal tract resonances
    • May
    • L. Deng, L. Lee, H. Attias, and A. Acero, "A structured speech model with continuous hidden dynamics and prediction-residual training for tracking vocal tract resonances," in Proc. IEEE ICASSP, vol. I, May 2004. pp. 557-560.
    • (2004) Proc. IEEE ICASSP , vol.1 , pp. 557-560
    • Deng, L.1    Lee, L.2    Attias, H.3    Acero, A.4
  • 12
    • 33745005721 scopus 로고    scopus 로고
    • Tracking vocal tract resonances using a quantized nonlinear function embedded in a temporal constraint
    • to be published
    • L. Deng, A. Acero, and I. Bazzi, "Tracking vocal tract resonances using a quantized nonlinear function embedded in a temporal constraint," IEEE Trans. Speech Audio Process., to be published.
    • IEEE Trans. Speech Audio Process.
    • Deng, L.1    Acero, A.2    Bazzi, I.3
  • 13
    • 0030638031 scopus 로고    scopus 로고
    • A post-processing system to yield reduced word error rates: Recognizer output voting error reduction (ROVER)
    • J. Fiscus, "A post-processing system to yield reduced word error rates: Recognizer output voting error reduction (ROVER)," in Proc. Automatic Speech Recognition and Understanding, 1997, pp. 347-354.
    • (1997) Proc. Automatic Speech Recognition and Understanding , pp. 347-354
    • Fiscus, J.1
  • 14
    • 85009110670 scopus 로고    scopus 로고
    • Multistage coarticulation model combining articulatory, formant and cepstral features
    • Y. Gao, R. Bakis, J. Huang, and B. Zhang, "Multistage coarticulation model combining articulatory, formant and cepstral features," in Proc. ICSLP, vol. 1, 2000, pp. 25-28.
    • (2000) Proc. ICSLP , vol.1 , pp. 25-28
    • Gao, Y.1    Bakis, R.2    Huang, J.3    Zhang, B.4
  • 15
    • 0017813672 scopus 로고
    • Effect of speaking rate on vowel formant movements
    • T. Gay, "Effect of speaking rate on vowel formant movements," J. Acoust. Soc. Amer., vol. 63, pp. 223-230, 1978.
    • (1978) J. Acoust. Soc. Amer. , vol.63 , pp. 223-230
    • Gay, T.1
  • 16
    • 85009287827 scopus 로고    scopus 로고
    • Parametric trajectory mixtures for LVCSR
    • Sydney, Australia
    • M. Siu, R. Iyer, H. Gish, and C. Quillen, "Parametric trajectory mixtures for LVCSR," in Proc. ICSLP, Sydney, Australia, 1998, pp. 3269-3272.
    • (1998) Proc. ICSLP , pp. 3269-3272
    • Siu, M.1    Iyer, R.2    Gish, H.3    Quillen, C.4
  • 17
    • 84930566519 scopus 로고
    • Streams, phones, and transitions: Toward a new phonological and phonetic model of formant timing
    • S. Hertz, "Streams, phones, and transitions: Toward a new phonological and phonetic model of formant timing," J. Phonet., vol. 19, pp. 91-109, 1991.
    • (1991) J. Phonet. , vol.19 , pp. 91-109
    • Hertz, S.1
  • 18
    • 84942397864 scopus 로고
    • Spectrographic study of vowel reduction
    • B. Lindblom, "Spectrographic study of vowel reduction," J. Acoust. Soc. Amer., vol. 35, pp. 1773-1781, 1963.
    • (1963) J. Acoust. Soc. Amer. , vol.35 , pp. 1773-1781
    • Lindblom, B.1
  • 19
    • 0032673963 scopus 로고    scopus 로고
    • Probabilistic-trajectory segmental HMMs
    • W. Holmes and M. Russell, "Probabilistic-trajectory segmental HMMs," Comput. Speech Lang., vol. 13, pp. 3-37, 1999.
    • (1999) Comput. Speech Lang. , vol.13 , pp. 3-37
    • Holmes, W.1    Russell, M.2
  • 20
    • 0018986665 scopus 로고
    • Software for a cascade/parallel formant synthesizer
    • D. Klatt, "Software for a cascade/parallel formant synthesizer," J. Acoust. Soc. Amer., vol. 99, no. 3, pp. 971-995, 1980.
    • (1980) J. Acoust. Soc. Amer. , vol.99 , Issue.3 , pp. 971-995
    • Klatt, D.1
  • 21
    • 0000665734 scopus 로고
    • Explaining phonetic variation: A sketch of the H & H theory
    • W. Hardcastle and A. Marchal, Eds. Norwell, MA: Kluwer
    • B. Lindblom, "Explaining phonetic variation: A sketch of the H & H theory," in Speech Production and Speech Modeling, W. Hardcastle and A. Marchal, Eds. Norwell, MA: Kluwer, 1990, pp. 403-439.
    • (1990) Speech Production and Speech Modeling , pp. 403-439
    • Lindblom, B.1
  • 22
    • 0347968275 scopus 로고    scopus 로고
    • Efficient decoding strategies for conversational speech recognition using a constrained nonlinear state-space model for vocal-tract-resonance dynamics
    • Nov.
    • J. Ma and L. Deng, "Efficient decoding strategies for conversational speech recognition using a constrained nonlinear state-space model for vocal-tract-resonance dynamics," IEEE Trans. Speech Audio Process., vol. 11, no. 6, pp. 590-602, Nov. 2003.
    • (2003) IEEE Trans. Speech Audio Process. , vol.11 , Issue.6 , pp. 590-602
    • Ma, J.1    Deng, L.2
  • 23
    • 0028239529 scopus 로고
    • Interaction between duration, context, and speaking style in English stressed vowels
    • S. Moon and B. Lindblom, "Interaction between duration, context, and speaking style in English stressed vowels," J. Acoust. Soc. Amer., vol. 96, pp. 40-55, 1994.
    • (1994) J. Acoust. Soc. Amer. , vol.96 , pp. 40-55
    • Moon, S.1    Lindblom, B.2
  • 24
    • 0030245363 scopus 로고    scopus 로고
    • From HMM's to segment models: A unified view of stochastic modeling for speech recognition
    • Sep.
    • M. Ostendorf, V. Digalakis, and J. Rohlicek, "From HMM's to segment models: A unified view of stochastic modeling for speech recognition," IEEE Trans. Speech Audio Process., vol. 4, no. 5, pp. 360-378, Sep. 1996.
    • (1996) IEEE Trans. Speech Audio Process. , vol.4 , Issue.5 , pp. 360-378
    • Ostendorf, M.1    Digalakis, V.2    Rohlicek, J.3
  • 25
    • 0034047363 scopus 로고    scopus 로고
    • Effect of speaking rate and contrastive stress on formant dynamics and vowel perception
    • M. Pitermann, "Effect of speaking rate and contrastive stress on formant dynamics and vowel perception," J. Acoust. Soc. Amer., vol. 107, pp. 3425-3437, 2000.
    • (2000) J. Acoust. Soc. Amer. , vol.107 , pp. 3425-3437
    • Pitermann, M.1
  • 26
    • 33744982649 scopus 로고    scopus 로고
    • Psycho-acoustics and speech perception
    • K. Ponting, Ed. Berlin, Germany: Springer-Verlag
    • L. Pols, "Psycho-acoustics and speech perception," in Computational Models of Speech Pattern Processing, K. Ponting, Ed. Berlin, Germany: Springer-Verlag, pp. 10-17.
    • Computational Models of Speech Pattern Processing , pp. 10-17
    • Pols, L.1
  • 27
    • 0030008004 scopus 로고    scopus 로고
    • The potential role of speech production models in automatic speech recognition
    • R. Rose, J. Schroeter, and M. Sondhi, "The potential role of speech production models in automatic speech recognition," J. Acoust. Soc. Amer., vol. 99, pp. 1699-1709, 1996.
    • (1996) J. Acoust. Soc. Amer. , vol.99 , pp. 1699-1709
    • Rose, R.1    Schroeter, J.2    Sondhi, M.3
  • 28
    • 84936526529 scopus 로고
    • On the quantal nature of speech
    • K. Stevens, "On the quantal nature of speech," J. Phonet., vol. 17, pp. 3-45, 1989.
    • (1989) J. Phonet. , vol.17 , pp. 3-45
    • Stevens, K.1
  • 29
    • 0036165806 scopus 로고    scopus 로고
    • An overlapping-feature based phonological model incorporating linguistic constraints: Applications to speech recognition
    • Feb.
    • J. Sun and L. Deng, "An overlapping-feature based phonological model incorporating linguistic constraints: Applications to speech recognition," J. Acoust. Soc. Amer., vol. 111, no. 2, pp. 1086-1101, Feb. 2002.
    • (2002) J. Acoust. Soc. Amer. , vol.111 , Issue.2 , pp. 1086-1101
    • Sun, J.1    Deng, L.2
  • 30
    • 0027554395 scopus 로고
    • Acoustic vowel reduction as a function of sentence accent, word stress and word class
    • D. van Bergem, "Acoustic vowel reduction as a function of sentence accent, word stress and word class," Speech Commun., vol. 12, pp. 1-12, 1993.
    • (1993) Speech Commun. , vol.12 , pp. 1-12
    • Van Bergem, D.1
  • 31
    • 4544383109 scopus 로고    scopus 로고
    • The use of a linguistically motivated language model in conversational speech recognition
    • May
    • W. Wang, A. Stolcke, and M. Harper, "The use of a linguistically motivated language model in conversational speech recognition," in Proc. IEEE ICASSP, vol. I, May 2004, pp. 261-264.
    • (2004) Proc. IEEE ICASSP , vol.1 , pp. 261-264
    • Wang, W.1    Stolcke, A.2    Harper, M.3
  • 32
    • 0035124445 scopus 로고    scopus 로고
    • Control of spectral dynamics in concatenative speech synthesis
    • Jan.
    • J. Wouters and M. Macon, "Control of spectral dynamics in concatenative speech synthesis," IEEE Trans. Speech Audio Process., vol. 9, no. 1, pp. 30-38, Jan. 2001.
    • (2001) IEEE Trans. Speech Audio Process. , vol.9 , Issue.1 , pp. 30-38
    • Wouters, J.1    Macon, M.2
  • 33
    • 0141478988 scopus 로고    scopus 로고
    • Coarticulation modeling by embedding a target-directed hidden trajectory model into HMM
    • Apr.
    • J. Zhou, F. Seide, and L. Deng, "Coarticulation modeling by embedding a target-directed hidden trajectory model into HMM," in Proc. IEEE ICASSP, vol. I, Apr. 2003, pp. 744-747.
    • (2003) Proc. IEEE ICASSP , vol.1 , pp. 744-747
    • Zhou, J.1    Seide, F.2    Deng, L.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.