메뉴 건너뛰기




Volumn 135, Issue 5, 2014, Pages 2885-2901

Robust fundamental frequency estimation in sustained vowels: Detailed algorithmic comparisons and information fusion with adaptive Kalman filtering

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS; KALMAN FILTERS; LINGUISTICS; NATURAL FREQUENCIES; PHYSIOLOGICAL MODELS; SIGNAL PROCESSING;

EID: 84900419715     PISSN: 00014966     EISSN: None     Source Type: Journal    
DOI: 10.1121/1.4870484     Document Type: Article
Times cited : (55)

References (45)
  • 2
    • 0042218299 scopus 로고    scopus 로고
    • 2nd ed. (National Center for Voice and Speech, Iowa City)
    • I. R. Titze, Principles of Voice Production, 2nd ed. (National Center for Voice and Speech, Iowa City, 2000).
    • (2000) Principles of Voice Production
    • Titze, I.R.1
  • 3
    • 0001455934 scopus 로고
    • A robust algorithm for pitch tracking
    • edited by W. B. Kleijn and K. K. Paliwal (Elsevier Science, Philadelphia), Cha 14
    • D. Talkin, " A robust algorithm for pitch tracking," in Speech Coding and Synthesis, edited by W. B. Kleijn and K. K. Paliwal (Elsevier Science, Philadelphia, 1995), Chap. 14, pp. 495-518.
    • (1995) Speech Coding and Synthesis , pp. 495-518
    • Talkin, D.1
  • 4
    • 33747143714 scopus 로고    scopus 로고
    • Frequency and voice: Perspectives in the time domain
    • 10.1016/j.jvoice.2005.12.009
    • R. M. Roark, " Frequency and voice: Perspectives in the time domain," J. Voice 20, 325-354 (2006). 10.1016/j.jvoice.2005.12.009
    • (2006) J. Voice , vol.20 , pp. 325-354
    • Roark, R.M.1
  • 5
    • 0032925301 scopus 로고    scopus 로고
    • A comparison of high precision F0 extraction algorithms for sustained vowels
    • 10.1044/jslhr.4201.112
    • V. Parsa and D. G. Jamieson, " A comparison of high precision F0 extraction algorithms for sustained vowels," J. Speech Lang. Hear. Res. 42, 112-126 (1999). 10.1044/jslhr.4201.112
    • (1999) J. Speech Lang. Hear. Res. , vol.42 , pp. 112-126
    • Parsa, V.1    Jamieson, D.G.2
  • 6
    • 0027330001 scopus 로고
    • Comparison of F0 extraction methods for high-precision voice perturbation measurements
    • I. R. Titze and H. Liang, " Comparison of F0 extraction methods for high-precision voice perturbation measurements," J. Speech Hear. Res. 36, 1120-1133 (1993).
    • (1993) J. Speech Hear. Res. , vol.36 , pp. 1120-1133
    • Titze, I.R.1    Liang, H.2
  • 8
    • 84861025901 scopus 로고    scopus 로고
    • Perturbation measurements in highly irregular voice signals: Performance/validity of analysis software tools
    • 10.1016/j.bspc.2011.06.004
    • C. Manfredi, A. Giordano, J. Schoentgen, S. Fraj, L. Bocchi, and P. H. Dejonckere, " Perturbation measurements in highly irregular voice signals: Performance/validity of analysis software tools," Biomed. Signal Process. Control 7, 409-416 (2012). 10.1016/j.bspc.2011.06.004
    • (2012) Biomed. Signal Process. Control , vol.7 , pp. 409-416
    • Manfredi, C.1    Giordano, A.2    Schoentgen, J.3    Fraj, S.4    Bocchi, L.5    Dejonckere, P.H.6
  • 9
    • 77249091669 scopus 로고    scopus 로고
    • Using dynamic time warping of T0 contours in the evaluation of cycle-to-cycle pitch detection algorithms
    • 10.1016/j.patrec.2009.07.021
    • C. Ferrer, D. Torres, and M. E. Hernandez-Diaz, " Using dynamic time warping of T0 contours in the evaluation of cycle-to-cycle pitch detection algorithms," Pattern Recogn. Lett. 31, 517-522 (2010). 10.1016/j.patrec. 2009.07.021
    • (2010) Pattern Recogn. Lett. , vol.31 , pp. 517-522
    • Ferrer, C.1    Torres, D.2    Hernandez-Diaz, M.E.3
  • 10
    • 79955730724 scopus 로고    scopus 로고
    • Nonlinear speech analysis algorithms mapped to a standard metric achieve clinically useful quantification of average Parkinson's disease symptom severity
    • 10.1098/rsif.2010.0456
    • A. Tsanas, M. A. Little, P. E. McSharry, and L. O. Ramig, " Nonlinear speech analysis algorithms mapped to a standard metric achieve clinically useful quantification of average Parkinson's disease symptom severity," J. R. Soc. Interface 8, 842-855 (2011). 10.1098/rsif.2010.0456
    • (2011) J. R. Soc. Interface , vol.8 , pp. 842-855
    • Tsanas, A.1    Little, M.A.2    McSharry, P.E.3    Ramig, L.O.4
  • 11
    • 84860386097 scopus 로고    scopus 로고
    • Novel speech signal processing algorithms for high-accuracy classification of Parkinson's disease
    • 10.1109/TBME.2012.2183367
    • A. Tsanas, M. A. Little, P. E. McSharry, J. Spielman, and L. O. Ramig: " Novel speech signal processing algorithms for high-accuracy classification of Parkinson's disease," IEEE Trans. Biomed. Eng. 59, 1264-1271 (2012). 10.1109/TBME.2012.2183367
    • (2012) IEEE Trans. Biomed. Eng. , vol.59 , pp. 1264-1271
    • Tsanas, A.1    Little, M.A.2    McSharry, P.E.3    Spielman, J.4    Ramig, L.O.5
  • 12
    • 84892569709 scopus 로고    scopus 로고
    • Objective automatic assessment of rehabilitative speech treatment in Parkinson's disease
    • 10.1109/TNSRE.2013.2293575
    • A. Tsanas, M. A. Little, C. Fox, and L. O. Ramig, " Objective automatic assessment of rehabilitative speech treatment in Parkinson's disease," IEEE Trans. Neural Syst. Rehab. Eng. 22, 181-190 (2014). 10.1109/TNSRE.2013.2293575
    • (2014) IEEE Trans. Neural Syst. Rehab. Eng. , vol.22 , pp. 181-190
    • Tsanas, A.1    Little, M.A.2    Fox, C.3    Ramig, L.O.4
  • 13
    • 84858861633 scopus 로고    scopus 로고
    • New nonlinear markers and insights into speech signal degradation for effective tracking of Parkinson's disease symptom severity
    • Krakow, Poland
    • A. Tsanas, M. A. Little, P. E. McSharry, and L. O. Ramig, " New nonlinear markers and insights into speech signal degradation for effective tracking of Parkinson's disease symptom severity," in International Symposium on Nonlinear Theory and its Applications (NOLTA), Krakow, Poland (2010), pp. 457-460.
    • (2010) International Symposium on Nonlinear Theory and Its Applications (NOLTA) , pp. 457-460
    • Tsanas, A.1    Little, M.A.2    McSharry, P.E.3    Ramig, L.O.4
  • 14
    • 33749525148 scopus 로고    scopus 로고
    • Dimensionality reduction of a pathological voice quality assessment system based on Gaussian mixture models and short-term cepstral parameters
    • 10.1109/TBME.2006.871883
    • J. I. Godino-Llorente, P. Gomez-Vilda, and M. Blanco-Velasco, " Dimensionality reduction of a pathological voice quality assessment system based on Gaussian mixture models and short-term cepstral parameters," IEEE Trans. Biomed. Eng. 53, 1943-1953 (2006). 10.1109/TBME.2006.871883
    • (2006) IEEE Trans. Biomed. Eng. , vol.53 , pp. 1943-1953
    • Godino-Llorente, J.I.1    Gomez-Vilda, P.2    Blanco-Velasco, M.3
  • 15
    • 0025304762 scopus 로고
    • Problems and pitfalls of electroglottography
    • 10.1016/S0892-1997(05)80077-3
    • R. H. Colton and E. G. Conture, " Problems and pitfalls of electroglottography," J. Voice 4, 10-24 (1990). 10.1016/S0892-1997(05) 80077-3
    • (1990) J. Voice , vol.4 , pp. 10-24
    • Colton, R.H.1    Conture, E.G.2
  • 16
    • 1542286741 scopus 로고    scopus 로고
    • On the use of the derivative of electroglottographic signals for characterization of nonpathological phonation
    • 10.1121/1.1646401
    • N. Henrich, C. d'Alessandro, B. Doval, and M. Castellengo, " On the use of the derivative of electroglottographic signals for characterization of nonpathological phonation," J. Acoust. Soc. Am. 115, 1321-1332 (2004). 10.1121/1.1646401
    • (2004) J. Acoust. Soc. Am. , vol.115 , pp. 1321-1332
    • Henrich, N.1    D'Alessandro, C.2    Doval, B.3    Castellengo, M.4
  • 17
    • 84858325005 scopus 로고    scopus 로고
    • Investigating acoustic correlates of human vocal fold phase asymmetry through mathematical modeling and laryngeal high-speed videoendoscopy
    • 10.1121/1.3658441
    • D. D. Mehta, M. Zañartu, T. F. Quatieri, D. D. Deliyski, and R. E. Hillman, " Investigating acoustic correlates of human vocal fold phase asymmetry through mathematical modeling and laryngeal high-speed videoendoscopy," J. Acoust. Soc. Am. 130, 3999-4009 (2011). 10.1121/1.3658441
    • (2011) J. Acoust. Soc. Am. , vol.130 , pp. 3999-4009
    • Mehta, D.D.1    Zañartu, M.2    Quatieri, T.F.3    Deliyski, D.D.4    Hillman, R.E.5
  • 20
    • 38049013310 scopus 로고    scopus 로고
    • Robust heart rate estimation from multiple asynchronous noisy sources using signal quality indices and a Kalman filter
    • 10.1088/0967-3334/29/1/002
    • Q. Li, R. G. Mark, and G. D. Clifford, " Robust heart rate estimation from multiple asynchronous noisy sources using signal quality indices and a Kalman filter," Physiol. Meas. 29, 15-32 (2008). 10.1088/0967-3334/29/1/002
    • (2008) Physiol. Meas. , vol.29 , pp. 15-32
    • Li, Q.1    Mark, R.G.2    Clifford, G.D.3
  • 21
    • 0028833120 scopus 로고
    • Voice simulation with a body-cover model of the vocal folds
    • 10.1121/1.412234
    • B. H. Story and I. R. Titze, " Voice simulation with a body-cover model of the vocal folds," J. Acoust. Soc. Am. 97, 1249-1260 (1995). 10.1121/1.412234
    • (1995) J. Acoust. Soc. Am. , vol.97 , pp. 1249-1260
    • Story, B.H.1    Titze, I.R.2
  • 22
    • 0028961788 scopus 로고
    • Bifurcations in an asymmetric vocal-fold model
    • 10.1121/1.412061
    • I. Steinecke and H. Herzel, " Bifurcations in an asymmetric vocal-fold model," J. Acoust. Soc. Am. 97, 1874-1884 (1995). 10.1121/1.412061
    • (1995) J. Acoust. Soc. Am. , vol.97 , pp. 1874-1884
    • Steinecke, I.1    Herzel, H.2
  • 23
    • 0036711789 scopus 로고    scopus 로고
    • Rules for controlling low-dimensional vocal fold models with muscle activation
    • 10.1121/1.1496080
    • I. R. Titze and B. H. Story, " Rules for controlling low-dimensional vocal fold models with muscle activation," J. Acoust. Soc. Am. 112, 1064-1076 (2002). 10.1121/1.1496080
    • (2002) J. Acoust. Soc. Am. , vol.112 , pp. 1064-1076
    • Titze, I.R.1    Story, B.H.2
  • 24
    • 79960690309 scopus 로고    scopus 로고
    • A theoretical model of the pressure distributions arising from asymmetric intraglottal flows applied to a two-mass model of the vocal folds
    • 10.1121/1.3586785
    • B. D. Erath, S. D. Peterson, M. Zañartu, G. R. Wodicka, and M. W. Plesniak, " A theoretical model of the pressure distributions arising from asymmetric intraglottal flows applied to a two-mass model of the vocal folds," J. Acoust. Soc. Am. 130, 389-403 (2011). 10.1121/1.3586785
    • (2011) J. Acoust. Soc. Am. , vol.130 , pp. 389-403
    • Erath, B.D.1    Peterson, S.D.2    Zañartu, M.3    Wodicka, G.R.4    Plesniak, M.W.5
  • 25
    • 0024400079 scopus 로고
    • Objective assessment of vocal hyperfunction: An experimental framework and initial results
    • R. E. Hillman, E. B. Holmberg, J. S. Perkell, M. Walsh, and C. Vaughan, " Objective assessment of vocal hyperfunction: An experimental framework and initial results," J. Speech Hear. Res. 32, 373-392 (1989).
    • (1989) J. Speech Hear. Res. , vol.32 , pp. 373-392
    • Hillman, R.E.1    Holmberg, E.B.2    Perkell, J.S.3    Walsh, M.4    Vaughan, C.5
  • 28
    • 43549113834 scopus 로고    scopus 로고
    • Nonlinear source-filter coupling in phonation: Theory
    • 10.1121/1.2832337
    • I. R. Titze, " Nonlinear source-filter coupling in phonation: Theory," J. Acoust. Soc. Am. 123, 2733-2749 (2008). 10.1121/1.2832337
    • (2008) J. Acoust. Soc. Am. , vol.123 , pp. 2733-2749
    • Titze, I.R.1
  • 30
    • 69249091414 scopus 로고    scopus 로고
    • The SIGMA algorithm: A glottal activity detector for electroglottographic signals
    • 10.1109/TASL.2009.2022430
    • M. R. P. Thomas and P. A. Naylor, " The SIGMA algorithm: A glottal activity detector for electroglottographic signals," IEEE Trans. Audio Speech Lang. Process. 17, 1557-1566 (2009). 10.1109/TASL.2009.2022430
    • (2009) IEEE Trans. Audio Speech Lang. Process. , vol.17 , pp. 1557-1566
    • Thomas, M.R.P.1    Naylor, P.A.2
  • 32
    • 70349894136 scopus 로고    scopus 로고
    • Should jitter be measured by peak picking or by waveform matching?
    • 10.1159/000245159
    • P. Boersma, " Should jitter be measured by peak picking or by waveform matching?," Folia Phoniat. Logoped. 61, 305-308 (2009). 10.1159/000245159
    • (2009) Folia Phoniat. Logoped. , vol.61 , pp. 305-308
    • Boersma, P.1
  • 33
    • 41049089736 scopus 로고    scopus 로고
    • Estimation of glottal closure instants in voices speech using the DYPSA algorithm
    • 10.1109/TASL.2006.876878
    • P. A. Naylor, A. Kounoudes, J. Gudnason, and M. Brookes, " Estimation of glottal closure instants in voices speech using the DYPSA algorithm," IEEE Trans. Audio Speech Lang. Process. 15, 34-43 (2007). 10.1109/TASL.2006.876878
    • (2007) IEEE Trans. Audio Speech Lang. Process. , vol.15 , pp. 34-43
    • Naylor, P.A.1    Kounoudes, A.2    Gudnason, J.3    Brookes, M.4
  • 34
    • 0001835850 scopus 로고
    • Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of sampled signal
    • P. Boersma, " Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of sampled signal," IFA Proc. 17, 97-110 (1993).
    • (1993) IFA Proc. , vol.17 , pp. 97-110
    • Boersma, P.1
  • 35
    • 0036299273 scopus 로고    scopus 로고
    • Pitch determination and voice quality analysis using subharmonic-to- harmonic ratio
    • Orlando, FL
    • X. Sun, " Pitch determination and voice quality analysis using subharmonic-to-harmonic ratio," ICASSP2002, Orlando, FL (2002).
    • (2002) ICASSP2002
    • Sun, X.1
  • 36
    • 52449117078 scopus 로고    scopus 로고
    • A sawtooth waveform inspired pitch estimator for speech and music
    • 10.1121/1.2951592
    • A. Camacho and J. G. Harris, " A sawtooth waveform inspired pitch estimator for speech and music," J. Acoust. Soc. Am. 124, 1638-1652 (2008). 10.1121/1.2951592
    • (2008) J. Acoust. Soc. Am. , vol.124 , pp. 1638-1652
    • Camacho, A.1    Harris, J.G.2
  • 37
    • 0036214787 scopus 로고    scopus 로고
    • YIN, a fundamental frequency estimator for speech and music
    • 10.1121/1.1458024
    • A. de Cheveigne and H. Kawahara, " YIN, a fundamental frequency estimator for speech and music," J. Acoust. Soc. Am. 111, 1917-1930 (2002). 10.1121/1.1458024
    • (2002) J. Acoust. Soc. Am. , vol.111 , pp. 1917-1930
    • De Cheveigne, A.1    Kawahara, H.2
  • 38
    • 84928118106 scopus 로고    scopus 로고
    • Fixed point analysis of frequency to instantaneous frequency mapping for accurate estimation of F0 and periodicity
    • Budapest, Hungary
    • H. Kawahara, H. Katayose, A. de Cheveigne, and R. D. Patterson, " Fixed point analysis of frequency to instantaneous frequency mapping for accurate estimation of F0 and periodicity," Eurospeech, Budapest, Hungary (1999), pp. 2781-2784.
    • (1999) Eurospeech , pp. 2781-2784
    • Kawahara, H.1    Katayose, H.2    De Cheveigne, A.3    Patterson, R.D.4
  • 39
    • 33745184239 scopus 로고    scopus 로고
    • Nearly defect-free F0 trajectory extraction for expressive speech modifications based on STRAIGHT
    • Lisbon, Portugal
    • H. Kawahara, A. de Cheveigne, H. Banno, T. Takahashi, and T. Irino, " Nearly defect-free F0 trajectory extraction for expressive speech modifications based on STRAIGHT," Interspeech, Lisbon, Portugal (2005), pp. 537-540.
    • (2005) Interspeech , pp. 537-540
    • Kawahara, H.1    De Cheveigne, A.2    Banno, H.3    Takahashi, T.4    Irino, T.5
  • 40
    • 51449108867 scopus 로고    scopus 로고
    • Tandem-STRAIGHT: A temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, F0, and aperiodicity estimation
    • Las Vegas
    • H. Kawahara, M. Morise, T. Takahashi, R. Nisimura, T. Irino, and H. Banno, " Tandem-STRAIGHT: A temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, F0, and aperiodicity estimation," ICASSP 2008, Las Vegas (2008), pp. 3933-3936.
    • (2008) ICASSP 2008 , pp. 3933-3936
    • Kawahara, H.1    Morise, M.2    Takahashi, T.3    Nisimura, R.4    Irino, T.5    Banno, H.6
  • 42
    • 0014764781 scopus 로고
    • On the identification of variance and adaptive Kalman filtering
    • 10.1109/TAC.1970.1099422
    • R. K. Mehra, " On the identification of variance and adaptive Kalman filtering," IEEE Trans. Automatic Control AC-15, 175-184 (1970). 10.1109/TAC.1970.1099422
    • (1970) IEEE Trans. Automatic Control , vol.15 , pp. 175-184
    • Mehra, R.K.1
  • 43
    • 77955293423 scopus 로고    scopus 로고
    • Data fusion for improved respiration rate estimation
    • 10.1155/2010/926305
    • S. Nemati, A. Malhorta, and G. D. Clifford, " Data fusion for improved respiration rate estimation," EURASIP J. Adv. Signal Process. 2010, 926315 (2010). 10.1155/2010/926305
    • (2010) EURASIP J. Adv. Signal Process. , vol.2010 , pp. 926315
    • Nemati, S.1    Malhorta, A.2    Clifford, G.D.3
  • 44
    • 30844433310 scopus 로고    scopus 로고
    • Testing the assumptions of linear prediction analysis in normal vowels
    • 10.1121/1.2141266
    • M. A. Little, P. E. McSharry, I. M. Moroz, and S. J. Roberts, " Testing the assumptions of linear prediction analysis in normal vowels," J. Acoust. Soc. Am. 119, 549-558 (2007). 10.1121/1.2141266
    • (2007) J. Acoust. Soc. Am. , vol.119 , pp. 549-558
    • Little, M.A.1    McSharry, P.E.2    Moroz, I.M.3    Roberts, S.J.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.