SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 20, Issue 3, 2012, Pages 968-981

The Deterministic plus Stochastic model of the residual signal and its applications

(2) Drugman, Thomas a Dutoit, Thierry a

a UNIVERSITY OF MONS (Belgium)

Author keywords

excitation modeling; glottal flow; speaker recognition; Speech analysis; speech synthesis

Indexed keywords

COMPUTATIONAL PROPERTIES; DATA SETS; EXCITATION MODELS; GLOTTAL FLOW; HIGH-FREQUENCY NOISE; HMM-BASED SPEECH SYNTHESIS; LOW FREQUENCY; ORTHONORMAL; PARAMETERIZING; PROCESSING APPLICATIONS; PULSE EXCITATION; RECOGNITION RATES; RESIDUAL SIGNALS; SPEAKER IDENTIFICATION; SPEAKER RECOGNITION; SPECTRAL BAND; SPEECH PRODUCTION; STOCHASTIC COMPONENT;

SPEECH ANALYSIS; SPEECH PROCESSING; SPEECH RECOGNITION; SPEECH SYNTHESIS; STOCHASTIC SYSTEMS;

STOCHASTIC MODELS;

EID: 84856248602 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2011.2169787 Document Type: Article

Times cited : (98)

References (47)

1
- 0003927842
- Upper Saddle River, NJ: Prentice-Hall
- T. Quatieri, Discrete-Time Speech Signal Processing: Principles and Practice. Upper Saddle River, NJ: Prentice-Hall, 2002.
- (2002) Discrete-time Speech Signal Processing: Principles and Practice
- Quatieri, T.¹

2
- 80955173659
- Comparative study of glottal source estimation techniques
- Jan.
- T. Drugman, B. Bozkurt, and T. Dutoit, "Comparative study of glottal source estimation techniques," Comput. Speech Lang., vol. 26, no. 1, pp. 20-34, Jan. 2012.
- (2012) Comput. Speech Lang. , vol.26 , Issue.1 , pp. 20-34
- Drugman, T.¹ Bozkurt, B.² Dutoit, T.³

3
- 85131821539
- Mel generalized cepstral analysis a unified approach to speech spectral estimation
- K. Tokuda, T. Kobayashi, T. Masuko, and S. Imai, "Mel generalized cepstral analysis a unified approach to speech spectral estimation," in Proc. ICSLP, 1994.
- (1994) Proc. ICSLP
- Tokuda, K.¹ Kobayashi, T.² Masuko, T.³ Imai, S.⁴

4
- 0022219187
- Code-excited linear prediction (CELP): High-quality speech at very low bit rates
- M. Schroeder and B. Atal, "Code-excited linear prediction (CELP): High-quality speech at very low bit rates," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 1985, vol. 10, pp. 937-940. (Pubitemid 16511503)
- (1985) ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings , pp. 937-940
- Schroeder Manfred, R.¹ Atal Bishnu, S.²

5
- 84856248349
- A trainable excitation model for HMM-based speech synthesis
- R. Maia, T. Toda, H. Zen, Y. Nankaku, and K. Tokuda, "A trainable excitation model for HMM-based speech synthesis," in Proc. Interspeech Conf., 2007, pp. 1909-1912.
- (2007) Proc. Interspeech Conf. , pp. 1909-1912
- Maia, R.¹ Toda, T.² Zen, H.³ Nankaku, Y.⁴ Tokuda, K.⁵

6
- 0028515601
- Enhancement of multiband excitation (MBE) by pitch-cycle waveform coding
- H. Yang, S. Koh, and P. Sivaprakasapillai, "Enhancement of multiband excitation (MBE) by pitch-cycle waveform coding," Electron. Lett., vol. 30, no. 20, pp. 1645-1646, 1994.
- (1994) Electron. Lett. , vol.30 , Issue.20 , pp. 1645-1646
- Yang, H.¹ Koh, S.² Sivaprakasapillai, P.³

7
- 0033677122
- Mixed excitation linear prediction coding of wideband speech at 8 kbps
- Speech, Signal Process. (ICASSP)
- W. Lin, S. Koh, and X. Lin, "Mixed excitation linear prediction coding of wideband speech at 8 kbps," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2000, vol. 2, pp. 1137-1140.
- (2000) Proc. IEEE Int. Conf. Acoust. , vol.2 , pp. 1137-1140
- Lin, W.¹ Koh, S.² Lin, X.³

8
- 85009097254
- Mixed-excitation for HMM-based speech synthesis
- T. Yoshimura, K. Tokuda, T. Masuko, and T. Kitamura, "Mixed- excitation for HMM-based speech synthesis," in Proc. Eurospeech, 2001, pp. 2259-2262.
- (2001) Proc. Eurospeech , pp. 2259-2262
- Yoshimura, T.¹ Tokuda, K.² Masuko, T.³ Kitamura, T.⁴

9
- 78649297510
- An excitation model for HMM-based speech synthesis based on residual modeling
- R. Maia, T. Toda, H. Zen, Y. Nankaku, and K. Tokuda, "An excitation model for HMM-based speech synthesis based on residual modeling," in Proc. ISCA SSW6, 2007.
- (2007) Proc. ISCA SSW6
- Maia, R.¹ Toda, T.² Zen, H.³ Nankaku, Y.⁴ Tokuda, K.⁵

10
- 33846935000
- HMM-based Korean speech synthesis system for hand-held devices
- DOI 10.1109/TCE.2006.273160
- S.-J. Kim, J.-J. Kim, and M. Hahn, "HMM-based korean speech synthesis system for hand-held devices," IEEE Trans. Consumer Electron., vol. 58, no. 4, pp. 1384-1390, Apr. 2006. (Pubitemid 46231653)
- (2006) IEEE Transactions on Consumer Electronics , vol.52 , Issue.4 , pp. 1384-1390
- Kim, S.-J.¹ Kim, J.-J.² Hahn, M.³

11
- 0032673049
- Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based f0 extraction: Possible role of a repetitive structure in sounds
- H. Kawahara, I. Masuda-Katsuse, and A. de Cheveigne, "Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based f0 extraction: Possible role of a repetitive structure in sounds," Speech Commun., vol. 27, pp. 187-207, 2001.
- (2001) Speech Commun. , vol.27 , pp. 187-207
- Kawahara, H.¹ Masuda-Katsuse, I.² De Cheveigne, A.³

12
- 77957744515
- HMM-based speech synthesis utilizing glottal inverse filtering
- Jan.
- T. Raitio, A. Suni, J. Yamagishi, H. Pulakka, J. Nurminen, M. Vainio, and P. Alku, "HMM-based speech synthesis utilizing glottal inverse filtering," IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 1, pp. 153-165, Jan. 2011.
- (2011) IEEE Trans. Audio, Speech, Lang. Process , vol.19 , Issue.1 , pp. 153-165
- Raitio, T.¹ Suni, A.² Yamagishi, J.³ Pulakka, H.⁴ Nurminen, J.⁵ Vainio, M.⁶ Alku, P.⁷

13
- 33947684811
- A four parameter model of glottal flow
- G. Fant and J. L. Q. Lin, "A four parameter model of glottal flow," in Proc. STL-QPSR4, 1985, pp. 1-13.
- (1985) Proc. STL-QPSR4 , pp. 1-13
- Fant, G.¹ Lin, J.L.Q.²

14
- 82155160991
- Towards an improved modeling of the glottal source in statistical parametric speech synthesis
- J. Cabral, S. Renals, K. Richmond, and J. Yamagishi, "Towards an improved modeling of the glottal source in statistical parametric speech synthesis," in Proc. 6th ISCA Workshop Speech Synth., 2007.
- (2007) Proc. 6th isCA Workshop Speech Synth.
- Cabral, J.¹ Renals, S.² Richmond, K.³ Yamagishi, J.⁴

15
- 0032595183
- Modeling of the glottal flow derivative waveform with application to speaker identification
- Sep.
- M. D. Plumpe, T. F. Quatieri, and D. A. Reynolds, "Modeling of the glottal flow derivative waveform with application to speaker identification," IEEE Trans. Speech Audio Process., vol. 7, no. 5, pp. 569-576, Sep. 1999.
- (1999) IEEE Trans. Speech Audio Process , vol.7 , Issue.5 , pp. 569-576
- Plumpe, M.D.¹ Quatieri, T.F.² Reynolds, D.A.³

16
- 51449086496
- Voice source cepstrum coefficients for speaker identification
- Speech, Signal Process. (ICASSP)
- J. Gudnason and M. Brookes, "Voice source cepstrum coefficients for speaker identification," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2008, pp. 4821-4824.
- (2008) Proc. IEEE Int. Conf. Acoust. , pp. 4821-4824
- Gudnason, J.¹ Brookes, M.²

17
- 0029356550
- Usefulness of the LPC-residue intext-inde-pendent speaker verification
- P. Thevenaz and H. Hugli, "Usefulness of the LPC-residue intext-inde-pendent speaker verification," Speech Commun., vol. 17, pp. 145-157, 1995.
- (1995) Speech Commun. , vol.17 , pp. 145-157
- Thevenaz, P.¹ Hugli, H.²

18
- 30444446629
- Combining evidence from residual phase and MFCC features for speaker recognition
- DOI 10.1109/LSP.2005.860538
- S. Murty and B. Yegnanarayana, "Combining evidence from residual phase and MFCC features for speaker recognition," IEEE Signal Process. Lett., vol. 13, no. 1, pp. 52-55, Jan. 2006. (Pubitemid 43072461)
- (2006) IEEE Signal Processing Letters , vol.13 , Issue.1 , pp. 52-55
- Sri Rama Murty, K.¹ Yegnanarayana, B.²

19
- 70450204573
- A deterministic plus stochastic model of the residual signal for improved parametric speech synthesis
- T. Drugman, G. Wilfart, and T. Dutoit, "A deterministic plus stochastic model of the residual signal for improved parametric speech synthesis," in Proc. Interspeech Conf., 2009.
- (2009) Proc. Interspeech Conf.
- Drugman, T.¹ Wilfart, G.² Dutoit, T.³

20
- 79959830729
- On the potential of glottal signatures for speaker recognition
- T. Drugman and T. Dutoit, "On the potential of glottal signatures for speaker recognition," in Proc. Interspeech Conf., 2010.
- (2010) Proc. Interspeech Conf.
- Drugman, T.¹ Dutoit, T.²

21
- 0035127703
- Applying the harmonic plus noise model in concatenative speech synthesis
- DOI 10.1109/89.890068
- Y. Stylianou, "Applying the harmonic plus noise model in concatenative speech synthesis," IEEE Trans. Speech Audio Process., vol. 9, no. 1, pp. 21-29, Jan. 2001. (Pubitemid 32130684)
- (2001) IEEE Transactions on Speech and Audio Processing , vol.9 , Issue.1 , pp. 21-29
- Stylianou, Y.¹

22
- 34547541173
- A new method for speech synthesis and transformation based on an ARX-LF source-filter decomposition and HNM modeling
- Speech, Signal Process. (ICASSP)
- D. Vincent, O. Rosec, and T. Chonavel, "A new method for speech synthesis and transformation based on an ARX-LF source-filter decomposition and HNM modeling," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2007, pp. 525-528.
- (2007) Proc. IEEE Int. Conf. Acoust. , pp. 525-528
- Vincent, D.¹ Rosec, O.² Chonavel, T.³

23
- 70349208681
- ARX-LF-based source-filter methods for voice modification and transformation
- Speech, Signal Process. (ICASSP)
- Y. Agiomyrgiannakis and O. Rosec, "ARX-LF-based source-filter methods for voice modification and transformation," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2009, pp.3589-3592.
- (2009) Proc. IEEE Int. Conf. Acoust. , pp. 3589-3592
- Agiomyrgiannakis, Y.¹ Rosec, O.²

24
- 70450198169
- Glottal closure and opening instant detection from speech signals
- T. Drugman and T. Dutoit, "Glottal closure and opening instant detection from speech signals," in Proc. Interspeech Conf., 2009.
- (2009) Proc. Interspeech Conf.
- Drugman, T.¹ Dutoit, T.²

25
- 70349203247
- The nitech-NAIST HMM-based speech synthesais system for the blizzard challenge 2006
- H. Zen, T. Toda, and K. Tokuda, "The Nitech-NAIST HMM-based speech synthesais system for the Blizzard challenge 2006," IEICE Trans. Inf. Syst., 2006.
- (2006) IEICE Trans. Inf. Syst.
- Zen, H.¹ Toda, T.² Tokuda, K.³

26
- 68949094517
- Optimum MVF estimation-based two-band excitation for HMM-based speech synthesis
- S. Han, S. Jeong, and M. Hahn, "Optimum MVF estimation-based two-band excitation for HMM-based speech synthesis," ETRI J., vol. 31, no. 4, pp. 457-459, 2009.
- (2009) ETRI J. , vol.31 , Issue.4 , pp. 457-459
- Han, S.¹ Jeong, S.² Hahn, M.³

27
- 9444268127
- Expressing vocal effort in concatenative synthesis
- M. Schroeder and M. Grice, "Expressing vocal effort in concatenative synthesis," in Proc. 15th Int. Conf. Phon. Sci., 2003, pp. 2589-2592.
- (2003) Proc. 15th Int. Conf. Phon. Sci. , pp. 2589-2592
- Schroeder, M.¹ Grice, M.²

28
- 0003447548
- Ph.D. dissertation, Ecole Nationale Superieure des Telecommunications, Paris, France
- Y. Stylianou, "Harmonic plus noise models for speech, combined with statistical methods, for speech and speaker modification," Ph.D. dissertation, Ecole Nationale Superieure des Telecommunications, Paris, France, 1996.
- (1996) Harmonic Plus Noise Models for Speech, Combined with Statistical Methods, for Speech and Speaker Modification
- Stylianou, Y.¹

29
- 51449095025
- Improving the modeling of the noise part in the harmonic plus noise model of speech
- Y. Pantazis and Y. Stylianou, "Improving the modeling of the noise part in the harmonic plus noise model of speech," in Proc. IEEE ICASSP, 2008, pp. 4609-1612.
- (2008) Proc. IEEE ICASSP , pp. 4609-1612
- Pantazis, Y.¹ Stylianou, Y.²

30
- 84856272951
- A comparative evaluation of pitch modification techniques
- T. Drugman and T. Dutoit, "A comparative evaluation of pitch modification techniques," in Proc. Eur. Signal Process. Conf., 2010, pp. 756-760.
- (2010) Proc. Eur. Signal Process. Conf. , pp. 756-760
- Drugman, T.¹ Dutoit, T.²

31
- 0003946510
- New York: Springer
- I. Jolliffe, Principal Component Analysis, ser. Statistics. New York: Springer, 2002.
- (2002) Principal Component Analysis, Ser. Statistics
- Jolliffe, I.¹

32
- 69249117913
- Data-driven voice source waveform modelling
- M. R. P. Thomas, J. Gudnason, and P. A. Naylor, "Data-driven voice source waveform modelling," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2009, pp. 3965-3968.
- (2009) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP) , pp. 3965-3968
- Thomas, M.R.P.¹ Gudnason, J.² Naylor, P.A.³

33
- 84870724307
- [Online]
- "CMU ARCTIC speech synthesis databases,' [Online]. Available: http://festvox.org/cmu-arctic/
- CMU ARCTIC Speech Synthesis Databases

34
- 26844515690
- Mixed-phase speech modeling and formant estimation, using differential phase spectrums
- B. Bozkurt and T. Dutoit, "Mixed-phase speech modeling and formant estimation, using differential phase spectrums,' in Proc. ISCA ITRW VOQUAL03, 2003, pp. 21-24.
- (2003) Proc. ISCA ITRW VOQUAL03 , pp. 21-24
- Bozkurt, B.¹ Dutoit, T.²

35
- 85016140477
- An adaptive algorithm for mel-cepstral analysis of speech
- Speech, Signal Process. (ICASSP)
- T. Fukada, K. Tokuda, T. Kobayashi, and S. Imai, "An adaptive algorithm for Mel-cepstral analysis of speech," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 1992, vol. 1, pp. 137-140.
- (1992) Proc. IEEE Int. Conf. Acoust. , vol.1 , pp. 137-140
- Fukada, T.¹ Tokuda, K.² Kobayashi, T.³ Imai, S.⁴

36
- 34547526960
- Statistical parametric speech synthesis
- Speech, Signal Process. (ICASSP)
- A. W. Black, H. Zen, and K. Tokuda, "Statistical parametric speech synthesis," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2007, pp. 1229-1232.
- (2007) Proc. IEEE Int. Conf. Acoust. , pp. 1229-1232
- Black, A.W.¹ Zen, H.² Tokuda, K.³

37
- 0036522887
- Multi-space probability distribution HMM
- K. Tokuda, T. Masuko, N. Myiazaki, and T. Kobayashi, 'Multi-space probability distribution HMM," IEICE Trans. Inf. Syst., vol. E85-D, pp. 455-464, 2002. (Pubitemid 35353984)
- (2002) IEICE Transactions on Information and Systems , vol.E85-D , Issue.3 , pp. 455-464
- Tokuda, K.¹ Masuko, T.² Miyazaki, N.³ Kobayashi, T.⁴

38
- 85031628788
- An algorithm for speech parameter generation from continuous mixture HMMs with dynamic features
- K. Tokuda, T. Masuko, T. Yamada, T. Kobayashi, and S. Imai, "An algorithm for speech parameter generation from continuous mixture HMMs with dynamic features," in Proc. Eurospeech, 1995.
- (1995) Proc. Eurospeech
- Tokuda, K.¹ Masuko, T.² Yamada, T.³ Kobayashi, T.⁴ Imai, S.⁵

39
- 78049394912
- [Online]
- "HMM-based speech synthesis system (HTS)," [Online]. Available: http://hts.sp.nitech.ac.jp/
- HMM-based Speech Synthesis System (HTS)

40
- 38549096029
- A speech parameter generation algorithm considering global variance for HMM-based speech synthesis
- T. Toda and K. Tokuda, "A speech parameter generation algorithm considering global variance for HMM-based speech synthesis," IEICE Trans. Inf. Syst., vol. 90, no. 5, pp. 816-824, 2007.
- (2007) IEICE Trans. Inf. Syst. , vol.90 , Issue.5 , pp. 816-824
- Toda, T.¹ Tokuda, K.²

41
- 84856297608
- Ph.D. disseration, School of Informatics, Univ. of Edinburgh, Edinburgh, U.K
- J. Cabral, "HMM-based speech synthesis using an acoustic glottal source model," Ph.D. disseration, School of Informatics, Univ. of Edinburgh, Edinburgh, U.K., 2010.
- (2010) HMM-based Speech Synthesis Using an Acoustic Glottal Source Model
- Cabral, J.¹

42
- 77957742142
- Speech quality assessment
- New York: Springer
- V. Grancharov and W. Kleijn, "Speech quality assessment," in Handbook of Speech Processing. New York: Springer, 2007.
- (2007) Handbook of Speech Processing
- Grancharov, V.¹ Kleijn, W.²

43
- 0036293830
- An overview of automatic speaker recognition technology
- Speech, Signal Process. (ICASSP)
- D. Reynolds, "An overview of automatic speaker recognition technology," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2002, vol. 4, pp. 4072-4075.
- (2002) Proc. IEEE Int. Conf. Acoust. , vol.4 , pp. 4072-4075
- Reynolds, D.¹

44
- 33748443739
- Extraction of speaker-specific excitation information from linear prediction residual of speech
- DOI 10.1016/j.specom.2006.06.002, PII S0167639306000665
- S. Prasanna, C. Gupta, and B. Yegnanarayana, "Extraction of speaker-specific information from linear prediction residual of speech," IEEE Trans. Pattern Anal. Mach. Intell., vol. 48, no. 10, pp. 1243-1261, Oct. 2006. (Pubitemid 44353818)
- (2006) Speech Communication , vol.48 , Issue.10 , pp. 1243-1261
- Mahadeva Prasanna, S.R.¹ Gupta, C.S.² Yegnanarayana, B.³

45
- 0032021555
- On combining classifiers
- J. Kittler, M. Hatef, R. Duin, and J. Matas, "On combining classifiers," IEEE Trans. Pattern Anal. Mach. Intell., vol. 20, no. 3, pp. 226-239, Mar. 1993. (Pubitemid 128741312)
- (1998) IEEE Transactions on Pattern Analysis and Machine Intelligence , vol.20 , Issue.3 , pp. 226-239
- Kittler, J.¹ Hatef, M.² Duin, R.P.W.³ Matas, J.⁴

46
- 0002538142
- The DARPA speech recognition research database: Specifications and status
- W. Fisher, G. Doddington, and K. Goudie-Marshall, "The DARPA speech recognition research database: Specifications and status," in Proc. DARPA Workshop Speech Recognit., 1986, pp. 93-99.
- (1986) Proc. DARPA Workshop Speech Recognit. , pp. 93-99
- Fisher, W.¹ Doddington, G.² Goudie-Marshall, K.³

47
- 0028996937
- Testing with the YOHO CD-ROM voice verification corpus
- Speech, Signal Process. (ICASSP)
- J. Campbell, "Testing with the YOHO CD-ROM voice verification corpus," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 1995, pp. 341-344.
- (1995) Proc. IEEE Int. Conf. Acoust. , pp. 341-344
- Campbell, J.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.