SCOPUS 정보 검색 플랫폼

Speech Communication

Volumn 59, Issue , 2014, Pages 10-21

Phonetic feature extraction for context-sensitive glottal source processing

(4) Kane, John a Aylett, Matthew b,c Yanushevskaya, Irena a Gobl, Christer a

a TRINITY COLLEGE DUBLIN (Ireland)

b UNIVERSITY OF EDINBURGH (United Kingdom)

c CEREPROC LTD (United Kingdom)

Author keywords

Expressive speech; Glottal source; Phonation type; Speech synthesis; Voice quality

Indexed keywords

CLASSIFICATION ALGORITHM; DISCRIMINATIVE CLASSIFIERS; EXPRESSIVE SPEECH; GAUSSIAN MIXTURE MODEL (GMMS); GLOTTAL SOURCE; PHONATION TYPE; SUPPORT VECTOR MACHINE (SVMS); VOICE QUALITY;

FEATURE EXTRACTION; LINGUISTICS; NEURAL NETWORKS; SPEECH SYNTHESIS; SUPPORT VECTOR MACHINES;

QUALITY CONTROL;

EID: 84892721755 PISSN: 01676393 EISSN: None Source Type: Journal
DOI: 10.1016/j.specom.2013.12.003 Document Type: Article

Times cited : (12)

References (52)

1
- 70450163450
- Comparison of multiple voice source parameters in different phonation types
- Airas, M.; Alku, P.; 2007. Comparison of multiple voice source parameters in different phonation types. In: Proceedings of Interspeech 2007, Antwerp, Belgium, pp. 1410-1413.
- (2007) Proceedings of Interspeech 2007, Antwerp, Belgium , pp. 1410-1413
- Airas, M.¹ Alku, P.²

2
- 0032716023
- An acoustic-phonetic feature-based system for automatic phoneme recognition in continuous speech
- Ali, A.M.A.; der Spiegel, J.V.; Mueller, P.; Haentjens, G.; Berman, J.; 1999. An acoustic-phonetic feature-based system for automatic phoneme recognition in continuous speech. In: Proceedings of the IEEE International Symposium on Circuits and Systems 3, pp. 118-121.
- (1999) Proceedings of the IEEE International Symposium on Circuits and Systems 3 , pp. 118-121
- Ali A. .M., .A.¹ Der Spiegel J., .V.² Mueller, P.³ Haentjens, G.⁴ Berman, J.⁵

3
- 0026881384
- Glottal wave analysis with pitch synchronous iterative adaptive inverse filtering
- P. Alku Glottal wave analysis with pitch synchronous iterative adaptive inverse filtering Speech Commun. 11 2-3 1992 109 118
- (1992) Speech Commun. , vol.11 , Issue.23 , pp. 109-118
- Alku, P.¹

4
- 84856294347
- Glottal inverse filtering analysis of human voice production - A review of estimation and parameterization methods of the glottal excitation and their applications
- P. Alku Glottal inverse filtering analysis of human voice production - a review of estimation and parameterization methods of the glottal excitation and their applications Sadhana 36 5 2011 623 650
- (2011) Sadhana , vol.36 , Issue.5 , pp. 623-650
- Alku, P.¹

5
- 85135308477
- Estimation of the glottal pulseform based on discrete all-pole modeling
- Alku, P.; Vilkman, E.; 1994. Estimation of the glottal pulseform based on discrete all-pole modeling. In: Proceedings of the Third International Conference on Spoken Language Processing, pp. 1619-1622.
- (1994) Proceedings of the Third International Conference on Spoken Language Processing , pp. 1619-1622
- Alku, P.¹ Vilkman, E.²

6
- 0031189455
- Parabolic spectral parameter - A new method for quantification of the glottal flow
- P. Alku, H. Strik, and E. Vilkman Parabolic spectral parameter - a new method for quantification of the glottal flow Speech Commun. 22 1 1997 67 79
- (1997) Speech Commun. , vol.22 , Issue.1 , pp. 67-79
- Alku, P.¹ Strik, H.² Vilkman, E.³

7
- 0036339929
- Normalized amplitude quotient for parameterization of the glottal flow
- P. Alku, T. Bäckström, and E. Vilkman Normalized amplitude quotient for parameterization of the glottal flow J. Acoust. Soc. Am. 112 2 2002 701 710
- (2002) J. Acoust. Soc. Am. , vol.112 , Issue.2 , pp. 701-710
- Alku, P.¹ Bäckström, T.² Vilkman, E.³

8
- 84882383984
- Formant frequency estimation of high-pitched vowels using weighted linear prediction
- P. Alku, J. Pohjalainen, M. Vainio, A. Laukkanen, and B. Story Formant frequency estimation of high-pitched vowels using weighted linear prediction J. Acoust. Soc. Am. 134 2 2013 1295 1313
- (2013) J. Acoust. Soc. Am. , vol.134 , Issue.2 , pp. 1295-1313
- Alku, P.¹ Pohjalainen, J.² Vainio, M.³ Laukkanen, A.⁴ Story, B.⁵

9
- 78049527800
- The CereVoice characterful speech synthesiser SDK
- Newcastle, UK
- Aylett, M.P.; Pidcock, C.J.; 2007. The CereVoice characterful speech synthesiser SDK. In: Artificial Intelligence and Simulation of Behaviour (AISB). Newcastle, UK.
- (2007) Artificial Intelligence and Simulation of Behaviour (AISB)
- Aylett M., .P.¹ Pidcock C., .J.²

10
- 33846516584
- Springer-Verlag New York
- C.M. Bishop Pattern Recognition and Machine Learning (Information Science and Statistics) 2006 Springer-Verlag New York
- (2006) Pattern Recognition and Machine Learning (Information Science and Statistics)
- Bishop, C.M.¹

11
- 9444234505
- Voice quality: The 4th prosodic dimension
- Campbell, N.; Mokhtari, P.; 2003. Voice quality: the 4th prosodic dimension. In: Proceedings of the 15th International Congress of Phonetic Sciences, pp. 2417-2420.
- (2003) Proceedings of the 15th International Congress of Phonetic Sciences , pp. 2417-2420
- Campbell, N.¹ Mokhtari, P.²

12
- 64449086223
- Discrimination power of vocal source and vocal tract related features for speaker segmentation
- W. Chan, N. Zheng, and T. Lee Discrimination power of vocal source and vocal tract related features for speaker segmentation IEEE Trans. Audio Speech Lang. process. 15 6 2007 1884 1892
- (2007) IEEE Trans. Audio Speech Lang. Process. , vol.15 , Issue.6 , pp. 1884-1892
- Chan, W.¹ Zheng, N.² Lee, T.³

13
- 0004119259
- MIT Press Cambridge, MA
- N. Chomsky, and M. Halle The Sound Pattern of English 1968 MIT Press Cambridge, MA
- (1968) The Sound Pattern of English
- Chomsky, N.¹ Halle, M.²

14
- 84892718247
- Creaky voice and the classification of affect
- Grenoble, France
- Cullen, A.; Kane, J.; Drugman, T.; Harte, N.; 2013. Creaky voice and the classification of affect. In: Proceedings of WASSS, Grenoble, France.
- (2013) Proceedings of WASSS
- Cullen, A.¹ Kane, J.² Drugman, T.³ Harte, N.⁴

15
- 80955173659
- A comparative study of glottal source estimation techniques
- T. Drugman, B. Bozkurt, and T. Dutoit A comparative study of glottal source estimation techniques Comput. Speech Lang. 26 2011 20 34
- (2011) Comput. Speech Lang. , vol.26 , pp. 20-34
- Drugman, T.¹ Bozkurt, B.² Dutoit, T.³

16
- 0003418124
- 2nd ed. Mouton Hague 1970
- G. Fant The Acoustic Theory of Speech Production 2nd ed. 1960 Mouton Hague 1970
- (1960) The Acoustic Theory of Speech Production
- Fant, G.¹

17
- 84928462881
- Glottal source - Vocal tract acoustic interaction
- Fant, G.; Lin, Q.; 1987. Glottal source - vocal tract acoustic interaction. KTH, Speech Transmission Laboratory, Quarterly Report 28 (1), pp. 13-27.
- (1987) KTH, Speech Transmission Laboratory, Quarterly Report , vol.28 , Issue.1 , pp. 13-27
- Fant, G.¹ Lin, Q.²

18
- 0003931473
- A four parameter model of glottal flow
- Fant, G.; Liljencrants, J.; Lin, Q.; 1985. A four parameter model of glottal flow. KTH, Speech Transmission Laboratory, Quarterly Report 4, pp. 1-13.
- (1985) KTH, Speech Transmission Laboratory, Quarterly Report 4 , pp. 1-13
- Fant, G.¹ Liljencrants, J.² Lin, Q.³

19
- 84922823681
- Notes on glottal flow interaction
- Fant, G.; Lin, Q.; Gobl, C.; 1985b. Notes on glottal flow interaction. KTH, Speech Transmission Laboratory, Quarterly Report 2-3, 21-45.
- (1985) KTH, Speech Transmission Laboratory, Quarterly Report 2-3 , pp. 21-45
- Fant, G.¹ Lin, Q.² Gobl, C.³

20
- 84875243987
- Inverse filtering of nasalized vowels using synthesized speech
- C. Gobl, and J. Mahshie Inverse filtering of nasalized vowels using synthesized speech J.Voice 27 2 2013 155 169
- (2013) J.Voice , vol.27 , Issue.2 , pp. 155-169
- Gobl, C.¹ Mahshie, J.²

21
- 0024381490
- Klassifizierung von glottisdysfunktionen mit hilfe der elektroglottographie
- T. Hacki Klassifizierung von glottisdysfunktionen mit hilfe der elektroglottographie Folia Phoniatrica 1989 43 48
- (1989) Folia Phoniatrica , pp. 43-48
- Hacki, T.¹

22
- 0031023993
- Glottal characteristics of female speakers: Acoustic correlates
- H.M. Hanson Glottal characteristics of female speakers: acoustic correlates J. Acoust. Soc. Am. 10 1 1997 466 481
- (1997) J. Acoust. Soc. Am. , vol.10 , Issue.1 , pp. 466-481
- Hanson, H.M.¹

23
- 0025751820
- Approximation capabilities of multilayer feedforward networks
- K. Hornik Approximation capabilities of multilayer feedforward networks Neural Networks 4 2 1991 251 257
- (1991) Neural Networks , vol.4 , Issue.2 , pp. 251-257
- Hornik, K.¹

24
- 77950073346
- Spoken emotion recognition through optimum-path forest classification using glottal features
- I. Iliev, M. Scordilis, J. Papa, and A. Falco Spoken emotion recognition through optimum-path forest classification using glottal features Comput. Speech Lang. 24 3 2010 445 460
- (2010) Comput. Speech Lang. , vol.24 , Issue.3 , pp. 445-460
- Iliev, I.¹ Scordilis, M.² Papa, J.³ Falco, A.⁴

25
- 84875409944
- Automating manual user strategies for precise voice source analysis
- J. Kane, and C. Gobl Automating manual user strategies for precise voice source analysis Speech Commun. 55 3 2013 397 414
- (2013) Speech Commun. , vol.55 , Issue.3 , pp. 397-414
- Kane, J.¹ Gobl, C.²

26
- 84888256701
- Evaluation of automatic glottal source analysis
- Mons, Belgium
- Kane, J.; Gobl, C.; 2013. Evaluation of automatic glottal source analysis. In: Proceedings of NOLISP, Mons, Belgium, pp. 1-8.
- (2013) Proceedings of NOLISP , pp. 1-8
- Kane, J.¹ Gobl, C.²

27
- 84870254871
- Evaluation of glottal closure instant detection in a range of voice qualities
- J. Kane, and C. Gobl Evaluation of glottal closure instant detection in a range of voice qualities Speech Commun. 55 2 2013 295 314
- (2013) Speech Commun. , vol.55 , Issue.2 , pp. 295-314
- Kane, J.¹ Gobl, C.²

28
- 84875035728
- Wavelet maxima dispersion for breathy to tense voice discrimination
- J. Kane, and C. Gobl Wavelet maxima dispersion for breathy to tense voice discrimination IEEE Trans. Audio Speech Lang. Process. 21 6 2013 1170 1179
- (2013) IEEE Trans. Audio Speech Lang. Process. , vol.21 , Issue.6 , pp. 1170-1179
- Kane, J.¹ Gobl, C.²

29
- 84890470090
- Speaker and language independent voice quality classification applied to unlabelled corpora of expressive speech
- Vancouver, Canada
- Kane, J.; Scherer, S.; Aylett, M.; Morency, L.; Gobl, C.; 2013. Speaker and language independent voice quality classification applied to unlabelled corpora of expressive speech. In: Proceedings of ICASSP, Vancouver, Canada.
- (2013) Proceedings of ICASSP
- Kane, J.¹ Scherer, S.² Aylett, M.³ Morency, L.⁴ Gobl, C.⁵

30
- 84906263810
- Using phonetic feature extraction to determine optimal speech regions for maximising the effectiveness of glottal source analysis
- Lyon, France
- Kane, J.; Yanushevskaya, I.; Dalton, J.; Gobl, C.; NíChasaide, A.; 2013. Using phonetic feature extraction to determine optimal speech regions for maximising the effectiveness of glottal source analysis. In: Proceedings of Interspeech, Lyon, France.
- (2013) Proceedings of Interspeech
- Kane, J.¹ Yanushevskaya, I.² Dalton, J.³ Gobl, C.⁴ Níchasaide, A.⁵

31
- 33746254801
- Comparative study: HMM and SVM for automatic articulatory feature extraction
- Kanokphara, S.; Macek, J.; Carson-berndsen, J.; 2006. Comparative study: HMM and SVM for automatic articulatory feature extraction. In: Proceedings of the 19th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems.
- (2006) Proceedings of the 19th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems
- Kanokphara, S.¹ Macek, J.² Carson-Berndsen, J.³

32
- 0034297586
- Detection of phonological features in continuous speech using neural networks
- S. King, and P. Taylor Detection of phonological features in continuous speech using neural networks Comput. Speech Lang. 14 2000 333 353
- (2000) Comput. Speech Lang. , vol.14 , pp. 333-353
- King, S.¹ Taylor, P.²

33
- 85090475413
- Pittsburgh, PA
- Kominek, J.; Black, A.; 2004. The CMU ARCTIC speech synthesis databases. ISCA speech synthesis workshop, Pittsburgh, PA, pp. 223-224.
- (2004) The CMU ARCTIC Speech Synthesis Databases. ISCA Speech Synthesis Workshop , pp. 223-224
- Kominek, J.¹ Black, A.²

34
- 0038676761
- Towards knowledge-based features for hmm based large vocabulary automatic speech recognition
- Launay, B.; Siohan, O.; Surendran, A.; Lee, C.; 2002. Towards knowledge-based features for hmm based large vocabulary automatic speech recognition. In: Proceedings of ICASSP, Orlando, Florida, USA, pp. 817-820.
- (2002) Proceedings of ICASSP, Orlando, Florida, USA , pp. 817-820
- Launay, B.¹ Siohan, O.² Surendran, A.³ Lee, C.⁴

35
- 84928462963
- Nonlinear interaction in voice production
- Lin, Q.; 1987. Nonlinear interaction in voice production. KTH, Speech Transmission Laboratory, Quarterly Report 28 (1), pp. 1-12.
- (1987) KTH, Speech Transmission Laboratory, Quarterly Report , vol.28 , Issue.1 , pp. 1-12
- Lin, Q.¹

36
- 51449108623
- Cascaded emotion classification via psychological emotion dimensions using a large set of voice quality parameters
- Lugger, M.; Yang, B.; 2008. Cascaded emotion classification via psychological emotion dimensions using a large set of voice quality parameters. In: Proceedings of ICASSP, Las Vegas, Nevada, USA, pp. 4945-4948.
- (2008) Proceedings of ICASSP, Las Vegas, Nevada, USA , pp. 4945-4948
- Lugger, M.¹ Yang, B.²

37
- 84966285002
- Automatic detection of acoustic centres of reliability for tagging paralinguistic information in expressive speech
- Mokhtari, P.; Campbell, N.; 2002. Automatic detection of acoustic centres of reliability for tagging paralinguistic information in expressive speech. In: Proceedings of Language Resources and Evaluation (LREC).
- (2002) Proceedings of Language Resources and Evaluation (LREC)
- Mokhtari, P.¹ Campbell, N.²

38
- 0038381003
- Automatic measurement of pressed/breathy phonation at acoustic centres of reliability in continuous speech
- Mokhtari, P.; Campbell, N.; 2003. Automatic measurement of pressed/breathy phonation at acoustic centres of reliability in continuous speech. IEICE Transactions on Information and Systems (special issue on speech information processing) E-86-D(3), pp. 574-582.
- (2003) IEICE Transactions on Information and Systems (Special Issue on Speech Information Processing) E-86-D(3) , pp. 574-582
- Mokhtari, P.¹ Campbell, N.²

39
- 30444446629
- Combining evidence from residual phase and mfcc features for speaker recognition
- K. Murty, and B. Yegnanarayana Combining evidence from residual phase and mfcc features for speaker recognition IEEE Signal Processing Lett. 13 1 2006 52 55
- (2006) IEEE Signal Processing Lett. , vol.13 , Issue.1 , pp. 52-55
- Murty, K.¹ Yegnanarayana, B.²

40
- 77957744515
- HMM-based speech synthesis utilizing glottal inverse filtering
- (1)
- T. Raitio, A. Suni, J. Yamagishi, H. Pulakka, J. Nurminen, M. Vainio, and P. Alku HMM-based speech synthesis utilizing glottal inverse filtering IEEE Trans. Audio Speech Lang. process. 19 1 2011 153 165 (1)
- (2011) IEEE Trans. Audio Speech Lang. Process. , vol.19 , Issue.1 , pp. 153-165
- Raitio, T.¹ Suni, A.² Yamagishi, J.³ Pulakka, H.⁴ Nurminen, J.⁵ Vainio, M.⁶ Alku, P.⁷

41
- 84890547237
- Synthesis and perception of breathy, normal, and lombard speech in the presence of noise
- T. Raitio, A. Suni, M. Vainio, and P. Alku Synthesis and perception of breathy, normal, and lombard speech in the presence of noise Comput. Speech Lang. 28 2 2014 648 664
- (2014) Comput. Speech Lang. , vol.28 , Issue.2 , pp. 648-664
- Raitio, T.¹ Suni, A.² Vainio, M.³ Alku, P.⁴

42
- 84892724003
- Festival multisyn voices for the 2007 blizzard challenge
- Bonn, Germany
- Richmond, K.; Strom, V.; Clark, R.; Yamagishi, J.; Fitt, S.; 2007. Festival multisyn voices for the 2007 blizzard challenge. In: Proc. Blizzard Challenge Workshop (in Proc. SSW6), Bonn, Germany.
- (2007) Proc. Blizzard Challenge Workshop (In Proc. SSW6)
- Richmond, K.¹ Strom, V.² Clark, R.³ Yamagishi, J.⁴ Fitt, S.⁵

43
- 67650999674
- A study on integrating acoustic-phonetic information into lattice rescoring for automatic speech recognition
- S. Siniscalchi, and C. Lee A study on integrating acoustic-phonetic information into lattice rescoring for automatic speech recognition Speech Commun. 51 11 2009 1139 1153
- (2009) Speech Commun. , vol.51 , Issue.11 , pp. 1139-1153
- Siniscalchi, S.¹ Lee, C.²

44
- 84875405186
- Exploiting deep neural networks for detection-based speech recognition
- S. Siniscalchi, D. Yu, L. Deng, and C. Lee Exploiting deep neural networks for detection-based speech recognition Neurocomputing 106 2013 148 157
- (2013) Neurocomputing , vol.106 , pp. 148-157
- Siniscalchi, S.¹ Yu, D.² Deng, L.³ Lee, C.⁴

45
- 38549135463
- A spectral method for estimation of the voice speed quotient and evaluation using electroglottography
- Groningen, The Netherlands
- Sturmel, N.; d'Alessandro, C.; Doval, B.; 2006. A spectral method for estimation of the voice speed quotient and evaluation using electroglottography. In: 7th Conference on Advances in Quantitative Laryngology, Groningen, The Netherlands.
- (2006) 7th Conference on Advances in Quantitative Laryngology
- Sturmel, N.¹ D'Alessandro, C.² Doval, B.³

46
- 84867584684
- Detecting a targeted voice style in an audiobook using voice quality features
- Kyoto, Japan
- Székely, É.; Kane, J.; Scherer, S.; Gobl, C.; Carson-Berndsen, J.; 2012. Detecting a targeted voice style in an audiobook using voice quality features. In: Proceedings of ICASSP, Kyoto, Japan, 4593-4596.
- (2012) Proceedings of ICASSP , pp. 4593-4596
- Székely, E.¹ Kane, J.² Scherer, S.³ Gobl, C.⁴ Carson-Berndsen, J.⁵

47
- 84892702959
- HARTFEX: A multi-dimentional system of HMM based recognisers for articulatory features extraction
- Tarek, A.; Carson-Berndsen, J.; 2003. HARTFEX: a multi-dimentional system of HMM based recognisers for articulatory features extraction. In: Proceedings of Non-Linear Speech Processing Workshop (NOLISP03).
- (2003) Proceedings of Non-Linear Speech Processing Workshop (NOLISP03)
- Tarek, A.¹ Carson-Berndsen, J.²

48
- 0003236089
- Evidence for nonlinear sound production mechanisms in the vocal tract
- W.J. Hardcastle, A. Marchal, Kluwer Academic
- H.M. Teager, and S.M. Teager Evidence for nonlinear sound production mechanisms in the vocal tract W.J. Hardcastle, A. Marchal, Speech Production and Speech Modelling 1990 Kluwer Academic 241 261
- (1990) Speech Production and Speech Modelling , pp. 241-261
- Teager, H.M.¹ Teager, S.M.²

49
- 39149117062
- A review of glottal waveform analysis
- Y. Stylianou, M. Faundez-Zanuy, A. Esposito, Springer Verlag
- J. Walker, and P. Murphy A review of glottal waveform analysis Y. Stylianou, M. Faundez-Zanuy, A. Esposito, Progress in Nonlinear Speech Processing 2007 Springer Verlag 1 21
- (2007) Progress in Nonlinear Speech Processing , pp. 1-21
- Walker, J.¹ Murphy, P.²

50
- 60749097551
- Cambridge University Press
- Steve J. Young, D. Kershaw, J. Odell, D. Ollason, V. Valtchev, and P. Woodland The HTK Book Version 3.4 2007 Cambridge University Press
- (2007) The HTK Book Version 3.4
- Young, S.J.¹ Kershaw, D.² Odell, J.³ Ollason, D.⁴ Valtchev, V.⁵ Woodland, P.⁶

51
- 84867329143
- Boosting attribute and phone estimation accuracies with deep neural networks for detection-based speech recognition
- Yu, D.; Siniscalchi, S.; Deng, L.; Lee, C.; 2012. Boosting attribute and phone estimation accuracies with deep neural networks for detection-based speech recognition. In: Proceedings of ICASSP, pp. 4169-4172.
- (2012) Proceedings of ICASSP , pp. 4169-4172
- Yu, D.¹ Siniscalchi, S.² Deng, L.³ Lee, C.⁴

52
- 33947583290
- Integration of complementary acoustic features for speaker recognition
- N. Zheng, T. Lee, and P. Ching Integration of complementary acoustic features for speaker recognition IEEE Signal Processing Lett. 14 3 2007 181 184
- (2007) IEEE Signal Processing Lett. , vol.14 , Issue.3 , pp. 181-184
- Zheng, N.¹ Lee, T.² Ching, P.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.