-
3
-
-
78650977476
-
Opensmile: The Munich versatile and fast open-source audio feature extractor
-
ACM
-
F. Eyben, M. Wöllmer, and B. Schüller, "Opensmile: The Munich versatile and fast open-source audio feature extractor," in Proc. ACM Multimedia (MM). 2010, pp. 1459-1462, ACM, http://www.openaudio.eu.
-
(2010)
Proc. ACM Multimedia (MM)
, pp. 1459-1462
-
-
Eyben, F.1
Wöllmer, M.2
Schüller, B.3
-
4
-
-
84874281338
-
The kaldi speech recognition toolkit
-
D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek, N. Goel, M. Hannemann, P. Motlicek, Y. Qian, P. Schwarz, J. Silovsky, G. Stemmer, and K. Vesely, "The Kaldi Speech Recognition Toolkit," in IEEE 2011 Workshop on Automatic Speech Recognition and Understanding, 2011.
-
(2011)
IEEE 2011 Workshop on Automatic Speech Recognition and Understanding
-
-
Povey, D.1
Ghoshal, A.2
Boulianne, G.3
Burget, L.4
Glembek, O.5
Goel, N.6
Hannemann, M.7
Motlicek, P.8
Qian, Y.9
Schwarz, P.10
Silovsky, J.11
Stemmer, G.12
Vesely, K.13
-
6
-
-
84865734075
-
Joint robust voicing detection and pitch estimation based on residual harmonics
-
T. Drugman and A. Alwan, "Joint robust voicing detection and pitch estimation based on residual harmonics," in Proc. Interspeech, 2011, pp. 1973-1976.
-
(2011)
Proc. Interspeech
, pp. 1973-1976
-
-
Drugman, T.1
Alwan, A.2
-
7
-
-
84874971457
-
Residual excitation skewness for automatic speech polarity detection
-
T. Drugman, "Residual excitation skewness for automatic speech polarity detection," IEEE Sig. Proc. Lett., vol. 20, no. 4, pp. 387-390, 2013.
-
(2013)
IEEE Sig. Proc. Lett.
, vol.20
, Issue.4
, pp. 387-390
-
-
Drugman, T.1
-
8
-
-
84863419425
-
Detection of glottal closure instants from speech signals: A quantitative review
-
T. Drugman, M. Thomas, J. Gudnason, P. Naylor, and T. Dutoit, "Detection of glottal closure instants from speech signals: A quantitative review," IEEE Trans. on Audio, Speech, and Lang. Proc., vol. 20, no. 3, pp. 994-1006, 2012.
-
(2012)
IEEE Trans. on Audio, Speech, and Lang. Proc.
, vol.20
, Issue.3
, pp. 994-1006
-
-
Drugman, T.1
Thomas, M.2
Gudnason, J.3
Naylor, P.4
Dutoit, T.5
-
9
-
-
70450198169
-
Glottal closure and opening instant detection from speech signals
-
T. Drugman and T. Dutoit, "Glottal closure and opening instant detection from speech signals," in Proc. Interspeech, 2009.
-
(2009)
Proc. Interspeech
-
-
Drugman, T.1
Dutoit, T.2
-
10
-
-
84870254871
-
Evaluation of glottal closure instant detection in a range of voice qualities
-
J. Kane and C. Gobl, "Evaluation of glottal closure instant detection in a range of voice qualities," Speech Commun., vol. 55, no. 2, pp. 295-314, 2013.
-
(2013)
Speech Commun.
, vol.55
, Issue.2
, pp. 295-314
-
-
Kane, J.1
Gobl, C.2
-
11
-
-
84863772450
-
Speech analysis/synthesis based on a sinusoidal representation
-
R. McAulay and T. Quatieri, "Speech analysis/synthesis based on a sinusoidal representation," IEEE Trans. on Audio, Speech, and Lang. Proc., vol. 34, no. 4, pp. 744-754, 1986.
-
(1986)
IEEE Trans. on Audio, Speech, and Lang. Proc.
, vol.34
, Issue.4
, pp. 744-754
-
-
McAulay, R.1
Quatieri, T.2
-
12
-
-
0003447548
-
-
Ph.D. thesis, TelecomParis, France
-
Y. Stylianou, Harmonic plus Noise Models for Speech combined with Statistical Methods, for Speech and Speaker Modification, Ph.D. thesis, TelecomParis, France, 1996.
-
(1996)
Harmonic Plus Noise Models for Speech Combined with Statistical Methods, for Speech and Speaker Modification
-
-
Stylianou, Y.1
-
13
-
-
84867221046
-
On the properties of a time-varying quasi-harmonic model of speech
-
Y. Pantazis, O. Rosec, and Y. Stylianou, "On the properties of a time-varying quasi-harmonic model of speech," in Proc. Interspeech, 2008, pp. 1044-1047.
-
(2008)
Proc. Interspeech
, pp. 1044-1047
-
-
Pantazis, Y.1
Rosec, O.2
Stylianou, Y.3
-
14
-
-
78049294305
-
Adaptive AM-FM signal decomposition with application to speech analysis
-
Y. Pantazis, O. Rosec, and Y. Stylianou, "Adaptive AM-FM signal decomposition with application to speech analysis," IEEE Trans. on Audio, Speech, and Lang. Proc., vol. 19, no. 2, pp. 290-300, 2010.
-
(2010)
IEEE Trans. on Audio, Speech, and Lang. Proc.
, vol.19
, Issue.2
, pp. 290-300
-
-
Pantazis, Y.1
Rosec, O.2
Stylianou, Y.3
-
15
-
-
84867585959
-
An extension of the adaptive quasi-harmonic model
-
G.P. Kafentzis, Y. Pantazis, O. Rosec, and Y. Stylianou, "An extension of the adaptive quasi-harmonic model," in Proc. IEEE Int. Conf. on Acoust. Speech and Signal Proc. (ICASSP), 2012.
-
(2012)
Proc. IEEE Int. Conf. on Acoust. Speech and Signal Proc. (ICASSP)
-
-
Kafentzis, G.P.1
Pantazis, Y.2
Rosec, O.3
Stylianou, Y.4
-
16
-
-
84881041616
-
Analysis and synthesis of speech using an adaptive full-band harmonic model
-
G. Degottex and Y. Stylianou, "Analysis and synthesis of speech using an adaptive full-band harmonic model," IEEE Trans. on Audio, Speech, and Lang. Proc., vol. 21, no. 10, pp. 2085-2095, 2013.
-
(2013)
IEEE Trans. on Audio, Speech, and Lang. Proc.
, vol.21
, Issue.10
, pp. 2085-2095
-
-
Degottex, G.1
Stylianou, Y.2
-
17
-
-
0026106454
-
Discrete all-pole modeling
-
A. El-Jaroudi and J. Makhoul, "Discrete all-pole modeling," IEEE Transactions on Sign. Proc., vol. 39, no. 2, pp. 411-423, 1991.
-
(1991)
IEEE Transactions on Sign. Proc.
, vol.39
, Issue.2
, pp. 411-423
-
-
El-Jaroudi, A.1
Makhoul, J.2
-
18
-
-
85008008295
-
Phase minimization for glottal model estimation
-
G. Degottex, A. Roebel, and X. Rodet, "Phase minimization for glottal model estimation," IEEE Trans. on Audio, Speech, and Lang. Proc., vol. 19, no. 5, pp. 1080-1090, 2011.
-
(2011)
IEEE Trans. on Audio, Speech, and Lang. Proc.
, vol.19
, Issue.5
, pp. 1080-1090
-
-
Degottex, G.1
Roebel, A.2
Rodet, X.3
-
20
-
-
4243858909
-
Spectral envelope extraction by improved cepstral method
-
in Japanese
-
S. Imai and Y. Abe, "Spectral envelope extraction by improved cepstral method," Electronics and Communication, vol. 62-A, no. 4, pp. 10-17, 1979, in Japanese.
-
(1979)
Electronics and Communication
, vol.62 A
, Issue.4
, pp. 10-17
-
-
Imai, S.1
Abe, Y.2
-
22
-
-
34249699086
-
On cepstral and all-pole based spectral envelope modeling with unknown model order
-
A. Roebel, F. Villavicencio, and X. Rodet, "On cepstral and all-pole based spectral envelope modeling with unknown model order," Pattern Recognition Letters, vol. 28, no. 11, pp. 1343-1350, 2007.
-
(2007)
Pattern Recognition Letters
, vol.28
, Issue.11
, pp. 1343-1350
-
-
Roebel, A.1
Villavicencio, F.2
Rodet, X.3
-
23
-
-
0027560122
-
Robust signal selection for linear prediction analysis of voiced speech
-
C. Ma, Y. Kamp, and L.F. Willems, "Robust signal selection for linear prediction analysis of voiced speech," Speech Commun., vol. 12, no. 2, pp. 69-81, 1993.
-
(1993)
Speech Commun.
, vol.12
, Issue.2
, pp. 69-81
-
-
Ma, C.1
Kamp, Y.2
Willems, L.F.3
-
24
-
-
61849149377
-
Stabilised weighted linear prediction
-
C. Magi, J. Pohjalainen, T. Bäckström, and P. Alku, "Stabilised weighted linear prediction," Speech Commun., vol. 51, no. 5, pp. 401-411, 2009.
-
(2009)
Speech Commun.
, vol.51
, Issue.5
, pp. 401-411
-
-
Magi, C.1
Pohjalainen, J.2
Bäckström, T.3
Alku, P.4
-
25
-
-
79959832654
-
Extended weighted linear prediction (XLP) analysis of speech and its application to speaker verification in adverse conditions
-
J. Pohjalainen, R. Saeidi, T. Kinnunen, and P. Alku, "Extended weighted linear prediction (XLP) analysis of speech and its application to speaker verification in adverse conditions," in Proc. Interspeech, 2010, pp. 1477-1480.
-
(2010)
Proc. Interspeech
, pp. 1477-1480
-
-
Pohjalainen, J.1
Saeidi, R.2
Kinnunen, T.3
Alku, P.4
-
26
-
-
33947130944
-
Improved differential phase spectrum processing for formant tracking
-
B. Bozkurt, B. Doval, C. d'Alessandro, and T. Dutoit, "Improved differential phase spectrum processing for formant tracking," Proc. ICSLP, 2004.
-
(2004)
Proc. ICSLP
-
-
Bozkurt, B.1
Doval, B.2
D'alessandro, C.3
Dutoit, T.4
-
27
-
-
33745473319
-
Advanced methods for glottal wave extraction
-
M. Faundez-Zanuy et al., Eds. Springer Berlin/Heidelberg
-
J. Walker and P. Murphy, "Advanced methods for glottal wave extraction," in Nonlinear Analyses and Algorithms for Speech Processing, M. Faundez-Zanuy et al., Eds., pp. 139-149. Springer Berlin/Heidelberg, 2005.
-
(2005)
Nonlinear Analyses and Algorithms for Speech Processing
, pp. 139-149
-
-
Walker, J.1
Murphy, P.2
-
28
-
-
84856294347
-
Glottal inverse filtering analysis of human voice production-A review of estimation and parameterization methods of the glottal excitation and their applications
-
P. Alku, "Glottal inverse filtering analysis of human voice production-A review of estimation and parameterization methods of the glottal excitation and their applications," Sadhana, vol. 36, no. 5, pp. 623-650, 2011.
-
(2011)
Sadhana
, vol.36
, Issue.5
, pp. 623-650
-
-
Alku, P.1
-
29
-
-
80955173659
-
A comparative study of glottal source estimation techniques
-
T. Drugman, B. Bozkurt, and T. Dutoit, "A comparative study of glottal source estimation techniques," Comp. Speech & Lang., vol. 26, no. 1, pp. 20-34, 2012.
-
(2012)
Comp. Speech & Lang.
, vol.26
, Issue.1
, pp. 20-34
-
-
Drugman, T.1
Bozkurt, B.2
Dutoit, T.3
-
30
-
-
0026881384
-
Glottal wave analysis with pitch synchronous iterative adaptive inverse filtering
-
P. Alku, "Glottal wave analysis with pitch synchronous iterative adaptive inverse filtering," Speech Commun., vol. 11, no. 2-3, pp. 109-118, 1992.
-
(1992)
Speech Commun.
, vol.11
, Issue.2-3
, pp. 109-118
-
-
Alku, P.1
-
31
-
-
79955528226
-
Causal-anticausal decomposition of speech using complex cepstrum for glottal source estimation
-
T. Drugman, B. Bozkurt, and T. Dutoit, "Causal-anticausal decomposition of speech using complex cepstrum for glottal source estimation," Speech Commun., vol. 53, no. 6, pp. 855-866, 2011.
-
(2011)
Speech Commun.
, vol.53
, Issue.6
, pp. 855-866
-
-
Drugman, T.1
Bozkurt, B.2
Dutoit, T.3
-
32
-
-
33750333146
-
Performance of glottal inverse filtering as tested by aeroelastic modelling of phonation and FE modelling of vocal tract
-
P. Alku, J. Horacek, M. Airas, F. Griffond-Boitier, and A.-M. Laukkanen, "Performance of glottal inverse filtering as tested by aeroelastic modelling of phonation and FE modelling of vocal tract," Acta Acustica united with Acustica, vol. 92, pp. 717-724, 2006.
-
(2006)
Acta Acustica United with Acustica
, vol.92
, pp. 717-724
-
-
Alku, P.1
Horacek, J.2
Airas, M.3
Griffond-Boitier, F.4
Laukkanen, A.-M.5
-
33
-
-
32944458861
-
Estimation of the voice source from speech pressure signals: Evaluation of an inverse filtering technique using physical modelling of voice production
-
P. Alku, B. Story, and M. Airas, "Estimation of the voice source from speech pressure signals: Evaluation of an inverse filtering technique using physical modelling of voice production," Folia Phoniatrica et Logopaedica, vol. 58, no. 1, pp. 102-113, 2006.
-
(2006)
Folia Phoniatrica et Logopaedica
, vol.58
, Issue.1
, pp. 102-113
-
-
Alku, P.1
Story, B.2
Airas, M.3
-
34
-
-
70450170853
-
Complex cepstrum-based decomposition of speech for glottal source estimation
-
T. Drugman, B. Bozkurt, and T. Dutoit, "Complex cepstrum-based decomposition of speech for glottal source estimation," in Proc. Interspeech, 2009, pp. 116-119.
-
(2009)
Proc. Interspeech
, pp. 116-119
-
-
Drugman, T.1
Bozkurt, B.2
Dutoit, T.3
-
35
-
-
0036339929
-
Normalized amplitude quotient for parameterization of the glottal flow
-
P. Alku, T. Bäckström, and E. Vilkman, "Normalized amplitude quotient for parameterization of the glottal flow," J. Acoust. Soc. Am., vol. 112, no. 2, pp. 701-710, 2002.
-
(2002)
J. Acoust. Soc. Am.
, vol.112
, Issue.2
, pp. 701-710
-
-
Alku, P.1
Bäckström, T.2
Vilkman, E.3
-
36
-
-
0024381490
-
Klassifizierung von glottisdysfunktionen mit hilfe der elektroglottographie
-
T. Hacki, "Klassifizierung von glottisdysfunktionen mit hilfe der elektroglottographie," Folia Phoniatrica, pp. 43-48, 1989.
-
(1989)
Folia Phoniatrica
, pp. 43-48
-
-
Hacki, T.1
-
37
-
-
0026680776
-
Vocal intensity in speakers and singers
-
I. Titze and J. Sundberg, "Vocal intensity in speakers and singers," J. Acoust. Soc. Am., vol. 91, no. 5, pp. 2936-2946, 1992.
-
(1992)
J. Acoust. Soc. Am.
, vol.91
, Issue.5
, pp. 2936-2946
-
-
Titze, I.1
Sundberg, J.2
-
38
-
-
0025786649
-
Voice quality factors: Analysis, synthesis and perception
-
D. Childers and C. Lee, "Voice quality factors: Analysis, synthesis and perception," J. Acoust. Soc. Am., vol. 90, no. 5, pp. 2394-2410, 1991.
-
(1991)
J. Acoust. Soc. Am.
, vol.90
, Issue.5
, pp. 2394-2410
-
-
Childers, D.1
Lee, C.2
-
39
-
-
0031189455
-
Parabolic spectral parameter-A new method for quantification of the glottal flow
-
P. Alku, H. Strik, and E. Vilkman, "Parabolic spectral parameter-A new method for quantification of the glottal flow," Speech Commun., vol. 22, no. 1, pp. 67-79, 1997.
-
(1997)
Speech Commun.
, vol.22
, Issue.1
, pp. 67-79
-
-
Alku, P.1
Strik, H.2
Vilkman, E.3
-
40
-
-
84875035728
-
Wavelet maxima dispersion for breathy to tense voice discrimination
-
J. Kane and C. Gobl, "Wavelet maxima dispersion for breathy to tense voice discrimination," IEEE Trans. on Audio, Speech, and Lang. Proc., vol. 21, no. 6, pp. 1170-1179, 2013.
-
(2013)
IEEE Trans. on Audio, Speech, and Lang. Proc.
, vol.21
, Issue.6
, pp. 1170-1179
-
-
Kane, J.1
Gobl, C.2
-
41
-
-
84865726860
-
Identifying regions of non-modal phonation using features of the wavelet transform
-
J. Kane and C. Gobl, "Identifying regions of non-modal phonation using features of the wavelet transform," in Proc. Interspeech, 2011, pp. 177-180.
-
(2011)
Proc. Interspeech
, pp. 177-180
-
-
Kane, J.1
Gobl, C.2
-
42
-
-
38049028071
-
The LF-model revisited. Transformations and frequency domain analysis
-
G. Fant, "The LF-model revisited. Transformations and frequency domain analysis," STL-QPSR, vol. 36, no. 2-3, pp. 119-156, 1995.
-
(1995)
STL-QPSR
, vol.36
, Issue.2-3
, pp. 119-156
-
-
Fant, G.1
-
43
-
-
33947684811
-
A four-parameter model of glottal flow
-
G. Fant, J. Liljencrants, and Q. Lin, "A four-parameter model of glottal flow," STL-QPSR, vol. 4, pp. 1-13, 1985.
-
(1985)
STL-QPSR
, vol.4
, pp. 1-13
-
-
Fant, G.1
Liljencrants, J.2
Lin, Q.3
-
44
-
-
84905286326
-
Automatic analysis of creaky excitation patterns
-
submitted
-
T. Drugman, J. Kane, and C. Gobl, "Automatic analysis of creaky excitation patterns," Comp. Speech & Lang., 2013, submitted.
-
(2013)
Comp. Speech & Lang.
-
-
Drugman, T.1
Kane, J.2
Gobl, C.3
-
45
-
-
84875958321
-
Improved automatic detection of creak
-
J. Kane, T. Drugman, and C. Gobl, "Improved automatic detection of creak," Comp. Speech & Lang., vol. 27, no. 4, pp. 1028-1047, 2013.
-
(2013)
Comp. Speech & Lang.
, vol.27
, Issue.4
, pp. 1028-1047
-
-
Kane, J.1
Drugman, T.2
Gobl, C.3
-
47
-
-
0020178566
-
On the audibility of midrange phase distortion in audio systems
-
S. P. Lipshitz, M. Pocock, and J. Vanderkooy, "On the Audibility of Midrange Phase Distortion in Audio Systems," J. Audio Eng. Soc, vol. 30, no. 9, pp. 580-595, 1982.
-
(1982)
J. Audio Eng. Soc
, vol.30
, Issue.9
, pp. 580-595
-
-
Lipshitz, S.P.1
Pocock, M.2
Vanderkooy, J.3
-
48
-
-
0036286580
-
The effect of group delay spectrum on timbre
-
H. Banno, K. Takeda, and F. Itakura, "The effect of group delay spectrum on timbre," Acoust. Sc. and Techn., vol. 23, no. 1, pp. 1-9, 2002.
-
(2002)
Acoust. Sc. and Techn.
, vol.23
, Issue.1
, pp. 1-9
-
-
Banno, H.1
Takeda, K.2
Itakura, F.3
-
49
-
-
63349083218
-
Simple representation of signal phase for harmonic speech models
-
I. Saratxaga, I. Hernaez, D. Erro, E. Navas, and J. Sanchez, "Simple representation of signal phase for harmonic speech models," Electronics Letters, vol. 45, no. 7, pp. 381-383, 2009.
-
(2009)
Electronics Letters
, vol.45
, Issue.7
, pp. 381-383
-
-
Saratxaga, I.1
Hernaez, I.2
Erro, D.3
Navas, E.4
Sanchez, J.5
-
50
-
-
84878377933
-
Perceptual importance of the phase related information in speech
-
I. Saratxaga, I. Hernaez, M. Pucher, and I. Sainz, "Perceptual importance of the phase related information in speech," in Proc. Interspeech. ISCA, 2012.
-
(2012)
Proc. Interspeech. ISCA
-
-
Saratxaga, I.1
Hernaez, I.2
Pucher, M.3
Sainz, I.4
-
51
-
-
84865369980
-
Evaluation of speaker verification security and detection of HMM-based synthetic speech
-
P.L. De Leon, M. Pucher, J. Yamagishi, I. Hernaez, and I. Saratxaga, "Evaluation of speaker verification security and detection of HMM-based synthetic speech," IEEE Trans. on Audio, Speech, and Lang. Proc., vol. 20, no. 8, pp. 2280-2290, 2012.
-
(2012)
IEEE Trans. on Audio, Speech, and Lang. Proc.
, vol.20
, Issue.8
, pp. 2280-2290
-
-
De Leon, P.L.1
Pucher, M.2
Yamagishi, J.3
Hernaez, I.4
Saratxaga, I.5
-
52
-
-
80051617842
-
Function of phase-distortion for glottal model estimation
-
G. Degottex, A. Roebel, and X. Rodet, "Function of phase-distortion for glottal model estimation," in Proc. IEEE Int. Conf. on Acoust. Speech and Signal Proc. (ICASSP), 2011, pp. 4608-4611.
-
(2011)
Proc. IEEE Int. Conf. on Acoust. Speech and Signal Proc. (ICASSP)
, pp. 4608-4611
-
-
Degottex, G.1
Roebel, A.2
Rodet, X.3
-
53
-
-
84902969948
-
Usual voice quality features and glottal features for emotional valence detection
-
M. Tahon, G. Degottex, and L. Devillers, "Usual voice quality features and glottal features for emotional valence detection," in Proc. Intl. Conf. on Speech Prosody, 2012, pp. 693-696.
-
(2012)
Proc. Intl. Conf. on Speech Prosody
, pp. 693-696
-
-
Tahon, M.1
Degottex, G.2
Devillers, L.3
-
54
-
-
33947159989
-
Chirp group delay analysis of speech signals
-
B. Bozkurt, L. Couvreur, and T. Dutoit, "Chirp group delay analysis of speech signals," Speech Commun., vol. 49, pp. 159-176, 2007.
-
(2007)
Speech Commun.
, vol.49
, pp. 159-176
-
-
Bozkurt, B.1
Couvreur, L.2
Dutoit, T.3
-
55
-
-
80051636668
-
Phase-based information for voice pathology detection
-
T. Drugman, T. Dubuisson, and T. Dutoit, "Phase-based information for voice pathology detection," Proc. IEEE Int. Conf. on Acoust. Speech and Signal Proc. (ICASSP), pp. 4612-4615, 2011.
-
(2011)
Proc. IEEE Int. Conf. on Acoust. Speech and Signal Proc. (ICASSP
, pp. 4612-4615
-
-
Drugman, T.1
Dubuisson, T.2
Dutoit, T.3
-
56
-
-
48149090536
-
Relevant feature selection for audiovisual speech recognition
-
T. Drugman, M. Gurban, and J.-P. Thiran, "Relevant feature selection for audiovisual speech recognition," IEEE Intl Workshop on Multimedia Signal Processing, pp. 179-182, 2007.
-
(2007)
IEEE Intl Workshop on Multimedia Signal Processing
, pp. 179-182
-
-
Drugman, T.1
Gurban, M.2
Thiran, J.-P.3
|