-
1
-
-
84966341178
-
The impact of speech recognition on speech synthesis
-
Santa Monica, CA, Sep.
-
M. Ostendorf and I. Bulyko, "The impact of speech recognition on speech synthesis," in Proc. IEEE Workshop Speech Synth., Santa Monica, CA, Sep. 2002, pp. 99-106.
-
(2002)
Proc. IEEE Workshop Speech Synth.
, pp. 99-106
-
-
Ostendorf, M.1
Bulyko, I.2
-
2
-
-
70349227947
-
The application of hidden Markov models in speech recognition
-
M. Gales and S. Young, "The application of hidden Markov models in speech recognition," Foundat. Trends Signal Process., vol. 1, no. 3, pp. 195-304, 2007.
-
(2007)
Foundat. Trends Signal Process.
, vol.1
, Issue.3
, pp. 195-304
-
-
Gales, M.1
Young, S.2
-
3
-
-
67651002140
-
Statistical parametric speech synthesis
-
H. Zen, K. Tokuda, and A. W. Black, "Statistical parametric speech synthesis," Speech Commun., 2009, 10.1016/j.specom.2009.04.004.
-
(2009)
Speech Commun. 2009, 10.1016/j.specom.
, pp. 04004
-
-
Zen, H.1
Tokuda, K.2
Black, A.W.3
-
4
-
-
84867203039
-
Unsupervised adaptation for HMM-based speech synthesis
-
Sep.
-
S. King, K. Tokuda, H. Zen, and J. Yamagishi, "Unsupervised adaptation for HMM-based speech synthesis," in Proc. Interspeech'08, Sep. 2008, pp. 1869-1872.
-
(2008)
Proc. Interspeech'08
, pp. 1869-1872
-
-
King, S.1
Tokuda, K.2
Zen, H.3
Yamagishi, J.4
-
5
-
-
70450185735
-
Two-pass decision tree construction for unsupervised adaptation of HMM-based synthesis models
-
Brighton, U.K. Sep.
-
M. Gibson, "Two-pass decision tree construction for unsupervised adaptation of HMM-based synthesis models," in Proc. Interspeech, Brighton, U.K., Sep. 2009, pp. 1791-1794.
-
(2009)
Proc. Interspeech
, pp. 1791-1794
-
-
Gibson, M.1
-
6
-
-
78049369783
-
A comparison of supervised and unsupervised cross-lingual speaker adaptation approaches for HMM-based speech synthesis
-
H. Liang, J. Dines, and L. Saheer, "A comparison of supervised and unsupervised cross-lingual speaker adaptation approaches for HMM-based speech synthesis," in Proc. ICASSP, Dallas, TX, 2010, pp. 4598-4601.
-
(2010)
Proc. ICASSP, Dallas, TX
, pp. 4598-4601
-
-
Liang, H.1
Dines, J.2
Saheer, L.3
-
7
-
-
0142192295
-
Conditional random fields: Probabilistic models for segmenting and labeling sequence data
-
Williamstown, MA
-
J. Lafferty, A. McCallum, and F. Pereira, "Conditional random fields: Probabilistic models for segmenting and labeling sequence data," in Proc. ICML, Williamstown, MA, 2001, pp. 282-289.
-
(2001)
Proc. ICML
, pp. 282-289
-
-
Lafferty, J.1
McCallum, A.2
Pereira, F.3
-
9
-
-
0036296863
-
Minimum phone error and I-smoothing for improved discriminative training
-
D. Povey and P. C. Woodland, "Minimum phone error and I-smoothing for improved discriminative training," in Proc. ICASSP, Orlando, FL, 2002, pp. 105-108.
-
(2002)
Proc. ICASSP, Orlando, FL
, pp. 105-108
-
-
Povey, D.1
Woodland, P.C.2
-
10
-
-
33846429403
-
Minimum generation error training for HMM-based speech synthesis
-
Toulouse, France
-
Y.-J. Wu and R.-H. Wang, "Minimum generation error training for HMM-based speech synthesis," in Proc. ICASSP, Toulouse, France, 2006, pp. 89-92.
-
(2006)
Proc. ICASSP
, pp. 89-92
-
-
Wu, Y.-J.1
Wang, R.-H.2
-
11
-
-
0024610919
-
A tutorial on hidden Markov models and selected appi-cations in speech recognition
-
Feb.
-
L. R. Rabiner, "A tutorial on hidden Markov models and selected appi-cations in speech recognition," Proc. IEEE, vol. 77, no. 2, pp. 257-286, Feb. 1989.
-
(1989)
Proc. IEEE
, vol.77
, Issue.2
, pp. 257-286
-
-
Rabiner, L.R.1
-
12
-
-
38149010136
-
A hidden Markov model approach to speech synthesis
-
Paris, France
-
A. Falaschi, M. Giustiniani, and M. Verola, "A hidden Markov model approach to speech synthesis," in Proc. Eurospeech, Paris, France, 1989, pp. 187-190.
-
(1989)
Proc. Eurospeech
, pp. 187-190
-
-
Falaschi, A.1
Giustiniani, M.2
Verola, M.3
-
13
-
-
0019555090
-
Cepstral analysis technique for automatic speaker verification
-
Apr.
-
S. Furui, "Cepstral analysis technique for automatic speaker verification," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-29, no. 2, pp. 254-272, Apr. 1981.
-
(1981)
IEEE Trans. Acoust., Speech, Signal Process.
, vol.ASSP-29
, Issue.2
, pp. 254-272
-
-
Furui, S.1
-
14
-
-
74149089478
-
Hidden semi-Markov models
-
Feb.
-
S.-Z. Yu, "Hidden semi-Markov models," Artificial Intell., vol. 174, no. 2, pp. 215-243, Feb. 2009.
-
(2009)
Artificial Intell.
, vol.174
, Issue.2
, pp. 215-243
-
-
Yu, S.-Z.1
-
15
-
-
85009231267
-
Trajectory modeling based on HMMs with explicit relationship between static and dynamic features
-
Geneva, Switzerland
-
K. Tokuda, H. Zen, and T. Kitamura, "Trajectory modeling based on HMMs with explicit relationship between static and dynamic features," in Proc. Eurospeech, Geneva, Switzerland, 2003, pp. 865-868.
-
(2003)
Proc. Eurospeech
, pp. 865-868
-
-
Tokuda, K.1
Zen, H.2
Kitamura, T.3
-
16
-
-
0026854213
-
A generalised hidden Markov model with state-conditioned trend functions of time for the speech signal
-
Apr.
-
L. Deng, "A generalised hidden Markov model with state-conditioned trend functions of time for the speech signal," Signal Process., vol. 27, pp. 65-78, Apr. 1992.
-
(1992)
Signal Process.
, vol.27
, pp. 65-78
-
-
Deng, L.1
-
17
-
-
0034854701
-
Trainable speech synthesis with trended hidden Markov models
-
J. Dines,S. Sridharan, andM. Moody, "Trainable speech synthesis with trended hidden Markov models," in Proc. ICASSP, Salt Lake City, UT, 2001, pp. 833-836.
-
(2001)
Proc. ICASSP, Salt Lake City, UT
, pp. 833-836
-
-
Dines, J.1
Sridharan, S.2
Moody, M.3
-
18
-
-
0023211846
-
Explicit time correlation in hidden Markov models for speech recognition
-
C. Wellekens, "Explicit time correlation in hidden Markov models for speech recognition," in Proc. ICASSP, Dallas, TX, 1987, vol. 12, pp. 384-386.
-
(1987)
Proc. ICASSP, Dallas, TX
, vol.12
, pp. 384-386
-
-
Wellekens, C.1
-
19
-
-
70450175584
-
Autoregressive HMMs for speech synthesis
-
Brighton, U.K.
-
M. Shannon and W. Byrne, "Autoregressive HMMs for speech synthesis," in Proc. Interspeech, Brighton, U.K., 2009.
-
(2009)
Proc. Interspeech
-
-
Shannon, M.1
Byrne, W.2
-
20
-
-
0003911245
-
A statistical coarticulatory model for the hidden vocal-tract-resonance dynamics
-
L. Deng and J. Ma, "A statistical coarticulatory model for the hidden vocal-tract-resonance dynamics," in Proc. Eurospeech, Budapest, Hungary, 1999, pp. 1499-1502.
-
(1999)
Proc. Eurospeech, Budapest, Hungary
, pp. 1499-1502
-
-
Deng, L.1
Ma, J.2
-
21
-
-
54349106040
-
Switching linear dynamical systems for noise robust speech recognition
-
Aug.
-
B. Mesot and D. Barber, "Switching linear dynamical systems for noise robust speech recognition," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 6, pp. 1850-1858, Aug. 2007.
-
(2007)
IEEE Trans. Audio, Speech, Lang. Process.
, vol.15
, Issue.6
, pp. 1850-1858
-
-
Mesot, B.1
Barber, D.2
-
22
-
-
79959849719
-
Autoregressive clustering for HMM speech synthesis
-
Makuhari, Japan
-
M. Shannon and W. Byrne, "Autoregressive clustering for HMM speech synthesis," in Proc. Interspeech, Makuhari, Japan, 2010.
-
(2010)
Proc. Interspeech
-
-
Shannon, M.1
Byrne, W.2
-
23
-
-
78649270883
-
Learning deep architectures for AI univ. de montréal montreal QC Canada
-
Y. Bengio, Learning Deep Architectures for AI Univ. de Montréal, Montreal, QC, Canada, Tech. Rep. 1312, 2007.
-
(2007)
Tech Rep.
, vol.1312
-
-
Bengio, Y.1
-
24
-
-
33745805403
-
A fast learning algorithm for deep belief nets
-
G. E. Hinton, S. Osindero, and Y. Teh, "A fast learning algorithm for deep belief nets," Neural Comput., vol. 18, pp. 1527-1554, 2006.
-
(2006)
Neural Comput.
, vol.18
, pp. 1527-1554
-
-
Hinton, G.E.1
Osindero, S.2
Teh, Y.3
-
25
-
-
78649297301
-
Deep belief networks for phone recognition
-
Whistler, Canada
-
A.-R. Mohamed, G. Dahl, and G. Hinton, "Deep belief networks for phone recognition," in Proc. NIPS Workshop Deep Learn. Speech Recogn. Rel. Applicat., Whistler, Canada, 2009.
-
(2009)
Proc. NIPS Workshop Deep Learn. Speech Recogn. Rel. Applicat.
-
-
Mohamed, A.-R.1
Dahl, G.2
Hinton, G.3
-
26
-
-
78649277342
-
Decision trees do not generalize to new variations Univ. de Montréal Montreal QC Canada
-
Y. Bengio, O. Delalleau, and C. Simard, Decision trees do not generalize to new variations Univ. de Montréal, Montreal, QC, Canada, Tech. Rep. 1304, 2006.
-
(2006)
Tech. Rep.
, vol.1304
-
-
Bengio, Y.1
Delalleau, O.2
Simard, C.3
-
27
-
-
51449118125
-
Acoustic modeling with contextual additive structure for HMM-based speech recognition
-
Y. Nankaku, K. Nakamura, H. Zen, T. Toda, and K. Tokuda, "Acoustic modeling with contextual additive structure for HMM-based speech recognition," in Proc. ICASSP, Las Vegas, NV, 2008, pp. 4469-4472.
-
(2008)
Proc. ICASSP, Las Vegas, NV
, pp. 4469-4472
-
-
Nankaku, Y.1
Nakamura, K.2
Zen, H.3
Toda, T.4
Tokuda, K.5
-
28
-
-
0003822743
-
-
Cambridge, U.K.: Cambridge Univ. Eng. Dept. Dec.
-
S. Young, G. Evermann, M. Gales, T. Hain, D. Kershaw, X. Liu, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. Woodland, The HTK Book, 3rd ed. Cambridge, U.K.: Cambridge Univ. Eng. Dept., Dec. 2006.
-
(2006)
The HTK Book, 3rd Ed
-
-
Young, S.1
Evermann, G.2
Gales, M.3
Hain, T.4
Kershaw, D.5
Liu, X.6
Moore, G.7
Odell, J.8
Ollason, D.9
Povey, D.10
Valtchev, V.11
Woodland, P.12
-
29
-
-
79952258981
-
-
[Online] Available
-
K. Tokuda, H. Zen, J. Yamagishi, T. Masuko, S. Sako, A. Black, and T. Nose, The HMM-Based Speech Synthesis System (HTS). [Online]. Available: http://hts.sp.nitech.ac.jp/
-
The HMM-Based Speech Synthesis System (HTS)
-
-
Tokuda, K.1
Zen, H.2
Yamagishi, J.3
Masuko, T.4
Sako, S.5
Black, A.6
Nose, T.7
-
30
-
-
85009139544
-
Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
-
T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Ki-tamura, "Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis," in Proc. Eurospeech, 1999, pp. 2347-2350.
-
(1999)
Proc. Eurospeech
, pp. 2347-2350
-
-
Yoshimura, T.1
Tokuda, K.2
Masuko, T.3
Kobayashi, T.4
Ki-Tamura, T.5
-
31
-
-
78649273643
-
HMM-based approach to multilingual speech synthesis, Text to Speech Synthesis: New Paradigms and Advances
-
S. Narayanan andA. Alwan, Eds NJ: Prentice-Hall
-
K. Tokuda, H. Zen, and A. W. Black, "HMM-based approach to multilingual speech synthesis," in Text to Speech Synthesis: New Paradigms and Advances, S. Narayanan andA. Alwan, Eds. Upper Saddle River, NJ: Prentice-Hall, 2004.
-
(2004)
Upper Saddle River
-
-
Tokuda, K.1
Zen, H.2
Black, A.W.3
-
32
-
-
85008006694
-
A robust speaker-adaptive HMM-based text-to-speech synthesis
-
Aug.
-
J. Yamagishi, T. Nose, H. Zen, Z.-H. Ling,T. Toda,K.Tokuda, S. King, and S. Renals, "A robust speaker-adaptive HMM-based text-to-speech synthesis," IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 6, pp. 1208-1230, Aug. 2009.
-
(2009)
IEEE. Trans. Audio, Speech, Lang. Process.
, vol.17
, Issue.6
, pp. 1208-1230
-
-
Yamagishi, J.1
Nose, T.2
Zen, H.3
Ling, Z.-H.4
Toda, T.5
Tokuda, K.6
King, S.7
Renals, S.8
-
33
-
-
77249139677
-
An HMM-based Mandarin Chinese text-to-speech system
-
Dec.
-
Y. Qian, F. Soong, Y. Chen, and M. Chu, "An HMM-based Mandarin Chinese text-to-speech system," in Proc. ISCSLP'06, Dec. 2006, pp. 223-232.
-
(2006)
Proc. ISCSLP'06
, pp. 223-232
-
-
Qian, Y.1
Soong, F.2
Chen, Y.3
Chu, M.4
-
35
-
-
0030672098
-
Hybrid HMM-ANN systems for training independent tasks: Experiments on phonebook and related improvements
-
Munich, Germany Apr.
-
S. Dupont, H. Bourlard, O. Deroo, V. Fontaine, and J.-M. Boite, "Hybrid HMM-ANN systems for training independent tasks: Experiments on phonebook and related improvements," in Proc. ICASSP, Munich, Germany, Apr. 1997, pp. 1767-1770.
-
(1997)
Proc. ICASSP
, pp. 1767-1770
-
-
Dupont, S.1
Bourlard, H.2
Deroo, O.3
Fontaine, V.4
Boite, J.-M.5
-
36
-
-
0025041264
-
Perceptual linear predictive (PLP) analysis of speech
-
H. Hermansky, "Perceptual linear predictive (PLP) analysis of speech," J. Acoust. Soc. Amer., vol. 87, no. 4, pp. 1738-1752, 1990.
-
(1990)
J. Acoust. Soc. Amer.
, vol.87
, Issue.4
, pp. 1738-1752
-
-
Hermansky, H.1
-
37
-
-
85131821539
-
Mel-generalized cepstral analysis\A unified approach to speech spectral estimation
-
Sep.
-
K. Koishida, G. Hirabayashi, K. Tokuda, and T. Kobayashi, "Mel-generalized cepstral analysis\A unified approach to speech spectral estimation," in Proc. ICSLP, Yokohama, Japan, Sep. 1994, vol. 3, pp. 1043-1046.
-
(1994)
Proc. ICSLP, Yokohama, Japan
, vol.3
, pp. 1043-1046
-
-
Koishida, K.1
Hirabayashi, G.2
Tokuda, K.3
Kobayashi, T.4
-
38
-
-
0032673049
-
Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
-
H. Kawahara, I. Masuda-Katsuse, and A. Cheveigne, "Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds," Speech Commun., vol. 27, pp. 187-207, 1999.
-
(1999)
Speech Commun.
, vol.27
, pp. 187-207
-
-
Kawahara, H.1
Masuda-Katsuse, I.2
Cheveigne, A.3
-
39
-
-
59849090295
-
Combining spectral representations for large vocabulary continuous speech recognition
-
Mar.
-
G. Garau and S. Renals, "Combining spectral representations for large vocabulary continuous speech recognition," IEEE Trans. Audio, Speech, Lang. Process., vol. 16, no. 3, pp. 508-518, Mar. 2008.
-
(2008)
IEEE Trans. Audio, Speech, Lang. Process.
, vol.16
, Issue.3
, pp. 508-518
-
-
Garau, G.1
Renals, S.2
-
40
-
-
33847129573
-
Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training
-
Feb.
-
J. Yamagishi and T. Kobayashi, "Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training," IEICE Trans. Inf. Syst, vol. E90-D, no. 2, pp. 533-543, Feb. 2007.
-
(2007)
IEICE Trans. Inf. Syst
, vol.E90-D
, Issue.2
, pp. 533-543
-
-
Yamagishi, J.1
Kobayashi, T.2
-
41
-
-
33645781551
-
Evaluation of a speech recognition/generation method based on HMM and STRAIGHT
-
T. Irino, Y. Minami, T. Nakatani, M. Tsuzaki, and H. Tagawa, "Evaluation of a speech recognition/generation method based on HMM and STRAIGHT," in Proc. ICSLP, Denver, CO, 2002, pp. 2545-2548.
-
(2002)
Proc. ICSLP, Denver, CO
, pp. 2545-2548
-
-
Irino, T.1
Minami, Y.2
Nakatani, T.3
Tsuzaki, M.4
Tagawa, H.5
-
42
-
-
0003805597
-
-
Ph.D. dissertation Queens College, Univ. of Cambridge, Cambridge, U.K.
-
J. J. Odell, "The use of context in large vocabulary continuous speech recognition," Ph.D. dissertation, Queens College, Univ. of Cambridge, Cambridge, U.K., 1995.
-
(1995)
The Use of Context in Large Vocabulary Continuous Speech Recognition
-
-
Odell, J.J.1
-
43
-
-
85135145174
-
Acoustic modeling based on the MDL criterion for speech recognition
-
K. Shinoda and T. Watanabe, "Acoustic modeling based on the MDL criterion for speech recognition," in Proc. Eurospeech, Rhodes, Greece, 1997, vol. 1, pp. 99-102.
-
(1997)
Proc. Eurospeech, Rhodes, Greece
, vol.1
, pp. 99-102
-
-
Shinoda, K.1
Watanabe, T.2
-
44
-
-
33947674781
-
Sub-phonetic modeling for capturing pronunciation variations for conversational speech synthesis
-
Toulouse, France
-
K. Prahallad, A. W. Black, and R. Mosur, "Sub-phonetic modeling for capturing pronunciation variations for conversational speech synthesis," in Proc. ICASSP, Toulouse, France, 2006, pp. 853-856.
-
(2006)
Proc. ICASSP
, pp. 853-856
-
-
Prahallad, K.1
Black, A.W.2
Mosur, R.3
-
45
-
-
0033906251
-
MDL-based context-dependent subword modeling for speech recognition
-
Japan (E) Mar.
-
K. Shinoda and T.Watanabe, "MDL-based context-dependent subword modeling for speech recognition," J. Acoust. Soc. Japan (E), vol. 21, pp. 79-86, Mar. 2000.
-
(2000)
J. Acoust. Soc.
, vol.21
, pp. 79-86
-
-
Shinoda, K.1
Watanabe, T.2
-
46
-
-
67650854725
-
Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm
-
Jan.
-
J. Yamagishi, T. Kobayashi, Y. Nakano, K. Ogata, and J. Isogai, "Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm," IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 1, pp. 66-83, Jan. 2009.
-
(2009)
IEEE Trans. Audio, Speech, Lang. Process.
, vol.17
, Issue.1
, pp. 66-83
-
-
Yamagishi, J.1
Kobayashi, T.2
Nakano, Y.3
Ogata, K.4
Isogai, J.5
-
47
-
-
0032050110
-
Maximum likelihood linear transformations for HMM-based speech recognition
-
M. Gales, "Maximum likelihood linear transformations for HMM-based speech recognition," Comput. Speech Lang., vol. 12, no. 2, pp. 75-98, 1998.
-
(1998)
Comput. Speech Lang.
, vol.12
, Issue.2
, pp. 75-98
-
-
Gales, M.1
-
48
-
-
0028419019
-
Maximum a posteriori estimation for multi-variate Gaussian mixture observations of Markov chains
-
Apr.
-
J. Gauvain and C. Lee, "Maximum a posteriori estimation for multi-variate Gaussian mixture observations of Markov chains," IEEE Trans. Speech Audio Process., vol. 2, pp. 291-298, Apr. 1994.
-
(1994)
IEEE Trans. Speech Audio Process.
, vol.2
, pp. 291-298
-
-
Gauvain, J.1
Lee, C.2
-
49
-
-
0030189744
-
Speaker adaptation using combined transformation and Bayesian methods
-
Jul.
-
V. Digalakis and L. Neumeyer, "Speaker adaptation using combined transformation and Bayesian methods," IEEE Trans. Speech Audio Process., vol. 4, no. 4, pp. 294-300, Jul. 1996.
-
(1996)
IEEE Trans. Speech Audio Process.
, vol.4
, Issue.4
, pp. 294-300
-
-
Digalakis, V.1
Neumeyer, L.2
-
50
-
-
0009623939
-
Flexible speaker adaptation using maximum likelihood linear regression
-
Morgan Kaufmann
-
C. Leggetter and P. Woodland, "Flexible speaker adaptation using maximum likelihood linear regression," in Proc. ARPA Spoken Lang. Technol. Workshop, 1995, pp. 104-109, Morgan Kaufmann.
-
(1995)
Proc. ARPA Spoken Lang. Technol. Workshop
, pp. 104-109
-
-
Leggetter, C.1
Woodland, P.2
-
51
-
-
0036461005
-
Structural maximum a posteriori linear regression for fast hmm adaptation
-
January
-
O. Siohan, T. Myrvoll, and C.-H. Lee, "Structural maximum a posteriori linear regression for fast hmm adaptation," Computer, Speech and Language, vol. 16, no. 1, pp. 5-24, January 2002.
-
(2002)
Computer, Speech and Language
, vol.16
, Issue.1
, pp. 5-24
-
-
Siohan, O.1
Myrvoll, T.2
Lee, C.-H.3
-
52
-
-
34547496746
-
Constrained structural maximum a posteriori linear regression for average-voice-based speech synthesis
-
Sep.
-
Y. Nakano, M. Tachibana, J. Yamagishi, and T. Kobayashi, "Constrained structural maximum a posteriori linear regression for average-voice-based speech synthesis," in Proc. ICSLP'06, Sep. 2006, pp. 2286-2289.
-
(2006)
Proc. ICSLP'06
, pp. 2286-2289
-
-
Nakano, Y.1
Tachibana, M.2
Yamagishi, J.3
Kobayashi, T.4
-
53
-
-
0030362995
-
A compact model for speaker-adaptive training
-
Oct.
-
T. Anastasakos, J. McDonough, R. Schwartz, and J. Makhoul, "A compact model for speaker-adaptive training," in Proc. ICSLP'96, Oct. 1996, pp. 1137-1140.
-
(1996)
Proc. ICSLP'96
, pp. 1137-1140
-
-
Anastasakos, T.1
McDonough, J.2
Schwartz, R.3
Makhoul, J.4
-
54
-
-
33947639066
-
Hidden semi-Markov model based speech recognition system using weighted finite-state transducer
-
Toulouse, France May
-
K. Oura, H. Zen, Y. Nankaku, A. Lee, and K. Tokuda, "Hidden semi-Markov model based speech recognition system using weighted finite-state transducer," in Proc. ICASSP'06, Toulouse, France, May 2006, pp. 33-36.
-
(2006)
Proc. ICASSP'06
, pp. 33-36
-
-
Oura, K.1
Zen, H.2
Nankaku, Y.3
Lee, A.4
Tokuda, K.5
-
55
-
-
70450169407
-
Speech recognition with speech synthesis models by marginalising over decision tree leaves
-
Brighton, U.K. Sep.
-
J. Dines, L. Saheer, and H. Liang, "Speech recognition with speech synthesis models by marginalising over decision tree leaves," in Proc. Interspeech, Brighton, U.K., Sep. 2009, pp. 1395-1398.
-
(2009)
Proc. Interspeech
, pp. 1395-1398
-
-
Dines, J.1
Saheer, L.2
Liang, H.3
-
56
-
-
67650819492
-
The HTS-2008 system: Yet another evaluation of the speaker-adaptive HMM-based speech synthesis system in the 2008 Blizzard Challenge
-
Sep.
-
J. Yamagishi, H. Zen, Y.-J. Wu, T. Toda, and K. Tokuda, "The HTS-2008 system: Yet another evaluation of the speaker-adaptive HMM-based speech synthesis system in the 2008 Blizzard Challenge," in Proc. Blizzard Challenge Workshop, Sep. 2008.
-
(2008)
Proc. Blizzard Challenge Workshop
-
-
Yamagishi, J.1
Zen, H.2
Wu, Y.-J.3
Toda, T.4
Tokuda, K.5
-
57
-
-
44449177634
-
A hidden semi-Markov model-based speech synthesis system
-
May
-
H. Zen, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "A hidden semi-Markov model-based speech synthesis system," IEICE Trans. Inf. Syst., vol. E90-D, no. 5, pp. 825-834, May 2007.
-
(2007)
IEICE Trans. Inf. Syst.
, vol.E90-D
, Issue.5
, pp. 825-834
-
-
Zen, H.1
Tokuda, K.2
Masuko, T.3
Kobayashi, T.4
Kitamura, T.5
-
58
-
-
0033708106
-
Speech parameter generation algorithms for HMM-based speech synthesis
-
Istanbul, Turkey
-
K. Tokuda, T. K. T. Masuko, T. Kobayashi, and T. Kitamura, "Speech parameter generation algorithms for HMM-based speech synthesis," in Proc. ICASSP'00, Istanbul, Turkey, 2000, pp. 1315-1318.
-
(2000)
Proc. ICASSP'00
, pp. 1315-1318
-
-
Tokuda, K.1
Masuko, T.K.T.2
Kobayashi, T.3
Kitamura, T.4
-
59
-
-
38549096029
-
A speech parameter generation algorithm considering global variance for HMM-based speech synthesis
-
May
-
T. Toda and K. Tokuda, "A speech parameter generation algorithm considering global variance for HMM-based speech synthesis," IEICE Trans. Inf. Syst., vol. E90-D, no. 5, pp. 816-824, May 2007.
-
(2007)
IEICE Trans. Inf. Syst.
, vol.E90-D
, Issue.5
, pp. 816-824
-
-
Toda, T.1
Tokuda, K.2
-
60
-
-
70450201930
-
DARPA February 1992 pilot corpus CSR dry run" benchmark test results
-
Harriman, NY Feb.
-
D. Pallet, "DARPA February 1992 pilot corpus CSR "dry run" benchmark test results," in Proc. Workshop Speech and Natural Language, Harriman, NY, Feb. 1992, pp. 382-386.
-
(1992)
Proc. Workshop Speech and Natural Language
, pp. 382-386
-
-
Pallet, D.1
-
61
-
-
70450161300
-
Thousands of voices for HMM-based speech synthesis
-
Brighton, U.K. Sep.
-
J. Yamagishi, B. Usabaev, S. King, O. Watts, J. Dines, J. Tian, R. Hu, K. Oura, K. Tokuda, R. Karhila, and M. Kurimo, "Thousands of voices for HMM-based speech synthesis," in Proc. Interspeech, Brighton, U.K., Sep. 2009, pp. 420-423.
-
(2009)
Proc. Interspeech
, pp. 420-423
-
-
Yamagishi, J.1
Usabaev, B.2
King, S.3
Watts, O.4
Dines, J.5
Tian, J.6
Hu, R.7
Oura, K.8
Tokuda, K.9
Karhila, R.10
Kurimo, M.11
-
62
-
-
77953708096
-
Thousands of voices for HMM-based speech synthesis-analysis and application of TTS systems built on various ASR corpora
-
Jul.
-
J. Yamagishi, B. Usabaev, S. King, O. Watts, J. Dines, J. Tian, R. Hu, K. Oura, K. Tokuda, R. Karhila, and M. Kurimo, "Thousands of voices for HMM-based speech synthesis-analysis and application of TTS systems built on various ASR corpora," IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 5, pp. 984-1004, Jul. 2010.
-
(2010)
IEEE Trans. Audio, Speech, Lang. Process.
, vol.18
, Issue.5
, pp. 984-1004
-
-
Yamagishi, J.1
Usabaev, B.2
King, S.3
Watts, O.4
Dines, J.5
Tian, J.6
Hu, R.7
Oura, K.8
Tokuda, K.9
Karhila, R.10
Kurimo, M.11
-
63
-
-
4544386225
-
Bootstrap estimates for confidence intervals in ASR performance evaluation
-
Montreal, QC, Canada May
-
M. Bisani and H. Ney, "Bootstrap estimates for confidence intervals in ASR performance evaluation," in Proc. ICASSP'94, Montreal, QC, Canada, May 1994, vol. 1, pp. 409-412.
-
(1994)
Proc. ICASSP'94
, vol.1
, pp. 409-412
-
-
Bisani, M.1
Ney, H.2
-
64
-
-
0017097474
-
Distance measures for speech processing
-
Oct.
-
A. Gray, Jr. and J. Markel, "Distance measures for speech processing," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-24, no. 5, pp. 380-391, Oct. 1976.
-
(1976)
IEEE Trans. Acoust., Speech, Signal Process.
, vol.ASSP-24
, Issue.5
, pp. 380-391
-
-
A. Jr. Gray1
Markel, J.2
-
65
-
-
0019146354
-
Correlation analysis of subjective and objective measures for speech quality
-
T. P. Barnwell, III, "Correlation analysis of subjective and objective measures for speech quality," in Proc. ICASSP'80, 1980, pp. 706-709.
-
(1980)
Proc. ICASSP'80
, pp. 706-709
-
-
Barnwell Iii, T.P.1
-
66
-
-
0019053271
-
Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
-
Aug.
-
S. Davis and P. Mermelstein, "Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-28, no. 4, pp. 357-366, Aug. 1980.
-
(1980)
IEEE Trans. Acoust., Speech, Signal Process.
, vol.ASSP-28
, Issue.4
, pp. 357-366
-
-
Davis, S.1
Mermelstein, P.2
-
67
-
-
84984366853
-
Speech analysis-synthesis system and quality of synthesized speech using mel-cep-strum
-
Japanese Japan (Part I: Commun.)
-
T. Kitamura, S. Imai, C. Furuichi, and T. Kobayashi, "Speech analysis-synthesis system and quality of synthesized speech using mel-cep-strum," (in Japanese)Electron. Commun. Japan (Part I: Commun.), vol. 69, no. 10, pp. 47-54, 1986.
-
(1986)
Electron. Commun
, vol.69
, Issue.10
, pp. 47-54
-
-
Kitamura, T.1
Imai, S.2
Furuichi, C.3
Kobayashi, T.4
-
68
-
-
85016140477
-
An adaptive algorithm for mel-cepstral analysis of speech
-
San Francisco, CA
-
T. Fukada, K. Tokuda, and S. Imai, "An adaptive algorithm for mel-cepstral analysis of speech," in Proc. ICASSP'92, San Francisco, CA, 1992, pp. 137-140.
-
(1992)
Proc. ICASSP'92
, pp. 137-140
-
-
Fukada, T.1
Tokuda, K.2
Imai, S.3
-
69
-
-
0027247004
-
Mel-cepstral distance measure for objective speech quality assessment
-
Comput., Signal Process., May
-
R. Kubichek, "Mel-cepstral distance measure for objective speech quality assessment," in Proc. IEEE Pacific Rim Conf. Commun., Comput., Signal Process., May 1993, vol. 1, pp. 125-128.
-
(1993)
Proc. IEEE Pacific Rim Conf. Commun.
, vol.1
, pp. 125-128
-
-
Kubichek, R.1
-
70
-
-
57749193836
-
Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
-
Nov.
-
T. Toda, A. Black, and K. Tokuda, "Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 8, pp. 2222-2235, Nov. 2007.
-
(2007)
IEEE Trans. Audio, Speech, Lang. Process.
, vol.15
, Issue.8
, pp. 2222-2235
-
-
Toda, T.1
Black, A.2
Tokuda, K.3
-
71
-
-
67650790758
-
The blizzard challenge 2008
-
Brisbane, Australia Sep.
-
V.Karaiskos,S. King, R. A. J. Clark, and C. Mayo, "The Blizzard Challenge 2008," in Proc. Blizzard Challenge Workshop, Brisbane, Australia, Sep. 2008.
-
(2008)
Proc. Blizzard Challenge Workshop
-
-
King, V.KaraiskosS.1
Clark, R.A.J.2
Mayo, C.3
-
72
-
-
34250618146
-
-
[Online]. Available
-
The CMU Pronouncing Dictionary. [Online]. Available: http://www. speech.cs.cmu.edu/cgi-bin/cmudict
-
The CMU Pronouncing Dictionary
-
-
-
73
-
-
85030493378
-
Synthesis of regional English using a keyword lexicon
-
Sep.
-
S. Fitt and S. Isard, "Synthesis of regional English using a keyword lexicon," in Proc. Eurospeech, Sep. 1999, vol. 2, pp. 823-826.
-
(1999)
Proc. Eurospeech
, vol.2
, pp. 823-826
-
-
Fitt, S.1
Isard, S.2
-
74
-
-
0028515984
-
Experimental evaluation of features for robust speaker identification
-
Oct.
-
D. A. Reynolds, "Experimental evaluation of features for robust speaker identification," IEEE Trans. Speech Audio Process., vol. 2, no. 4, pp. 639-643, Oct. 1994.
-
(1994)
IEEE Trans. Speech Audio Process.
, vol.2
, Issue.4
, pp. 639-643
-
-
Reynolds, D.A.1
-
75
-
-
70450183638
-
Measuring the gap between HMM-based ASR and TTS
-
Brighton, U.K. Sep.
-
J. Dines, J. Yamagishi, and S. King, "Measuring the gap between HMM-based ASR and TTS," in Proc. Interspeech, Brighton, U.K., Sep. 2009, pp. 1391-1394.
-
(2009)
Proc. Interspeech
, pp. 1391-1394
-
-
Dines, J.1
Yamagishi, J.2
King, S.3
|