-
1
-
-
67651002140
-
Statistical parametric speech synthesis
-
Nov.
-
H. Zen, K. Tokuda, and A. W. Black, "Statistical parametric speech synthesis," Speech Communication, vol. 51, no. 11, pp. 1039-1064, Nov. 2009.
-
(2009)
Speech Communication
, vol.51
, Issue.11
, pp. 1039-1064
-
-
Zen, H.1
Tokuda, K.2
Black, A.W.3
-
2
-
-
85008006694
-
A robust speaker-adaptive HMM-based text-to-speech synthesis
-
Aug.
-
J. Yamagishi, T. Nose, H. Zen, Z.-H. Ling, T. Toda, K. Tokuda, S. King, and S. Renals, "A robust speaker-adaptive HMM-based text-to-speech synthesis," IEEE Trans. Speech, Audio & Language Process., vol. 17, no. 6, pp. 1208-1230, Aug. 2009.
-
(2009)
IEEE Trans. Speech, Audio & Language Process.
, vol.17
, Issue.6
, pp. 1208-1230
-
-
Yamagishi, J.1
Nose, T.2
Zen, H.3
Ling, Z.-H.4
Toda, T.5
Tokuda, K.6
King, S.7
Renals, S.8
-
3
-
-
70450161300
-
Thousands of voices for HMM-based speech synthesis
-
Brighton, U.K., Sep.
-
J. Yamagishi et al., "Thousands of voices for HMM-based speech synthesis," in Proc. Interspeech 2009, Brighton, U.K., Sep. 2009, pp. 420-423.
-
(2009)
Proc. Interspeech 2009
, pp. 420-423
-
-
Yamagishi, J.1
-
4
-
-
77953708096
-
Thousands of voices for HMM-based speech synthesis - Analysis and application of TTS systems built on various ASR corpora
-
(in press)
-
J. Yamagishi, B. Usabaev, S. King, O. Watts, J. Dines, J. Tian, R. Hu, Y. Guan, K. Oura, K. Tokuda, R. Karhila, and M. Kurimo, "Thousands of voices for HMM-based speech synthesis - Analysis and application of TTS systems built on various ASR corpora," IEEE Trans. Speech, Audio & Language Process., 2010, (in press).
-
(2010)
IEEE Trans. Speech, Audio & Language Process
-
-
Yamagishi, J.1
Usabaev, B.2
King, S.3
Watts, O.4
Dines, J.5
Tian, J.6
Hu, R.7
Guan, Y.8
Oura, K.9
Tokuda, K.10
Karhila, R.11
Kurimo, M.12
-
5
-
-
77953693885
-
Building personalised synthesised voices for individuals with dysarthria using the HTS toolkit
-
J. W. Mullennix and S. E. Stern, Eds. IGI Global, Jan.
-
S. Creer, P. Green, S. Cunningham, and J. Yamagishi, "Building personalised synthesised voices for individuals with dysarthria using the HTS toolkit," in Computer Synthesized Speech Technologies: Tools for Aiding Impairment, J. W. Mullennix and S. E. Stern, Eds. IGI Global, Jan. 2010.
-
(2010)
Computer Synthesized Speech Technologies: Tools for Aiding Impairment
-
-
Creer, S.1
Green, P.2
Cunningham, S.3
Yamagishi, J.4
-
6
-
-
33847129573
-
Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training
-
DOI 10.1093/ietisy/e90-d.2.533
-
J. Yamagishi and T. Kobayashi, "Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training," IEICE Trans. Inf. & Syst., vol. E90-D, no. 2, pp. 533-543, Feb. 2007. (Pubitemid 46279829)
-
(2007)
IEICE Transactions on Information and Systems
, vol.E90-D
, Issue.2
, pp. 533-543
-
-
Yamagishi, J.1
Kobayashi, T.2
-
7
-
-
67650854725
-
Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm
-
1
-
J. Yamagishi, T. Kobayashi, Y. Nakano, K. Ogata, and J. Isogai, "Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm," IEEE Trans. Speech, Audio & Language Process., vol. 17, no. 1, pp. 66-83, 1 2009.
-
(2009)
IEEE Trans. Speech, Audio & Language Process.
, vol.17
, Issue.1
, pp. 66-83
-
-
Yamagishi, J.1
Kobayashi, T.2
Nakano, Y.3
Ogata, K.4
Isogai, J.5
-
8
-
-
0141760645
-
1993 benchmark tests for the ARPA spoken language program
-
Morristown, NJ, USA: Association for Computational Linguistics
-
D. S. Pallett, J. G. Fiscus, W. M. Fisher, J. S. Garofolo, B. A. Lund, and M. A. Przybocki, "1993 benchmark tests for the ARPA spoken language program," in HLT '94: Proceedings of the workshop on Human Language Technology. Morristown, NJ, USA: Association for Computational Linguistics, 1994, pp. 49-74.
-
(1994)
HLT '94: Proceedings of the Workshop on Human Language Technology
, pp. 49-74
-
-
Pallett, D.S.1
Fiscus, J.G.2
Fisher, W.M.3
Garofolo, J.S.4
Lund, B.A.5
Przybocki, M.A.6
-
9
-
-
0032673049
-
Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
-
H. Kawahara, I. Masuda-Katsuse, and A. Cheveigné, "Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: possible role of a repetitive structure in sounds," Speech Communication, vol. 27, pp. 187-207, 1999.
-
(1999)
Speech Communication
, vol.27
, pp. 187-207
-
-
Kawahara, H.1
Masuda-Katsuse, I.2
Cheveigné, A.3
-
10
-
-
0032050110
-
Maximum likelihood linear transformations for HMM-based speech recognition
-
M. J. F. Gales, "Maximum likelihood linear transformations for HMM-based speech recognition," Computer Speech and Language, vol. 12, no. 2, pp. 75-98, 1998. (Pubitemid 128383747)
-
(1998)
Computer Speech and Language
, vol.12
, Issue.2
, pp. 75-98
-
-
Gales, M.J.F.1
-
11
-
-
38549096029
-
A speech parameter generation algorithm considering global variance for HMM-based speech synthesis
-
May
-
T. Toda and K. Tokuda, "A speech parameter generation algorithm considering global variance for HMM-based speech synthesis," IEICE Trans. Inf. & Syst., vol. E90-D, no. 5, pp. 816-824, May 2007.
-
(2007)
IEICE Trans. Inf. & Syst.
, vol.E90-D
, Issue.5
, pp. 816-824
-
-
Toda, T.1
Tokuda, K.2
-
12
-
-
0012330750
-
The design for the wall street journal-based CSR corpus
-
Harriman, New York
-
D. B. Paul and J. M. Baker, "The design for the wall street journal-based CSR corpus," in Proceedings of the workshop on Speech and Natural Language, Harriman, New York, 1992, pp. 357-362.
-
(1992)
Proceedings of the Workshop on Speech and Natural Language
, pp. 357-362
-
-
Paul, D.B.1
Baker, J.M.2
-
13
-
-
33646800617
-
Analysis of speaking styles by two-dimensional visualization of aggregate of acoustic models
-
Jeju Island, Korea, Oct.
-
M. Shozakai and G. Nagino, "Analysis of speaking styles by two-dimensional visualization of aggregate of acoustic models," in Proc. ICSLP 2004, Jeju Island, Korea, Oct. 2004, pp. 717-720.
-
(2004)
Proc. ICSLP 2004
, pp. 717-720
-
-
Shozakai, M.1
Nagino, G.2
-
14
-
-
70449388052
-
QMOS - A robust visualization method for speaker dependencies with different microphones
-
A. Maier, M. Schuster, U. Eysholdt, T. Haderlein, T. Cincarek, S. Steidl, A. Batliner, S. Wenhardt, and E. Nöth, "QMOS - a robust visualization method for speaker dependencies with different microphones," Journal of Pattern Recognition Research, vol. 1, pp. 32 - 51, 2009.
-
(2009)
Journal of Pattern Recognition Research
, vol.1
, pp. 32-51
-
-
Maier, A.1
Schuster, M.2
Eysholdt, U.3
Haderlein, T.4
Cincarek, T.5
Steidl, S.6
Batliner, A.7
Wenhardt, S.8
Nöth, E.9
-
16
-
-
77953723444
-
Reformulating the HMM as a trajectory model
-
Dec.
-
K. Tokuda, H. Zen, and T. Kitamura, "Reformulating the HMM as a trajectory model," IEICE technical report. Natural language understanding and models of communication, vol. 104, no. 538, pp. 43-48, Dec. 2004.
-
(2004)
IEICE Technical Report. Natural Language Understanding and Models of Communication
, vol.104
, Issue.538
, pp. 43-48
-
-
Tokuda, K.1
Zen, H.2
Kitamura, T.3
-
18
-
-
84970205467
-
Attractive faces are only average
-
J. H. Langlois and L. A. Roggman, "Attractive faces are only average," Psychological Science, vol. 1, no. 2, pp. 115-121, 1990.
-
(1990)
Psychological Science
, vol.1
, Issue.2
, pp. 115-121
-
-
Langlois, J.H.1
Roggman, L.A.2
-
19
-
-
74549192033
-
Vocal attractiveness increases by averaging
-
L. Bruckert, P. Bestelmeyer, M. Latinus, J. Rouger, I. Charest, G. A. Rousselet, H. Kawahara, and P. Belin, "Vocal attractiveness increases by averaging." Current Biology, vol. 20, no. 2, pp. 116-120, 2010.
-
(2010)
Current Biology
, vol.20
, Issue.2
, pp. 116-120
-
-
Bruckert, L.1
Bestelmeyer, P.2
Latinus, M.3
Rouger, J.4
Charest, I.5
Rousselet, G.A.6
Kawahara, H.7
Belin, P.8
|