-
1
-
-
70450161300
-
Thousands of voices for HMM-based speech synthesis
-
Brighton, U.K., Sep.
-
J. Yamagishi et al., "Thousands of voices for HMM-based speech synthesis," in Proc. Interspeech-99, Brighton, U.K., Sep. 2009, pp. 420-423.
-
(2009)
Proc. Interspeech-99
, pp. 420-423
-
-
Yamagishi, J.1
-
2
-
-
85009139544
-
Simultaneous modeling of spectrum, pitch and duration in HMMbased speech synthesis
-
Budapest, Hungary, Sep.
-
T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Simultaneous modeling of spectrum, pitch and duration in HMMbased speech synthesis," in Proc. EUROSPEECH-99, Budapest, Hungary, Sep. 1999, pp. 2374-12350
-
(1999)
Proc. EUROSPEECH-99
, pp. 2374-12350
-
-
Yoshimura, T.1
Tokuda, K.2
Masuko, T.3
Kobayashi, T.4
Kitamura, T.5
-
3
-
-
79952258981
-
-
Version 2.1. [Online]. Available:
-
K. Tokuda, H. Zen, J. Yamagishi, T. Masuko, S. Sako, A. B. Black, and T. Nose, "The HMM-Based Speech Synthesis System (HTS) Version 2.1." [Online]. Available: http://hts.sp.nitech.ac.jp/
-
The HMM-Based Speech Synthesis System (HTS)
-
-
Tokuda, K.1
Zen, H.2
Yamagishi, J.3
Masuko, T.4
Sako, S.5
Black, A.B.6
Nose, T.7
-
4
-
-
85133720638
-
The HMM-based speech synthesis system (HTS)
-
version 2.0, Bonn, Germany, Aug.
-
H. Zen, T. Nose, J. Yamagishi, S. Sako, T. Masuko, A. W. Black, and K. Tokuda, "The HMM-based speech synthesis system (HTS) version 2.0," in Proc. 6th ISCA Workshop Speech Synth. (SSW-6), Bonn, Germany, Aug. 2007.
-
(2007)
Proc. 6th ISCA Workshop Speech Synth. (SSW-6)
-
-
Zen, H.1
Nose, T.2
Yamagishi, J.3
Sako, S.4
Masuko, T.5
Black, A.W.6
Tokuda, K.7
-
5
-
-
85008006694
-
A robust speaker-adaptive HMM-based text-to-speech synthesis
-
Aug.
-
J.Yamagishi, T. Nose, H. Zen, Z.-H. Ling, T. Toda, K. Tokuda, S. King, and S. Renals, "A robust speaker-adaptive HMM-based text-to-speech synthesis," IEEE Trans. Speech, Audio, Lang. Process., vol.17, no.6, pp. 1208-1230, Aug. 2009.
-
(2009)
IEEE Trans. Speech, Audio, Lang. Process.
, vol.17
, Issue.6
, pp. 1208-1230
-
-
Yamagishi, J.1
Nose, T.2
Zen, H.3
Ling, Z.-H.4
Toda, T.5
Tokuda, K.6
King, S.7
Renals, S.8
-
6
-
-
84867223798
-
Robustness of HMM-based speech synthesis
-
Brisbane, Australia, Sep.
-
J. Yamagishi, Z.-H. Ling, and S. King, "Robustness of HMM-based speech synthesis," in Proc. Interspeech-08, Brisbane, Australia, Sep. 2008, pp. 581-584.
-
(2008)
Proc. Interspeech-08
, pp. 581-584
-
-
Yamagishi, J.1
Ling, Z.-H.2
King, S.3
-
7
-
-
67651002140
-
Statistical parametric speech synthesis
-
Nov.
-
H. Zen, K. Tokuda, and A. W. Black, "Statistical parametric speech synthesis," Speech Commun., vol.51, no.11, pp. 1039-1064, Nov. 2009.
-
(2009)
Speech Commun
, vol.51
, Issue.11
, pp. 1039-1064
-
-
Zen, H.1
Tokuda, K.2
Black, A.W.3
-
8
-
-
0012330750
-
The design for thewall street journal-based CSR corpus
-
Harriman, NY
-
D. B. Paul and J. M. Baker, "The design for thewall street journal-based CSR corpus," in Proc.Workshop Speech Natural Lang., Harriman, NY, 1992, pp. 357-362.
-
(1992)
Proc.Workshop Speech Natural Lang.
, pp. 357-362
-
-
Paul, D.B.1
Baker, J.M.2
-
9
-
-
0028996854
-
WSJCAM0: A British English speech corpus for large vocabulary continuous speech recognition
-
Detroit, MI, May
-
T. Robinson, J. Fransen, D. Pye, J. Foote, and S. Renals, "WSJCAM0: A British English speech corpus for large vocabulary continuous speech recognition," in Proc. ICASSP-95, Detroit, MI, May 1995, pp. 81-84.
-
(1995)
Proc. ICASSP-95
, pp. 81-84
-
-
Robinson, T.1
Fransen, J.2
Pye, D.3
Foote, J.4
Renals, S.5
-
10
-
-
84936692751
-
DARPAresource management bench
-
Hidden Valley, PA, Jun.
-
D. S. Pallet, J. G. Fiscus, and J. S. Garofolo, "DARPAresource management bench," in Proc. Workshop Speech Natural Lang., Hidden Valley, PA, Jun. 1990, pp. 298-305.
-
(1990)
Proc. Workshop Speech Natural Lang.
, pp. 298-305
-
-
Pallet, D.S.1
Fiscus, J.G.2
Garofolo, J.S.3
-
11
-
-
85009274666
-
GlobalPhone: A multilingual speech and text database developed at Karlsruhe university
-
Denver, CO, Sep.
-
T. Schultz, "GlobalPhone: A multilingual speech and text database developed at Karlsruhe university," in Proc. ICSLP'02, Denver, CO, Sep. 2002, pp. 345-348.
-
(2002)
Proc. ICSLP'02
, pp. 345-348
-
-
Schultz, T.1
-
12
-
-
84910032186
-
SPEECON-speech databases for consumer devices: Database specification and validation
-
Canary Islands, Spain, May
-
D. Iskra, B. Grosskopf, K. Marasek, H. V. D. Heuvel, F. Diehl, and A. Kiessling, "SPEECON-speech databases for consumer devices: Database specification and validation," in Proc. LREC'02, Canary Islands, Spain, May 2002, pp. 329-333.
-
(2002)
Proc. LREC'02
, pp. 329-333
-
-
Iskra, D.1
Grosskopf, B.2
Marasek, K.3
Heuvel, H.V.D.4
Diehl, F.5
Kiessling, A.6
-
13
-
-
70349227947
-
The application of hidden Markov models in speech recognition
-
M. J. F. Gales and S. J. Young, "The application of hidden Markov models in speech recognition," Foundations Trends R Signal Process., vol.1, no.3, pp. 195-304, 2008.
-
(2008)
Foundations Trends R Signal Process
, vol.1
, Issue.3
, pp. 195-304
-
-
Gales, M.J.F.1
Young, S.J.2
-
14
-
-
85128361526
-
The design of the newspaper- based Japanese large vocabulary continuous speech recognition corpus
-
Sydney, Australia, Dec.
-
K. Itou, M. Yamamoto, K. Takeda, T. Takezawa, T. Matsuoka, T. Kobayashi, K. Shikano, and S. Itahashi, "The design of the newspaper- based Japanese large vocabulary continuous speech recognition corpus," in Proc. ICSLP-98, Sydney, Australia, Dec. 1998, pp. 3261-3264.
-
(1998)
Proc. ICSLP-98
, pp. 3261-3264
-
-
Itou, K.1
Yamamoto, M.2
Takeda, K.3
Takezawa, T.4
Matsuoka, T.5
Kobayashi, T.6
Shikano, K.7
Itahashi, S.8
-
15
-
-
0002985991
-
Mora and syllable
-
N. Tsujimura, Ed. New York: Blackwell
-
H. Kubozono, "Mora and syllable," in The Handbook of Japanese Linguistics, N. Tsujimura, Ed. New York: Blackwell, 1995, pp. 31-61.
-
(1995)
The Handbook of Japanese Linguistics
, pp. 31-61
-
-
Kubozono, H.1
-
16
-
-
85030493378
-
Synthesis of regional English using a keyword lexicon
-
Budapest, Hungary, Sep.
-
S. Fitt and S. Isard, "Synthesis of regional English using a keyword lexicon," in Proc. Eurospeech-99, Budapest, Hungary, Sep. 1999, vol.2, pp. 823-826.
-
(1999)
Proc. Eurospeech-99
, vol.2
, pp. 823-826
-
-
Fitt, S.1
Isard, S.2
-
17
-
-
34047123652
-
Multisyn: Open-domain unit selection for the Festival speech synthesis system
-
R. A. J. Clark, K. Richmond, and S. King, "Multisyn: Open-domain unit selection for the Festival speech synthesis system," Speech Commun., vol.49, no.4, pp. 317-330, 2007.
-
(2007)
Speech Commun
, vol.49
, Issue.4
, pp. 317-330
-
-
Clark, R.A.J.1
Richmond, K.2
King, S.3
-
18
-
-
77953725740
-
-
[Online].Available:
-
[Online]. Available: http://www.lc-star.com
-
-
-
-
19
-
-
77249139677
-
An HMM-based Mandarin Chinese text-to-speech system
-
Singapore, Dec.
-
Y. Qian, F. Soong, Y. Chen, and M. Chu, "An HMM-based Mandarin Chinese text-to-speech system," in Proc. ISCSLP'06, Singapore, Dec. 2006, pp. 223-232.
-
(2006)
Proc. ISCSLP'06
, pp. 223-232
-
-
Qian, Y.1
Soong, F.2
Chen, Y.3
Chu, M.4
-
20
-
-
77953713775
-
-
Deliverable Report D2.1 EMIME Project, 2008
-
Deliverable Report D2.1 EMIME Project, 2008.
-
-
-
-
21
-
-
77953728396
-
An efficient and unified approach of Mandarin HTS system
-
Dallas, TX, Mar.
-
Y. Guan, J. Tian, Y.-J. Wu, J. Yamagishi, and J. Nurminen, "An efficient and unified approach of Mandarin HTS system," in Proc. ICASSP'10, Dallas, TX, Mar. 2010.
-
(2010)
Proc. ICASSP'10
-
-
Guan, Y.1
Tian, J.2
Wu, Y.-J.3
Yamagishi, J.4
Nurminen, J.5
-
22
-
-
85123861026
-
XIMERA: A new TTS from ATR based on corpus-based technologies
-
Workshop, Pittsburgh, PA, Jun.
-
H. Kawai, T. Toda, J. Ni, M. Tsuzaki, and K. Tokuda, "XIMERA: A new TTS from ATR based on corpus-based technologies," in Proc. ISCA 5th Speech Synth. Workshop, Pittsburgh, PA, Jun. 2004, pp. 179-184.
-
(2004)
Proc. ISCA 5th Speech Synth
, pp. 179-184
-
-
Kawai, H.1
Toda, T.2
Ni, J.3
Tsuzaki, M.4
Tokuda, K.5
-
23
-
-
60649102582
-
XIMERA: A concatenative speech synthesis system with large scale corpora
-
Dec.
-
H. Kawai, T. Toda, J. Yamagishi, T. Hirai, J. Ni, N. Nishizawa, M. Tsuzaki, and K. Tokuda, "XIMERA: A concatenative speech synthesis system with large scale corpora," IEICE Trans. Inf. Syst., vol.J89-D-II, no.12, pp. 2688-2698, Dec. 2006.
-
(2006)
IEICE Trans. Inf. Syst.
, vol.J89-D-II
, Issue.12
, pp. 2688-2698
-
-
Kawai, H.1
Toda, T.2
Yamagishi, J.3
Hirai, T.4
Ni, J.5
Nishizawa, N.6
Tsuzaki, M.7
Tokuda, K.8
-
24
-
-
33751057590
-
The ATR multilingual speech-to-speech translation system
-
Mar.
-
S. Nakamura, K. Markov, H. Nakaiwa, G. Kikui, H. Kawai, T. Jitsuhiro, J.-S. Zhang, H. Yamamoto, E. Sumita, and S. Yamamoto, "The ATR multilingual speech-to-speech translation system," IEEE Trans. Speech, Audio, Lang. Process., vol.14, no.2, pp. 365-376, Mar. 2006.
-
(2006)
IEEE Trans. Speech, Audio, Lang. Process.
, vol.14
, Issue.2
, pp. 365-376
-
-
Nakamura, S.1
Markov, K.2
Nakaiwa, H.3
Kikui, G.4
Kawai, H.5
Jitsuhiro, T.6
Zhang, J.-S.7
Yamamoto, H.8
Sumita, E.9
Yamamoto, S.10
-
25
-
-
77949915957
-
Generacion de una voz sintetica en Castellano basada en HSMM para la Evaluacion Albayzin 2008: Conversion texto a voz
-
Bilbao, Spain, Nov. [Online]. Available:
-
R. Barra-Chicote, J. Yamagishi, J. Montero, S. King, S. Lutfi, and J. Macias-Guarasa, "Generacion de una voz sintetica en Castellano basada en HSMM para la Evaluacion Albayzin 2008: Conversion texto a voz," in V Jornadas en Tecnologia del Habla (in Spanish), Bilbao, Spain, Nov. 2008, pp. 115-118 [Online]. Available: http://www.cstr.inf.ed.ac.uk/downloads/ publications/ 2008/tts-jth08.pdf
-
(2008)
V Jornadas en Tecnologia Del Habla (In Spanish)
, pp. 115-118
-
-
Barra-Chicote, R.1
Yamagishi, J.2
Montero, J.3
King, S.4
Lutfi, S.5
MacIas-Guarasa, J.6
-
26
-
-
33645758767
-
HMM-based approach to multilingual speech synthesis
-
S. Narayanan and A. Alwan, Eds. Upper Saddle River, NJ: Prentice-Hall
-
K. Tokuda, H. Zen, and A. W. Black, "HMM-based approach to multilingual speech synthesis," in Text to Speech Synthesis: New Paradigms and Advances, S. Narayanan and A. Alwan, Eds. Upper Saddle River, NJ: Prentice-Hall, 2004.
-
(2004)
Text to Speech Synthesis: New Paradigms and Advances
-
-
Tokuda, K.1
Zen, H.2
Black, A.W.3
-
27
-
-
0002144369
-
Tree-based state tying for high accuracy acoustic modeling
-
Workshop, Plainsboro, NJ, Mar.
-
S. J. Young, J. J. Odell, and P. C. Woodland, "Tree-based state tying for high accuracy acoustic modeling," in Proc. ARPA Human Lang. Technol. Workshop, Plainsboro, NJ, Mar. 1994, pp. 307-312.
-
(1994)
Proc. ARPA Human Lang. Technol
, pp. 307-312
-
-
Young, S.J.1
Odell, J.J.2
Woodland, P.C.3
-
28
-
-
70449126171
-
The HTS- 2008 system: Yet another evaluation of the speaker-adaptive HMMbased speech synthesis system in the 2008 Blizzard Challenge
-
Brisbane, Australia, Sep.
-
J. Yamagishi, H. Zen, Y.-J. Wu, T. Toda, and K. Tokuda, "The HTS- 2008 system: Yet another evaluation of the speaker-adaptive HMMbased speech synthesis system in the 2008 Blizzard Challenge," in Proc. Blizzard Challenge 2008, Brisbane, Australia, Sep. 2008.
-
(2008)
Proc. Blizzard Challenge 2008
-
-
Yamagishi, J.1
Zen, H.2
Wu, Y.-J.3
Toda, T.4
Tokuda, K.5
-
30
-
-
67650790758
-
The blizzard challenge 2008
-
Brisbane, Australia, Sep.
-
V. Karaiskos, S. King, R. A. J. Clark, and C. Mayo, "The Blizzard Challenge 2008," in Proc. Blizzard Challenge 2008, Brisbane, Australia, Sep. 2008.
-
(2008)
Proc. Blizzard Challenge 2008
-
-
Karaiskos, V.1
King, S.2
Clark, R.A.J.3
Mayo, C.4
-
31
-
-
0032673049
-
Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
-
H. Kawahara, I. Masuda-Katsuse, and A. Cheveigné, "Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds," Speech Commun., vol.27, pp. 187-207, 1999.
-
(1999)
Speech Commun
, vol.27
, pp. 187-207
-
-
Kawahara, H.1
Masuda-Katsuse, I.2
Cheveigné, A.3
-
32
-
-
33846405723
-
Details of Nitech HMM-based speech synthesis system for the Blizzard Challenge 2005
-
Jan.
-
H. Zen, T. Toda, M. Nakamura, and K. Tokuda, "Details of Nitech HMM-based speech synthesis system for the Blizzard Challenge 2005," IEICE Trans. Inf. Syst., vol.E90-D, no.1, pp. 325-333, Jan. 2007.
-
(2007)
IEICE Trans. Inf. Syst.
, vol.E90-D
, Issue.1
, pp. 325-333
-
-
Zen, H.1
Toda, T.2
Nakamura, M.3
Tokuda, K.4
-
33
-
-
44449177634
-
A hidden semi-Markov model-based speech synthesis system
-
May
-
H. Zen, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "A hidden semi-Markov model-based speech synthesis system," IEICE Trans. Inf. Syst., vol.E90-D, no.5, pp. 825-834, May 2007.
-
(2007)
IEICE Trans. Inf. Syst.
, vol.E90-D
, Issue.5
, pp. 825-834
-
-
Zen, H.1
Tokuda, K.2
Masuko, T.3
Kobayashi, T.4
Kitamura, T.5
-
34
-
-
0002629270
-
Maximum likelihood from incomplete data via the em algorithm
-
A. Dempster, N. Laird, and D. Rubin, "Maximum likelihood from incomplete data via the EM algorithm," J. R. Statist. Soc., Series B, vol.39, no.1, pp. 1-38, 1977.
-
(1977)
J. R. Statist. Soc., Series B
, vol.39
, Issue.1
, pp. 1-38
-
-
Dempster, A.1
Laird, N.2
Rubin, D.3
-
35
-
-
0033906251
-
MDL-based context-dependent subword modeling for speech recognition
-
Mar.
-
K. Shinoda and T.Watanabe, "MDL-based context-dependent subword modeling for speech recognition," J. Acoust. Soc. Japan (E), vol.21, pp. 79-86, Mar. 2000.
-
(2000)
J. Acoust. Soc. Japan (E)
, vol.21
, pp. 79-86
-
-
Shinoda, K.1
Watanabe, T.2
-
36
-
-
77953719894
-
Evaluation of flat start labeling for phoneme based Mandarin HTS system
-
Aug.
-
Y. Guan and J. Tian, "Evaluation of flat start labeling for phoneme based Mandarin HTS system," in Proc. ORIENTAL-COCOSDA-09, Aug. 2009, pp. 187-190.
-
(2009)
Proc. ORIENTAL-COCOSDA-09
, pp. 187-190
-
-
Guan, Y.1
Tian, J.2
-
37
-
-
0030362995
-
A compact model for speaker-adaptive training
-
Philadelphia, PA, Oct.
-
T. Anastasakos, J. McDonough, R. Schwartz, and J. Makhoul, "A compact model for speaker-adaptive training," in Proc. ICSLP-96, Philadelphia, PA, Oct. 1996, pp. 1137-1140.
-
(1996)
Proc. ICSLP-96
, pp. 1137-1140
-
-
Anastasakos, T.1
McDonough, J.2
Schwartz, R.3
Makhoul, J.4
-
38
-
-
0032050110
-
Maximum likelihood linear transformations for HMMbased speech recognition
-
M. J. F. Gales, "Maximum likelihood linear transformations for HMMbased speech recognition," Comput. Speech Lang., vol. 12, no. 2, pp. 75-98, 1998.
-
(1998)
Comput. Speech Lang.
, vol.12
, Issue.2
, pp. 75-98
-
-
Gales, M.J.F.1
-
39
-
-
67650854725
-
Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm
-
Jan.
-
J. Yamagishi, T. Kobayashi, Y. Nakano, K. Ogata, and J. Isogai, "Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm," IEEE Trans. Speech, Audio, Lang. Process., vol. 17, no. 1, pp. 66-83, Jan. 2009.
-
(2009)
IEEE Trans. Speech, Audio, Lang. Process.
, vol.17
, Issue.1
, pp. 66-83
-
-
Yamagishi, J.1
Kobayashi, T.2
Nakano, Y.3
Ogata, K.4
Isogai, J.5
-
40
-
-
38549096029
-
A speech parameter generation algorithm considering global variance for HMM-based speech synthesis
-
May
-
T. Toda and K. Tokuda, "A speech parameter generation algorithm considering global variance for HMM-based speech synthesis," IEICE Trans. Inf. Syst., vol.E90-D, no.5, pp. 816-824, May 2007.
-
(2007)
IEICE Trans. Inf. Syst.
, vol.E90-D
, Issue.5
, pp. 816-824
-
-
Toda, T.1
Tokuda, K.2
-
41
-
-
0025543906
-
Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones
-
E. Moulines and F. Charpentier, "Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones," Speech Commun., vol.9, no.5-6, pp. 453-468, 1990.
-
(1990)
Speech Commun
, vol.9
, Issue.5-6
, pp. 453-468
-
-
Moulines, E.1
Charpentier, F.2
-
42
-
-
85016140477
-
An adaptive algorithm for mel-cepstral analysis of speech
-
San Francisco, CA, Mar.
-
T. Fukada, K. Tokuda, T. Kobayashi, and S. Imai, "An adaptive algorithm for mel-cepstral analysis of speech," in Proc. ICASSP-92, San Francisco, CA, Mar. 1992, pp. 137-140.
-
(1992)
Proc. ICASSP-92
, pp. 137-140
-
-
Fukada, T.1
Tokuda, K.2
Kobayashi, T.3
Imai, S.4
-
44
-
-
33847129573
-
Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training
-
Feb.
-
J.Yamagishi and T.Kobayashi, "Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training," IEICE Trans. Inf. Syst., vol.E90-D, no.2, pp. 533-543, Feb. 2007.
-
(2007)
IEICE Trans. Inf. Syst.
, vol.E90-D
, Issue.2
, pp. 533-543
-
-
Yamagishi, J.1
Kobayashi, T.2
-
45
-
-
70450183638
-
Measuring the gap between HMM-based ASR and TTS
-
Brighton, U.K., Sep.
-
J. Dines, J. Yamagishi, and S. King, "Measuring the gap between HMM-based ASR and TTS," in Proc. Interspeech-09, Brighton, U.K., Sep. 2009, pp. 1391-1394.
-
(2009)
Proc. Interspeech-09
, pp. 1391-1394
-
-
Dines, J.1
Yamagishi, J.2
King, S.3
-
47
-
-
0141760645
-
1993 benchmark tests for the ARPA spoken language program
-
Morristown, NJ
-
D. S. Pallett, J. G. Fiscus, W. M. Fisher, J. S. Garofolo, B. A. Lund, and M. A. Przybocki, "1993 benchmark tests for the ARPA spoken language program," in Proc. HLT '94: Workshop Human Lang. Technol., Morristown, NJ, 1994, pp. 49-74.
-
(1994)
Proc. HLT '94: Workshop Human Lang. Technol.
, pp. 49-74
-
-
Pallett, D.S.1
Fiscus, J.G.2
Fisher, W.M.3
Garofolo, J.S.4
Lund, B.A.5
Przybocki, M.A.6
-
48
-
-
60849092922
-
Cross-lingual speaker adaptation for HMM-based speech synthesis
-
Kunming, China
-
Y.-J. Wu, S. King, and K. Tokuda, "Cross-lingual speaker adaptation for HMM-based speech synthesis," in Proc. ISCSLP-08, Kunming, China, 2008, pp. 9-12.
-
(2008)
Proc. ISCSLP-08
, pp. 9-12
-
-
Wu, Y.-J.1
King, S.2
Tokuda, K.3
-
49
-
-
70450192740
-
State mapping based method for cross-lingual speaker adaptation in HMM-based speech synthesis
-
Brighton, U.K., Sep.
-
Y.-J. Wu and K. Tokuda, "State mapping based method for cross-lingual speaker adaptation in HMM-based speech synthesis," in Proc. Interspeech- 09, Brighton, U.K., Sep. 2009, pp. 528-531.
-
(2009)
Proc. Interspeech- 09
, pp. 528-531
-
-
Wu, Y.-J.1
Tokuda, K.2
-
50
-
-
0017097474
-
Distance measures for speech processing
-
Oct.
-
J. A. Gray and J. Markel, "Distance measures for speech processing," IEEE Trans. Acoust., Speech, Signal Process., vol.ASSP-24, no.5, pp. 380-391, Oct. 1976.
-
(1976)
IEEE Trans. Acoust., Speech, Signal Process.
, vol.ASSP-24
, Issue.5
, pp. 380-391
-
-
Gray, J.A.1
Markel, J.2
-
51
-
-
0019146354
-
Correlation analysis of subjective and objective measures for speech quality
-
Denver, CO
-
T. P. Barnwell, III, "Correlation analysis of subjective and objective measures for speech quality," in Proc. ICASSP-80, Denver, CO, 1980, pp. 706-709.
-
(1980)
Proc. ICASSP-80
, pp. 706-709
-
-
Barnwell III, T.P.1
-
52
-
-
0029725605
-
Speech synthesis using HMMs with dynamic features
-
Atlanta, GA, May
-
T. Masuko, K. Tokuda, T. Kobayashi, and S. Imai, "Speech synthesis using HMMs with dynamic features," in Proc. ICASSP-96, Atlanta, GA, May 1996, pp. 389-392.
-
(1996)
Proc. ICASSP-96
, pp. 389-392
-
-
Masuko, T.1
Tokuda, K.2
Kobayashi, T.3
Imai, S.4
-
53
-
-
70349208664
-
Optimizing segment label boundaries for statistical speech synthesis
-
Taipei, Taiwan, Apr.
-
A. W. Black and J. Kominek, "Optimizing segment label boundaries for statistical speech synthesis," in Proc. ICASSP-09, Taipei, Taiwan, Apr. 2009, pp. 3785-3788.
-
(2009)
Proc. ICASSP-09
, pp. 3785-3788
-
-
Black, A.W.1
Kominek, J.2
-
54
-
-
67650832556
-
Statistical analysis of the Blizzard Challenge 2007 listening test results
-
Bonn, Germany, Aug.
-
R. A. J. Clark, M. Podsiadlo, M. Fraser, C. Mayo, and S. King, "Statistical analysis of the Blizzard Challenge 2007 listening test results," in Proc. BLZ3-2007 (in Proc. SSW6), Bonn, Germany, Aug. 2007.
-
(2007)
Proc. BLZ3-2007 (In Proc. SSW6)
-
-
Clark, R.A.J.1
Podsiadlo, M.2
Fraser, M.3
Mayo, C.4
King, S.5
-
55
-
-
33646800617
-
Analysis of speaking styles by two-dimensional visualization of aggregate of acoustic models
-
Jeju Island, Korea, Oct.
-
M. Shozakai and G. Nagino, "Analysis of speaking styles by two-dimensional visualization of aggregate of acoustic models," in Proc. ICSLP-04, Jeju Island, Korea, Oct. 2004, pp. 717-720.
-
(2004)
Proc. ICSLP-04
, pp. 717-720
-
-
Shozakai, M.1
Nagino, G.2
-
56
-
-
70449388052
-
QMOS-A robust visualization method for speaker dependencies with different microphones
-
A. Maier, M. Schuster, U. Eysholdt, T. Haderlein, T. Cincarek, S. Steidl, A. Batliner, S. Wenhardt, and E. Noth, "QMOS-A robust visualization method for speaker dependencies with different microphones," J. Pattern Recognition Res., vol.1, pp. 32-51, 2009.
-
(2009)
J. Pattern Recognition Res.
, vol.1
, pp. 32-51
-
-
Maier, A.1
Schuster, M.2
Eysholdt, U.3
Haderlein, T.4
Cincarek, T.5
Steidl, S.6
Batliner, A.7
Wenhardt, S.8
Noth, E.9
-
58
-
-
33646781551
-
Acoustic training from heterogeneous data sources: Experiments in Mandarin conversational telephone speech transcription
-
S. Tsakalidis and W. Byrne, "Acoustic training from heterogeneous data sources: Experiments in Mandarin conversational telephone speech transcription," in Proc. ICASSP-05, 18-23, 2005, vol.1, pp. 461-464.
-
(2005)
Proc. ICASSP-05, 18-23
, vol.1
, pp. 461-464
-
-
Tsakalidis, S.1
Byrne, W.2
-
59
-
-
77953712724
-
Cross-corpus normalization of diverse acoustic training data for robustHMMtraining
-
Cambridge, U.K.
-
S. Tsakalidis and W. Byrne, "Cross-corpus normalization of diverse acoustic training data for robustHMMtraining," Cambridge Univ. Eng. Dept., Cambridge, U.K., 2005.
-
(2005)
Cambridge Univ. Eng. Dept.
-
-
Tsakalidis, S.1
Byrne, W.2
-
60
-
-
77953723444
-
Reformulating the HMM as a trajectory model
-
Dec.
-
K. Tokuda, H. Zen, and T. Kitamura, "Reformulating the HMM as a trajectory model," IEICE Tech. Rep. Natural Lang. Understanding Models of Commun., vol.104, no.538, pp. 43-48, Dec. 2004.
-
(2004)
IEICE Tech. Rep. Natural Lang. Understanding Models of Commun.
, vol.104
, Issue.538
, pp. 43-48
-
-
Tokuda, K.1
Zen, H.2
Kitamura, T.3
-
61
-
-
77953697940
-
-
Ph.D. dissertation, Univ. Politecnica de Catalunya, Barcelona, Spain
-
D. Erro, "Intra-lingual and cross-lingual voice conversion using harmonic plus stochastic models," Ph.D. dissertation, Univ. Politecnica de Catalunya, Barcelona, Spain, 2008.
-
(2008)
Intra-lingual and Cross-lingual Voice Conversion Using Harmonic Plus Stochastic Models
-
-
Erro, D.1
-
62
-
-
84970205467
-
Attractive faces are only average
-
J. H. Langlois and L. A. Roggman, "Attractive faces are only average," Psychol. Sci., vol.1, no.2, pp. 115-121, 1990.
-
(1990)
Psychol. Sci.
, vol.1
, Issue.2
, pp. 115-121
-
-
Langlois, J.H.1
Roggman, L.A.2
-
63
-
-
77953710433
-
Analysis of unsupervised and noise-robust speaker-adaptive HMM-based speech synthesis systems toward a unified ASR and TTS framework
-
Edinburgh, U.K., Sep.
-
J. Yamagishi, M. Lincoln, S. King, J. Dines, M. Gibson, J. Tian, and Y. Guan, "Analysis of unsupervised and noise-robust speaker-adaptive HMM-based speech synthesis systems toward a unified ASR and TTS framework," in Proc. Blizzard Challenge Workshop, Edinburgh, U.K., Sep. 2009.
-
(2009)
Proc. Blizzard Challenge Workshop
-
-
Yamagishi, J.1
Lincoln, M.2
King, S.3
Dines, J.4
Gibson, M.5
Tian, J.6
Guan, Y.7
-
64
-
-
85131821539
-
Mel-generalized cepstral analysis-A unified approach to speech spectral estimation
-
Yokohama, Japan, Sep.
-
K. Tokuda, T. Kobayashi, T. Masuko, and S. Imai, "Mel-generalized cepstral analysis-A unified approach to speech spectral estimation," in Proc. ICSLP-94, Yokohama, Japan, Sep. 1994, pp. 1043-1046.
-
(1994)
Proc. ICSLP-94
, pp. 1043-1046
-
-
Tokuda, K.1
Kobayashi, T.2
Masuko, T.3
Imai, S.4
-
66
-
-
0036567794
-
The development of the HTK broadcast news transcription system: An overview
-
P. C. Woodland, "The development of the HTK broadcast news transcription system: An overview," Speech Commun., vol.37, no.1-2, pp. 47-67, 2002.
-
(2002)
Speech Commun
, vol.37
, Issue.1-2
, pp. 47-67
-
-
Woodland, P.C.1
-
67
-
-
77953693885
-
Building personalised synthesised voices for individuals with dysarthria using the HTS toolkit
-
J. W. Mullennix and S. E. Stern, Eds. Hershey, PA: IGI Global, Jan.
-
S. Creer, P. Green, S. Cunningham, and J. Yamagishi, "Building personalised synthesised voices for individuals with dysarthria using the HTS toolkit," in Computer Synthesized Speech Technologies: Tools for Aiding Impairment, J. W. Mullennix and S. E. Stern, Eds. Hershey, PA: IGI Global, Jan. 2010.
-
(2010)
Computer Synthesized Speech Technologies: Tools for Aiding Impairment
-
-
Creer, S.1
Green, P.2
Cunningham, S.3
Yamagishi, J.4
-
68
-
-
85135274466
-
On the security of HMM-based speaker verification systems against imposture using synthetic speech
-
Budapest, Hungary, Sep.
-
T. Masuko, T. Hitotsumatsu, K. Tokuda, and T. Kobayashi, "On the security of HMM-based speaker verification systems against imposture using synthetic speech," in Proc. Eurospeech-99, Budapest, Hungary, Sep. 1999, pp. 1223-1226.
-
(1999)
Proc. Eurospeech-99
, pp. 1223-1226
-
-
Masuko, T.1
Hitotsumatsu, T.2
Tokuda, K.3
Kobayashi, T.4
-
69
-
-
85009077529
-
Imposture using synthetic speech against speaker verification based on spectrum and pitch
-
Beijing, China, Oct.
-
T. Masuko, K. Tokuda, and T. Kobayashi, "Imposture using synthetic speech against speaker verification based on spectrum and pitch," in Proc. ICSLP-00, Beijing, China, Oct. 2000, pp. 302-305.
-
(2000)
Proc. ICSLP-00
, pp. 302-305
-
-
Masuko, T.1
Tokuda, K.2
Kobayashi, T.3
-
70
-
-
78049409687
-
Revisiting the security of speaker verification systems against imposture using synthetic speech
-
Dallas, TX, Mar.
-
P. L. De Leon, V. R. Apsingekar, M. Pucher, and J. Yamagishi, "Revisiting the security of speaker verification systems against imposture using synthetic speech," in Proc. ICASSP-10, Dallas, TX, Mar. 2010.
-
(2010)
Proc. ICASSP-10
-
-
De Leon, P.L.1
Apsingekar, V.R.2
Pucher, M.3
Yamagishi, J.4
|