-
1
-
-
85133674021
-
Improved average-voice-based speech synthesis using gender-mixed modeling and a parameter generation algorithm considering GV
-
Aug
-
J. Yamagishi, T. Kobayashi, S. Renals, S. King, H. Zen, T. Toda, and K. Tokuda, "Improved average-voice-based speech synthesis using gender-mixed modeling and a parameter generation algorithm considering GV," in Proc. ISCA SSW6, Aug. 2007.
-
(2007)
Proc. ISCA SSW6
-
-
Yamagishi, J.1
Kobayashi, T.2
Renals, S.3
King, S.4
Zen, H.5
Toda, T.6
Tokuda, K.7
-
2
-
-
67650854725
-
Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm
-
Jan
-
J. Yamagishi, T. Kobayashi, Y. Nakano, K. Ogata, and J. Isogai, "Analysis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adaptation algorithm," IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 1, pp. 66-83, Jan. 2009.
-
(2009)
IEEE Trans. Audio, Speech, Lang. Process
, vol.17
, Issue.1
, pp. 66-83
-
-
Yamagishi, J.1
Kobayashi, T.2
Nakano, Y.3
Ogata, K.4
Isogai, J.5
-
3
-
-
0032026483
-
Continuous probabilistic transform for voice conversion
-
PII S1063667698017386
-
Y. Stylianou, O. Cappé, and E. Moulines, "Continuous probabilistic transform for voice conversion," IEEE Trans. Speech Audio Process., vol. 6, no. 2, pp. 131-142, Mar. 1998. (Pubitemid 128720639)
-
(1998)
IEEE Transactions on Speech and Audio Processing
, vol.6
, Issue.2
, pp. 131-142
-
-
Stylianou, Y.1
Cappe, O.2
Moulines, E.3
-
4
-
-
77953723062
-
Synthesis of child speech with HMM adaptation and voice conversion
-
Aug
-
O. Watts, J. Yamagishi, S. King, and K. Berkling, "Synthesis of child speech with HMM adaptation and voice conversion," IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 6, pp. 1005-1016, Aug. 2010.
-
(2010)
IEEE Trans. Audio, Speech, Lang. Process
, vol.18
, Issue.6
, pp. 1005-1016
-
-
Watts, O.1
Yamagishi, J.2
King, S.3
Berkling, K.4
-
5
-
-
0031623661
-
Spectral voice conversion for text-tospeech synthesis
-
May 12-15 vol. 1
-
A. Kain and M. W. Macon, "Spectral voice conversion for text-tospeech synthesis," in Proc. ICASSP'98, May 12-15, 1998, vol. 1, pp. 285-288, vol. 1.
-
(1998)
Proc. ICASSP'98
, vol.1
, pp. 285-288
-
-
Kain, A.1
MacOn, M.W.2
-
6
-
-
84994241109
-
Including dynamic and phonetic information in voice conversion systems
-
Jeju Island, South Korea
-
H. Duxans, A. Bonafonte, A. Kain, and J. van Santen, "Including dynamic and phonetic information in voice conversion systems," in Proc. ICSLP '04, Jeju Island, South Korea, 2004, pp. 5-8.
-
(2004)
Proc. ICSLP '04
, pp. 5-8
-
-
Duxans, H.1
Bonafonte, A.2
Kain, A.3
Van Santen, J.4
-
7
-
-
57749193836
-
Voice conversion based on maximum likelihood estimation of spectral parameter trajectory
-
Nov
-
T. Toda, A. W. Black, and K. Tokuda, "Voice conversion based on maximum likelihood estimation of spectral parameter trajectory," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 8, pp. 2222-2235, Nov. 2007.
-
(2007)
IEEE Trans. Audio, Speech, Lang. Process
, vol.15
, Issue.8
, pp. 2222-2235
-
-
Toda, T.1
Black, A.W.2
Tokuda, K.3
-
8
-
-
77952978184
-
Adaptive training for voice conversion based on eigenvoices
-
Jun
-
Y. Ohtani, T. Toda, H. Saruwatari, and K. Shikano, "Adaptive training for voice conversion based on eigenvoices," IEICE Trans. Inf. Syst., vol. E93-D, no. 6, pp. 1589-1598, Jun. 2010.
-
(2010)
IEICE Trans. Inf. Syst
, vol.E93-D
, Issue.6
, pp. 1589-1598
-
-
Ohtani, Y.1
Toda, T.2
Saruwatari, H.3
Shikano, K.4
-
9
-
-
34548216761
-
Conversion function clustering and selection using linguistic and spectral information for emotional voice conversion
-
Sep
-
C.-C. Hsia, C.-H. Wu, and J.-Q. Wu, "Conversion function clustering and selection using linguistic and spectral information for emotional voice conversion," IEEE Trans. Comput., vol. 56, no. 9, pp. 1225-1233, Sep. 2007.
-
(2007)
IEEE Trans. Comput
, vol.56
, Issue.9
, pp. 1225-1233
-
-
Hsia, C.-C.1
Wu, C.-H.2
Wu, J.-Q.3
-
10
-
-
34047247202
-
Voice conversion using duration-embedded Bi-HMMs for expressive speech synthesis
-
DOI 10.1109/TASL.2006.876112
-
C.-H. Wu, C.-C. Hsia, T.-H. Liu, and J.-F. Wang, "Voice conversion using duration-embedded Bi-HMMs for expressive speech synthesis," IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 4, pp. 1109-1116, Jul. 2006. (Pubitemid 46547608)
-
(2006)
IEEE Transactions on Audio, Speech and Language Processing
, vol.14
, Issue.4
, pp. 1109-1116
-
-
Wu, C.-H.1
Hsia, C.-C.2
Liu, T.-H.3
Wang, J.-F.4
-
11
-
-
0026394044
-
Speaker adaptation and voice conversion by codebook mapping
-
Jun. 11-14
-
K. Shikano, S. Nakamura, and M. Abe, "Speaker adaptation and voice conversion by codebook mapping," in Proc. IEEE Int. Symp. Circuits Syst., Jun. 11-14, 1991, vol. 1, pp. 594-597, vol., no.
-
(1991)
Proc. IEEE Int. Symp. Circuits Syst
, vol.1
, pp. 594-597
-
-
Shikano, K.1
Nakamura, S.2
Abe, M.3
-
12
-
-
77953707533
-
Spectral mapping using artificial neural networks for voice conversion
-
Jul
-
S. Desai, A. W. Black, B. Yegnanarayana, and K. Prahallad, "Spectral mapping using artificial neural networks for voice conversion," IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 5, pp. 954-964, Jul. 2010.
-
(2010)
IEEE Trans. Audio, Speech, Lang. Process
, vol.18
, Issue.5
, pp. 954-964
-
-
Desai, S.1
Black, A.W.2
Yegnanarayana, B.3
Prahallad, K.4
-
13
-
-
84946753271
-
VTLN-based cross-language voice conversion
-
30 Nov.-3 Dec
-
D. Sundermann, H. Ney, and H. Hoge, "VTLN-based cross-language voice conversion," in Proc. IEEE Workshop on ASRU'03, 30 Nov.-3 Dec. 2003, pp. 676-681.
-
(2003)
Proc. IEEE Workshop on ASRU'03
, pp. 676-681
-
-
Sundermann, D.1
Ney, H.2
Hoge, H.3
-
14
-
-
85128407266
-
Phonetic Alignment: Speech Synthesis vs. Hybrid HMM/ANN
-
Sydney, Australia Dec
-
F. Malfrere, O. Deroo, and T. Dutoit, "Phonetic Alignment: Speech Synthesis vs. Hybrid HMM/ANN," in Proc. ICSLP'98, Sydney, Australia, Dec. 1998, vol. 4, p. 1571.
-
(1998)
Proc. ICSLP'98
, vol.4
, pp. 1571
-
-
Malfrere, F.1
Deroo, O.2
Dutoit, T.3
-
15
-
-
0030366724
-
Autolabelling japanese ToBI
-
Philadelphia, PA Oct
-
N. Campbell, "Autolabelling Japanese ToBI," in Proc. ICSLP'96, Philadelphia, PA, Oct. 1996.
-
(1996)
Proc. ICSLP'96
-
-
Campbell, N.1
-
16
-
-
77955722263
-
Hierarchical prosody conversion using regression-based clustering for emotional speech synthesis
-
Aug
-
C.-H. Wu, C.-C. Hsia, C.-H. Lee, and M.-C. Lin, "Hierarchical prosody conversion using regression-based clustering for emotional speech synthesis," IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 6, pp. 1394-1405, Aug. 2010.
-
(2010)
IEEE Trans. Audio, Speech, Lang. Process
, vol.18
, Issue.6
, pp. 1394-1405
-
-
Wu, C.-H.1
Hsia, C.-C.2
Lee, C.-H.3
Lin, M.-C.4
-
17
-
-
77956285048
-
Exploiting prosody hierarchy and dynamic features for pitch modeling and generation in HMM-based speech synthesis
-
Nov
-
C.-C. Hsia, C.-H. Wu, and J.-Y. Wu, "Exploiting prosody hierarchy and dynamic features for pitch modeling and generation in HMM-based speech synthesis," IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 8, pp. 1994-2003, Nov. 2010.
-
(2010)
IEEE Trans. Audio, Speech, Lang. Process
, vol.18
, Issue.8
, pp. 1994-2003
-
-
Hsia, C.-C.1
Wu, C.-H.2
Wu, J.-Y.3
-
18
-
-
21844474040
-
Fluent speech prosody: Framework and modeling
-
DOI 10.1016/j.specom.2005.03.015, PII S0167639305000919, Quantitative Prosody Modelling for Natural Speech Description and Generation
-
C.-Y. Tseng, S.-H. Pin, Y.-L. Lee, H. M. Wang, and Y. C. Chen, "Fluent Speech Prosody: Framework and Modeling," Speech Commun., Spec. Iss. Quantitative Prosody Modeling for Natural Speech Description and Generation, vol. 46, no. 3-4, pp. 284-309, 2005. (Pubitemid 40952517)
-
(2005)
Speech Communication
, vol.46
, Issue.3-4
, pp. 284-309
-
-
Tseng, C.-Y.1
Pin, S.-H.2
Lee, Y.3
Wang, H.-M.4
Chen, Y.-C.5
-
19
-
-
13544257213
-
A statistics-based pitch contour model for Mandarin speech
-
DOI 10.1121/1.1841572
-
S.-H. Chen, W.-H. Lai, and Y.-R. Wang, "A statistics-based pitch contour model for mandarin speech," J. Acoust. Soc. Amer., vol. 117, no. 2, pp. 908-925, 2005. (Pubitemid 40223449)
-
(2005)
Journal of the Acoustical Society of America
, vol.117
, Issue.2
, pp. 908-925
-
-
Chen, S.-H.1
Lai, W.-H.2
Wang, Y.-R.3
-
20
-
-
4544354696
-
Segmental tonal modeling for phone set design in mandarin LVCSR
-
C. Huang, Y. Shi, J. L. Zhou, M. Chu, T. Wang, and E. Chang, "Segmental tonal modeling for phone set design in mandarin LVCSR," in Proc. ICASSP'04, 2004, pp. 901-904.
-
(2004)
Proc. ICASSP'04
, pp. 901-904
-
-
Huang, C.1
Shi, Y.2
Zhou, J.L.3
Chu, M.4
Wang, T.5
Chang, E.6
-
21
-
-
0030677481
-
Speech representation and transformation using adaptive interpolation of weighted spectrum: Vocoder revisited
-
Munich, Germany
-
H. Kawahara, "Speech representation and transformation using adaptive interpolation of weighted spectrum: Vocoder revisited," in Proc. ICASSP'97, Munich, Germany, 1997, pp. 1303-1306.
-
(1997)
Proc. ICASSP'97
, pp. 1303-1306
-
-
Kawahara, H.1
-
22
-
-
21444431930
-
Locating boundaries for prosodic constituents in unrestricted mandarin texts
-
M. Chu and Y. Qian, "Locating boundaries for prosodic constituents in unrestricted mandarin texts," Comput. Linguist. Chinese Lang. Process., vol. 6, no. 1, pp. 61-82, 2001.
-
(2001)
Comput. Linguist. Chinese Lang. Process
, vol.6
, Issue.1
, pp. 61-82
-
-
Chu, M.1
Qian, Y.2
-
23
-
-
0024736612
-
The synthesis rules in a Chinese text-to-speech system
-
Sep
-
L.-S. Lee, C.-Y. Tseng, and M. Ouh-young, "The synthesis rules in a Chinese text-to-speech system," IEEE Trans. Acoust., Speech, Signal Process., vol. 37, no. 9, pp. 1309-1319, Sep. 1989.
-
(1989)
IEEE Trans. Acoust., Speech, Signal Process
, vol.37
, Issue.9
, pp. 1309-1319
-
-
Lee, L.-S.1
Tseng, C.-Y.2
Ouh-Young, M.3
-
24
-
-
84856036312
-
A corpus-based Mandarin text-to-speech synthesizer
-
A. Benijamin, S. Chilin, and S. Richard, "A corpus-based Mandarin text-to-speech synthesizer," in Proc. ICSLP, 1994, vol. S29, no. 8. 1-8. 4, pp. 1771-1774.
-
(1994)
Proc. ICSLP
, vol.S29
, Issue.81-84
, pp. 1771-1774
-
-
Benijamin, A.1
Chilin, S.2
Richard, S.3
-
25
-
-
70450171823
-
Analysis and recognition of accentual patterns
-
Wagner and Agnieszka, "Analysis and recognition of accentual patterns," in Proc. Interspeech'09, 2009, pp. 2427-2430, (2009).
-
(2009)
Proc. Interspeech'09
, vol.2009
, pp. 2427-2430
-
-
Wagner1
Agnieszka2
-
26
-
-
0022796218
-
Synthesis of natural sounding pitch contours in isolated utterances using hidden Markov models
-
Oct
-
L. Andrej and F. Frank, "Synthesis of natural sounding pitch contours in isolated utterances using hidden Markov models," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-34, no. 5, pp. 1074-1080, Oct. 1986.
-
(1986)
IEEE Trans. Acoust., Speech, Signal Process
, vol.ASSP-34
, Issue.5
, pp. 1074-1080
-
-
Andrej, L.1
Frank, F.2
-
27
-
-
0034509204
-
Prosody model in a Mandarin text-to-speech system based on a hierarchical approach
-
N.-H. Pan, W.-T. Jen, S.-S. Yu, S.-Y. Huang, and M.-J. Wu, "Prosody model in a Mandarin text-to-speech system based on a hierarchical approach," in Proc. IEEE Int. Conf. Multimedia and Expo, 2000, vol. 1, pp. 448-451. (Pubitemid 33058980)
-
(2000)
IEEE International Conference on Multi-Media and Expo
, Issue.IMONDAY
, pp. 448-451
-
-
Pan, N.-H.1
Jen, W.-T.2
Yu, S.-S.3
Yu, M.-S.4
Huang, S.-Y.5
Wu, M.-J.6
-
28
-
-
85009282418
-
Pitch Contour Model for Chinese text-tospeech using CART and statistical model
-
M. Dong and K.-T. Lua, "Pitch Contour Model for Chinese text-tospeech using CART and statistical model," in Proc. ICSLP, 2002, pp. 2405-2408.
-
(2002)
Proc. ICSLP
, pp. 2405-2408
-
-
Dong, M.1
Lua, K.-T.2
-
29
-
-
0142192295
-
Conditional random fields: Probabilistic models for segmenting and labeling sequence data
-
J. Lafferty, A. McCallum, and F. Pereira, "Conditional random fields: Probabilistic models for segmenting and labeling sequence data," in Proc. Int. Conf. Mach. Learn., 2001.
-
(2001)
Proc. Int. Conf. Mach. Learn
-
-
Lafferty, J.1
McCallum, A.2
Pereira, F.3
-
30
-
-
84867923069
-
Domain adaptation for conditional random fields
-
New York: Springer
-
Q. Zhang, X. Qiu, X. Huang, and L. Wu, "Domain Adaptation for Conditional Random Fields," in Information Retrieval Technology. New York: Springer, 2008.
-
(2008)
Information Retrieval Technology
-
-
Zhang, Q.1
Qiu, X.2
Huang, X.3
Wu, L.4
-
31
-
-
33646887390
-
On the limited memory BFGS method for large scale optimization
-
D. C. Liu and J. Nocedal, "On the limited memory BFGS method for large scale optimization,"Math.Programming, ser. B, vol. 45, no. 3, pp. 503-528, 1989. (Pubitemid 20660315)
-
(1989)
Mathematical Programming, Series B
, vol.45
, Issue.3
, pp. 503-528
-
-
Liu Dong, C.1
Nocedal Jorge2
-
32
-
-
0002629270
-
Maximum likelihood from incomplete data via the em algorithm
-
A. P. Dempster, N. M. Laird, and D. B. Rubin, "Maximum likelihood from incomplete data via the EM algorithm," J. R. Statist. Soc. B, vol. 39, pp. 1-38, 1977.
-
(1977)
J. R. Statist. Soc. B
, vol.39
, pp. 1-38
-
-
Dempster, A.P.1
Laird, N.M.2
Rubin, D.B.3
-
33
-
-
84867197177
-
Articulatory control of HMM-based parametric speech synthesis driven by phonetic knowledge
-
Sep
-
Z. H. Ling, K. Richmond, J. Yamagishi, and R. H. Wang, "Articulatory control of HMM-based parametric speech synthesis driven by phonetic knowledge," in Proc. Interspeech'08, Brisbane, Australia, Sep. 2008, pp. 573-576.
-
(2008)
Proc. Interspeech'08, Brisbane, Australia
, pp. 573-576
-
-
Ling, Z.H.1
Richmond, K.2
Yamagishi, J.3
Wang, R.H.4
-
34
-
-
44449179384
-
TH-CoSS, aMandarin speech corpus for TTS
-
Mar
-
L. H. Cai, D. D. Cui, and R. Cai, "TH-CoSS, aMandarin speech corpus for TTS," J. Chinese Inf. Process., vol. 21, no. 2, pp. 94-99, Mar. 2007.
-
(2007)
J. Chinese Inf. Process
, vol.21
, Issue.2
, pp. 94-99
-
-
Cai, L.H.1
Cui, D.D.2
Cai, R.3
-
35
-
-
70350498327
-
-
[Online]
-
H. Zen, T. Nose, J. Yamagishi, S. Sako, and K. Tokuda, The HMM-based Speech Synthesis System (HTS) Version 2. 0 2007 [Online]. Available: http://hts. sp. nitech. ac. jp/
-
(2007)
The HMM-based Speech Synthesis System (HTS) Version 2. 0
-
-
Zen, H.1
Nose, T.2
Yamagishi, J.3
Sako, S.4
Tokuda, K.5
|