-
1
-
-
84855906479
-
Speech synthesis technologies for individuals with vocal diabilities: Voice banking and reconstruction
-
J. Yamagishi, C. Veaux, S. King, and S. Renals, "Speech synthesis technologies for individuals with vocal diabilities: Voice banking and reconstruction, " Acoust. Sci. Technol., vol. 33, pp. 1-5, 2012.
-
(2012)
Acoust. Sci. Technol.
, vol.33
, pp. 1-5
-
-
Yamagishi, J.1
Veaux, C.2
King, S.3
Renals, S.4
-
2
-
-
84905252904
-
An evaluation of excitation feature prediction in a hybrid approach to electrolaryngeal speech enhancement
-
Florence, Italy, May
-
K. Tanaka, T. Toda, G. Neubig, S. Sakti, and S. Nakamura, "An evaluation of excitation feature prediction in a hybrid approach to electrolaryngeal speech enhancement, " in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), Florence, Italy, May 2014, pp. 4521-4525.
-
(2014)
Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP)
, pp. 4521-4525
-
-
Tanaka, K.1
Toda, T.2
Neubig, G.3
Sakti, S.4
Nakamura, S.5
-
3
-
-
84874403435
-
Singing voice conversion method based on many-to-many eigenvoice conversion and training data generation using a singing-to-singing synthesis system
-
Hollywood, CA, USA, Nov
-
H. Doi, T. Toda, T. Nakano, M. Goto, and S. Nakamura, "Singing voice conversion method based on many-to-many eigenvoice conversion and training data generation using a singing-to-singing synthesis system, " in Proc. Asia Pac. Signal Inf. Process. Assoc. Annu. Summit Conf. (APSIPA ASC), Hollywood, CA, USA, Nov. 2012, pp. 1-6.
-
(2012)
Proc. Asia Pac. Signal Inf. Process. Assoc. Annu. Summit Conf. (APSIPA ASC)
, pp. 1-6
-
-
Doi, H.1
Toda, T.2
Nakano, T.3
Goto, M.4
Nakamura, S.5
-
4
-
-
84865743435
-
Speakeradaptive speech synthesis based on eigenvoice conversion and languagedependent prosodic conversion in speech-to-speech translation
-
Florence, Italy, Aug
-
N. Hattori, T. Toda, H. Kawai, H. Saruwatari, and K. Shikano, "Speakeradaptive speech synthesis based on eigenvoice conversion and languagedependent prosodic conversion in speech-to-speech translation, " in Proc. INTERSPEECH, Florence, Italy, Aug. 2011, pp. 2769-2772.
-
(2011)
Proc. INTERSPEECH
, pp. 2769-2772
-
-
Hattori, N.1
Toda, T.2
Kawai, H.3
Saruwatari, H.4
Shikano, K.5
-
5
-
-
84905280452
-
Automatic discovery of a phonetic inventory for unwritten languages for statistical speech synthesis
-
Florence, Italy, May
-
P. K. Muthukumar and A. W. Black, "Automatic discovery of a phonetic inventory for unwritten languages for statistical speech synthesis, " in Proc. Int. Conf. Acoust., Speech, Signal Process., Florence, Italy, May 2014, pp. 2594-2598.
-
(2014)
Proc. Int. Conf. Acoust., Speech, Signal Process
, pp. 2594-2598
-
-
Muthukumar, P.K.1
Black, A.W.2
-
6
-
-
85047398116
-
Text to speech in new languages without a standardized orthography
-
Barcelona, Spain, Aug
-
S. Sitaram, G. Anumanchipalli, J. Chiu, A. Parlikar, and A. W. Black, "Text to speech in new languages without a standardized orthography, " in Proc. 8th Speech Synth. Workshop, Barcelona, Spain, Aug. 2013, pp. 95- 100.
-
(2013)
Proc. 8th Speech Synth. Workshop
, pp. 95-100
-
-
Sitaram, S.1
Anumanchipalli, G.2
Chiu, J.3
Parlikar, A.4
Black, A.W.5
-
7
-
-
84905223321
-
Regression approaches to perceptual age control in singing voice conversion
-
Florence, Italy, May
-
K. Kobayashi et al., "Regression approaches to perceptual age control in singing voice conversion, " in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), Florence, Italy, May 2014, pp. 7954-7958.
-
(2014)
Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP)
, pp. 7954-7958
-
-
Kobayashi, K.1
-
8
-
-
85032750981
-
Deep learning for acoustic modeling in parametric speech generation: A systematic review of existing techniques and future trends
-
May
-
Z.-H. Ling et al., "Deep learning for acoustic modeling in parametric speech generation: A systematic review of existing techniques and future trends, " IEEE Signal Process. Mag., vol. 32, no. 3, pp. 35-52, May 2015.
-
(2015)
IEEE Signal Process. Mag.
, vol.32
, Issue.3
, pp. 35-52
-
-
Ling, Z.-H.1
-
9
-
-
0023756465
-
Speech synthesis by rule using an optimal selection of non-uniform synthesis units
-
New York, NY, USA, Apr
-
Y. Sagisaka, "Speech synthesis by rule using an optimal selection of non-uniform synthesis units, " in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), New York, NY, USA, Apr. 1988, pp. 679-682.
-
(1988)
Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP)
, pp. 679-682
-
-
Sagisaka, Y.1
-
10
-
-
0032026483
-
Continuous probabilistic transform for voice conversion
-
Mar
-
Y. Stylianou, O. Cappe, and E. Moulines, "Continuous probabilistic transform for voice conversion, " IEEE Trans. Speech Audio Process., vol. 6, no. 2, pp. 131-142, Mar. 1988.
-
(1988)
IEEE Trans. Speech Audio Process
, vol.6
, Issue.2
, pp. 131-142
-
-
Stylianou, Y.1
Cappe, O.2
Moulines, E.3
-
11
-
-
67651002140
-
Statistical parametric speech synthesis
-
H. Zen, K. Tokuda, and A. Black, "Statistical parametric speech synthesis, " Speech Commun., vol. 51, no. 11, pp. 1039-1064, 2009.
-
(2009)
Speech Commun.
, vol.51
, Issue.11
, pp. 1039-1064
-
-
Zen, H.1
Tokuda, K.2
Black, A.3
-
12
-
-
0028996993
-
Speech parameter generation from hmm using dynamic features
-
Detroit, MI, USA, May
-
K. Tokuda, T. Kobayashi, and S. Imai, "Speech parameter generation from HMM using dynamic features, " in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), Detroit, MI, USA, May 1995, pp. 660-663.
-
(1995)
Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP)
, pp. 660-663
-
-
Tokuda, K.1
Kobayashi, T.2
Imai, S.3
-
13
-
-
84876687945
-
Speech synthesis based on hidden markov models
-
May
-
K. Tokuda, Y. Nankaku, T. Toda, H. Zen, J. Yamagishi, and K. Oura, "Speech synthesis based on hidden Markov models, " Proc. IEEE, vol. 101, no. 5, pp. 1234-1252, May 2013.
-
(2013)
Proc. IEEE
, vol.101
, Issue.5
, pp. 1234-1252
-
-
Tokuda, K.1
Nankaku, Y.2
Toda, T.3
Zen, H.4
Yamagishi, J.5
Oura, K.6
-
14
-
-
57749193836
-
Voice conversion based on maximum likelihood estimation of spectral parameter trajectory
-
Nov
-
T. Toda, A. W. Black, and K. Tokuda, "Voice conversion based on maximum likelihood estimation of spectral parameter trajectory, " IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 8, pp. 2222-2235, Nov. 2007.
-
(2007)
IEEE Trans. Audio, Speech, Lang. Process
, vol.15
, Issue.8
, pp. 2222-2235
-
-
Toda, T.1
Black, A.W.2
Tokuda, K.3
-
15
-
-
44949232373
-
CLUSTERGEN: A statistical parametric synthesizer using trajectory modeling
-
Pittsburgh, PA, USA, Sep
-
A.W. Black, "CLUSTERGEN: A statistical parametric synthesizer using trajectory modeling, " in Proc. INTERSPEECH, Pittsburgh, PA, USA, Sep. 2006.
-
(2006)
Proc. INTERSPEECH
-
-
Black, A.W.1
-
16
-
-
84897902941
-
Statistical parametric speech synthesis based on Gaussian process regression
-
Apr
-
T. Koriyama, T. Nose, and T. Kobayashi, "Statistical parametric speech synthesis based on Gaussian process regression, " IEEE J. Sel. Topics Signal Process., vol. 8, no. 2, pp. 173-183, Apr. 2014.
-
(2014)
IEEE J. Sel. Topics Signal Process
, vol.8
, Issue.2
, pp. 173-183
-
-
Koriyama, T.1
Nose, T.2
Kobayashi, T.3
-
17
-
-
84856141218
-
Voice conversion using dynamic kernel partial least squares regression
-
Mar
-
E. Helander, T. V. H. Silen, and M. Gabbouj, "Voice conversion using dynamic kernel partial least squares regression, " IEEE Trans. Audio, Speech, Lang. Process., vol. 20, no. 3, pp. 806-817, Mar. 2012.
-
(2012)
IEEE Trans. Audio, Speech, Lang. Process
, vol.20
, Issue.3
, pp. 806-817
-
-
Helander, E.1
Silen, T.V.H.2
Gabbouj, M.3
-
18
-
-
84905262874
-
Deep mixture density networks for acoustic modeling in statistical parametric speech synthesis
-
Florence, Italy, May
-
H. Zen and A. Senior, "Deep mixture density networks for acoustic modeling in statistical parametric speech synthesis, " in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), Florence, Italy, May 2014, pp. 3872-3876.
-
(2014)
Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP)
, pp. 3872-3876
-
-
Zen, H.1
Senior, A.2
-
19
-
-
84906280857
-
Voice conversion in high-order Eigen space using deep belief nets
-
Lyon, France, Aug
-
T. Nakashika, R. Takashima, T. Takiguchi, and Y. Ariki, "Voice conversion in high-order Eigen space using deep belief nets, " in Proc. INTERSPEECH, Lyon, France, Aug. 2013, pp. 369-372.
-
(2013)
Proc. INTERSPEECH
, pp. 369-372
-
-
Nakashika, T.1
Takashima, R.2
Takiguchi, T.3
Ariki, Y.4
-
20
-
-
0029765811
-
Unit selection in a concatenative speech synthesis system using a large speech database
-
Atlanta, GA, USA, May
-
A. J. Hunt and A. Black, "Unit selection in a concatenative speech synthesis system using a large speech database, " in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), Atlanta, GA, USA, May 1996, pp. 373-376.
-
(1996)
Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP)
, pp. 373-376
-
-
Hunt, A.J.1
Black, A.2
-
21
-
-
84988274722
-
An investigation of implementation performance analysis of DNN based speech synthesis system
-
Brighton, U.K
-
K. Oura, H. Zen, Y. Nankaku, A. Lee, and K. Tokuda, "An investigation of implementation performance analysis of DNN based speech synthesis system, " in Proc. INTERSPEECH, Brighton, U.K., 2014, pp. 577-582.
-
(2014)
Proc. INTERSPEECH
, pp. 577-582
-
-
Oura, K.1
Zen, H.2
Nankaku, Y.3
Lee, A.4
Tokuda, K.5
-
22
-
-
33847129573
-
Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training
-
J. Yamagishi and T. Kobayashi, "Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training, " IEICE Trans. Inf. Syst., vol. E90-D, no. 2, pp. 533-543, 2007.
-
(2007)
IEICE Trans. Inf. Syst.
, vol.E90-D
, Issue.2
, pp. 533-543
-
-
Yamagishi, J.1
Kobayashi, T.2
-
23
-
-
51449114529
-
A style control technique for HMM-based expressive speech synthesis
-
T. Nose, J. Yamagishi, T. Masuko, and T. Kobayashi, "A style control technique for HMM-based expressive speech synthesis, " IEICE Trans. Inf. Syst., vol. E90-D, no. 9, pp. 1406-1413, 2007.
-
(2007)
IEICE Trans. Inf. Syst.
, vol.E90-D
, Issue.9
, pp. 1406-1413
-
-
Nose, T.1
Yamagishi, J.2
Masuko, T.3
Kobayashi, T.4
-
24
-
-
84878397811
-
Exploring rich expressive information from audiobook data using cluster adaptive training
-
Portland, OR, USA, Sep
-
L. Chen, M. J. F. Gales, L. Chen, K. Chin, K. Knull, and M. Akamine, "Exploring rich expressive information from audiobook data using cluster adaptive training, " in Proc. INTERSPEECH, Portland, OR, USA, Sep. 2012.
-
(2012)
Proc. INTERSPEECH
-
-
Chen, L.1
Gales, M.J.F.2
Chen, L.3
Chin, K.4
Knull, K.5
Akamine, M.6
-
26
-
-
70349197715
-
Voice transformation: A survey
-
Taipei, Taiwan, Apr
-
Y. Stylianou, "Voice transformation: A survey, " in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), Taipei, Taiwan, Apr. 2009, pp. 3585-3588.
-
(2009)
Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP)
, pp. 3585-3588
-
-
Stylianou, Y.1
-
27
-
-
0032673049
-
Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based f0 extraction: Possible role of a repetitive structure in sounds
-
H. Kawahara, I. Masuda-Katsuse, and A. D. Cheveigne, "Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds, " Speech Commun., vol. 27, no. 3-4, pp. 187-207, 1999.
-
(1999)
Speech Commun.
, vol.27
, Issue.3-4
, pp. 187-207
-
-
Kawahara, H.1
Masuda-Katsuse, I.2
Cheveigne, A.D.3
-
28
-
-
85010837662
-
An attempt to develop a singing synthesizer by collaborative creation
-
Stockholm, Sweden, Aug
-
M. Morise, "An attempt to develop a singing synthesizer by collaborative creation, " in Proc. Stockholm Music Acoust. Conf. (SMAC), Stockholm, Sweden, Aug. 2013, pp. 287 292.
-
(2013)
Proc. Stockholm Music Acoust. Conf. (SMAC)
, pp. 287-292
-
-
Morise, M.1
-
29
-
-
84930664922
-
Vocaine the vocoder and applications in speech synthesis
-
Brisbane, QLD, Australia, Apr
-
Y. Agiomyrgiannakis, "Vocaine the vocoder and applications in speech synthesis, " in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), Brisbane, QLD, Australia, Apr. 2015, pp. 4230-4234.
-
(2015)
Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP)
, pp. 4230-4234
-
-
Agiomyrgiannakis, Y.1
-
30
-
-
84906279165
-
Optimizations and fitting procedures for the liljencrants-fant model for statistical parametric speech synthesis
-
Vancouver, BC, Canada, May
-
P. K. Muthukumar, A. W. Black, and H. T. Bunnell, "Optimizations and fitting procedures for the Liljencrants-Fant model for statistical parametric speech synthesis, " in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), Vancouver, BC, Canada, May 2013, pp. 397-401.
-
(2013)
Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP)
, pp. 397-401
-
-
Muthukumar, P.K.1
Black, A.W.2
Bunnell, H.T.3
-
31
-
-
38549096029
-
A speech parameter generation algorithm considering global variance for HMM-based speech synthesis
-
T. Toda and K. Tokuda, "A speech parameter generation algorithm considering global variance for HMM-based speech synthesis, " IEICE Trans., vol. E90-D, no. 5, pp. 816-824, 2007.
-
(2007)
IEICE Trans
, vol.E90-D
, Issue.5
, pp. 816-824
-
-
Toda, T.1
Tokuda, K.2
-
32
-
-
84897862522
-
Parameter generation methods with rich context models for highquality and flexible text-to-speech synthesis
-
Apr
-
S. Takamichi, T. Toda, Y. Shiga, S. Sakti, G. Neubig, and S. Nakamura, "Parameter generation methods with rich context models for highquality and flexible text-to-speech synthesis, " IEEE J. Sel. Topics Signal Process., vol. 8, no. 2, pp. 239-250, Apr. 2014.
-
(2014)
IEEE J. Sel. Topics Signal Process
, vol.8
, Issue.2
, pp. 239-250
-
-
Takamichi, S.1
Toda, T.2
Shiga, Y.3
Sakti, S.4
Neubig, G.5
Nakamura, S.6
-
33
-
-
84897832343
-
A parameter generation algorithm using local variance for HMM-based speech synthesis
-
Apr
-
T. Nose, V. Chunwijitra, and T. Kobayashi, "A parameter generation algorithm using local variance for HMM-based speech synthesis, " IEEE J. Sel. Topics Signal Process., vol. 8, no. 2, pp. 221-228, Apr. 2014.
-
(2014)
IEEE J. Sel. Topics Signal Process
, vol.8
, Issue.2
, pp. 221-228
-
-
Nose, T.1
Chunwijitra, V.2
Kobayashi, T.3
-
34
-
-
84890495160
-
Fast, low-artifact speech synthesis considering global variance
-
Vancouver, BC, Canada, May
-
M. Shannon and W. Byrne, "Fast, low-artifact speech synthesis considering global variance, " in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), Vancouver, BC, Canada, May 2013, pp. 7869-7873.
-
(2013)
Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP)
, pp. 7869-7873
-
-
Shannon, M.1
Byrne, W.2
-
35
-
-
84878390910
-
Implementation of computationally efficient real-time voice conversion
-
Portland, OR, USA, Sep
-
T. Toda, T. Muramatsu, and H. Banno, "Implementation of computationally efficient real-time voice conversion, " in Proc. INTERSPEECH, Portland, OR, USA, Sep. 2012.
-
(2012)
Proc. INTERSPEECH
-
-
Toda, T.1
Muramatsu, T.2
Banno, H.3
-
36
-
-
0028287770
-
Effect of reducing slow temporal modulations on speech reception
-
R. Drullman, J. M. Festen, and R. Plomp, "Effect of reducing slow temporal modulations on speech reception, " J. Acoust. Soc. Amer., vol. 95, pp. 2670-2680, 1994.
-
(1994)
J. Acoust. Soc. Amer.
, vol.95
, pp. 2670-2680
-
-
Drullman, R.1
Festen, J.M.2
Plomp, R.3
-
37
-
-
70349212558
-
Phoneme recognition using spectral envelop and modulation frequency features
-
Taipei, Taiwan, Apr
-
S. Thomas, S. Ganapathy, and H. Hermansky, "Phoneme recognition using spectral envelop and modulation frequency features, " in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), Taipei, Taiwan, Apr. 2009, pp. 4453-4456.
-
(2009)
Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP)
, pp. 4453-4456
-
-
Thomas, S.1
Ganapathy, S.2
Hermansky, H.3
-
38
-
-
84959088222
-
Reduction of reverberation effects in the mfcc modulation spectrum for improved classification of acoustic signals
-
Dresden, Germany, Sep
-
S. Gergen, A. Nagathil, and R. Martin, "Reduction of reverberation effects in the MFCC modulation spectrum for improved classification of acoustic signals, " in Proc. INTERSPEECH, Dresden, Germany, Sep. 2015, pp. 1992-1995.
-
(2015)
Proc. INTERSPEECH
, pp. 1992-1995
-
-
Gergen, S.1
Nagathil, A.2
Martin, R.3
-
39
-
-
84890543945
-
Synthetic speech detection using temporal modulation feature
-
Vancouver, BC, Canada, May
-
Z. Wu, X. Xiao, E. S. Chng, and H. Li, "Synthetic speech detection using temporal modulation feature, " in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), Vancouver, BC, Canada, May 2013, pp. 7234-7238.
-
(2013)
Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP)
, pp. 7234-7238
-
-
Wu, Z.1
Xiao, X.2
Chng, E.S.3
Li, H.4
-
40
-
-
0027957839
-
Effect of temporal envelope smearing on speech perception
-
R. Drullman, J. M. Festen, and R. Plomp, "Effect of temporal envelope smearing on speech perception, " J. Acoust. Soc. Amer., vol. 95, pp. 1053- 1064, 1994.
-
(1994)
J. Acoust. Soc. Amer.
, vol.95
, pp. 1053-1064
-
-
Drullman, R.1
Festen, J.M.2
Plomp, R.3
-
41
-
-
0030369532
-
Intelligibility of speech with filtered time trajectories of spectral envelopes
-
T. Arai, M. Pavel, H. Hermansky, and C. Avendano, "Intelligibility of speech with filtered time trajectories of spectral envelopes, " in Proc. 4th Int. Conf. Spoken Lang. (ICSLP), 1996, vol. 4, pp. 2490-2493.
-
(1996)
Proc. 4th Int. Conf. Spoken Lang. (ICSLP)
, vol.4
, pp. 2490-2493
-
-
Arai, T.1
Pavel, M.2
Hermansky, H.3
Avendano, C.4
-
42
-
-
84867211725
-
Lowdelay voice conversion based on maximum likelihood estimation of spectral parameter trajectory
-
Brisbane, QLD, Australia, Sep
-
T. Muramatsu, Y. Ohtani, T. Toda, H. Saruwatari, and K. Shikano, "Lowdelay voice conversion based on maximum likelihood estimation of spectral parameter trajectory, " in Proc. INTERSPEECH, Brisbane, QLD, Australia, Sep. 2008, pp. 1076-1079.
-
(2008)
Proc. INTERSPEECH
, pp. 1076-1079
-
-
Muramatsu, T.1
Ohtani, Y.2
Toda, T.3
Saruwatari, H.4
Shikano, K.5
-
43
-
-
84946045510
-
Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis
-
Brisbane, QLD, Australia, Apr
-
H. Zen and H. Sak, "Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis, " in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), Brisbane, QLD, Australia, Apr. 2015, pp. 4470-4474.
-
(2015)
Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP)
, pp. 4470-4474
-
-
Zen, H.1
Sak, H.2
-
44
-
-
84905234422
-
A postfilter to modify modulation spectrum in hmm-based speech synthesis
-
Florence, Italy, May
-
S. Takamichi, T. Toda, G. Neubig, S. Sakti, and S. Nakamura, "A postfilter to modify modulation spectrum in HMM-based speech synthesis, " in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), Florence, Italy, May 2014, pp. 290-294.
-
(2014)
Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP)
, pp. 290-294
-
-
Takamichi, S.1
Toda, T.2
Neubig, G.3
Sakti, S.4
Nakamura, S.5
-
45
-
-
84959144982
-
Modified modulation spectrum-based post-filter for HMM-based speech synthesis
-
Atlanta, GA, USA, Dec
-
S. Takamichi, T. Toda, A. W. Black, and S. Nakamura, "Modified modulation spectrum-based post-filter for HMM-based speech synthesis, " in Proc. GlobalSIP, Atlanta, GA, USA, Dec. 2014, pp. 710-714.
-
(2014)
Proc. GlobalSIP
, pp. 710-714
-
-
Takamichi, S.1
Toda, T.2
Black, A.W.3
Nakamura, S.4
-
46
-
-
84949926049
-
Modulation spectrum-based post-filter for GMM-based voice conversion
-
Siem Reap, Cambodia, Dec
-
S. Takamichi, T. Toda, A. W. Black, and S. Nakamura, "Modulation spectrum-based post-filter for GMM-based voice conversion, " in Proc. Annu. Summit Conf. Asia-Pac. Signal Inf. Process. Assoc. (APSIPA ASC), Siem Reap, Cambodia, Dec. 2014, pp. 1-4.
-
(2014)
Proc. Annu. Summit Conf. Asia-Pac. Signal Inf. Process. Assoc. (APSIPA ASC)
, pp. 1-4
-
-
Takamichi, S.1
Toda, T.2
Black, A.W.3
Nakamura, S.4
-
47
-
-
85009139544
-
Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
-
Budapest, Hungary, Apr
-
T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis, " in Proc. EUROSPEECH, Budapest, Hungary, Apr. 1999, pp. 2347-2350.
-
(1999)
Proc. EUROSPEECH
, pp. 2347-2350
-
-
Yoshimura, T.1
Tokuda, K.2
Masuko, T.3
Kobayashi, T.4
Kitamura, T.5
-
48
-
-
0033708106
-
Speech parameter generation algorithms for hmm-based speech synthesis
-
Istanbul, Turkey, Jun
-
K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, "Speech parameter generation algorithms for HMM-based speech synthesis, " in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), Istanbul, Turkey, Jun. 2000, pp. 1315-1318.
-
(2000)
Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP)
, pp. 1315-1318
-
-
Tokuda, K.1
Yoshimura, T.2
Masuko, T.3
Kobayashi, T.4
Kitamura, T.5
-
49
-
-
84989843139
-
MDL-based context-dependent subword modeling for speech recognition
-
K. Shinoda and T. Watanabe, "MDL-based context-dependent subword modeling for speech recognition, " J. Acoust. Soc. Jpn. (E), vol. 28, no. 3, pp. 140-146, 2007.
-
(2007)
J. Acoust. Soc. Jpn. (E)
, vol.28
, Issue.3
, pp. 140-146
-
-
Shinoda, K.1
Watanabe, T.2
-
51
-
-
84866866142
-
A state duration generation algorithm considering global variance for hmm-based speech synthesis
-
Xi'an, China
-
S. Pan, J. Tao, and Y. Wang, "A state duration generation algorithm considering global variance for HMM-based speech synthesis, " in Proc. Annu. Summit Conf. Asia-Pac. Signal Inf. Process. Assoc. (APSIPA ASC), Xi'an, China, 2011.
-
(2011)
Proc. Annu. Summit Conf. Asia-Pac. Signal Inf. Process. Assoc. (APSIPA ASC)
-
-
Pan, S.1
Tao, J.2
Wang, Y.3
-
52
-
-
85008023596
-
Continuous f0 modeling for hmm based statistical parametric speech synthesis
-
Jul
-
K. Yu and S. Young, "Continuous F0 modeling for HMM based statistical parametric speech synthesis, " IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 5, pp. 1071-1079, Jul. 2011.
-
(2011)
IEEE Trans. Audio, Speech, Lang. Process
, vol.19
, Issue.5
, pp. 1071-1079
-
-
Yu, K.1
Young, S.2
-
53
-
-
84905244240
-
A hybrid approach to electrolaryngeal speech enhansement based on spectral subtraction and statistical voice conversion
-
Lyon, France, Sep
-
K. Tanaka, T. Toda, G. Neubig, S. Sakti, and S. Nakamura, "A hybrid approach to electrolaryngeal speech enhansement based on spectral subtraction and statistical voice conversion, " in Proc. Interspeech, Lyon, France, Sep. 2013, pp. 3067-3071.
-
(2013)
Proc. Interspeech
, pp. 3067-3071
-
-
Tanaka, K.1
Toda, T.2
Neubig, G.3
Sakti, S.4
Nakamura, S.5
-
54
-
-
84925160976
-
-
Cambridge, U.K.: Cambridge Univ. Press
-
P. Taylor, Text-To-Speech Synthesis. Cambridge, U.K.: Cambridge Univ. Press, 2009.
-
(2009)
Text-To-Speech Synthesis
-
-
Taylor, P.1
-
55
-
-
84905283795
-
A frequency-weighted post-filtering transform for compensation of the over-smoothing effect in hmm-based speech synthesis
-
Florence, Italy, May
-
F. Eyben and Y. Agiomyrgiannakis, "A frequency-weighted post-filtering transform for compensation of the over-smoothing effect in HMM-based speech synthesis, " in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), Florence, Italy, May 2014, pp. 275-279.
-
(2014)
Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP)
, pp. 275-279
-
-
Eyben, F.1
Agiomyrgiannakis, Y.2
-
56
-
-
84946074523
-
The effect of neural networks in statistical parametric speech synthesis
-
Brisbane, QLD, Australia, Apr
-
K. Hashimoto, K. Oura, Y. Nankaku, and K. Tokuda, "The effect of neural networks in statistical parametric speech synthesis, " in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP), Brisbane, QLD, Australia, Apr. 2015, pp. 4455-4459.
-
(2015)
Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP)
, pp. 4455-4459
-
-
Hashimoto, K.1
Oura, K.2
Nankaku, Y.3
Tokuda, K.4
-
57
-
-
84910100893
-
Dnn-based stochastic postfilter for HMM-based speech synthesis
-
MAX Atria, Singapore, May
-
L.-H. Chen, T. Raitio, C.-V. Botinhao, J. Yamagishi, and Z.-H. Ling, "DNN-based stochastic postfilter for HMM-based speech synthesis, " in Proc. INTERSPEECH, MAX Atria, Singapore, May 2014, pp. 1954- 1958.
-
(2014)
Proc. INTERSPEECH
, pp. 1954-1958
-
-
Chen, L.-H.1
Raitio, T.2
Botinhao, C.-V.3
Yamagishi, J.4
Ling, Z.-H.5
-
58
-
-
85008525798
-
Product of experts for statistical parametric speech synthesis
-
Mar
-
H. Zen, M. J. F. Gales, Y. Nankaku, and K. Tokuda, "Product of experts for statistical parametric speech synthesis, " IEEE Trans. Audio, Speech, Lang. Process., vol. 20, no. 3, pp. 794-805, Mar. 2011.
-
(2011)
IEEE Trans. Audio, Speech, Lang. Process
, vol.20
, Issue.3
, pp. 794-805
-
-
Zen, H.1
Gales, M.J.F.2
Nankaku, Y.3
Tokuda, K.4
-
59
-
-
84910088495
-
Analysis of spectral enhancement using global variance in HMM-based speech synthesis
-
MAX Atria, Singapore, May
-
T. Nose and A. Ito, "Analysis of spectral enhancement using global variance in HMM-based speech synthesis, " in Proc. INTERSPEECH, MAX Atria, Singapore, May 2014, pp. 2917-2921.
-
(2014)
Proc. INTERSPEECH
, pp. 2917-2921
-
-
Nose, T.1
Ito, A.2
-
60
-
-
44449177634
-
Hidden semi- Markov model based speech synthesis system
-
E90-D, no. 5
-
H. Zen, K. Tokuda, T. K. T. Masuko, and T. Kitamura, "Hidden semi- Markov model based speech synthesis system, " IEICE Trans. Inf. Syst., E90-D, no. 5, pp. 825-834, 2007.
-
(2007)
IEICE Trans. Inf. Syst.
, pp. 825-834
-
-
Zen, H.1
Tokuda, K.2
Masuko, T.K.T.3
Kitamura, T.4
-
61
-
-
6644226630
-
A large-scale Japanese speech database
-
Kobe, Japan, Nov
-
Y. Sagisaka, K. Takeda, M. Abe, S. Katagiri, T. Umeda, and H. Kuawhara, "A large-scale Japanese speech database, " in Proc. Int. Conf. Spoken Lang. (ICSLP'90), Kobe, Japan, Nov. 1990, pp. 1089-1092.
-
(1990)
Proc. Int. Conf. Spoken Lang. (ICSLP'90)
, pp. 1089-1092
-
-
Sagisaka, Y.1
Takeda, K.2
Abe, M.3
Katagiri, S.4
Umeda, T.5
Kuawhara, H.6
-
62
-
-
84874199000
-
Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system straight
-
Firentze, Italy, Sep
-
H. Kawahara, J. Estill, and O. Fujimura, "Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT, " in Proc. Int. Workshop Models Anal. Vocal Emissions Biomed. Appl. (MAVEBA), Firentze, Italy, Sep. 2001, pp. 1-6.
-
(2001)
Proc. Int. Workshop Models Anal. Vocal Emissions Biomed. Appl. (MAVEBA)
, pp. 1-6
-
-
Kawahara, H.1
Estill, J.2
Fujimura, O.3
-
63
-
-
44949143155
-
Maximum likelihood voice conversion based on GMM with straight mixed excitation
-
Pittsburgh, PA, USA, Sep
-
Y. Ohtani, T. Toda, H. Saruwatari, and K. Shikano, "Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation, " in Proc. INTERSPEECH, Pittsburgh, PA, USA, Sep. 2006, pp. 2266-2269.
-
(2006)
Proc. INTERSPEECH
, pp. 2266-2269
-
-
Ohtani, Y.1
Toda, T.2
Saruwatari, H.3
Shikano, K.4
-
64
-
-
84946750132
-
Mel-cepstrum modulation spectrum (mcms) features for robust asr
-
MAX Atria, Singapore, Nov
-
V. Tyagi, I. McCowan, H. Misra, and H. Bourlard, "Mel-cepstrum modulation spectrum (MCMS) features for robust ASR, " in Proc. Automat. Speech Recog. Understand. (ASRU), MAX Atria, Singapore, Nov. 2003, pp. 399-404.
-
(2003)
Proc. Automat. Speech Recog. Understand. (ASRU)
, pp. 399-404
-
-
Tyagi, V.1
McCowan, I.2
Misra, H.3
Bourlard, H.4
-
66
-
-
84870436666
-
-
Online Available
-
Amazon Mechanical Turk [Online]. Available: https://www.mturk.com/.
-
Amazon Mechanical Turk
-
-
|