SCOPUS 정보 검색 플랫폼

2014 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA 2014

Volumn , Issue , 2014, Pages

Modulation spectrum-based post-filter for GMM-based Voice Conversion

(4) Takamichi, Shinnosuke a,b Toda, Tomoki a Black, Alan W b Nakamura, Satoshi a

a NARA INSTITUTE OF SCIENCE AND TECHNOLOGY (Japan)

b Carnegie Mellon University ^* (United States)

Author keywords

[No Author keywords available]

Indexed keywords

HIDDEN MARKOV MODELS; MARKOV PROCESSES; SPEECH PROCESSING; TRELLIS CODES;

CONVERSION PROCESS; EXPERIMENTAL EVALUATION; GAUSSIAN MIXTURE MODEL; MODULATION SPECTRUM; QUALITY DEGRADATION; SPEECH-BASED SYSTEMS; STATISTICAL APPROACH; STATISTICAL MODELING;

MODULATION;

EID: 84949926049 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/APSIPA.2014.7041540 Document Type: Conference Paper

Times cited : (13)

References (18)

1
- 84905252904
- An evaluation of excitation feature prediction in a hybrid approach to electrolaryngeal speech enhancement
- Florence, Italy, May
- K. Tanaka, T. Toda, G. Neubig, S. Sakti, and S. Nakamura. An evaluation of excitation feature prediction in a hybrid approach to electrolaryngeal speech enhancement. In Proc. ICASSP, pp. 4521-4525, Florence, Italy, May 2014.
- (2014) Proc. ICASSP , pp. 4521-4525
- Tanaka, K.¹ Toda, T.² Neubig, G.³ Sakti, S.⁴ Nakamura, S.⁵

2
- 84905223321
- Regression approaches to perceptual age control in singing voice conversion
- Florence, Italy, May
- K. Kobayashi, T. Toda, T. Nakano, M. Goto, G. Neubig, S. Sakti, and S. Nakamura. Regression approaches to perceptual age control in singing voice conversion. In Proc. ICASSP, pp. 7954-7958, Florence, Italy, May 2014.
- (2014) Proc. ICASSP , pp. 7954-7958
- Kobayashi, K.¹ Toda, T.² Nakano, T.³ Goto, M.⁴ Neubig, G.⁵ Sakti, S.⁶ Nakamura, S.⁷

3
- 84865743435
- Speakeradaptive speech synthesis based on eigenvoice conversion and languagedependent prosodic conversion in speech-to-speech translation
- Florence, Italy, Aug
- N. Hattori, T. Toda, H. Kawai, H. Saruwatari, and K. Shikano. Speakeradaptive speech synthesis based on eigenvoice conversion and languagedependent prosodic conversion in speech-to-speech translation. In Proc. INTERSPEECH, pp. 2769-2772, Florence, Italy, Aug. 2011.
- (2011) Proc. INTERSPEECH , pp. 2769-2772
- Hattori, N.¹ Toda, T.² Kawai, H.³ Saruwatari, H.⁴ Shikano, K.⁵

4
- 84923814232
- Can voice conversion be used to reduce non-native accents
- Florence, Italy, May
- S. Aryal and R. G.-Osuna. Can voice conversion be used to reduce non-native accents? In Proc. ICASSP, pp. 7929-7933, Florence, Italy, May 2014.
- (2014) Proc. ICASSP , pp. 7929-7933
- Aryal, S.¹ Osuna, G.-R.²

5
- 57749193836
- Voice conversion based on maximum likelihood estimation of spectral parameter trajectory
- T. Toda, A. W. Black, and K. Tokuda. Voice conversion based on maximum likelihood estimation of spectral parameter trajectory. IEEE Transactions on Audio, Speech and Language Processing, Vol. 15, No. 8, pp. 2222-2235, 2007.
- (2007) IEEE Transactions on Audio, Speech and Language Processing , vol.15 , Issue.8 , pp. 2222-2235
- Toda, T.¹ Black, A.W.² Tokuda, K.³

6
- 78049361102
- Incorporation of mixed excitation model and postfilter into HMM-based text-to-speech synthesis
- T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura. Incorporation of mixed excitation model and postfilter into HMM-based text-to-speech synthesis. IEICE Trans. Inf. Syst., Vol. J87-D-II, No. 8, pp. 1563-1571, 2004.
- (2004) IEICE Trans. Inf. Syst. , vol.J87-D-II , Issue.8 , pp. 1563-1571
- Yoshimura, T.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

7
- 84905283795
- A frequency-weighted post-filtering transform for compensation of the over-smoothing effect in HMM-based speech synthesis
- Florence, Italy, May
- F. Eyben and Y. Agiomyrgiannakis. A frequency-weighted post-filtering transform for compensation of the over-smoothing effect in HMM-based speech synthesis. In Proc. ICASSP, pp. 275-279, Florence, Italy, May 2014.
- (2014) Proc. ICASSP , pp. 275-279
- Eyben, F.¹ Agiomyrgiannakis, Y.²

8
- 84878390910
- Implementation of conputationally efficient real-time voice conversion
- Portland, Oregon, U. S., Sept
- T. Toda, T. Muramatsu, and H. Banno. Implementation of conputationally efficient real-time voice conversion. In Proc. INTERSPEECH, Portland, Oregon, U. S., Sept. 2012.
- (2012) Proc. INTERSPEECH
- Toda, T.¹ Muramatsu, T.² Banno, H.³

9
- 84905281160
- Improving voice quality of HMM-based speech synthesis using voice conversion method
- Florence, Italy, May
- Y. Jiao, X. Na, and M. Tu. Improving voice quality of HMM-based speech synthesis using voice conversion method. In Proc. ICASSP, pp. 7964-7968, Florence, Italy, May 2014.
- (2014) Proc. ICASSP , pp. 7964-7968
- Jiao, Y.¹ Na, X.² Tu, M.³

10
- 84893234191
- Incorporating global variance in the training phase of GMM-based voice conversion
- Kaohsiung, Taiwan, Oct
- H. Hwang, Y. Tsao, H. Wang, Y. Wang, and S. Chen. Incorporating global variance in the training phase of GMM-based voice conversion. In Proc. APSIPA, pp. 1-6, Kaohsiung, Taiwan, Oct. 2013.
- (2013) Proc. APSIPA , pp. 1-6
- Hwang, H.¹ Tsao, Y.² Wang, H.³ Wang, Y.⁴ Chen, S.⁵

11
- 0028287770
- Effect of reducing slow temporal modulations on speech reception
- R. Drullman, J. M. Festen, and R. Plomp. Effect of reducing slow temporal modulations on speech reception. J. Acoust. Soc. of America, Vol. 95, pp. 2670-2680, 1994.
- (1994) J. Acoust. Soc. of America , vol.95 , pp. 2670-2680
- Drullman, R.¹ Festen, J.M.² Plomp, R.³

12
- 70349212558
- Phoneme recgnition usng spectral envelop and modulation frequency features
- Taipei, Taiwan, April
- S. Thomas, S. Ganapathy, and H. Hermansky. Phoneme recgnition usng spectral envelop and modulation frequency features. In Proc. ICASSP, pp. 4453-4456, Taipei, Taiwan, April 2009.
- (2009) Proc. ICASSP , pp. 4453-4456
- Thomas, S.¹ Ganapathy, S.² Hermansky, H.³

13
- 84905234422
- A postfilter to modify modulation spectrum in HMM-based speech synthesis
- Florence, Italy, May
- S. Takamichi, T. Toda, G. Neubig, S. Sakti, and S. Nakamura. A postfilter to modify modulation spectrum in HMM-based speech synthesis. In Proc. ICASSP, pp. 290-294, Florence, Italy, May 2014.
- (2014) Proc. ICASSP , pp. 290-294
- Takamichi, S.¹ Toda, T.² Neubig, G.³ Sakti, S.⁴ Nakamura, S.⁵

14
- 0033708106
- Speech parameter generation algorithms for HMM-based speech synthesis
- Istanbul, Turkey, June
- K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura. Speech parameter generation algorithms for HMM-based speech synthesis. In Proc. ICASSP, pp. 1315-1318, Istanbul, Turkey, June 2000.
- (2000) Proc. ICASSP , pp. 1315-1318
- Tokuda, K.¹ Yoshimura, T.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

15
- 6644226630
- A large-scale Japanese speech database
- Kobe, Japan, Nov
- Y. Sagisaka, K. Takeda, M. Abe, S. Katagiri, T. Umeda, and H. Kuawhara. A large-scale Japanese speech database. In ICSLP90, pp. 1089-1092, Kobe, Japan, Nov. 1990.
- (1990) ICSLP90 , pp. 1089-1092
- Sagisaka, Y.¹ Takeda, K.² Abe, M.³ Katagiri, S.⁴ Umeda, T.⁵ Kuawhara, H.⁶

16
- 84874199000
- Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT
- Firentze, Italy, Sept
- H. Kawahara, Jo Estill, and O. Fujimura. Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT . In MAVEBA 2001, pp. 1-6, Firentze, Italy, Sept. 2001.
- (2001) MAVEBA 2001 , pp. 1-6
- Kawahara, H.¹ Estill, J.² Fujimura, O.³

17
- 44949143155
- Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation
- Pittsburgh, U. S. A., Sep
- Y. Ohtani, T. Toda, H. Saruwatari, and K. Shikano. Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation. In Proc. INTERSPEECH, pp. 2266-2269, Pittsburgh, U. S. A., Sep. 2006.
- (2006) Proc. INTERSPEECH , pp. 2266-2269
- Ohtani, Y.¹ Toda, T.² Saruwatari, H.³ Shikano, K.⁴

18
- 0032673049
- Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
- H. Kawahara, I. Masuda-Katsuse, and A. D. Cheveigne. Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds. Speech Commun., Vol. 27, No. 3-4, pp. 187-207, 1999.
- (1999) Speech Commun. , vol.27 , Issue.3-4 , pp. 187-207
- Kawahara, H.¹ Masuda-Katsuse, I.² Cheveigne, A.D.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.