SCOPUS 정보 검색 플랫폼

7th ISCA Workshop on Speech Synthesis, SSW 2010

Volumn , Issue , 2010, Pages 211-216

Recent Development of the HMM-based Singing Voice Synthesis System - Sinsy

(6) Oura, Keiichiro a Mase, Ayami a Yamada, Tomohiko a Muto, Satoru a Nankaku, Yoshihiko a Tokuda, Keiichi a

a NAGOYA INSTITUTE OF TECHNOLOGY (Japan)

Author keywords

HMM based speech synthesis; singing voice synthesis

Indexed keywords

MUSIC; SPEECH SYNTHESIS;

CONTEXT DEPENDENT; HIDDEN MARKOV MODEL-BASED SPEECH SYNTHESIS; HIDDEN-MARKOV MODELS; MODEL-BASED OPC; PARAMETRIC APPROACH; SINGING VOICES; SINGING-VOICE SYNTHESIS; SPECTRA'S; SYNTHESISED; WAVEFORMS;

HIDDEN MARKOV MODELS;

EID: 84876667508 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (97)

References (33)

1
- 85009139544
- Simultaneous Modeling of Spectrum, Pitch and Duration in HMM-Based Speech Synthesis
- T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, “Simultaneous Modeling of Spectrum, Pitch and Duration in HMM-Based Speech Synthesis,” Proc. of Eurospeech, pp. 2347-2350, 1999.
- (1999) Proc. of Eurospeech , pp. 2347-2350
- Yoshimura, T.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

2
- 33846463597
- Ph. D. thesis, Tokyo Institute of Technology
- J. Yamagishi, “Average-Voice-Based Speech Synthesis,” Ph. D. thesis, Tokyo Institute of Technology, 2006.
- (2006) Average-Voice-Based Speech Synthesis
- Yamagishi, J.¹

3
- 85135145847
- Speaker Interpolation in HMM-Based Speech Synthesis System
- T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, “Speaker Interpolation in HMM-Based Speech Synthesis System,” Proc. of Eurospeech, pp. 2523-2526, 1997.
- (1997) Proc. of Eurospeech , pp. 2523-2526
- Yoshimura, T.¹ Tokuda, K.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

4
- 85009257840
- Eigenvoices for HMM-Based Speech Synthesis
- K. Shichiri, A. Sawabe, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, “Eigenvoices for HMM-Based Speech Synthesis,” Proc. of ICSLP, pp. 1269-1272, 2002.
- (2002) Proc. of ICSLP , pp. 1269-1272
- Shichiri, K.¹ Sawabe, A.² Tokuda, K.³ Masuko, T.⁴ Kobayashi, T.⁵ Kitamura, T.⁶

5
- 50249141145
- An HMM-Based Singing Voice Synthesis System
- K. Saino, H. Zen, Y. Nankaku, A. Lee, and K. Tokuda, “An HMM-Based Singing Voice Synthesis System,” Proc. of ICSLP, pp. 1141-1144, 2006.
- (2006) Proc. of ICSLP , pp. 1141-1144
- Saino, K.¹ Zen, H.² Nankaku, Y.³ Lee, A.⁴ Tokuda, K.⁵

6
- 85133411876
- (in Japanese)
- HMM-Based Singing Voice Synthesis System (Sinsy), http://www.sinsy.jp/ (in Japanese).
- HMM-Based Singing Voice Synthesis System (Sinsy)

7
- 79952258981
- HMM-Based Speech Synthesis System (HTS), http://hts.sp.nitech.ac.jp/.
- HMM-Based Speech Synthesis System (HTS)

8
- 85063449886
- HMM-Based Speech Synthesis Engine (hts engine API), http://hts-engine.sourceforge.net/.
- HMM-Based Speech Synthesis Engine (hts engine API)

9
- 84878381939
- Speech Signal Processing Toolkit (SPTK), http://sptk.sourceforge.net/.
- Speech Signal Processing Toolkit (SPTK)

10
- 85133432081
- A Speech Analysis, Modification and Synthesis System (STRAIGHT), http://www.wakayama-u.ac.jp/kawahara/STRAIGHTadv/indexe.html.
- A Speech Analysis, Modification and Synthesis System (STRAIGHT)

11
- 85063539434
- CrestMuseXML Toolkit (CMX), http://cmx.sourceforge.jp/.
- CrestMuseXML Toolkit (CMX)

12
- 84869650341
- MusicXML Definition, http://musicxml.org/.
- MusicXML Definition

13
- 0038000318
- Spectral Estimation of Speech by Mel-Generalized Cepstral Analysis
- K. Tokuda, T. Kobayashi, T. Chiba, and S. Imai, “Spectral Estimation of Speech by Mel-Generalized Cepstral Analysis,” IEICE Trans. vol. 75-A, no. 7, pp. 1124-1134, 1992.
- (1992) IEICE Trans , vol.75-A , Issue.7 , pp. 1124-1134
- Tokuda, K.¹ Kobayashi, T.² Chiba, T.³ Imai, S.⁴

14
- 0033708106
- Speech Parameter Generation Algorithms for HMM-Based Speech Synthesis
- K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, “Speech Parameter Generation Algorithms for HMM-Based Speech Synthesis,” Proc. of ICASSP, pp. 1315-1318, 2000.
- (2000) Proc. of ICASSP , pp. 1315-1318
- Tokuda, K.¹ Yoshimura, T.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

15
- 0020596154
- Cepstral Analysis Synthesis on the Mel Frequency Scale
- S. Imai, “Cepstral Analysis Synthesis on the Mel Frequency Scale,” Proc. of ICASSP, pp. 93-96, 1983.
- (1983) Proc. of ICASSP , pp. 93-96
- Imai, S.¹

16
- 44449177634
- A Hidden Semi-Markov Model-Based Speech Synthesis System
- H. Zen, T. Masuko, K. Tokuda, T. Kobayashi, and T. Kitamura, “A Hidden Semi-Markov Model-Based Speech Synthesis System,” Proc. of IEICE Trans. Inf. & Sys., vol. 90D, no. 5, pp. 825-834, 2007.
- (2007) Proc. of IEICE Trans. Inf. & Sys , vol.90D , Issue.5 , pp. 825-834
- Zen, H.¹ Masuko, T.² Tokuda, K.³ Kobayashi, T.⁴ Kitamura, T.⁵

17
- 68749108220
- A Fully Consistent Hidden Semi-Markov Model-Based Speech Recognition System
- 208
- K. Oura, H. Zen, Y. Nankaku, A. Lee, and K. Tokuda, “A Fully Consistent Hidden Semi-Markov Model-Based Speech Recognition System,” Proc. of IEICE Trans. Inf. and Syst., vol. E91-D, no. 11, pp. 2693-2700, 208.
- Proc. of IEICE Trans. Inf. and Syst , vol.E91-D , Issue.11 , pp. 2693-2700
- Oura, K.¹ Zen, H.² Nankaku, Y.³ Lee, A.⁴ Tokuda, K.⁵

18
- 0025475528
- ATR Japanese Speech Database as a Tool of Speech Recognition and Synthesis
- A. Kuramatsu, K. Takeda, Y. Sagisaka, S. Katagiri, H. Kawabara, and K. Shikano, “ATR Japanese Speech Database as a Tool of Speech Recognition and Synthesis,” Speech Communication, vol. 9, pp. 357-363, 1990.
- (1990) Speech Communication , vol.9 , pp. 357-363
- Kuramatsu, A.¹ Takeda, K.² Sagisaka, Y.³ Katagiri, S.⁴ Kawabara, H.⁵ Shikano, K.⁶

19
- 79959831939
- HMM-Based Singing Voice Synthesis System Using Pitch-Shifted Pseudo Training Data
- (to be published)
- A. Mase, K. Oura, Y. Nankaku, and K. Tokuda, “HMM-Based Singing Voice Synthesis System Using Pitch-Shifted Pseudo Training Data,” Proc. of Interspeech, 2010 (to be published).
- (2010) Proc. of Interspeech
- Mase, A.¹ Oura, K.² Nankaku, Y.³ Tokuda, K.⁴

20
- 0033906251
- MDL-Based Context-Dependent Subword Modeling for Speech Recognition
- K. Shinoda and T. Watanabe, “MDL-Based Context-Dependent Subword Modeling for Speech Recognition,” J. Acoust. Soc. Jpn.(E), vol.21, no. 2, pp. 79-86, 2000.
- (2000) J. Acoust. Soc. Jpn.(E) , vol.21 , Issue.2 , pp. 79-86
- Shinoda, K.¹ Watanabe, T.²

21
- 85063493294
- Vibrato Modeling for HMM-Based Singing Voice Synthesis
- (in Japanese)
- T. Yamada, S. Muto, Y. Nankaku, S. Sako, and K. Tokuda, “Vibrato Modeling for HMM-Based Singing Voice Synthesis,” Proc. of Information Processing Society of Japan, vol. 2009-MUS-80, no. 5, pp. 1-6, 2009 (in Japanese).
- (2009) Proc. of Information Processing Society of Japan , vol.2009-MUS-80 , Issue.5 , pp. 1-6
- Yamada, T.¹ Muto, S.² Nankaku, Y.³ Sako, S.⁴ Tokuda, K.⁵

22
- 44949192112
- An Automatic Singing Skill Evaluation Method for Unknown Melodies Using Pitch Interval Accuracy and Vibrato Features
- T. Nakano, M. Goto, and Y. Hiraga, “An Automatic Singing Skill Evaluation Method for Unknown Melodies Using Pitch Interval Accuracy and Vibrato Features”, Proc. of Interspeech, pp. 1706-1709, 2006.
- (2006) Proc. of Interspeech , pp. 1706-1709
- Nakano, T.¹ Goto, M.² Hiraga, Y.³

23
- 0004213896
- Northern Illinois University Press
- J. Sundberg, “The Science of the Singing Voice,” Northern Illinois University Press, 1987.
- (1987) The Science of the Singing Voice
- Sundberg, J.¹

24
- 44949247517
- A Musical Ornament, the Vibrato
- McGraw-Hill Book Company
- C. E. Seashore, “A Musical Ornament, the Vibrato,” Proc. of Psychology of Music, McGraw-Hill Book Company, pp. 33-52, 1938.
- (1938) Proc. of Psychology of Music , pp. 33-52
- Seashore, C. E.¹

25
- 85133408098
- Reducing Computational Cost of Training for HMM-Based Singing Voice Synthesis Using Note Boundaries
- 2-7-8, (in Japanese)
- S. Muto, K. Oura, Y. Nankaku, and K. Tokuda, “Reducing Computational Cost of Training for HMM-Based Singing Voice Synthesis Using Note Boundaries,” Proc. of Acoustic Society of Japan Spring Meeting, vol. I, 2-7-8, pp. 347-348, 2009 (in Japanese).
- (2009) Proc. of Acoustic Society of Japan Spring Meeting , vol.I , pp. 347-348
- Muto, S.¹ Oura, K.² Nankaku, Y.³ Tokuda, K.⁴

26
- 85133411482
- A New and Simplified BSD License, http://www.opensource.org/licenses/bsd-license.php.
- A New and Simplified BSD License

27
- 0032673049
- Restructuring Speech Representations Using a Pitch-Adaptive Time-Frequency Smoothing and an Instantaneous-Frequency-Based F0 Extraction: Possible Role of a Repetitive Structure in Sounds
- H. Kawahara, M. K. Ikuyo, and A. Cheneigne, “Restructuring Speech Representations Using a Pitch-Adaptive Time-Frequency Smoothing and an Instantaneous-Frequency-Based F0 Extraction: Possible Role of a Repetitive Structure in Sounds,” Proc. of Speech Communication, 27, pp. 187-207, 1999.
- (1999) Proc. of Speech Communication , vol.27 , pp. 187-207
- Kawahara, H.¹ Ikuyo, M. K.² Cheneigne, A.³

28
- 77950574571
- Recent Development of the HMM-Based Speech Synthesis System (HTS)
- H. Zen, K. Oura, T. Nose, J. Yamagishi, S. Sako, T. Toda, T. Masuko, A. W. Black, and K. Tokuda, “Recent Development of the HMM-Based Speech Synthesis System (HTS),” Proc. of APSIPA, pp. 121-130, 2009.
- (2009) Proc. of APSIPA , pp. 121-130
- Zen, H.¹ Oura, K.² Nose, T.³ Yamagishi, J.⁴ Sako, S.⁵ Toda, T.⁶ Masuko, T.⁷ Black, A. W.⁸ Tokuda, K.⁹

29
- 4243101253
- The Hidden Markov Model Toolkit (HTK), http://htk.eng.cam.ac.uk/.
- The Hidden Markov Model Toolkit (HTK)

30
- 85133460481
- On CrestMuseXML (CMX) Toolkit Ver. 0.40
- (in Japanese)
- T. Kitahara and H. Katayose, “On CrestMuseXML (CMX) Toolkit Ver. 0.40,” IPSJ SIG Technical Report, vol. 2008-MUS-75, no. 17, pp. 95-100, 2008 (in Japanese).
- (2008) IPSJ SIG Technical Report , vol.2008-MUS-75 , Issue.17 , pp. 95-100
- Kitahara, T.¹ Katayose, H.²

31
- 0032678076
- Hidden Markov Models Based on Multi-Space Probability Distribution for Pitch Pattern Modeling
- K. Tokuda, T. Masuko, N. Miyazaki, and T. Kobayashi, “Hidden Markov Models Based on Multi-Space Probability Distribution for Pitch Pattern Modeling,” Proc. of ICASSP, vol. I, pp. 229-232, 1999.
- (1999) Proc. of ICASSP , vol.I , pp. 229-232
- Tokuda, K.¹ Masuko, T.² Miyazaki, N.³ Kobayashi, T.⁴

32
- 33846429403
- Minimum Generation Error Training for HMM-Based Speech Synthesis
- Y. J. Wu, and R. H. Wang, “Minimum Generation Error Training for HMM-Based Speech Synthesis,” Proc. of ICASSP, vol. I, pp. 89-92, 2006.
- (2006) Proc. of ICASSP , vol.I , pp. 89-92
- Wu, Y. J.¹ Wang, R. H.²

33
- 33745200051
- Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis
- T. Toda and K. Tokuda, “Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis,” Proc. of Interspeech, pp. 2801-2804, 2005.
- (2005) Proc. of Interspeech , pp. 2801-2804
- Toda, T.¹ Tokuda, K.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.