메뉴 건너뛰기




Volumn 2015-January, Issue , 2015, Pages 2207-2211

Deep neural network context embeddings for model selection in rich-context HMM synthesis

Author keywords

Deep neural networks; Embedding; Hidden Markov model; Rich context; Speech synthesis

Indexed keywords

DECISION TREES; LINGUISTICS; MARKOV PROCESSES; SPEECH COMMUNICATION; SPEECH SYNTHESIS; TREES (MATHEMATICS); TRELLIS CODES;

EID: 84959122693     PISSN: 2308457X     EISSN: 19909772     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (6)

References (27)
  • 4
    • 84910105608 scopus 로고    scopus 로고
    • Measuring a decade of progress in text-to-speech
    • S. King, "Measuring a decade of progress in text-to-speech, " Loquens, vol. 1, no. 1, 2014.
    • (2014) Loquens , vol.1 , Issue.1
    • King, S.1
  • 5
    • 38549096029 scopus 로고    scopus 로고
    • A Speech parameter generation algorithmconsidering global variance for HMM-based speechsynthesis
    • May
    • T. Toda and K. Tokuda, "A Speech Parameter Generation AlgorithmConsidering Global Variance for HMM-Based SpeechSynthesis, " IEICE Transactions on Information and Systems, vol. E90-D, no. 5, pp. 816-824, May 2007.
    • (2007) IEICE Transactions on Information and Systems , vol.E90-D , Issue.5 , pp. 816-824
    • Toda, T.1    Tokuda, K.2
  • 6
    • 84856237844 scopus 로고    scopus 로고
    • An introduction to statistical parametric speech synthesis
    • S. King, "An introduction to statistical parametric speech synthesis, "Sadhana, vol. 36, pp. 837-852, 2011.
    • (2011) Sadhana , vol.36 , pp. 837-852
    • King, S.1
  • 8
    • 67651002140 scopus 로고    scopus 로고
    • Statistical parametricspeech synthesis
    • Nov.
    • H. Zen, K. Tokuda, and A. W. Black, "Statistical parametricspeech synthesis, " Speech Communication, vol. 51, no. 11, pp. 1039-1064, Nov. 2009.
    • (2009) Speech Communication , vol.51 , Issue.11 , pp. 1039-1064
    • Zen, H.1    Tokuda, K.2    Black, A.W.3
  • 9
    • 84946042252 scopus 로고    scopus 로고
    • Attributing modelling errorsinHMMsynthesis by stepping gradually from natural to modelledspeech
    • T. Merritt, J. Latorre, and S. King, "Attributing modelling errorsinHMMsynthesis by stepping gradually from natural to modelledspeech, " in Proc. ICASSP, 2015.
    • (2015) Proc. ICASSP
    • Merritt, T.1    Latorre, J.2    King, S.3
  • 11
    • 84910070288 scopus 로고    scopus 로고
    • Investigating source and filtercontributions, and their interaction, to statistical parametricspeech synthesis
    • T. Merritt, T. Raitio, and S. King, "Investigating source and filtercontributions, and their interaction, to statistical parametricspeech synthesis, " in Proc. Interspeech, 2014, pp. 1509-1513.
    • (2014) Proc. Interspeech , pp. 1509-1513
    • Merritt, T.1    Raitio, T.2    King, S.3
  • 12
    • 84910028520 scopus 로고    scopus 로고
    • Measuring the perceptual effects of modelling assumptions inspeech synthesis using stimuli constructed from repeated naturalspeech
    • G. E. Henter, T. Merritt, M. Shannon, C. Mayo, and S. King, "Measuring the perceptual effects of modelling assumptions inspeech synthesis using stimuli constructed from repeated naturalspeech, " in Proc. Interspeech, 2014, pp. 1504-1508.
    • (2014) Proc. Interspeech , pp. 1504-1508
    • Henter, G.E.1    Merritt, T.2    Shannon, M.3    Mayo, C.4    King, S.5
  • 13
    • 70450161678 scopus 로고    scopus 로고
    • Rich context modeling forhigh quality HMM-based TTS
    • Z.-J. Yan, Y. Qian, and F. K. Soong, "Rich context modeling forhigh quality HMM-based TTS, " in Proc. Interspeech, 2009, pp. 1755-1758.
    • (2009) Proc. Interspeech , pp. 1755-1758
    • Yan, Z.-J.1    Qian, Y.2    Soong, F.K.3
  • 14
    • 78049399368 scopus 로고    scopus 로고
    • Rich-context unit selection ( RUS) approach to high qualityTTS
    • -, "Rich-context unit selection ( RUS) approach to high qualityTTS, " in Proc. ICASSP, 2010, pp. 4798-4801.
    • (2010) Proc. ICASSP , pp. 4798-4801
    • Yan, Z.-J.1    Qian, Y.2    Soong, F.K.3
  • 15
    • 84878421733 scopus 로고    scopus 로고
    • An evaluation of parameter generation methods withrich context models in HMM-based speech synthesis
    • S. Takamichi, T. Toda, Y. Shiga, H. Kawai, S. Sakti, and S. Nakamura, "An Evaluation of Parameter Generation Methods withRich Context Models in HMM-Based Speech Synthesis, " in Proc. Interspeech, 2012, pp. 1139-1142.
    • (2012) Proc. Interspeech , pp. 1139-1142
    • Takamichi, S.1    Toda, T.2    Shiga, Y.3    Kawai, H.4    Sakti, S.5    Nakamura, S.6
  • 17
    • 51449111086 scopus 로고    scopus 로고
    • A cross-languagestate mapping approach to bilingual (Mand arin-English) TTS
    • H. Liang, Y. Qian, F. K. Soong, and G. Liu, "A cross-languagestate mapping approach to bilingual (Mand arin-English) TTS, " inProc. ICASSP, 2008, pp. 4641-4644.
    • (2008) Proc. ICASSP , pp. 4641-4644
    • Liang, H.1    Qian, Y.2    Soong, F.K.3    Liu, G.4
  • 18
    • 84946033275 scopus 로고    scopus 로고
    • Deep neuralnetworks employing multi-task learning and stacked bottleneckfeatures for speech synthesis
    • Z. Wu, C. Valentini-Botinhao, O. Watts, and S. King, "Deep neuralnetworks employing multi-task learning and stacked bottleneckfeatures for speech synthesis, " in ICASSP, 2015.
    • (2015) ICASSP
    • Wu, Z.1    Valentini-Botinhao, C.2    Watts, O.3    King, S.4
  • 19
    • 84910030525 scopus 로고    scopus 로고
    • Word embeddings for speech recognition
    • S. Bengio and G. Heigold, "Word Embeddings for Speech Recognition, "in Proc. Interspeech, 2014, pp. 1053-1057.
    • (2014) Proc. Interspeech , pp. 1053-1057
    • Bengio, S.1    Heigold, G.2
  • 20
    • 44949153641 scopus 로고    scopus 로고
    • The target cost formulation in unit selection speechsynthesis
    • P. Taylor, "The target cost formulation in unit selection speechsynthesis. " in Proc. Interspeech, 2006, pp. 2038-2041.
    • (2006) Proc. Interspeech , pp. 2038-2041
    • Taylor, P.1
  • 21
    • 34547516258 scopus 로고    scopus 로고
    • Approximating the kullback-leibler divergence between Gaussian mixture models
    • J. R. Hershey and P. a. Olsen, "Approximating the Kullback-Leibler divergence between Gaussian mixture models, " in Proc. ICASSP, 2007.
    • (2007) Proc. ICASSP
    • Hershey, J.R.1    Olsen P, A.2
  • 23
    • 33750915991 scopus 로고    scopus 로고
    • STRAIGHT, exploitation of the other aspect ofVOCODER: Perceptually isomorphic decomposition of speechsounds
    • H. Kawahara, "STRAIGHT, exploitation of the other aspect ofVOCODER: Perceptually isomorphic decomposition of speechsounds, " Acoust. Sci. Technol., vol. 27, no. 6, pp. 349-353, 2006.
    • (2006) Acoust. Sci. Technol , vol.27 , Issue.6 , pp. 349-353
    • Kawahara, H.1
  • 25
    • 84959114033 scopus 로고    scopus 로고
    • Method for the subjective assessment of intermediate quality levelof coding systems, ITU Recommendation ITU-R BS. 1534-1, Geneva, Switzerland, March
    • Method for the subjective assessment of intermediate quality levelof coding systems, ITU Recommendation ITU-R BS. 1534-1, InternationalTelecommunication Union Radiocommunication Assembly, Geneva, Switzerland, March 2003.
    • (2003) InternationalTelecommunication Union Radiocommunication Assembly
  • 27
    • 84959127221 scopus 로고    scopus 로고
    • Are we usingenough listeners No! an empirically-supported critique ofInterspeech 2014 TTS evaluations
    • M. Wester, C. Valentini-Botinhao, and G. E. Henter, "Are we usingenough listeners No! an empirically-supported critique ofInterspeech 2014 TTS evaluations, " in Proc. Interspeech, 2015.
    • (2015) Proc. Interspeech
    • Wester, M.1    Valentini-Botinhao, C.2    Henter, G.E.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.