메뉴 건너뛰기




Volumn 08-12-September-2016, Issue , 2016, Pages 2453-2457

Deep bidirectional LSTM modeling of timbre and prosody for emotional voice conversion

Author keywords

Long short term memory; Prosody; Recurrent neural networks; Voice conversion

Indexed keywords

BRAIN; RECURRENT NEURAL NETWORKS; SPEECH COMMUNICATION; WAVELET TRANSFORMS;

EID: 84994251909     PISSN: 2308457X     EISSN: 19909772     Source Type: Conference Proceeding    
DOI: 10.21437/Interspeech.2016-1053     Document Type: Conference Paper
Times cited : (92)

References (30)
  • 4
    • 0037384712 scopus 로고    scopus 로고
    • Vocal communication of emotion: A review of research paradigms
    • K. R. Scherer, (2003). Vocal communication of emotion: A review of research paradigms. Speech communication, 40(1), 227-256.
    • (2003) Speech Communication , vol.40 , Issue.1 , pp. 227-256
    • Scherer, K.R.1
  • 7
    • 84938935270 scopus 로고    scopus 로고
    • A system for transform- ing the emotion in speech: Combining data-driven conversion tech- niques for prosody and voice quality
    • August
    • Z. Inanoglu and S. Young, (2007, August). A system for transform- ing the emotion in speech: combining data-driven conversion tech- niques for prosody and voice quality. In INTERSPEECH (pp. 490- 493).
    • (2007) INTERSPEECH , pp. 490-493
    • Inanoglu, Z.1    Young, S.2
  • 11
    • 84876502441 scopus 로고    scopus 로고
    • Review of F0 modelling and generation in HMM based speech synthesis
    • October
    • K. Yu, (2012, October). Review of F0 modelling and generation in HMM based speech synthesis. In IEEE 11th International Con- ference on Signal Processing (ICSP), (Vol. 1, pp. 599-604).
    • (2012) IEEE 11th International Con- Ference on Signal Processing (ICSP) , vol.1 , pp. 599-604
    • Yu, K.1
  • 12
    • 77955722263 scopus 로고    scopus 로고
    • Hier- archical prosody conversion using regression-based clustering for emotional speech synthesis
    • C. H. Wu, C. C. Hsia, C. H. Lee and M. C. Lin, (2010). Hier- archical prosody conversion using regression-based clustering for emotional speech synthesis. IEEE Transactions on Audio, Speech, and Language Processing, 18(6), 1394-1405.
    • (2010) IEEE Transactions on Audio, Speech, and Language Processing , vol.18 , Issue.6 , pp. 1394-1405
    • Wu, C.H.1    Hsia, C.C.2    Lee, C.H.3    Lin, M.C.4
  • 14
    • 84864409462 scopus 로고    scopus 로고
    • Speech prosody: A methodological review
    • Y. Xu, (2011). Speech prosody: A methodological review. Journal of Speech Sciences, 1(1), 85-115.
    • (2011) Journal of Speech Sciences , vol.1 , Issue.1 , pp. 85-115
    • Xu, Y.1
  • 17
    • 84867194192 scopus 로고    scopus 로고
    • Multilevel parametric-base F0 model for speech synthesis
    • September
    • J. Latorre and M. Akamine, (2008, September). Multilevel parametric-base F0 model for speech synthesis. In INTERSPEECH (pp. 2274-2277).
    • (2008) INTERSPEECH , pp. 2274-2277
    • Latorre, J.1    Akamine, M.2
  • 18
  • 19
    • 84865714286 scopus 로고    scopus 로고
    • Stylization and trajectory modelling of short and long term speech prosody variations
    • August
    • N. Obin, A. Lacheret and X. Rodet, (2011, August). Stylization and trajectory modelling of short and long term speech prosody variations. In INTERSPEECH.
    • (2011) INTERSPEECH
    • Obin, N.1    Lacheret, A.2    Rodet, X.3
  • 23
    • 0035505385 scopus 로고    scopus 로고
    • LSTM recurrent network- s learn simple context-free and context-sensitive languages
    • F. A. Gers, and J. Schmidhuber, (2001). LSTM recurrent network- s learn simple context-free and context-sensitive languages. IEEE Transactions on Neural Networks, 12(6), 1333-1340.
    • (2001) IEEE Transactions on Neural Networks , vol.12 , Issue.6 , pp. 1333-1340
    • Gers, F.A.1    Schmidhuber, J.2
  • 24
    • 84910046405 scopus 로고    scopus 로고
    • Long short-term memory recurrent neural network architectures for large vocabulary speech recognition
    • September
    • H. Sak, A. W. Senior, and F. Beaufays, (2014, September). Long short-term memory recurrent neural network architectures for large vocabulary speech recognition. In INTERSPEECH (pp. 338-342).
    • (2014) INTERSPEECH , pp. 338-342
    • Sak, H.1    Senior, A.W.2    Beaufays, F.3
  • 25
    • 84910047819 scopus 로고    scopus 로고
    • TTS synthesis with bidirectional LSTM based recurrent neural net- works
    • September
    • Y. Fan, Y. Qian, F. L., Xie and F. K. Soong, (2014, September). TTS synthesis with bidirectional LSTM based recurrent neural net- works. In INTERSPEECH (pp. 1964-1968).
    • (2014) INTERSPEECH , pp. 1964-1968
    • Fan, Y.1    Qian, Y.2    Xie, F.L.3    Soong, F.K.4
  • 27


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.