메뉴 건너뛰기




Volumn 2018-September, Issue , 2018, Pages 2833-2837

Investigation of using disentangled and interpretable representations for one-shot cross-lingual voice conversion

Author keywords

Cross lingual; One shot learning; Variational autoencoder; Voice conversion

Indexed keywords

LEARNING SYSTEMS; SPEECH COMMUNICATION;

EID: 85054986055     PISSN: 2308457X     EISSN: 19909772     Source Type: Conference Proceeding    
DOI: 10.21437/Interspeech.2018-2525     Document Type: Conference Paper
Times cited : (12)

References (33)
  • 1
    • 85046996130 scopus 로고    scopus 로고
    • Unsupervised learning of disentangled and interpretable representations from sequential data
    • W.-N. Hsu, Y. Zhang, and J. Glass, “Unsupervised learning of disentangled and interpretable representations from sequential data,” in Advances in neural information processing systems, 2017, pp. 1876-1887.
    • (2017) Advances in Neural Information Processing Systems , pp. 1876-1887
    • Hsu, W.-N.1    Zhang, Y.2    Glass, J.3
  • 3
    • 85010399617 scopus 로고    scopus 로고
    • An overview of voice conversion systems
    • S. H. Mohammadi and A. Kain, “An overview of voice conversion systems,” Speech Communication, vol. 88, pp. 65-82, 2017.
    • (2017) Speech Communication , vol.88 , pp. 65-82
    • Mohammadi, S.H.1    Kain, A.2
  • 4
    • 57749193836 scopus 로고    scopus 로고
    • Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
    • T. Toda, A. W. Black, and K. Tokuda, “Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 8, pp. 2222-2235, 2007.
    • (2007) IEEE Transactions on Audio, Speech, and Language Processing , vol.15 , Issue.8 , pp. 2222-2235
    • Toda, T.1    Black, A.W.2    Tokuda, K.3
  • 6
    • 84869384026 scopus 로고    scopus 로고
    • Mixture of factor analyzers using priors from non-parallel speech for voice conversion
    • Z. Wu, T. Kinnunen, E. S. Chng, and H. Li, “Mixture of factor analyzers using priors from non-parallel speech for voice conversion,” IEEE Signal Processing Letters, vol. 19, no. 12, pp. 914-917, 2012.
    • (2012) IEEE Signal Processing Letters , vol.19 , Issue.12 , pp. 914-917
    • Wu, Z.1    Kinnunen, T.2    Chng, E.S.3    Li, H.4
  • 11
    • 85039165729 scopus 로고    scopus 로고
    • Siamese autoencoders for speech style extraction and switching applied to voice identification and conversion
    • S. H. Mohammadi and A. Kain, “Siamese autoencoders for speech style extraction and switching applied to voice identification and conversion,” Proceedings of Interspeech, pp. 1293-1297, 2017.
    • (2017) Proceedings of Interspeech , pp. 1293-1297
    • Mohammadi, S.H.1    Kain, A.2
  • 12
    • 84959297010 scopus 로고    scopus 로고
    • A multi-level gmm-based cross-lingual voice conversion using language-specific mixture weights for polyglot synthesis
    • B. Ramani, M. A. Jeeva, P. Vijayalakshmi, and T. Nagarajan, “A multi-level gmm-based cross-lingual voice conversion using language-specific mixture weights for polyglot synthesis,” Circuits, Systems, and Signal Processing, vol. 35, no. 4, pp. 1283-1311, 2016.
    • (2016) Circuits, Systems, and Signal Processing , vol.35 , Issue.4 , pp. 1283-1311
    • Ramani, B.1    Jeeva, M.A.2    Vijayalakshmi, P.3    Nagarajan, T.4
  • 28
    • 84976902575 scopus 로고    scopus 로고
    • World: A vocoder-based high-quality speech synthesis system for real-time applications
    • M. Morise, F. Yokomori, and K. Ozawa, “World: a vocoder-based high-quality speech synthesis system for real-time applications,” IEICE TRANSACTIONS on Information and Systems, vol. 99, no. 7, pp. 1877-1884, 2016.
    • (2016) IEICE TRANSACTIONS on Information and Systems , vol.99 , Issue.7 , pp. 1877-1884
    • Morise, M.1    Yokomori, F.2    Ozawa, K.3
  • 29
    • 0031573117 scopus 로고    scopus 로고
    • Long short-term memory
    • S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997.
    • (1997) Neural Computation , vol.9 , Issue.8 , pp. 1735-1780
    • Hochreiter, S.1    Schmidhuber, J.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.