메뉴 건너뛰기




Volumn 18, Issue 5, 2010, Pages 932-943

Supervisory data alignment for text-independent voice conversion

Author keywords

Data alignment; Self organized learning; Supervisory phonetic restriction; Text independent voice conversion

Indexed keywords

ALIGNMENT ACCURACY; CROSS-LINGUAL; DATA ALIGNMENTS; EVALUATION RESULTS; LINEAR ALIGNMENTS; NON-LINEAR METHODS; PARALLEL TRAINING; PARAMETER SPACES; PHONETIC INFORMATION; SELF ORGANIZED LEARNING; SELF-ORGANIZING LEARNING; SOURCE DATA; SOURCE PARAMETERS; TARGET SPACE; TARGET SPEAKER; TOPOLOGICAL STRUCTURE; TOPOLOGY PRESERVATION; VOICE CONVERSION;

EID: 77953724495     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2010.2041688     Document Type: Article
Times cited : (24)

References (41)
  • 2
    • 0024874920 scopus 로고
    • Speaker adaptation applied to HMM and neural networks
    • Glasgow, U.K., May
    • S. Nakamura and K. Shikano, "Speaker adaptation applied to HMM and neural networks," in Proc. ICASSP, Glasgow, U.K., May 1989, pp. 89-92.
    • (1989) Proc. ICASSP , pp. 89-92
    • Nakamura, S.1    Shikano, K.2
  • 3
    • 84863268465 scopus 로고    scopus 로고
    • Voice conversion by codebook mapping of line spectral frequencies and excitation spectrum
    • Rhodes, Greece
    • L. M. Arslan and D. Talkin, "Voice conversion by codebook mapping of line spectral frequencies and excitation spectrum," in Proc. Eurospeech' 97, Rhodes, Greece, 1997.
    • (1997) Proc. Eurospeech' 97
    • Arslan, L.M.1    Talkin, D.2
  • 4
    • 0032026483 scopus 로고    scopus 로고
    • Continuous probabilistic transform for voice conversion
    • Mar.
    • Y. Stylianou, O. Cappé, and E. Moulines, "Continuous probabilistic transform for voice conversion," IEEE Trans. Speech Audio Process., vol.6, no.2, pp. 131-142, Mar. 1998.
    • (1998) IEEE Trans. Speech Audio Process. , vol.6 , Issue.2 , pp. 131-142
    • Stylianou, Y.1    Cappé, O.2    Moulines, E.3
  • 5
    • 0031623661 scopus 로고    scopus 로고
    • Spectral voice conversion for text-to-speech synthesis
    • Seattle, WA, May
    • A. Kain and M. W. Macon, "Spectral voice conversion for text-to-speech synthesis," in Proc. ICASSP, Seattle, WA, May 1998, pp. 285-288.
    • (1998) Proc. ICASSP , pp. 285-288
    • Kain, A.1    MacOn, M.W.2
  • 6
    • 0034842552 scopus 로고    scopus 로고
    • Voice conversion algorithm based on Gaussian mixture model with dynamic frequency warping of straight spectrum
    • T. Toda, H. Saruwatari, and K. Shikao, "Voice conversion algorithm based on Gaussian mixture model with dynamic frequency warping of straight spectrum," in Proc. ICASSP, 2001, pp. 841-944.
    • (2001) Proc. ICASSP , pp. 841-944
    • Toda, T.1    Saruwatari, H.2    Shikao, K.3
  • 7
    • 57749193836 scopus 로고    scopus 로고
    • Voice conversion based on maximum likelihood estimation of spectral parameter trajectory
    • Nov.
    • T. Toda, A. W. Black, and K. Tokuda, "Voice conversion based on maximum likelihood estimation of spectral parameter trajectory," IEEE Trans. Audio, Speech, Lang. Process., vol.15, no.8, pp. 2222-2235, Nov. 2007.
    • (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.8 , pp. 2222-2235
    • Toda, T.1    Black, A.W.2    Tokuda, K.3
  • 8
    • 0026880275 scopus 로고
    • Voice transformation using PSOLA technique
    • H. Valbret, E. Moulines, and J. P. Tubach, "Voice transformation using PSOLA technique," Speech Commun., vol.11, no.2-3, pp. 175-187, 1992.
    • (1992) Speech Commun. , vol.11 , Issue.2-3 , pp. 175-187
    • Valbret, H.1    Moulines, E.2    Tubach, J.P.3
  • 9
    • 0029254176 scopus 로고
    • Transformation of formants for voice conversion using artificial neural networks
    • M. Narendranath, H. A. Murthy, S. Rajendran, and B. Yegnanarayana, "Transformation of formants for voice conversion using artificial neural networks," Speech Commun., vol.16, no.2, pp. 207-216, 1995.
    • (1995) Speech Commun. , vol.16 , Issue.2 , pp. 207-216
    • Narendranath, M.1    Murthy, H.A.2    Rajendran, S.3    Yegnanarayana, B.4
  • 10
    • 0029251946 scopus 로고
    • Speech spectrum conversion based on speaker interpolation and multi-functional representation with weighting by radial basis function networks
    • N. Iwahashi and Y. Sagisaka, "Speech spectrum conversion based on speaker interpolation and multi-functional representation with weighting by radial basis function networks," Speech Commun., vol.16, no.2, pp. 139-151, 1995.
    • (1995) Speech Commun. , vol.16 , Issue.2 , pp. 139-151
    • Iwahashi, N.1    Sagisaka, Y.2
  • 11
    • 21544463108 scopus 로고    scopus 로고
    • VTLN-Based cross-language voice conversion
    • Virgin Islands
    • D. Suendermann, H. Ney, and H. Hoege, "VTLN-Based cross-language voice conversion," in Proc. ASRU'03, Virgin Islands, 2003.
    • (2003) Proc. ASRU'03
    • Suendermann, D.1    Ney, H.2    Hoege, H.3
  • 13
    • 85009102954 scopus 로고    scopus 로고
    • Voice conversion for unknown speakers
    • H. Ye and S. J. Young, "Voice conversion for unknown speakers," in Proc. ICSLP'04.
    • Proc. ICSLP'04
    • Ye, H.1    Young, S.J.2
  • 14
    • 0033154052 scopus 로고    scopus 로고
    • Speaker transformation algorithm using segmental codebooks
    • L. M. Arslan, "Speaker transformation algorithm using segmental codebooks," Speech Commun., vol.28, pp. 211-226, 1999.
    • (1999) Speech Commun. , vol.28 , pp. 211-226
    • Arslan, L.M.1
  • 16
    • 51449121435 scopus 로고    scopus 로고
    • Text-independent voice conversion based on state mapped codebook
    • M. Zhang, J. Tao, J. Tian, and X. Wang, "Text-independent voice conversion based on state mapped codebook," in Proc. ICASSP'08, 2008.
    • (2008) Proc. ICASSP'08
    • Zhang, M.1    Tao, J.2    Tian, J.3    Wang, X.4
  • 20
    • 0003009750 scopus 로고
    • Acoustic phonetics
    • M. Joos, "Acoustic phonetics," Language, vol.24, pp. 1-136, 1948.
    • (1948) Language , vol.24 , pp. 131-136
    • Joos, M.1
  • 22
    • 0026400231 scopus 로고
    • Robust and efficient quantization of speech LSF parameters using structured vector quantizers
    • R. Laroia, N. Phamdo, and N. Farvardin, "Robust and efficient quantization of speech LSF parameters using structured vector quantizers," in Proc. ICASSP'91, 1991.
    • (1991) Proc. ICASSP'91
    • Laroia, R.1    Phamdo, N.2    Farvardin, N.3
  • 23
    • 0028997003 scopus 로고
    • Vector-field-smoothed Bayesian learning for incremental speaker adaptation
    • J. Takahashi and S. Sagayama, "Vector-field-smoothed Bayesian learning for incremental speaker adaptation," in Proc. ICASSP'95, 1995, vol.1, pp. 696-699.
    • (1995) Proc. ICASSP'95 , vol.1 , pp. 696-699
    • Takahashi, J.1    Sagayama, S.2
  • 25
    • 17744361925 scopus 로고    scopus 로고
    • New York: Springer, Graduate Texts in Mathematics
    • J. M. Lee, Introduction to Topological Manifolds. New York: Springer, 2000, vol.202, Graduate Texts in Mathematics.
    • (2000) Introduction to Topological Manifolds , vol.202
    • Lee, J.M.1
  • 26
    • 4544326792 scopus 로고    scopus 로고
    • Manifold learning using euclidean K-nearest neighbor graphs
    • J. A. Costa and A. O. Hero, "Manifold learning using euclidean K-nearest neighbor graphs," in Proc. ICASSP, 2004.
    • (2004) Proc. ICASSP
    • Costa, J.A.1    Hero, A.O.2
  • 27
    • 0003410791 scopus 로고
    • Berlin/Heidelberg, Germany: Springer
    • T. Kohonen, Self-Organizing Maps. Berlin/Heidelberg, Germany: Springer, 1995, vol.30.
    • (1995) Self-Organizing Maps , vol.30
    • Kohonen, T.1
  • 28
    • 0001798623 scopus 로고
    • Convergence in distribution of the onedimensionalKohonen algorithms when the stimuli are not uniform
    • C. Bouton and G. Pagès, "Convergence in distribution of the onedimensionalKohonen algorithms when the stimuli are not uniform," Adv. Appl. Probab., vol.26, no.1, pp. 80-103, 1994.
    • (1994) Adv. Appl. Probab. , vol.26 , Issue.1 , pp. 80-103
    • Bouton, C.1    Pagès, G.2
  • 29
    • 21344436353 scopus 로고
    • On the a.s. convergence of the Kohonen algorithm with a general neighborhood function
    • J. C. Fort and G. Pagès, "On the a.s. convergence of the Kohonen algorithm with a general neighborhood function," Ann. Appl. Probab., vol.5, no.4, pp. 1177-1216, 1995.
    • (1995) Ann. Appl. Probab. , vol.5 , Issue.4 , pp. 1177-1216
    • Fort, J.C.1    Pagès, G.2
  • 32
    • 0030359624 scopus 로고    scopus 로고
    • Voice conversion based on topological feature maps and time-variant filtering
    • R. Ansgar, "Voice conversion based on topological feature maps and time-variant filtering," in Proc. ICSLP'96, pp. 1445-1448.
    • Proc. ICSLP'96 , pp. 1445-1448
    • Ansgar, R.1
  • 33
    • 34547806096 scopus 로고    scopus 로고
    • A self-organizing map with twin units capable of describing a nonlinear input-output relation applied to speech code vector mapping
    • Nov.
    • E. Uchino, K. Yano, and T. Azetsu, "A self-organizing map with twin units capable of describing a nonlinear input-output relation applied to speech code vector mapping," Inf. Sci.: Int. J., vol.177, no.21, pp. 4634-4644, Nov. 2007.
    • (2007) Inf. Sci.: Int. J. , vol.177 , Issue.21 , pp. 4634-4644
    • Uchino, E.1    Yano, K.2    Azetsu, T.3
  • 34
    • 77953701779 scopus 로고    scopus 로고
    • Embedding new data points for manifold learning via coordinate propagation
    • Long version: Knowledge and Information Systems Journal
    • S. Xiang, F. Nie, Y. Song, C. Zhang, and C. Zhang, "Embedding new data points for manifold learning via coordinate propagation," in Proc. PAKDD'07, 2007, Long version: Knowledge and Information Systems Journal.
    • (2007) Proc. PAKDD'07
    • Xiang, S.1    Nie, F.2    Song, Y.3    Zhang, C.4    Zhang, C.5
  • 35
    • 33947233031 scopus 로고    scopus 로고
    • Out-of-sample extensions for LLE, Isomap, MDS, Eigenmaps and spectral clustering
    • Cambridge, MA: MIT Press
    • Y. Bengio, J. Paiement, and P. Vincent, "Out-of-sample extensions for LLE, Isomap, MDS, Eigenmaps and spectral clustering," in Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press, 2004, vol.16.
    • (2004) Advances in Neural Information Processing Systems , vol.16
    • Bengio, Y.1    Paiement, J.2    Vincent, P.3
  • 36
    • 31544473466 scopus 로고    scopus 로고
    • Incremental nonlinear dimensionality reduction by manifold learning
    • Mar.
    • M. Law and A. K. Jain, "Incremental nonlinear dimensionality reduction by manifold learning," IEEE Trans. Pattern Anal. Mach. Intell., vol.28, no.3, pp. 377-391, Mar. 2006.
    • (2006) IEEE Trans. Pattern Anal. Mach. Intell. , vol.28 , Issue.3 , pp. 377-391
    • Law, M.1    Jain, A.K.2
  • 37
    • 22844435049 scopus 로고    scopus 로고
    • Incremental locally linear embedding
    • O. Kouropteva, O. Okun, and M. Pietikaeinen, "Incremental locally linear embedding," Pattern Recognition, vol.38, no.10, pp. 1764-1767, 2005.
    • (2005) Pattern Recognition , vol.38 , Issue.10 , pp. 1764-1767
    • Kouropteva, O.1    Okun, O.2    Pietikaeinen, M.3
  • 38
    • 67249142662 scopus 로고    scopus 로고
    • Phonetic Anchor based state mapping for text-independent voice conversion
    • M. Zhang, J. Tao, J. Nurminen, J. Tian, and X.Wang, "Phonetic Anchor based state mapping for text-independent voice conversion," in Proc. ICSP'08, 2008.
    • (2008) Proc. ICSP'08
    • Zhang, M.1    Tao, J.2    Nurminen, J.3    Tian, J.4    Wang, X.5
  • 39
    • 0032673049 scopus 로고    scopus 로고
    • Restructuring speech representations using pitch-adaptive time frequency smoothing and instanta-neous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds
    • H. Kawahara, I. Masuda-Katsuse, and A. deCheveigne, "Restructuring speech representations using pitch-adaptive time frequency smoothing and instanta-neous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds," Speech Commun., vol.27, pp. 187-207, 1999.
    • (1999) Speech Commun. , vol.27 , pp. 187-207
    • Kawahara, H.1    Masuda-Katsuse, I.2    De Cheveigne, A.3
  • 41
    • 0542366491 scopus 로고
    • Efficient vector quantization of LPC parameters at 24 bits/frame
    • Jan.
    • K. Paliwal and B. Atal, "Efficient vector quantization of LPC parameters at 24 bits/frame," IEEE Trans. Speech Audio Process., vol.1, no.1, pp. 3-14, Jan. 1993.
    • (1993) IEEE Trans. Speech Audio Process. , vol.1 , Issue.1 , pp. 3-14
    • Paliwal, K.1    Atal, B.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.