SCOPUS 정보 검색 플랫폼

Eurasip Journal on Audio, Speech, and Music Processing

Volumn 2014, Issue , 2014, Pages

A preliminary demonstration of exemplar-based voice conversion for articulation disorders using an individuality-preserving dictionary

(4) Aihara, Ryo a Takashima, Ryoichi a Takiguchi, Tetsuya a Ariki, Yasuo a

a KOBE UNIVERSITY (Japan)

Author keywords

Articulation disorders; Assistive technologies; NMF; Voice conversion; Voice reconstruction

Indexed keywords

LINGUISTICS;

ASSISTIVE TECHNOLOGY; CEREBRAL PALSY; EXEMPLAR-BASED; NMF; NONNEGATIVE MATRIX FACTORIZATION; SPECTRAL CONVERSION; TARGET SPEAKER; VOICE CONVERSION;

SPEECH PROCESSING;

EID: 84901801701 PISSN: 16874714 EISSN: 16874722 Source Type: Journal
DOI: 10.1186/1687-4722-2014-5 Document Type: Article

Times cited : (17)

References (30)

1
- 84964319691
- Capturing human hand motion in image sequences
- Orlando, 5-6 Dec 2002 (IEEE, Piscataway
- J Lin, W Ying, TS Huang, Capturing human hand motion in image sequences, in IEEE Workshop on Motion and Video Computing, Orlando, 5-6 Dec 2002 (IEEE, Piscataway, 2002), pp. 99-104
- (2002) IEEE Workshop on Motion and Video Computing , pp. 99-104
- Lin, J.¹ Ying, W.² Huang, T.S.³

2
- 0032304547
- Real-time american sign language recognition using desk and wearable computer based video
- T Starner, J Weaver, A Pentland, Real-time American sign language recognition using desk and wearable computer based video. IEEE Trans. Pattern Anal. Mach. Intell. 20(12), 1371-1375 (1998) (Pubitemid 128741382)
- (1998) IEEE Transactions on Pattern Analysis and Machine Intelligence , vol.20 , Issue.12 , pp. 1371-1375
- Starner, T.¹ Weaver, J.² Pentland, A.³

3
- 18844375067
- Large vocabulary sign language recognition based on hierarchical decision trees
- G Fang, W Gao, D Zhao, Large vocabulary sign language recognition based on hierarchical decision trees. 5th International Conference on Multimodal Interfaces. 34(3), 125-131 (2004)
- (2004) 5th International Conference on Multimodal Interfaces. , vol.34 , Issue.3 , pp. 125-131
- Fang, G.¹ Gao, W.² Zhao, D.³

4
- 10044237641
- Text detection from natural scene images: Towards a system for visually impaired persons
- N Ezaki, M Bulacu, L Schomaker, Text detection from natural scene images: towards a system for visually impaired persons. Proceedings of the 17th International Conference on Pattern Recognition (ICPR 2004). 2, 683-686 (2004)
- (2004) Proceedings of the 17th International Conference on Pattern Recognition (ICPR 2004) , vol.2 , pp. 683-686
- Ezaki, N.¹ Bulacu, M.² Schomaker, L.³

5
- 1542359445
- Unsupervised texture segmentation via wavelet-based locally orderless images (WLOIs) and SOM
- MK Bashar, T Matsumoto, Y Takeuchi, H Kudo, N Ohnishi, Unsupervised texture segmentation via wavelet-based locally orderless images (WLOIs) and SOM, in Computer Graphics and Imaging (IASTED/ACTA Press, Calgary, 2003), pp. 279-284
- (2003) Computer Graphics and Imaging (IASTED/ACTA Press, Calgary , pp. 279-284
- Bashar, M.K.¹ Matsumoto, T.² Takeuchi, Y.³ Kudo, H.⁴ Ohnishi, N.⁵

6
- 0033220888
- TextFinder: An automatic system to detect and recognize text in images
- DOI 10.1109/34.809116
- V Wu, R Manmatha, EM Riseman, Textfinder: an automatic system to detect and recognize text in images. IEEE Trans. Pattern Anal. Mach. Intell. 21(11), 1224-1229 (1999) (Pubitemid 32211200)
- (1999) IEEE Transactions on Pattern Analysis and Machine Intelligence , vol.21 , Issue.11 , pp. 1224-1229
- Wu, V.¹ Manmatha, R.² Riseman, E.M.³

7
- 84901802715
- A basic design of wearable speech synthesizer for voice disorders [Japanese]
- K Yabu, T Ifukube, S Aomura, A basic design of wearable speech synthesizer for voice disorders [Japanese]. EIC Technical report (Institute Electron. Inf. Commun. Eng). 105(686), 59-64 (2006)
- (2006) EIC Technical Report (Institute Electron. Inf. Commun. Eng) , vol.105 , Issue.686 , pp. 59-64
- Yabu, K.¹ Ifukube, T.² Aomura, S.³

8
- 0003658713
- Mosby-Year Book, St. Louis
- ST Canale, WC Campbell, Campbell's Operative Orthopaedics, vol. 4 (Mosby-Year Book, St. Louis, 2002)
- (2002) Campbell's Operative Orthopaedics , vol.4
- Canale, S.T.¹ Campbell, W.C.²

9
- 78651564957
- Integration of metamodel and acoustic model for dysarthric speech recognition
- H Matsumasa, T Takiguchi, Y Ariki, I Li, T Nakabayachi, Integration of metamodel and acoustic model for dysarthric speech recognition. J. Multimedia. 4(4), 254-261 (2009)
- (2009) J. Multimedia. , vol.4 , Issue.4 , pp. 254-261
- Matsumasa, H.¹ Takiguchi, T.² Ariki, Y.³ Li, I.⁴ Nakabayachi, T.⁵

10
- 78650877992
- Multimodal speech recognition of a person with articulation disorders using AAM and MAF
- St. Malo, 4-6 Oct 2010 (IEEE, Piscataway
- C Miyamoto, Y Komai, T Takiguchi, Y Ariki, I Li, Multimodal speech recognition of a person with articulation disorders using AAM and MAF, in IEEE International Workshop on Multimedia Signal Processing (MMSP'10), St. Malo, 4-6 Oct 2010 (IEEE, Piscataway, 2010), pp. 517-520
- (2010) IEEE International Workshop on Multimedia Signal Processing (MMSP'10) , pp. 517-520
- Miyamoto, C.¹ Komai, Y.² Takiguchi, T.³ Ariki, Y.⁴ Li, I.⁵

11
- 80051557209
- Automatic speech recognition systems for the evaluation of voice and speech disorders in head and neck cancer
- A Maier, T Haderlein, F Stelzle, E Noth, E Nkenke, F Rosanowski, A Schutzenberger, M Schuster, Automatic speech recognition systems for the evaluation of voice and speech disorders in head and neck cancer. EURASIP J. Audio Speech Music Process. 2010, 926951 (2010)
- (2010) EURASIP J. Audio Speech Music Process , vol.2010 , pp. 926951
- Maier, A.¹ Haderlein, T.² Stelzle, F.³ Noth, E.⁴ Nkenke, E.⁵ Rosanowski, F.⁶ Schutzenberger, A.⁷ Schuster, M.⁸

12
- 84898964201
- Algorithms for non-negative matrix factorization
- MIT Press, Massachusetts
- D Lee, HS Seung, Algorithms for non-negative matrix factorization, in Advances in Neural Information Processing 13 (NIPS 2000) (MIT Press, Massachusetts, 2001), pp. 556-562
- (2001) Advances in Neural Information Processing 13 (NIPS 2000) , pp. 556-562
- Lee, D.¹ Seung, H.S.²

13
- 50249152311
- Monaural sound source separation by non-negative matrix factorization with temporal continuity and sparseness criteria
- T Virtanen, Monaural sound source separation by non-negative matrix factorization with temporal continuity and sparseness criteria. IEEE Trans. Audio Speech Lang. Process. 15(3), 1066-1074 (2007)
- (2007) IEEE Trans. Audio Speech Lang. Process , vol.15 , Issue.3 , pp. 1066-1074
- Virtanen, T.¹

14
- 78049412911
- Noise robust exemplar-based connected digit recognition
- Dallas, 14-19 March 2010 (IEEE, Piscataway
- JF Gemmeke, T Virtanen, Noise robust exemplar-based connected digit recognition, in 2010 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), Dallas, 14-19 March 2010 (IEEE, Piscataway, 2010), pp. 4546-4549
- (2010) 2010 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP) , pp. 4546-4549
- Gemmeke, J.F.¹ Virtanen, T.²

15
- 44949110218
- Single-channel speech separation using sparse non-negative matrix factorization
- Pittsburgh, 17-21 Sept 2006 (Curran Associates, Inc., New York
- MN Schmidt, RK Olsson, Single-channel speech separation using sparse non-negative matrix factorization, in Interspeech 2006-ICSLP, Ninth International Conference on Spoken Language Processing, Pittsburgh, 17-21 Sept 2006 (Curran Associates, Inc., New York, 2006), pp. 2614-2617
- (2006) Interspeech 2006-ICSLP, Ninth International Conference on Spoken Language Processing , pp. 2614-2617
- Schmidt, M.N.¹ Olsson, R.K.²

16
- 57749193836
- Voice conversion based on maximum likelihood estimation of spectral parameter trajectory
- T Toda, A Black, K Tokuda, Voice conversion based on maximum likelihood estimation of spectral parameter trajectory. IEEE Trans. Audio Speech Lang. Process. 15(8), 2222-2235 (2007)
- (2007) IEEE Trans. Audio Speech Lang. Process , vol.15 , Issue.8 , pp. 2222-2235
- Toda, T.¹ Black, A.² Tokuda, K.³

17
- 84876497245
- GMM-based voice conversion applied to emotional speech synthesis
- Y Iwami, T Toda, H Saruwatari, K Shikano, GMM-based voice conversion applied to emotional speech synthesis. IEEE Trans. Speech Audio Process. 7, 2401-2404 (1999)
- (1999) IEEE Trans. Speech Audio Process , vol.7 , pp. 2401-2404
- Iwami, Y.¹ Toda, T.² Saruwatari, H.³ Shikano, K.⁴

18
- 84890451203
- GMM-Based emotional voice conversion using spectrum and prosody features
- R Aihara, R Takashima, T Takiguchi, Y Ariki, GMM-Based emotional voice conversion using spectrum and prosody features. Am. J. Signal Process. 2(5), 135-138 (2012)
- (2012) Am. J. Signal Process , vol.2 , Issue.5 , pp. 135-138
- Aihara, R.¹ Takashima, R.² Takiguchi, T.³ Ariki, Y.⁴

19
- 0032026483
- Continuous probabilistic transform for voice conversion
- PII S1063667698017386
- Y Stylianou, O Cappe, E Moulines, Continuous probabilistic transform for voice conversion. IEEE Trans. Speech Audio Process. 6(2), 131-142 (1998) (Pubitemid 128720639)
- (1998) IEEE Transactions on Speech and Audio Processing , vol.6 , Issue.2 , pp. 131-142
- Stylianou, Y.¹ Cappe, O.² Moulines, E.³

20
- 77953712499
- Voice conversion using partial least squares regression
- E Helander, T Virtanen, J Nurminen, M Gabbouj, Voice conversion using partial least squares regression. IEEE Trans. Audio Speech Lang. Process. 18(5), 912-921 (2010)
- (2010) IEEE Trans. Audio Speech Lang. Process , vol.18 , Issue.5 , pp. 912-921
- Helander, E.¹ Virtanen, T.² Nurminen, J.³ Gabbouj, M.⁴

21
- 44949210554
- Map-based adaptation for speech conversion using adaptation data selection and non-parallel training
- Pittsburgh, 17-21 Sept 2006 (Curran Associates, Inc., New York
- CH Lee, CH Wu, Map-based adaptation for speech conversion using adaptation data selection and non-parallel training, in Interspeech 2006-ICSLP, Ninth International Conference on Spoken Language Processing, Pittsburgh, 17-21 Sept 2006 (Curran Associates, Inc., New York, 2006), pp. 2254-2257
- (2006) Interspeech 2006-ICSLP, Ninth International Conference on Spoken Language Processing , pp. 2254-2257
- Lee, C.H.¹ Wu, C.H.²

22
- 34547512822
- Eigenvoice conversion based on Gaussian mixture model
- Pittsburgh, 17-21 Sept 2006 (Curran Associates, Inc., New York
- T Toda, Y Ohtani, K Shikano, Eigenvoice conversion based on Gaussian mixture model, in Interspeech 2006-ICSLP, Ninth International Conference on Spoken Language Processing, Pittsburgh, 17-21 Sept 2006 (Curran Associates, Inc., New York, 2006), pp. 2446-2449
- (2006) Interspeech 2006-ICSLP, Ninth International Conference on Spoken Language Processing , pp. 2446-2449
- Toda, T.¹ Ohtani, Y.² Shikano, K.³

23
- 80052698826
- Speaking-aid systems using GMM-based voice conversion for electrolaryngeal speech
- K Nakamura, T Toda, H Saruwatari, K Shikano, Speaking-aid systems using GMM-based voice conversion for electrolaryngeal speech. Speech Commun. 54(1), 134-146 (2012)
- (2012) Speech Commun. , vol.54 , Issue.1 , pp. 134-146
- Nakamura, K.¹ Toda, T.² Saruwatari, H.³ Shikano, K.⁴

24
- 44949265538
- Speaking aid system for total laryngectomees using voice conversion of body transmitted artificial speech
- Pittsburgh, 17-21 Sept 2006 (Curran Associates, Inc., New York
- K Nakamura, T Toda, H Saruwatari, K Shikano, Speaking aid system for total laryngectomees using voice conversion of body transmitted artificial speech, in Interspeech 2006-ICSLP, Ninth International Conference on Spoken Language Processing, Pittsburgh, 17-21 Sept 2006 (Curran Associates, Inc., New York, 2006), pp. 1395-1398
- (2006) Interspeech 2006-ICSLP, Ninth International Conference on Spoken Language Processing , pp. 1395-1398
- Nakamura, K.¹ Toda, T.² Saruwatari, H.³ Shikano, K.⁴

25
- 84878397216
- Using HMM-based speech synthesis to reconstruct the voice of individuals with degenerative speech disorders
- Portland, 9-13 September 2012 (Curran Associates, Inc. New York
- C Veaux, J Yamagishi, S King, Using HMM-based speech synthesis to reconstruct the voice of individuals with degenerative speech disorders, in 13th Annual Conference of the International Speech Communication Association 2012 (INTERSPEECH 2012), Portland, 9-13 September 2012 (Curran Associates, Inc. New York, 2012), pp. 966-969
- (2012) 13th Annual Conference of the International Speech Communication Association 2012 (INTERSPEECH 2012) , pp. 966-969
- Veaux, C.¹ Yamagishi, J.² King, S.³

26
- 0032673049
- Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency based F0 extraction: Possible role of a repetitive structure in sounds
- H Kawahara, I Masuda-Katsuse, A Cheveigne, Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency based F0 extraction: possible role of a repetitive structure in sounds. Speech Commun. 27(3-4), 187-207 (1999)
- (1999) Speech Commun. , vol.27 , Issue.3-4 , pp. 187-207
- Kawahara, H.¹ Masuda-Katsuse, I.² Cheveigne, A.³

27
- 79960657803
- Exemplar-based sparse representations for noise robust automatic speech recognition
- JF Gemmeke, T Viratnen, A Hurmalainen, Exemplar-based sparse representations for noise robust automatic speech recognition. IEEE Trans. Audio Speech Lang. Process. 19(7), 2067-2080 (2011)
- (2011) IEEE Trans. Audio Speech Lang. Process , vol.19 , Issue.7 , pp. 2067-2080
- Gemmeke, J.F.¹ Viratnen, T.² Hurmalainen, A.³

28
- 0025475528
- ATR Japanese speech database as a tool of speech recognition and synthesis
- A Kurematsu, K Takeda, Y Sagisaka, S Katagiri, H Kuwabara, K Shikano, ATR Japanese speech database as a tool of speech recognition and synthesis. Speech Commun. 9, 357-363 (1990)
- (1990) Speech Commun. , vol.9 , pp. 357-363
- Kurematsu, A.¹ Takeda, K.² Sagisaka, Y.³ Katagiri, S.⁴ Kuwabara, H.⁵ Shikano, K.⁶

29
- 78649238036
- Synthesizer voice quality of nwe languages calibrated with mean mel cepstral distortion
- Hanoi University of Technology, Hanoi, 5-7 May
- J Kominek, T Schultz, AW Black, Synthesizer voice quality of nwe languages calibrated with mean mel cepstral distortion, in The International Workshop on Spoken Language Technology for Under-Resourced Languages (SLTU) (Hanoi University of Technology, Hanoi, 5-7 May 2008)
- (2008) The International Workshop on Spoken Language Technology for Under-Resourced Languages (SLTU)
- Kominek, J.¹ Schultz, T.² Black, A.W.³

30
- 84901814867
- International Telecommunication Union, ITU-T Recommendation P.800-P.899: Methods for Objective and Subjective Assessment of Quality (ITU, Geneva
- International Telecommunication Union, ITU-T Recommendation P.800-P.899: Methods for Objective and Subjective Assessment of Quality (ITU, Geneva, 2003)
- (2003)

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.