SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 20, Issue 8, 2012, Pages 2301-2312

Foreign accent conversion through concatenative synthesis in the articulatory domain

(3) Felps, Daniel a Geng, Christian b Gutierrez Osuna, Ricardo a

a TEXAS A AND M UNIVERSITY (United States)

b UNIVERSITY OF EDINBURGH (United Kingdom)

Author keywords

Accent conversion; speaker recognition; speech perception; speech synthesis

Indexed keywords

ACOUSTIC FEATURES; ARTICULATORY FEATURES; ELECTROMAGNETIC ARTICULOGRAPHY; LISTENING TESTS; MEL-FREQUENCY CEPSTRAL COEFFICIENTS; NON-NATIVE SPEAKERS; SPEAKER DEPENDENTS; SPEAKER RECOGNITION; SPEECH PERCEPTION; STRONG COUPLING; VOCAL TRACT LENGTHS;

SPEECH RECOGNITION; SPEECH SYNTHESIS;

SPEECH PROCESSING;

EID: 84865392230 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2012.2201474 Document Type: Article

Times cited : (34)

References (43)

1
- 84935468520
- Cambridge U.K.: Newbury House
- T. Scovel, A Time to Speak: A Psycholinguistic Inquiry Into the Critical Period for Human Speech. Cambridge, U.K.: Newbury House, 1988.
- (1988) A Time to Speak: A Psycholinguistic Inquiry Into the Critical Period for Human Speech
- Scovel, T.¹

2
- 84972167927
- Native reactions to non-native speech: A review of empirical research
- M. Eisenstein, "Native reactions to non-native speech: A review of empirical research," Studies in Second Lang. Acquisit., vol. 5, no. 02, pp. 160-176, 1983.
- (1983) Studies in Second Lang. Acquisit. , vol.5 , Issue.2 , pp. 160-176
- Eisenstein, M.¹

3
- 67650581742
- An overview of spoken language technology for education
- M. Eskenazi, "An overview of spoken language technology for education," Speech Commun., vol. 51, no. 10, pp. 832-844, 2009.
- (2009) Speech Commun. , vol.51 , Issue.10 , pp. 832-844
- Eskenazi, M.¹

4
- 84937381349
- The pedagogy-technology interface in computer assisted pronunciation training
- A. Neri, C. Cucchiarini, and H. Strik et al., "The pedagogy-technology interface in computer assisted pronunciation training," Comput. Assist. Lang. Learn., vol. 15, no. 5, pp. 441-467, 2002.
- (2002) Comput. Assist. Lang. Learn. , vol.15 , Issue.5 , pp. 441-467
- Neri, A.¹ Cucchiarini, C.² Strik, H.³

5
- 4544358888
- Automatic speech recognition for second language learning: How and why it actually works
- A. Neri, C. Cucchiarini, and H. Strik, "Automatic speech recognition for second language learning: How and why it actually works," in Proc. Int. Congr. Phon. Sci., 2003, pp. 1157-1160.
- (2003) Proc. Int. Congr. Phon. Sci. , pp. 1157-1160
- Neri, A.¹ Cucchiarini, C.² Strik, H.³

6
- 23844449782
- Software that listens: It's not a question of whether, it's a question of how
- K. A. Wachowicz and B. Scott, "Software that listens: It's not a question of whether, it's a question of how," CALICO J., vol. 16, no. 3, pp. 253-276, 1999.
- (1999) CALICO J. , vol.16 , Issue.3 , pp. 253-276
- Wachowicz, K.A.¹ Scott, B.²

7
- 0040485015
- Negotiation of form, recasts, and explicit correction in relation to error types and learner repair in immersion classrooms
- R. Lyster, "Negotiation of form, recasts, and explicit correction in relation to error types and learner repair in immersion classrooms," Lang. Learn., vol. 51, no. s1, pp. 265-301, 2001. (Pubitemid 33281959)
- (2001) Language Learning , vol.51 , Issue.SUPPL. 1 , pp. 265-301
- Lyster, R.¹

8
- 67650668657
- English speech training using voice conversion
- K. Nagano and K. Ozawa, "English speech training using voice conversion," in Proc. ICSLP, 1990, pp. 1169-1172.
- (1990) Proc. ICSLP , pp. 1169-1172
- Nagano, K.¹ Ozawa, K.²

9
- 67650602764
- Lexical stress training of German compounds for Italian speakers by means of resynthesis and emphasis
- M. P. Bissiri, H. R. Pfitzinger, and H. G. Tillmann, "Lexical stress training of German compounds for Italian speakers by means of resynthesis and emphasis," in Proc. Austral. Int. Conf. Speech Sci. Tech., 2006, pp. 24-29.
- (2006) Proc. Austral. Int. Conf. Speech Sci. Tech. , pp. 24-29
- Bissiri, M.P.¹ Pfitzinger, H.R.² Tillmann, H.G.³

10
- 0036642569
- Enhancing foreign language tutors - In search of the golden speaker
- DOI 10.1016/S0167-6393(01)00009-7, PII S0167639301000097
- K. Probst, Y. Ke, and M. Eskenazi, "Enhancing foreign language tutors -In search of the golden speaker," Speech Commun., vol. 37, no. 3-4, pp. 161-173, 2002. (Pubitemid 34524837)
- (2002) Speech Communication , vol.37 , Issue.3-4 , pp. 161-173
- Probst, K.¹ Ke, Y.² Eskenazi, M.³

11
- 67650657780
- Foreign accent conversion in computer assisted pronunciation training
- D. Felps, H. Bortfeld, and R. Gutierrez-Osuna, "Foreign accent conversion in computer assisted pronunciation training," Speech Commun., vol. 51, no. 10, pp. 920-932, 2009.
- (2009) Speech Commun. , vol.51 , Issue.10 , pp. 920-932
- Felps, D.¹ Bortfeld, H.² Gutierrez-Osuna, R.³

12
- 25044464569
- The front-cavity/F2[prime] hypothesis tested by data on tongue movements
- D. J. Broad and H. Hermansky, "The front-cavity/F2[prime] hypothesis tested by data on tongue movements," J. Acoust. Soc. Amer., vol. 86, no. S1, pp. S113-S114, 1989.
- (1989) J. Acoust. Soc. Amer. , vol.86 , Issue.S1
- Broad, D.J.¹ Hermansky, H.²

13
- 85027461504
- Using articulatory position data in voice transformation
- A. Toth and A. Black, "Using articulatory position data in voice transformation," in Proc. ISCA Speech Synth.Workshop, 2007, pp. 182-185.
- (2007) Proc. ISCA Speech Synth.Workshop , pp. 182-185
- Toth, A.¹ Black, A.²

14
- 84937181324
- Listeners and disguised voices: The imitation and perception of dialectal accent
- D. Markham, "Listeners and disguised voices: The imitation and perception of dialectal accent," Forensic Linguist., vol. 6, no. 2, pp. 290-299, 1999.
- (1999) Forensic Linguist. , vol.6 , Issue.2 , pp. 290-299
- Markham, D.¹

15
- 84937385165
- Passing for a native speaker: Identity and success in second language learning
- I. Piller, "Passing for a native speaker: Identity and success in second language learning," J. Sociolinguist., vol. 6, no. 2, pp. 179-208, 2002.
- (2002) J. Sociolinguist. , vol.6 , Issue.2 , pp. 179-208
- Piller, I.¹

16
- 84910085270
- Foreign-language speech synthesis
- N. Campbell, "Foreign-language speech synthesis," in Proc. ISCA Speech Synth. Workshop, 1998, pp. 117-180.
- (1998) Proc. ISCA Speech Synth. Workshop , pp. 117-180
- Campbell, N.¹

17
- 67650666088
- Spoken language conversion with accent morphing
- M. Huckvale and K. Yanagisawa, "Spoken language conversion with accent morphing," in Proc. ISCA Speech Synth. Workshop, 2007, pp. 64-70.
- (2007) Proc. ISCA Speech Synth. Workshop , pp. 64-70
- Huckvale, M.¹ Yanagisawa, K.²

18
- 64349124465
- Analysis and synthesis of formant spaces of british, australian, and american accents
- Q. Yan, S. Vaseghi, and D. Rentzos et al., "Analysis and synthesis of formant spaces of british, australian, and american accents," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 2, pp. 676-689, 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.2 , pp. 676-689
- Yan, Q.¹ Vaseghi, S.² Rentzos, D.³

19
- 0032680858
- Implications of glottal source for speaker and dialect identification
- Phoenix, AZ
- L. R. Yanguas, T. F. Quatieri, and F. Goodman, "Implications of glottal source for speaker and dialect identification," in Proc. ICASSP, Phoenix, AZ, 1999, pp. 813-816.
- (1999) Proc. ICASSP , pp. 813-816
- Yanguas, L.R.¹ Quatieri, T.F.² Goodman, F.³

20
- 84865357645
- [Online]. Available:
- A. Wrench, MOCHA-TIMIT. [Online]. Available: http://www.cstr.ed.ac.uk/ research/projects/artic/mocha.html
- MOCHA-TIMIT
- Wrench, A.¹

21
- 0003652255
- report Univ. of Wisconsin, Madison, WI
- J. R. Westbury, "X-ray microbeam speech production database tech. report," Univ. of Wisconsin, Madison, WI, 1994.
- (1994) X-Ray Microbeam Speech Production Database Tech
- Westbury, J.R.¹

22
- 33846669825
- Beyond 2D in articulatory data acquisition and analysis
- P. Hoole, A. Zierdt, and C. Geng, "Beyond 2D in articulatory data acquisition and analysis," in Proc. Int. Conf. Phon. Sci., 2003, pp. 265-268.
- (2003) Proc. Int. Conf. Phon. Sci. , pp. 265-268
- Hoole, P.¹ Zierdt, A.² Geng, C.³

23
- 79960575108
- Five-dimensional articulography
- P. Hoole and A. Zierdt, "Five-dimensional articulography," Speech Motor Control, pp. 331-349, 2010.
- (2010) Speech Motor Control , pp. 331-349
- Hoole, P.¹ Zierdt, A.²

24
- 51449085174
- Analysis-by-synthesis features for speech recognition
- Z. Al Bawab, R. Bhiksha, and R. M. Stern, "Analysis-by-synthesis features for speech recognition," in Proc. ICASSP, 2008, pp. 4185-4188.
- (2008) Proc. ICASSP , pp. 4185-4188
- Al Bawab, Z.¹ Bhiksha, R.² Stern, R.M.³

25
- 34247634965
- An articulatory model of the tongue based on a statistical analysis
- S. Maeda, "An articulatory model of the tongue based on a statistical analysis," J. Acoust. Soc. Amer., vol. 65, p. S22, 1979.
- (1979) J. Acoust. Soc. Amer. , vol.65
- Maeda, S.¹

26
- 0030677481
- Speech representation and transformation using adaptive interpolation of weighted spectrum: Vocoder revisited
- H. Kawahara, "Speech representation and transformation using adaptive interpolation of weighted spectrum: Vocoder revisited," in Proc. ICASSP, 1997, pp. 1303-1306.
- (1997) Proc. ICASSP , pp. 1303-1306
- Kawahara, H.¹

27
- 0032141206
- Cepstral domain segmental feature vector normalization for noise robust speech recognition
- PII S0167639398000338
- O. Viikki and K. Laurila, "Cepstral domain segmental feature vector normalization for noise robust speech recognition," Speech Commun., vol. 25, no. 1-3, pp. 133-147, 1998. (Pubitemid 128413638)
- (1998) Speech Communication , vol.25 , Issue.1-3 , pp. 133-147
- Viikki, O.¹ Laurila, K.²

28
- 51449115975
- Univ. of Cambridge, U.K. Tech. Rep
- K. Vertanen, "Baseline WSJ acoustic models for HTK and Sphinx: Training recipes and recognition experiments," Univ. of Cambridge, U.K., 2006, Tech. Rep.
- (2006) Baseline WSJ acoustic models for HTK and Sphinx: Training recipes and recognition experiments
- Vertanen, K.¹

29
- 34248705249
- Castilian spanish
- E. Martínez-Celdrán, A. M. Fernández-Planas, and J. Carrera-Sabaté, "Castilian Spanish," J. Int. Phon. Assoc., vol. 33, no. 02, pp. 255-259, 2003.
- (2003) J. Int. Phon. Assoc. , vol.33 , Issue.2 , pp. 255-259
- Martínez-Celdrán, E.¹ Fernández-Planas, A.M.² Carrera-Sabaté, J.³

30
- 0029765811
- Unit selection in a concatenative speech synthesis system using a large speech database
- A. J. Hunt and A. W. Black, "Unit selection in a concatenative speech synthesis system using a large speech database," in Proc. ICASSP, 1996, pp. 373-376.
- (1996) Proc. ICASSP , pp. 373-376
- Hunt, A.J.¹ Black, A.W.²

31
- 34047123652
- Multisyn: Open-domain unit selection for the Festival speech synthesis system
- DOI 10.1016/j.specom.2007.01.014, PII S0167639307000398
- R. A. J. Clark, K. Richmond, and S. King, "Multisyn: Open-domain unit selection for the festival speech synthesis system," Speech Commun., vol. 49, no. 4, pp. 317-330, 2007. (Pubitemid 46517714)
- (2007) Speech Communication , vol.49 , Issue.4 , pp. 317-330
- Clark, R.A.J.¹ Richmond, K.² King, S.³

32
- 0002609530
- Optimal coupling of diphones
- A. Conkie and S. Isard, "Optimal coupling of diphones," Progress in Speech Synth., pp. 293-304, 1997.
- (1997) Progress in Speech Synth. , pp. 293-304
- Conkie, A.¹ Isard, S.²

33
- 0036497601
- A comparison of spectral smoothing methods for segment concatenation based speech synthesis
- D. T. Chappell and J. H. L. Hansen, "A comparison of spectral smoothing methods for segment concatenation based speech synthesis," Speech Commun., vol. 36, no. 3-4, pp. 343-373, 2002.
- (2002) Speech Commun. , vol.36 , Issue.3-4 , pp. 343-373
- Chappell, D.T.¹ Hansen, J.H.L.²

34
- 70450161677
- Pulse density representation of spectrum for statistical speech processing
- Y. Shiga, "Pulse density representation of spectrum for statistical speech processing," in Proc. Interspeech, 2009, pp. 1771-1774.
- (2009) Proc. Interspeech , pp. 1771-1774
- Shiga, Y.¹

35
- 84965511190
- Evaluations of foreign accent in extemporaneous and read material
- M. Munro and T. Derwing, "Evaluations of foreign accent in extemporaneous and read material," Lang. Testing, vol. 11, pp. 253-266, 1994.
- (1994) Lang. Testing , vol.11 , pp. 253-266
- Munro, M.¹ Derwing, T.²

36
- 0031647824
- A frequencywarping approach to speaker normalization
- Jan
- L. Lee and R. Rose, "A frequencywarping approach to speaker normalization," IEEE Trans. Speech Audio Process., vol. 6, no. 1, pp. 49-60, Jan. 1998.
- (1998) IEEE Trans. Speech Audio Process. , vol.6 , Issue.1 , pp. 49-60
- Lee, L.¹ Rose, R.²

37
- 84936526529
- On the quantal nature of speech
- K. N. Stevens, "On the quantal nature of speech," Phonetics, vol. 17, no. 1, pp. 3-45, 1989.
- (1989) Phonetics , vol.17 , Issue.1 , pp. 3-45
- Stevens, K.N.¹

38
- 77955426516
- Automatic voice onset time detection for unvoiced stops (/p/,/t/,/k/) with application to accent classification
- J. H. L. Hansen, S. S. Gray, and W. Kim, "Automatic voice onset time detection for unvoiced stops (/p/,/t/,/k/) with application to accent classification," Speech Commun., vol. 52, no. 10, pp. 777-789, 2010.
- (2010) Speech Commun. , vol.52 , Issue.10 , pp. 777-789
- Hansen, J.H.L.¹ Gray, S.S.² Kim, W.³

39
- 84966440972
- Integration of rule-based formant synthesis and waveform concatenation: A hybrid approach to text-to-speech synthesis
- S. R. Hertz, "Integration of rule-based formant synthesis and waveform concatenation: A hybrid approach to text-to-speech synthesis," in Proc. IEEE Workshop Speech Synth., 2002, pp. 87-90.
- (2002) Proc. IEEE Workshop Speech Synth. , pp. 87-90
- Hertz, S.R.¹

40
- 84865372824
- Evaluation of cross-language voice conversion using bilingual and non-bilingual databases
- M. Mashimo, T. Toda, and H. Kawanami et al., "Evaluation of cross-language voice conversion using bilingual and non-bilingual databases," in Proc. Interspeech, 2002.
- (2002) Proc. Interspeech
- Mashimo, M.¹ Toda, T.² Kawanami, H.³

41
- 0027227401
- Individual differences in vowel production
- K. Johnson, P. Ladefoged, and M. Lindau, "Individual differences in vowel production," J. Acoust. Soc. Amer., vol. 94, no. 2, pp. 701-714, 1993. (Pubitemid 23236850)
- (1993) Journal of the Acoustical Society of America , vol.94 , Issue.2 , pp. 701-714
- Johnson, K.¹ Ladefoged, P.² Lindau, M.³

42
- 68149181313
- A comparison of acoustic features for articulatory inversion
- M. Á. Carreira-Perpiñán and C. Qin, "A comparison of acoustic features for articulatory inversion," in Proc. Interspeech, 2007, pp. 2469-2472.
- (2007) Proc. Interspeech , pp. 2469-2472
- Carreira-Perpiñán, M.A.¹ Qin, C.²

43
- 79959852489
- Estimating missing data sequences in X-ray microbeam recordings
- Makuhari, Japan
- C. Qin and M. A. Carreira-Perpinán, "Estimating missing data sequences in X-ray microbeam recordings," in Proc. Interspeech, Makuhari, Japan, 2010, pp. 1592-1595.
- (2010) Proc. Interspeech , pp. 1592-1595
- Qin, C.¹ Carreira-Perpinán, M.A.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.