SCOPUS 정보 검색 플랫폼

Speech Communication

Volumn 50, Issue 3, 2008, Pages 215-227

Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model

(3) Toda, Tomoki a Black, Alan W b Tokuda, Keiichi c

a NARA INSTITUTE OF SCIENCE AND TECHNOLOGY (Japan)

b Carnegie Mellon University (United States)

c NAGOYA INSTITUTE OF TECHNOLOGY (Japan)

Author keywords

Acoustic to articulatory inversion mapping; Articulatory to acoustic mapping; Dynamic features; GMM; MMSE

Indexed keywords

DATABASE SYSTEMS; GAUSSIAN DISTRIBUTION; MEAN SQUARE ERROR; PARAMETER ESTIMATION; SPEECH ANALYSIS; SPEECH SYNTHESIS;

ARTICULATORY MOVEMENTS; GAUSSIAN MIXTURE MODEL (GMM); SPEECH DATABASES;

ACOUSTIC SPECTROSCOPY;

EID: 38649140222 PISSN: 01676393 EISSN: None Source Type: Journal
DOI: 10.1016/j.specom.2007.09.001 Document Type: Article

Times cited : (217)

References (36)

1
- 0017968519
- Inversion of articulatory-to-acoustic transformation in the vocal tract by a computer-sorting technique
- Atal B.S., Chang J.J., Mathews M.V., and Tukey J.W. Inversion of articulatory-to-acoustic transformation in the vocal tract by a computer-sorting technique. J. Acoust. Soc. Amer. 63 (1978) 1535-1555
- (1978) J. Acoust. Soc. Amer. , vol.63 , pp. 1535-1555
- Atal, B.S.¹ Chang, J.J.² Mathews, M.V.³ Tukey, J.W.⁴

2
- 0034840906
- Chu, M., Peng, H., Yang, H., Chang, E., 2001. Selecting non-uniform units from a very large corpus for concatenative speech synthesizer. In: Proc. ICASSP. Salt Lake City, USA, pp. 785-788.

3
- 84994254645
- Frankel, J., Richmond, K., King, S., Taylor, P., 2000. An automatic speech recognition system using neural networks and linear dynamic models to recover and model articulatory traces. In: Proc. ICSLP, Beijing, China, Vol. 4, pp. 254-257.

4
- 2142659020
- Estimation of articulatory movements from speech acoustics using an HMM-based speech production model
- Hiroya S., and Honda M. Estimation of articulatory movements from speech acoustics using an HMM-based speech production model. IEEE Trans. Speech Audio Process. 12 2 (2004) 175-185
- (2004) IEEE Trans. Speech Audio Process. , vol.12 , Issue.2 , pp. 175-185
- Hiroya, S.¹ Honda, M.²

5
- 2642528734
- Speaker adaptation method for acoustic-to-articulatory inversion using an HMM-based speech production model
- Hiroya S., and Honda M. Speaker adaptation method for acoustic-to-articulatory inversion using an HMM-based speech production model. IEICE Trans. Inf. Systems E87-D 5 (2004) 1071-1078
- (2004) IEICE Trans. Inf. Systems , vol.E87-D , Issue.5 , pp. 1071-1078
- Hiroya, S.¹ Honda, M.²

6
- 0029843107
- Accurate recovery of articulator positions from acoustics: new conclusions based on human data
- Hogden J., Lofqvist A., Gracco V., Zlokarnik I., Rubin P., and Saltzman E. Accurate recovery of articulator positions from acoustics: new conclusions based on human data. J. Acoust. Soc. Amer. 100 (1996) 1819-1834
- (1996) J. Acoust. Soc. Amer. , vol.100 , pp. 1819-1834
- Hogden, J.¹ Lofqvist, A.² Gracco, V.³ Zlokarnik, I.⁴ Rubin, P.⁵ Saltzman, E.⁶

7
- 0029765811
- Hunt, A.J., Black, A.W., 1996. Unit selection in a concatenative speech synthesis system using a large speech database. In: Proc. ICASSP, Atlanta, USA, pp. 373-376.

8
- 38649097792
- Kaburagi, T., Honda, M., 1998. Determination of the vocal tract spectrum from the articulatory movements based on the search of an articulatory-acoustic database. In: Proc. ICSLP, Sydney, Australia, pp. 433-436.

9
- 0031623661
- Kain, A., Macon, M.W., 1998. Spectral voice conversion for text-to-speech synthesis. In: Proc. ICASSP, Seattle, USA, pp. 285-288.

10
- 38649084966
- Kain, A., Niu, X., Hosom, J.-P., Miao, Q., van Santen, J., 2004. Formant re-synthesis of dysarthric speech. In: Proc. 5th ISCA Speech Synthesis Workshop, Pittsburgh, USA, pp. 25-30.

11
- 0032673049
- 0 extraction: possible role of a repetitive structure in sounds
- 0 extraction: possible role of a repetitive structure in sounds. Speech Comm. 27 3-4 (1999) 187-207
- (1999) Speech Comm. , vol.27 , Issue.3-4 , pp. 187-207
- Kawahara, H.¹ Masuda-Katsuse, I.² de Cheveigné, A.³

12
- 38649102765
- 0 and periodicity. In: Proc. EUROSPEECH, Budapest, Hungary, pp. 2781-2784.

13
- 38649108449
- Kawai, H., Toda, T., Ni, J., Tsuzaki, M., Tokuda, K. 2004. XIMERA: a new TTS from ATR based on corpus-based technologies. In: Proc. 5th ISCA Speech Synthesis Workshop (SSW5). Pittsburgh, USA, pp. 179-184.

14
- 6344254321
- A neural network model of the articulatory-acoustic forward mapping trained on recordings of articulatory parameters
- Kello C.T., and Plaut D.C. A neural network model of the articulatory-acoustic forward mapping trained on recordings of articulatory parameters. J. Acoust. Soc. Amer. 116 4 (2004) 2354-2364
- (2004) J. Acoust. Soc. Amer. , vol.116 , Issue.4 , pp. 2354-2364
- Kello, C.T.¹ Plaut, D.C.²

15
- 33645796901
- Minami, Y., McDermott, E., Nakamura, A., Katagiri, S., 2004. A theoretical analysis of speech recognition based on feature trajectory models. In: Proc. INTERSPEECH, Jeju, Korea, pp. 549-552.

16
- 42649146508
- Nakamura, K., Toda, T., Nankaku, Y., Tokuda, K., 2006. On the use of phonetic information for mapping from articulatory movements to vocal tract spectrum. In: Proc. ICASSP. Toulouse, France, pp. 93-96.

17
- 0033692729
- Park, K.Y., Kim, H.S., 2000. Narrowband to wideband conversion of speech using GMM based transformation. In: Proc. ICASSP, Istanbul, pp. 1847-1850.

18
- 38649110974
- Richmond, K., 2001. Estimating articulatory parameters from the acoustic speech signal. Ph.D. Thesis, The Centre for Speech Technology Research, University of Edinburgh.

19
- 44949185845
- Richmond, K., 2006. A trajectory mixture density network for the acoustic-articulatory inversion mapping. In: Proc. INTERSPEECH, Pittsburgh, USA, pp. 577-580.

20
- 0038359547
- Modelling the uncertainty in recovering articulation from acoustics
- Richmond K., King S., and Taylor P. Modelling the uncertainty in recovering articulation from acoustics. Computer Speech Language 17 2 (2003) 153-172
- (2003) Computer Speech Language , vol.17 , Issue.2 , pp. 153-172
- Richmond, K.¹ King, S.² Taylor, P.³

21
- 0023756465
- Sagisaka, Y., 1988. Speech synthesis by rule using an optimal selection of non-uniform synthesis units. In: Proc. ICASSP, New York, USA, pp. 679-682.

22
- 0014077928
- Determination of the geometry of the human vocal tract by acoustic measurements
- Schroeder M.R. Determination of the geometry of the human vocal tract by acoustic measurements. J. Acoust. Soc. Amer. 41 (1967) 1002-1010
- (1967) J. Acoust. Soc. Amer. , vol.41 , pp. 1002-1010
- Schroeder, M.R.¹

23
- 0001736204
- Speech coding based on physiological models of speech production
- Furui S., and Sondhi M.M. (Eds), Marcel Dekker, New York
- Schroeter J., and Sondhi M.M. Speech coding based on physiological models of speech production. In: Furui S., and Sondhi M.M. (Eds). Advances in Speech Signal Processing (1992), Marcel Dekker, New York 231-267
- (1992) Advances in Speech Signal Processing , pp. 231-267
- Schroeter, J.¹ Sondhi, M.M.²

24
- 0028259480
- Techniques for estimating vocal-tract shapes from the speech signal
- Schroeter J., and Sondhi M.M. Techniques for estimating vocal-tract shapes from the speech signal. IEEE Trans. Speech Audio Process. 2 (1994) 133-150
- (1994) IEEE Trans. Speech Audio Process. , vol.2 , pp. 133-150
- Schroeter, J.¹ Sondhi, M.M.²

25
- 38649089738
- Shiga, Y., King, S. 2004. Accurate spectral envelope estimation for articulation-to-speech synthesis. In: Proc. 5th ISCA Speech Synthesis Workshop. Pittsburgh, USA, pp. 19-24.

26
- 84966270944
- Sondhi, M.M. 2002. Articulatory modeling: a possible role in concatenative text-to-speech synthesis. IEEE 2002 Workshop on Speech Synthesis, Santa Monica, USA.

27
- 0032026483
- Continuous probabilistic transform for voice conversion
- Stylianou Y., Cappé O., and Moulines E. Continuous probabilistic transform for voice conversion. IEEE Trans. Speech Audio Process. 6 2 (1998) 131-142
- (1998) IEEE Trans. Speech Audio Process. , vol.6 , Issue.2 , pp. 131-142
- Stylianou, Y.¹ Cappé, O.² Moulines, E.³

28
- 38649115954
- Suzuki, S., Okadome, T., Honda, M., 1998. Determination of articulatory positions from speech acoustics by applying dynamic articulatory constraints. In: Proc. ICSLP. Sydney, Australia, pp. 2251-2254.

29
- 85001632375
- Syrdal, A.K., Wightman, C.W., Conkie, A., Stylianou, Y., Beutnagel, M., Schroeter, J., Strom, V., Lee, K.-S., Makashay, M.J., 2000. Corpus-based techniques in the AT& T NextGen synthesis system. In: Proc. ICSLP, Beijing, China, Vol. 3, pp. 410-415.

30
- 38649122984
- Toda, T., Black, A.W., Tokuda, K., 2004. Mapping from articulatory movements to vocal tract spectrum with Gaussian mixture model for articulatory speech synthesis. In: Proc. 5th ISCA Speech Synthesis Workshop. Pittsburgh, USA, pp. 31-36.

31
- 85009061214
- Toda, T., Black, A.W., Tokuda, K., 2004. Acoustic-to-articulatory inversion mapping with Gaussian mixture model. In: Proc. INTERSPEECH. Jeju, Korea, pp. 1129-1132.

32
- 0033708106
- Tokuda, K., Yoshimura, T., Masuko, T., Kobayashi, T., Kitamura, T., 2000. Speech parameter generation algorithms for HMM-based speech synthesis. In: Proc. ICASSP, Istanbul, Turkey, pp. 1315-1318.

33
- 38649128116
- Wrench, A. 1999. The MOCHA-TIMIT articulatory database. http://www.cstr.ed.ac.uk/research/projects/artic/mocha.html, Queen Margaret University College.

34
- 85009089757
- Wrench, A.A., Richmond, K., 2000. Continuous speech recognition using articulatory data. In: Proc. ICSLP. Beijing, China, pp. 145-148.

35
- 33749573927
- Reformulating the HMM as a trajetory model by imposing explicit relationships between static and dynamic feature vector sequences
- Zen H., Tokuda K., and Kitamura T. Reformulating the HMM as a trajetory model by imposing explicit relationships between static and dynamic feature vector sequences. Computer Speech Language 21 (2007) 153-173
- (2007) Computer Speech Language , vol.21 , pp. 153-173
- Zen, H.¹ Tokuda, K.² Kitamura, T.³

36
- 84946719891
- Zheng, Y., Liu, Z., Zhang, Z., Sinclair, M., Droppo, J., Deng, L., Acero, A., Huang, X., 2003. Air- and bone-conductive integrated microphones for robust speech detection and enhancement. In: Proc. ASRU, St. Thomas, USA, pp. 249-254.

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.