SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn , Issue , 2014, Pages 1504-1508

Measuring the perceptual effects of modelling assumptions in speech synthesis using stimuli constructed from repeated natural speech

(5) Henter, Gustav Eje a Merritt, Thomas a Shannon, Matt b Mayo, Catherine a King, Simon a

a UNIVERSITY OF EDINBURGH (United Kingdom)

b UNIVERSITY OF CAMBRIDGE (United Kingdom)

Author keywords

Acoustic modelling; Diagonal covariance matrices; Repeated speech; Speech synthesis; Stream independence

Indexed keywords

COVARIANCE MATRIX; SPEECH SYNTHESIS;

ACOUSTIC MODELLING; COVARIANCE MATRICES; NATURAL SPEECH; PERCEPTUAL EFFECTS; SPEECH CORPORA; STATISTICAL PARAMETRIC SPEECH SYNTHESIS; STREAM INDEPENDENCE; SUBJECTIVE LISTENING TEST;

SPEECH COMMUNICATION;

EID: 84910028520 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (35)

References (25)

1
- 84890516589
- The Blizzard challenge 2012
- S. King and V. Karaiskos, "The Blizzard Challenge 2012, " in Proc. Blizzard Chall. Workshop, 2012.
- (2012) Proc. Blizzard Chall. Workshop
- King, S.¹ Karaiskos, V.²

2
- 84865801900
- The effect of using normalized models in statistical speech synthesis
- M. Shannon, H. Zen, and W. Byrne, "The effect of using normalized models in statistical speech synthesis, " in Proc. Inter Speech, 2011.
- (2011) Proc. Inter Speech
- Shannon, M.¹ Zen, H.² Byrne, W.³

3
- 0033708106
- Speech parameter generation algorithms for HMM-based speech synthesis
- K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, "Speech parameter generation algorithms for HMM-based speech synthesis, " in Proc. ICASSP, vol. 3, 2000, pp. 1315-1318.
- (2000) Proc. ICASSP , vol.3 , pp. 1315-1318
- Tokuda, K.¹ Yoshimura, T.² Masuko, T.³ Kobayashi, T.⁴ Kitamura, T.⁵

4
- 84905253193
- An experimental comparison of multiple vocoder types
- Q. Hu, K. Richmond, J. Yamagishi, and J. Latorre, "An experimental comparison of multiple vocoder types, " in Proc. SSW8, 2013, pp. 155-160.
- (2013) Proc. SSW8 , pp. 155-160
- Hu, Q.¹ Richmond, K.² Yamagishi, J.³ Latorre, J.⁴

5
- 0032638856
- Semi-tied covariance matrices for hidden Markov models
- M. J. F. Gales, "Semi-tied covariance matrices for hidden Markov models, " IEEE T. Speech Audi. P., vol. 7, no. 3, pp. 272-281, 1999.
- (1999) IEEE T. Speech Audi. P. , vol.7 , Issue.3 , pp. 272-281
- Gales, M.J.F.¹

6
- 85133720638
- The HMM-based speech synthesis system (HTS) version 2.0
- H. Zen, T. Nose, J. Yamagishi, S. Sako, T. Masuko, A. Black, and K. Tokuda, "The HMM-based speech synthesis system (HTS) version 2.0, " in Proc. SSW 6, 2007, pp. 294-299.
- (2007) Proc. SSW , vol.6 , pp. 294-299
- Zen, H.¹ Nose, T.² Yamagishi, J.³ Sako, S.⁴ Masuko, T.⁵ Black, A.⁶ Tokuda, K.⁷

7
- 84910063941
- Investigating the shortcomings of HMM synthesis
- T. Merritt and S. King, "Investigating the shortcomings of HMM synthesis, " in Proc. SSW8, 2013, pp. 185-190.
- (2013) Proc. SSW8 , pp. 185-190
- Merritt, T.¹ King, S.²

8
- 0004056285
- Upper Saddle River, NJ: Prentice Hall
- X. Huang, A. Acero, and H.-W. Hon, Spoken Language Processing. Upper Saddle River, NJ: Prentice Hall, 2001, p. 12.
- (2001) Spoken Language Processing
- Huang, X.¹ Acero, A.² Hon, H.-W.³

9
- 70450184166
- An assessment of automatic recognition techniques for spontaneous speech in comparison with human performance
- T. Shinozaki and S. Furui, "An assessment of automatic recognition techniques for spontaneous speech in comparison with human performance, " in Proc. SSPR, 2003.
- (2003) Proc. SSPR
- Shinozaki, T.¹ Furui, S.²

10
- 84858986605
- A comparison of automatic and human speech recognition in null grammar
- A. Juneja, "A comparison of automatic and human speech recognition in null grammar, " J. Acoust. Soc. Am., vol. 131, no. 3, pp. EL256-EL261, 2012.
- (2012) J. Acoust. Soc. Am. , vol.131 , Issue.3 , pp. EL256-EL261
- Juneja, A.¹

11
- 84943154470
- Fabricating conversational speech data with acoustic models: A program to examine model-data mismatch
- D. McAllaster, L. Gillick, F. Scattone, and M. Newman, "Fabricating conversational speech data with acoustic models: A program to examine model-data mismatch, " in Proc. ICSLP, 1998.
- (1998) Proc. ICSLP
- McAllaster, D.¹ Gillick, L.² Scattone, F.³ Newman, M.⁴

12
- 84858952478
- Don't multiply lightly: Quantifying problems with the acoustic model assumptions in speech recognition
- D. Gillick, L. Gillick, and S. Wegmann, "Don't multiply lightly: quantifying problems with the acoustic model assumptions in speech recognition, " in Proc. ASRU, 2011, pp. 71-76.
- (2011) Proc. ASRU , pp. 71-76
- Gillick, D.¹ Gillick, L.² Wegmann, S.³

13
- 84856237844
- An introduction to statistical parametric speech synthesis
- S. King, "An introduction to statistical parametric speech synthesis, " Sadhana, vol. 36, no. 5, pp. 837-852, 2011.
- (2011) Sadhana , vol.36 , Issue.5 , pp. 837-852
- King, S.¹

14
- 67651002140
- Statistical parametric speech synthesis
- H. Zen, K. Tokuda, and A. W. Black, "Statistical parametric speech synthesis, " Speech Commun., vol. 51, no. 11, pp. 1039- 1064, 2009.
- (2009) Speech Commun. , vol.51 , Issue.11 , pp. 1039-1064
- Zen, H.¹ Tokuda, K.² Black, A.W.³

15
- 0003418124
- 2nd ed. The Hague, The Netherlands: Mouton & Co
- G. Fant, Acoustic Theory of Speech Production: With Calculations based on X-Ray Studies of Russian Articulations, 2nd ed. The Hague, The Netherlands: Mouton & Co., 1970.
- (1970) Acoustic Theory of Speech Production: With Calculations Based on X-Ray Studies of Russian Articulations
- Fant, G.¹

16
- 33749573927
- Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences
- H. Zen, K. Tokuda, and T. Kitamura, "Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences, " Comput. Speech Lang., vol. 21, no. 1, pp. 153-173, 2007.
- (2007) Comput. Speech Lang. , vol.21 , Issue.1 , pp. 153-173
- Zen, H.¹ Tokuda, K.² Kitamura, T.³

17
- 84872190545
- Autoregressive models for statistical parametric speech synthesis
- M. Shannon, H. Zen, and W. Byrne, "Autoregressive models for statistical parametric speech synthesis, " IEEE T. Audio Speech, vol. 21, no. 3, pp. 587-597, 2013.
- (2013) IEEE T. Audio Speech , vol.21 , Issue.3 , pp. 587-597
- Shannon, M.¹ Zen, H.² Byrne, W.³

18
- 0014568991
- IEEE recommended practice for speech quality measurements
- E. H. Rothauser, W. D. Chapman, N. Guttman, K. S. Nordby, H. R. Silbiger, G. E. Urbanek, and M. Weinstock, "IEEE recommended practice for speech quality measurements, " IEEE T. Acoust. Speech, vol. 17, no. 3, pp. 225-246, 1969.
- (1969) IEEE T. Acoust. Speech , vol.17 , Issue.3 , pp. 225-246
- Rothauser, E.H.¹ Chapman, W.D.² Guttman, N.³ Nordby, K.S.⁴ Silbiger, H.R.⁵ Urbanek, G.E.⁶ Weinstock, M.⁷

19
- 84910047268
- Objective measurement of active speech level, Telecommunication Standardization Sector, Geneva, Switzerland, March
- Objective measurement of active speech level, ITU Recommendation ITU-T P.56, International Telecommunication Union, Telecommunication Standardization Sector, Geneva, Switzerland, March 2011.
- (2011) ITU Recommendation ITU-T P.56, International Telecommunication Union

20
- 33750915991
- STRAIGHT, exploitation of the other aspect of VOCODER: Perceptually isomorphic decomposition of speech sounds
- H. Kawahara, "STRAIGHT, exploitation of the other aspect of VOCODER: Perceptually isomorphic decomposition of speech sounds, " Acoust. Sci. Technol., vol. 27, no. 6, pp. 349-353, 2006.
- (2006) Acoust. Sci. Technol. , vol.27 , Issue.6 , pp. 349-353
- Kawahara, H.¹

21
- 84910053549
- Method for the subjective assessment of intermediate quality level of coding systems, International Telecommunication Union Radiocommunication Assembly, Geneva, Switzerland, March
- Method for the subjective assessment of intermediate quality level of coding systems, ITU Recommendation ITU-R BS.1534-1, International Telecommunication Union Radiocommunication Assembly, Geneva, Switzerland, March 2003.
- (2003) ITU Recommendation ITU-R BS.1534-1

22
- 38549096029
- A speech parameter generation algorithm considering global variance for HMM-based speech synthesis
- T. Tomoki and K. Tokuda, "A speech parameter generation algorithm considering global variance for HMM-based speech synthesis, " IEICE Trans. Inf. Syst., vol. E90-D, no. 5, pp. 816-824, 2007.
- (2007) IEICE Trans. Inf. Syst. , vol.E90-D , Issue.5 , pp. 816-824
- Tomoki, T.¹ Tokuda, K.²

23
- 84878384520
- Ways to implement global variance in statistical speech synthesis
- H. Silén, E. Helander, J. Nurminen, and M. Gabbouj, "Ways to implement global variance in statistical speech synthesis, " in Proc. Inter Speech, 2012.
- (2012) Proc. Inter Speech
- Silén, H.¹ Helander, E.² Nurminen, J.³ Gabbouj, M.⁴

24
- 84890495160
- Fast, low-artifact speech synthesis considering global variance
- M. Shannon and W. Byrne, "Fast, low-artifact speech synthesis considering global variance, " in Proc. ICASSP, 2013, pp. 7869- 7873.
- (2013) Proc. ICASSP , pp. 7869-7873
- Shannon, M.¹ Byrne, W.²

25
- 0003562364
- 2nd ed. New York, NY: Springer
- I. Borg and P. J. F. Groenen, Modern Multidimensional Scaling: Theory and Applications, 2nd ed. New York, NY: Springer, 2005.
- (2005) Modern Multidimensional Scaling: Theory and Applications
- Borg, I.¹ Groenen, P.J.F.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.