-
1
-
-
0015112070
-
Speech analysis and synthesis by linear prediction of speech wave
-
doi:10.1121/1.1912679
-
Atal, B. S., & Hanauer, S. L. (1971). Speech analysis and synthesis by linear prediction of Speech Wave. The Journal of the Acoustical Society of America, 50(2b), 637-655. doi:10.1121/1.1912679
-
(1971)
The Journal of the Acoustical Society of America
, vol.50
, Issue.2
, pp. 637-655
-
-
Atal, B.S.1
Hanauer, S.L.2
-
2
-
-
0030166343
-
The sus test: A method for the assessment of text-to-speech synthesis intelligibility using semantically unpredictable sentences
-
doi:10.1016/0167-6393(96)00026-X
-
Benoît, C., Grice, M., & Hazan, V. (1996). The SUS test: A method for the assessment of text-to-speech synthesis intelligibility using Semantically Unpredictable Sentences. Speech Communication, 18(4), 381-392. doi:10.1016/0167-6393(96)00026-X
-
(1996)
Speech Communication
, vol.18
, Issue.4
, pp. 381-392
-
-
Benoît, C.1
Grice, M.2
Hazan, V.3
-
3
-
-
33745216749
-
The blizzard challenge - 2005: Evaluating corpus-based speech synthesis on common datasets
-
Black, A., & Tokuda, K. (2005). The Blizzard Challenge - 2005: Evaluating corpus-based speech synthesis on common datasets. INTER--SPEECH-2005, 77-80.
-
(2005)
INTER--SPEECH-2005
, pp. 77-80
-
-
Black, A.1
Tokuda, K.2
-
4
-
-
85039153976
-
A biphone constrained concatenation method for diphone synthesis
-
Bunnell, H. T., Hoskins, S. R., & Yarrington, D. M. (1998). A biphone constrained concatenation method for diphone synthesis. SSW3-1998, 171-176.
-
(1998)
SSW3-1998
, pp. 171-176
-
-
Bunnell, H.T.1
Hoskins, S.R.2
Yarrington, D.M.3
-
5
-
-
84899187749
-
Schwa variants in american english
-
Bunnell, H. T., & Lilley, J. (2008). Schwa variants in American English. Proceedings: Interspeech, 2008, 1159-1162.
-
(2008)
Proceedings: Interspeech
, vol.2008
, pp. 1159-1162
-
-
Bunnell, H.T.1
Lilley, J.2
-
6
-
-
33745218768
-
Automatic personal synthetic voice construction
-
Bunnell, H. T., Pennington, C., Yarrington, D., & Gray, J. (2005). Automatic personal synthetic voice construction. INTERSPEECH-2005, 89-92.
-
(2005)
INTERSPEECH-2005
, pp. 89-92
-
-
Bunnell, H.T.1
Pennington, C.2
Yarrington, D.3
Gray, J.4
-
7
-
-
84941167756
-
Optimal coupling of diphones
-
Conkie, A., & Isard, S. (1994). Optimal coupling of diphones. SSW2-1994, 119-122.
-
(1994)
SSW2-1994
, pp. 119-122
-
-
Conkie, A.1
Isard, S.2
-
8
-
-
0023222647
-
Intelligibility of average talkers in typical listening environments
-
doi:10.1121/1.394512
-
Cox, R. M., Alexander, G. C., & Gilmore, C. (1987). Intelligibility of average talkers in typical listening environments. The Journal of the Acoustical Society of America, 81(5), 1598-1608. doi:10.1121/1.394512
-
(1987)
The Journal of the Acoustical Society of America
, vol.81
, Issue.5
, pp. 1598-1608
-
-
Cox, R.M.1
Alexander, G.C.2
Gilmore, C.3
-
10
-
-
0001109477
-
Coarticulation and theories of extrinsic timing
-
Fowler, C. A. (1980). Coarticulation and theories of extrinsic timing. Journal of Phonetics, 8, 113-133.
-
(1980)
Journal of Phonetics
, vol.8
, pp. 113-133
-
-
Fowler, C.A.1
-
11
-
-
56849127909
-
The breadth of coarticulatory units in children and adults
-
doi:10.1044/1092-4388(2008/07-0020)
-
Goffman, L., Smith, A., Heisler, L., & Ho, M. (2008). The breadth of coarticulatory units in children and adults. Journal of Speech, Language, and Hearing Research: JSLHR, 51(6), 1424-1437. doi:10.1044/1092-4388(2008/07-0020)
-
(2008)
Journal of Speech, Language, and Hearing Research: JSLHR
, vol.51
, Issue.6
, pp. 1424-1437
-
-
Goffman, L.1
Smith, A.2
Heisler, L.3
Ho, M.4
-
12
-
-
0006660131
-
-
Bloomington, IN: Indiana University Speech Research Laboratory
-
Greene, B. G., Manous, L. M., & Pisoni, D. B. (1984). Perceptual evaluation of DECtalk: A final report on version 1.8 (Progress Report No. 10). Bloomington, IN: Indiana University Speech Research Laboratory.
-
(1984)
Perceptual Evaluation of DECtalk: A Final Report on Version 1.8 (Progress Report No. 10)
-
-
Greene, B.G.1
Manous, L.M.2
Pisoni, D.B.3
-
13
-
-
0036711819
-
A quasiarticulatory approach to controlling acoustic source parameters in a klatt-type formant synthesizer using hlsyn
-
doi:10.1121/1.1498851
-
Hanson, H. M., & Stevens, K. N. (2002). A quasiarticulatory approach to controlling acoustic source parameters in a Klatt-type formant synthesizer using HLsyn. The Journal of the Acoustical Society of America, 112(3), 1158-1182. doi:10.1121/1.1498851
-
(2002)
The Journal of the Acoustical Society of America
, vol.112
, Issue.3
, pp. 1158-1182
-
-
Hanson, H.M.1
Stevens, K.N.2
-
14
-
-
0001074490
-
From phoneme to morpheme
-
doi:10.2307/411036
-
Harris, Z. S. (1955). From phoneme to morpheme. Language, 31(2), 190-222. doi:10.2307/411036
-
(1955)
Language
, vol.31
, Issue.2
, pp. 190-222
-
-
Harris, Z.S.1
-
15
-
-
44949180195
-
A nucleus-based timing model applied to multi-dialect speech synthesis by rule
-
Hertz, S. R., & Huffman, M. K. (1992). A nucleus-based timing model applied to multi-dialect speech synthesis by rule. ICSLP-1992, 1171-1174.
-
(1992)
ICSLP-1992
, pp. 1171-1174
-
-
Hertz, S.R.1
Huffman, M.K.2
-
16
-
-
84899161284
-
Research on speech synthesis carried out during a visit to the royal institute of technology, stockholm, from november 1960 to march 1961
-
Holmes, J. N. (1961). Research on Speech Synthesis Carried out during a Visit to the Royal Institute of Technology, Stockholm, from November 1960 to March 1961. Joint Speech Resear4ch Unit Report JU 11.4, British Post Office, Eastcote, England.
-
(1961)
Joint Speech Resear4ch Unit Report JU 11.4, British Post Office, Eastcote, England
-
-
Holmes, J.N.1
-
17
-
-
0015699693
-
The influence of the glottal waveform on the naturalness of speech from a parallel formant synthesizer
-
Holmes, J. N. (1973). The influence of the glottal waveform on the naturalness of speech from a parallel formant synthesizer. IEEE Trans., AU--21, 298-305.
-
(1973)
IEEE Trans., AU--21
, pp. 298-305
-
-
Holmes, J.N.1
-
18
-
-
72249121867
-
Vocalid: Personalizing text-to-speech synthesis for individuals with severe speech impairment
-
In
-
Jreige, C., Patel, R., & Bunnell, H. T. (2009). VocaliD: Personalizing Text-to-Speech Synthesis for Individuals with Severe Speech Impairment. In Proceedings of ASSETS 2009.
-
(2009)
Proceedings of ASSETS 2009
-
-
Jreige, C.1
Patel, R.2
Bunnell, H.T.3
-
19
-
-
0018986665
-
Software for a cascade/ parallel formant synthesizer
-
doi:10.1121/1.383940
-
Klatt, D. H. (1980). Software for a cascade/ parallel formant synthesizer. The Journal of the Acoustical Society of America, 67(3), 971-995. doi:10.1121/1.383940
-
(1980)
The Journal of the Acoustical Society of America
, vol.67
, Issue.3
, pp. 971-995
-
-
Klatt, D.H.1
-
20
-
-
0023407575
-
Review of text-to-speech conversion for english
-
doi:10.1121/1.395275
-
Klatt, D. H. (1987). Review of text-to-speech conversion for English. The Journal of the Acoustical Society of America, 82(3), 737-793. doi:10.1121/1.395275
-
(1987)
The Journal of the Acoustical Society of America
, vol.82
, Issue.3
, pp. 737-793
-
-
Klatt, D.H.1
-
21
-
-
0026206653
-
Comparing discrimination and recognition of unfamiliar voices
-
doi:10.1016/0167-6393(91)90016-M
-
Kreiman, J., & Papcun, G. (1991). Comparing discrimination and recognition of unfamiliar voices. Speech Communication, 10(3), 265-275. doi:10.1016/0167-6393(91)90016-M
-
(1991)
Speech Communication
, vol.10
, Issue.3
, pp. 265-275
-
-
Kreiman, J.1
Papcun, G.2
-
22
-
-
0015404068
-
On the perception of coarticulation effects in english vcv syllables
-
Lehiste, I., & Shockey, L. (1972). On the perception of coarticulation effects in English VCV syllables. Journal of Speech and Hearing Research, 15(3), 500-506.
-
(1972)
Journal of Speech and Hearing Research
, vol.15
, Issue.3
, pp. 500-506
-
-
Lehiste, I.1
Shockey, L.2
-
23
-
-
0024344665
-
Segmental intelligibility of synthetic speech produced by rule
-
doi:10.1121/1.398236
-
Logan, J. S., Greene, B. G., & Pisoni, D. B. (1989). Segmental intelligibility of synthetic speech produced by rule. The Journal of the Acoustical Society of America, 86(2), 566-581. doi:10.1121/1.398236
-
(1989)
The Journal of the Acoustical Society of America
, vol.86
, Issue.2
, pp. 566-581
-
-
Logan, J.S.1
Greene, B.G.2
Pisoni, D.B.3
-
24
-
-
0038201502
-
Posterior pharyngeal wall position in the production of speech
-
doi:10.1044/1092-4388(2003/019)
-
Magen, H. S., Kang, A. M., Tiede, M. K., & Whalen, D. H. (2003). Posterior pharyngeal wall position in the production of speech. Journal of Speech, Language, and Hearing Research: JSLHR, 46(1), 241-251. doi:10.1044/1092-4388(2003/019)
-
(2003)
Journal of Speech, Language, and Hearing Research: JSLHR
, vol.46
, Issue.1
, pp. 241-251
-
-
Magen, H.S.1
Kang, A.M.2
Tiede, M.K.3
Whalen, D.H.4
-
25
-
-
0019531333
-
Perception of anticipatory coarticulation effects
-
doi:10.1121/1.385484
-
Martin, J. G., & Bunnell, H. T. (1981). Perception of anticipatory coarticulation effects. The Journal of the Acoustical Society of America, 69(2), 559-567. doi:10.1121/1.385484
-
(1981)
The Journal of the Acoustical Society of America
, vol.69
, Issue.2
, pp. 559-567
-
-
Martin, J.G.1
Bunnell, H.T.2
-
26
-
-
0020145847
-
Perception of anticipatory coarticulation effects in vowel-stop consonant-bowel sequences
-
doi:10.1037/0096-1523.8.3.473
-
Martin, J. G., & Bunnell, H. T. (1982). Perception of anticipatory coarticulation effects in vowel-stop consonant-bowel sequences. Journal of Experimental Psychology. Human Perception and Performance, 8(3), 473-488. doi:10.1037/0096-1523.8.3.473
-
(1982)
Journal of Experimental Psychology. Human Perception and Performance
, vol.8
, Issue.3
, pp. 473-488
-
-
Martin, J.G.1
Bunnell, H.T.2
-
27
-
-
0015613574
-
Articulatory model for the study of speech production
-
doi:10.1121/1.1913427
-
Mermelstein, P. (1973). Articulatory model for the study of speech production. The Journal of the Acoustical Society of America, 53(4), 1070-1082. doi:10.1121/1.1913427
-
(1973)
The Journal of the Acoustical Society of America
, vol.53
, Issue.4
, pp. 1070-1082
-
-
Mermelstein, P.1
-
28
-
-
0025543906
-
Pitch-synchronous wave-form processing techniques for text-to-speech synthesis using diphones
-
doi:10.1016/0167-6393(90)90021-Z
-
Moulines, E., & Charpentier, F. (1990). Pitch-synchronous wave-form processing techniques for Text-to-Speech synthesis using diphones. Speech Communication, 9(5-6), 453-467. doi:10.1016/0167-6393(90)90021-Z
-
(1990)
Speech Communication
, vol.9
, Issue.5-6
, pp. 453-467
-
-
Moulines, E.1
Charpentier, F.2
-
29
-
-
0026660215
-
The influence of talker differences on vowel identification by normal-hearing and hearing-impaired listeners
-
doi:10.1121/1.403973
-
Nabelek, A. K., Czyzewski, Z., Krishnan, L. A., & Krishnan, L. A. (1992). The influence of talker differences on vowel identification by normal-hearing and hearing-impaired Listeners. The Journal of the Acoustical Society of America, 92(3), 1228-1246. doi:10.1121/1.403973
-
(1992)
The Journal of the Acoustical Society of America
, vol.92
, Issue.3
, pp. 1228-1246
-
-
Nabelek, A.K.1
Czyzewski, Z.2
Krishnan, L.A.3
Krishnan, L.A.4
-
31
-
-
0013871855
-
Coarticulation in vcv utterances: Spectrographic measurements
-
doi:10.1121/1.1909864
-
Öhman, S. E. G. (1966). Coarticulation in VCV Utterances: Spectrographic Measurements. The Journal of the Acoustical Society of America, 39(1), 151-168. doi:10.1121/1.1909864
-
(1966)
The Journal of the Acoustical Society of America
, vol.39
, Issue.1
, pp. 151-168
-
-
Öhman, S.E.G.1
-
32
-
-
84955017725
-
Segmentation techniques in speech synthesis
-
doi:10.1121/1.1909746
-
Peterson, G., Wang, W., & Siversten, E. (1958). Segmentation techniques in speech synthesis. The Journal of the Acoustical Society of America, 30, 739-742. doi:10.1121/1.1909746
-
(1958)
The Journal of the Acoustical Society of America
, vol.30
, pp. 739-742
-
-
Peterson, G.1
Wang, W.2
Siversten, E.3
-
33
-
-
34047275265
-
The ibm expressive text-to-speech synthesis system for american english
-
doi:10.1109/TASL.2006.876123
-
Pitrelli, J. F., Bakis, R., Eide, E. M., Fernandez, R., Hamza, W., & Picheny, M. A. (2006). The IBM expressive text-to-speech synthesis system for American English. IEEE Transactions on Audio Speech and Language Processing, 14(4), 1099-1108. doi:10.1109/TASL.2006.876123
-
(2006)
IEEE Transactions on Audio Speech and Language Processing
, vol.14
, Issue.4
, pp. 1099-1108
-
-
Pitrelli, J.F.1
Bakis, R.2
Eide, E.M.3
Fernandez, R.4
Hamza, W.5
Picheny, M.A.6
-
34
-
-
84878395202
-
Data-driven approach to rapid prototyping xhosa speech synthesis
-
Roux, J. C., & Visagie, A. S. (2007). Data-driven approach to rapid prototyping Xhosa speech synthesis. SSW6-2007, 143-147.
-
(2007)
SSW6-2007
, pp. 143-147
-
-
Roux, J.C.1
Visagie, A.S.2
-
35
-
-
0023756465
-
Speech synthesis by rule using an optimal selection of non-uniform synthesis units
-
Sagisaka, Y. (1988). Speech synthesis by rule using an optimal selection of non-uniform synthesis units. IEEE ICASSP1988, 679-682.
-
(1988)
IEEE ICASSP1988
, pp. 679-682
-
-
Sagisaka, Y.1
-
36
-
-
85119213703
-
Tobi: A standard for labeling english prosody
-
Silverman, K., Beckman, M. E., Pitrelli, J., Ostendorf, M., Wightman, C., Price, P., et al. (1992). ToBI: a standard for labeling English prosody. Proceedings of the Second International Conference on Spoken Language Processing, 867-870.
-
(1992)
Proceedings of the Second International Conference on Spoken Language Processing
, pp. 867-870
-
-
Silverman, K.1
Beckman, M.E.2
Pitrelli, J.3
Ostendorf, M.4
Wightman, C.5
Price, P.6
-
37
-
-
84964193368
-
Segment inventories for speech synthesis
-
Sivertsen, E. (1961). Segment inventories for speech synthesis. Language and Speech, 4(1), 27-90.
-
(1961)
Language and Speech
, vol.4
, Issue.1
, pp. 27-90
-
-
Sivertsen, E.1
-
38
-
-
84912906590
-
Constraints among parameters simplify control of klatt formant synthesizer
-
Stevens, K. N., & Bickley, C. A. (1991). Constraints among parameters simplify control of Klatt formant synthesizer. Journal of Phonetics, 19, 161-174.
-
(1991)
Journal of Phonetics
, vol.19
, pp. 161-174
-
-
Stevens, K.N.1
Bickley, C.A.2
-
39
-
-
84955022381
-
Development of a quantitative description of vowel articulation
-
doi:10.1121/1.1907943
-
Stevens, K. N., & House, A. S. (1955). Development of a quantitative description of vowel articulation. The Journal of the Acoustical Society of America, 27(3), 484-493. doi:10.1121/1.1907943
-
(1955)
The Journal of the Acoustical Society of America
, vol.27
, Issue.3
, pp. 484-493
-
-
Stevens, K.N.1
House, A.S.2
-
40
-
-
0003058857
-
On the basic scheme and algorithms in non-uniform unit speech synthesis
-
In G. Bailly, C. Benoît & T. R. Sawallis (Eds.), Amsterdam, The Netherlands: North-Holland Publishing Co
-
Takeda, K., Abe, K., & Sagisaka, Y. (1992). On the basic scheme and algorithms in non-uniform unit speech synthesis. In G. Bailly, C. Benoît & T. R. Sawallis (Eds.), Talking machines: Theories, models, and designs (pp. 93-105). Amsterdam, The Netherlands: North-Holland Publishing Co.
-
(1992)
Talking Machines: Theories, Models, and Designs
, pp. 93-105
-
-
Takeda, K.1
Abe, K.2
Sagisaka, Y.3
-
41
-
-
6344264628
-
Deriving text-to-speech durations from natural speech
-
In G. Bailly, C. Benoît & T. R. Sawallis (Eds.), Amsterdam, The Netherlands: North-Holland Publishing Co
-
van Santen, J. P. H. (1992). Deriving text-to-speech durations from natural speech. In G. Bailly, C. Benoît & T. R. Sawallis (Eds.), Talking machines: Theories, models, and designs (pp. 275-285). Amsterdam, The Netherlands: North-Holland Publishing Co.
-
(1992)
Talking Machines: Theories, Models, and Designs
, pp. 275-285
-
-
van Santen, J.P.H.1
-
42
-
-
0017877485
-
Correlates of psychological dimensions in talker similarity
-
Walden, B. E., Montgomery, A. A., Gibeily, G. J., Prosek, R. A., & Schwartz, D. M. (1978). Correlates of psychological dimensions in talker similarity. Journal of Speech and Hearing Research, 21(2), 265-275.
-
(1978)
Journal of Speech and Hearing Research
, vol.21
, Issue.2
, pp. 265-275
-
-
Walden, B.E.1
Montgomery, A.A.2
Gibeily, G.J.3
Prosek, R.A.4
Schwartz, D.M.5
-
43
-
-
84959174906
-
Hmm-based synthesis of child speech
-
Watts, O., Yamagishi, J., Berkling, K., & King, S. (2008). HMM-Based Synthesis of Child Speech. 1st Workshop on Child, Computer and Interaction (ICMI'08 post-conference workshop).
-
(2008)
1st Workshop on Child, Computer and Interaction (ICMI'08 Post-conference Workshop)
-
-
Watts, O.1
Yamagishi, J.2
Berkling, K.3
King, S.4
-
44
-
-
0345189494
-
Predicting midsagittal pharynx shape from tongue position during vowel production
-
Whalen, D. H., Kang, A. M., Magen, H. S., Ful-bright, R. K., & Gore, J. C. (1999). Predicting midsagittal pharynx shape from tongue position during vowel production. Journal of Speech, Language, and Hearing Research: JSLHR, 42(3), 592-603.
-
(1999)
Journal of Speech, Language, and Hearing Research: JSLHR
, vol.42
, Issue.3
, pp. 592-603
-
-
Whalen, D.H.1
Kang, A.M.2
Magen, H.S.3
Ful-Bright, R.K.4
Gore, J.C.5
-
45
-
-
0002489485
-
Context-sensitive coding associative memory and serial order in (speech) behavior
-
doi:10.1037/h0026823
-
Wicklegran, W. A. (1969). Context-sensitive coding associative memory and serial order in (speech) behavior. Psychological Review, 76, 1-15. doi:10.1037/h0026823
-
(1969)
Psychological Review
, vol.76
, pp. 1-15
-
-
Wicklegran, W.A.1
-
46
-
-
0348153016
-
Robust automatic extraction of diphones with variable boundaries
-
Yarrington, D., Bunnell, H. T., & Ball, G. (1995). Robust automatic extraction of diphones with variable boundaries. EUROSPEECH, 95, 1845-1848.
-
(1995)
EUROSPEECH
, vol.95
, pp. 1845-1848
-
-
Yarrington, D.1
Bunnell, H.T.2
Ball, G.3
-
47
-
-
67651002140
-
Statistical parametric speech synthesis
-
doi:10.1016/j.specom.2009.04.004
-
Zen, H., Tokuda, K., & Black, A. W. (2009). Statistical parametric speech synthesis. Speech Communication, 51(11), 1039-1064. doi:10.1016/j.specom.2009.04.004
-
(2009)
Speech Communication
, vol.51
, Issue.11
, pp. 1039-1064
-
-
Zen, H.1
Tokuda, K.2
Black, A.W.3
-
48
-
-
33749573927
-
Reformulating the hmm as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences
-
doi:10.1016/j.csl.2006.01.002
-
Zen, H., Tokuda, K., & Kitamura, T. (2007). Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences. Computer Speech & Language, 21(1), 153-173. doi:10.1016/j.csl.2006.01.002
-
(2007)
Computer Speech & Language
, vol.21
, Issue.1
, pp. 153-173
-
-
Zen, H.1
Tokuda, K.2
Kitamura, T.3
|