-
1
-
-
0002212488
-
Chunks and dependencies: Bringing processing evidence to bear on syntax
-
J. Cole, G. Green, and J. Morgan, Eds. CSLI
-
Abney, S. Chunks and dependencies: Bringing processing evidence to bear on syntax. In Computational Linguistics and the Foundations of Linguistic Theory, J. Cole, G. Green, and J. Morgan, Eds. CSLI (1995), pp. 145-164.
-
(1995)
Computational Linguistics and the Foundations of Linguistic Theory
, pp. 145-164
-
-
Abney, S.1
-
4
-
-
85135264071
-
Formant analysis and synthesis using hidden markov models
-
Acero, A. Formant analysis and synthesis using hidden Markov models. In Proceedings of Eurospeech 1999 (1999).
-
(1999)
Proceedings Of Eurospeech 1999
-
-
Acero, A.1
-
9
-
-
85009210634
-
Evolutionary weight tuning based on diphone pairs for unit selection speech synthesis
-
Alias, F., and Llora, X. Evolutionary weight tuning based on diphone pairs for unit selection speech synthesis. In Proceedings of Eurospeech 2003 (2003).
-
(2003)
Proceedings Of eurospeech 2003
-
-
Alias, F.1
Llora, X.2
-
13
-
-
0031055369
-
Towards articulatory-acoustic models for liquid consonants based on MRI and EPG data. Part ii: The rhotics
-
Alwan, A., Narayanan, S., and Haker, K. Towards articulatory-acoustic models for liquid consonants based on MRI and EPG data. Part II: The rhotics. Journal of the Acoustical Society of America 101, 2 (1997), 1078-1089.
-
(1997)
Journal Ofthe Acoustical Society of America
, vol.101
, Issue.2
, pp. 1078-1089
-
-
Alwan, A.1
Narayanan, S.2
Haker, K.3
-
15
-
-
85032415626
-
Unit selection synthesis database development using utterance verification
-
Amdal, I., and Svendsen, T. Unit selection synthesis database development using utterance verification. In Proceedings of Eurospeech 2005 (2005).
-
(2005)
Proceedings Of eurospeech 2005
-
-
Amdal, I.1
Svendsen, T.2
-
18
-
-
85009153344
-
Long vowel detection for letter-to-sound conversion for japanese sourced words transliterated into the alphabet
-
Asano, H., Nakajima, H., Mizuno, H., and Oku, M. Long vowel detection for letter-to-sound conversion for Japanese sourced words transliterated into the alphabet. In Proceedings of Interspeech 2004 (2004).
-
(2004)
Proceedings of Interspeech 2004
-
-
Asano, H.1
Nakajima, H.2
Mizuno, H.3
Oku, M.4
-
20
-
-
0015112070
-
Speech analysis and synthesis by linear prediction of the speech wave
-
Atal, B. S., and Hanauer, L. Speech analysis and synthesis by linear prediction of the speech wave. Journal of the Acoustical Society of America 50 (1971), 637-655.
-
(1971)
Journal of the Acoustical Society of America
, vol.50
, pp. 637-655
-
-
Atal, B.S.1
Hanauer, L.2
-
21
-
-
2942726537
-
On the phonetics and phonology of “segmental anchoring” of f0: Evidence from german
-
Atterer, M., and Ladd, D. R. On the phonetics and phonology of “segmental anchoring” of F0: Evidence from German. Journal of Phonetics 32 (2004), 177-197.
-
(2004)
Journal of Phonetics
, vol.32
, pp. 177-197
-
-
Atterer, M.1
Ladd, D.R.2
-
24
-
-
85009067857
-
Stochastic Suprasegmentals: Relationships between redundancy, prosodic structure and care of articulation in spontaneous speech
-
Aylett, M. P. Stochastic suprasegmentals: Relationships between redundancy, prosodic structure and care of articulation in spontaneous speech. In Proceedings ofthe International Conference on Speech and Language Processing 2000 (2000).
-
(2000)
Proceedings Ofthe International Conference on Speech and Language Processing 2000
-
-
Aylett, M.P.1
-
25
-
-
85032425410
-
Synthesising hyperarticulation in unit selection TTS
-
Aylett, M. P. Synthesising hyperarticulation in unit selection TTS. In Proceedings of Eurospeech 2005 (2005).
-
(2005)
Proceedings of Eurospeech 2005
-
-
Aylett, M.P.1
-
26
-
-
84930562270
-
A computational grammar of discourse-neutral prosodic phrasing in english
-
Bachenko, J., and Fitzpatrick, E. A computational grammar of discourse-neutral prosodic phrasing in English. Computational Linguistics 16, 3 (1990), 155-170.
-
(1990)
Computational Linguistics
, vol.16
, Issue.3
, pp. 155-170
-
-
Bachenko, J.1
Fitzpatrick, E.2
-
27
-
-
0032045825
-
Phonemic transcription by analogy in text-to-speech synthesis: Novel word pronunciation and lexicon compression
-
Bagshaw, P. Phonemic transcription by analogy in text-to-speech synthesis: Novel word pronunciation and lexicon compression. Computer Speech & Language 12, 2 (1998), 119-142.
-
(1998)
Computer Speech & Language
, vol.12
, Issue.2
, pp. 119-142
-
-
Bagshaw, P.1
-
28
-
-
85093707396
-
Enhanced pitch tracking and the processing of f0 contours for computer aided intonation teaching
-
Bagshaw, P. C., Hiller, S. M., and Jack, M. A. Enhanced pitch tracking and the processing of F0 contours for computer aided intonation teaching. In Proceedings of Eurospeech 1993 (1993), pp. 1003-1006.
-
(1993)
Proceedings of Eurospeech
, vol.1993
, pp. 1003-1006
-
-
Bagshaw, P.C.1
Hiller, S.M.2
Jack, M.A.3
-
29
-
-
21844464777
-
No future for comprehensive models of intonation?
-
Y. Sagisaka, N. Campbell and N. Higuchi, Eds. Berlin: Springer-Verlag
-
Bailly, G. No future for comprehensive models of intonation? In Computing Prosody: Computational Models for Processing Spontaneous Speech, Y. Sagisaka, N. Campbell and N. Higuchi, Eds. Berlin: Springer-Verlag (1997), pp. 157-164.
-
(1997)
Computing Prosody: Computational Models for Processing Spontaneous Speech
, pp. 157-164
-
-
Bailly, G.1
-
30
-
-
0142216141
-
Audiovisual speech synthesis
-
Bailly, G., Berar, M., Elisei, F., and Odisio, M. Audiovisual speech synthesis. International Journal of Speech Technology 6 (2003), 331-346.
-
(2003)
International Journal of Speech Technology
, vol.6
, pp. 331-346
-
-
Bailly, G.1
Berar, M.2
Elisei, F.3
Odisio, M.4
-
31
-
-
21844440585
-
SFC: A trainable prosodic model
-
Bailly, G., and Holm, B. SFC: A trainable prosodic model. Speech Communication 46, 3-4 (2005), 348-364.
-
(2005)
Speech Communication
, vol.46
, Issue.3-4
, pp. 348-364
-
-
Bailly, G.1
Holm, B.2
-
32
-
-
0342484597
-
Compost: A rule compiler for speech synthesis
-
Bailly, G., and Tran, A. Compost: A rule compiler for speech synthesis. In Proceedings of Eurospeech 1989 (1989), pp. 136-139.
-
(1989)
In Proceedings Of eurospeech
, vol.1989
, pp. 136-139
-
-
Bailly, G.1
Tran, A.2
-
33
-
-
0016663359
-
The dragon system - an overview
-
Baker, J. K. The DRAGON system - an overview. IEEE Transactions on Acoustics, Speech, and Signal Processing 23, 1 (1975), 24-29.
-
(1975)
IEEE transactions on acoustics, Speech, and Signal Processing
, vol.23
, Issue.1
, pp. 24-29
-
-
Baker, J.K.1
-
34
-
-
34748917145
-
Is there an emotion signature in international patterns? And can it be used in synthesis?
-
Banziner, T., Morel, M., and Scherer, K. Is there an emotion signature in international patterns? and can it be used in synthesis? In Proceedings ofEurospeech 2003 (2003).
-
(2003)
Proceedings Of eurospeech 2003
-
-
Banziner, T.1
Morel, M.2
Scherer, K.3
-
35
-
-
0028531866
-
Characterization of rhythmic patterns for text-to-speech synthesis
-
Barbosa, P., and Bailly, G. Characterization of rhythmic patterns for text-to-speech synthesis. Speech Communication 15, 1 (1994), 127-137.
-
(1994)
Speech Communication
, vol.15
, Issue.1
, pp. 127-137
-
-
Barbosa, P.1
Bailly, G.2
-
36
-
-
0000353178
-
A maximization technique occurring in the statistical analysis of probabilistic functions of markov chain
-
Baum, L. E., Peterie, T., Souled, G., and Weiss, N. A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chain. The Annals of Mathematical Statistics 41, 1 (1970), 249-336.
-
(1970)
The Annals of Mathematical Statistics
, vol.41
, Issue.1
, pp. 249-336
-
-
Baum, L.E.1
Peterie, T.2
Souled, G.3
Weiss, N.4
-
38
-
-
0141703269
-
Unsupervised, language-independent grapheme-to-phoneme conversion by latent analogy
-
Bellegarda, J. Unsupervised, language-independent grapheme-to-phoneme conversion by latent analogy. In Acoustics, Speech, and Signal Processing, 2003. Proceedings (ICASSP’03). 2003 IEEE International Conference (2003).
-
(2003)
Acoustics, Speech, and Signal Processing, 2003. Proceedings (ICASSP’03). 2003 IEEE International Conference
-
-
Bellegarda, J.1
-
39
-
-
85032421249
-
A novel discontinuity metrica for unit selection text-to-speech synthesis
-
Bellegarda, J. R. A novel discontinuity metrica for unit selection text-to-speech synthesis. In 5th ISCA Workshop on Speech Synthesis (2004).
-
(2004)
5Th ISCA Workshop on Speech Synthesis
-
-
Bellegarda, J.R.1
-
40
-
-
33846442606
-
Large scale evaluation of corpus-based synthesizers: Results and lessons from the Blizzard Challenge 2005
-
Bennett, C. L. Large scale evaluation of corpus-based synthesizers: Results and lessons from the Blizzard Challenge 2005. In Proceedings of Interspeech 2006 (2005).
-
(2005)
Proceedings of Interspeech 2006
-
-
Bennett, C.L.1
-
49
-
-
85133526552
-
Automatically clustering similar units for unit selection in speech synthesis
-
Black, A., and Taylor, P. Automatically clustering similar units for unit selection in speech synthesis. In Proceedings of Eurospeech 1997 (1997), vol. 2, pp. 601-604.
-
(1997)
Proceedings of Eurospeech 1997
, vol.2
, pp. 601-604
-
-
Black, A.1
Taylor, P.2
-
50
-
-
33947682675
-
Blizzard Challenge 2006: Evaluating corpus-based speech synthesis on common datasets
-
Black, A., and Tokuda, K. Blizzard Challenge 2006: Evaluating corpus-based speech synthesis on common datasets. In Proceedings of Interspeech 2005 (2005).
-
(2005)
Proceedings of Interspeech 2005
-
-
Black, A.1
Tokuda, K.2
-
51
-
-
0030355540
-
Generation f0 contours from tobi labels using linear regression
-
Black, A. W., and Hunt, A. J. Generation F0 contours from ToBI labels using linear regression. In Computer Speech and Language (1996).
-
(1996)
Computer Speech and Language
-
-
Black, A.W.1
Hunt, A.J.2
-
53
-
-
0342918775
-
CHATR: A generic speech synthesis system
-
Black, A. W., and Taylor, P. CHATR: A generic speech synthesis system. In COLING 1994 (1994), pp. 983-986.
-
(1994)
COLING
, vol.1994
, pp. 983-986
-
-
Black, A.W.1
Taylor, P.2
-
55
-
-
84925038181
-
-
The Festival Speech Synthesis System. Manual and source code avaliable at
-
Black, A. W., Taylor, P., and Caley, R. The Festival Speech Synthesis System. Manual and source code avaliable at http://www.cstr.ed.ac.uk/projects/festival.html, 1996-2006.
-
(1996)
-
-
Black, A.W.1
Taylor, P.2
Caley, R.3
-
57
-
-
0004123567
-
-
New York: Henry Holt
-
Bloomfield, L. Language. New York: Henry Holt (1933).
-
(1933)
Language
-
-
Bloomfield, L.1
-
58
-
-
84925038179
-
Speech perception: Phonetic aspects
-
W. J. Frawley, Ed., Oxford: Oxford University Press
-
Blumstein, S., and Cutler, A. Speech perception: Phonetic aspects. In International Encyclopedia of Language, W. J. Frawley, Ed., vol. 4. Oxford: Oxford University Press (2003).
-
(2003)
International Encyclopedia of Language
, vol.4
-
-
Blumstein, S.1
Cutler, A.2
-
59
-
-
33745089688
-
-
Cambridge, MA: MIT Press
-
Bod, R., Hay, j., and Jannedy, S. Probabilistic Linguistics. Cambridge, MA: MIT Press (1999).
-
(1999)
Probabilistic Linguistics
-
-
Bod, R.1
Hay, J.2
Jannedy, S.3
-
62
-
-
0042868018
-
Evaluation of grapheme- to-phoneme conversion for text-to-speech synthesis in french
-
Boula de MareUil, P., Yvon, F., d’Alessandro, C. et al. Evaluation of grapheme- to-phoneme conversion for text-to-speech synthesis in French. Proceedings of First International Conference on Language Resources & Evaluation (1998), pp. 641-645.
-
(1998)
Proceedings & Evaluation
, pp. 641-645
-
-
Boula De Mareuil, P.1
Yvon, F.2
D’ Alessandro, C.3
-
63
-
-
0030142722
-
Towards increasing speech recognition error rates
-
Boulard, H., Hermansky, H., and Morgan, N. Towards increasing speech recognition error rates. Speech Communication 18 (1996), 205-255.
-
(1996)
Speech Communication
, vol.18
, pp. 205-255
-
-
Boulard, H.1
Hermansky, H.2
Morgan, N.3
-
64
-
-
0004316316
-
-
Princeton, MA: Princeton University Press
-
Boyer, C. B. History of Mathematics. Princeton, MA: Princeton University Press (1985).
-
(1985)
History of Mathematics
-
-
Boyer, C.B.1
-
67
-
-
84867919822
-
Transformation-based error-driven learning and natural language processing: A case study in part of speech tagging
-
Brill, E. Transformation-based error-driven learning and natural language processing: A case study in part of speech tagging. Computational Linguistics 21, 4 (1995) 543-565.
-
(1995)
Computational Linguistics
, vol.21
, Issue.4
, pp. 543-565
-
-
Brill, E.1
-
68
-
-
0032688795
-
Modelling energy flow in the vocal tract with applications to glottal closure and opening detection
-
Brookes, D. M., and Loke, H. P. Modelling energy flow in the vocal tract with applications to glottal closure and opening detection. In Proceedings ofthe International Conference on Acoustics, Speech, and Signal Processing, 1999 (1999).
-
(1999)
Proceedings Ofthe International Conference on Acoustics, Speech, and Signal Processing, 1999
-
-
Brookes, D.M.1
Loke, H.P.2
-
70
-
-
0027024362
-
Articulatory phonology: An overview
-
Browman, C. P., and Goldstein, L. Articulatory phonology: an overview. Phonetica 49 (1992), 155-180.
-
(1992)
Phonetica
, vol.49
, pp. 155-180
-
-
Browman, C.P.1
Goldstein, L.2
-
71
-
-
32244434943
-
An automatic extraction method of f0 generation model parameters
-
Bu, S., Yamamoto, M., and Itahashi, S. An automatic extraction method of F0 generation model parameters. IEICE Transactions on Information and Systems 89, 1 (2006), 305.
-
(2006)
IEICE Transactions on Information and Systems
, vol.89
, Issue.1
, pp. 305
-
-
Bu, S.1
Yamamoto, M.2
Itahashi, S.3
-
72
-
-
85009062747
-
Data driven intonation modelling of 6 languages
-
Buhmann, J., Vereecken, H., Fackrell, J., Martens, J. P., and Coile, B. V. Data driven intonation modelling of 6 languages. In Proceedings of the International Conference on Spoken Language Processing 2000 (2000).
-
(2000)
Proceedings of the International Conference on Spoken Language Processing 2000
-
-
Buhmann, J.1
Vereecken, H.2
Fackrell, J.3
Martens, J.P.4
Coile, B.V.5
-
74
-
-
85009102972
-
Unit selection for speech synthesis using splicing costs with weighted finite state transducers
-
Bulyko, I., and Ostendorf, M. Unit selection for speech synthesis using splicing costs with weighted finite state transducers. In Proceedings ofEurospeech 2001 (2001).
-
(2001)
Proceedings Ofeurospeech 2001
-
-
Bulyko, I.1
Ostendorf, M.2
-
78
-
-
84936824214
-
Regularity and idiomaticity in grammatical constructions
-
Fillmore, C. J., Kay, P., and O’Connor, C. Regularity and idiomaticity in grammatical constructions: The case of let alone. Language 64 (1988), 501-538.
-
(1988)
The Case of Let Alone. Language
, vol.64
, pp. 501-538
-
-
Fillmore, C.J.1
Kay, P.2
O’ Connor, C.3
-
79
-
-
0002515370
-
The generation of affect in synthesized speech
-
Cahn, J. The generation of affect in synthesized speech. Journal of the American Voice I/O Society 8 (1990), 1-19.
-
(1990)
Journal of the American Voice I/O Society
, vol.8
, pp. 1-19
-
-
Cahn, J.1
-
81
-
-
85009207979
-
Towards synthesizing expressive speech; designing and collecting expressive speech data
-
Campbell, N. Towards synthesizing expressive speech; designing and collecting expressive speech data. In Proceedings of Eurospeech 2003 (2003).
-
(2003)
Proceedings of Eurospeech 2003
-
-
Campbell, N.1
-
83
-
-
0001717383
-
Syllable-based segmental duration
-
Theories, Models and Designs, C. B. G. Bailly and T. R. Sawallis, Eds. Amsterdam: Elsevier Science Publishers
-
Campbell, W. N. Syllable-based segmental duration. In Talking Machines: Theories, Models and Designs, C. B. G. Bailly and T. R. Sawallis, Eds. Amsterdam: Elsevier Science Publishers (1992), pp. 211-224.
-
(1992)
Talking Machines
, pp. 211-224
-
-
Campbell, W.N.1
-
85
-
-
0033677157
-
Speech reconstruction frommel frequency cepstral coefficients and pitch
-
Chazan, D., Hoory, R., Cohen, G., and Zibulsk, M. Speech reconstruction frommel frequency cepstral coefficients and pitch. In Proceedings ofthe International Conference on Acoustics, Speech, and Signal Processing 2000 (2000).
-
(2000)
Proceedings Ofthe International Conference on Acoustics, Speech, and Signal Processing 2000
-
-
Chazan, D.1
Hoory, R.2
Cohen, G.3
Zibulsk, M.4
-
86
-
-
85009227369
-
Conditional and joint models for grapheme-to-phoneme conversion
-
Chen, S. F. Conditional and joint models for grapheme-to-phoneme conversion. In Proceedings ofEurospeech 2003 (2003).
-
(2003)
Proceedings Ofeurospeech 2003
-
-
Chen, S.F.1
-
92
-
-
0034840906
-
Selecting non-uniform units from a very large corpus for concatenative speech synthesizer
-
Chu, M., Peng, H., Yang, H. Y., and Chang, E. Selecting non-uniform units from a very large corpus for concatenative speech synthesizer. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 2001 (2001).
-
(2001)
Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 2001
-
-
Chu, M.1
Peng, H.2
Yang, H.Y.3
Chang, E.4
-
93
-
-
0141480034
-
Microsoft mulan - a bilingual tts system
-
Chu, M., Peng, H., Zhao, y., Niu, Z., and Chan, E. Microsoft Mulan - a bilingual TTS system. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 2003 (2003).
-
(2003)
Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 2003
-
-
Chu, M.1
Peng, H.2
Zhao, Y.3
Niu, Z.4
Chan, E.5
-
94
-
-
0025750735
-
A comparison of the enhanced good-turing and deleted estimation methods for estimating probabilities of english bigrams
-
Church, K. W., and Gale, W. A. A comparison of the enhanced good-Turing and deleted estimation methods for estimating probabilities of English bigrams. Computer Speech and Language 5 (1991), 19-54.
-
(1991)
Computer Speech and Language
, vol.5
, pp. 19-54
-
-
Church, K.W.1
Gale, W.A.2
-
95
-
-
1842653460
-
-
Cambridge: Cambridge University Press
-
Clark, H. H. Using Language. Cambridge: Cambridge University Press (1996).
-
(1996)
Using Language
-
-
Clark, H.H.1
-
100
-
-
84958907242
-
The geometry of phonological features
-
Clements, G. N. The geometry of phonological features. Phonology Yearbook 2 (1985), pp. 225-252.
-
(1985)
Phonology Yearbook
, vol.2
, pp. 225-252
-
-
Clements, G.N.1
-
102
-
-
85009168875
-
Speculations on the future of speech technology research
-
Cole, R. Roadmaps, journeys and destinations: Speculations on the future of speech technology research. In Proceedings ofEurospeech 2003 (2003).
-
(2003)
Proceedings Ofeurospeech 2003
-
-
Cole, R.R.1
-
103
-
-
79959826505
-
Linguistic features weighting for a text-to-speech system without prosody model
-
Colotte, V., and Beaufort, R. Linguistic features weighting for a text-to-speech system without prosody model. In Proceedings ofEurospeech 2005 (2005).
-
(2005)
Proceedings Ofeurospeech 2005
-
-
Colotte, V.1
Beaufort, R.2
-
106
-
-
0003677575
-
The interconversion of audible and visible patterns as a basis for research in the perception of speech
-
Cooper, F. S., Liberman, A. M., and Borst, J. M. The interconversion of audible and visible patterns as a basis for research in the perception of speech. Proceedings of the National Academy of Science 37, 5 (1951), 318-325.
-
(1951)
Proceedings of the National Academy of Science
, vol.37
, Issue.5
, pp. 318-325
-
-
Cooper, F.S.1
Liberman, A.M.2
Borst, J.M.3
-
108
-
-
84985926077
-
Segment selection in the lh realspeak laboratory tts system
-
Coorman, G., Fackrell, J., Rutten, P., and Coile, B. V. Segment selection in the LH RealSpeak Laboratory TTS system. In Proceedings of the International Conference on Spoken Language Processing 2000 (2000).
-
(2000)
Proceedings of the International Conference on Spoken Language Processing 2000
-
-
Coorman, G.1
Fackrell, J.2
Rutten, P.3
Coile, B.V.4
-
109
-
-
0012236013
-
Automatic modeling of duration in a spanish text-to-speech system using neural networks
-
Cordoba, R., Vallejo, J. a., Montero, J. M. et al. Automatic modeling of duration in a Spanish text-to-speech system using neural networks. In Proceedings ofEurospeech 1999 (1999).
-
(1999)
Proceedings Ofeurospeech 1999
-
-
Cordoba, R.1
Vallejo, J.2
-
111
-
-
34249753618
-
Support-vector networks
-
Cortes, C., and Vapnik, V. Support-vector networks. Machine Learning 20, 3 (1995), 273-297.
-
(1995)
Machine Learning
, vol.20
, Issue.3
, pp. 273-297
-
-
Cortes, C.1
Vapnik, V.2
-
112
-
-
85032408336
-
Multimodal databases of everyday emotion: Facing up to complexity
-
Cowie, R., Devillers, L., Martin, J.-C. et al. Multimodal databases of everyday emotion: Facing up to complexity. In Proceedings ofEurospeech, Interspeech 2005 (2005).
-
(2005)
Proceedings Ofeurospeech, Interspeech 2005
-
-
Cowie, R.1
Devillers, L.2
Martin, J.-C.3
-
113
-
-
85032751766
-
Emotion recognition in human-computer interaction
-
Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N. et al. Emotion recognition in human-computer interaction. IEEE Signal Processing Magazine (2001), 32-80.
-
(2001)
IEEE Signal Processing Magazine
, pp. 32-80
-
-
Cowie, R.1
Douglas-Cowie, E.2
Tsapatsoulis, N.3
-
116
-
-
0023419762
-
A globally optimising format tracker using generalised centroids
-
Crowe, A., and Jack, M. A. A globally optimising format tracker using generalised centroids. Electronics Letters 23 (1987), 1019-1020.
-
(1987)
Electronics Letters
, vol.23
, pp. 1019-1020
-
-
Crowe, A.1
Jack, M.A.2
-
118
-
-
0029342671
-
Automatic pitch contour stylization using a model of tonal perception
-
d’Alessandro, C., and Mertens, P. Automatic pitch contour stylization using a model of tonal perception. In Computer Speech and Language (1995).
-
(1995)
Computer Speech and Language
-
-
D’ Alessandro, C.1
Mertens, P.2
-
119
-
-
0032624182
-
Forgetting exceptions is harmful in language learning
-
Daelemans, W., Van Den Bosch, A., and Zavrel, J. Forgetting exceptions is harmful in language learning. Machine Learning 34, 1 (1999), 11-41.
-
(1999)
Machine Learning
, vol.34
, Issue.1
, pp. 11-41
-
-
Daelemans, W.1
Van Den Bosch, A.2
Zavrel, J.3
-
121
-
-
0029342671
-
Automatic pitch contour stylization using a model of tonal perception
-
d’Alessandro, C., and Mertens, P. Automatic pitch contour stylization using a model of tonal perception. Computer Speech & Language 9, 3 (1995), 257-288.
-
(1995)
Computer & Language
, vol.9
, Issue.3
, pp. 257-288
-
-
D’ Alessandro, C.1
Mertens, P.2
-
122
-
-
0039666139
-
Pronunciation By Analogy: Impact of implementational choices on performance
-
Damper, R., and Eastmond, J. Pronunciation by analogy: Impact of implementational choices on performance. Language and Speech 40, 1 (1997), 1-23.
-
(1997)
Language and Speech
, vol.40
, Issue.1
, pp. 1-23
-
-
Damper, R.1
Eastmond, J.2
-
123
-
-
0033106614
-
A performance comparison of different approaches
-
Damper, R., Marchand, y., Adamson, M., and Gustafson, K. Evaluating the pronunciation component of text-to-speech systems for English: A performance comparison of different approaches. Computer Speech and Language 13, 2 (1999), 155-176.
-
(1999)
Computer Speech and Language
, vol.13
, Issue.2
, pp. 155-176
-
-
Damper, R.1
Marchand, Y.2
Adamson, M.3
-
127
-
-
0025796358
-
A program for pronunciation by analogy
-
Dedina, M., and Nusbaum, H. Pronounce: A program for pronunciation by analogy. Computer Speech & Language (Print) 5, 1 (1991), 55-64.
-
(1991)
Computer & Language (Print)
, vol.5
, Issue.1
, pp. 55-64
-
-
Dedina, M.1
Nusbaum, H.P.2
-
129
-
-
85009211881
-
Tracking vocal track resonances using an analytical nonlinear predictor and a target guided temporal constraint
-
Deng, L., Bazzi, I., and Acero, A. Tracking vocal track resonances using an analytical nonlinear predictor and a target guided temporal constraint. In Proceedings ofEurospeech 2003 (2003).
-
(2003)
Proceedings Ofeurospeech 2003
-
-
Deng, L.1
Bazzi, I.2
Acero, A.3
-
130
-
-
84979899767
-
Prosodic cues foremotion characterization in real-life spoken dialogs
-
Devillers, L., and Vasilescu, I. Prosodic cues foremotion characterization in real-life spoken dialogs. In Proceedings ofEurospeech 2003 (2003).
-
(2003)
Proceedings Ofeurospeech 2003
-
-
Devillers, L.1
Vasilescu, I.2
-
131
-
-
85032421967
-
A neural network approach for the design of the target cost function in unit-selection speech synthesis
-
Diaz, F. C., Alba, J. L., and Banga, E. R. A neural network approach for the design of the target cost function in unit-selection speech synthesis. In Proceedings ofEurospeech 2005 (2005).
-
(2005)
Proceedings Ofeurospeech 2005
-
-
Diaz, F.C.1
Alba, J.L.2
Banga, E.R.3
-
132
-
-
14844352803
-
Alignment of l and h in bitonal pitch accents: Testing two hypotheses
-
Dilley, L., Ladd, D., and Schepman, A. Alignment of L and H in bitonal pitch accents: Testing two hypotheses. Journal of Phonetics 33, 1 (2005), 115-119.
-
(2005)
Journal of Phonetics
, vol.33
, Issue.1
, pp. 115-119
-
-
Dilley, L.1
Ladd, D.2
Schepman, A.3
-
135
-
-
85041486134
-
Optimising unit selection with voice source and formants in the chatr speech synthesis system
-
Ding, W., and Campbell, N. Optimising unit selection with voice source and formants in the Chatr speech synthesis system. In Proceedings ofEurospeech 1997 (1997).
-
(1997)
Proceedings Ofeurospeech 1997
-
-
Ding, W.1
Campbell, N.2
-
136
-
-
0012266740
-
A computational grammar of discourse-neutral prosodic phrasing in english
-
Divay, M., and Vitale, A. J. A computational grammar of discourse-neutral prosodic phrasing in English. Computational Linguistics 23, 4 (1997), 495-523.
-
(1997)
Computational Linguistics
, vol.23
, Issue.4
, pp. 495-523
-
-
Divay, M.1
Vitale, A.J.2
-
137
-
-
0002869769
-
The study of natural phonology
-
D. Dinnsen, Ed. Indiana: Indiana University Press
-
Donegan, P. J., and Stampe, D. The study of natural phonology. In Current Approaches to Phonological Theory, D. Dinnsen, Ed. Indiana: Indiana University Press (1979), pp. 126-173.
-
(1979)
Current Approaches to Phonological Theory
, pp. 126-173
-
-
Donegan, P.J.1
Stampe, D.2
-
139
-
-
0032665403
-
Phrase splicing and variable substitution using the ibm trainable speeech synthesis system. In proceedings of the international conference on acoustics
-
Donovan, R. E., Franz, M., Sorensen, J. S., and Roukos, S. Phrase splicing and variable substitution using the IBM trainable speeech synthesis system. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 1999 (1999), pp. 373-376.
-
(1999)
Speech, and Signal Processing
, vol.1999
, pp. 373-376
-
-
Donovan, R.E.1
Franz, M.2
Sorensen, J.S.3
Roukos, S.4
-
144
-
-
85032410801
-
Dependency and non-linear phonology
-
Durand, J. Dependency and Non-Linear Phonology. Croom Helm (1986).
-
(1986)
Croom Helm
-
-
Durand, J.1
-
145
-
-
85032411920
-
Explorations in dependency phonology
-
Durand, J., and Anderson, J. Explorations in Dependency Phonology. Foris (1987).
-
(1987)
Foris
-
-
Durand, J.1
Anderson, J.2
-
146
-
-
85032407598
-
Generating f0 contours for speech synthesis using the tilt intonation theory
-
Dusterhoff, K., and Black, A. Generating f0 contours for speech synthesis using the tilt intonation theory. In Proceedings of Eurospeech 1997 (1997).
-
(1997)
Proceedings of Eurospeech 1997
-
-
Dusterhoff, K.1
Black, A.2
-
149
-
-
0027839344
-
Text-to-speech synthesis based on an mbe re-synthesis of the segments database
-
Dutoit, T., and Leich, H. Text-to-speech synthesis based on an MBE re-synthesis of the segments database. Speech Communication 13 (1993), 435-440.
-
(1993)
Speech Communication
, vol.13
, pp. 435-440
-
-
Dutoit, T.1
Leich, H.2
-
150
-
-
0141702290
-
Recent improvements to the ibm trainable speech synthesis system
-
Eide, E., Aaron, A., Bakis, R. et al. Recent improvements to the IBM trainable speech synthesis system. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 2003 (2003).
-
(2003)
Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 2003
-
-
Eide, E.1
Aaron, A.2
Bakis, R.3
-
151
-
-
33947635494
-
A corpus-based approach to ahem expressive speech synthesis
-
Eide, E., Aaron, A., Bakis, Hamza, W., and Picheny, M. J. A corpus-based approach to AHEM expressive speech synthesis. In Proceedings of the 5th ISCA Workshop on Speech Synthesis (2005).
-
(2005)
Proceedings of the 5Th ISCA Workshop on Speech Synthesis
-
-
Eide, E.1
Aaron, A.2
Bakis, H.W.3
Picheny, M.J.4
-
155
-
-
0017269304
-
Letter-to-sound rules for automatic translation of english text to phonetics
-
Elovitz, H. S., Johnson, R., McHugh, A., and Shore, J. Letter-to-sound rules for automatic translation of English text to phonetics. IEEE Transactions on Acoustics, Speech, and Signal Processing 24 (1976), 446-459.
-
(1976)
IEEE Transactions on Acoustics, Speech, and Signal Processing
, vol.24
, pp. 446-459
-
-
Elovitz, H.S.1
Johnson, R.2
Mc Hugh, A.3
Shore, J.4
-
156
-
-
0002540664
-
The patterns of silence: Performance structures in sentence production
-
Grosjean, L. G., and Lane, H. The patterns of silence: Performance structures in sentence production. Cognitive Psychology 11 (1979), 58-81.
-
(1979)
Cognitive Psychology
, vol.11
, pp. 58-81
-
-
Grosjean, L.G.1
Lane, H.2
-
157
-
-
85135273903
-
Multilingual prosody modelling using cascades of regression trees and neural networks
-
Fackrell, J. W. A., Vereecken, H., Martens, J. P., and Coile, B. V. Multilingual prosody modelling using cascades of regression trees and neural networks. In Proceedings of Eurospeech 1999 (1999).
-
(1999)
Proceedings of Eurospeech 1999
-
-
Fackrell, J.1
Vereecken, H.2
Martens, J.P.3
Coile, B.V.4
-
159
-
-
84928451959
-
Glottal flow: Models and interaction
-
Fant, G. Glottal flow: models and interaction. Journal of Phonetics 14 (1986), 393-399.
-
(1986)
Journal of Phonetics
, vol.14
, pp. 393-399
-
-
Fant, G.1
-
160
-
-
0000764772
-
A. The use of multiple measures in taxonomic problems
-
Fisher, R. A. the use of multiple measures in taxonomic problems. Annals of Eugenics 7 (1936), 179-188.
-
(1936)
Annals of Eugenics
, vol.7
, pp. 179-188
-
-
Fisher, R.1
-
161
-
-
85032419628
-
The generation of regional pronunciations of english for speech synthesis
-
Fitt, S., and Isard, S. The generation of regional pronunciations of English for speech synthesis. In Proceedings of Eurospeech 1997 (1997).
-
(1997)
Proceedings of Eurospeech 1997
-
-
Fitt, S.1
Isard, S.2
-
163
-
-
85032418675
-
The treatment of vowels preceding ‘r’ in a keyword lexicon of english
-
Fitt, S., and Isard, S. The treatment of vowels preceding ‘r’ in a keyword lexicon of English. In Proceedings ofICPhS 99 (1999).
-
(1999)
Proceedings Oficphs 99
-
-
Fitt, S.1
Isard, S.2
-
165
-
-
85011187169
-
Analysis of voice fundamental frequency contours for declarative sentences of japanese
-
Fujisaki, H., and Hirose, K. Analysis of voice fundamental frequency contours for declarative sentences of Japanese. Journal of the Acoustical Society of Japan 5, 4 (1984), 233-241.
-
(1984)
Journal of the Acoustical Society of Japan
, vol.5
, Issue.4
, pp. 233-241
-
-
Fujisaki, H.1
Hirose, K.2
-
166
-
-
0010987926
-
Modeling the dynamic characteristics of voice fundamental frequency with applications to analysis and synthesis of intonation
-
Fujisaki, H., and Kawai, H. Modeling the dynamic characteristics of voice fundamental frequency with applications to analysis and synthesis of intonation. In Working Group on Intonation, 13th International Congress of Linguists (1982).
-
(1982)
Working Group on Intonation, 13Th International Congress of Linguists
-
-
Fujisaki, H.1
Kawai, H.2
-
169
-
-
0003128462
-
Using bilingual materials to develop word sense disambiguation methods
-
Gale, W. A., Church, K. W., and Yarowsky, D. Using bilingual materials to develop word sense disambiguation methods. In International Conference on Theoretical and Methodological Issues in Machine Translation (1992), pp. 101-112.
-
(1992)
International Conference on Theoretical and Methodological Issues in Machine Translation
, pp. 101-112
-
-
Gale, W.A.1
Church, K.W.2
Yarowsky, D.3
-
170
-
-
33646821390
-
Development of the cu-htk 2004 mandarin conversational telephone speech transcription system
-
Gales, M. J. F., Jia, B., Liu, X. et al. Development of the CU-HTK 2004 Mandarin conversational telephone speech transcription system. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 2005 (2005).
-
(2005)
Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 2005
-
-
Gales, M.1
Jia, B.2
Liu, X.3
-
171
-
-
34047266379
-
Progress in the cu-htk broadcast news transcription system
-
Gales, M. J. F., Kim, D. Y., Woodland, P. C. et al. Progress in the CU-HTK Broadcast News transcription system. IEEE Transactions on Audio, Speech, and Language Processing 14, 5 (2006), 1513-1525.
-
(2006)
IEEE Transactions on Audio, Speech, and Language Processing
, vol.14
, Issue.5
, pp. 1513-1525
-
-
Gales, M.1
Kim, D.Y.2
Woodland, P.C.3
-
172
-
-
85009238150
-
Name pronunciation with a joint n-gram model for bidirectional grapheme-to-phoneme conversion
-
Galescu, L., and Allen, J. Name pronunciation with a joint N-gram model for bidirectional grapheme-to-phoneme conversion. Proceedings ofICSLP (2002), pp. 109-112.
-
(2002)
Proceedings Oficslp
, pp. 109-112
-
-
Galescu, L.1
Allen, J.2
-
173
-
-
0003548585
-
-
Gaithersburg, MD (CD-ROM
-
Garofolo, J. S., Lamel, L. F., Fisher, W. M. et al. The DARPA-TIMIT acoustic- phonetic continuous speech corpus. Technical report, US Department of Commerce, Gaithersburg, MD (CD-ROM, 1990).
-
(1990)
The DARPA-TIMIT Acoustic- Phonetic Continuous Speech Corpus. Technical Report, US Department of Commerce
-
-
Garofolo, J.S.1
Lamel, L.F.2
Fisher, W.M.3
-
179
-
-
84895711707
-
An overview of autosegmental phonology
-
Goldsmith, J. An overview of autosegmental phonology. Linguistic Analysis 2, 1 (1976), 23-68.
-
(1976)
Linguistic Analysis
, vol.2
, Issue.1
, pp. 23-68
-
-
Goldsmith, J.1
-
181
-
-
0029292169
-
Classification of methods used for the assessment of text-to-speech systems according to the demands placed on the listener
-
Goldstein, M. Classification of methods used for the assessment of text-to-speech systems according to the demands placed on the listener. Speech Communication 16 (1995), 225-244.
-
(1995)
Speech Communication
, vol.16
, pp. 225-244
-
-
Goldstein, M.1
-
183
-
-
0026940107
-
The use of speech synthesis in exploring different speaking styles
-
Granstrom, B. The use of speech synthesis in exploring different speaking styles. Speech Communication 11, 4-5 (1992), 347-355.
-
(1992)
Speech Communication
, vol.11
, Issue.4-5
, pp. 347-355
-
-
Granstrom, B.1
-
184
-
-
0000534475
-
Logic and conversation
-
P. Cole and J. Morgan, Eds. New York: Academic Press
-
Grice, H. P. Logic and conversation. In Syntax and Semantics: Speech Acts, P. Cole and J. Morgan, Eds. New York: Academic Press (1975), vol. 3, pp. 41-58.
-
(1975)
Syntax and Semantics: Speech Acts
, vol.3
, pp. 41-58
-
-
Grice, H.P.1
-
185
-
-
0024060644
-
Multiband excitation vocoder
-
Griffin, D., and Lim, J. Multiband excitation vocoder. IEEE Transactions on Acoustics, Speech, and Signal Processing, 36, 8 (1988).
-
(1988)
IEEE Transactions on Acoustics, Speech, and Signal Processing
, vol.36
, pp. 8
-
-
Griffin, D.1
Lim, J.2
-
186
-
-
0023304697
-
Prosodic structure and spoken word recognition
-
Grosjean, F., and Gee, J. P. Prosodic structure and spoken word recognition. Cognition, 156 (1987).
-
(1987)
Cognition
, pp. 156
-
-
Grosjean, F.1
Gee, J.P.2
-
187
-
-
6344250064
-
Designing prosodic databases for automatic modelling in 6 languages
-
Grover, C., Fackrell, J., Vereecken, H., Martens, J., and Van Coile, B. Designing prosodic databases for automatic modelling in 6 languages. In Proceedings ofICSLP 1998 (1998).
-
(1998)
Proceedings Oficslp 1998
-
-
Grover, C.1
Fackrell, J.2
Vereecken, H.3
Martens, J.4
Van Coile, B.5
-
189
-
-
0032673049
-
Possible role of a repetitive structure in sounds
-
Kawahara, I. M.-K., and de Cheveigne, A. Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds. Speech Communication 27, 187-207.
-
Speech Communication
, vol.27
, pp. 187-207
-
-
Kawahara, I.M.1
De Cheveigne, A.2
-
191
-
-
34547326497
-
A hybrid approach for grapheme-to-phoneme conversion based on a combination of partial string matching and a neural network
-
Hain, H.-U. A hybrid approach for grapheme-to-phoneme conversion based on a combination of partial string matching and a neural network. In Proceedings of the International Conference on Speech and Language Processing (2000).
-
(2000)
Proceedings of the International Conference on Speech and Language Processing
-
-
Hain, H.-U.1
-
192
-
-
27744599401
-
Automatic transcription of conversational telephone speech - development of the cu-htk 2002 system
-
Hain, T., Woodland, P. C., Evermann, G. et al. Automatic transcription of conversational telephone speech - development of the CU-HTK 2002 system. IEEE Transactions on Audio, Speech, and Language Processing (2005).
-
(2005)
IEEE Transactions on Audio, Speech, and Language Processing
-
-
Hain, T.1
Woodland, P.C.2
Evermann, G.3
-
193
-
-
85032406374
-
A. Intonation and grammar in british english
-
Halliday, M. A. Intonation and Grammar in British English. Mouton (1967).
-
(1967)
Mouton
-
-
Halliday, M.1
-
194
-
-
0024906968
-
A diphone synthesis system based on time-domain modifications of speech
-
Hamon, C., Moulines, E., and Charpentier, F. A diphone synthesis system based on time-domain modifications of speech. In Proceedings of International Conference on Acoustics, Speech, and Signal Processing 1989 (1989).
-
(1989)
Proceedings of International Conference on Acoustics, Speech, and Signal Processing 1989
-
-
Hamon, C.1
Moulines, E.2
Charpentier, F.3
-
195
-
-
56149096472
-
The ibm expressive speech synthesis system
-
Hamza, W., Bakis, R., Eide, E. M., Picheny, M. a., and Pitrelli, J. F. The IBM expressive speech synthesis system. In Proceedings of the International Conference on Spoken Language Processing 2004 (2004).
-
(2004)
Proceedings of the International Conference on Spoken Language Processing 2004
-
-
Hamza, W.1
Bakis, R.2
Eide, E.M.3
Picheny, M.A.4
Pitrelli, J.F.5
-
196
-
-
85032414131
-
On building a concatenative speech synthesis system from the blizzard challenge speech databases
-
Hamza, W., Bakis, R., Shuang, Z. W., and Zen, H. On building a concatenative speech synthesis system from the blizzard challenge speech databases. In Proceedings of Interspeech 2005 (2005).
-
(2005)
Proceedings of Interspeech 2005
-
-
Hamza, W.1
Bakis, R.2
Shuang, Z.W.3
Zen, H.4
-
198
-
-
85032403628
-
Letter-to-sound for small-footprint multilingual tts engine
-
Han, K., and Chen, G. Letter-to-sound for small-footprint multilingual TTS engine. In Proceedings ofInterspeech 2004 (2004).
-
(2004)
Proceedings Ofinterspeech 2004
-
-
Han, K.1
Chen, G.2
-
200
-
-
85009062928
-
Transformation-based learning of danish stress assignment
-
Henrichsen, P. J. Transformation-based learning of Danish stress assignment. In Proceedings ofEurospeech 2001 (2001).
-
(2001)
Proceedings Ofeurospeech 2001
-
-
Henrichsen, P.J.1
-
205
-
-
85009291529
-
Improved corpus-based synthesis of fundamental frequency contours using generation process model
-
Hirose, K., Eto, M., and Minematsu, N. Improved corpus-based synthesis of fundamental frequency contours using generation process model. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 2002 (2002).
-
(2002)
Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 2002
-
-
Hirose, K.1
Eto, M.2
Minematsu, N.3
-
206
-
-
21844459140
-
Corpus-based synthesis of fundamental frequency contours based on a generation process model
-
Hirose, K., Eto, M., Minematsu, N., and Sakurai, A. Corpus-based synthesis of fundamental frequency contours based on a generation process model. In Proceedings of Eurospeech 2001 (2001).
-
(2001)
Proceedings of Eurospeech 2001
-
-
Hirose, K.1
Eto, M.2
Minematsu, N.3
Sakurai, A.4
-
207
-
-
0027684991
-
Predicting intonational prominence from text
-
Pitch accent in context
-
Hirschberg, J. Pitch accent in context: Predicting intonational prominence from text. Artificial Intelligence 63 (1993), 305-340.
-
(1993)
Artificial Intelligence
, vol.63
, pp. 305-340
-
-
Hirschberg, J.1
-
208
-
-
0036027583
-
Functional aspects of prosody
-
Hirschberg, J. Communication and prosody: Functional aspects of prosody. Speech Communication 36 (2002), 31-43.
-
(2002)
Speech Communication
, vol.36
, pp. 31-43
-
-
Hirschberg, J.C.1
-
214
-
-
72949135793
-
The origin of speech
-
Hockett, C. F. The origin of speech. Scientific American 203 (1960), 88-96.
-
(1960)
Scientific American
, vol.203
, pp. 88-96
-
-
Hockett, C.F.1
-
217
-
-
21844457336
-
Implementing various functions of prosody
-
Holm, B., and Bailly, G. Learning the hidden structure of intonation: Implementing various functions of prosody. In Speech Prosody (2002), 399-402.
-
(2002)
Speech Prosody
, pp. 399-402
-
-
Holm, B.1
Bailly, G.2
-
218
-
-
0015699693
-
The influence of the glottal waveform on the naturalness of speech from a parallel formant synthesizer
-
Holmes, J. N. The influence of the glottal waveform on the naturalness of speech from a parallel formant synthesizer. IEEE Transactions on Audio Electroacoustics 21 (1980), 298-305.
-
(1980)
IEEE Transactions on Audio Electroacoustics
, vol.21
, pp. 298-305
-
-
Holmes, J.N.1
-
219
-
-
84964175806
-
Speech synthesis by rule
-
Holmes, J. N., Mattingly, I. G., and Shearme, J. N. Speech synthesis by rule. Language and Speech 7 (1964), 127-143.
-
(1964)
Language and Speech
, vol.7
, pp. 127-143
-
-
Holmes, J.N.1
Mattingly, I.G.2
Shearme, J.N.3
-
220
-
-
0031642265
-
Automatic generation of synthesis units for trainable text-to-speech systems
-
Hon, H., Acero, A., Huang, X., Liu, J., and Plumpe, M. Automatic generation of synthesis units for trainable text-to-speech systems. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing 1998 (1998).
-
(1998)
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing 1998
-
-
Hon, H.1
Acero, A.2
Huang, X.3
Liu, J.4
Plumpe, M.5
-
221
-
-
0001562208
-
Articulation-Testing Methods: Consonantal differentiation with a closed-response set
-
House, A., Williams, C., Hecker, M., and Kryter, K. Articulation-testing methods: Consonantal differentiation with a closed-response set. The Journal of the Acoustical Society of America 37 (1965), 158.
-
(1965)
The Journal of the Acoustical Society of America
, vol.37
, pp. 158
-
-
House, A.1
Williams, C.2
Hecker, M.3
Kryter, K.4
-
224
-
-
0004056285
-
-
Englewood Cliffs, NJ: Prentice-Hall
-
Huang, X., Acero, a., and Hon, H.-W. Spoken Language Processing: A Guide to Theory, Algorithm and System Development. Englewood Cliffs, NJ: Prentice-Hall (2001).
-
(2001)
Spoken Language Processing: A Guide to Theory, Algorithm and System Development
-
-
Huang, X.1
Acero, A.2
Hon, H.-W.3
-
228
-
-
70450167047
-
Issues in high quality ipc analysis and synthesis
-
Hunt, M., Zwierynski, D., and Carr, R. Issues in high quality IPC analysis and synthesis. In Proceedings ofEurospeech 1989 (1989), pp. 348-351.
-
(1989)
Proceedings Ofeurospeech
, vol.1989
, pp. 348-351
-
-
Hunt, M.1
Zwierynski, D.2
Carr, R.3
-
232
-
-
85032424085
-
Model adaptation and adaptive training using esat algorithm for hmm-based speech synthesis
-
Isogai, J., Yamagishi, J., and Kobayashi, T. Model adaptation and adaptive training using ESAT algorithm for HMM-based speech synthesis. In Proceedings of Eurospeech 2005 (2005).
-
(2005)
Proceedings of Eurospeech 2005
-
-
Isogai, J.1
Yamagishi, J.2
Kobayashi, T.3
-
234
-
-
0027699809
-
Speech segment selection for concatenative synthesis based on spectral distortion minimization
-
Iwahashi, N., Kaiki, N., and Sagisaka, Y. Speech segment selection for concatenative synthesis based on spectral distortion minimization. Transactions of the Institute of Electronics, Information and Communication Engineers E76A (1993), 1942-1948.
-
(1993)
Transactions of the Institute of Electronics, Information and Communication Engineers
, pp. 1942-1948
-
-
Iwahashi, N.1
Kaiki, N.2
Sagisaka, Y.3
-
235
-
-
85009152114
-
Automatic segmentation for czech concatenative speech synthesis using statistical approach with boundary-specific correction
-
Matousek, D. T., and Psutka, J. Automatic segmentation for Czech concatenative speech synthesis using statistical approach with boundary-specific correction. In Proceedings ofEurospeech 2003 (2003).
-
(2003)
Proceedings Ofeurospeech 2003
-
-
Matousek, D.T.1
Psutka, J.2
-
237
-
-
0016939124
-
Continuous speech recognition by statistical methods
-
Jelinek, F. Continuous speech recognition by statistical methods. Proceedings of the IEEE 64 (1976), 532-556.
-
(1976)
Proceedings of the IEEE
, vol.64
, pp. 532-556
-
-
Jelinek, F.1
-
241
-
-
0345580812
-
Rules forthe generationoftobi-based american english intonation
-
Jilka, M., Mohler, G., and Dogil, G. Rules forthe generationofToBI-based American English intonation. In Speech Communication 28 (1999), 83-108.
-
(1999)
Speech Communication
, vol.28
, pp. 83-108
-
-
Jilka, M.1
Mohler, G.2
Dogil, G.3
-
243
-
-
0003847769
-
-
Englewood Cliffs, NJ: Prentice-Hall
-
Jurafsky, D., and Martin, J. H. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Englewood Cliffs, NJ: Prentice-Hall (2000).
-
(2000)
Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition
-
-
Jurafsky, D.1
Martin, J.H.2
-
246
-
-
84876497245
-
Gmmbase voice conversion applied to emotional speech synthesis
-
Kawanami, h., Iwami, y., Toda, t., Saruwatarai, h., and Shikano, K. GMMbase voice conversion applied to emotional speech synthesis. In Proceedings of Eurospeech 2003 (2003).
-
(2003)
Proceedings of Eurospeech 2003
-
-
Kawanami, H.1
Iwami, Y.2
Toda, T.3
Saruwatarai, H.4
Shikano, K.5
-
247
-
-
84946981738
-
One process, not two, in reading aloud: Lexical analogies do the work of non-lexical rules
-
Kay, j., and Marcel, A. One process, not two, in reading aloud: Lexical analogies do the work of non-lexical rules. Quarterly Journal of Experimental Psychology 33a (1981), 397-413.
-
(1981)
Quarterly Journal of Experimental Psychology
, vol.33
, pp. 397-413
-
-
Kay, J.1
Marcel, A.2
-
248
-
-
84928223726
-
The internal structure of phonological elements: A theory of charm and government
-
Kaye, j., Lowenstamm, j., and Vergnaud, J. R. The internal structure of phonological elements: A theory of charm and government. Phonology Yearbook 2 (1985), pp. 305-328.
-
(1985)
Phonology Yearbook
, vol.2
, pp. 305-328
-
-
Kaye, J.1
Lowenstamm, J.2
Vergnaud, J.R.3
-
249
-
-
34249324336
-
Designing very compact decision trees forgrapheme- to-phoneme transcription
-
Kienappel, A. K., and Kneser, R. Designing very compact decision trees forgrapheme- to-phoneme transcription. In Proceedings ofEurospeech 2001 (2001).
-
(2001)
Proceedings Ofeurospeech 2001
-
-
Kienappel, A.K.1
Kneser, R.2
-
251
-
-
0035127353
-
Reducing audible spectral discontinuities
-
Klabbers, E., and Veldhuis, R. Reducing audible spectral discontinuities. IEEE Transactions on Speech and Audio Processing 9, 1 (2001), 39-51.
-
(2001)
IEEE Transactions on Speech and Audio Processing
, vol.9
, Issue.1
, pp. 39-51
-
-
Klabbers, E.1
Veldhuis, R.2
-
252
-
-
84855929084
-
Acoustic theory of terminal analog speech synthesis
-
Acoustic theory of terminal analog speech synthesis
-
Klatt, D. H. Acoustic theory of terminal analog speech synthesis. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 1972 (1972), vol. 1, pp. 131-135.
-
(1972)
Proceedings of the International Conference on Acoustics, Speech, and Signal Processing
, vol.1
, pp. 131-135
-
-
Klatt, D.H.1
-
253
-
-
0015676852
-
Interaction between two factors that influence vowel duration
-
Klatt, D. H. Interaction between two factors that influence vowel duration. Journal of the Acoustical Society of America 5 (1973), 1102-1104.
-
(1973)
Journal of the Acoustical Society of America
, vol.5
, pp. 1102-1104
-
-
Klatt, D.H.1
-
254
-
-
0018986665
-
Software for a cascade/parallel formant synthesizer
-
Klatt, D. H. Software for a cascade/parallel formant synthesizer. Journal of the Acoustical Society of America 67 (1980), 971-995.
-
(1980)
Journal of the Acoustical Society of America
, vol.67
, pp. 971-995
-
-
Klatt, D.H.1
-
255
-
-
0023407575
-
Review of text-to-speech conversion for english
-
Klatt, D. H. Review of text-to-speech conversion for English. Journal of the Acoustical Society of America 82, 3 (1987), 793-850.
-
(1987)
Journal of the Acoustical Society of America
, vol.82
, Issue.3
, pp. 793-850
-
-
Klatt, D.H.1
-
257
-
-
0033719622
-
Improving intonationalphrasing with syntactic information
-
Koehn, P., Abney, s., Hirschberg, J., and Collins, M. Improving intonationalphrasing with syntactic information. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 2000 (2000).
-
(2000)
Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 2000
-
-
Koehn, P.1
Abney, S.2
Hirschberg, J.3
Collins, M.4
-
258
-
-
29744447077
-
A model of german intonation
-
K.J. Kohler, Ed. Kiel: Universitat Kiel
-
Kohler, K. J. A model of German intonation. In Studies in German Intonation, K. J. Kohler, Ed. Kiel: Universitat Kiel (1991).
-
(1991)
Studies in German Intonation
-
-
Kohler, K.J.1
-
259
-
-
33744686495
-
The perception of accents: Peak height versus peak position
-
K. J. Kohler, Ed. Kiel: Universitat Kiel
-
Kohler, K. J. The perception of accents: Peak height versus peak position. In Studies in German Intonation, K. J. Kohler, Ed. Kiel: Universitat Kiel (1991), pp. 72-96.
-
(1991)
Studies in German Intonation
, pp. 72-96
-
-
Kohler, K.J.1
-
260
-
-
0006552509
-
Phonetics, phonology and semantics
-
Terminal intonation patterns in single-accent utterances of German, K. J. Kohler, Ed. Kiel: Universitat Kiel
-
Kohler, K. J. Terminal intonation patterns in single-accent utterances of German: Phonetics, phonology and semantics. In Studies in German Intonation, K. J. Kohler, Ed. Kiel: Universitat Kiel (1991), pp. 53-71.
-
(1991)
Studies in German Intonation
, pp. 53-71
-
-
Kohler, K.J.1
-
261
-
-
0028996842
-
Celp coding based on mel cepstral analysis
-
Koishida, K., Tokuda, K., and Imai, S. CELP coding based on mel cepstral analysis. In Proceedings ofthe International Conference on Acoustics, Speech, and Signal Processing 1995 (1995).
-
(1995)
Proceedings Ofthe International Conference on Acoustics, Speech, and Signal Processing 1995
-
-
Koishida, K.1
Tokuda, K.2
Imai, S.3
-
264
-
-
85009064374
-
Duration modeling for hindi text-to-speech synthesis system
-
Krishna, N. s., Talukdar, P. p., Bali, K., and Ramakrishnam, A. G. Duration modeling for Hindi text-to-speech synthesis system. In Proceedings of the International Conference on Speech and Language Processing 2004 (2004).
-
(2004)
Proceedings of the International Conference on Speech and Language Processing 2004
-
-
Krishna, N.S.1
Talukdar, P.P.2
Bali, K.3
Ramakrishnam, A.G.4
-
266
-
-
0034108523
-
Phonological conditioning of peak alignment in rising pitch accents in dutch
-
Ladd, D., Mennen, I., and Schepman, A. Phonological conditioning of peak alignment in rising pitch accents in Dutch. The Journal of the Acoustical Society of America 107, 2685.
-
(2000)
The Journal of the Acoustical Society of America
, vol.107
-
-
Ladd, D.1
Mennen, I.2
Schepman, A.3
-
267
-
-
0023715072
-
Declination reset and the hierarchical organization of utterances
-
Ladd, D. R. Declination reset and the hierarchical organization of utterances. Journal of the Acoustical Society of America 84, 2 (1988), 530-544.
-
(1988)
Journal of the Acoustical Society of America
, vol.84
, Issue.2
, pp. 530-544
-
-
Ladd, D.R.1
-
269
-
-
0004190969
-
-
Cambridge: Cambridge University Press
-
Ladd, D. R. Intonational Phonology. Cambridge: Cambridge University Press (1996).
-
(1996)
Intonational Phonology
-
-
Ladd, D.R.1
-
270
-
-
84927457556
-
Vowel intrinsic pitch in connected speech
-
Ladd, D. R., and Silverman, K. E. A. Vowel intrinsic pitch in connected speech. Pho- netica 41 (1984), 31-40.
-
(1984)
Pho- Netica
, vol.41
, pp. 31-40
-
-
Ladd, D.R.1
Silverman, K.2
-
275
-
-
80054370614
-
-
Cambridge: Cambridge University Press
-
Laver, J. Principles of Phonetics. Cambridge: Cambridge University Press (1995).
-
(1995)
Principles of Phonetics
-
-
Laver, J.1
-
276
-
-
85032401207
-
Speech segmentation criteria for the atr/cstr database. Technical report, centre for speech technology research
-
Laver, J., Alexander, M., Bennet, C. et al. Speech segmentation criteria for the ATR/CSTR database. Technical report, Centre for Speech Technology Research, University of Edinburgh (1988).
-
(1988)
University of Edinburgh
-
-
Laver, J.1
Alexander, M.2
Bennet, C.3
-
278
-
-
33646639815
-
The synthesis of speech from signals which have a low information rate
-
W. Jackson, Ed. London: Butterworth & Co., Ltd
-
Lawrence, W. The synthesis of speech from signals which have a low information rate. In Communication Theory, W. Jackson, Ed. London: Butterworth & Co., Ltd (1953), pp. 460-469.
-
(1953)
Communication Theory
, pp. 460-469
-
-
Lawrence, W.1
-
280
-
-
0005282123
-
A computational algorithm for f0 contour generation in korean developed with prosodically labeled databases using k-tobi system
-
Lee, Y. J., Lee, S., Kim, J. J., and Ko, H. J. A computational algorithm for F0 contour generation in Korean developed with prosodically labeled databases using K-ToBI system. In Proceedings of the International Conference on Spoken Language Processing 1998 (1998).
-
(1998)
Proceedings of the International Conference on Spoken Language Processing 1998
-
-
Lee, Y.J.1
Lee, S.2
Kim, J.J.3
Ko, H.J.4
-
281
-
-
85032421246
-
A new quantization technique for lsp parameters and its application to low bit rate multi-band excited vocoders
-
Leich, H., Deketelaere, S., Dbman, I., Dothey, M., and Wery, b. A new quantization technique for LSP parameters and its application to low bit rate multi-band excited vocoders. In EUSIPCO (1992).
-
(1992)
EUSIPCO
-
-
Leich, H.1
Deketelaere, S.2
Dbman, I.3
Dothey, M.4
-
282
-
-
0022685753
-
Continuously variable duration hidden markov models for automatic speech recognition
-
Levinson, S. Continuously variable duration hidden Markov models for automatic speech recognition. Computer Speech and Language 1 (1986), 29-45.
-
(1986)
Computer Speech and Language
, vol.1
, pp. 29-45
-
-
Levinson, S.1
-
285
-
-
0001469109
-
Intonational invariance under changes in pitch range and length
-
M. Aronoff and R. T. Oehrle, Eds. Cambridge, MA: MIT Press
-
Liberman, M., and Pierrehumbert, J. Intonational invariance under changes in pitch range and length. In Language Sound Structure, M. Aronoff and R. T. Oehrle, Eds. Cambridge, MA: MIT Press (1984), pp. 157-233.
-
(1984)
Language Sound Structure
, pp. 157-233
-
-
Liberman, M.1
Pierrehumbert, J.2
-
286
-
-
0000106333
-
On stress and linguistic rhythm
-
Liberman, M. Y., and Prince, A. On stress and linguistic rhythm. Linguistic Inquiry 8 (1977), 249-336.
-
(1977)
Linguistic Inquiry
, vol.8
, pp. 249-336
-
-
Liberman, M.Y.1
Prince, A.2
-
289
-
-
84995620255
-
Knowledge of language origin lmproves pronunciation accuracy of proper names
-
Llitjos, A., and Black, A. Knowledge of language origin lmproves pronunciation accuracy of proper names. In Proceedings ofEurospeech 2001 (2001).
-
(2001)
Proceedings Ofeurospeech 2001
-
-
Llitjos, A.1
Black, A.2
-
291
-
-
84966366503
-
Rapid unit selection from a large speech corpus for concatenative speech synthesis
-
Beutnagel, M. M., and Riley, M. Rapid unit selection from a large speech corpus for concatenative speech synthesis. In Proceedings ofEurospeech 1999 (1999).
-
(1999)
Proceedings Ofeurospeech 1999
-
-
Beutnagel, M.M.1
Riley, M.2
-
292
-
-
33745216020
-
A text-to-speech platform for variable length optimal unit searching using perceptual cost functions
-
Lee, D. P. L., and Olive, J. P. A text-to-speech platform for variable length optimal unit searching using perceptual cost functions. In Proceedings of the Fourth ISCA Workshop on Speech Synthesis (2001).
-
(2001)
Proceedings of the Fourth ISCA Workshop on Speech Synthesis
-
-
Lee, D.1
Olive, J.P.2
-
295
-
-
0016495091
-
Linear prediction:A tutorial review
-
Makhoul, J. Linear prediction: A tutorial review. Proceedings of the IEEE 63, 4 (1975), 561-580.
-
(1975)
Proceedings of the IEEE
, vol.63
, Issue.4
, pp. 561-580
-
-
Makhoul, J.1
-
297
-
-
33646790880
-
A graphical model for formant tracking
-
Malkin, J., Li, X., and Bilmes, J. A graphical model for formant tracking. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing 2005 (2005).
-
(2005)
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing 2005
-
-
Malkin, J.1
Li, X.2
Bilmes, J.3
-
300
-
-
0039255896
-
A multistrategy approach to improving pronunciation by analogy
-
Marchand, Y., and Damper, R. A multistrategy approach to improving pronunciation by analogy. Computational Linguistics 26, 2 (2000), 195-219.
-
(2000)
Computational Linguistics
, vol.26
, Issue.2
, pp. 195-219
-
-
Marchand, Y.1
Damper, R.2
-
305
-
-
0029725605
-
Speech synthesis using hmms with dynamic features
-
Masuko, T., Tokuda, K., Kobayashi, T., and Imai, S. Speech synthesis using HMMs with dynamic features. In Proceedings of the International Conference on Acoustics Speech and Signal Processing 1996 (1996).
-
(1996)
Proceedings of the International Conference on Acoustics Speech and Signal Processing 1996
-
-
Masuko, T.1
Tokuda, K.2
Kobayashi, T.3
Imai, S.4
-
306
-
-
84940803687
-
Feature geometry and dependency: A review
-
McCarthy, J. Feature geometry and dependency: Areview. Phonetica 43 (1988), 84-108.
-
(1988)
Phoneticaa
, vol.43
, pp. 84-108
-
-
Mc Carthy, J.1
-
308
-
-
85032414384
-
User attitudes to concatenated natural speech and text-to-speech synthesis in an automated information service
-
McInnes, F. R., Attwater, D. J., Edgington, M. D., Schmidt, M. S., and Jack, M. A. User attitudes to concatenated natural speech and text-to-speech synthesis in an automated information service. In Proceedings ofEurospeech 1999 (1999).
-
(1999)
Proceedings Ofeurospeech 1999
-
-
Mc Innes, F.R.1
Attwater, D.J.2
Edgington, M.D.3
Schmidt, M.S.4
Jack, M.A.5
-
309
-
-
0025807353
-
Super resolution pitch determination of speech signals
-
Medan, Y., Yair, E., and Chazan, D. Super resolution pitch determination of speech signals. IEEE Transactions on Signal Processing 39 (1991), 40-48.
-
(1991)
IEEE Transactions on Signal Processing
, vol.39
, pp. 40-48
-
-
Medan, Y.1
Yair, E.2
Chazan, D.3
-
312
-
-
0000241903
-
The case of nominal extraposition
-
Michaelis, L. A., and Lambrecht, K. Toward a construction-based model of language function: The case of nominal extraposition. Language 72 (1996), 215-247.
-
(1996)
Language
, vol.72
, pp. 215-247
-
-
Michaelis, L.A.1
Lambrecht, K.2
-
316
-
-
0020816083
-
Suggested formulae for calculating auditory-filter bandwidths and excitation patterns
-
Moore, B. C. J., and Glasberg, B. R. Suggested formulae for calculating auditory-filter bandwidths and excitation patterns. Journal of the Acoustical Society of America 74 (1983), 750-753.
-
(1983)
Journal of the Acoustical Society of America
, vol.74
, pp. 750-753
-
-
Moore, B.1
Glasberg, B.R.2
-
317
-
-
85032411560
-
Twenty things we still don’t know about speech
-
Moore, R. K. Twenty things we still don’t know about speech. In Proceedings of Eurospeech 1999 (1995).
-
(1995)
Proceedings of Eurospeech 1999
-
-
Moore, R.K.1
-
319
-
-
85032425439
-
Exploratory analysis of linguistic data based on genetic algorithm for robust modeling of the segmental duration of speech
-
Morais, E., and Violaro, F. Exploratory analysis of linguistic data based on genetic algorithm for robust modeling of the segmental duration of speech. In Proceedings of Interspeech 2005 (2005).
-
(2005)
Proceedings of Interspeech 2005
-
-
Morais, E.1
Violaro, F.2
-
321
-
-
0035283592
-
Generating prosodic attitudes in french: Data, model and evaluation
-
Morlec, Y., Bailly, G., and AubergE, V. Generating prosodic attitudes in French: data, model and evaluation. Speech Communication 33, 4 (2001), 357-371.
-
(2001)
Speech Communication
, vol.33
, Issue.4
, pp. 357-371
-
-
Morlec, Y.1
Bailly, G.2
-
322
-
-
0009151070
-
Time-domain and frequency-domain techniques for prosodic modification of speech
-
W.B. Kleijn and K. K. Paliwal, Eds. Amsterdam: Elsevier Science B.V
-
Moulines, E., and Verhelst, W. Time-domain and frequency-domain techniques for prosodic modification of speech. In Speech Coding and Synthesis, W. B. Kleijn and K. K. Paliwal, Eds. Amsterdam: Elsevier Science B.V. (1995), pp. 519-555.
-
(1995)
Speech Coding and Synthesis
, pp. 519-555
-
-
Moulines, E.1
Verhelst, W.2
-
323
-
-
85009110171
-
Accent label prediction by time delay neural networks using gating clusters
-
Muller, A. F., and Hoffmann, R. Accent label prediction by time delay neural networks using gating clusters. In Proceedings of Eurospeech 2001 (2001).
-
(2001)
Proceedings of Eurospeech 2001
-
-
Muller, A.F.1
Hoffmann, R.2
-
324
-
-
0027447292
-
A review of the literature on human vocal emotion
-
Murray, I. R., and Arnott, J. L. Toward the simulation of emotion in synthetic speech: A review of the literature on human vocal emotion. Journal of the Acoustical Society of America 93, 2 (1993), 1097-1108.
-
(1993)
Journal of the Acoustical Society of America
, vol.93
, Issue.2
, pp. 1097-1108
-
-
Murray, I.R.1
Arnott, J.L.2
-
331
-
-
0029097108
-
An articulatory study of fricative consonants using mri
-
Narayanan, S., Alwan, A., and Haker, K. An articulatory study of fricative consonants using MRI. Journal of the Acoustical Society of America 98, 3 (1995), 1325-1347.
-
(1995)
Journal of the Acoustical Society of America
, vol.98
, Issue.3
, pp. 1325-1347
-
-
Narayanan, S.1
Alwan, A.2
Haker, K.3
-
332
-
-
0031042761
-
The laterals
-
Towards articulatory-acoustic models for liquid consonants based on MRI and EPG data. Part I
-
Narayanan, S., Alwan, A., and Haker, K. Towards articulatory-acoustic models for liquid consonants based on MRI and EPG data. Part I: The laterals. Journal of the Acoustical Society of America 101, 2 (1997), 1064-1077.
-
(1997)
Journal of the Acoustical Society of America
, vol.101
, Issue.2
, pp. 1064-1077
-
-
Narayanan, S.1
Alwan, A.2
Haker, K.3
-
334
-
-
0034224125
-
Prosynth: An integrated prosodic approach to device-independent, natural-sounding speech synthesis
-
Ogden, R., Hawkins, S., House, J. et al. Prosynth: An integrated prosodic approach to device-independent, natural-sounding speech synthesis. Computer Speech and Language 14 (2000), 177-210.
-
(2000)
Computer Speech and Language
, vol.14
, pp. 177-210
-
-
Ogden, R.1
Hawkins, S.2
House, J.3
-
335
-
-
0008904919
-
Word and sentence intonation: A quantitative model
-
Ohman, S. Word and sentence intonation: A quantitative model. STL-Quarterly Progress and Staatus Report 8, 2-3 (1967), 20-54.
-
(1967)
STL-Quarterly Progress and Staatus Report
, vol.5
, Issue.2-3
, pp. 20-54
-
-
Ohman, S.1
-
337
-
-
85009065136
-
Aperiodicity control in arx-based speech analysissynthesis method
-
Ohtsuka, T., and Kasuya, H. Aperiodicity control in ARX-based speech analysissynthesis method. In Proceedings of Eurospeech 2001 (2001).
-
(2001)
Proceedings of Eurospeech 2001
-
-
Ohtsuka, T.1
Kasuya, H.2
-
340
-
-
85032422534
-
Modelling pitch accent types for polish speech synthesis
-
Oliver, D., and Clark, R. Modelling pitch accent types for Polish speech synthesis. In Proceedings ofInterspeech 2005 (1995).
-
(1995)
Proceedings Ofinterspeech 2005
-
-
Oliver, D.1
Clark, R.2
-
341
-
-
0014568991
-
-
I. S. IEEE recommended practices for speech quality measurements
-
IEEE Subcommittee on Subjective Measurements, I. S. IEEE recommended practices for speech quality measurements. IEEE Transactions on Audio and Electroacoustics 17 (1969), 227-246.
-
(1969)
IEEE Transactions on Audio and Electroacoustics
, vol.17
, pp. 227-246
-
-
-
345
-
-
77956899791
-
Discontinuity detection in concatenated speech synthesis based on nonlinear speech analysis
-
Pantazis, Y., Stylianou, Y., and Klabbers, E. Discontinuity detection in concatenated speech synthesis based on nonlinear speech analysis. In Proceedings ofEurospeech, Interspeech 2005 (2005).
-
(2005)
Proceedings Ofeurospeech, Interspeech 2005
-
-
Pantazis, Y.1
Stylianou, Y.2
Klabbers, E.3
-
346
-
-
85018094829
-
Computer generated animation of faces
-
Parke, F. I. Computer generated animation of faces. In ACM National Conference (1972).
-
(1972)
ACM National Conference
-
-
Parke, F.I.1
-
348
-
-
85009168738
-
Dtw-based phonetic alignment using multiple acoustic features
-
Paulo, S., and Oliveira, L. C. DTW-based phonetic alignment using multiple acoustic features. In Proceedings ofEurospeech 2003 (2003).
-
(2003)
Proceedings Ofeurospeech 2003
-
-
Paulo, S.1
Oliveira, L.C.2
-
354
-
-
0019603077
-
Animating facial expression
-
Platt, S. M., and Badler, N. I. Animating facial expression. Computer Graphics 15, 3 (2001), 245-252.
-
(2001)
Computer Graphics
, vol.15
, Issue.3
, pp. 245-252
-
-
Platt, S.M.1
Badler, N.I.2
-
355
-
-
0347463071
-
Which is more important in a concatenative speech synthesis system-pitch duration or spectral discontinuity?
-
Plumpe, M., and Meredith, S. Which is more important in a concatenative speech synthesis system-pitch duration or spectral discontinuity? In Third ESCA/IEEE Workshop on Speech Synthesis (1998).
-
(1998)
Third ESCA/IEEE Workshop on Speech Synthesis
-
-
Plumpe, M.1
Meredith, S.2
-
356
-
-
0032595183
-
Modeling of the glottal flow derivative waveform with application to speaker identification
-
Plumpe, M. D., Quatieri, T. F., and Reynolds, D. A. Modeling of the glottal flow derivative waveform with application to speaker identification. IEEE Transactions on Speech and Audio Processing 1, 5 (1999), 569-586.
-
(1999)
IEEE Transactions on Speech and Audio Processing
, vol.1
, Issue.5
, pp. 569-586
-
-
Plumpe, M.D.1
Quatieri, T.F.2
Reynolds, D.A.3
-
359
-
-
0030362834
-
Generation of multiple synthesis inventories by a bootstrapping procedure
-
Portele, T., Stober, K. H., Meyer, H., and Hess, W. Generation of multiple synthesis inventories by a bootstrapping procedure. In Proceedings of the International Conference on Speech and Language Processing 1996 (1996).
-
(1996)
Proceedings of the International Conference on Speech and Language Processing 1996
-
-
Portele, T.1
Stober, K.H.2
Meyer, H.3
Hess, W.4
-
361
-
-
84941609666
-
The organisation of purposeful dialogues
-
Power, R. J. D. The organisation of purposeful dialogues. Linguistics 17 (1979), 107-152.
-
(1979)
Linguistics
, vol.17
, pp. 107-152
-
-
Power, R.1
-
363
-
-
0026850770
-
Automatic classification of intonational phrase boundaries
-
Qang, M. Q., and Hirschberg, J. Automatic classification of intonational phrase boundaries. Computer Speech and Language 6 (1992), 175-196.
-
(1992)
Computer Speech and Language
, vol.6
, pp. 175-196
-
-
Qang, M.Q.1
Hirschberg, J.2
-
365
-
-
0026771806
-
The derivation of prosody for text-to-speech from prosodic sentence structure
-
Quene, H. The derivation of prosody for text-to-speech from prosodic sentence structure. Computer Speech & Language 6, 1 (1992), 77-98.
-
(1992)
Computer Speech & Language
, vol.6
, Issue.1
, pp. 77-98
-
-
Quene, H.1
-
369
-
-
85032417079
-
Stochastic and syntactic techniques for predicting phrase breaks
-
Read, I., and Cox, S. Stochastic and syntactic techniques for predicting phrase breaks. In Proceedings ofEurospeech 2005 (2005).
-
(2005)
Proceedings Ofeurospeech 2005
-
-
Read, I.1
Cox, S.2
-
370
-
-
85032423350
-
Improving data driven part-of-speech tagging by morphologic knowledge induction
-
Reichel, U. Improving data driven part-of-speech tagging by morphologic knowledge induction. In Proceedings ofAdvances in Speech Technology (2005).
-
(2005)
Proceedings Ofadvances in Speech Technology
-
-
Reichel, U.1
-
372
-
-
0002069313
-
Tree-based modelling of segmental duration
-
C. B. G Bailly and T. R. Sawallis, Eds. Amsterdam: Elsevier Science Publishers
-
Riley, M. Tree-based modelling of segmental duration. In Talking Machines: Theories, Models and Designs, C. B. G Bailly and T. R. Sawallis, Eds. Amsterdam: Elsevier Science Publishers (1992), pp. 265-273.
-
(1992)
Talking Machines: Theories, Models and Designs
, pp. 265-273
-
-
Riley, M.1
-
373
-
-
1442267080
-
Learning decision lists
-
Rivest, R. L. Learning decision lists. Machine Learning 2 (1987), 229-246.
-
(1987)
Machine Learning
, vol.2
, pp. 229-246
-
-
Rivest, R.L.1
-
375
-
-
0000329355
-
A recurrent error propagation network speech recognition system
-
Robinson, t., and Fallside, F. A recurrent error propagation network speech recognition system. Computer Speech and Language 5, 3 (1991).
-
(1991)
Computer Speech and Language
, vol.5
, pp. 3
-
-
Robinson, T.1
Fallside, F.2
-
376
-
-
0015008817
-
Effect of glottal pulse shape on the quality of natural vowels
-
Rosenberg, A. E. Effect of glottal pulse shape on the quality of natural vowels. Journal of the Acoustical Society of America 49 (1970), 583-590.
-
(1970)
Journal of the Acoustical Society of America
, vol.49
, pp. 583-590
-
-
Rosenberg, A.E.1
-
378
-
-
0030181584
-
Prediction of abstract prosodic labels for speech synthesis
-
Ross, K., and Ostendorf, M. Prediction of abstract prosodic labels for speech synthesis. In Computer Speech and Language 10, 3 (1996), 155-185.
-
(1996)
Computer Speech and Language
, vol.10
, Issue.3
, pp. 155-185
-
-
Ross, K.1
Ostendorf, M.2
-
379
-
-
4544238841
-
A method for automatic extraction of fujisaki- model parameters
-
Rossi, P., Palmieri, F., and Cutugno, F. A method for automatic extraction of Fujisaki- model parameters. In Proceedings of Speech Prosody 2002 (2002), pp. 615-618.
-
(2002)
Proceedings of Speech Prosody
, vol.2002
, pp. 615-618
-
-
Rossi, P.1
Palmieri, F.2
Cutugno, F.3
-
381
-
-
85032405806
-
Unit selection for speech synthesis based on anew acoustic target cost
-
Rouibia, s., and Rosec, O. Unit selection for speech synthesis based on anew acoustic target cost. In Proceedings ofInterspeech 2005 (2005).
-
(2005)
Proceedings Ofinterspeech 2005
-
-
Rouibia, S.1
Rosec, O.2
-
382
-
-
0019606728
-
An articulatory synthesizer for perceptual research
-
Rubin, P., Baer, t., and Mermelstein, P. An articulatory synthesizer for perceptual research. Journal of the Acoustical Society of America 70 (1981), 32-328.
-
(1981)
Journal of the Acoustical Society of America
, vol.70
, pp. 32-328
-
-
Rubin, P.1
Baer, T.2
Mermelstein, P.3
-
383
-
-
85009273523
-
A statistically motivated database pruning technique for unit selection synthesis
-
Rutten, P., Aylett, m., Fackrell, j., and Taylor, P. A statistically motivated database pruning technique for unit selection synthesis. In Proceedings oftheICSLP (2002), pp. 125-128.
-
(2002)
Proceedings Oftheicslp
, pp. 125-128
-
-
Rutten, P.1
Aylett, M.2
Fackrell, J.3
Taylor, P.4
-
385
-
-
85009231318
-
Discriminative weight training for unit-selection based speech synthesis
-
Park, C. K. K., and Kim, N. S. Discriminative weight training for unit-selection based speech synthesis. In Proceedings ofEurospeech 2003 (2003).
-
(2003)
Proceedings Ofeurospeech 2003
-
-
Park, C.1
Kim, N.S.2
-
388
-
-
0029747042
-
High-quality speech synthesis using context-dependent syllabic units
-
Saito, T., Hashimoto, Y., and Sakamoto, M. High-quality speech synthesis using context-dependent syllabic units. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 1996 (1996).
-
(1996)
Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 1996
-
-
Saito, T.1
Hashimoto, Y.2
Sakamoto, M.3
-
389
-
-
85009143747
-
Stress assignment in spanish proper names
-
San-Segundo, R., Montero, J. M., Cordoba, R., and Gutierrez-Arriola, J. Stress assignment in Spanish proper names. In Proceedings of the International Conference on Speech and Language Processing 2000 (2000).
-
(2000)
Proceedings of the International Conference on Speech and Language Processing 2000
-
-
San-Segundo, R.1
Montero, J.M.2
Cordoba, R.3
Gutierrez-Arriola, J.4
-
390
-
-
85009084534
-
Two features to check phonetic transcriptions in text to speech systems
-
Sandri, S., and Zovato, E. Two features to check phonetic transcriptions in text to speech systems. In Proceedings ofEurospeech 2001 (2001).
-
(2001)
Proceedings Ofeurospeech 2001
-
-
Sandri, S.1
Zovato, E.2
-
393
-
-
0002697853
-
Three dimensions of emotion
-
Schlossberg, H. Three dimensions of emotion. Psychological Review 61, 2 (1954), 8188.
-
(1954)
Psychological Review
, vol.61
, Issue.2
, pp. 8188
-
-
Schlossberg, H.1
-
394
-
-
85032409834
-
Speech parameter generation algorithm considering global variance for hmm-based speech synthesis
-
Schweitzer, A. Speech parameter generation algorithm considering global variance for HMM-based speech synthesis. In Proceedings ofEurospeech, Interspeech 2005 (2005).
-
(2005)
Proceedings Ofeurospeech, Interspeech 2005
-
-
Schweitzer, A.1
-
395
-
-
0004097793
-
-
Cambridge: Cambridge University Press
-
Searle, J. R. Speech Acts. Cambridge: Cambridge University Press (1969).
-
(1969)
Speech Acts
-
-
Searle, J.R.1
-
396
-
-
0000383868
-
Parallel networks that learn to pronounce english text
-
Sejnowski, T., and Rosenberg, C. Parallel networks that learn to pronounce English text. Complex Systems 1, 1 (1987), 145-168.
-
(1987)
Complex Systems
, vol.1
, Issue.1
, pp. 145-168
-
-
Sejnowski, T.1
Rosenberg, C.2
-
400
-
-
0000163989
-
A mathematical model of communication. Technical report
-
Shannon, C. E. A mathematical model of communication
-
Shannon, C. E. A mathematical model of communication. Technical report, Bell System Technical Journal (1948).
-
(1948)
Bell System Technical Journal
-
-
Shannon, C.E.1
-
402
-
-
85032406881
-
Prosodic phrasing with inductive learning
-
Sheng, Z., Jianhua, T., and Lianhong, C. Prosodic phrasing with inductive learning. In Proceedings of the International Conference on Spoken Language Processing, Interspeech 2002 (2002).
-
(2002)
Proceedings of the International Conference on Spoken Language Processing, Interspeech 2002
-
-
Sheng, Z.1
Jianhua, T.2
Lianhong, C.3
-
407
-
-
33947703664
-
The cu-htk mandarin broadcast news transcription system
-
et al
-
Sinha, R., Gales, M. J. F., Kim, D. Y. et al. The CU-HTK Mandarin Broadcast News transcription system. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 2006 (2006).
-
(2006)
Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 2006
-
-
Sinha, R.1
Gales, M.2
Kim, D.Y.3
-
410
-
-
0028405433
-
English noun-phrase accent prediction for text-to-speech
-
Sproat, R. English noun-phrase accent prediction for text-to-speech. In Computer Speech and Language (1994), vol. 8, pp. 79-94.
-
(1994)
Computer Speech and Language
, vol.8
, pp. 79-94
-
-
Sproat, R.1
-
414
-
-
85009212430
-
Effects of voice gender and signal quality
-
Stevens, C., Lees, N., and Vonwiller, J. Experimental tools to evaluate intelligibility of text-to-speech (TTS) synthesis: Effects of voice gender and signal quality. In Proceedings ofEurospeech 2003 (2003).
-
(2003)
Proceedings Ofeurospeech 2003
-
-
Stevens, C.1
Lees, N.2
-
415
-
-
84955035459
-
A scale for the measurement of the psychological magnitude of pitch
-
Stevens, s., Volkman, J., and Newman, E. A scale for the measurement of the psychological magnitude of pitch. Journal of the Acoustical Society of America 8 (1937), 185-190.
-
(1937)
Journal of the Acoustical Society of America
, vol.8
, pp. 185-190
-
-
Stevens, S.1
Volkman, J.2
Newman, E.3
-
416
-
-
0001441770
-
Synthesis by word concatenation
-
Stober, k., Portele, T., Wagner, P., and Hess, W. Synthesis by word concatenation. In Proceedings ofEurospeech 1999 (1999).
-
(1999)
Proceedings Ofeurospeech 1999
-
-
Stober, K.1
Portele, T.2
Wagner, P.3
Hess, W.4
-
417
-
-
0026881761
-
On the relation between voice source parameters and prosodic features in connected speech
-
Strik, H., and Boves, L. On the relation between voice source parameters and prosodic features in connected speech. Speech Communication 11, 2 (1992), 167-174.
-
(1992)
Speech Communication
, vol.11
, Issue.2
, pp. 167-174
-
-
Strik, H.1
Boves, L.2
-
422
-
-
0027813728
-
Novel-word pronunciation: A cross-language study
-
3-4
-
Sullivan, K., and Damper, R. Novel-word pronunciation: A cross-language study. Speech Communication 13, 3-4 (1993), 441-452.
-
(1993)
Speech Communication
, vol.13
, pp. 441-452
-
-
Sullivan, K.1
Damper, R.2
-
423
-
-
85009106023
-
Intonational phrase break prediction using decision tree and n-gram model
-
Sun, X., and Applebaum, T. H. Intonational phrase break prediction using decision tree and n-gram model. In Proceedings ofEurospeech 2001 (2001).
-
(2001)
Proceedings Ofeurospeech 2001
-
-
Sun, X.1
Applebaum, T.H.2
-
424
-
-
0035283546
-
The effect of speech melody on voice quality
-
Swerts, M., and Veldhuis, R. The effect of speech melody on voice quality. Speech Communication 33, 4 (2001), 297-303.
-
(2001)
Speech Communication
, vol.33
, Issue.4
, pp. 297-303
-
-
Swerts, M.1
Veldhuis, R.2
-
425
-
-
0026880274
-
D. Et Al. Evaluation of speech synthesis techniques in a comprehension task
-
Sydeserff, H. A., Caley, R. J., Isard, S. D. et al. Evaluation of speech synthesis techniques in a comprehension task. Speech Communication 11, 2-3 (1992), 189-194.
-
(1992)
Speech Communication
, vol.11
, Issue.2-3
, pp. 189-194
-
-
Sydeserff, H.A.1
Caley, R.J.2
Isard, S.3
-
426
-
-
85009151528
-
Prosodic effects on listener detection of vowel concatenation
-
Syrdal, A. K. Prosodic effects on listener detection of vowel concatenation. In Proceeding ofEurospeech 2001 (2001).
-
(2001)
Proceeding Ofeurospeech 2001
-
-
Syrdal, A.K.1
-
429
-
-
29144475179
-
Speech synthesis with various emotional expressions and speaking styles by style interpolation and morphing
-
Tachibana, M., Yamagishi, J., Masuko, T., and Kobayashi, T. Speech synthesis with various emotional expressions and speaking styles by style interpolation and morphing. IEICE Transactions on Information and Systems 2005 E88-D11 (2004), 2484-2491.
-
(2004)
IEICE Transactions on Information and Systems 2005
, pp. 2484-2491
-
-
Tachibana, M.1
Yamagishi, J.2
Masuko, T.3
Kobayashi, T.4
-
430
-
-
44949120030
-
Voice and emotional expression transformation based on statistics of vowel parameters in an emotional speech database
-
et al
-
Takahashi, T., Takeshi, F., Nishi, M. et al. Voice and emotional expression transformation based on statistics of vowel parameters in an emotional speech database. In Proceedings ofInterspeech 2005 (2005).
-
(2005)
Proceedings Ofinterspeech 2005
-
-
Takahashi, T.1
Takeshi, F.2
Nishi, M.3
-
431
-
-
0001455934
-
A robust algorithm for pitch tracking rapt
-
W.B. Kleijn and K. K. Paliwal, Eds. Amsterdam: Elsevier
-
Talkin, D. A robust algorithm for pitch tracking RAPT. In Speech Coding and Synthesis, W. B. Kleijn and K. K. Paliwal, Eds. Amsterdam: Elsevier (1995), pp. 495-518.
-
(1995)
Speech Coding and Synthesis
, pp. 495-518
-
-
Talkin, D.1
-
433
-
-
34547547622
-
Hidden markov models for grapheme to phoneme conversion
-
Taylor, P. Hidden Markov models for grapheme to phoneme conversion. In Proceedings ofInterspeech 2005 (2005).
-
(2005)
Proceedings Ofinterspeech 2005
-
-
Taylor, P.1
-
435
-
-
0028529843
-
The rise/fall/connection model of intonation
-
Taylor, P. A. The rise/fall/connection model of intonation. Speech Communication 15 (1995), 169-186.
-
(1995)
Speech Communication
, vol.15
, pp. 169-186
-
-
Taylor, P.A.1
-
436
-
-
0034008810
-
Analysis and synthesis of intonation using the tilt model
-
Taylor, P. A. Analysis and synthesis of intonation using the tilt model. Journal of the Acoustical Society of America 107, 4 (2000), 1697-1714.
-
(2000)
Journal of the Acoustical Society of America
, vol.107
, Issue.4
, pp. 1697-1714
-
-
Taylor, P.A.1
-
440
-
-
85135272129
-
Speech synthesis by phonological structure matching
-
Taylor, P. A., and Black, A. W. Speech synthesis by phonological structure matching. In Proceedings of Eurospeech 1999 (1999), pp. 623-626.
-
(1999)
Proceedings of Eurospeech
, vol.1999
, pp. 623-626
-
-
Taylor, P.A.1
Black, A.W.2
-
441
-
-
0035155093
-
-
Taylor, P. A., Black, A. W., and Caley, R. J. Heterogeneous relation graphs as a mechanism for representing linguistic information. Speech Communication special issue on annotation, 1-2 (2000), 153-174.
-
(2000)
Heterogeneous Relation Graphs as a Mechanism for Representing Linguistic Information. Speech Communication Special Issue on Annotation
, vol.1-2
, pp. 153-174
-
-
Taylor, P.A.1
Black, A.W.2
Caley, R.J.3
-
442
-
-
0347128737
-
Intonation and dialogue context as constraints for speech recognition
-
Taylor, P. A., King, s., Isard, S. D., and Wright, H. Intonation and dialogue context as constraints for speech recognition. Language and Speech 41, 3-4 (1998), 491-512.
-
(1998)
Language and Speech
, vol.41
, Issue.3-4
, pp. 491-512
-
-
Taylor, P.A.1
King, S.2
Isard, S.D.3
Wright, H.4
-
443
-
-
85135152214
-
Using intonation to constrain language models in speech recognition
-
Taylor, P. a., King, s., Isard, S. d., Wright, h., and Kowtko, J. Using intonation to constrain language models in speech recognition. In Proceedings ofEurospeech 1997 (1997).
-
(1997)
Proceedings Ofeurospeech 1997
-
-
Taylor, P.A.1
King, S.2
Isard, S.D.3
Wright, H.4
Kowtko, J.5
-
444
-
-
85032400113
-
A real time speech synthesis system
-
Taylor, P. a., Nairn, I. a., Sutherland, A. M., and Jack, M. A. A real time speech synthesis system. In Proceedings of Eurospeech 1991 (1991).
-
(1991)
Proceedings of Eurospeech 1991
-
-
Taylor, P.A.1
Nairn, I.A.2
Sutherland, A.M.3
Jack, M.A.4
-
445
-
-
0002872229
-
Intonation By Rule: A perceptual quest
-
’t Hart, J., and Cohen, A. Intonation by rule: A perceptual quest. Journal of Phonetics 1 (1973), 309-327.
-
(1973)
Journal of Phonetics
, vol.1
, pp. 309-327
-
-
’T Hart, J.1
Cohen, A.2
-
446
-
-
0000372476
-
Integrating different levels of intonation analysis
-
‘T Hart, J., and Collier, R. Integrating different levels of intonation analysis. Journal of Phonetics 3 (1975), 235-255.
-
(1975)
Journal of Phonetics
, vol.3
, pp. 235-255
-
-
T Hart, J.1
Collier, R.2
-
447
-
-
85032420364
-
Symbolic prosody driven unit selection for highly natural synthetic speech
-
Tihelka, D. Symbolic prosody driven unit selection for highly natural synthetic speech. In Proceedings ofEurospeech, Interspeech 2005 (2005).
-
(2005)
Proceedings Ofeurospeech, Interspeech 2005
-
-
Tihelka, D.1
-
448
-
-
4544270859
-
Optimizing sub-cost functions for segment selection based on perceptual evaluations in concatenative speech synthesis
-
Toda, t., Kawai, H., and Tsuzaki, M. Optimizing sub-cost functions for segment selection based on perceptual evaluations in concatenative speech synthesis. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 2004 (2004).
-
(2004)
Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 2004
-
-
Toda, T.1
Kawai, H.2
Tsuzaki, M.3
-
449
-
-
0141590580
-
Segment selection considering local degradation of naturalness in concatenative speech synthesis
-
Toda, T., Kawai, H., Tsuzaki, M., and Shikano, K. Segment selection considering local degradation of naturalness in concatenative speech synthesis. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 2003 (2003).
-
(2003)
Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 2003
-
-
Toda, T.1
Kawai, H.2
Tsuzaki, M.3
Shikano, K.4
-
450
-
-
33846410497
-
Speech parameter generation algorithm considering global variance for hmm-based speech synthesis
-
Toda, T., and Tokuda, K. Speech parameter generation algorithm considering global variance for HMM-based speech synthesis. In Proceedings of Eurospeech, Interspeech 2005 (2005).
-
(2005)
Proceedings of Eurospeech, Interspeech 2005
-
-
Toda, T.1
Tokuda, K.2
-
451
-
-
0028996993
-
Speech parameter generation from hmm using dynamic features
-
Tokuda, K., Kobayashi, t., and Imai, S. Speech parameter generation from HMM using dynamic features. In Proceedings ofthe International Conference on Acoustics, Speech, and Signal Processing 1995 (1995).
-
(1995)
Proceedings Ofthe International Conference on Acoustics, Speech, and Signal Processing 1995
-
-
Tokuda, K.1
Kobayashi, T.2
Imai, S.3
-
452
-
-
85031628788
-
An algorithm for speech parameter generation from continuous mixture hmms with dynamic features
-
Tokuda, K., Masuko, T., and Yamada, T. An algorithm for speech parameter generation from continuous mixture HMMs with dynamic features. In Proceedings ofEurospeech 1995 (1995).
-
(1995)
Proceedings Ofeurospeech 1995
-
-
Tokuda, K.1
Masuko, T.2
Yamada, T.3
-
453
-
-
0033708106
-
Speech parameter generation algorithms for hmm-based speech synthesis
-
Tokuda, K, Yoshimura, T., Masuko, T., Kobayashi, T., and Kitamura, T. Speech parameter generation algorithms for HMM-based speech synthesis. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 2000 (2000).
-
(2000)
Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 2000
-
-
Tokuda, K.1
Yoshimura, T.2
Masuko, T.3
Kobayashi, T.4
Kitamura, T.5
-
454
-
-
85009231267
-
Trajectory modeling based on hmms with the explicit relationship between static and dynamic features
-
Tokuda, k., Zen, H., and Kitamura, T. Trajectory modeling based on HMMs with the explicit relationship between static and dynamic features. In Proceedings of Eurospeech 2003 (2003).
-
(2003)
Proceedings of Eurospeech 2003
-
-
Tokuda, K.1
Zen, H.2
Kitamura, T.3
-
456
-
-
85009083859
-
Feature extraction by auditory modeling for unit selection in concatenative speech synthesis
-
Tsuzaki, M. Feature extraction by auditory modeling for unit selection in concatenative speech synthesis. In Proceedings ofEurospeech 2001 (2001).
-
(2001)
Proceedings Ofeurospeech 2001
-
-
Tsuzaki, M.1
-
457
-
-
85032413724
-
Et al. Constructing emotional speech synthesizers with limited speech database
-
Tszuki, R., Zen, H., Tokuda, K. et al. Constructing emotional speech synthesizers with limited speech database. In Proceedings of Interspeech 2004 (2004).
-
(2004)
Proceedings of Interspeech 2004
-
-
Tszuki, R.1
Zen, H.2
Tokuda, K.3
-
458
-
-
85089833692
-
Voice quality interpolation for emotional speech synthesis
-
Turk, O., Schroder, M., Bozkurt, B., and Arslan, L. Voice quality interpolation for emotional speech synthesis. In Proceedings ofInterspeech 2005 (2005).
-
(2005)
Proceedings Ofinterspeech 2005
-
-
Turk, O.1
Schroder, M.2
Bozkurt, B.3
Arslan, L.4
-
459
-
-
85032410013
-
Grapheme-to-phoneme conversion using pseudo-morphological units
-
Uebler, U. Grapheme-to-phoneme conversion using pseudo-morphological units. In Proceedings of Interspeech 2002 (2002).
-
(2002)
Proceedings of Interspeech 2002
-
-
Uebler, U.1
-
460
-
-
85032410315
-
A method for fully automatic analysis and modelling of voice source characteristics
-
Darsinos, D. G., and Kokkinakis, G. A method for fully automatic analysis and modelling of voice source characteristics. In Proceedings ofEurospeech 1995 (1995).
-
(1995)
Proceedings Ofeurospeech 1995
-
-
Darsinos, D.G.1
Kokkinakis, G.2
-
461
-
-
0032118314
-
Towards a blackboard model of accenting
-
Van Deemter, K. Towards a blackboard model of accenting. Computer Speech and Language 12, 3 (1998), 143-164.
-
(1998)
Computer Speech and Language
, vol.12
, Issue.3
, pp. 143-164
-
-
Van Deemter, K.1
-
462
-
-
85009080290
-
Evaluation of pros-3 for the assignment of prosodic structure, compared to assignment by human experts
-
van Herwijnen, O., and Terken, J. Evaluation of pros-3 for the assignment of prosodic structure, compared to assignment by human experts. In Proceedings ofEurospeech 2001 (2001).
-
(2001)
Proceedings Ofeurospeech 2001
-
-
Van Herwijnen, O.1
Terken, J.2
-
463
-
-
0028405296
-
Assignment of segmental duration in text-to-speech synthesis
-
van Santen, J. Assignment of segmental duration in text-to-speech synthesis. Computer Speech and Language 8 (1994), 95-128.
-
(1994)
Computer Speech and Language
, vol.8
, pp. 95-128
-
-
Van Santen, J.1
-
465
-
-
85009187535
-
Applications and computer generated expressive speech for communication disorders
-
van Santen, J., Black, L., Cohen, G. et al. Applications and computer generated expressive speech for communication disorders. In Proceedings of Eurospeech 2003 (2003).
-
(2003)
Proceedings of Eurospeech 2003
-
-
Van Santen, J.1
Black, L.2
Cohen, G.3
-
466
-
-
21844466234
-
Synthesis of prosody using multi-level unit sequences
-
van Santen, J., Kain, A., Klabbers, E., and Mishra, T. Synthesis of prosody using multi-level unit sequences. Speech Communication 46 (2005), 365-375.
-
(2005)
Speech Communication
, vol.46
, pp. 365-375
-
-
Van Santen, J.1
Kain, A.2
Klabbers, E.3
Mishra, T.4
-
469
-
-
85009167944
-
Kalman-filter based join cost for unit-selection speech synthesis
-
Vepa, J., and King, S. Kalman-filter based join cost for unit-selection speech synthesis. In Proceedings ofEurospeech 2003 (2003).
-
(2003)
Proceedings Ofeurospeech 2003
-
-
Vepa, J.1
King, S.2
-
472
-
-
85128348481
-
Automatic prosodic labeling of 6 languages
-
Vereecken, H., Martens, J., Grover, C., Fackrell, J., and Van Coile, B. Automatic prosodic labeling of 6 languages. In Proceedings of the International Conference on Speech and Language Processing 1998 (1998), pp. 1399-1402.
-
(1998)
Proceedings of the International Conference on Speech and Language Processing
, vol.1998
, pp. 1399-1402
-
-
Vereecken, H.1
Martens, J.2
Grover, C.3
Fackrell, J.4
Van Coile, B.5
-
473
-
-
85009061166
-
A unified view on synchronized overlap-add methods for prosodic modification of speech
-
Verhelst, W., Compernolle, D. V., and Wambacq, P. A unified view on synchronized overlap-add methods for prosodic modification of speech. In Proceedings of the International Conference on Spoken Language Processing 2000 (2000), vol. 2, pp. 6366.
-
(2000)
Proceedings of the International Conference on Spoken Language Processing
, vol.2
, pp. 6366
-
-
Verhelst, W.1
Compernolle, D.V.2
Wambacq, P.3
-
474
-
-
0032296808
-
A stochastic model of intonation for text-to-speech synthesis
-
VEronis, J., Di Cristo, P., Courtois, f., and Chaumette, C. A stochastic model of intonation for text-to-speech synthesis. Speech Communication 26, 4 (1998), 233-244.
-
(1998)
Speech Communication
, vol.26
, Issue.4
, pp. 233-244
-
-
VÉronis, J.1
Di Cristo, P.2
Courtois, F.3
Chaumette, C.4
-
475
-
-
33745214458
-
Estimation of LF glottal source parameters based on an ARX model
-
Vincent, D., Rosec, O., and Chonavel, T. Estimation of LF glottal source parameters based on an ARX model. In Proceedings ofEurospeech, Interspeech 2005 (2005), pp. 333-336.
-
(2005)
Proceedings ofEurospeech, Interspeech
, pp. 333-336
-
-
Vincent, D.1
Rosec, O.2
Chonavel, T.3
-
476
-
-
84935113569
-
Error bounds for convolutional codes and an asymptotically optimum decoding algorithm
-
Viterbi, A. J. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Transactions on Information Theory 13, 2 (1967), 260-269.
-
(1967)
IEEE Transactions on Information Theory
, vol.13
, Issue.2
, pp. 260-269
-
-
Viterbi, A.J.1
-
478
-
-
85032419083
-
Comprehension of prosody in synthesized speech
-
Vonwiller, J. P., King, R. W., Stevens, k., and Latimer, C. R. Comprehension of prosody in synthesized speech. In Proceedings of the Third International Australian Conference on Speech Science and Technology (1990).
-
(1990)
Proceedings of the Third International Australian Conference on Speech Science and Technology
-
-
Vonwiller, J.P.1
King, R.W.2
Stevens, K.3
Latimer, C.R.4
-
479
-
-
85009070758
-
Use of clustering information for coarticulation compensation in speech synthesis by word concatenation
-
Vosnidis, C., and Digalakis, V. Use of clustering information for coarticulation compensation in speech synthesis by word concatenation. In Proceedings ofEurospeech 2001 (2001).
-
(2001)
Proceedings Ofeurospeech 2001
-
-
Vosnidis, C.1
Digalakis, V.2
-
481
-
-
4544373879
-
Refining segmental boundaries for tts database using fine contextual-dependent boundary models
-
Wang, L., Zhao, y., Chu, M., Zhou, j., and Cao, Z. Refining segmental boundaries for TTS database using fine contextual-dependent boundary models. In Proceedings of the International Conference on Acoustics Speech and Signal Processing 2004 (2004).
-
(2004)
Proceedings of the International Conference on Acoustics Speech and Signal Processing 2004
-
-
Wang, L.1
Zhao, Y.2
Chu, M.3
Zhou, J.4
Cao, Z.5
-
482
-
-
85032401540
-
Improving letter-to-pronunciation accuracy with automatic morphologically- based stress prediction
-
Webster, G. Improving letter-to-pronunciation accuracy with automatic morphologically- based stress prediction. In Proceedings of Interspeech 2004 (2004).
-
(2004)
Proceedings of Interspeech 2004
-
-
Webster, G.1
-
484
-
-
85032421589
-
Anew parametric speech analysis and synthesis technique in the frequency domain
-
Wer, B. R., Leroux, A., Delbrouck, H. P., and Leclercs, J. Anew parametric speech analysis and synthesis technique in the frequency domain. In Proceedings of Eurospeech 1995 (1995).
-
(1995)
Proceedings of Eurospeech 1995
-
-
Wer, B.R.1
Leroux, A.2
Delbrouck, H.P.3
Leclercs, J.4
-
486
-
-
21844471192
-
Tobi or not tobi
-
Wightman, C. ToBI or not ToBI. Speech Prosody (2002), 25-29.
-
(2002)
Speech Prosody
, pp. 25-29
-
-
Wightman, C.1
-
488
-
-
84925037939
-
-
Main page - Wikipedia, the free encyclopedia, [Online; accessed 30 January 2007]
-
Wikipedia. Main page - Wikipedia, the free encyclopedia (2007). [Online; accessed 30 January 2007].
-
(2007)
-
-
-
489
-
-
0029000585
-
Physiological modeling of speech production: Methods for modeling soft-tissue articulators
-
Wilhelms-Tricarico, R. Physiological modeling of speech production: Methods for modeling soft-tissue articulators. Journal ofthe Acoustical Society of America 97, 5 (1995), 3085-3098.
-
(1995)
Journal Ofthe Acoustical Society of America
, vol.97
, Issue.5
, pp. 3085-3098
-
-
Wilhelms-Tricarico, R.1
-
490
-
-
0029704037
-
A biomechanical and physiologically-based vocal tract model and its control
-
Wilhelms-Tricarico, R. A biomechanical and physiologically-based vocal tract model and its control. Journal of Phonetics 24 (1996), 23-28.
-
(1996)
Journal of Phonetics
, vol.24
, pp. 23-28
-
-
Wilhelms-Tricarico, R.1
-
493
-
-
84926270981
-
A model of standard english intonation patterns
-
Willems, N. J. A model of standard English intonation patterns. IPO Annual Progress Report (1983).
-
(1983)
IPO Annual Progress Report
-
-
Willems, N.J.1
-
494
-
-
0036461035
-
Large scale discriminative training of hidden markov models for speech recognition
-
Woodland, P. C., and Povey, D. Large scale discriminative training of hidden Markov models for speech recognition. Computer Speech and Language 16 (2002), 2547.
-
(2002)
Computer Speech and Language
, vol.16
, pp. 2547
-
-
Woodland, P.C.1
Povey, D.2
-
497
-
-
0040494557
-
An investigation of sagittal velar movement and its correlation with lip, tongue and jaw movement
-
Wrench, A. A. An investigation of sagittal velar movement and its correlation with lip, tongue and jaw movement. In Proceedings of the International Congress of Phonetic Sciences (1999), pp. 435-438.
-
(1999)
Proceedings of the International Congress of Phonetic Sciences
, pp. 435-438
-
-
Wrench, A.A.1
-
498
-
-
4544270860
-
Minimum segmentation error based discriminative training for speech synthesis application
-
Wu, Y., Kawai, H., Ni, J., and Wang, R.-H. Minimum segmentation error based discriminative training for speech synthesis application. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 2004 (2004).
-
(2004)
Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 2004
-
-
Wu, Y.1
Kawai, H.2
Ni, J.3
Wang, R.-H.4
-
499
-
-
85009228508
-
On unit analysis for cantonese corpus-based tts
-
Xu, J., Choy, T., Dong, M., Guan, C., and Li, H. On unit analysis for Cantonese corpus-based TTS. In Proceedings ofEurospeech 2003 (2003).
-
(2003)
Proceedings Ofeurospeech 2003
-
-
Xu, J.1
Choy, T.2
Dong, M.3
Guan, C.4
Li, H.5
-
500
-
-
21844447847
-
Speech melody as articulatory implemented communicative functions
-
Xu, Y. Speech melody as articulatory implemented communicative functions. Speech Communication 46 (2005), 220-251.
-
(2005)
Speech Communication
, vol.46
, pp. 220-251
-
-
Xu, Y.1
-
501
-
-
79959816265
-
Speech prosody as articulated communicative functions
-
Xu, Y. Speech prosody as articulated communicative functions. In Proceedings of Speech Prosody 2006 (2006).
-
(2006)
Proceedings of Speech Prosody 2006
-
-
Xu, Y.1
-
502
-
-
85135109865
-
Atr - v-talk speech synthesis system
-
Sagisaka, Y. Kaiki, N. I., and Mimura, K. ATR - v-TALK speech synthesis system. In Proceedings of the InternationalConference on Speech and Language Processing 1992 (1992), vol. 1, pp. 483-486.
-
(1992)
Proceedings of the InternationalConference on Speech and Language Processing
, vol.1
, pp. 483-486
-
-
Sagisaka, Y.K.1
Mimura, K.2
-
504
-
-
85009177437
-
Modeling of various speaking styles and emotions for hmm-based speech synthesis
-
Yamagishi, J., Onishi, K., Masuko, T., and Kobayashi, T. Modeling of various speaking styles and emotions for HMM-based speech synthesis. In Proceedings ofEurospeech 2003 (2003).
-
(2003)
Proceedings Ofeurospeech 2003
-
-
Yamagishi, J.1
Onishi, K.2
Masuko, T.3
Kobayashi, T.4
-
506
-
-
85032399236
-
Homograph disambiguation in text-to-speech synthesis
-
Yarowsky, D. Homograph disambiguation in text-to-speech synthesis. In Computer Speech and Language (1996).
-
(1996)
Computer Speech and Language
-
-
Yarowsky, D.1
-
508
-
-
85009139544
-
Simultaneous modelling of spectrum, pitch and duration in hmm-based speech synthesis
-
Yoshimura, T., Tokuda, K., Masuko, T., Kobayashi, T., and Kitamura, T. Simultaneous modelling of spectrum, pitch and duration in HMM-based speech synthesis. In Proceedings ofEurospeech 1999 (1999).
-
(1999)
Proceedings Ofeurospeech 1999
-
-
Yoshimura, T.1
Tokuda, K.2
Masuko, T.3
Kobayashi, T.4
Kitamura, T.5
-
509
-
-
85009097254
-
Mixed excitation for hmm-based speech synthesis
-
Yoshimura, T., Tokuda, K., Masuko, T., Kobayashi, T., and Kitamura, T. Mixed excitation for HMM-based speech synthesis. In Proceedings of the European Conference on Speech Communication and Technology 2001, vol. 3 (2001), pp. 2259-2262.
-
(2001)
Proceedings of the European Conference on Speech Communication and Technology
, vol.3
, pp. 2259-2262
-
-
Yoshimura, T.1
Tokuda, K.2
Masuko, T.3
Kobayashi, T.4
Kitamura, T.5
-
511
-
-
85032414707
-
Data pruning approach to unit selection for inventory generation of concatenative embeddable chinese tts systems
-
Yu, Z.-L., Wang, K.-Z., Zu, Y.-Q., Yue, D.-J., and Chen, G.-L. Data pruning approach to unit selection for inventory generation of concatenative embeddable Chinese TTS systems. In Proceedings of Interspeech 2004 (2004).
-
(2004)
Proceedings of Interspeech 2004
-
-
Yu, Z.-L.1
Wang, K.-Z.2
Zu, Y.-Q.3
Yue, D.-J.4
Chen, G.-L.5
-
513
-
-
33846446896
-
An overview of nitech hmm-based speech synthesis system for blizzard challenge 2005
-
Zen, H., and Toda, T. An overview of Nitech HMM-based speech synthesis system for Blizzard Challenge 2005. In Proceedings of Interspeech 2005 (2005).
-
(2005)
Proceedings of Interspeech 2005
-
-
Zen, H.1
Toda, T.2
-
514
-
-
85009111560
-
Hidden semi- markov model based speech synthesis
-
Zen, H., Tokuda, K., Masuko, T., Kobayashi, T., and Kitamura, T. Hidden semi- Markov model based speech synthesis. In Proceedings ofthe 8th International Conference on Spoken Language Processing, Interspeech 2004 (2004).
-
(2004)
Proceedings Ofthe 8Th International Conference on Spoken Language Processing, Interspeech 2004
-
-
Zen, H.1
Tokuda, K.2
Masuko, T.3
Kobayashi, T.4
Kitamura, T.5
-
515
-
-
85009103324
-
Bayesian induction of intonational phrase breaks
-
Zervas, P., Maragoudakis, M., Fakotakis, N., and Kokkinakis, G. Bayesian induction of intonational phrase breaks. In Proceedings ofEurospeech 2003 (2003).
-
(2003)
Proceedings Ofeurospeech 2003
-
-
Zervas, P.1
Maragoudakis, M.2
Fakotakis, N.3
Kokkinakis, G.4
-
517
-
-
85032412721
-
Refining phoneme segmentations using speaker-adaptive context dependent boundary models
-
Zhao, Y., Wang, L., Chu, M., Soong, F. K., and Cao, Z. Refining phoneme segmentations using speaker-adaptive context dependent boundary models. In Proceedings of Interspeech 2005 (2005).
-
(2005)
Proceedings of Interspeech 2005
-
-
Zhao, Y.1
Wang, L.2
Chu, M.3
Soong, F.K.4
Cao, Z.5
-
518
-
-
26944460802
-
Grapheme-to-phoneme conversion based on a fast TBL algorithm in Mandarin TTS systems
-
Berlin: Springer-Verlag
-
Zheng, m., Shi, Q., Zhang, W., and Cai, L. Grapheme-to-phoneme conversion based on a fast TBL algorithm in Mandarin TTS systems. Lecture Notes in Computer Science. Berlin: Springer-Verlag (2005), p. 600.
-
(2005)
Lecture Notes in Computer Science
, pp. 600
-
-
Zheng, M.1
Shi, Q.2
Zhang, W.3
Cai, L.4
-
519
-
-
4544350112
-
Glottal closure instant synchronous sinusoidal model for high quality speech analysis/synthesis
-
Zolfaghari, P., Nakatani, T., and Irino, T. Glottal closure instant synchronous sinusoidal model for high quality speech analysis/synthesis. In Proceedings ofEurospeech 2003 (2003).
-
(2003)
Proceedings Ofeurospeech 2003
-
-
Zolfaghari, P.1
Nakatani, T.2
Irino, T.3
-
520
-
-
85009092350
-
Prosodic analysis of a multi-style corpus in the perspective of emotional speech synthesis
-
Zovato, E., Sandri, s., Quazza, s., and Badino, L. Prosodic analysis of a multi-style corpus in the perspective of emotional speech synthesis. In Proceedings of Interspeech 2004 (2004).
-
(2004)
Proceedings of Interspeech 2004
-
-
Zovato, E.1
Sandri, S.2
Quazza, S.3
Badino, L.4
|