-
1
-
-
33749543418
-
-
Acero, A., 1999. Formant analysis and synthesis using hidden Markov models. In: Proceedings of European Conference on Speech Communication and Technology'99. pp. 1047-1050.
-
-
-
-
2
-
-
0022890536
-
-
Bahl, L., Brown, P., de Souza, P., Mercer, R., 1986. Maximum mutual information estimation of hidden Markov model parameters for speech recognition. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing'86. pp. 49-52.
-
-
-
-
3
-
-
0033677172
-
-
Bilmes, J., 2000. Factored sparse inverse covariance matrices. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing'97, vol. 2. pp. 1009-1012.
-
-
-
-
4
-
-
0038021376
-
Buried Markov models: a graphical modeling approach for automatic speech recognition
-
Bilmes J. Buried Markov models: a graphical modeling approach for automatic speech recognition. Computer, Speech and Language 17 2-3 (2003) 213-231
-
(2003)
Computer, Speech and Language
, vol.17
, Issue.2-3
, pp. 213-231
-
-
Bilmes, J.1
-
5
-
-
33749552622
-
-
Black, A., Taylor, P., 1997. The festival speech synthesis system: system documentation. Tech. Rep. HCRC/TR-83, University of Edinburgh.
-
-
-
-
6
-
-
85009062911
-
-
Bridle, J., 2004. Towards better understanding of the model implied by the use of dynamic features in HMMs. In: Proceedings of International Conference on Spoken Language Processing 2004, vol. 1. pp. 725-728.
-
-
-
-
7
-
-
33749570542
-
-
Brown, P., 1987. The acoustic modeling problem in automatic speech recognition. Ph.D. thesis, Carnegie Mellon University.
-
-
-
-
9
-
-
0032119268
-
A dynamic, feature-based approach to the interface between phonology and phonetics for speech modeling and recognition
-
Deng L. A dynamic, feature-based approach to the interface between phonology and phonetics for speech modeling and recognition. Speech Communication 24 4 (1998) 299-323
-
(1998)
Speech Communication
, vol.24
, Issue.4
, pp. 299-323
-
-
Deng, L.1
-
10
-
-
0031185482
-
Speaker-independent phonetic classification using hidden Markov models with mixture of trend functions
-
Deng L., and Aksmanovic M. Speaker-independent phonetic classification using hidden Markov models with mixture of trend functions. IEEE Transactions Speech & Audio Processing 5 4 (1997) 319-324
-
(1997)
IEEE Transactions Speech & Audio Processing
, vol.5
, Issue.4
, pp. 319-324
-
-
Deng, L.1
Aksmanovic, M.2
-
11
-
-
0033623527
-
Spontaneous speech recognition using a statistical coarticulatory model for the hidden vocal-tract-resonance dynamics
-
Deng L., and Ma J. Spontaneous speech recognition using a statistical coarticulatory model for the hidden vocal-tract-resonance dynamics. Journal of Acoustic Society of America 108 6 (2000) 3036-3048
-
(2000)
Journal of Acoustic Society of America
, vol.108
, Issue.6
, pp. 3036-3048
-
-
Deng, L.1
Ma, J.2
-
12
-
-
0028516022
-
Speech recognition using hidden Markov models with polynomial regression functions as nonstationary states
-
Deng L., Aksmanovic M., Sun X., and Wu J. Speech recognition using hidden Markov models with polynomial regression functions as nonstationary states. IEEE Transactions on Speech & Audio Processing 2 4 (1994) 507-520
-
(1994)
IEEE Transactions on Speech & Audio Processing
, vol.2
, Issue.4
, pp. 507-520
-
-
Deng, L.1
Aksmanovic, M.2
Sun, X.3
Wu, J.4
-
13
-
-
85009211881
-
-
Deng, L., Bazzi, L., Acero, A., 2003. Tracking vocal tract resonances using an analytical nonlinear predictor and a target-guided temporal constraint. In: Proceedings of European Conference on Speech Communication and Technology, 2003. pp. 73-76.
-
-
-
-
14
-
-
0026397875
-
-
Digalakis, V., Rohlicek, J., Ostendorf, M., 1991. A dynamical system approach to continuous speech recognition. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing'91, pp. 282-292.
-
-
-
-
15
-
-
33749552390
-
-
Donovan, R., Eide, E., 1998. The IBM trainable speech synthesis system. In: Proceedings of International Conference on Spoken Language Processing'98, vol. 5. pp. 1703-1706.
-
-
-
-
16
-
-
0028996983
-
-
Donovan, R., Woodland, P., 1995. Automatic speech synthesizer parameter estimation using HMMs. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing'95. pp. 640-643.
-
-
-
-
17
-
-
85016140477
-
-
Fukada, T., Tokuda, K., T., K., Imai, S., 1992. An adaptive algorithm for melcepstral analysis of speech. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing'92, vol. 1. pp. 137-140.
-
-
-
-
18
-
-
0022667694
-
Speaker independent isolated word recognition using dynamic features of speech spectrum
-
Furui S. Speaker independent isolated word recognition using dynamic features of speech spectrum. IEEE Transactions Acoustics, Speech, & Signal Processing 34 (1986) 52-59
-
(1986)
IEEE Transactions Acoustics, Speech, & Signal Processing
, vol.34
, pp. 52-59
-
-
Furui, S.1
-
19
-
-
0032050110
-
Maximum likelihood linear transformations for HMM-based speech recognition
-
Gales M. Maximum likelihood linear transformations for HMM-based speech recognition. Computer Speech & Language 12 2 (1998) 75-98
-
(1998)
Computer Speech & Language
, vol.12
, Issue.2
, pp. 75-98
-
-
Gales, M.1
-
20
-
-
0032638856
-
Semi-tied covariance matrices for hidden Markov models
-
Gales M. Semi-tied covariance matrices for hidden Markov models. IEEE Transactions on Speech & Audio Processing 7 3 (1999) 272-281
-
(1999)
IEEE Transactions on Speech & Audio Processing
, vol.7
, Issue.3
, pp. 272-281
-
-
Gales, M.1
-
21
-
-
33749543199
-
-
Gales, M., Airey, S., 2003. Product of Gaussians for speech recognition. Tech. Rep. CUED/F-INFENG/TR.458, Cambridge University.
-
-
-
-
22
-
-
33749555773
-
-
Gales, M., Young, S., 1993. The theory of segmental hidden Markov models. Tech. Rep. CUED/F-INFENG/TR.133, Cambridge University.
-
-
-
-
23
-
-
0028419019
-
Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains
-
Gauvain J., and Lee C.-H. Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains. IEEE Transaction on Speech & Audio Processing 2 2 (1994) 291-298
-
(1994)
IEEE Transaction on Speech & Audio Processing
, vol.2
, Issue.2
, pp. 291-298
-
-
Gauvain, J.1
Lee, C.-H.2
-
24
-
-
0030371122
-
-
Gish, H., Ng, K., 1996. Parametric trajectory models for speech recognition. In: Proceedings of International Conference on Spoken Language Processing'96, vol. 1. pp. 466-469.
-
-
-
-
25
-
-
33749579647
-
-
Hinton, G., 1999. Product of experts. In: Proceedings of ICANN, vol. 1. pp. 1-6.
-
-
-
-
28
-
-
0022097649
-
Maximum likelihood estimation for mixture of multivariate stochastic observations of Markov chains
-
Juang B.-H. Maximum likelihood estimation for mixture of multivariate stochastic observations of Markov chains. AT&T Technical Journal 64 6 (1985) 1235-1249
-
(1985)
AT&T Technical Journal
, vol.64
, Issue.6
, pp. 1235-1249
-
-
Juang, B.-H.1
-
30
-
-
0031103570
-
Frame-correlated hidden Markov model based on extended logarithmic pool
-
Kim N.-S., and Un C.-K. Frame-correlated hidden Markov model based on extended logarithmic pool. IEEE Transactions on Acoustics, Speech, & Signal Processing 5 2 (1997) 149-160
-
(1997)
IEEE Transactions on Acoustics, Speech, & Signal Processing
, vol.5
, Issue.2
, pp. 149-160
-
-
Kim, N.-S.1
Un, C.-K.2
-
31
-
-
0032649321
-
-
Kobayashi, T., Masumitsu, K., Furuyama, J., 1999. Partly hidden Markov model and its application to speech recognition. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing'99, vol. 1. pp. 121-124.
-
-
-
-
32
-
-
33749564954
-
-
Koishida, K., Tokuda, K., Masuko, T., Kobayashi, T., 1997. Vector quantization of speech spectral parameters using statistics of dynamic features. In: Proceedings of International Conference on Signal Processing'97. pp. 247-252.
-
-
-
-
33
-
-
33749571719
-
-
Kominek, J., Black, A., 2003. CMU ARCTIC databases for speech synthesis. Tech. Rep. CMU-LTI-03-177, Carnegie Mellon University.
-
-
-
-
34
-
-
0034320005
-
Rapid speaker adaptation in eigenvoice space
-
Kuhn R., Janqua J., Nguyen P., and Niedzielski N. Rapid speaker adaptation in eigenvoice space. IEEE Transaction on Speech and Audio Processing 8 6 (2000) 695-707
-
(2000)
IEEE Transaction on Speech and Audio Processing
, vol.8
, Issue.6
, pp. 695-707
-
-
Kuhn, R.1
Janqua, J.2
Nguyen, P.3
Niedzielski, N.4
-
35
-
-
0025475528
-
ATR Japanese speech database as a tool of speech recognition and synthesis
-
Kurematsu A., Takeda K., Sagisaka Y., Katagiri S., Kuwabara H., and Shikano K. ATR Japanese speech database as a tool of speech recognition and synthesis. Speech Communication 9 (1990) 357-363
-
(1990)
Speech Communication
, vol.9
, pp. 357-363
-
-
Kurematsu, A.1
Takeda, K.2
Sagisaka, Y.3
Katagiri, S.4
Kuwabara, H.5
Shikano, K.6
-
36
-
-
0026370307
-
-
Lee, C.-H., Giachin, E., 1991. Improved acoustic modeling for speaker independent large vocabulary continuous speech recognition. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing, vol. 1. pp. 161-164.
-
-
-
-
37
-
-
4544260432
-
-
Lee, L., Attias, H., Deng, L., Fieguth, P., 2004. A multimodal variational approach to learning and inference in switching state space models. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing 2004. pp. 505-508.
-
-
-
-
38
-
-
0346892469
-
Automatic speech segmentation for concatenative inventory selection
-
van Santen J., Sproat R., Olive J., and Hirshberg J. (Eds), Springer-Verlag
-
Ljolje A., Hirschberg J., and van Santen J. Automatic speech segmentation for concatenative inventory selection. In: van Santen J., Sproat R., Olive J., and Hirshberg J. (Eds). Progress in Speech Synthesis (1997), Springer-Verlag 305-311
-
(1997)
Progress in Speech Synthesis
, pp. 305-311
-
-
Ljolje, A.1
Hirschberg, J.2
van Santen, J.3
-
39
-
-
33749556047
-
-
Ma, J., 2000. Spontaneous speech recognition using statistical dynamic models for the vocal tract resonance dynamics. Ph.D. thesis, University of Waterloo.
-
-
-
-
40
-
-
0742307392
-
Target-directed mixture linear dynamic models for spontaneous speech recognition
-
Ma J., and Deng L. Target-directed mixture linear dynamic models for spontaneous speech recognition. IEEE Transactions on Speech & Audio Processing 12 1 (2004) 47-58
-
(2004)
IEEE Transactions on Speech & Audio Processing
, vol.12
, Issue.1
, pp. 47-58
-
-
Ma, J.1
Deng, L.2
-
41
-
-
0036293703
-
-
Minami, Y., McDermott, E., Nakamura, A., Katagiri, S., 2002. A recognition method with parametric trajectory synthesized using direct relations between static and dynamic feature vector time series. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing 2002, vol. 1. pp. 957-960.
-
-
-
-
42
-
-
0141480073
-
-
Minami, Y., McDermott, E., Nakamura, A., Katagiri, S., 2003. Recognition method with parametric trajectory generated from mixture distribution HMMs. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing, 2003, vol. 1. pp. 124-127.
-
-
-
-
43
-
-
33645796901
-
-
Minami, Y., McDermott, E., Nakamura, A., Katagiri, S., 2004. A theoretical analysis of speech recognition based on feature trajectory models. In: Proceedings of International Conference on Spoken Language Processing 2004, vol. 1. pp. 549-552.
-
-
-
-
44
-
-
0036522866
-
A survey on automatic speech recognition
-
Nakagawa S. A survey on automatic speech recognition. IEICE Transactions Information & System E85-D 3 (2002) 465-486
-
(2002)
IEICE Transactions Information & System
, vol.E85-D
, Issue.3
, pp. 465-486
-
-
Nakagawa, S.1
-
46
-
-
33749565410
-
-
Odell, J., 1995. The use of context in large vocabulary speech recognition. Ph.D. thesis, Cambridge University.
-
-
-
-
50
-
-
18544404092
-
-
Paliwal, K., 1993. Use of temporal correlation between successive frames in hidden Markov model based speech recognizer. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing'93. pp. 215-218.
-
-
-
-
51
-
-
0032639922
-
-
Picone, J., Pike, S., Regan, R., Kamm, T., Bridle, J., Deng, L., Ma, Z., Richards, H., Schuster, M., 1999. Initial evaluation of hidden dynamic models on conversational speech. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing, vol. 1. pp. 109-112.
-
-
-
-
52
-
-
0032627031
-
-
Qing, G., Fang, Z., Jian, W., Wenhu, W., 1999. A new method used in HMM for modeling frame correlation. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing'99, vol. 1. pp. 169-172.
-
-
-
-
53
-
-
0024610919
-
-
Rabiner, L., 1989. A tutorial on hidden Markov models and selected applications in speech recognition. In: Proceedings of IEEE, vol. 77. pp. 257-285.
-
-
-
-
54
-
-
0032675736
-
-
Richards, H., Bridle, J., 1999. The HDM: a segmental hidden dynamic model of coarticulation. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing'99, vol. 1. pp. 357-360.
-
-
-
-
56
-
-
33749577566
-
-
Rosti, A., Gales, M., 2003. Switching linear dynamical systems for speech recognition. Tech. Rep. CUED/F-INFENG/TR.461, Cambridge University.
-
-
-
-
57
-
-
0027228741
-
-
Russel, M., 1993. A segmental HMM for speech pattern modeling. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing'93. pp. 499-502.
-
-
-
-
58
-
-
33749580231
-
-
Sagayama, S., Itakura, R, 1979. On individuality in a dynamic measure of speech. In: Proceedings of Spring Conference of Acoustic Society of Japan, pp. 589-590, (in Japanese).
-
-
-
-
59
-
-
85009257840
-
-
Shichiri, K., Sawabe, A., Tokuda, K., Masuko, T., Kobayashi, T., Kitamura, T., 2002. Eigenvoices for HMM-based speech synthesis. In: Proceedings of International Conference on Spoken Language Processing 2002. pp. 1269-1272.
-
-
-
-
60
-
-
33749575074
-
-
Shinoda, K., Watanabe, T., 1997. Acoustic modeling based on the MDL criterion for speech recognition. In: Proceedings of European Conference on Speech Communication and Technology'97. pp. 99-102.
-
-
-
-
61
-
-
33749563058
-
-
Sim, K.-C., Gales, M., 2004. Precision matrix modeling for large vocabulary continuous speech recognition. Tech. Rep. CUED/F-INFENG/TR.485, Cambridge University.
-
-
-
-
62
-
-
33745200057
-
-
Sim, K.-C., Gales, M., 2005. Temporally varying model parameters for large vocabulary continuous speech recognition. In: Proceedings of Interspeech'05. pp. 2137-2140.
-
-
-
-
63
-
-
0027309782
-
-
Takahashi, S., 1993. Phoneme HMM's constrained by frame correlations. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing'93. pp. 219-222.
-
-
-
-
64
-
-
0001455934
-
A robust algorithm for pitch tracking (RAPT)
-
Kleijn W., and Paliwal K. (Eds), Elsevier
-
Talkin D. A robust algorithm for pitch tracking (RAPT). In: Kleijn W., and Paliwal K. (Eds). Speech Coding and Synthesis (1995), Elsevier 497-518
-
(1995)
Speech Coding and Synthesis
, pp. 497-518
-
-
Talkin, D.1
-
65
-
-
85143189909
-
-
Tamura, M., Masuko, T., Tokuda, K., Kobayashi, T., 2001. Adaptation of pitch and spectrum for HMM-based speech synthesis using mllr. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing 2001, vol. 2. pp. 805-808.
-
-
-
-
66
-
-
0028996993
-
-
Tokuda, K., Kobayashi, T., Imai, S., 1995a. Speech parameter generation from HMM using dynamic features. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing'95. pp. 660-663.
-
-
-
-
67
-
-
33749551677
-
-
Tokuda, K., Masuko, T., Yamada, Y., Kobayashi, T., Imai, S., 1995b. An algorithm for speech parameter generation from continuous mixture HMMs with dynamic features. In: Proceedings of European Conference on Speech Communication and Technology'95. pp. 757-760.
-
-
-
-
68
-
-
0032678076
-
-
Tokuda, K., Masuko, T., Miyazaki, N., Kobayashi, T., 1999. Hidden Markov models based on multi-space probability distribution for pitch pattern modeling. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing'99. pp. 229-232.
-
-
-
-
69
-
-
0033708106
-
-
Tokuda, K., Yoshimura, T., Masuko, T., Kobayashi, T., Kitamura, T., 2000. Speech parameter generation algorithms for HMM-based speech synthesis. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing 2000, vol. 3. pp. 1315-1318.
-
-
-
-
70
-
-
33749575075
-
-
Vanhoucke, V., 2003. Mixtures of inverse covariances: covariance modeling for Gaussian mixtures with applications to automatic speech recognition. Ph.D. thesis, Stanford University.
-
-
-
-
71
-
-
84935113569
-
Error bounds for convolutional codes and an asymptotically optimal decoding algorithm
-
Viterbi A. Error bounds for convolutional codes and an asymptotically optimal decoding algorithm. IEEE Transactions on Information Theory 13 (1967) 260-269
-
(1967)
IEEE Transactions on Information Theory
, vol.13
, pp. 260-269
-
-
Viterbi, A.1
-
72
-
-
33749541446
-
-
Wellekens, C., 1987. Explicit correlation in hidden Markov model for speech recognition. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing'87. pp. 383-386.
-
-
-
-
73
-
-
14544300108
-
How to pretend that correlated variables are independent by using difference observations
-
Williams C. How to pretend that correlated variables are independent by using difference observations. Neural Computation 17 1 (2005) 1-7
-
(2005)
Neural Computation
, vol.17
, Issue.1
, pp. 1-7
-
-
Williams, C.1
-
74
-
-
33749547148
-
Products of Gaussians
-
Diettrich T., Becker S., and Ghahramani Z. (Eds), MIT Press
-
Williams C., Agakov E., and Felderhof S. Products of Gaussians. In: Diettrich T., Becker S., and Ghahramani Z. (Eds). Advances in Neural Information Processing Systems vol. 14 (2002), MIT Press 1014-1017
-
(2002)
Advances in Neural Information Processing Systems
, vol.14
, pp. 1014-1017
-
-
Williams, C.1
Agakov, E.2
Felderhof, S.3
-
75
-
-
0026382580
-
-
Wilpon, J., Lee, C.-H., Rabiner, L., 1991. Improvements in connected digit recognition using higher order spectral and energy features. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing, vol. 1. pp. 349-352.
-
-
-
-
76
-
-
33749547146
-
-
Yoshimura, T., Tokuda, K., Masuko, T., Kobayashi, T., Kitamura, T., 1997. Speaker interpolation in HMM-based speech synthesis system. In: Proceedings of European Conference on Speech Communication and Technology'97, vol. 5. pp. 2523-2526.
-
-
-
-
77
-
-
33749541221
-
-
Yoshimura, T., Tokuda, K., Masuko, T., Kobayashi, T., Kitamura, T., 1999. Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis. In: Proceedings of European Conference on Speech Communication and Technology'99, vol. 5. pp. 2347-2350.
-
-
-
-
78
-
-
0003571976
-
-
Cambridge University
-
Young S., Evermann G., Gales M., Hain T., Kershaw D., Moore G., Odell J., Ollason D., Povey D., Valtchev V., and Woodland P. The HTK Book (for HTK Version 3.3) (2005), Cambridge University
-
(2005)
The HTK Book (for HTK Version 3.3)
-
-
Young, S.1
Evermann, G.2
Gales, M.3
Hain, T.4
Kershaw, D.5
Moore, G.6
Odell, J.7
Ollason, D.8
Povey, D.9
Valtchev, V.10
Woodland, P.11
-
79
-
-
0141702226
-
-
Zhou, J.-L., Seide, F., Deng, L., 2003. Coarticulation modeling by embedding a target-directed hidden trajectory model into HMM-model and training. In: Proceedings of International Conference on Acoustics, Speech, and Signal Processing 2003, vol. 1. pp. 744-747.
-
-
-
-
80
-
-
33749544707
-
-
Zweig, G., 1998. Speech recognition using dynamic Bayesian networks. Ph.D. thesis, University of California, Berkeley.
-
-
-
|