-
1
-
-
85004448479
-
Voice conversion through vector quantization
-
M. Abe, S. Nakamura, K. Shikano, and H. Kuwabara, "Voice conversion through vector quantization", J. Acoust. Soc. Jpn. (E), vol. 11, no. 2, pp. 71-76, 1990.
-
(1990)
J. Acoust. Soc. Jpn. (E)
, vol.11
, Issue.2
, pp. 71-76
-
-
Abe, M.1
Nakamura, S.2
Shikano, K.3
Kuwabara, H.4
-
2
-
-
34547550766
-
Stereo-based stochastic mapping for robust speech recognition
-
M. Afify, X.-D. Cui, and Y. Gao, "Stereo-based stochastic mapping for robust speech recognition", in Proc. ICASSP, 2007, pp. 377-380.
-
(2007)
Proc. ICASSP
, pp. 377-380
-
-
Afify, M.1
Cui, X.-D.2
Gao, Y.3
-
3
-
-
84905560807
-
Voice conversion with smoothed GMM and MAP adaptation
-
Y. Chen, M. Chu, E. Chang, and J. Liu, "Voice conversion with smoothed GMM and MAP adaptation", in Proc. Interspeech, 2003, pp. 2413-2416.
-
(2003)
Proc. Interspeech
, pp. 2413-2416
-
-
Chen, Y.1
Chu, M.2
Chang, E.3
Liu, J.4
-
4
-
-
51449114531
-
MMSE-based stereo feature stochastic mapping for noise robust speech recognition
-
X.-D. Cui, M. Afify, and Y. Gao, "MMSE-based stereo feature stochastic mapping for noise robust speech recognition", in Proc. ICASSP, 2008, pp. 4077-4080.
-
(2008)
Proc. ICASSP
, pp. 4077-4080
-
-
Cui, X.-D.1
Afify, M.2
Gao, Y.3
-
5
-
-
0034855352
-
High-performance robust speech recognition using stereo training data
-
L. Deng, A. Acero, L. Jiang, J. Droppo, and X. Huang, "High-performance robust speech recognition using stereo training data", in Proc. ICASSP, 2001, pp. 301-304.
-
(2001)
Proc. ICASSP
, pp. 301-304
-
-
Deng, L.1
Acero, A.2
Jiang, L.3
Droppo, J.4
Huang, X.5
-
6
-
-
0036291376
-
Uncertainty decoding with SPLICE for noise robust speech recognition
-
J. Droppo, A. Acero, and L. Deng, "Uncertainty decoding with SPLICE for noise robust speech recognition", in Proc. ICASSP, 2002, pp. 57-60.
-
(2002)
Proc. ICASSP
, pp. 57-60
-
-
Droppo, J.1
Acero, A.2
Deng, L.3
-
7
-
-
78149261566
-
Bandwidth extension of cellular phone speech based on maximum likelihood estimation with GMM
-
W. Fujitsuru, H. Sekimoto, T. Toda, H Saruwatari, and K. Shikano, "Bandwidth extension of cellular phone speech based on maximum likelihood estimation with GMM", in Proc. NCSP, 2008, pp. 283-286.
-
(2008)
Proc. NCSP
, pp. 283-286
-
-
Fujitsuru, W.1
Sekimoto, H.2
Toda, T.3
H Saruwatari4
Shikano, K.5
-
8
-
-
85016140477
-
An adaptive algorithm for mel-cepstral analysis of speech
-
T. Fukada, K. Tokuda, T. Kobayashi, and S. Imai, "An adaptive algorithm for mel-cepstral analysis of speech", in Proc. ICASSP, 1992, pp. 137-140.
-
(1992)
Proc. ICASSP
, pp. 137-140
-
-
Fukada, T.1
Tokuda, K.2
Kobayashi, T.3
Imai, S.4
-
9
-
-
0022667694
-
Speaker independent isolated word recognition using dynamic features of speech spectrum
-
S. Furui, "Speaker independent isolated word recognition using dynamic features of speech spectrum", IEEE Trans. Acoust., Speech, Signal Process., vol. 34, pp. 52-59, 1986.
-
(1986)
IEEE Trans. Acoust., Speech, Signal Process.
, vol.34
, pp. 52-59
-
-
Furui, S.1
-
10
-
-
2142659020
-
Estimation of articulatory movements from speech acoustics using an HMM-based speech production model
-
Mar
-
S. Hiroya and M. Honda, "Estimation of articulatory movements from speech acoustics using an HMM-based speech production model", IEEE Trans. Speech Audio Process., vol. 12, no. 2, pp. 175-185, Mar. 2004.
-
(2004)
IEEE Trans. Speech Audio Process.
, vol.12
, Issue.2
, pp. 175-185
-
-
Hiroya, S.1
Honda, M.2
-
11
-
-
0038669544
-
The AURORA experimental framework for the performance evaluations of speech recognition systems under noisy conditions
-
H. Hirsch and D. Pearce, "The AURORA experimental framework for the performance evaluations of speech recognition systems under noisy conditions", in Proc. ISCA ITRW ASR'00, 2000, pp. 181-188.
-
(2000)
Proc. ISCA ITRW ASR'00
, pp. 181-188
-
-
Hirsch, H.1
Pearce, D.2
-
12
-
-
0020596154
-
Cepstral analysis synthesis on the mel frequency scale
-
S. Imai, "Cepstral analysis synthesis on the mel frequency scale", in Proc. ICASSP, 1983, pp. 93-96.
-
(1983)
Proc. ICASSP
, pp. 93-96
-
-
Imai, S.1
-
13
-
-
0031623661
-
Spectral voice conversion for text-to-speech synthesis
-
A. Kain and M. Macon, "Spectral voice conversion for text-to-speech synthesis", in Proc. ICASSP, 1998, pp. 285-288.
-
(1998)
Proc. ICASSP
, pp. 285-288
-
-
Kain, A.1
Macon, M.2
-
14
-
-
85133413596
-
Formant re-synthesis of dysarthric speech
-
A. Kain, X. Niu, J.-P. Hosom, Q. Miao, and J. van Santen, "Formant re-synthesis of dysarthric speech", in Proc. ISCA SSW5, 2003, pp. 25-30.
-
(2003)
Proc. ISCA SSW5
, pp. 25-30
-
-
Kain, A.1
Niu, X.2
Hosom, J.-P.3
Miao, Q.4
Van Santen, J.5
-
15
-
-
0006682104
-
Vector quantization of speech spectral parameters using statistics of dynamic features
-
K. Koishida, K. Tokuda, T. Masuko, and T. Kobayashi, "Vector quantization of speech spectral parameters using statistics of dynamic features", in Proc. Int. Conf. Signal Process.'97, 1997, pp. 247-252.
-
(1997)
Proc. Int. Conf. Signal Process.'97
, pp. 247-252
-
-
Koishida, K.1
Tokuda, K.2
Masuko, T.3
Kobayashi, T.4
-
16
-
-
33646773080
-
-
Carnegie Mellon Univ., Pittsburgh, PA, Tech. Rep. CMU-LTI-03-177
-
J. Kominek and A. Black, "CMU ARCTIC databases for speech synthesis", Carnegie Mellon Univ., Pittsburgh, PA, Tech. Rep. CMU-LTI-03-177, 2003.
-
(2003)
CMU ARCTIC Databases for Speech Synthesis
-
-
Kominek, J.1
Black, A.2
-
17
-
-
34547503417
-
HMM-based unit selection using frame sized speech segments
-
Z.-H. Ling and R.-H. Wang, "HMM-based unit selection using frame sized speech segments", in Proc. Interspeech, 2006, pp. 2034-2037.
-
(2006)
Proc. Interspeech
, pp. 2034-2037
-
-
Ling, Z.-H.1
Wang, R.-H.2
-
18
-
-
33646887390
-
On the limited memory BFGS method for large scale optimization
-
D. Liu and J. Nocedal, "On the limited memory BFGS method for large scale optimization", Math. Program. B, vol. 45, no. 3, pp. 503-528, 1989.
-
(1989)
Math. Program. B
, vol.45
, Issue.3
, pp. 503-528
-
-
Liu, D.1
Nocedal, J.2
-
19
-
-
84867211725
-
Low-delay voice conversion based on maximum likelihood estimation of spectral parameter trajectory
-
T. Muramatsu, Y. Ohtani, T. Toda, H. Saruwatari, and K. Shikano, "Low-delay voice conversion based on maximum likelihood estimation of spectral parameter trajectory", in Proc. Interspeech, 2008, pp. 1076-1079.
-
(2008)
Proc. Interspeech
, pp. 1076-1079
-
-
Muramatsu, T.1
Ohtani, Y.2
Toda, T.3
Saruwatari, H.4
Shikano, K.5
-
20
-
-
44949187612
-
Improving body transmitted unvoiced speech with statistical voice conversion
-
M. Nakagiri, T. Toda, H. Kashioka, and K. Shikano, "Improving body transmitted unvoiced speech with statistical voice conversion", in Proc. Interspeech, 2006, pp. 2270-2273.
-
(2006)
Proc. Interspeech
, pp. 2270-2273
-
-
Nakagiri, M.1
Toda, T.2
Kashioka, H.3
Shikano, K.4
-
21
-
-
42649146508
-
On the use of phonetic information for mapping from articulatory movements to vocal tract spectrum
-
K. Nakamura, T. Toda, Y. Nankaku, and K. Tokuda, "On the use of phonetic information for mapping from articulatory movements to vocal tract spectrum", in Proc. ICASSP, 2006, pp. 93-96.
-
(2006)
Proc. ICASSP
, pp. 93-96
-
-
Nakamura, K.1
Toda, T.2
Nankaku, Y.3
Tokuda, K.4
-
22
-
-
44949265538
-
Speaking aid system for total laryngectomees using voice conversion of body transmitted artificial speech
-
K. Nakamura, T. Toda, H. Saruwatari, and K. Shikano, "Speaking aid system for total laryngectomees using voice conversion of body transmitted artificial speech", in Proc. Interspeech, 2006, pp. 1395-1398.
-
(2006)
Proc. Interspeech
, pp. 1395-1398
-
-
Nakamura, K.1
Toda, T.2
Saruwatari, H.3
Shikano, K.4
-
23
-
-
78149241363
-
Spectral conversion based on statistical models including time-frequency matching
-
Y. Nankaku, K. Nakamura, T. Toda, and K. Tokuda, "Spectral conversion based on statistical models including time-frequency matching", in Proc. ISCA SSW6, 2007, pp. 333-338.
-
(2007)
Proc. ISCA SSW6
, pp. 333-338
-
-
Nankaku, Y.1
Nakamura, K.2
Toda, T.3
Tokuda, K.4
-
24
-
-
0033692729
-
Narrowband to wideband conversion of speech using GMM based transformation
-
K.-Y. Park and H.-S. Kim, "Narrowband to wideband conversion of speech using GMM based transformation", in Proc. ICASSP, 2000, pp. 1847-1850.
-
(2000)
Proc. ICASSP
, pp. 1847-1850
-
-
Park, K.-Y.1
Kim, H.-S.2
-
25
-
-
4243714433
-
-
Ph. D. dissertation, Centre for Speech Technol. Res., Edinburgh Univ., Edinburgh, U. K.
-
K. Richmond, "Estimating articulatory parameters from the acoustic speech signal", Ph. D. dissertation, Centre for Speech Technol. Res., Edinburgh Univ., Edinburgh, U. K., 2002.
-
(2002)
Estimating Articulatory Parameters From the Acoustic Speech Signal
-
-
Richmond, K.1
-
26
-
-
67650105018
-
Trajectory mixture density network with multiple mixtures for acoustic-articulatory inversion
-
K. Richmond, "Trajectory mixture density network with multiple mixtures for acoustic-articulatory inversion", in Proc. NOLISP, 2007, pp. 67-70.
-
(2007)
Proc. NOLISP
, pp. 67-70
-
-
Richmond, K.1
-
27
-
-
0038359547
-
Modelling the uncertainty in recovering articulation from acoustics
-
K. Richmond, S. King, and P. Taylor, "Modelling the uncertainty in recovering articulation from acoustics", Comput. Speech Lang., vol. 17, pp. 153-172, 2003.
-
(2003)
Comput. Speech Lang.
, vol.17
, pp. 153-172
-
-
Richmond, K.1
King, S.2
Taylor, P.3
-
29
-
-
33745199156
-
Robust bandwidth extension of noise-corrupted narrowband speech
-
M. Seltzer, A. Acero, and J. Droppo, "Robust bandwidth extension of noise-corrupted narrowband speech", in Proc. Interspeech, 2005, pp. 1509-1512.
-
(2005)
Proc. Interspeech
, pp. 1509-1512
-
-
Seltzer, M.1
Acero, A.2
Droppo, J.3
-
30
-
-
64149122631
-
Accurate spectral envelope estimation for articulation-to-speech synthesis
-
Y. Shiga and S. King, "Accurate spectral envelope estimation for articulation-to-speech synthesis", in Proc. ISCA SSW5, 2004, pp. 19-24.
-
(2004)
Proc. ISCA SSW5
, pp. 19-24
-
-
Shiga, Y.1
King, S.2
-
31
-
-
0032026483
-
Continuous probabilistic transform for voice conversion
-
Mar
-
Y. Stylianou, O. Cappe, and E. Moulines, "Continuous probabilistic transform for voice conversion", IEEE Trans. Speech Audio Process., vol. 6, no. 2, pp. 131-142, Mar. 1998.
-
(1998)
IEEE Trans. Speech Audio Process.
, vol.6
, Issue.2
, pp. 131-142
-
-
Stylianou, Y.1
Cappe, O.2
Moulines, E.3
-
32
-
-
0001455934
-
A robust algorithm for pitch tracking (RAPT)
-
W. Kleijn and K. Paliwal, Eds. Amsterdam, The Netherlands: Elsevier
-
D. Talkin, "A robust algorithm for pitch tracking (RAPT)", in Speech Coding and Synthesis, W. Kleijn and K. Paliwal, Eds. Amsterdam, The Netherlands: Elsevier, 1995.
-
(1995)
Speech Coding and Synthesis
-
-
Talkin, D.1
-
33
-
-
85027459007
-
Mapping from ariticulatory movements to vocal tract spectrum with Gaussian mixture model for ariticulatory speech synthesis
-
T. Toda, A. W. Black, and K. Tokuda, "Mapping from ariticulatory movements to vocal tract spectrum with Gaussian mixture model for ariticulatory speech synthesis", in Proc. ISCA SSW5, 2004, pp. 31-36.
-
(2004)
Proc. ISCA SSW5
, pp. 31-36
-
-
Toda, T.1
Black, A.W.2
Tokuda, K.3
-
34
-
-
57749193836
-
Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory
-
Nov
-
T. Toda, A. Black, and K. Tokuda, "Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory", IEEE Trans. Acoust. Speech, Lang. Process., vol. 15, no. 8, pp. 2222-2235, Nov. 2007.
-
(2007)
IEEE Trans. Acoust. Speech, Lang. Process.
, vol.15
, Issue.8
, pp. 2222-2235
-
-
Toda, T.1
Black, A.2
Tokuda, K.3
-
35
-
-
38649140222
-
Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model
-
T. Toda, A. Black, and K. Tokuda, "Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model", Speech Comm., vol. 50, no. 3, pp. 215-227, 2008.
-
(2008)
Speech Comm.
, vol.50
, Issue.3
, pp. 215-227
-
-
Toda, T.1
Black, A.2
Tokuda, K.3
-
36
-
-
33745214435
-
NAM-to-speech conversion with Gaussian mixture models
-
T. Toda and K. Shikano, "NAM-to-speech conversion with Gaussian mixture models", in Proc. Interspeech, 2005, pp. 1957-1960.
-
(2005)
Proc. Interspeech
, pp. 1957-1960
-
-
Toda, T.1
Shikano, K.2
-
37
-
-
0028996993
-
Speech parameter generation from HMM using dynamic features
-
K. Tokuda, T. Kobayashi, and S. Imai, "Speech parameter generation from HMM using dynamic features", in Proc. ICASSP, 1995, pp. 660-663.
-
(1995)
Proc. ICASSP
, pp. 660-663
-
-
Tokuda, K.1
Kobayashi, T.2
Imai, S.3
-
38
-
-
85031628788
-
An algorithm for speech parameter generation from continuous mixture HMMs with dynamic features
-
K. Tokuda, T. Masuko, Y. Yamada, T. Kobayashi, and S. Imai, "An algorithm for speech parameter generation from continuous mixture HMMs with dynamic features", in Proc. Eurospeech, 1995, pp. 757-760.
-
(1995)
Proc. Eurospeech
, pp. 757-760
-
-
Tokuda, K.1
Masuko, T.2
Yamada, Y.3
Kobayashi, T.4
Imai, S.5
-
39
-
-
0033708106
-
Speech parameter generation algorithms for HMM-based speech synthesis
-
K. Tokuda, T. Yoshimura, T. Masuko, T Kobayashi, and T. Kitamura, "Speech parameter generation algorithms for HMM-based speech synthesis", in Proc. ICASSP, 2000, pp. 1315-1318.
-
(2000)
Proc. ICASSP
, pp. 1315-1318
-
-
Tokuda, K.1
Yoshimura, T.2
Masuko, T.3
T Kobayashi4
Kitamura, T.5
-
40
-
-
33646815712
-
-
Online. Available
-
A. Wrench, The MOCHA-TIMIT Database, 1999. [Online]. Available: http://www.cstr.ed.ac.uk/artic/mocha.html
-
(1999)
The MOCHA-TIMIT Database
-
-
Wrench, A.1
-
41
-
-
85009139544
-
Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
-
T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, "Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis", in Proc. Eurospeech, 1999, pp. 2347-2350.
-
(1999)
Proc. Eurospeech
, pp. 2347-2350
-
-
Yoshimura, T.1
Tokuda, K.2
Masuko, T.3
Kobayashi, T.4
Kitamura, T.5
-
42
-
-
78149252505
-
-
Online. Available
-
S. Young, G. Evermann, M. Gales, T. Hain, D. Kershaw, X.-Y. Liu, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. Woodland, The Hidden Markov Model Toolkit (HTK) Version 3.4, 2006. [Online]. Available: http://htk.eng.cam.ac.uk/
-
(2006)
The Hidden Markov Model Toolkit (HTK) Version 3.4
-
-
Young, S.1
Evermann, G.2
Gales, M.3
Hain, T.4
Kershaw, D.5
Liu, X.-Y.6
Moore, G.7
Odell, J.8
Ollason, D.9
Povey, D.10
Valtchev, V.11
Woodland, P.12
-
43
-
-
67650826180
-
Model-space MLLR for trajectory HMMs
-
H. Zen, Y. Nankaku, and K. Tokuda, "Model-space MLLR for trajectory HMMs", in Proc. Interspeech, 2007, pp. 2065-2068.
-
(2007)
Proc. Interspeech
, pp. 2065-2068
-
-
Zen, H.1
Nankaku, Y.2
Tokuda, K.3
-
44
-
-
44949197937
-
Speaker adaptation of trajectory HMMs using feature-space MLLR
-
H. Zen, Y. Nankaku, K. Tokuda, and T. Kitamura, "Speaker adaptation of trajectory HMMs using feature-space MLLR", in Proc. Interspeech, 2006, pp. 2274-2277.
-
(2006)
Proc. Interspeech
, pp. 2274-2277
-
-
Zen, H.1
Nankaku, Y.2
Tokuda, K.3
Kitamura, T.4
-
45
-
-
33947642095
-
Estimating trajectory HMM parameters by Monte Carlo EM with Gibbs sampler
-
H. Zen, K. Tokuda, and T. Kitamura, "Estimating trajectory HMM parameters by Monte Carlo EM with Gibbs sampler", in Proc. ICASSP, 2006, pp. 1173-1176.
-
(2006)
Proc. ICASSP
, pp. 1173-1176
-
-
Zen, H.1
Tokuda, K.2
Kitamura, T.3
-
46
-
-
33749573927
-
Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic features
-
H. Zen, K. Tokuda, and T. Kitamura, "Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic features", Comput. Speech Lang., vol. 21, no. 1, pp. 153-173, 2007.
-
(2007)
Comput. Speech Lang.
, vol.21
, Issue.1
, pp. 153-173
-
-
Zen, H.1
Tokuda, K.2
Kitamura, T.3
-
47
-
-
67650829355
-
-
Ph. D. dissertation, Univ. of Edinburgh, Edinburgh, U. K.
-
L. Zhang, "Modelling speech dynamics with trajectory-HMMs", Ph. D. dissertation, Univ. of Edinburgh, Edinburgh, U. K., 2009.
-
(2009)
Modelling Speech Dynamics with Trajectory-HMMs
-
-
Zhang, L.1
-
48
-
-
67650153217
-
Acoustic-articulatory modelling with the trajectory HMM
-
L. Zhang and S. Renals, "Acoustic-articulatory modelling with the trajectory HMM", IEEE Signal Process. Lett., vol. 15, pp. 245-248, 2008.
-
(2008)
IEEE Signal Process. Lett.
, vol.15
, pp. 245-248
-
-
Zhang, L.1
Renals, S.2
-
49
-
-
84946719891
-
Air-and bone-conductive integrated microphones for robust speech detection and enhancement
-
Y. Zheng, Z. Liu, Z. Zhang, M. Sinclair, J. Droppo, L. Deng, A. Acero, and X Huang, "Air-and bone-conductive integrated microphones for robust speech detection and enhancement", in Proc. ASRU, 2003, pp. 249-254.
-
(2003)
Proc. ASRU
, pp. 249-254
-
-
Zheng, Y.1
Liu, Z.2
Zhang, Z.3
Sinclair, M.4
Droppo, J.5
Deng, L.6
Acero, A.7
X Huang8
|