-
1
-
-
0033693063
-
Conversational speech recognition using acoustic and articulatory input
-
K. Kirchhoff, G. A. Fink, and G. Sagerer, "Conversational speech recognition using acoustic and articulatory input," in Proc, IEEE ICASSP, 2000, pp. 1435-1438.
-
(2000)
Proc, IEEE ICASSP
, pp. 1435-1438
-
-
Kirchhoff, K.1
Fink, G.A.2
Sagerer, G.3
-
2
-
-
85009097225
-
On using MLP features forLVCSR
-
Q. Zhu, B. Chen. N. Morgan, and A. Stolcke, "On using MLP features forLVCSR," in Proc. Euivspeech, 2004, pp. 921-924.
-
(2004)
Proc. Euivspeech
, pp. 921-924
-
-
Zhu, Q.1
Chen, B.2
Morgan, N.3
Stolcke, A.4
-
3
-
-
34250015828
-
Using multiple acoustic feature sets for speech recognition
-
A. Zolnay, D. Kocharov, R. Schliiter, and H. Ney, "Using multiple acoustic feature sets for speech recognition," Speech Commun., vol. 49, pp. 514-525, 2007.
-
(2007)
Speech Commun
, vol.49
, pp. 514-525
-
-
Zolnay, A.1
Kocharov, D.2
Schliiter, R.3
Ney, H.4
-
4
-
-
34547539413
-
-
R. Schliiter, I. Bezrukov, H. Wagner, and H. Ney, Gammatone features and feature combination for large vocabulary speech recognition, in Proc. IEEE ICASSP, 2007, pp. IV-649-IV-652.
-
R. Schliiter, I. Bezrukov, H. Wagner, and H. Ney, "Gammatone features and feature combination for large vocabulary speech recognition," in Proc. IEEE ICASSP, 2007, pp. IV-649-IV-652.
-
-
-
-
5
-
-
85119697883
-
iROVER: Improving system combination with classification
-
D. Hillard, B. Hoffmeister, M. Ostendorf, R. Schliiter, and H. Ney, "iROVER: Improving system combination with classification," in Proc. NAACL-HLT Companion Volume Short Papers. 2007, pp. 65-68.
-
(2007)
Proc. NAACL-HLT Companion Volume Short Papers
, pp. 65-68
-
-
Hillard, D.1
Hoffmeister, B.2
Ostendorf, M.3
Schliiter, R.4
Ney, H.5
-
6
-
-
85080018809
-
-
J. Cohen, T. Kamm, and A. Andreou, Vocal tract normalization in speech recognition: Compensating for systematic speaker variability, J. Acoust. Soc. Amer., 97, no. 5, pt. 2, pp. 3246-3247, 1995.
-
J. Cohen, T. Kamm, and A. Andreou, "Vocal tract normalization in speech recognition: Compensating for systematic speaker variability," J. Acoust. Soc. Amer., vol. 97, no. 5, pt. 2, pp. 3246-3247, 1995.
-
-
-
-
7
-
-
0029747183
-
Speaker normalization using efficient frequency warping procedures
-
L. Lee and R. Rose, "Speaker normalization using efficient frequency warping procedures," Proc. IEEE ICASSP, pp. 353-356, 1996.
-
(1996)
Proc. IEEE ICASSP
, pp. 353-356
-
-
Lee, L.1
Rose, R.2
-
8
-
-
0034847002
-
The 1998 HTK system for transcription of conversational telephone speech
-
T. Hain, P. Woodland, T. Niesler, and E. Whittaker, "The 1998 HTK system for transcription of conversational telephone speech," in Proc. IEEE ICASSP, 1999, pp. 57-60.
-
(1999)
Proc. IEEE ICASSP
, pp. 57-60
-
-
Hain, T.1
Woodland, P.2
Niesler, T.3
Whittaker, E.4
-
9
-
-
0036753897
-
Speaker adaptive modeling by vocal tract normalization
-
Sep
-
L. Welling, H. Ney, and S. Kanthak, "Speaker adaptive modeling by vocal tract normalization," IEEE Trans. Speech Audio Process., vol. 10, no. 6, pp. 415-126, Sep. 2002.
-
(2002)
IEEE Trans. Speech Audio Process
, vol.10
, Issue.6
, pp. 415-126
-
-
Welling, L.1
Ney, H.2
Kanthak, S.3
-
10
-
-
0029764708
-
Speaker normalization on conversational telephone speech
-
S. Wegmann, D. McAllaster, J. Orloff, andB. Peskin, "Speaker normalization on conversational telephone speech," in Proc. IEEE ICASSP, 1996, pp. 339-341.
-
(1996)
Proc. IEEE ICASSP
, pp. 339-341
-
-
Wegmann, S.1
McAllaster, D.2
Orloff, J.3
andB4
Peskin5
-
11
-
-
0029725604
-
A parametric approach to vocal tract length normalization
-
E. Eide and H. Gish, "A parametric approach to vocal tract length normalization," in Proc. IEEE ICASSP, 1996, pp. 346-348.
-
(1996)
Proc. IEEE ICASSP
, pp. 346-348
-
-
Eide, E.1
Gish, H.2
-
12
-
-
0032673049
-
Restructuring speech representations using pitch adaptive time-frequency smoothing and instantaneous-frequency-based FO extraction: Possible role of repetitive structure in sounds
-
H. Kawahara, I. Masuda-Katsuse, and A. de Cheveigné, "Restructuring speech representations using pitch adaptive time-frequency smoothing and instantaneous-frequency-based FO extraction: Possible role of repetitive structure in sounds," Speech Commun., vol. 27, no. 3-4, pp. 187-207, 1999.
-
(1999)
Speech Commun
, vol.27
, Issue.3-4
, pp. 187-207
-
-
Kawahara, H.1
Masuda-Katsuse, I.2
de Cheveigné, A.3
-
14
-
-
0034841228
-
Perceptual harmonic cepstral coefficients for speech recognition in noisy environments
-
L. Gu and K. Rose, "Perceptual harmonic cepstral coefficients for speech recognition in noisy environments," in Proc. IEEE ICASSP, 2001, pp. 125-128.
-
(2001)
Proc. IEEE ICASSP
, pp. 125-128
-
-
Gu, L.1
Rose, K.2
-
15
-
-
0347968155
-
Pitch adaptive windows for improved exitation coding in low-rate CELP coders
-
Nov
-
A. V. Rao, S. Ahmadi, J. Linden, A. Gersho, V. Cuperman, and R. Heidari, "Pitch adaptive windows for improved exitation coding in low-rate CELP coders," IEEE Trans. Speech Audio Process., vol. 11, no. 6, pp. 648-659, Nov. 2003.
-
(2003)
IEEE Trans. Speech Audio Process
, vol.11
, Issue.6
, pp. 648-659
-
-
Rao, A.V.1
Ahmadi, S.2
Linden, J.3
Gersho, A.4
Cuperman, V.5
Heidari, R.6
-
16
-
-
85135241246
-
Comparison of MFCC and pitch synchronous AM, FM parameters for speaker identification
-
H. Ezzaidi and J. Rouat, "Comparison of MFCC and pitch synchronous AM, FM parameters for speaker identification," in Proc. ICSLP, 2000, vol. 8, pp. 318-321.
-
(2000)
Proc. ICSLP
, vol.8
, pp. 318-321
-
-
Ezzaidi, H.1
Rouat, J.2
-
17
-
-
4544386226
-
A pitch synchronous feature extraction method for speaker recognition
-
pp. I-405-I-408
-
S. Kim, T. Eriksson, H. Kang, and D. H. Youn, "A pitch synchronous feature extraction method for speaker recognition," in Proc. IEEE ICASSP, 2004, pp. I-405-I-408.
-
Proc. IEEE ICASSP
, pp. 2004
-
-
Kim, S.1
Eriksson, T.2
Kang, H.3
Youn, D.H.4
-
18
-
-
85143191773
-
-
R. Zilca, J. Navratil, and G. N. Ramaswamy, Depitch and the role of fundamental frequency in speaker recognition, in Proc. IEEE ICASSP, 2003, pp. II-81-II-84.
-
R. Zilca, J. Navratil, and G. N. Ramaswamy, "Depitch and the role of fundamental frequency in speaker recognition," in Proc. IEEE ICASSP, 2003, pp. II-81-II-84.
-
-
-
-
19
-
-
7544241146
-
Dynamic Bayesian network based speech recognition with pitch and energy as auxiliary variables
-
T. A. Stephenson, J. Escofet, M. Magimai-Doss, and H. Bourlard, "Dynamic Bayesian network based speech recognition with pitch and energy as auxiliary variables," in Proc. IEEE Workshop Neural Netw. Signal Process., 2002, pp. 637-646.
-
(2002)
Proc. IEEE Workshop Neural Netw. Signal Process
, pp. 637-646
-
-
Stephenson, T.A.1
Escofet, J.2
Magimai-Doss, M.3
Bourlard, H.4
-
20
-
-
84863687026
-
On the use of phase information for speech recognition
-
Online, Available
-
B. Bozkurt and L. Couvreur, "On the use of phase information for speech recognition," in Proc. EUSIPCO, 2005 [Online], Available: http://www.eurasip.org/Proeeedings/Eusipco/Eusipco2005/de-fevent/papers/cr 1390.pdf
-
(2005)
Proc. EUSIPCO
-
-
Bozkurt, B.1
Couvreur, L.2
-
21
-
-
85009135069
-
Improving the representation of time structure in front-ends for automatic speech recognition
-
W. J. Holmes, "Improving the representation of time structure in front-ends for automatic speech recognition," in Proc. ICSLP, 2000, vol. 2, pp. 1073-1076.
-
(2000)
Proc. ICSLP
, vol.2
, pp. 1073-1076
-
-
Holmes, W.J.1
-
22
-
-
85009118262
-
A noise-robust feature extraction method based on pitch-synchronous ZCPA for ASR
-
M. Ghulam, T. Fukuda, I. Horikawa, and T. Nitta, "A noise-robust feature extraction method based on pitch-synchronous ZCPA for ASR," in Proc. ICSLP, 2004, vol. 1. pp. 133-136.
-
(2004)
Proc. ICSLP
, vol.1
, pp. 133-136
-
-
Ghulam, M.1
Fukuda, T.2
Horikawa, I.3
Nitta, T.4
-
23
-
-
33645781551
-
Evaluation of a speech recognition/generation method based on HMM and STRAIGHT
-
T. Irino, Y. Minami, T. Nakatani, M. Tsuzaki, and H. Tagawa, "Evaluation of a speech recognition/generation method based on HMM and STRAIGHT." in Proc. ICSLP, 2002, pp. 2545-2548.
-
(2002)
Proc. ICSLP
, pp. 2545-2548
-
-
Irino, T.1
Minami, Y.2
Nakatani, T.3
Tsuzaki, M.4
Tagawa, H.5
-
24
-
-
0001455934
-
A robust algorithm for pitch tracking (RAPT)
-
D. Talkin, W. B. Kleijn and K. K. Paliwal, Eds, New York: Elsevier
-
D. Talkin, W. B. Kleijn and K. K. Paliwal, Eds., "A robust algorithm for pitch tracking (RAPT)," in Speech Coding and Synthesis. New York: Elsevier, 1995. pp. 495-518.
-
(1995)
Speech Coding and Synthesis
, pp. 495-518
-
-
-
25
-
-
85009141768
-
Combination of speech features using smoothed heteroscedastic linear discriminant analysis
-
L. Burget, "Combination of speech features using smoothed heteroscedastic linear discriminant analysis," in Proc. ICSLP, 2004, pp. 2549-2552.
-
(2004)
Proc. ICSLP
, pp. 2549-2552
-
-
Burget, L.1
-
26
-
-
0030638031
-
A post-processing system to yield reduced word error rates: Recognition Output Voting Error Reduction (ROVER)
-
J. G. Fiscus, "A post-processing system to yield reduced word error rates: Recognition Output Voting Error Reduction (ROVER)," in Proc. IEEE Workshop ASKU, 1997, pp. 347-354.
-
(1997)
Proc. IEEE Workshop ASKU
, pp. 347-354
-
-
Fiscus, J.G.1
-
27
-
-
0038133932
-
A statistical approach to metrics for word and syllable recognition
-
M. J. Hunt, "A statistical approach to metrics for word and syllable recognition," J. Acoust. Soc. Amer., vol. 66, pp. S535-536, 1979.
-
(1979)
J. Acoust. Soc. Amer
, vol.66
-
-
Hunt, M.J.1
-
28
-
-
0023867341
-
Speaker dependent and independent speech recognition experiments with an auditory model
-
M. J. Hunt and C. Lefebvre, "Speaker dependent and independent speech recognition experiments with an auditory model," in Proc. IEEE ICASSP, 1988, vol. 1, pp. 215-218.
-
(1988)
Proc. IEEE ICASSP
, vol.1
, pp. 215-218
-
-
Hunt, M.J.1
Lefebvre, C.2
-
29
-
-
0032289099
-
Heteroscedastic discriminant analysis and reduced rank HMMs for improved recognition
-
N. Kumar and A. G. Andreou, "Heteroscedastic discriminant analysis and reduced rank HMMs for improved recognition," Speech Commun., vol. 26, pp. 283-297, 1998.
-
(1998)
Speech Commun
, vol.26
, pp. 283-297
-
-
Kumar, N.1
Andreou, A.G.2
-
30
-
-
33947643073
-
Complementarity of speech recognition systems and system combination,
-
Ph.D. dissertation, Brno Univ. of Technol, Brno, Czech Republic
-
L. Burget, "Complementarity of speech recognition systems and system combination," Ph.D. dissertation, Brno Univ. of Technol., Brno, Czech Republic, 2004.
-
(2004)
-
-
Burget, L.1
-
31
-
-
0032638856
-
Semi-tied covariance matrices for hidden Markov models
-
May
-
M. Gales, "Semi-tied covariance matrices for hidden Markov models," IEEE Trans. Speech Audio Process., vol. 7, no. 3, pp. 272-281, May 1999.
-
(1999)
IEEE Trans. Speech Audio Process
, vol.7
, Issue.3
, pp. 272-281
-
-
Gales, M.1
-
32
-
-
33745199182
-
Applying vocal tract length normalization to meeting recordings
-
G. Garau, S. Renals, and T. Hain, "Applying vocal tract length normalization to meeting recordings," in Proc. Eurospeech, 2005, pp. 265-268.
-
(2005)
Proc. Eurospeech
, pp. 265-268
-
-
Garau, G.1
Renals, S.2
Hain, T.3
-
33
-
-
85079978469
-
Feature combination using linear discriminant analysis and its pitfalls
-
R. Schlüter, A. Zolnay, and H. Ney, "Feature combination using linear discriminant analysis and its pitfalls," in Proc. Interspeech, 2006, pp. 1077-1081.
-
(2006)
Proc. Interspeech
, pp. 1077-1081
-
-
Schlüter, R.1
Zolnay, A.2
Ney, H.3
-
34
-
-
0003822743
-
-
Cambridge, U.K, Cambridge Univ. Press, Dec
-
S. Young, G. Evermann, M. Gales, T. Hain, D. Kershaw, X. Liu, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. Woodland, The HTK book (v3.4). Cambridge.' U.K.: Cambridge Univ. Press, Dec. 2006.
-
(2006)
The HTK book (v3.4)
-
-
Young, S.1
Evermann, G.2
Gales, M.3
Hain, T.4
Kershaw, D.5
Liu, X.6
Moore, G.7
Odell, J.8
Ollason, D.9
Povey, D.10
Valtchev, V.11
Woodland, P.12
-
35
-
-
33745536025
-
The 2005 AMI system for the transcription of speech in meetings
-
Proc. MLMI'05 Machine Learning for Multimodal Interaction, Springer
-
T. Hain, L. Burget, J. Dines, G. Garau, M. Karafiat, M. Lincoln, I. McCowan, D. Moore, V. Wan, R. Ordelman, and S. Renals, "The 2005 AMI system for the transcription of speech in meetings," in Proc. MLMI'05 Machine Learning for Multimodal Interaction, 2006, no. 3869, pp. 450-62, ser. Lecture Notes in Computer Science. Springer.
-
(2006)
ser. Lecture Notes in Computer Science
, vol.3869
, pp. 450-462
-
-
Hain, T.1
Burget, L.2
Dines, J.3
Garau, G.4
Karafiat, M.5
Lincoln, M.6
McCowan, I.7
Moore, D.8
Wan, V.9
Ordelman, R.10
Renals, S.11
-
36
-
-
85090317334
-
A pitch extraction reference database
-
F. Plante, G. F. Meyer, and W. A. Ainsworth, "A pitch extraction reference database," in Proc. Eurospeech, 1995, pp. 837-840.
-
(1995)
Proc. Eurospeech
, pp. 837-840
-
-
Plante, F.1
Meyer, G.F.2
Ainsworth, W.A.3
-
37
-
-
0028996854
-
WSJCAMO: A British English speech corpus for large vocabulary continuous speech recognition
-
Detroit, Ml
-
T. Robinson, J. Fransen, D. Pye, J. Foote, and S. Renals, "WSJCAMO: A British English speech corpus for large vocabulary continuous speech recognition," in Proc. IEEE ICASSP, Detroit, Ml, 1995, pp. 81-84.
-
(1995)
Proc. IEEE ICASSP
, pp. 81-84
-
-
Robinson, T.1
Fransen, J.2
Pye, D.3
Foote, J.4
Renals, S.5
-
38
-
-
33745531876
-
An investigation into transcription of conference room meetings
-
T. Main, J. Dines, G. Garau, M. Karafiat, D. Moore, V. Wan, R. Or-delman, I. Mc. Cowan, J. Vepa, and S. Renals, "An investigation into transcription of conference room meetings," in Proc. Eurospeech, 2005, pp. 1661-1664.
-
(2005)
Proc. Eurospeech
, pp. 1661-1664
-
-
Main, T.1
Dines, J.2
Garau, G.3
Karafiat, M.4
Moore, D.5
Wan, V.6
Or-delman, R.7
Cowan, I.M.8
Vepa, J.9
Renals, S.10
-
39
-
-
0017097478
-
A comparative performance study of several pitch detection algorithms
-
Oct
-
L. R. Rabiner, M. J. Cheng, A. E. Rosenberg, and C. A. McGonegal, "A comparative performance study of several pitch detection algorithms," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-24, no. 5, pp. 399-418, Oct. 1976.
-
(1976)
IEEE Trans. Acoust., Speech, Signal Process
, vol.ASSP-24
, Issue.5
, pp. 399-418
-
-
Rabiner, L.R.1
Cheng, M.J.2
Rosenberg, A.E.3
McGonegal, C.A.4
-
40
-
-
34547548247
-
-
T. Hain, L. Burget, J. Dines, G. Garau, M. Karafiat, M. Lincoln, J. Vepa, and V. Wan, The AMI system for the transcription of speech in meetings, in Proc. IEEE ICASSP, 2007, pp. IV-357-IV-360.
-
T. Hain, L. Burget, J. Dines, G. Garau, M. Karafiat, M. Lincoln, J. Vepa, and V. Wan, "The AMI system for the transcription of speech in meetings," in Proc. IEEE ICASSP, 2007, pp. IV-357-IV-360.
-
-
-
-
41
-
-
77249114287
-
The Rich Transcription 2006 spring meeting recognition evauation
-
Proc. MLMI'06 Machine Learning for Multimodal Interaction, Springer
-
J. G. Fiscus, J. Ajot, M. Michel, and J. S. Garofolo, "The Rich Transcription 2006 spring meeting recognition evauation," in Proc. MLMI'06 Machine Learning for Multimodal Interaction, 2006, no. 4299, pp. 309-322, ser. Lecture Notes in Computer Science. Springer.
-
(2006)
ser. Lecture Notes in Computer Science
, vol.4299
, pp. 309-322
-
-
Fiscus, J.G.1
Ajot, J.2
Michel, M.3
Garofolo, J.S.4
|