-
1
-
-
70450190034
-
PodCastle: Collaborative training of acoustic models on the basis of wisdom of crowds for podcast transcription
-
J. Ogata and M. Goto, "PodCastle: Collaborative training of acoustic models on the basis of wisdom of crowds for podcast transcription," in Proc. Interspeech, 2009.
-
(2009)
Proc. Interspeech
-
-
Ogata, J.1
Goto, M.2
-
2
-
-
70349217247
-
An audio indexing system for election video material
-
C. Alberti, M. Bacchiani, A. Bezman, C. Chelba, A. Drofa, H. Liao, P. Moreno, T. Power, A. Sahuguet, M. Shugrina, and O. Siohan, "An audio indexing system for election video material," in Proc. ICASSP, 2009, pp. 4873-4876.
-
(2009)
Proc. ICASSP
, pp. 4873-4876
-
-
Alberti, C.1
Bacchiani, M.2
Bezman, A.3
Chelba, C.4
Drofa, A.5
Liao, H.6
Moreno, P.7
Power, T.8
Sahuguet, A.9
Shugrina, M.10
Siohan, O.11
-
3
-
-
84874275817
-
-
Tech. Rep. cued/f-infeng/tr.676, Cambridge University Engineering Department
-
R. C. van Dalen, J. Yang, andM. J. F. Gales, "Generative kernels and score-spaces for classification of speech: Progress report," Tech. Rep. cued/f-infeng/tr.676, Cambridge University Engineering Department, 2012.
-
(2012)
Generative Kernels and Score-spaces for Classification of Speech: Progress Report
-
-
Van Dalen, R.C.1
Yang, J.2
Gales, J.F.3
-
4
-
-
84877728825
-
Overview of MediaEval 2011 rich speech retrieval task and genre tagging task
-
M. Larson, M. Eskevich, R. Ordelman, C. Kofler, S. Schmiedeke, and G. J. F. Jones, "Overview of MediaEval 2011 rich speech retrieval task and genre tagging task," in Working Notes Proceedings of the MediaEval 2011 Workshop, 2011.
-
(2011)
Working Notes Proceedings of the MediaEval 2011 Workshop
-
-
Larson, M.1
Eskevich, M.2
Ordelman, R.3
Kofler, C.4
Schmiedeke, S.5
Jones, G.J.F.6
-
5
-
-
84861016333
-
Automated semantic tagging of speech audio
-
Y. Raimond, C. Lowis, R. Hodgson, and J. Tweed, "Automated semantic tagging of speech audio," in Proc. WWW 2012, 2012.
-
(2012)
Proc. WWW 2012
-
-
Raimond, Y.1
Lowis, C.2
Hodgson, R.3
Tweed, J.4
-
6
-
-
0033709098
-
Tandem connectionist feature extraction for conventional HMM systems
-
H. Hermanksy, D.P.W. Ellis, and S. Sharma, "Tandem connectionist feature extraction for conventional HMM systems," in Proc. ICASSP, 2000, pp. 1635-1630.
-
(2000)
Proc. ICASSP
, pp. 1635-1630
-
-
Hermanksy, H.1
Ellis, D.P.W.2
Sharma, S.3
-
7
-
-
84055222005
-
Contextdependent pre-trained deep neural networks for largevocabulary speech recognition
-
G.E. Dahl, D. Yu, L. Deng, and A. Acero, "Contextdependent pre-trained deep neural networks for largevocabulary speech recognition," IEEE Transactions on Audio, Speech and Language Processing, vol. 20, no. 1, pp. 30-42, 2012.
-
(2012)
IEEE Transactions on Audio, Speech and Language Processing
, vol.20
, Issue.1
, pp. 30-42
-
-
Dahl, G.E.1
Yu, D.2
Deng, L.3
Acero, A.4
-
8
-
-
84055211743
-
Acoustic modeling using deep belief networks
-
A. Mohammed, G.E. Dahl, and G. Hinton, "Acoustic modeling using deep belief networks," IEEE Transactions on Audio, Speech and Language Processing, vol. 20, no. 1, pp. 14-22, 2012.
-
(2012)
IEEE Transactions on Audio, Speech and Language Processing
, vol.20
, Issue.1
, pp. 14-22
-
-
Mohammed, A.1
Dahl, G.E.2
Hinton, G.3
-
10
-
-
4544236237
-
On use of task independent training data in tandem feature extraction
-
S. Sivadas and H. Hermansky, "On use of task independent training data in tandem feature extraction," in Proc. ICASSP, 2004.
-
(2004)
Proc. ICASSP
-
-
Sivadas, S.1
Hermansky, H.2
-
11
-
-
33947619591
-
Cross-domain and cross-language portability of acoustic features estimated by multilayer perceptrons
-
A. Stolcke, F. Gŕezl, M.-Y. Hwang, X Lei, N. Morgan, and D. Vergyri, "Cross-domain and cross-language portability of acoustic features estimated by multilayer perceptrons," in Proc. ICASSP, 2006.
-
(2006)
Proc. ICASSP
-
-
Stolcke, A.1
Gŕezl, F.2
Hwang, M.-Y.3
Lei, X.4
Morgan, N.5
Vergyri, D.6
-
12
-
-
78049384951
-
Multi-style MLP features for BN transcription
-
V.-B. Le, L. Lamel, and J.-L. Gauvain, "Multi-style MLP features for BN transcription," in Proc. ICASSP, 2010, pp. 4866-4869.
-
(2010)
Proc. ICASSP
, pp. 4866-4869
-
-
Le, V.-B.1
Lamel, L.2
Gauvain, J.-L.3
-
13
-
-
79959819891
-
Crosslingual and multi-stream posterior features for low resource LVCSR systems
-
S. Thomas, S. Ganapathy, and H. Hermansky, "Crosslingual and multi-stream posterior features for low resource LVCSR systems," in Proc. Interspeech, 2010.
-
(2010)
Proc. Interspeech
-
-
Thomas, S.1
Ganapathy, S.2
Hermansky, H.3
-
14
-
-
33745805403
-
A fast learning algorithm for deep belief nets
-
G. Hinton, S. Osindero, and Y. Teh, "A fast learning algorithm for deep belief nets," Neural Computation, vol. 18, pp. 1527-1554, 2006.
-
(2006)
Neural Computation
, vol.18
, pp. 1527-1554
-
-
Hinton, G.1
Osindero, S.2
Teh, Y.3
-
15
-
-
77949522811
-
Why does unsupervised pre-training help deep learning?
-
D. Erhan, Y. Bengio, A. Courville, P.-A. Manzagol, and P. Vincent, "Why does unsupervised pre-training help deep learning?," Journal of Machine Learning Research, vol. 11, pp. 625-660, 2010.
-
(2010)
Journal of Machine Learning Research
, vol.11
, pp. 625-660
-
-
Erhan, D.1
Bengio, Y.2
Courville, A.3
Manzagol, P.-A.4
Vincent, P.5
-
16
-
-
34547530011
-
Combining discriminative feature, transform, and model training for large vocabulary speech recognition
-
J. Zheng, O. Cetin, M.-Y. Hwang, X. Lei, A. Stolcke, and N. Morgan, "Combining discriminative feature, transform, and model training for large vocabulary speech recognition," in Proc. ICASSP, 2007.
-
(2007)
Proc. ICASSP
-
-
Zheng, J.1
Cetin, O.2
Hwang, M.-Y.3
Lei, X.4
Stolcke, A.5
Morgan, N.6
-
17
-
-
0036296863
-
Minimum phone error and I-smoothing for improved discriminative training
-
D. Povey and P.C. Woodland, "Minimum phone error and I-smoothing for improved discriminative training," in Proc. ICASSP. IEEE, 2002, vol. I, pp. 105-108.
-
(2002)
Proc. ICASSP IEEE
, vol.1
, pp. 105-108
-
-
Povey, D.1
Woodland, P.C.2
-
18
-
-
84878392008
-
Data-driven posterior features for low resource speech recognition applications
-
to appear
-
S. Thomas, S. Ganapathy, A. Jansen, and H. Hermansky, "Data-driven posterior features for low resource speech recognition applications," in Proc. Interspeech, 2012, to appear.
-
(2012)
Proc. Interspeech
-
-
Thomas, S.1
Ganapathy, S.2
Jansen, A.3
Hermansky, H.4
-
19
-
-
79959817774
-
Lightly supervised recognition for automatic alignment of large coherent speech recordings
-
N. Braunschweiler, M.J.F. Gales, and S. Buchholz, "Lightly supervised recognition for automatic alignment of large coherent speech recordings," in Proc. Interspeech, 2010, pp. 2222-2225.
-
(2010)
Proc. Interspeech
, pp. 2222-2225
-
-
Braunschweiler, N.1
Gales, M.J.F.2
Buchholz, S.3
-
20
-
-
33745219648
-
The development of the Cambridge University RT-04 diarisation system
-
nov
-
S. E. Tranter, M. J. F. Gales, R. Sinha, S. Umesh, and P. C. Woodland, "The development of the Cambridge University RT-04 diarisation system," in Proc. Fall 2004 Rich Transcription Workshop (RT-04), nov 2004.
-
(2004)
Proc. Fall 2004 Rich Transcription Workshop (RT-04)
-
-
Tranter, S.E.1
Gales, M.J.F.2
Sinha, R.3
Umesh, S.4
Woodland, P.C.5
-
21
-
-
85008520364
-
Transcribing meetings with the AMIDA systems
-
T. Hain, L. Burget, J. Dines, P.N. Garner, F. Grezl, A.E. Hannani, M. Huijbregts, M. Karafiat, M. Lincoln, and V. Wan, "Transcribing meetings with the AMIDA systems," IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 2, pp. 486-498, 2012.
-
(2012)
IEEE Transactions on Audio, Speech, and Language Processing
, vol.20
, Issue.2
, pp. 486-498
-
-
Hain, T.1
Burget, L.2
Dines, J.3
Garner, P.N.4
Grezl, F.5
Hannani, A.E.6
Huijbregts, M.7
Karafiat, M.8
Lincoln, M.9
Wan, V.10
-
22
-
-
0032638856
-
Semi-tied covariance matrices for hidden Markov models
-
May
-
M.J.F. Gales, "Semi-tied covariance matrices for hidden Markov models," IEEE Trans. on Speech and Audio Processing, vol. 7, pp. 272-281, May 1999.
-
(1999)
IEEE Trans. on Speech and Audio Processing
, vol.7
, pp. 272-281
-
-
Gales, M.J.F.1
|