-
1
-
-
85032751593
-
Research developments and directions in speech recognition and understanding, Part 1
-
May
-
J. M. Baker, L. Deng, J. Glass, S. Khudanpur, C. -H. Lee, N. Morgan, and D. O'Shaughnessy, "Research developments and directions in speech recognition and understanding, Part 1, " IEEE Signal Process. Mag. , vol. 26, no. 3, pp. 75-80, May 2009.
-
(2009)
IEEE Signal Process. Mag.
, vol.26
, Issue.3
, pp. 75-80
-
-
Baker, J.M.1
Deng, L.2
Glass, J.3
Khudanpur, S.4
Lee, C.-H.5
Morgan, N.6
O'Shaughnessy, D.7
-
2
-
-
0026882842
-
Experiments with a nonlinear spectral subtractor (NSS), hidden Markov models and the projection, for robust speech recognition in cars
-
P. Lockwood and J. Boudy, "Experiments with nonlinear spectral subtractor (NSS), hidden Markov models and the projection for robust speech recognition in cars, " Speech Commun. , vol. 11, pp. 215-228, 1992. (Pubitemid 23572493)
-
(1992)
Speech Communication
, vol.11
, Issue.2-3
, pp. 215-228
-
-
Lockwood, P.1
Boudy, J.2
-
3
-
-
0035396555
-
Noise power spectral density estimation based on optimal smoothing and minimum statistics
-
DOI 10.1109/89.928915, PII S106366760104980X
-
R. Martin, "Noise power spectral density estimation based on optimal smoothing and minimum statistics, " IEEE Trans. Speech. Audio Process. , vol. 9, no. 5, pp. 504-512, Jul. 2001. (Pubitemid 32631178)
-
(2001)
IEEE Transactions on Speech and Audio Processing
, vol.9
, Issue.5
, pp. 504-512
-
-
Martin, R.1
-
4
-
-
0035342414
-
Robust automatic speech recognition with missing and unreliable acoustic data
-
DOI 10.1016/S0167-6393(00)00034-0, PII S0167639300000340
-
M. Cooke, P. Green, L. Josifovski, and A. Vizinho, "Robust automatic speech recognition with missing and uncertain acoustic data, " Speech Commun. , vol. 34, pp. 267-285, 2001. (Pubitemid 32284867)
-
(2001)
Speech Communication
, vol.34
, Issue.3
, pp. 267-285
-
-
Cooke, M.1
Green, P.2
Josifovski, L.3
Vizinho, A.4
-
5
-
-
4644317224
-
A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition
-
M. Seltzer, B. Raj, and R. Stern, "A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition, " Speech Commun. , vol. 43, pp. 379-393, 2004.
-
(2004)
Speech Commun.
, vol.43
, pp. 379-393
-
-
Seltzer, M.1
Raj, B.2
Stern, R.3
-
6
-
-
11144316019
-
Decoding speech in the presence of other sources
-
DOI 10.1016/j.specom.2004.05.002, PII S0167639304000615
-
J. Barker, M. Cooke, and D. Ellis, "Decoding speech in the presence of other sources, " Speech Commun. , vol. 45, pp. 5-25, 2005. (Pubitemid 40034706)
-
(2005)
Speech Communication
, vol.45
, Issue.1
, pp. 5-25
-
-
Barker, J.P.1
Cooke, M.P.2
Ellis, D.P.W.3
-
7
-
-
0036291376
-
Uncertainty decoding with splice for noise robust speech recognition
-
J. Droppo, L. Deng, and A. Acero, "Uncertainty decoding with splice for noise robust speech recognition, " in Proc. IEEE Int. Conf. Acoust. , Speech, Signal Process. , 2002, pp. 57-60.
-
(2002)
Proc. IEEE Int. Conf. Acoust. , Speech, Signal Process.
, pp. 57-60
-
-
Droppo, J.1
Deng, L.2
Acero, A.3
-
8
-
-
18744401086
-
Dynamic compensation of HMM variances using the feature enhancement uncertainty computed from a parametric model of speech distortion
-
DOI 10.1109/TSA.2005.845814
-
L. Deng, J. Droppo, and A. Acero, "Dynamic compensation of HMM variances using the feature enhancement uncertainty computed from a parametric model of speech distortion, " IEEE Trans. Speech. Audio Process. , vol. 13, no. 3, pp. 412-421, May 2005. (Pubitemid 40666175)
-
(2005)
IEEE Transactions on Speech and Audio Processing
, vol.13
, Issue.3
, pp. 412-421
-
-
Deng, L.1
Droppo, J.2
Acero, A.3
-
9
-
-
40249103761
-
Issues with uncertainty decoding for noise robust automatic speech recognition
-
H. Liao and M. Gales, "Issues with uncertainty decoding for noise robust automatic speech recognition, " Speech Commun. , vol. 50, pp. 265-277, 2008.
-
(2008)
Speech Commun.
, vol.50
, pp. 265-277
-
-
Liao, H.1
Gales, M.2
-
10
-
-
0025681008
-
Hidden Markov model decomposition of speech and noise
-
A. Varga and R. Moore, "Hidden Markov model decomposition of speech and noise, " in Proc. IEEE Int. Conf. Acoust. , Speech, Signal Process. , 1990, pp. 845-848.
-
(1990)
Proc. IEEE Int. Conf. Acoust. , Speech, Signal Process.
, pp. 845-848
-
-
Varga, A.1
Moore, R.2
-
11
-
-
85135375893
-
HMM recognition in noise using parallel model combination
-
Berlin
-
M. Gales and S. Young, "HMM recognition in noise using parallel model combination, " in Proc. Eurospeech, Berlin, 1993.
-
(1993)
Proc. Eurospeech
-
-
Gales, M.1
Young, S.2
-
12
-
-
85009074657
-
ALGONQUIN: Iterating Laplace's method to remove multiple types of distortion for robust speech recognition
-
Aalborg, Denmark
-
B. Frey, L. Deng, A. Acero, and T. Kristjansson, "ALGONQUIN: Iterating Laplace's method to remove multiple types of distortion for robust speech recognition, " in Proc. Eurospeech, Aalborg, Denmark, 2001, pp. 901-904.
-
(2001)
Proc. Eurospeech
, pp. 901-904
-
-
Frey, B.1
Deng, L.2
Acero, A.3
Kristjansson, T.4
-
13
-
-
69249222720
-
Super-human multi-talker speech recognition: A graphical modeling approach
-
J. R. Hershey, S. J. Rennie, and P. A. Olsen, "Super-human multi-talker speech recognition: A graphical modeling approach, " Comput. Speech. Lang. , vol. 24, pp. 45-66, 2010.
-
(2010)
Comput. Speech. Lang.
, vol.24
, pp. 45-66
-
-
Hershey, J.R.1
Rennie, S.J.2
Olsen, P.A.3
-
14
-
-
85032751986
-
Single-channel multitalker speech recognition
-
S. J. Rennie, J. R. Hershey, and P. A. Olsen, "Single-channel multitalker speech recognition, " IEEE Signal Process. Mag. , vol. 27, pp. 66-80, 2010.
-
(2010)
IEEE Signal Process. Mag.
, vol.27
, pp. 66-80
-
-
Rennie, S.J.1
Hershey, J.R.2
Olsen, P.A.3
-
15
-
-
69249202377
-
Monaural speech separation and recognition challenge
-
M. Cooke, J. Hershey, and S. Rennie, "Monaural speech separation and recognition challenge, " Comput. Speech. Lang. , vol. 24, pp. 1-15, 2010.
-
(2010)
Comput. Speech. Lang.
, vol.24
, pp. 1-15
-
-
Cooke, M.1
Hershey, J.2
Rennie, S.3
-
16
-
-
50249152311
-
Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria
-
Mar
-
T. Virtanen, "Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria, " IEEE Trans. Audio. Speech. , vol. 15, no. 3, pp. 1066-1074, Mar. 2007.
-
(2007)
IEEE Trans. Audio. Speech.
, vol.15
, Issue.3
, pp. 1066-1074
-
-
Virtanen, T.1
-
17
-
-
44949110218
-
Single-channel speech separation using sparse non-negative matrix factorization
-
Pittsburgh, PA
-
M. N. Schmidt and R. K. Olsson, "Single-channel speech separation using sparse non-negative matrix factorization, " in Proc. Interspeech, Pittsburgh, PA, 2006, pp. 2614-2617.
-
(2006)
Proc. Interspeech
, pp. 2614-2617
-
-
Schmidt, M.N.1
Olsson, R.K.2
-
18
-
-
4344607755
-
Likelihood-maximizing beamforming for robust hands-free speech recognition
-
Sep
-
M. Seltzer, B. Raj, and R. Stern, "Likelihood-maximizing beamforming for robust hands-free speech recognition, " IEEE Trans. Speech. Audio Process. , vol. 12, no. 5, pp. 489-498, Sep. 2004.
-
(2004)
IEEE Trans. Speech. Audio Process.
, vol.12
, Issue.5
, pp. 489-498
-
-
Seltzer, M.1
Raj, B.2
Stern, R.3
-
19
-
-
34250689497
-
Missing-feature based speech recognition for two simultaneous speech signals separated by ICA with a pair of humanoid ears
-
DOI 10.1109/IROS.2006.281741, 4058472, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2006
-
R. Takeda, S. Yamamoto, K. Komatani, T. Ogata, and H. Okuno, "Missing-feature based speech recognition for two simultaneous speech signals separated by ICA with a pair of humanoid ears, " in IEEE/RSJ Int. Conf. Intell. Robots Syst. , 2006, pp. 878-885. (Pubitemid 46927954)
-
(2006)
IEEE International Conference on Intelligent Robots and Systems
, pp. 878-885
-
-
Takeda, R.1
Yamamoto, S.2
Komatani, K.3
Ogata, T.4
Okuno, H.G.5
-
20
-
-
79959845286
-
The CHiME corpus: A resource and a challenge for Computational Hearing in Multisource Environments
-
H. Christensen, J. Barker, N. Ma, and P. Green, "The CHiME corpus: A resource and a challenge for Computational Hearing in Multisource Environments, " in Proc. Interspeech, 2010.
-
(2010)
Proc. Interspeech
-
-
Christensen, H.1
Barker, J.2
Ma, N.3
Green, P.4
-
21
-
-
0002059527
-
Understanding speech understanding: Towards a unified theory of speech perception
-
U. K.
-
S. Greenberg, W. Ainsworth and S. Greenberg, Eds. , "Understanding speech understanding: Towards a unified theory of speech perception, " in Proc. ESCA Workshop Auditory Basis Speech Percept. , U. K. , 1996, pp. 1-8.
-
(1996)
Proc. ESCA Workshop Auditory Basis Speech Percept.
, pp. 1-8
-
-
Greenberg, S.1
Ainsworth, W.2
Greenberg, S.3
-
22
-
-
0002296637
-
On the importance of time - A temporal representation of sound
-
M. Cooke, S. Beet, and M. Crawford, Eds. Sussex, U. K. : Wiley
-
M. Slaney and R. Lyon, "On the importance of time - A temporal representation of sound, " in Visual Representations of Speech Signals, M. Cooke, S. Beet, and M. Crawford, Eds. Sussex, U. K. : Wiley, 1993, pp. 95-116.
-
(1993)
Visual Representations of Speech Signals
, pp. 95-116
-
-
Slaney, M.1
Lyon, R.2
-
23
-
-
0344581050
-
Temporal integration and context effects in hearing
-
DOI 10.1016/S0095-4470(03)00011-1
-
B. C. J. Moore, "Temporal integration and context effects in hearing, " in J. Phonetics, 2003, vol. 31, pp. 563-574. (Pubitemid 37495928)
-
(2003)
Journal of Phonetics
, vol.31
, Issue.3-4
, pp. 563-574
-
-
Moore, B.C.J.1
-
25
-
-
0029249228
-
Spectral redundancy: Intelligibility of sentences heard through narrow spectral slits
-
R. Warren, K. Riener, J. Bashford, and B. Brubaker, "Spectral redundancy: Intelligibility of sentences heard through narrow spectral slits, " Percept. Psychophys. , vol. 57, pp. 175-182, 1995.
-
(1995)
Percept. Psychophys.
, vol.57
, pp. 175-182
-
-
Warren, R.1
Riener, K.2
Bashford, J.3
Brubaker, B.4
-
26
-
-
0036713102
-
The intelligibility of speech with "holes" in the spectrum
-
K. Kasturi, P. C. Loizou, M. Dorman, and T. Spahr, "The intelligibility of speech with "holes" in the spectrum, " J. Acoust. Soc. Amer. , vol. 112, pp. 1102-1111, 2002.
-
(2002)
J. Acoust. Soc. Amer.
, vol.112
, pp. 1102-1111
-
-
Kasturi, K.1
Loizou, P.C.2
Dorman, M.3
Spahr, T.4
-
27
-
-
33644661135
-
A glimpsing model of speech perception in noise
-
DOI 10.1121/1.2166600
-
M. Cooke, "A glimpsing model of speech perception in noise, " J. Acoust. Soc. Amer. , vol. 119, pp. 1562-1573, 2006. (Pubitemid 43326025)
-
(2006)
Journal of the Acoustical Society of America
, vol.119
, Issue.3
, pp. 1562-1573
-
-
Cooke, M.1
-
28
-
-
4644336054
-
Reconstruction of missing features for robust speech recognition
-
B. Raj, M. Seltzer, and R. Stern, "Reconstruction of missing features for robust speech recognition, " Speech Commun. , vol. 43, pp. 275-296, 2004.
-
(2004)
Speech Commun.
, vol.43
, pp. 275-296
-
-
Raj, B.1
Seltzer, M.2
Stern, R.3
-
29
-
-
85009063707
-
Soft decisions in missing data techniques for robust automatic speech recognition
-
Beijing, China
-
J. Barker, L. Josifovski, M. Cooke, and P. Green, "Soft decisions in missing data techniques for robust automatic speech recognition, " in Proc. ICSLP, Beijing, China, 2000, pp. 373-376.
-
(2000)
Proc. ICSLP
, pp. 373-376
-
-
Barker, J.1
Josifovski, L.2
Cooke, M.3
Green, P.4
-
30
-
-
11144343436
-
Detection of reliable features for speech recognition in noisy conditions using a statistical criterion
-
Aalborg, Denmark
-
P. Renevey and A. Drygajlo, "Detection of reliable features for speech recognition in noisy conditions using a statistical criterion, " in Proc. CRAC, Aalborg, Denmark, 2001.
-
(2001)
Proc. CRAC
-
-
Renevey, P.1
Drygajlo, A.2
-
31
-
-
33847629729
-
On noise masking for automatic missing data speech recognition: A survey and discussion
-
DOI 10.1016/j.csl.2006.08.001, PII S0885230806000301
-
C. Cerisara, S. Demange, and J. Haton, "On noise masking for automatic missing data speech recognition: A survey and discussion, " Comput. Speech. Lang. , vol. 21, pp. 443-457, 2007. (Pubitemid 46367508)
-
(2007)
Computer Speech and Language
, vol.21
, Issue.3
, pp. 443-457
-
-
Cerisara, C.1
Demange, S.2
Haton, J.-P.3
-
32
-
-
0041360463
-
Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging
-
Sep
-
I. Cohen, "Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging, " IEEE Trans. Speech. Audio Process. , vol. 11, no. 5, pp. 466-475, Sep. 2003.
-
(2003)
IEEE Trans. Speech. Audio Process.
, vol.11
, Issue.5
, pp. 466-475
-
-
Cohen, I.1
-
33
-
-
29444448046
-
A noise-estimation algorithm for highly non-stationary environments
-
DOI 10.1016/j.specom.2005.08.005, PII S0167639305002001
-
S. Rangachari and P. C. Loizou, "A noise-estimation algorithm for highly non-stationary environments, " Speech Commun. , vol. 48, pp. 220-231, 2006. (Pubitemid 43012033)
-
(2006)
Speech Communication
, vol.48
, Issue.2
, pp. 220-231
-
-
Rangachari, S.1
Loizou, P.C.2
-
34
-
-
0034244889
-
Learning patterns of activity using realtime tracking
-
Aug
-
C. Stauffer and W. Grimson, "Learning patterns of activity using realtime tracking, " IEEE Trans. Pattern Anal. Mach. Intell. , vol. 22, no. 8, pp. 747-757, Aug. 2000.
-
(2000)
IEEE Trans. Pattern Anal. Mach. Intell.
, vol.22
, Issue.8
, pp. 747-757
-
-
Stauffer, C.1
Grimson, W.2
-
35
-
-
0025110885
-
Derivation of auditory filter shapes from notched-noise data
-
DOI 10.1016/0378-5955(90)90170-T
-
B. Glasberg and B. Moore, "Derivation of auditory filter shapes from notched-noise data, " Hearing Res. , vol. 47, pp. 103-138, 1990. (Pubitemid 20244652)
-
(1990)
Hearing Research
, vol.47
, Issue.1-2
, pp. 103-138
-
-
Glasberg, B.R.1
Moore, B.C.J.2
-
36
-
-
34748817500
-
Exploiting correlogram structure for robust speech recognition with multiple speech sources
-
DOI 10.1016/j.specom.2007.05.003, PII S016763930700088X
-
N. Ma, P. Green, J. Barker, and A. Coy, "Exploiting correlogram structure for robust speech recognition with multiple speech sources, "Speech Commun. , vol. 49, pp. 874-891, 2007. (Pubitemid 47488511)
-
(2007)
Speech Communication
, vol.49
, Issue.12
, pp. 874-891
-
-
Ma, N.1
Green, P.2
Barker, J.3
Coy, A.4
-
37
-
-
85009106519
-
Robust ASR based on clean speech models: An evaluation of missing data techniques for connected digit recognition in noise
-
Aalborg, Denmark
-
J. Barker, M. Cooke, and P. Green, "Robust ASR based on clean speech models: An evaluation of missing data techniques for connected digit recognition in noise, " in Proc. Eurospeech, Aalborg, Denmark, 2001, pp. 213-216.
-
(2001)
Proc. Eurospeech
, pp. 213-216
-
-
Barker, J.1
Cooke, M.2
Green, P.3
-
38
-
-
0001463644
-
A duplex theory of pitch perception
-
J. Licklider, "A duplex theory of pitch perception, " Experientia, vol. 7, pp. 128-134, 1951.
-
(1951)
Experientia
, vol.7
, pp. 128-134
-
-
Licklider, J.1
-
39
-
-
0025623060
-
A perceptual pitch detector
-
Albequerque, NM
-
M. Slaney and R. Lyon, "A perceptual pitch detector, " in Proc. IEEE Int. Conf. Acoust. , Speech, Signal Process. , Albequerque, NM, 1990, pp. 357-360.
-
(1990)
Proc. IEEE Int. Conf. Acoust. , Speech, Signal Process.
, pp. 357-360
-
-
Slaney, M.1
Lyon, R.2
-
40
-
-
33750368310
-
An audio-visual corpus for speech perception and automatic speech recognition
-
DOI 10.1121/1.2229005
-
M. Cooke, J. Barker, S. Cunningham, and X. Shao, "An audio-visual corpus for speech perception and automatic speech recognition, " J. Acoust. Soc. Amer. , vol. 120, pp. 2421-2424, 2006. (Pubitemid 44631681)
-
(2006)
Journal of the Acoustical Society of America
, vol.120
, Issue.5
, pp. 2421-2424
-
-
Cooke, M.1
Barker, J.2
Cunningham, S.3
Shao, X.4
-
41
-
-
0024909979
-
Some statistical issues in the comparison of speech recognition algorithms
-
L. Gillick and S. Cox, "Some statistical issues in the comparison of speech recognition algorithms, " in Proc. IEEE Int. Conf. Acoust. , Speech, Signal Process. , 1989, pp. 532-535. (Pubitemid 20604171)
-
(1989)
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
, vol.1
, pp. 532-535
-
-
Gillick, L.1
Cox, S.J.2
-
42
-
-
0031268341
-
Factorial hidden markov models
-
Z. Ghahramani and M. I. Jordan, "Factorial hidden Markov models, " Mach. Learn. , vol. 29, pp. 245-273, 1997. (Pubitemid 127510040)
-
(1997)
Machine Learning
, vol.29
, Issue.2-3
, pp. 245-273
-
-
Ghahramani, Z.1
Jordan, M.I.2
|