-
1
-
-
3442876970
-
Phase-based dual-microphone robust speech enhancement
-
P. Aarabi and G. Shi "Phase-based dual-microphone robust speech enhancement," IEEE Trans. Syst., Man, Cybern., B, vol. 34, no. 4, pp. 1763-1773, 2004.
-
(2004)
IEEE Trans. Syst., Man, Cybern., B
, vol.34
, Issue.4
, pp. 1763-1773
-
-
Aarabi, P.1
Shi, G.2
-
2
-
-
0028516073
-
How do humans process and recognize speech?
-
J. B. Allen, "How do humans process and recognize speech?," IEEE Trans. Speech Audio, vol. 2, no. 4, pp. 567-577, 1994.
-
(1994)
IEEE Trans. Speech Audio
, vol.2
, Issue.4
, pp. 567-577
-
-
Allen, J.B.1
-
3
-
-
84863773378
-
Frequency-domain linear prediction for temporal features
-
M. Athineos and D. Ellis, "Frequency-domain linear prediction for temporal features," in Proc. IEEE ASRU Workshop, 2003, pp. 261-266.
-
(2003)
Proc. IEEE ASRU Workshop
, pp. 261-266
-
-
Athineos, M.1
Ellis, D.2
-
4
-
-
23744508888
-
Multiresolution spectrotemporal analysis of complex sounds
-
DOI 10.1121/1.1945807
-
T. Chi, P. Ru, and S. A. Shamma, "Multiresolution spectrotemporal analysis of complex sounds," J. Acoust. Soc. Amer., vol. 118, no. 2, pp. 887-906, 2005. (Pubitemid 41129224)
-
(2005)
Journal of the Acoustical Society of America
, vol.118
, Issue.2
, pp. 887-906
-
-
Chi, T.1
Ru, P.2
Shamma, S.A.3
-
5
-
-
0024392496
-
Application of an auditory model to speech recognition
-
DOI 10.1121/1.397756
-
J. R. Cohen, "Application of an auditory model to speech recognition," J. Acoust. Soc. Amer., vol. 85, no. 6, pp. 2623-2629, 1989. (Pubitemid 19160389)
-
(1989)
Journal of the Acoustical Society of America
, vol.85
, Issue.6
, pp. 2623-2629
-
-
Cohen, J.R.1
-
6
-
-
33745224873
-
Vocal track normalization in speech recognition: Compensating for systematic speaker variability
-
J. R. Cohen, T. Kamm, and A.G. Andreou, "Vocal track normalization in speech recognition: Compensating for systematic speaker variability," J. Acoust. Soc. Amer., vol. 97, no. 5, pp. 3246-3247, 1995.
-
(1995)
J. Acoust. Soc. Amer.
, vol.97
, Issue.5
, pp. 3246-3247
-
-
Cohen, J.R.1
Kamm, T.2
Andreou, A.G.3
-
7
-
-
0019053271
-
Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
-
S. Davis and P. Mermelstein, "Comparison of parametric representations of monosyllabic word recognition in continuously spoken sentences," IEEE Trans. Acoust. Speech Signal Processing, vol. 28, no. 4, pp. 357-366, 1980. (Pubitemid 11464930)
-
(1980)
IEEE Transactions on Acoustics, Speech, and Signal Processing
, vol.ASSP-28
, Issue.4
, pp. 357-366
-
-
Davis Steven, B.1
Mermelstein Paul2
-
8
-
-
0023516708
-
A composite auditory model for processing speech sounds
-
L. Deng and D. C. Geisler, "A composite auditory model for processing speech sounds," J. Acoust. Soc. Amer., vol. 82, no. 6, pp. 2001-2012, 1987.
-
(1987)
J. Acoust. Soc. Amer.
, vol.82
, Issue.6
, pp. 2001-2012
-
-
Deng, L.1
Geisler, D.C.2
-
9
-
-
0035097825
-
Spectro-temporal response field characterization with dynamic ripples in ferret primary auditory cortex
-
D. A. Depireux, J. Z. Simon, D. J. Klein, and S. A. Shamma, "Spectro-temporal response field characterization with dynamic ripples in ferret primary auditory cortex," J. Neurophysiol., vol. 85, no. 3, pp. 1220-1234, 2001. (Pubitemid 32209608)
-
(2001)
Journal of Neurophysiology
, vol.85
, Issue.3
, pp. 1220-1234
-
-
Depireux, D.A.1
Simon, J.Z.2
Klein, D.J.3
Shamma, S.A.4
-
10
-
-
0002439510
-
Auditory patterns
-
H. Fletcher, "Auditory patterns," Rev. Mod. Phys., vol. 12, no. 1, pp. 47-65, 1940.
-
(1940)
Rev. Mod. Phys.
, vol.12
, Issue.1
, pp. 47-65
-
-
Fletcher, H.1
-
11
-
-
84955013022
-
Loudness, its definition, measurement, and calculation
-
H. Fletcher and W. A. Munson, "Loudness, its definition, measurement, and calculation," J. Acoust. Soc. Amer., vol. 5, no. 2, pp. 82-108, 1933.
-
(1933)
J. Acoust. Soc. Amer.
, vol.5
, Issue.2
, pp. 82-108
-
-
Fletcher, H.1
Munson, W.A.2
-
12
-
-
84953657538
-
Factors governing the intelligibility of speech sounds
-
N. R. French and J. C. Steinberg, "Factors governing the intelligibility of speech sounds," J. Acoust. Soc. Amer. vol. 19, no. 1, pp. 90-119, 1947.
-
(1947)
J. Acoust. Soc. Amer.
, vol.19
, Issue.1
, pp. 90-119
-
-
French, N.R.1
Steinberg, J.C.2
-
13
-
-
0022548705
-
On the role of spectral transition for speech perception
-
S. Furui, "On the role of spectral transition for speech perception," J. Acoust. Soc. Amer., vol. 80, no. 4, pp. 1016-1025, 1986. (Pubitemid 16023317)
-
(1986)
Journal of the Acoustical Society of America
, vol.80
, Issue.4
, pp. 1016-1025
-
-
Furui, S.1
-
14
-
-
84991416125
-
Auditory nerve representation as a front-end for speech recognition in a noisy environment
-
O. Ghitza, "Auditory nerve representation as a front-end for speech recognition in a noisy environment," Comput. Speech Lang., vol. 1, no. 2, pp. 109-130, 1986.
-
(1986)
Comput. Speech Lang.
, vol.1
, Issue.2
, pp. 109-130
-
-
Ghitza, O.1
-
15
-
-
85037528026
-
Transcription methods for consistency, volume and efficiency
-
M. Glenn, S. Strassel, H. Lee, K. Maeda, R. Zakhary, and X. Li, "Transcription methods for consistency, volume and efficiency," in Proc. LREC 2010, Valletta, Malta.
-
Proc. LREC 2010, Valletta, Malta
-
-
Glenn, M.1
Strassel, S.2
Lee, H.3
Maeda, K.4
Zakhary, R.5
Li, X.6
-
16
-
-
0025041264
-
Perceptual linear predictive (PLP) analysis of speech
-
DOI 10.1121/1.399423
-
H. Hermansky, "Perceptual linear predictive (PLP) analysis of speech," J. Acoust. Soc. Amer., vol. 87, no. 4, pp. 1738-1752, 1990. (Pubitemid 20256470)
-
(1990)
Journal of the Acoustical Society of America
, vol.87
, Issue.4
, pp. 1738-1752
-
-
Hermansky, H.1
-
17
-
-
0033709098
-
Tandem connectionist feature extraction for conventional HMM systems
-
H. Hermansky, D. Ellis, and S. Sharma, "Tandem connectionist feature extraction for conventional HMM systems," in Proc. IEEE ICASSP, Istanbul, Turkey, 2000, pp. 1635-1638.
-
(2000)
Proc. IEEE ICASSP, Istanbul, Turkey
, pp. 1635-1638
-
-
Hermansky, H.1
Ellis, D.2
Sharma, S.3
-
18
-
-
0028517164
-
RASTA processing of speech
-
H. Hermansky and N. Morgan, "RASTA processing of speech," IEEE Trans. Speech Audio Processing, vol. 2, no. 4, pp. 578-589, 1994.
-
(1994)
IEEE Trans. Speech Audio Processing
, vol.2
, Issue.4
, pp. 578-589
-
-
Hermansky, H.1
Morgan, N.2
-
19
-
-
0032658253
-
Temporal patterns (TRAPS) in ASR of noisy speech
-
H. Hermansky and S. Sharma, "Temporal patterns (TRAPS) in ASR of noisy speech," in Proc. IEEE ICASSP, Phoenix, AZ, 1999, pp. 1255-1258.
-
(1999)
Proc. IEEE ICASSP, Phoenix, AZ
, pp. 1255-1258
-
-
Hermansky, H.1
Sharma, S.2
-
20
-
-
0030365517
-
Towards ASR on partially corrupted speech
-
H. Hermansky, S. Tibrewala, and M. Pavel, "Towards ASR on partially corrupted speech," in Proc. ICSLP, Philadelphia, PA, 1996, vol. 1, pp. 462-465.
-
(1996)
Proc. ICSLP, Philadelphia, PA
, vol.1
, pp. 462-465
-
-
Hermansky, H.1
Tibrewala, S.2
Pavel, M.3
-
21
-
-
0001887874
-
A place theory of sound localization
-
L. A. Jeffress, "A place theory of sound localization," J. Comp. Physiol. Psychol., vol. 41, no. 1, pp. 35-39, 1948.
-
(1948)
J. Comp. Physiol. Psychol.
, vol.41
, Issue.1
, pp. 35-39
-
-
Jeffress, L.A.1
-
22
-
-
1642342844
-
Neural Processing of Amplitude-Modulated Sounds
-
DOI 10.1152/physrev.00029.2003
-
P. X. Joris, C. E. Schreiner, and A. Rees, "Neural processing of amplitude-modulated sounds," Physiol. Rev., vol. 84, no. 2, pp. 541-577, 2004. (Pubitemid 38365492)
-
(2004)
Physiological reviews
, vol.84
, Issue.2
, pp. 541-577
-
-
Joris, P.X.1
Schreiner, C.E.2
Rees, A.3
-
24
-
-
79959834164
-
Automatic selection of thresholds for signal separation algorithms based on interaural delay
-
C. Kim, R. M. Stern, K. Eom, and J. Lee, "Automatic selection of thresholds for signal separation algorithms based on interaural delay," in Proc Interspeech, Makuhari, Japan, 2010.
-
(2010)
Proc Interspeech, Makuhari, Japan
-
-
Kim, C.1
Stern, R.M.2
Eom, K.3
Lee, J.4
-
25
-
-
84910064377
-
Power-normalized cepstral coefficients (PNCC) for robust speech recognition
-
to be published
-
C. Kim and R. M. Stern, "Power-normalized cepstral coefficients (PNCC) for robust speech recognition," IEEE Trans. Audio, Speech, Lang. Processing, to be published.
-
IEEE Trans. Audio, Speech, Lang. Processing
-
-
Kim, C.1
Stern, R.M.2
-
26
-
-
0032785783
-
Auditory processing of speech signals for robust speech recognition in real world noisy environments
-
D. Kim, S. Lee, and R. M. Kil, "Auditory processing of speech signals for robust speech recognition in real world noisy environments," IEEE Trans. Speech Audio Processing, vol. 7, no. 1, pp. 55-69, 1999.
-
(1999)
IEEE Trans. Speech Audio Processing
, vol.7
, Issue.1
, pp. 55-69
-
-
Kim, D.1
Lee, S.2
Kil, R.M.3
-
28
-
-
85009227802
-
Localized spectro-temporal features for automatic speech recognition
-
M. Kleinschmidt, "Localized spectro-temporal features for automatic speech recognition," in Proc. Eurospeech, 2003, pp. 2573-2576.
-
(2003)
Proc. Eurospeech
, pp. 2573-2576
-
-
Kleinschmidt, M.1
-
29
-
-
0001463644
-
A duplex theory of pitch perception
-
J. C. R. Licklider, "A duplex theory of pitch perception," Experientia, vol. 7, no. 4, pp. 128-134, 1951.
-
(1951)
Experientia
, vol.7
, Issue.4
, pp. 128-134
-
-
Licklider, J.C.R.1
-
30
-
-
70450168923
-
Subband temporal modulation spectrum normalization for automatic speech recognition in reverberant environments
-
X. Lu, M. Unoki , and S. Nakamura. "Subband temporal modulation spectrum normalization for automatic speech recognition in reverberant environments," in Proc. Interspeech 2009.
-
(2009)
Proc. Interspeech
-
-
Lu, X.1
Unoki, M.2
Nakamura, S.3
-
31
-
-
79251542316
-
A computational model of filtering, detection and compression in the cochlea
-
R. F. Lyon, "A computational model of filtering, detection and compression in the cochlea," in Proc. ICASSP, Paris, France, 1982, pp. 1282-1285.
-
(1982)
Proc. ICASSP, Paris, France
, pp. 1282-1285
-
-
Lyon, R.F.1
-
32
-
-
0020497765
-
A computational model of binaural localization and separation
-
R. F. Lyon, "A computational model of binaural localization and separation," in Proc. ICASSP, Boston, MA, 1983, pp. 1148-1151.
-
(1983)
Proc. ICASSP, Boston, MA
, pp. 1148-1151
-
-
Lyon, R.F.1
-
33
-
-
78049405087
-
A comparative study on system combination schemes for LVCSR
-
Dallas, TX
-
C. Ma, K.-K. J. Kuo, H. Soltau, X. Cui, U. Chaudhari, L. Mangu, and C.-H. Lee, "A comparative study on system combination schemes for LVCSR," in Proc. ICASSP 2010, Dallas, TX, pp. 4394-4397.
-
Proc. ICASSP 2010
, pp. 4394-4397
-
-
Ma, C.1
Kuo, K.-K.J.2
Soltau, H.3
Cui, X.4
Chaudhari, U.5
Mangu, L.6
Lee, C.-H.7
-
34
-
-
0034296009
-
Finding consensus in speech recognition; Word error minimization and other applications of confusion networks
-
L. Mangu, E. Brill, and A. Stolcke, "Finding consensus in speech recognition; word error minimization and other applications of confusion networks," Comput. Speech Lang., vol. 14, no. 4, pp. 373-400, 2000.
-
(2000)
Comput. Speech Lang.
, vol.14
, Issue.4
, pp. 373-400
-
-
Mangu, L.1
Brill, E.2
Stolcke, A.3
-
36
-
-
34047272330
-
Discrimination of speech from nonspeech based on multiscale spectro-temporal modulations
-
N. Mesgarani, M. Slaney, and S. Shamma. "Discrimination of speech from nonspeech based on multiscale spectro-temporal modulations," IEEE Trans. Audio, Speech, Lang. Processing, vol. 14, no. 3, pp. 920-929, 2006.
-
(2006)
IEEE Trans. Audio, Speech, Lang. Processing
, vol.14
, Issue.3
, pp. 920-929
-
-
Mesgarani, N.1
Slaney, M.2
Shamma, S.3
-
38
-
-
0020816083
-
Suggested formulae for calculating auditory-filter bandwidths and excitation patterns
-
B. C. J. Moore and B. R. Glasberg, "Suggested formulae for calculating auditory-filter bandwidths and excitation patterns," J. Acoust. Soc. Amer., vol. 74, no. 3, pp. 750-753, 1983. (Pubitemid 13019047)
-
(1983)
Journal of the Acoustical Society of America
, vol.74
, Issue.3
, pp. 750-753
-
-
Moore, B.C.J.1
Glasberg, B.R.2
-
39
-
-
84255177123
-
Deep and wide: Multiple layers in automatic speech recognition
-
Jan.
-
N. Morgan, "Deep and wide: Multiple layers in automatic speech recognition," IEEE Trans. Audio, Speech, Lang. Processing (Special Issue on Deep Learning), vol. 20, no. 1, pp. 7-13, Jan. 2012.
-
(2012)
IEEE Trans. Audio, Speech, Lang. Processing (Special Issue on Deep Learning)
, vol.20
, Issue.1
, pp. 7-13
-
-
Morgan, N.1
-
40
-
-
44949110409
-
Environmental robustness in automatic speech recognition using physiologically motivated signal processing
-
Y. Ohshima and R. M. Stern, "Environmental robustness in automatic speech recognition using physiologically motivated signal processing," in Proc. ICSLP 1994, pp. 1619-1622.
-
(1994)
Proc. ICSLP
, pp. 1619-1622
-
-
Ohshima, Y.1
Stern, R.M.2
-
41
-
-
0031187171
-
Speech recognition by machines and humans
-
PII S0167639397000216
-
R. P. Lippmann, "Speech recognition by machines and humans," Speech Commun., vol. 22, no. 1, pp. 1-15, 1997. (Pubitemid 127403436)
-
(1997)
Speech Communication
, vol.22
, Issue.1
, pp. 1-15
-
-
Lippmann, R.P.1
-
42
-
-
0000460671
-
Complex sounds and auditory images
-
Y Cazals, L. Demany, and K. Horner, Eds. Oxford: Pergamon
-
R. D. Patterson, K. Robinson, J. Holdsworth, D. McKeown, C. Zhang, and M. Allerhand, "Complex sounds and auditory images," in Auditory Physiology and Perception (Proc. 9th Int. Symp. Hearing), Y Cazals, L. Demany, and K. Horner, Eds. Oxford: Pergamon, 1992, 429-446.
-
(1992)
Auditory Physiology and Perception (Proc. 9th Int. Symp. Hearing)
, pp. 429-446
-
-
Patterson, R.D.1
Robinson, K.2
Holdsworth, J.3
McKeown, D.4
Zhang, C.5
Allerhand, M.6
-
43
-
-
0023841401
-
Vowel processing by a model of the auditory periphery: A comparison to eighth-nerve responses
-
K. L. Payton, "Vowel processing by a model of the auditory periphery: A comparison to eighth-nerve responses," J. Acoust. Soc. Amer., vol. 83, no. 1, pp. 145-162, 1988. (Pubitemid 18036631)
-
(1988)
Journal of the Acoustical Society of America
, vol.83
, Issue.1
, pp. 145-162
-
-
Payton, K.L.1
-
45
-
-
84867817382
-
-
M.S. thesis, Dept. Elect. Eng. Comput. Sciences, Univ. California, Berkeley, Spring
-
S. Ravuri, "On the use of spectro-temporal features in noise-additive speech," M.S. thesis, Dept. Elect. Eng. Comput. Sciences, Univ. California, Berkeley, Spring 2011.
-
(2011)
On the Use of Spectro-temporal Features in Noise-additive Speech
-
-
Ravuri, S.1
-
46
-
-
0142026377
-
Speech segregation based on sound localization
-
DOI 10.1121/1.1610463
-
N. Roman, DeL. Wang, and G. J. Brown, "Speech segregation based on sound localization," J. Acoust. Soc. Amer., vol. 114, no. 4, pp. 2236-2252, 2003. (Pubitemid 37266649)
-
(2003)
Journal of the Acoustical Society of America
, vol.114
, Issue.4
, pp. 2236-2252
-
-
Roman, N.1
Wang, D.2
Brown, G.J.3
-
48
-
-
17344367464
-
Recognition of complex acoustic signals
-
T. H. Bullock, Ed. Abakon Verlag
-
M. R. Schroeder, "Recognition of complex acoustic signals," in Life Sciences Research Report 5, T. H. Bullock, Ed. Abakon Verlag, 1977.
-
(1977)
Life Sciences Research Report 5
-
-
Schroeder, M.R.1
-
49
-
-
84928837806
-
A joint synchrony/mean-rate model of auditory speech processing
-
S. Seneff, "A joint synchrony/mean-rate model of auditory speech processing," J. Phonet., vol. 15, no. 1, pp. 55-76, 1988.
-
(1988)
J. Phonet.
, vol.15
, Issue.1
, pp. 55-76
-
-
Seneff, S.1
-
50
-
-
0031647650
-
Speech analysis and recognition using interval statistics generated from a composite auditory model
-
H. Sheikhzadeh and L. Deng, "Speech analysis and recognition using interval statistics generated from a composite auditory model," IEEE Trans. Speech Audio Processing, vol. 6, no. 1, pp. 50-54, 1998.
-
(1998)
IEEE Trans. Speech Audio Processing
, vol.6
, Issue.1
, pp. 50-54
-
-
Sheikhzadeh, H.1
Deng, L.2
-
51
-
-
84868663836
-
Binaural sound localization
-
D. Wang and G. J. Brown, Eds. New York: IEEE Press
-
R. M. Stern, G. J. Brown, and D. Wang, "Binaural sound localization," in Computational Auditory Scene Analysis, D. Wang and G. J. Brown, Eds. New York: IEEE Press, 2006, pp. 147-185.
-
(2006)
Computational Auditory Scene Analysis
, pp. 147-185
-
-
Stern, R.M.1
Brown, G.J.2
Wang, D.3
-
52
-
-
56149126779
-
'Polyaural' array processing for automatic speech recognition in degraded environments
-
R. M. Stern, E. Gouvêa, and G. Thattai, "'Polyaural' array processing for automatic speech recognition in degraded environments," in Proc. Interspeech 2007.
-
(2007)
Proc. Interspeech
-
-
Stern, R.M.1
Gouvêa, E.2
Thattai, G.3
-
53
-
-
34447546202
-
On the psychophysical law
-
S. S. Stevens, "On the psychophysical law," Psychol. Rev., vol. 64, pp. 153-181, 1957.
-
(1957)
Psychol. Rev.
, vol.64
, pp. 153-181
-
-
Stevens, S.S.1
-
54
-
-
84955035459
-
A scale for the measurement of the psychological magnitude pitch
-
S. S. Stevens, J. Volkman, and E. Newman, "A scale for the measurement of the psychological magnitude pitch," J. Acoust. Soc. Amer., vol. 8, no. 3, pp. 185-190, 1937.
-
(1937)
J. Acoust. Soc. Amer.
, vol.8
, Issue.3
, pp. 185-190
-
-
Stevens, S.S.1
Volkman, J.2
Newman, E.3
-
55
-
-
0032828464
-
A model of auditory perception as front end for automatic speech recognition
-
J. Tchorz and B. Kollmeier, "A model of auditory perception as front end for automatic speech recognition," J. Acoust. Soc. Amer., vol. 106, no. 4, pp. 2040-2060, 1999.
-
(1999)
J. Acoust. Soc. Amer.
, vol.106
, Issue.4
, pp. 2040-2060
-
-
Tchorz, J.1
Kollmeier, B.2
-
56
-
-
84867771218
-
-
Hoboken, NJ: Wiley
-
T. Virtanen, R. Singh, and B. Raj, Eds., Noise Robust Techniques for Automatic Speech Recognition. Hoboken, NJ: Wiley, 2012.
-
(2012)
Noise Robust Techniques for Automatic Speech Recognition
-
-
Virtanen, T.1
Singh, R.2
Raj, B.3
-
57
-
-
0035122055
-
A phenomenological model for the responses of auditory-nerve fibers: I. Nonlinear tuning with compression and suppression
-
DOI 10.1121/1.1336503
-
X. Zhang, M. G. Heinz, I. C. Bruce, and L. H. Carney, "A phenomenological model for the response of auditory-nerve fibers: I. Nonlinear tuning with compression and suppression," J. Acoust. Soc. Amer., vol. 109, no. 2, pp. 648-670, 2001. (Pubitemid 32144001)
-
(2001)
Journal of the Acoustical Society of America
, vol.109
, Issue.2
, pp. 648-670
-
-
Zhang, X.1
Heinz, M.G.2
Bruce, I.C.3
Carney, L.H.4
-
58
-
-
84953656445
-
Subdivision of the audible frequency range into critical bands (frequenzgruppen)
-
E. Zwicker, "Subdivision of the audible frequency range into critical bands (frequenzgruppen)," J. Acoustic. Soc. Amer., vol. 33, no. 248, , p. 248, 1961.
-
(1961)
J. Acoustic. Soc. Amer.
, vol.33
, Issue.248
, pp. 248
-
-
Zwicker, E.1
-
59
-
-
0022976531
-
Extension of a binaural cross-correlation model by contralateral inhibiltion: I Simulation of lateralizaton for stationary signals
-
W. Lindemann, "Extension of a binaural cross-correlation model by contralateral inhibiltion: I Simulation of lateralizaton for stationary signals," J. Acoustic. Soc. Amer., vol. 80, no. 6, pp. 1608-1622.
-
J. Acoustic. Soc. Amer.
, vol.80
, Issue.6
, pp. 1608-1622
-
-
Lindemann, W.1
|