SCOPUS 정보 검색 플랫폼

IEEE Journal on Selected Topics in Signal Processing

Volumn 4, Issue 6, 2010, Pages 994-1006

Speech recognition with flat direct models

(3) Nguyen, Patrick a Heigold, Georg b Zweig, Geoffrey a

a MICROSOFT RESEARCH (United States)

b GOOGLE INC (United States)

Author keywords

Direct model; features; log linear model; maximum mutual information (MMI); speech recognition

Indexed keywords

ACOUSTIC DETECTION; AUDIO SIGNAL; DIRECT MODEL; DIRECT MODELING; FEATURES; INHERENT STRUCTURES; KEY PROBLEMS; LINEAR MODELING; LOGLINEAR MODEL; MARKOV ASSUMPTIONS; MARKOV MODEL; MAXIMUM MUTUAL INFORMATION; MUTUAL INFORMATIONS; SENTENCE ERRORS; TEMPLATE-BASED;

FEATURE EXTRACTION; HIDDEN MARKOV MODELS; REGRESSION ANALYSIS; STEREOPHONIC BROADCASTING; TELEPHONE SETS;

SPEECH RECOGNITION;

EID: 78649280264 PISSN: 19324553 EISSN: None Source Type: Journal
DOI: 10.1109/JSTSP.2010.2080812 Document Type: Article

Times cited : (15)

References (25)

1
- 34047266376
- Advances in speech transcription at IBM under the DARPA EARS program
- Sep.
- S. Chen, B. Kingsbury, L. Mangu, D. Povey, G. Saon, H. Soltau, and G. Zweig, "Advances in speech transcription at IBM under the DARPA EARS program," IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 5, pp. 1596-1608, Sep. 2006.
- (2006) IEEE Trans. Audio, Speech, Lang. Process. , vol.14 , Issue.5 , pp. 1596-1608
- Chen, S.¹ Kingsbury, B.² Mangu, L.³ Povey, D.⁴ Saon, G.⁵ Soltau, H.⁶ Zweig, G.⁷

2
- 34047266379
- Progress in the CU-HTK broadcast news transcription system
- DOI 10.1109/TASL.2006.878264
- M. Gales, D. Kim, P. Woodland, H. Chan, D. Mrva, R. Sinha, and S. Tranter, "Progress in the CU-HTK broadcast news transcription system," IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 5, pp. 1513-1525, Sep. 2006. (Pubitemid 46547578)
- (2006) IEEE Transactions on Audio, Speech and Language Processing , vol.14 , Issue.5 , pp. 1513-1525
- Gales, M.J.F.¹ Kim, D.Y.² Woodland, P.C.³ Chan, H.Y.⁴ Mrva, D.⁵ Sinha, R.⁶ Tranter, S.E.⁷

3
- 34147119672
- Advances in transcription of broadcast news and conversational telephone speech within the combined EARS BBN/LIMSI system
- Sep.
- S. Matsoukas, J.-L. Gauvain, G. Adda, T. Colhurst, C.-L. Kao, O. Kim-ball, L. Lamel, F. Lefevre, J. Ma, J. Makhoul, L. Nguyen, R. Prasad, R. Schwartz, H. Schwenk, and B. Xiang, "Advances in transcription of broadcast news and conversational telephone speech within the combined EARS BBN/LIMSI system," IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 5, pp. 1541-1556, Sep. 2006.
- (2006) IEEE Trans. Audio, Speech, Lang. Process. , vol.14 , Issue.5 , pp. 1541-1556
- Matsoukas, S.¹ Gauvain, J.-L.² Adda, G.³ Colhurst, T.⁴ Kao, C.-L.⁵ Kim-Ball, O.⁶ Lamel, L.⁷ Lefevre, F.⁸ Ma, J.⁹ Makhoul, J.¹⁰ Nguyen, L.¹¹ Prasad, R.¹² Schwartz, R.¹³ Schwenk, H.¹⁴ Xiang, B.¹⁵

4
- 4544293504
- Moving beyond the 'beads-on-a-string' model of speech
- M. Ostendorf, "Moving beyond the 'beads-on-a-string' model of speech," in Proc. IEEE ASRU Workshop, 1999, pp. 79-84.
- (1999) Proc. IEEE ASRU Workshop , pp. 79-84
- Ostendorf, M.¹

5
- 85009110188
- Learning long-term temporal features in LVCSR using neural networks
- B. Y. Chen, Q. Zhu, and N. Morgan, "Learning long-term temporal features in LVCSR using neural networks," in Proc. ICSLP, 2004.
- (2004) Proc. ICSLP
- Chen, B.Y.¹ Zhu, Q.² Morgan, N.³

6
- 0032658253
- Temporal patterns (TRAPS) in ASR of noisy speech
- H. Hermansky and S. Sharma, "Temporal patterns (TRAPS) in ASR of noisy speech," in Proc. ICASSP, 1999, pp. 289-292.
- (1999) Proc. ICASSP , pp. 289-292
- Hermansky, H.¹ Sharma, S.²

7
- 85009227403
- Data driven example based continuous speech recognition
- W. D. Wachter, K. Demuynck, D. V. Compernolle, and P. Wambacq, "Data driven example based continuous speech recognition," in Proc. Eurospeech, 2003, pp. 1133-1136.
- (2003) Proc. Eurospeech , pp. 1133-1136
- Wachter, W.D.¹ Demuynck, K.² Compernolle, D.V.³ Wambacq, P.⁴

8
- 45549086638
- Template-based continuous speech recognition
- May
- M. De Wachter, M. Matton, K. Demuynck, P. Wambacq, R. Cools, and D. Van Compernolle, "Template-based continuous speech recognition," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 4, pp. 1377-1390, May 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.4 , pp. 1377-1390
- De Wachter, M.¹ Matton, M.² Demuynck, K.³ Wambacq, P.⁴ Cools, R.⁵ Van Compernolle, D.⁶

9
- 51449113682
- Live search for mobile: Web services by voice on the cellphone
- A. Acero, N. Bernstein, R. Chambers, Y. Ju, X. Li, J. Odell, P. Nguyen, O. Scholz, and G. Zweig, "Live search for mobile: Web services by voice on the cellphone," in Proc. ICASSP, 2007, pp. 5256-5259.
- (2007) Proc. ICASSP , pp. 5256-5259
- Acero, A.¹ Bernstein, N.² Chambers, R.³ Ju, Y.⁴ Li, X.⁵ Odell, J.⁶ Nguyen, P.⁷ Scholz, O.⁸ Zweig, G.⁹

10
- 78649245809
- [Online] Available
- [Online]. Available: http://www.tellme.com/you

11
- 78649250686
- [Online] Available
- [Online]. Available: http://vlingo.com

12
- 78649297785
- [Online] Available
- [Online]. Available: http://www.google.com/mobile/apple/app.html

13
- 78649256443
- [Online] Available
- [Online]. Available: http://mobile.yahoo.com/onesearch

14
- 84946710255
- Maximum entropy direct models for speech recognition
- H.-K. J. Kuo and Y. Gao, "Maximum entropy direct models for speech recognition," in Proc. ASRU, 2003.
- (2003) Proc. ASRU
- Kuo, H.-K.J.¹ Gao, Y.²

15
- 70349208656
- A flat direct model for speech recognition
- G. Heigold, G. Zweig, X. Li, and P. Nguyen, "A flat direct model for speech recognition," in Proc. ICASSP, 2009, pp. 3861-3864.
- (2009) Proc. ICASSP , pp. 3861-3864
- Heigold, G.¹ Zweig, G.² Li, X.³ Nguyen, P.⁴

16
- 70450201983
- Maximum mutual information multiphone units in direct modeling
- G. Zweig and P. Nguyen, "Maximum mutual information multiphone units in direct modeling," in Proc. Interspeech, 2009.
- (2009) Proc. Interspeech
- Zweig, G.¹ Nguyen, P.²

17
- 33846253039
- Hidden conditional Random fields for phone classification
- A. Gunawardana, M. Mahajan, A. Acero, and J. C. Platt, "Hidden conditional random fields for phone classification," in Proc. Interspeech, 2005.
- (2005) Proc. Interspeech
- Gunawardana, A.¹ Mahajan, M.² Acero, A.³ Platt, J.C.⁴

18
- 0033887568
- A survey of smoothing techniques for ME models
- Jan.
- S. Chen and R. Rosenfeld, "A survey of smoothing techniques for ME models," IEEE Trans. Speech Audio Process., vol. 8, no. 1, pp. 37-50, Jan. 2000.
- (2000) IEEE Trans. Speech Audio Process. , vol.8 , Issue.1 , pp. 37-50
- Chen, S.¹ Rosenfeld, R.²

19
- 0004109478
- Rprop\Description and implementation details Univ. of Karlsruhe Jan. 1994
- M. Reidmiller, Rprop\Description and implementation details Univ. of Karlsruhe, Jan. 1994, Tech. Rep.
- Tech. Rep
- Reidmiller, M.¹

20
- 85149106909
- Discriminative language modeling with conditional Random fields and the perceptron algorithm
- B. Roark, M. Saraclar, M. Collins, and M. Johnson, "Discriminative language modeling with conditional random fields and the perceptron algorithm," in Proc. ACL, 2004.
- (2004) Proc. ACL
- Roark, B.¹ Saraclar, M.² Collins, M.³ Johnson, M.⁴

21
- 56149117265
- An investigation into a simulation of episodic memory for automatic speech recognition
- Sep.
- V. Maier and R. Moore, "An investigation into a simulation of episodic memory for automatic speech recognition," in Proc. Interspeech, Sep. 2005.
- (2005) Proc. Interspeech
- Maier, V.¹ Moore, R.²

22
- 0032165145
- A multispan language modeling framework for large vocabulary speech recognition
- J. R. Bellegarda, "A multispan language modeling framework for large vocabulary speech recognition," IEEE Trans. Speech Audio Process., vol. 6, no. 5, pp. 456-467, 1998.
- (1998) IEEE Trans. Speech Audio Process. , vol.6 , Issue.5 , pp. 456-467
- Bellegarda, J.R.¹

23
- 0035340439
- Syllable-based large vocabulary continuous speech recognition
- May
- A. Ganapathiraju, J. Hamaker, J. Picone, M. Ordowski, and G. Dod-dington, "Syllable-based large vocabulary continuous speech recognition," IEEE Trans. Speech and Audio Processing, vol. 9, no. 4, pp. 358-366, May 2001.
- (2001) IEEE Trans. Speech and Audio Processing , vol.9 , Issue.4 , pp. 358-366
- Ganapathiraju, A.¹ Hamaker, J.² Picone, J.³ Ordowski, M.⁴ Dod-Dington, G.⁵

24
- 0029725372
- Design of a speech recognition system based on acoustically derived segmental units
- M. Bacchiani, M. Ostendorf, Y. Sagisaka, and K. Paliwal, "Design of a speech recognition system based on acoustically derived segmental units," in Proc. ICASSP, 1996, pp. 443-446.
- (1996) Proc. ICASSP , pp. 443-446
- Bacchiani, M.¹ Ostendorf, M.² Sagisaka, Y.³ Paliwal, K.⁴

25
- 0036476255
- Automatic generation of subword units for speech recognition systems
- Feb.
- R. Singh, B. Raj, and R. Stern, "Automatic generation of subword units for speech recognition systems," IEEE Trans. Speech and Audio Processing, vol. 10, no. 2, pp. 89-99, Feb. 2002
- (2002) IEEE Trans. Speech and Audio Processing , vol.10 , Issue.2 , pp. 89-99
- Singh, R.¹ Raj, B.² Stern, R.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.