SCOPUS 정보 검색 플랫폼

Studies in Fuzziness and Soft Computing

Volumn 194, Issue , 2006, Pages 137-186

Neural probabilistic language models

(5) Bengio, Yoshua a Schwenk, Holger b Senécal, Jean Sébastien a Morin, Fréderic a Gauvain, Jean Luc b

a UNIVERSITÉ DE MONTRÉAL (Canada)

b CNRS (France)

Author keywords

[No Author keywords available]

Indexed keywords

EID: 33845260073 PISSN: 14349922 EISSN: None Source Type: Book Series
DOI: 10.1007/10985687_6 Document Type: Article

Times cited : (364)

References (56)

1
- 33845266780
- Automatically tuned linear algebra software
- Automatically tuned linear algebra software, https://sourceforge.net/ projects/math-atlas/atlas

2
- 0032264186
- Distributional clustering of words for text classification
- Baker, D. and McCallum, A. (1998). Distributional clustering of words for text classification. In SIGIR'98.
- (1998) SIGIR'98
- Baker, D.¹ McCallum, A.²

3
- 0006273786
- A latent semantic analysis framework for large-span language modeling
- Rhodes, Greece
- Bellegarda, J. (1997). A latent semantic analysis framework for large-span language modeling. In Proceedings of Eurospeech 97, pages 1451-1454, Rhodes, Greece.
- (1997) Proceedings of Eurospeech 97 , pp. 1451-1454
- Bellegarda, J.¹

4
- 0034187310
- Taking on the curse of dimensionality in joint distributions using neural networks
- Bengio, S. and Bengio, Y. (2000a). Taking on the curse of dimensionality in joint distributions using neural networks. IEEE Transactions on Neural Networks, special issue on Data Mining and Knowledge Discovery, 11(3), 550-557.
- (2000) IEEE Transactions on Neural Networks, Special Issue on Data Mining and Knowledge Discovery , vol.11 , Issue.3 , pp. 550-557
- Bengio, S.¹ Bengio, Y.²

5
- 0142223338
- Modeling high-dimensional discrete data with multi-layer neural networks
- In S. Solla, T. Leen, and K.-R. Muller, editors, 400-406 MIT Press
- Bengio, Y. and Bengio, S. (2000b). Modeling high-dimensional discrete data with multi-layer neural networks. In S. Solla, T. Leen, and K.-R. Muller, editors, Advances in Neural Information Processing Systems 12, pages 400-406. MIT Press.
- (2000) Advances in Neural Information Processing Systems , vol.12
- Bengio, Y.¹ Bengio, S.²

6
- 10944221006
- Quick training of probabilistic neural nets by sampling
- In AI and Statistics. 38 Authors Suppressed Due to Excessive Length Key West, Florida
- Bengio, Y. and Senecal, J.-S. (2003). Quick training of probabilistic neural nets by sampling. In Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics, volume 9, Key West, Florida. AI and Statistics. 38 Authors Suppressed Due to Excessive Length
- (2003) Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics , vol.9
- Bengio, Y.¹ Senecal, J.-S.²

7
- 0142166851
- A neural probabilistic language model
- Bengio, Y., Ducharme, R., Vincent, P., and Jauvin, C. (2003). A neural probabilistic language model. Journal of Machine Learning Research, 3, 1137-1155.
- (2003) Journal of Machine Learning Research , vol.3 , pp. 1137-1155
- Bengio, Y.¹ Ducharme, R.² Vincent, P.³ Jauvin, C.⁴

8
- 0002652285
- A maximum entropy approach to natural language processing
- Berger, A., Della Pietra, S., and Della Pietra, V. (1996). A maximum entropy approach to natural language processing. Computational Linguistics, 22, 39-71.
- (1996) Computational Linguistics , vol.22 , pp. 39-71
- Berger, A.¹ Della Pietra, S.² Della Pietra, V.³

9
- 0030676599
- Using phipac to speed error back-propagation learning
- Bilmes, J., Asanovic, K., Chin, C.-W., and Demmel, J. (1997). Using phipac to speed error back-propagation learning. In International Conference on Acoustics, Speech, and Signal Processing, pages V: 4153-4156.
- (1997) International Conference on Acoustics, Speech, and Signal Processing , vol.5 , pp. 4153-4156
- Bilmes, J.¹ Asanovic, K.² Chin, C.-W.³ Demmel, J.⁴

10
- 84899027412
- Hierarchical distributed representations for statistical language models
- In L. Saul, Y. Weiss, and L. Bottou, editors, MIT Press
- Blitzer, J., K.Q.Weinberger, Saul, L., and Pereira, F. (2005). Hierarchical distributed representations for statistical language models. In L. Saul, Y. Weiss, and L. Bottou, editors, Advances in Neural Information Processing Systems 17. MIT Press.
- (2005) Advances in Neural Information Processing Systems , vol.17
- Blitzer, J.¹ Weinberger, K.Q.² Saul, L.³ Pereira, F.⁴

11
- 0030211964
- Bagging predictors
- Breiman, L. (1994). Bagging predictors. Machine Learning, 24(2), 123-140.
- (1994) Machine Learning , vol.24 , Issue.2 , pp. 123-140
- Breiman, L.¹

12
- 0141698978
- Products of hidden markov models
- Technical Report GCNU TR 2000-004, Gatsby Unit, University College London
- Brown, A. and Hinton, G. (2000). Products of hidden markov models. Technical Report GCNU TR 2000-004, Gatsby Unit, University College London.
- (2000)
- Brown, A.¹ Hinton, G.²

13
- 85022919385
- Class-based n-gram models of natural language
- Brown, P., Pietra, V. D., DeSouza, P., Lai, J., and Mercer, R. (1992). Class-based n-gram models of natural language. Computational Linguistics, 18, 467-479.
- (1992) Computational Linguistics , vol.18 , pp. 467-479
- Brown, P.¹ Pietra, V.D.² DeSouza, P.³ Lai, J.⁴ Mercer, R.⁵

14
- 0033329799
- An empirical study of smoothing techniques for language modeling
- Chen, S. F. and Goodman, J. T. (1999). An empirical study of smoothing techniques for language modeling. Computer, Speech and Language, 13(4), 359-393.
- (1999) Computer, Speech and Language , vol.13 , Issue.4 , pp. 359-393
- Chen, S.F.¹ Goodman, J.T.²

15
- 0001249662
- Ais-bn: An adaptive importance sampling algorithm for evidential reasoning in large Bayesian networks
- Cheng, J. and Druzdzel, M. J. (2000). Ais-bn: An adaptive importance sampling algorithm for evidential reasoning in large Bayesian networks. Journal of Artificial Intelligence Research, 13, 155-188.
- (2000) Journal of Artificial Intelligence Research , vol.13 , pp. 155-188
- Cheng, J.¹ Druzdzel, M.J.²

16
- 84989525001
- Indexing by latent semantic analysis
- Deerwester, S., Dumais, S., Furnas, G., Landauer, T., and Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6), 391-407.
- (1990) Journal of the American Society for Information Science , vol.41 , Issue.6 , pp. 391-407
- Deerwester, S.¹ Dumais, S.² Furnas, G.³ Landauer, T.⁴ Harshman, R.⁵

17
- 26444565569
- Finding structure in time
- Elman, J. (1990). Finding structure in time. Cognitive Science, 14, 179-211.
- (1990) Cognitive Science , vol.14 , pp. 179-211
- Elman, J.¹

18
- 0141591496
- Using a connectionist model in a syntactical based language model
- Emami, A., Xu, P., and Jelinek, F. (2003). Using a connectionist model in a syntactical based language model. In International Conference on Acoustics, Speech, and Signal Processing, pages I: 272-375.
- (2003) International Conference on Acoustics, Speech, and Signal Processing , vol.1 , pp. 272-375
- Emami, A.¹ Xu, P.² Jelinek, F.³

19
- 0004289791
- MIT Press
- Fellbaum, C. (1998). WordNet: An Electronic Lexical Database. MIT Press.
- (1998) WordNet: An Electronic Lexical Database
- Fellbaum, C.¹

20
- 33845262875
- Results of the fall 2004 STT and MDE evaluation
- (Nov). In Palisades NY
- Fiscus, J., Garofolo, J., Lee, A., Martin, A., Pallett, D., Przybocki, M., and Sanders, G. (Nov 2004). Results of the fall 2004 STT and MDE evaluation. In DARPA Rich Transcription Workshop, Palisades NY.
- (2004) DARPA Rich Transcription Workshop
- Fiscus, J.¹ Garofolo, J.² Lee, A.³ Martin, A.⁴ Pallett, D.⁵ Przybocki, M.⁶ Sanders, G.⁷

21
- 58149321460
- Boosting a weak learning algorithm by majority
- Freund, Y. (1995). Boosting a weak learning algorithm by majority. Information and Computation, 121(2), 256-285.
- (1995) Information and Computation , vol.121 , Issue.2 , pp. 256-285
- Freund, Y.¹

22
- 0141703281
- Conversational telephone speech recognition
- Gauvain, J.-L., Lamel, L., Schwenk, H., Adda, G., Chen, L., and Lefevre, F. (2003). Conversational telephone speech recognition. In International Conference on Acoustics, Speech, and Signal Processing, pages I: 212-215.
- (2003) International Conference on Acoustics, Speech, and Signal Processing, , vol.1 , pp. 212-215
- Gauvain, J.-L.¹ Lamel, L.² Schwenk, H.³ Adda, G.⁴ Chen, L.⁵ Lefevre, F.⁶

23
- 0012356157
- A bit of progress in language modeling
- Technical Report MSR-TR-2001-72, Microsoft Research
- Goodman, J. (2001a). A bit of progress in language modeling. Technical Report MSR-TR-2001-72, Microsoft Research.
- (2001)
- Goodman, J.¹

24
- 0034856455
- Classes for fast maximum entropy training
- Utah
- Goodman, J. (2001b). Classes for fast maximum entropy training. In International Conference on Acoustics, Speech, and Signal Processing, Utah.
- (2001) International Conference on Acoustics, Speech, and Signal Processing
- Goodman, J.¹

25
- 0002623785
- Learning distributed representations of concepts
- In Amherst 1986. Lawrence Erlbaum, Hillsdale
- Hinton, G. (1986). Learning distributed representations of concepts. In Proceedings of the Eighth Annual Conference of the Cognitive Science Society, pages 1-12, Amherst 1986. Lawrence Erlbaum, Hillsdale.
- (1986) Proceedings of the Eighth Annual Conference of the Cognitive Science Society , pp. 1-12
- Hinton, G.¹

26
- 0008602090
- Training products of experts by minimizing contrastive divergence
- Technical Report GCNU TR 2000-004, Gatsby Unit, University College London
- Hinton, G. (2000). Training products of experts by minimizing contrastive divergence. Technical Report GCNU TR 2000-004, Gatsby Unit, University College London.
- (2000)
- Hinton, G.¹

27
- 0013344078
- Training products of experts by minimizing contrastive divergence
- 1 Neural Probabilistic Language Models 39
- Hinton, G. (2002). Training products of experts by minimizing contrastive divergence. Neural Computation, 14(8), 1771-1800. 1 Neural Probabilistic Language Models 39
- (2002) Neural Computation , vol.14 , Issue.8 , pp. 1771-1800
- Hinton, G.¹

28
- 84898964829
- Stochastic neighbor embedding
- In S. Becker, S. Thrun, and K. Obermayer, editors, MIT Press, Cambridge, MA
- Hinton, G. and Roweis, S. (2003). Stochastic neighbor embedding. In S. Becker, S. Thrun, and K. Obermayer, editors, Advances in Neural Information Processing Systems 15. MIT Press, Cambridge, MA.
- (2003) Advances in Neural Information Processing Systems , vol.15
- Hinton, G.¹ Roweis, S.²

29
- 0019114666
- Interpolated estimation of markov source parameters from sparse data
- Jelinek, F. and Mercer, R. (2000). Interpolated estimation of markov source parameters from sparse data. Pattern Recognition in Practice, pages 381-397.
- (2000) Pattern Recognition in Practice , pp. 381-397
- Jelinek, F.¹ Mercer, R.²

30
- 0019114666
- Interpolated estimation of Markov source parameters from sparse data
- In E. S. Gelsema and L. N. Kanal, editors, North-Holland, Amsterdam
- Jelinek, F. and Mercer, R. L. (1980). Interpolated estimation of Markov source parameters from sparse data. In E. S. Gelsema and L. N. Kanal, editors, Pattern Recognition in Practice. North-Holland, Amsterdam.
- (1980) Pattern Recognition in Practice
- Jelinek, F.¹ Mercer, R.L.²

31
- 85009143810
- Self-organizing letter code-book for text-to-phoneme neural network model
- Jensen, K. and Riis, S. (2000). Self-organizing letter code-book for text-to-phoneme neural network model. In Proceedings ICSLP.
- (2000) Proceedings ICSLP
- Jensen, K.¹ Riis, S.²

32
- 0023312404
- Estimation of probabilities from sparse data for the language model component of a speech recognizer
- Katz, S. M. (1987). Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Transactions on Acoustics, Speech, and Signal Processing, ASSP-35(3), 400-401.
- (1987) IEEE Transactions on Acoustics, Speech, and Signal Processing , vol.ASSP-35 , Issue.3 , pp. 400-401
- Katz, S.M.¹

33
- 0028996876
- Improved backing-off for m-gram language modeling
- Kneser, R. and Ney, H. (1995). Improved backing-off for m-gram language modeling. In International Conference on Acoustics, Speech, and Signal Processing, pages 181-184.
- (1995) International Conference on Acoustics, Speech, and Signal Processing , pp. 181-184
- Kneser, R.¹ Ney, H.²

34
- 33845250701
- A note on importance sampling using standardized weights
- Technical Report 348, Department of Statistics, University of Chicago
- Kong, A. (1992). A note on importance sampling using standardized weights. Technical Report 348, Department of Statistics, University of Chicago.
- (1992)
- Kong, A.¹

35
- 84950943564
- Sequential imputations and Bayesian missing data problems
- Kong, A., Liu, J. S., and Wong, W. H. (1994). Sequential imputations and Bayesian missing data problems. Journal of the American Statistical Association, 89, 278-288.
- (1994) Journal of the American Statistical Association , vol.89 , pp. 278-288
- Kong, A.¹ Liu, J.S.² Wong, W.H.³

36
- 33845239976
- Spring speech-to-text transcription evaluation results
- (May). In Boston
- Lee, A., Fiscus, J., Garofolo, J., Przybocki, M., Martin, A., Sanders, G., and Pallett, D. (May 2003). Spring speech-to-text transcription evaluation results. In Rich Transcription Workshop, Boston.
- (2003) Rich Transcription Workshop
- Lee, A.¹ Fiscus, J.² Garofolo, J.³ Przybocki, M.⁴ Martin, A.⁵ Sanders, G.⁶ Pallett, D.⁷

37
- 0004182828
- Springer
- Liu, J. S. (2001). Monte Carlo Strategies in Scientific Computing. Springer.
- (2001) Monte Carlo Strategies in Scientific Computing
- Liu, J.S.¹

38
- 0001861916
- Adaptive importance sampling for estimation in structured domains
- Luis, O. and Leslie, K. (2000). Adaptive importance sampling for estimation in structured domains. In Proceedings of the 16th Annual Conference on Uncertainty in Artificial Intelligence (UAI-00), pages 446-454.
- (2000) Proceedings of the 16th Annual Conference on Uncertainty in Artificial Intelligence (UAI-00) , pp. 446-454
- Luis, O.¹ Leslie, K.²

39
- 33845239725
- Intel math kernel library
- Intel math kernel library (2004)., http://www.intel.com/software/ products/mkl/.
- (2004)

40
- 34248842385
- Natural language processing with modular neural networks and distributed lexicon
- Miikkulainen, R. and Dyer, M. (1991). Natural language processing with modular neural networks and distributed lexicon. Cognitive Science, 15, 343-399.
- (1991) Cognitive Science , vol.15 , pp. 343-399
- Miikkulainen, R.¹ Dyer, M.²

41
- 85123963268
- Improved clustering techniques for class-based statistical language modeling
- In Berlin
- Ney, H. and Kneser, R. (1993). Improved clustering techniques for class-based statistical language modeling. In European Conference on Speech Communication and Technology (Eurospeech), pages 973-976, Berlin.
- (1993) European Conference on Speech Communication and Technology (Eurospeech) , pp. 973-976
- Ney, H.¹ Kneser, R.²

42
- 0031628780
- Comparison of part-of-speech and automatically derived category-based language models for speech recognition
- Niesler, T., Whittaker, E., and Woodland, P. (1998). Comparison of part-of-speech and automatically derived category-based language models for speech recognition. In International Conference on Acoustics, Speech, and Signal Processing, pages 177-180.
- (1998) International Conference on Acoustics, Speech, and Signal Processing , pp. 177-180
- Niesler, T.¹ Whittaker, E.² Woodland, P.³

43
- 0033683815
- Extracting distributed representations of concepts and relations from positive and negative propositions
- In Como, Italy. IEEE, New York
- Paccanaro, A. and Hinton, G. (2000). Extracting distributed representations of concepts and relations from positive and negative propositions. In Proceedings of the International Joint Conference on Neural Network, IJCNN'2000, Como, Italy. IEEE, New York.
- (2000) Proceedings of the International Joint Conference on Neural Network, IJCNN'2000
- Paccanaro, A.¹ Hinton, G.²

44
- 85123966307
- Distributional clustering of English words
- In Columbus, Ohio
- Pereira, F., Tishby, N., and Lee, L. (1993). Distributional clustering of English words. In 30th Annual Meeting of the Association for Computational Linguistics, pages 183-190, Columbus, Ohio.
- (1993) 30th Annual Meeting of the Association for Computational Linguistics , pp. 183-190
- Pereira, F.¹ Tishby, N.² Lee, L.³

45
- 0029984070
- Improving protein secondary structure prediction using structured neural networks and multiple sequence profiles
- 40 Authors Suppressed Due to Excessive Length
- Riis, S. and Krogh, A. (1996). Improving protein secondary structure prediction using structured neural networks and multiple sequence profiles. Journal of Computational Biology, pages 163-183. 40 Authors Suppressed Due to Excessive Length
- (1996) Journal of Computational Biology , pp. 163-183
- Riis, S.¹ Krogh, A.²

46
- 0003919677
- Springer. Springer texts in statistics
- Robert, C. P. and Casella, G. (2000). Monte Carlo Statistical Methods. Springer. Springer texts in statistics.
- (2000) Monte Carlo Statistical Methods
- Robert, C.P.¹ Casella, G.²

47
- 45549117987
- Term weighting approaches in automatic text retrieval
- Salton, G. and Buckley, C. (1988). Term weighting approaches in automatic text retrieval. Information Processing and Management, 24(5), 513-523.
- (1988) Information Processing and Management , vol.24 , Issue.5 , pp. 513-523
- Salton, G.¹ Buckley, C.²

48
- 0029732478
- Sequential neural text compression
- Schmidhuber, J. (1996). Sequential neural text compression. IEEE Transactions on Neural Networks, 7(1), 142-146.
- (1996) IEEE Transactions on Neural Networks , vol.7 , Issue.1 , pp. 142-146
- Schmidhuber, J.¹

49
- 0142161367
- Word space
- In C. Giles, S. Hanson, and J. Cowan, editors, San Mateo CA. Morgan Kaufmann
- Schutze, H. (1993). Word space. In C. Giles, S. Hanson, and J. Cowan, editors, Advances in Neural Information Processing Systems 5, pages pp. 895-902, San Mateo CA. Morgan Kaufmann.
- (1993) Advances in Neural Information Processing Systems , vol.5 , pp. 895-902
- Schutze, H.¹

50
- 0036293862
- Connectionist language modeling for large vocabulary continuous speech recognition
- Schwenk, H. and Gauvain, J.-L. (2002). Connectionist language modeling for large vocabulary continuous speech recognition. In International Conference on Acoustics, Speech, and Signal Processing, pages I: 765-768.
- (2002) International Conference on Acoustics, Speech, and Signal Processing , vol.1 , pp. 765-768
- Schwenk, H.¹ Gauvain, J.-L.²

51
- 34547547667
- Using continuous space language models for conversational speech recognition
- In Tokyo
- Schwenk, H. and Gauvain, J.-L. (2003). Using continuous space language models for conversational speech recognition. In ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition, Tokyo.
- (2003) ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition
- Schwenk, H.¹ Gauvain, J.-L.²

52
- 10944267136
- Efficient training of large neural networks for language modeling
- Schwenk, H. (2004). Efficient training of large neural networks for language modeling. In IEEE joint conference on neural networks, pages 3059-3062.
- (2004) IEEE Joint Conference on Neural Networks , pp. 3059-3062
- Schwenk, H.¹

53
- 85009143782
- Neural network language models for conversational speech recognition
- Schwenk, H. and Gauvain, J.-L. (2004). Neural network language models for conversational speech recognition. In International Conference on Speech and Language Processing, pages 1215-1218.
- (2004) International Conference on Speech and Language Processing , pp. 1215-1218
- Schwenk, H.¹ Gauvain, J.-L.²

54
- 33845265697
- Building Continuous Language Models for Transcribing European Languages
- In To appear
- Schwenk, H. and Gauvain, J.-L. (2005). Building Continuous Language Models for Transcribing European Languages. In Eurospeech. To appear.
- (2005) Eurospeech
- Schwenk, H.¹ Gauvain, J.-L.²

55
- 84891308106
- SRILM - An extensible language modeling toolkit
- In Denver, Colorado
- Stolcke, A. (2002). SRILM - an extensible language modeling toolkit. In Proceedings of the International Conference on Statistical Language Processing, Denver, Colorado.
- (2002) Proceedings of the International Conference on Statistical Language Processing
- Stolcke, A.¹

56
- 84988402904
- Can artificial neural network learn language models?
- In Beijing, China
- Xu, W. and Rudnicky, A. (2000). Can artificial neural network learn language models? In International Conference on Statistical Language Processing, pages M1-13, Beijing, China.
- (2000) International Conference on Statistical Language Processing
- Xu, W.¹ Rudnicky, A.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.