SCOPUS 정보 검색 플랫폼

Volumn 19, Issue 4, 2008, Pages 713-722

Adaptive importance sampling to accelerate training of a neural probabilistic language model

(2) Bengio, Yoshua a Senécal, Jean Sébastien a

Author keywords

Energy based models; Fast training; Importance sampling; Language modeling; Monte Carlo methods; Probabilistic neural networks

Indexed keywords

APPROXIMATION THEORY; COMPUTER SIMULATION; MAXIMUM LIKELIHOOD; PROBABILITY; STATISTICAL METHODS;

NEURAL PROBABILISTIC LANGUAGE MODEL; STATISTICAL LANGUAGE MODELING;

NEURAL NETWORKS;

ARTICLE; ARTIFICIAL NEURAL NETWORK; COMPUTER LANGUAGE; COMPUTER SIMULATION; HUMAN; LANGUAGE; PROBABILITY; STATISTICAL MODEL;

COMPUTER SIMULATION; HUMANS; LANGUAGE; MARKOV CHAINS; MODELS, STATISTICAL; NEURAL NETWORKS (COMPUTER); PROGRAMMING LANGUAGES;

EID: 42549142788 PISSN: 10459227 EISSN: None Source Type: Journal
DOI: 10.1109/TNN.2007.912312 Document Type: Article

Times cited : (223)

References (34)

1
- 0142166851
- A neural probabilistic language model
- Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin, "A neural probabilistic language model," J. Mach. Learn. Res., vol. 3, pp. 1137-1155, 2003.
- (2003) J. Mach. Learn. Res , vol.3 , pp. 1137-1155
- Bengio, Y.¹ Ducharme, R.² Vincent, P.³ Jauvin, C.⁴

2
- 0036293862
- Connectionist language modeling for large vocabulary continuous speech recognition
- Orlando, FL
- H. Schwenk and J.-L. Gauvain, "Connectionist language modeling for large vocabulary continuous speech recognition," in Proc. Int. Conf. Acoust. Speech Signal Process., Orlando, FL, 2002, pp. 765-768.
- (2002) Proc. Int. Conf. Acoust. Speech Signal Process , pp. 765-768
- Schwenk, H.¹ Gauvain, J.-L.²

3
- 10944267136
- Efficient training of large neural networks for language modeling
- Jul
- H. Schwenk, "Efficient training of large neural networks for language modeling," in Proc. IEEE Int. Joint Conf. Neural Netw., Jul. 2004, vol. 4, pp. 3059-3064.
- (2004) Proc. IEEE Int. Joint Conf. Neural Netw , vol.4 , pp. 3059-3064
- Schwenk, H.¹

4
- 33645488707
- Training connectionist models for the structured language model
- P. Xu, A. Emami, and F. Jelinek, "Training connectionist models for the structured language model," in Proc. Conf. Empiric. Methods Natural Lang. Process., 2001, vol. 10, pp. 160-167.
- (2001) Proc. Conf. Empiric. Methods Natural Lang. Process , vol.10 , pp. 160-167
- Xu, P.¹ Emami, A.² Jelinek, F.³

5
- 0003612818
- Cambridge, MA: MIT Press
- C. D. Manning and H. Schütze, Foundations of Statisticat Natural Language Processing. Cambridge, MA: MIT Press, 1999.
- (1999) Foundations of Statisticat Natural Language Processing
- Manning, C.D.¹ Schütze, H.²

6
- 0002553443
- Interpolated estimation of Markov source parameters from sparse data
- F. Jelinek and R. L. Mercer E. S. Gelsema and L. N. Kanal, Eds, Amsterdam, The Netherlands: North-Holland
- F. Jelinek and R. L. Mercer" E. S. Gelsema and L. N. Kanal, Eds., "Interpolated estimation of Markov source parameters from sparse data," in Pattern Recognition in Practice. Amsterdam, The Netherlands: North-Holland, 1980.
- (1980) Pattern Recognition in Practice

7
- 0023312404
- Estimation of probabilities from sparse data for the language model component of a speech recognizer
- Mar
- S. M. Katz, "Estimation of probabilities from sparse data for the language model component of a speech recognizer," IEEE Trans. Acoust. Speech Signal Process., vol. ASSP-35, no. 3, pp. 400-401, Mar. 1987.
- (1987) IEEE Trans. Acoust. Speech Signal Process , vol.ASSP-35 , Issue.3 , pp. 400-401
- Katz, S.M.¹

8
- 0028996876
- Improved backing-off for m-gram language modeling
- R. Kneser and H. Ney, "Improved backing-off for m-gram language modeling," in Proc. Int. Conf. Acoust. Speech Signal Process., 1995, pp. 181-184.
- (1995) Proc. Int. Conf. Acoust. Speech Signal Process , pp. 181-184
- Kneser, R.¹ Ney, H.²

9
- 84899005563
- A neural probabilistic language model
- T. K. Leen, T. G. Dietterich, and V. Tresp, Eds. Cambridge, MA: MIT Press
- Y. Bengio, R. Ducharme, and P. Vincent, "A neural probabilistic language model," in Advances in Neural Information Processing Systems 13, T. K. Leen, T. G. Dietterich, and V. Tresp, Eds. Cambridge, MA: MIT Press, 2001, pp. 932-938.
- (2001) Advances in Neural Information Processing Systems 13 , pp. 932-938
- Bengio, Y.¹ Ducharme, R.² Vincent, P.³

10
- 0002623785
- Learning distributed representations of concerts
- Amherst, Hillsdale
- G. Hinton, "Learning distributed representations of concerts," in Proc. 8th Annu. Conf. Cogn. Sci. Soc., Amherst, Hillsdale, 1986, pp. 1-12.
- (1986) Proc. 8th Annu. Conf. Cogn. Sci. Soc , pp. 1-12
- Hinton, G.¹

11
- 84988402904
- Can artificial neural network learn language models
- Beijing, China
- W. Xu and A. Rudnicky, "Can artificial neural network learn language models," in Proc. Int. Conf. Statist. Lang. Process., Beijing, China, 2000, pp. M1-13.
- (2000) Proc. Int. Conf. Statist. Lang. Process
- Xu, W.¹ Rudnicky, A.²

12
- 0006273786
- A latent semantic analysis framework for large-span language modeling
- Rhodes, Greece
- J. Bellegarda, "A latent semantic analysis framework for large-span language modeling," in Proc. Eurospeech, Rhodes, Greece, 1997, pp. 1451-1454.
- (1997) Proc. Eurospeech , pp. 1451-1454
- Bellegarda, J.¹

13
- 0029984070
- Improving protein secondary structure prediction using structured neural networks and multiple sequence profiles
- S. Riis and A. Krogh, "Improving protein secondary structure prediction using structured neural networks and multiple sequence profiles," J. Comput. Biol., pp. 163-183, 1996.
- (1996) J. Comput. Biol , pp. 163-183
- Riis, S.¹ Krogh, A.²

14
- 85009143810
- Self organizing letter code-book for text-to-phoneme neural network model
- K. Jensen and S. Riis, "Self organizing letter code-book for text-to-phoneme neural network model," in Proc. Int. Conf. Spoken Lang. Process., 2000, vol. 3, pp. 318-321.
- (2000) Proc. Int. Conf. Spoken Lang. Process , vol.3 , pp. 318-321
- Jensen, K.¹ Riis, S.²

15
- 0003966401
- Carnegie Mellon Univ, Pittsburgh, PA, Tech. Rep. CMU-CS-88-152
- K. J. Lang and G. E. Hinton, "The development of the time-delay neural network architecture for speech recognition," Carnegie Mellon Univ., Pittsburgh, PA, Tech. Rep. CMU-CS-88-152, 1988.
- (1988) The development of the time-delay neural network architecture for speech recognition
- Lang, K.J.¹ Hinton, G.E.²

16
- 8344290493
- Energy-based models for sparse overcomplete representations
- Y.-W. Teh, M. Welling, S. Osindero, and G. E. Hinton, "Energy-based models for sparse overcomplete representations," J. Mach. Learn. Res., vol. 4, pp. 1235-1260, 2003.
- (2003) J. Mach. Learn. Res , vol.4 , pp. 1235-1260
- Teh, Y.-W.¹ Welling, M.² Osindero, S.³ Hinton, G.E.⁴

17
- 0003757760
- Fundamentals of stafisfical exponential families
- Bethesda, MD: Institute of Mathematical Statistics
- L. D. Brown, "Fundamentals of stafisfical exponential families," in Lecture Notes Monograph Series. Bethesda, MD: Institute of Mathematical Statistics, 1986, vol. 9.
- (1986) Lecture Notes Monograph Series , vol.9
- Brown, L.D.¹

18
- 42549093328
- G. E. Hinton and T. J. Sejnowski, Learning and releasing in Boltzmann machines, in Parallel Distributed Processing: Explorations in the Microstructure of Cognition. 1: Foundations, D. E. Rumelhart and J. L. McClelland, Eds. Cambridge, MA: MIT Press, 1986.
- G. E. Hinton and T. J. Sejnowski, "Learning and releasing in Boltzmann machines," in Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Volume 1: Foundations, D. E. Rumelhart and J. L. McClelland, Eds. Cambridge, MA: MIT Press, 1986.

19
- 0035059194
- Whole-sentence exponential language models: A vehicle for linguistic-statistical integration
- Online, Available
- R. Rosenfeld, S. F. Chen, and X. Zhu, "Whole-sentence exponential language models: A vehicle for linguistic-statistical integration," Comput. Speech Lang. vol. 15, no. 1, 2001 [Online]. Available: citeseer.nj.nec.com/448532.html
- (2001) Comput. Speech Lang , vol.15 , Issue.1
- Rosenfeld, R.¹ Chen, S.F.² Zhu, X.³

20
- 0002652285
- A maximum entropy approach to natural language processing
- A. Berger, S. Della Pietra, and V. Della Pietra, "A maximum entropy approach to natural language processing," Comput. Linguist., vol. 22, pp. 39-71, 1996.
- (1996) Comput. Linguist , vol.22 , pp. 39-71
- Berger, A.¹ Della Pietra, S.² Della Pietra, V.³

21
- 0003919677
- New York: Springer-Verlag
- C. P. Robert and G. Casella, Monte Carlo Statistical Methods, New York: Springer-Verlag, 2000.
- (2000) Monte Carlo Statistical Methods
- Robert, C.P.¹ Casella, G.²

22
- 84950943564
- Sequential imputations and Bayesian missing data problems
- A. Kong, J. S. Liu, and W. H. Wong, "Sequential imputations and Bayesian missing data problems," J. Amer. Statist. Assoc., vol. 89, pp. 278-288, 1994.
- (1994) J. Amer. Statist. Assoc , vol.89 , pp. 278-288
- Kong, A.¹ Liu, J.S.² Wong, W.H.³

23
- 0004182828
- New York: Springer-Verlag
- J. S. Liu, Monte Carlo Strategies in Scientific Computing. New York: Springer-Verlag, 2001.
- (2001) Monte Carlo Strategies in Scientific Computing
- Liu, J.S.¹

24
- 0001861916
- Adaptive importance sampling for estimation in structured domains
- O. Luis and K. Leslie, "Adaptive importance sampling for estimation in structured domains," in Proc. 16th Annu. Conf. Uncertainty Artif. Intell., 2000, pp. 446-454.
- (2000) Proc. 16th Annu. Conf. Uncertainty Artif. Intell , pp. 446-454
- Luis, O.¹ Leslie, K.²

25
- 10944221006
- Quick training of probabilistic neural nets by sampling
- Key West, FL
- Y. Bengio and J.-S. Senécal, "Quick training of probabilistic neural nets by sampling," in Proc. 9th Int. Workshop Artif. Intell. Statist., Key West, FL., 2003, vol. 9.
- (2003) Proc. 9th Int. Workshop Artif. Intell. Statist , vol.9
- Bengio, Y.¹ Senécal, J.-S.²

26
- 33845250701
- Dept. Statist, Univ. Chicago, Chicago, IL, Tech. Rep. 348
- A. Kong, "A note on importance sampling using standardized weights," Dept. Statist., Univ. Chicago, Chicago, IL, Tech. Rep. 348, 1992.
- (1992) A note on importance sampling using standardized weights
- Kong, A.¹

27
- 0012356157
- Mach. Learn. Appl. Statist. Group, Microsoft Res, Redmond, WA, Tech. Rep. MSR-TR-2001-72
- J. Goodman, "A bit of progress in language modeling-extended version," Mach. Learn. Appl. Statist. Group, Microsoft Res., Redmond, WA, Tech. Rep. MSR-TR-2001-72, 2003.
- (2003) A bit of progress in language modeling-extended version
- Goodman, J.¹

28
- 0001249662
- Ais-bn: An adaptive importance sampling algorithm for evidential reasoning in large Bayesian networks
- J. Cheng and M. J. Druzdzel, "Ais-bn: An adaptive importance sampling algorithm for evidential reasoning in large Bayesian networks," J. Artif. Intell. Res., vol. 13, pp. 155-188, 2000.
- (2000) J. Artif. Intell. Res , vol.13 , pp. 155-188
- Cheng, J.¹ Druzdzel, M.J.²

29
- 33646907991
- Two decades of statistical language modeling: Where do we go from here?
- Aug
- R. Rosenfeld, "Two decades of statistical language modeling: Where do we go from here?," Proc. IEEE, vol. 88, no. 8, pp. 1270-1278, Aug. 2000.
- (2000) Proc. IEEE , vol.88 , Issue.8 , pp. 1270-1278
- Rosenfeld, R.¹

30
- 0012356157
- A bit of progress in language modeling
- Microsoft Res, Tech. Rep. MSR -TR-2001-72
- J. Goodman, "A bit of progress in language modeling," Microsoft Res., Tech. Rep. MSR -TR-2001-72, 2001.
- (2001)
- Goodman, J.¹

31
- 4544358964
- The super ARV language model: Investigating th effectiveness of tightly integrating multiple knowledge sources
- Morristown, NJ
- W. Wang and M. P. Harper, "The super ARV language model: Investigating th effectiveness of tightly integrating multiple knowledge sources," in Proc. ACL-02 Conf. Empirical Methods Natural Lang. Proress. Morristown, NJ, 2002, pp. 238-247.
- (2002) Proc. ACL-02 Conf. Empirical Methods Natural Lang. Proress , pp. 238-247
- Wang, W.¹ Harper, M.P.²

32
- 0036293862
- Connectionist language modeling for large vocabulary continuous speech recognition
- Orlando, FL
- H. Schwenk and J.-L. Gauvain, "Connectionist language modeling for large vocabulary continuous speech recognition," in Proc. Int. Conf. Acoust. Speach Signal Process., Orlando, FL, 2002, pp. 765-768.
- (2002) Proc. Int. Conf. Acoust. Speach Signal Process , pp. 765-768
- Schwenk, H.¹ Gauvain, J.-L.²

33
- 0033350721
- Products of experts
- Edinburgh, Scotland
- G. Hinton, "Products of experts," in Proc. 9th Int. Conf. Artif. Neural Netw., Edinburgh, Scotland, 1999, pp. 1-6.
- (1999) Proc. 9th Int. Conf. Artif. Neural Netw , pp. 1-6
- Hinton, G.¹

34
- 0142192256
- Dept. IRO, Université de Montr_aveal, Montréal, QC, Canada, Tech Rep. 1215
- Y. Bengio, "New distributed probabilistic language models," Dept. IRO, Université de Montr_aveal, Montréal, QC, Canada, Tech Rep. 1215, 2002.
- (2002) New distributed probabilistic language models
- Bengio, Y.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.