메뉴 건너뛰기




Volumn 25, Issue 11, 2014, Pages 1953-1966

Online PLSA: Batch updating techniques including out-of-vocabulary words

Author keywords

Document clustering; document modeling; information retrieval; out of vocabulary (OOV) words; PLSA updating; probabilistic latent semantic analysis (PLSA); unsupervised learning.

Indexed keywords

UNSUPERVISED LEARNING;

EID: 84908090239     PISSN: 2162237X     EISSN: 21622388     Source Type: Journal    
DOI: 10.1109/TNNLS.2014.2299806     Document Type: Article
Times cited : (30)

References (45)
  • 1
    • 0001509519 scopus 로고    scopus 로고
    • Probabilistic latent semantic analysis
    • T. Hofmann, "Probabilistic latent semantic analysis," in Proc. Uncertainty Artif. Intell., 1999, pp. 286-296.
    • (1999) Proc. Uncertainty Artif. Intell. , pp. 286-296
    • Hofmann, T.1
  • 3
    • 0141607824 scopus 로고    scopus 로고
    • Latent Dirichlet allocation
    • D. M. Blei, A. Y. Ng, and M. I. Jordan, "Latent Dirichlet allocation," J. Mach. Learn. Res., vol. 3, no. 5, pp. 993-1022, 2003.
    • (2003) J. Mach. Learn. Res. , vol.3 , Issue.5 , pp. 993-1022
    • Blei, D.M.1    Ng, A.Y.2    Jordan, M.I.3
  • 6
    • 24044495247 scopus 로고    scopus 로고
    • Combining statistical language models via the latent maximum entropy principle
    • S. Wang, D. Schuurmans, F. Peng, and Y. Zhao, "Combining statistical language models via the latent maximum entropy principle," Mach. Learn., vol. 60, nos. 1-3, pp. 229-250, 2005.
    • (2005) Mach. Learn. , vol.60 , Issue.1-3 , pp. 229-250
    • Wang, S.1    Schuurmans, D.2    Peng, F.3    Zhao, Y.4
  • 7
    • 81855166765 scopus 로고    scopus 로고
    • Missing data imputation for time-frequency representations of audio signals
    • P. Smaragdis, B. Raj, and M. Shashanka, "Missing data imputation for time-frequency representations of audio signals," J. Signal Process. Syst., vol. 65, no. 3, pp. 361-370, 2011.
    • (2011) J. Signal Process. Syst. , vol.65 , Issue.3 , pp. 361-370
    • Smaragdis, P.1    Raj, B.2    Shashanka, M.3
  • 8
    • 78751675106 scopus 로고    scopus 로고
    • Correlated PLSA for image clustering
    • K.-T. Lee, W.-H. Tsai, H.-Y. Liao, T. Chen, J.-W. Hsieh, and C.-C. Tseng, Eds. Berlin, Germany: Springer-Verlag
    • P. Li, J. Cheng, Z. Li, and H. Lu, "Correlated PLSA for image clustering," in Advances in Multimedia Modeling, K.-T. Lee, W.-H. Tsai, H.-Y. Liao, T. Chen, J.-W. Hsieh, and C.-C. Tseng, Eds. Berlin, Germany: Springer-Verlag, 2011, pp. 307-316.
    • (2011) Advances in Multimedia Modeling , pp. 307-316
    • Li, P.1    Cheng, J.2    Li, Z.3    Lu, H.4
  • 9
    • 1542377542 scopus 로고    scopus 로고
    • Text categorization by boosting automatically extracted concepts
    • Toronto, ON, Canada
    • L. Cai and T. Hofmann, "Text categorization by boosting automatically extracted concepts," in Proc. 26th ACM SIGIR Conf. Res. Develop. Inf. Retr., Toronto, ON, Canada, 2003, pp. 182-189.
    • (2003) Proc. 26th ACM SIGIR Conf. Res. Develop. Inf. Retr. , pp. 182-189
    • Cai, L.1    Hofmann, T.2
  • 10
    • 84898996741 scopus 로고    scopus 로고
    • Learning the similarity of documents: An informationgeometric approach to document retrieval and categorization
    • Cambridge, MA, USA: MIT Press
    • T. Hofmann, "Learning the similarity of documents: An informationgeometric approach to document retrieval and categorization," in Advances in Neural Information Processing Systems 12. Cambridge, MA, USA: MIT Press, 2000, pp. 914-920.
    • (2000) Advances in Neural Information Processing Systems , vol.12 , pp. 914-920
    • Hofmann, T.1
  • 11
    • 0036498205 scopus 로고    scopus 로고
    • A probabilistic framework for the hierarchic organisation and classification of document collections
    • A. Vinokourov and M. Girolami, "A probabilistic framework for the hierarchic organisation and classification of document collections," J. Intell. Inf. Syst., vol. 18, nos. 2-3, pp. 153-172, 2002.
    • (2002) J. Intell. Inf. Syst. , vol.18 , Issue.2-3 , pp. 153-172
    • Vinokourov, A.1    Girolami, M.2
  • 12
    • 20244372124 scopus 로고    scopus 로고
    • A hierarchical model for clustering and categorising documents
    • Glasgow, U.K Mar.
    • E. Gaussier, C. Goutte, K. Popat, and F. Chen, "A hierarchical model for clustering and categorising documents," in Proc. 24th BCS-IRSG ECIR Res., Glasgow, U.K., Mar. 2002, pp. 292-247.
    • (2002) Proc. 24th BCS-IRSG ECIR Res. , pp. 292-247
    • Gaussier, E.1    Goutte, C.2    Popat, K.3    Chen, F.4
  • 13
    • 70449585624 scopus 로고    scopus 로고
    • A novel approach to musical genre classification using probabilistic latent semantic analysis model
    • Jul.
    • Z. Zeng, S. Zhang, H. Li, W. Liang, and H. Zheng, "A novel approach to musical genre classification using probabilistic latent semantic analysis model," in Proc. IEEE ICME, Jul. 2009, pp. 486-489.
    • (2009) Proc. IEEE ICME , pp. 486-489
    • Zeng, Z.1    Zhang, S.2    Li, H.3    Liang, W.4    Zheng, H.5
  • 14
    • 0037481043 scopus 로고    scopus 로고
    • Topic-based document segmentation with probabilistic latent semantic analysis
    • Washington, DC, USA Nov.
    • T. Brants, F. Chen, and I. Tsochantaridis, "Topic-based document segmentation with probabilistic latent semantic analysis," in Proc. 11th Int. Conf. Inf. Knowl. Manag., Washington, DC, USA, Nov. 2002, pp. 211-218.
    • (2002) Proc. 11th Int. Conf. Inf. Knowl. Manag. , pp. 211-218
    • Brants, T.1    Chen, F.2    Tsochantaridis, I.3
  • 17
    • 0036298547 scopus 로고    scopus 로고
    • Fast update of latent semantic spaces using a linear transform framework
    • J. R. Bellegarda, "Fast update of latent semantic spaces using a linear transform framework," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., vol. 1. 2002, pp. 769-772.
    • (2002) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. , vol.1 , pp. 769-772
    • Bellegarda, J.R.1
  • 20
    • 85162005069 scopus 로고    scopus 로고
    • Online learning for latent Dirichlet allocation
    • M. Hoffman, D. Blei, and F. Bach, "Online learning for latent Dirichlet allocation," in Proc. Adv. NIPS, 2010, pp. 856-864.
    • (2010) Proc. Adv. NIPS , pp. 856-864
    • Hoffman, M.1    Blei, D.2    Bach, F.3
  • 22
    • 70449126967 scopus 로고    scopus 로고
    • Topic models over text streams: A study of batch and online unsupervised learning
    • A. Banerjee and S. Basu, "Topic models over text streams: A study of batch and online unsupervised learning," in Proc. 7th SIAM Int. Conf. Data Mining, 2007, pp. 437-442.
    • (2007) Proc. 7th SIAM Int. Conf. Data Mining , pp. 437-442
    • Banerjee, A.1    Basu, S.2
  • 24
    • 60549085346 scopus 로고    scopus 로고
    • Adaptive Bayesian latent semantic analysis
    • Jan.
    • J. T. Chien and M. S. Wu, "Adaptive Bayesian latent semantic analysis," IEEE Trans. Audio, Speech, Lang. Process., vol. 16, no. 1, pp. 198-207, Jan. 2008.
    • (2008) IEEE Trans. Audio, Speech, Lang. Process. , vol.16 , Issue.1 , pp. 198-207
    • Chien, J.T.1    Wu, M.S.2
  • 25
    • 38749103565 scopus 로고    scopus 로고
    • Using incremental PLSI for thresholdresilient online event analysis
    • Mar.
    • T. C. Chou and M. C. Chen, "Using incremental PLSI for thresholdresilient online event analysis," IEEE Trans. Knowl. Data Eng., vol. 20, no. 3, pp. 289-299, Mar. 2008.
    • (2008) IEEE Trans. Knowl. Data Eng. , vol.20 , Issue.3 , pp. 289-299
    • Chou, T.C.1    Chen, M.C.2
  • 26
    • 79952347069 scopus 로고    scopus 로고
    • RPLSA: A novel updating scheme for probabilistic latent semantic analysis
    • Oct.
    • N. Bassiou and C. Kotropoulos, "RPLSA: A novel updating scheme for probabilistic latent semantic analysis," Comput. Speech Lang., vol. 25, no. 4, pp. 741-760, Oct. 2011.
    • (2011) Comput. Speech Lang. , vol.25 , Issue.4 , pp. 741-760
    • Bassiou, N.1    Kotropoulos, C.2
  • 27
    • 0002788893 scopus 로고    scopus 로고
    • A view of the em algorithm that justifies incremental, sparse, and other variants
    • M. I. Jordan, Ed. Norwell, MA, USA: Kluwer
    • R. M. Neal and G. E. Hinton, "A view of the EM algorithm that justifies incremental, sparse, and other variants," in Learning in Graphical Models, M. I. Jordan, Ed. Norwell, MA, USA: Kluwer, 1998, pp. 355-368.
    • (1998) Learning in Graphical Models , pp. 355-368
    • Neal, R.M.1    Hinton, G.E.2
  • 30
    • 0344031459 scopus 로고    scopus 로고
    • Unsupervised learning from dyadic data
    • Berkeley, CA, USA, Tech. Rep. TR-
    • T. Hofmann and J. Puzicha, "Unsupervised learning from dyadic data," Int. Comput. Sci. Inst., Berkeley, CA, USA, Tech. Rep. TR-98-042, 1998.
    • (1998) Int. Comput. Sci. Inst. , pp. 98-042
    • Hofmann, T.1    Puzicha, J.2
  • 31
    • 0034818212 scopus 로고    scopus 로고
    • Unsupervised learning by probabilistic latent semantic analysis
    • Jan.
    • T. Hofmann, "Unsupervised learning by probabilistic latent semantic analysis," Mach. Learn., vol. 42, nos. 1-2, pp. 177-196, Jan. 2001.
    • (2001) Mach. Learn. , vol.42 , Issue.1-2 , pp. 177-196
    • Hofmann, T.1
  • 32
    • 0002629270 scopus 로고
    • Maximum likelihood from incomplete data via the em algorithm (with discussion)
    • A. Dempster, N. Laird, and D. Rubin, "Maximum likelihood from incomplete data via the EM algorithm (with discussion)," J. R. Statist. Soc., Ser. B, vol. 39, no. 1, pp. 1-38, 1977.
    • (1977) J. R. Statist. Soc., Ser. B , vol.39 , Issue.1 , pp. 1-38
    • Dempster, A.1    Laird, N.2    Rubin, D.3
  • 34
    • 17144419650 scopus 로고    scopus 로고
    • Test data likelihood for PLSA models
    • T. Brants, "Test data likelihood for PLSA models," Inf. Retr., vol. 8, no. 2, pp. 181-196, 2005.
    • (2005) Inf. Retr. , vol.8 , Issue.2 , pp. 181-196
    • Brants, T.1
  • 35
    • 63449138786 scopus 로고    scopus 로고
    • Incremental probabilistic latent semantic analysis for automatic question recommendation
    • Lausanne, Switzerland, Oct.
    • H. Wu, D. Zhang, Y. Wang, and X. Cheng, "Incremental probabilistic latent semantic analysis for automatic question recommendation," in Proc. ACM Conf. Recommender Syst., Lausanne, Switzerland, Oct. 2008, pp. 99-106.
    • (2008) Proc. ACM Conf. Recommender Syst. , pp. 99-106
    • Wu, H.1    Zhang, D.2    Wang, Y.3    Cheng, X.4
  • 41
    • 0012078715 scopus 로고    scopus 로고
    • Statistical language modeling using leaving-one-out
    • S. Young and G. Bloothooft, Eds. Dordrecht, The Netherlands: Kluwer
    • H. Ney, S. Martin, and F. Wessel, "Statistical language modeling using leaving-one-out," in Corpus-Based Methods in Language and Speech Processing, S. Young and G. Bloothooft, Eds. Dordrecht, The Netherlands: Kluwer, 1997, pp. 174-207.
    • (1997) Corpus-Based Methods in Language and Speech Processing , pp. 174-207
    • Ney, H.1    Martin, S.2    Wessel, F.3
  • 42
    • 84948481845 scopus 로고
    • An algorithm for suffix stripping
    • Jul.
    • M. F. Porter, "An algorithm for suffix stripping," Program, vol. 14, no. 3, pp. 130-137, Jul. 1980.
    • (1980) Program , vol.14 , Issue.3 , pp. 130-137
    • Porter, M.F.1
  • 43
    • 78149283977 scopus 로고    scopus 로고
    • Detecting the number of clusters in n-way probabilistic clustering
    • Nov.
    • Z. He, A. Cichocki, S. Xie, and K. Choi, "Detecting the number of clusters in n-way probabilistic clustering," IEEE Trans. Pattern Anal. Mach. Intell., vol. 32, no. 11, pp. 2006-2021, Nov. 2010.
    • (2010) IEEE Trans. Pattern Anal. Mach. Intell. , vol.32 , Issue.11 , pp. 2006-2021
    • He, Z.1    Cichocki, A.2    Xie, S.3    Choi, K.4
  • 44
    • 36749055295 scopus 로고    scopus 로고
    • Determining the number of clusters using the weighted gap statistic
    • Apr.
    • M. Yan and K. Ye, "Determining the number of clusters using the weighted gap statistic," Biometrics, vol. 63, no. 4, pp. 1031-1037, Apr. 2007.
    • (2007) Biometrics , vol.63 , Issue.4 , pp. 1031-1037
    • Yan, M.1    Ye, K.2
  • 45
    • 0035532141 scopus 로고    scopus 로고
    • Estimating the number of clusters in a data set via the gap statistic
    • G. W. R. Tibshirani and T. Hastie, "Estimating the number of clusters in a data set via the gap statistic," J. R. Statist. Soc. (Ser. B), vol. 63, no. 2, pp. 411-423, 2001.
    • (2001) J. R. Statist. Soc. (Ser. B) , vol.63 , Issue.2 , pp. 411-423
    • Tibshirani, G.W.R.1    Hastie, T.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.