메뉴 건너뛰기




Volumn 57, Issue 6, 2011, Pages 3207-3229

Probability estimation in the rare-events regime

Author keywords

Classification; entropy estimation; large alphabets; large number of rare events (LNRE); natural language; probability estimation

Indexed keywords

CLASSIFICATION; ENTROPY ESTIMATION; LARGE ALPHABETS; LARGE NUMBER OF RARE EVENTS (LNRE); NATURAL LANGUAGE; PROBABILITY ESTIMATION;

EID: 79957659601     PISSN: 00189448     EISSN: None     Source Type: Journal    
DOI: 10.1109/TIT.2011.2137210     Document Type: Article
Times cited : (22)

References (50)
  • 2
    • 0017994420 scopus 로고
    • A convergent gambling estimate of the entropy of English
    • Jul.
    • T. M. Cover and R. C. King, "A convergent gambling estimate of the entropy of English," IEEE Trans. Inf. Theory, vol. IT-24, no. 4, pp. 413-421, Jul. 1978.
    • (1978) IEEE Trans. Inf. Theory , vol.IT-24 , Issue.4 , pp. 413-421
    • Cover, T.M.1    King, R.C.2
  • 3
    • 84944486544 scopus 로고
    • Prediction and entropy of printed English
    • Jan.
    • C. E. Shannon, "Prediction and entropy of printed English," Bell Syst. Tech. J., vol. 30, pp. 50-64, Jan. 1951.
    • (1951) Bell Syst. Tech. J. , vol.30 , pp. 50-64
    • Shannon, C.E.1
  • 6
    • 0025750735 scopus 로고
    • A comparison of the enhanced Good-Turing and deleted estimation methods for estimating probabilities of English bigrams
    • Jan.
    • K. W. Church and W. A. Gale, "A comparison of the enhanced Good-Turing and deleted estimation methods for estimating probabilities of English bigrams," Comput. Speech Lang., vol. 5, no. 1, pp. 19-54, Jan. 1991.
    • (1991) Comput. Speech Lang. , vol.5 , Issue.1 , pp. 19-54
    • Church, K.W.1    Gale, W.A.2
  • 7
    • 84953486351 scopus 로고
    • Good-Turing frequency estimation without tears
    • W. A. Gale and G. Sampson, "Good-Turing frequency estimation without tears," J. Quant. Ling., vol. 2, pp. 217-237, 1995.
    • (1995) J. Quant. Ling. , vol.2 , pp. 217-237
    • Gale, W.A.1    Sampson, G.2
  • 8
    • 0001835919 scopus 로고    scopus 로고
    • Models for power law relations in linguistics and information science
    • S. Naranan and V. Balasubrahmanyan, "Models for power law relations in linguistics and information science," J. Quant. Ling., vol. 5, pp. 35-61, 1998.
    • (1998) J. Quant. Ling. , vol.5 , pp. 35-61
    • Naranan, S.1    Balasubrahmanyan, V.2
  • 9
    • 0000803388 scopus 로고
    • The population frequencies of species and the estimation of population parameters
    • I. J. Good, "The population frequencies of species and the estimation of population parameters," Biometrika, vol. 40, no. 3/4, pp. 237-264, 1953.
    • (1953) Biometrika , vol.40 , Issue.3-4 , pp. 237-264
    • Good, I.J.1
  • 11
    • 0000353709 scopus 로고
    • Statistical analysis of a large number of rare events and related problems
    • Russian
    • E. V. Khmaladze and R. Ya Chitashvili, "Statistical analysis of a large number of rare events and related problems," in Proc. A. Razmadze Math. Inst., 1989, vol. 92, pp. 196-245, Russian.
    • (1989) Proc. A. Razmadze Math. Inst. , vol.92 , pp. 196-245
    • Khmaladze, E.V.1    Ya Chitashvili, R.2
  • 12
    • 0034338224 scopus 로고    scopus 로고
    • Consistent estimation of the structural distribution function
    • C. A. J. Klaassen and R. M. Mnatsakanov, "Consistent estimation of the structural distribution function," Scand. J. Statist., vol. 27, pp. 733-746, 2000.
    • (2000) Scand. J. Statist. , vol.27 , pp. 733-746
    • Klaassen, C.A.J.1    Mnatsakanov, R.M.2
  • 13
    • 0000983592 scopus 로고
    • Decomposable statistics and hypothesis testing. The case of small samples
    • G. I. Ivchenko and Y. I. Medvedev, "Decomposable statistics and hypothesis testing. The case of small samples.," Theor. Probab. Appl., vol. 23, no. 4, pp. 764-775, 1978.
    • (1978) Theor. Probab. Appl. , vol.23 , Issue.4 , pp. 764-775
    • Ivchenko, G.I.1    Medvedev, Y.I.2
  • 14
    • 33644591301 scopus 로고    scopus 로고
    • The complexity of approximating the entropy
    • T. Batu, S. Dasgupta, R. Kumar, and R. Rubinfeld, "The complexity of approximating the entropy," SIAM J. Comput., vol. 35, no. 1, pp. 132-150, 2005.
    • (2005) SIAM J. Comput. , vol.35 , Issue.1 , pp. 132-150
    • Batu, T.1    Dasgupta, S.2    Kumar, R.3    Rubinfeld, R.4
  • 15
    • 0005770331 scopus 로고
    • Note on the bias of entropy estimates
    • H. Quastler, Ed. Glencoe, IL: Free Press
    • G. A. Miller, "Note on the bias of entropy estimates," in Information Theory in Psychology II-B, H. Quastler, Ed. Glencoe, IL: Free Press, 1955, pp. 95-100.
    • (1955) Information Theory in Psychology II-B , pp. 95-100
    • Miller, G.A.1
  • 16
    • 0000090102 scopus 로고
    • Finite sample corrections to entropy and dimension estimates
    • P. Grassberger, "Finite sample corrections to entropy and dimension estimates," Phys. Lett. A, vol. 128, no. 6-7, pp. 369-373, 1988.
    • (1988) Phys. Lett. A , vol.128 , Issue.6-7 , pp. 369-373
    • Grassberger, P.1
  • 18
    • 3142654255 scopus 로고    scopus 로고
    • Bias analysis in entropy estimation
    • DOI 10.1088/0305-4470/37/27/L02, PII S0305447004773801
    • T. Schürmann, "Bias analysis in entropy estimation," J. Phys. A: Math. Gen., vol. 37, pp. L295-L301, 2004. (Pubitemid 38904275)
    • (2004) Journal of Physics A: Mathematical and General , vol.37 , Issue.27
    • Schurmann, T.1
  • 20
    • 28444443140 scopus 로고
    • Calculation of entropy from data of motion
    • Oct.
    • S.-K. Ma, "Calculation of entropy from data of motion," J. Stat. Phys., vol. 26, no. 2, pp. 221-240, Oct. 1981.
    • (1981) J. Stat. Phys. , vol.26 , Issue.2 , pp. 221-240
    • Ma, S.-K.1
  • 21
    • 0041877169 scopus 로고    scopus 로고
    • Estimation of entropy and mutual information
    • DOI 10.1162/089976603321780272
    • L. Paninski, "Estimation of entropy and mutual information," Neural Comput., vol. 15, no. 6, pp. 1191-1253, June 2003. (Pubitemid 37049793)
    • (2003) Neural Computation , vol.15 , Issue.6 , pp. 1191-1253
    • Paninski, L.1
  • 23
    • 0035539882 scopus 로고    scopus 로고
    • Convergence properties of functional estimates for discrete distributions
    • A. Antos and I. Kontoyiannis, "Convergence properties of functional estimates for discrete distributions," Random Structures Alg., vol. 19, no. 3-4, pp. 163-193, 2001.
    • (2001) Random Structures Alg. , vol.19 , Issue.3-4 , pp. 163-193
    • Antos, A.1    Kontoyiannis, I.2
  • 24
    • 3042515346 scopus 로고    scopus 로고
    • Universal entropy estimation via block sorting
    • Jul.
    • H. Cai, S. R. Kulkarni, and S. Verdú, "Universal entropy estimation via block sorting," IEEE Trans. Inf. Theory, vol. 50, no. 7, pp. 1551-1561, Jul. 2004.
    • (2004) IEEE Trans. Inf. Theory , vol.50 , Issue.7 , pp. 1551-1561
    • Cai, H.1    Kulkarni, S.R.2    Verdú, S.3
  • 25
    • 33746656746 scopus 로고    scopus 로고
    • Universal divergence estimation for finite-alphabet sources
    • DOI 10.1109/TIT.2006.878182
    • H. Cai, S. R. Kulkarni, and S. Verdú, "Universal divergence estimation for finite-alphabet sources," IEEE Trans. Inf. Theory, vol. 52, no. 8, pp. 3456-3475, Aug. 2006. (Pubitemid 44145110)
    • (2006) IEEE Transactions on Information Theory , vol.52 , Issue.8 , pp. 3456-3475
    • Cai, H.1    Kulkarni, S.R.2    Verdu, S.3
  • 26
    • 4544287683 scopus 로고    scopus 로고
    • Estimating entropy on bins given fewer than m samples
    • Sep.
    • L. Paninski, "Estimating entropy on m bins given fewer than samples," IEEE Trans. Inf. Theory, vol. 50, no. 9, pp. 2200-2203, Sep. 2004.
    • (2004) IEEE Trans. Inf. Theory , vol.50 , Issue.9 , pp. 2200-2203
    • Paninski, L.1
  • 27
    • 0142084741 scopus 로고    scopus 로고
    • Always Good Turing: Asymptotically optimal probability estimation
    • DOI 10.1126/science.1088284
    • A. Orlitsky, N. P. Santhanam, and J. Zhang, "Always Good Turing: Asymptotically optimal probability estimation," Science, vol. 302, pp. 427-431, Oct. 2003. (Pubitemid 37296264)
    • (2003) Science , vol.302 , Issue.5644 , pp. 427-431
    • Orlitsky, A.1    Santhanam, N.P.2    Zhang, J.3
  • 31
    • 4544279028 scopus 로고    scopus 로고
    • Large deviation asymptotics for occupancy problems
    • DOI 10.1214/009117904000000135
    • P. Dupuis, C. Nuzman, and P. Whiting, "Large deviation asymptotics for occupancy problems," Ann. Probab., vol. 32, no. 3B, pp. 2765-2818, 2004. (Pubitemid 39258787)
    • (2004) Annals of Probability , vol.32 , Issue.3 B , pp. 2765-2818
    • Dupuis, P.1    Nuzman, C.2    Whiting, P.3
  • 32
    • 0001456426 scopus 로고
    • Estimating the total probability of the unobserved outcomes of an experiment
    • H. E. Robbins, "Estimating the total probability of the unobserved outcomes of an experiment," Ann. Math. Statist., vol. 39, no. 1, pp. 256-7, 1968.
    • (1968) Ann. Math. Statist. , vol.39 , Issue.1 , pp. 256-257
    • Robbins, H.E.1
  • 33
    • 0028381083 scopus 로고
    • On the bias of the Turing-Good estimate of probabilities
    • Feb.
    • B. H. Juang and S. H. Lo, "On the bias of the Turing-Good estimate of probabilities," IEEE Trans. Signal Process., vol. 42, no. 2, pp. 496-8, Feb. 1994.
    • (1994) IEEE Trans. Signal Process. , vol.42 , Issue.2 , pp. 496-498
    • Juang, B.H.1    Lo, S.H.2
  • 34
    • 0000760438 scopus 로고
    • The efficiency of Good's nonparametric coverage estimator
    • W. W. Esty, "The efficiency of Good's nonparametric coverage estimator," Ann. Statist., vol. 14, no. 3, pp. 1257-1260, 1986.
    • (1986) Ann. Statist. , vol.14 , Issue.3 , pp. 1257-1260
    • Esty, W.W.1
  • 35
    • 10844280272 scopus 로고    scopus 로고
    • A Poisson model for the coverage problem with a genomic application
    • DOI 10.1093/biomet/89.3.669
    • C. X. Mao and B. G. Lindsay, "A Poisson model for the coverage problem with a genomic application," Biometrika, vol. 89, no. 3, pp. 669-681, 2002. (Pubitemid 41311973)
    • (2002) Biometrika , vol.89 , Issue.3 , pp. 669-681
    • Mao, C.X.1    Lindsay, B.G.2
  • 36
    • 3142742865 scopus 로고    scopus 로고
    • On the convergence rate of Good-Turing estimators
    • San Francisco, CA, Morgan Kaufmann
    • D. McAllester and R. E. Schapire, "On the convergence rate of Good-Turing estimators," in Proc. 13th Annu. Conf. Comput. Learning Theory, San Francisco, CA, 2000, pp. 1-6, Morgan Kaufmann.
    • (2000) Proc. 13th Annu. Conf. Comput. Learning Theory , pp. 1-6
    • McAllester, D.1    Schapire, R.E.2
  • 37
    • 23744441936 scopus 로고    scopus 로고
    • Concentration bounds for unigram language models
    • E. Drukh and Y. Mansour, "Concentration bounds for unigram language models," J. Mach. Learn. Res., vol. 6, pp. 1231-1264, 2005.
    • (2005) J. Mach. Learn. Res. , vol.6 , pp. 1231-1264
    • Drukh, E.1    Mansour, Y.2
  • 38
    • 3142657660 scopus 로고    scopus 로고
    • Concentration inequalities for the missing mass and for histogram rule error
    • D. McAllester and L. Ortiz, "Concentration inequalities for the missing mass and for histogram rule error," J. Mach. Learn. Res., vol. 4, pp. 895-911, 2003.
    • (2003) J. Mach. Learn. Res. , vol.4 , pp. 895-911
    • McAllester, D.1    Ortiz, L.2
  • 39
    • 33646072127 scopus 로고    scopus 로고
    • Estimation of the number of operating sensors in large-scale sensor networks with mobile access
    • May
    • C. Budianu, S. Ben-David, and L. Tong, "Estimation of the number of operating sensors in large-scale sensor networks with mobile access," IEEE Trans. Signal Process., vol. 54, no. 5, pp. 1703-1715, May 2006.
    • (2006) IEEE Trans. Signal Process. , vol.54 , Issue.5 , pp. 1703-1715
    • Budianu, C.1    Ben-David, S.2    Tong, L.3
  • 40
    • 0017139943 scopus 로고
    • Estimating the number of unseen species: How many words did Shakespeare know?
    • B. Efron and R. Thisted, "Estimating the number of unseen species: How many words did Shakespeare know?," Biometrika, vol. 63, no. 3, pp. 435-47, 1976.
    • (1976) Biometrika , vol.63 , Issue.3 , pp. 435-47
    • Efron, B.1    Thisted, R.2
  • 43
    • 0024627211 scopus 로고
    • Asymptotically optimal classification for multiple tests with empirically observed statistics
    • Mar.
    • M. Gutman, "Asymptotically optimal classification for multiple tests with empirically observed statistics," IEEE Trans. Inf. Theory, vol. 35, no. 2, pp. 401-408, Mar. 1989.
    • (1989) IEEE Trans. Inf. Theory , vol.35 , Issue.2 , pp. 401-408
    • Gutman, M.1
  • 44
    • 0023979656 scopus 로고
    • On classification with empirically observed statistics and universal data compression
    • DOI 10.1109/18.2636
    • J. Ziv, "On classification with empirically observed statistics and universal data compression," IEEE Trans. Inf. Theory, vol. 34, no. 2, pp. 278-286, Mar. 1988. (Pubitemid 18639980)
    • (1988) IEEE Transactions on Information Theory , vol.34 , Issue.2 , pp. 278-286
    • Ziv Jacob1
  • 46
    • 0001035413 scopus 로고
    • On the method of bounded differences
    • London Math. Soc. Lecture Note Ser., J. Siemons, Ed. Cambridge, U.K,: Cambridge Univ. Press
    • C. McDiarmid, "On the method of bounded differences," in Surveys in Combinatorics, ser. London Math. Soc. Lecture Note Ser., J. Siemons, Ed. Cambridge, U.K,: Cambridge Univ. Press, 1989, vol. 141, pp. 148-188.
    • (1989) Surveys in Combinatorics, ser. , vol.141 , pp. 148-188
    • McDiarmid, C.1
  • 50
    • 3042606358 scopus 로고    scopus 로고
    • Universal compression of memoryless sources over unknown alphabets
    • Jul.
    • A. Orlitsky, N. P. Santhanam, and J. Zhang, "Universal compression of memoryless sources over unknown alphabets," IEEE Trans. Inf. Theory, vol. 50, no. 7, pp. 1469-1481, Jul. 2004.
    • (2004) IEEE Trans. Inf. Theory , vol.50 , Issue.7 , pp. 1469-1481
    • Orlitsky, A.1    Santhanam, N.P.2    Zhang, J.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.