메뉴 건너뛰기




Volumn 37, Issue 5, 2007, Pages 692-709

A novel framework for imputation of missing values in databases

Author keywords

Accuracy; Databases; Missing values; Multiple imputation (MI); Single imputation

Indexed keywords

ALGORITHMS; ASYMPTOTIC ANALYSIS; COMPUTATIONAL COMPLEXITY;

EID: 34548234315     PISSN: 10834427     EISSN: None     Source Type: Journal    
DOI: 10.1109/TSMCA.2007.902631     Document Type: Article
Times cited : (201)

References (59)
  • 1
    • 19644370395 scopus 로고    scopus 로고
    • The treatment of missing values and its effect in the classifier accuracy
    • D. Banks, L. House, F. R. McMorris, P. Arabie, and W. Gaul, Eds. Berlin, Germany: Springer-Verlag
    • E. Acuna and C. Rodriguez, "The treatment of missing values and its effect in the classifier accuracy," in Classification, Clustering and Data Mining Applications, D. Banks, L. House, F. R. McMorris, P. Arabie, and W. Gaul, Eds. Berlin, Germany: Springer-Verlag, 2004, pp. 639-648.
    • (2004) Classification, Clustering and Data Mining Applications , pp. 639-648
    • Acuna, E.1    Rodriguez, C.2
  • 3
    • 0032954507 scopus 로고    scopus 로고
    • Applications of multiple imputation in medical studies: From AIDS to NHANES
    • Jan
    • J. Barnard and X. L. Meng, "Applications of multiple imputation in medical studies: From AIDS to NHANES," Stat. Methods Med. Res., vol. 8, no. 1, pp. 17-36, Jan. 1999.
    • (1999) Stat. Methods Med. Res , vol.8 , Issue.1 , pp. 17-36
    • Barnard, J.1    Meng, X.L.2
  • 4
    • 0242498488 scopus 로고    scopus 로고
    • An analysis of four missing data treatment methods for supervised learning
    • G. Batista and M. Monard, "An analysis of four missing data treatment methods for supervised learning," Appl. Artif. Intell., vol. 17, no. 5/6, pp. 519-533, 2003.
    • (2003) Appl. Artif. Intell , vol.17 , Issue.5-6 , pp. 519-533
    • Batista, G.1    Monard, M.2
  • 5
    • 0003408496 scopus 로고    scopus 로고
    • Irvine, CA: Univ. California at Irvine, Dept. Inf. and Comput. Sci.,1998, Online, Available
    • C. L. Blake and C. J. Merz, UCI Repository of Machine Learning Databases. Irvine, CA: Univ. California at Irvine, Dept. Inf. and Comput. Sci.,1998. [Online]. Available: http://www.ics.uci.edu/~mlearn/ MLRepository.html
    • UCI Repository of Machine Learning Databases
    • Blake, C.L.1    Merz, C.J.2
  • 6
    • 0040322676 scopus 로고    scopus 로고
    • Development, implementation and evaluation of multiple imputation strategies for the statistical analysis of incomplete datasets,
    • Ph.D. dissertation, Erasmus Univ, Rotterdam, The Netherlands
    • J. P. L. Brand, "Development, implementation and evaluation of multiple imputation strategies for the statistical analysis of incomplete datasets," Ph.D. dissertation, Erasmus Univ., Rotterdam, The Netherlands, 1999.
    • (1999)
    • Brand, J.P.L.1
  • 7
    • 84958528434 scopus 로고
    • Characterizing the applicability of classification algorithms using meta level learning
    • P. Brazdil, J. Gama, and R. Henery, "Characterizing the applicability of classification algorithms using meta level learning," in Proc. ECML, 1994, pp. 83-102.
    • (1994) Proc. ECML , pp. 83-102
    • Brazdil, P.1    Gama, J.2    Henery, R.3
  • 8
    • 0001342385 scopus 로고
    • A method of estimation of missing values in multivariate data suitable for use with an electronic computer
    • S. F. Buck, "A method of estimation of missing values in multivariate data suitable for use with an electronic computer," J. R. Stat. Soc. vol. B22, no. 2, pp. 302-306, 1960.
    • (1960) J. R. Stat. Soc , vol.B22 , Issue.2 , pp. 302-306
    • Buck, S.F.1
  • 12
    • 84937730674 scopus 로고
    • Explaining the Gibbs sampler
    • Aug
    • G. Casella and E. L. George, "Explaining the Gibbs sampler," Amer. Stat., vol. 46, no. 3, pp. 167-174, Aug. 1992.
    • (1992) Amer. Stat , vol.46 , Issue.3 , pp. 167-174
    • Casella, G.1    George, E.L.2
  • 13
    • 0041324801 scopus 로고    scopus 로고
    • Variational Bayesian learning of ICA with missing data
    • Aug
    • K. Chan, T. W. Lee, and T. J. Sejnowski, "Variational Bayesian learning of ICA with missing data," Neural Comput., vol. 15, no. 8, pp. 1991-2011, Aug. 2003.
    • (2003) Neural Comput , vol.15 , Issue.8 , pp. 1991-2011
    • Chan, K.1    Lee, T.W.2    Sejnowski, T.J.3
  • 15
    • 0010594108 scopus 로고    scopus 로고
    • Hybrid inductive machine learning: An overview of CLIP algorithms
    • L. C. Jain and J. Kacprzyk, Eds. New York: Springer-Verlag
    • K. J. Cios and L. A. Kurgan, "Hybrid inductive machine learning: An overview of CLIP algorithms," in New Learning Paradigms in Soft Computing, L. C. Jain and J. Kacprzyk, Eds. New York: Springer-Verlag, 2001, pp. 276-322.
    • (2001) New Learning Paradigms in Soft Computing , pp. 276-322
    • Cios, K.J.1    Kurgan, L.A.2
  • 16
    • 2442649388 scopus 로고    scopus 로고
    • CLIP4: Hybrid inductive machine learning algorithm that generates inequality rules
    • K. J. Cios and L. A. Kurgan, "CLIP4: Hybrid inductive machine learning algorithm that generates inequality rules," Inf. Sci., vol. 163, no. 1-3, pp. 37-83, 2004.
    • (2004) Inf. Sci , vol.163 , Issue.1-3 , pp. 37-83
    • Cios, K.J.1    Kurgan, L.A.2
  • 17
    • 0036756222 scopus 로고    scopus 로고
    • Uniqueness of medical data mining
    • Sep./Oct
    • K. J. Cios and G. Moore, "Uniqueness of medical data mining," Artif. Intell. Med., vol. 26, no. 1/2, pp. 1-24, Sep./Oct. 2002.
    • (2002) Artif. Intell. Med , vol.26 , Issue.1-2 , pp. 1-24
    • Cios, K.J.1    Moore, G.2
  • 19
    • 34548286482 scopus 로고    scopus 로고
    • Missing data incremental imputation through tree-based methods
    • W. Hardle and B. Ronz, Eds
    • C. Conversano and C. Capelli, "Missing data incremental imputation through tree-based methods," in Proc. COMPSTAT, W. Hardle and B. Ronz, Eds., 2002, pp. 455-460.
    • (2002) Proc. COMPSTAT , pp. 455-460
    • Conversano, C.1    Capelli, C.2
  • 20
    • 0002629270 scopus 로고
    • Maximum likelihood from incomplete data via the EM algorithm (with discussion)
    • A. P. Dempster, N. M. Laird, and D. B. Rubin, "Maximum likelihood from incomplete data via the EM algorithm (with discussion)," J. R. Stat. Soc., vol. 82, pp. 528-550, 1978.
    • (1978) J. R. Stat. Soc , vol.82 , pp. 528-550
    • Dempster, A.P.1    Laird, N.M.2    Rubin, D.B.3
  • 23
    • 84956855928 scopus 로고    scopus 로고
    • Handling missing data in trees: Surrogate splits or statistical imputation
    • A. J. Feelders, "Handling missing data in trees: Surrogate splits or statistical imputation," in Proc. 3rd Eur. Conf. PKDD, 1999, pp. 329-334.
    • (1999) Proc. 3rd Eur. Conf. PKDD , pp. 329-334
    • Feelders, A.J.1
  • 26
    • 34548263283 scopus 로고    scopus 로고
    • Z. Ghahramani and M. I. Jordan, Mixture models for learning from incomplete data, in Computational Learning Theory and Natural Learning Systems, 4, Making Learning Systems Practical, R. Greiner, T. Petsche, and S. J. Hanson, Eds. Cambridge, MA: MIT Press, 1997, pp. 67-85.
    • Z. Ghahramani and M. I. Jordan, "Mixture models for learning from incomplete data," in Computational Learning Theory and Natural Learning Systems, vol. 4, Making Learning Systems Practical, R. Greiner, T. Petsche, and S. J. Hanson, Eds. Cambridge, MA: MIT Press, 1997, pp. 67-85.
  • 27
    • 0003704318 scopus 로고    scopus 로고
    • Irvine, CA: Univ. California, Dept. Inf. and Comput. Sci, Online, Available
    • S. Hettich and S. D. Bay, The UCI KDD Archive. Irvine, CA: Univ. California, Dept. Inf. and Comput. Sci., 1999. [Online]. Available: http://kdd.ics.uci.edu
    • (1999) The UCI KDD Archive
    • Hettich, S.1    Bay, S.D.2
  • 28
    • 23044525261 scopus 로고    scopus 로고
    • Multiple imputation in practice: Comparison of software packages for regression models with missing variables
    • N. J. Horton and S. R. Lipsitz, "Multiple imputation in practice: Comparison of software packages for regression models with missing variables," Amer. Stat., vol. 55, no. 3, pp. 244-254, 2001.
    • (2001) Amer. Stat , vol.55 , Issue.3 , pp. 244-254
    • Horton, N.J.1    Lipsitz, S.R.2
  • 29
    • 1942486882 scopus 로고    scopus 로고
    • Methods for imputation of missing values in air quality data sets
    • Jun
    • H. Junninen, H. Niska, K. Tuppurainen, J. Ruuskanen, and M. Kolehmainen, "Methods for imputation of missing values in air quality data sets," Atmos. Environ., vol. 38, no. 18, pp. 2895-2907, Jun. 2004.
    • (2004) Atmos. Environ , vol.38 , Issue.18 , pp. 2895-2907
    • Junninen, H.1    Niska, H.2    Tuppurainen, K.3    Ruuskanen, J.4    Kolehmainen, M.5
  • 31
    • 0033225727 scopus 로고    scopus 로고
    • Imputation of missing data in industrial databases
    • Nov./Dec
    • K. Lakshminarayan, S. A. Harp, and T. Samad, "Imputation of missing data in industrial databases," Appl. Intell., vol. 11, no. 3, pp. 259-275, Nov./Dec. 1999.
    • (1999) Appl. Intell , vol.11 , Issue.3 , pp. 259-275
    • Lakshminarayan, K.1    Harp, S.A.2    Samad, T.3
  • 32
    • 0002502286 scopus 로고
    • Imputation using Markov chains
    • K.-H. Li, "Imputation using Markov chains," J. Comput. Simul., vol. 30, no. 1, pp. 57-79, 1988.
    • (1988) J. Comput. Simul , vol.30 , Issue.1 , pp. 57-79
    • Li, K.-H.1
  • 37
    • 15744384861 scopus 로고    scopus 로고
    • Options for handling missing data in the health utilities index mark 3
    • A. Naeim, E. B. Keeler, and C. M. Mangione, "Options for handling missing data in the health utilities index mark 3," Med. Decision Making, vol. 25, no. 2, pp. 186-198, 2005.
    • (2005) Med. Decision Making , vol.25 , Issue.2 , pp. 186-198
    • Naeim, A.1    Keeler, E.B.2    Mangione, C.M.3
  • 38
    • 0000628115 scopus 로고
    • Weighting adjustments for unit nonresponse, incomplete data in sample survey
    • W. G. Madow, I. Olkin, and D. B. Rubin, Eds. New York: Academic
    • H. L. Oh and F. L. Scheuren, "Weighting adjustments for unit nonresponse, incomplete data in sample survey," in Theory and Bibliographies, vol. 2, W. G. Madow, I. Olkin, and D. B. Rubin, Eds. New York: Academic, 1983, pp. 143-183.
    • (1983) Theory and Bibliographies , vol.2 , pp. 143-183
    • Oh, H.L.1    Scheuren, F.L.2
  • 40
    • 33744584654 scopus 로고
    • Induction of decision trees
    • J. R. Quinlan, "Induction of decision trees," Mach. Learn., vol. 1, no. 1, pp. 81-106, 1986.
    • (1986) Mach. Learn , vol.1 , Issue.1 , pp. 81-106
    • Quinlan, J.R.1
  • 41
    • 34548213753 scopus 로고    scopus 로고
    • A preprocessing method to treat missing values in knowledge discovery in databases
    • A. Ragel, "A preprocessing method to treat missing values in knowledge discovery in databases," Comput. Inf. Syst., vol. 2, pp. 66-72, 2000.
    • (2000) Comput. Inf. Syst , vol.2 , pp. 66-72
    • Ragel, A.1
  • 42
    • 0002415125 scopus 로고
    • Efficiently creating multiple imputation for incomplete multivariate normal data
    • D. B. Rubin and J. L. Schafer, "Efficiently creating multiple imputation for incomplete multivariate normal data," in Proc. Stat. Comput. Section, 1990, pp. 83-88.
    • (1990) Proc. Stat. Comput. Section , pp. 83-88
    • Rubin, D.B.1    Schafer, J.L.2
  • 43
    • 0001354633 scopus 로고
    • Formalizing subjective notions about the effect of nonrespondents in sample surveys
    • Sep
    • D. B. Rubin, "Formalizing subjective notions about the effect of nonrespondents in sample surveys," J. Amer. Stat. Assoc., vol. 72, no. 359, pp. 538-543, Sep. 1977.
    • (1977) J. Amer. Stat. Assoc , vol.72 , Issue.359 , pp. 538-543
    • Rubin, D.B.1
  • 44
    • 0030539070 scopus 로고    scopus 로고
    • Multiple imputation after 18+ years
    • Jun
    • D. B. Rubin, "Multiple imputation after 18+ years," J. Amer. Stat. Assoc., vol. 91, no. 434, pp. 473-489, Jun. 1996.
    • (1996) J. Amer. Stat. Assoc , vol.91 , Issue.434 , pp. 473-489
    • Rubin, D.B.1
  • 47
    • 14844365626 scopus 로고    scopus 로고
    • Indirect methods of imputation of missing data based on available units
    • May
    • M. M. Rueda, S. González, and A. Arcos, "Indirect methods of imputation of missing data based on available units," Appl. Math. Comput., vol. 164, no. 1, pp. 249-261, May 2005.
    • (2005) Appl. Math. Comput , vol.164 , Issue.1 , pp. 249-261
    • Rueda, M.M.1    González, S.2    Arcos, A.3
  • 49
    • 34548229532 scopus 로고    scopus 로고
    • Prediction with missing inputs
    • W. S. Sarle, "Prediction with missing inputs," in Proc. 4th JCIS 1998, vol. 2, pp. 399-402.
    • (1998) Proc. 4th JCIS , vol.2 , pp. 399-402
    • Sarle, W.S.1
  • 52
    • 0032960273 scopus 로고    scopus 로고
    • Multiple imputations: A primer
    • J. L. Shafer, "Multiple imputations: A primer," Stat. Methods Med. Res., vol. 8, pp. 3-15, 1999.
    • (1999) Stat. Methods Med. Res , vol.8 , pp. 3-15
    • Shafer, J.L.1
  • 53
    • 84856043672 scopus 로고
    • A mathematical theory of communication
    • C. E. Shannon, "A mathematical theory of communication," Bell Syst. Tech. J., vol. 27, pp. 379-423, 1948.
    • (1948) Bell Syst. Tech. J , vol.27 , pp. 379-423
    • Shannon, C.E.1
  • 55
    • 85153928966 scopus 로고
    • Efficient methods for dealing with missing data in supervised learning
    • G. Tesauro, D. S. Touretzky, and T. K. Leen, Eds. Cambridge, MA: MIT Press
    • V. Tresp, R. Neuneier, and S. Ahmad, "Efficient methods for dealing with missing data in supervised learning," in Advances in Neural Information Processing Systems 7, G. Tesauro, D. S. Touretzky, and T. K. Leen, Eds. Cambridge, MA: MIT Press, 1995, pp. 689-696.
    • (1995) Advances in Neural Information Processing Systems 7 , pp. 689-696
    • Tresp, V.1    Neuneier, R.2    Ahmad, S.3
  • 56
    • 0003064420 scopus 로고
    • Missing values: Statistical theory and computational practice
    • P. Dirschedl and R. Ostermann, Eds. Heidelberg, Germany: Physica-Verlag
    • W. Vach, "Missing values: Statistical theory and computational practice," in Computational Statistics, P. Dirschedl and R. Ostermann, Eds. Heidelberg, Germany: Physica-Verlag, 1994, pp. 345-354.
    • (1994) Computational Statistics , pp. 345-354
    • Vach, W.1
  • 57
    • 34548238789 scopus 로고    scopus 로고
    • May, Intelligent Enterprise, Online, Available
    • R. Winter and K. Auerbach, Contents Under Pressure, May 2004, Intelligent Enterprise. [Online]. Available: http:// www.intelligententerprise.com/showArticle.jhtml?articleID=18902161
    • (2004) Contents Under Pressure
    • Winter, R.1    Auerbach, K.2
  • 58
    • 0033901936 scopus 로고    scopus 로고
    • Association based multiple imputation in multivariate datasets: A summary
    • W. Zhang, "Association based multiple imputation in multivariate datasets: A summary," in Proc. 16th ICDE, 2000, pp. 310-311.
    • (2000) Proc. 16th ICDE , pp. 310-311
    • Zhang, W.1
  • 59
    • 0345659235 scopus 로고    scopus 로고
    • Multiple imputation: Theory and method (with discussion
    • P. Zhang, "Multiple imputation: Theory and method (with discussion," Int. Stat. Rev., vol. 71, no. 3, pp. 581-592, 2003.
    • (2003) Int. Stat. Rev , vol.71 , Issue.3 , pp. 581-592
    • Zhang, P.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.