메뉴 건너뛰기




Volumn 50, Issue 2, 2010, Pages 105-115

Missing data imputation using statistical and machine learning methods in a real breast cancer problem

Author keywords

Breast cancer prognosis; Early breast cancer; Machine learning imputation methods; Missing data; Statistical imputation techniques; Survival analysis

Indexed keywords

BREAST CANCER; EARLY BREAST CANCER; IMPUTATION TECHNIQUES; MACHINE-LEARNING; MISSING DATA; SURVIVAL ANALYSIS;

EID: 77957130052     PISSN: 09333657     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.artmed.2010.05.002     Document Type: Article
Times cited : (402)

References (62)
  • 2
    • 0031921607 scopus 로고    scopus 로고
    • Feed-forward neural networks for the analysis of censored survival data: a partial logistic regression approach
    • Biganzoli E., Boracchi P., Mariani L., Marubini E. Feed-forward neural networks for the analysis of censored survival data: a partial logistic regression approach. Statistics in Medicine 1998, 17(10):1169-1186.
    • (1998) Statistics in Medicine , vol.17 , Issue.10 , pp. 1169-1186
    • Biganzoli, E.1    Boracchi, P.2    Mariani, L.3    Marubini, E.4
  • 3
    • 0031047117 scopus 로고    scopus 로고
    • Artificial neural networks improve the accuracy of cancer survival prediction
    • Burke H.B., Goodman P.H., Rosen D.B., Henson D.E., Weinstein J.N., Harrell F.E., et al. Artificial neural networks improve the accuracy of cancer survival prediction. Cancer 1997, 79(4):857-862.
    • (1997) Cancer , vol.79 , Issue.4 , pp. 857-862
    • Burke, H.B.1    Goodman, P.H.2    Rosen, D.B.3    Henson, D.E.4    Weinstein, J.N.5    Harrell, F.E.6
  • 4
    • 0348112535 scopus 로고    scopus 로고
    • Predicting disease outcome of non-invasive transitional cell carcinoma of the urinary bladder using an artificial neural network model: results of patient follow-up for 15 years or longer
    • Fujikawa K., Matsui Y., Kobayashi T., Miura K., Oka H., Fukuzawa S., et al. Predicting disease outcome of non-invasive transitional cell carcinoma of the urinary bladder using an artificial neural network model: results of patient follow-up for 15 years or longer. International Journal of Urology 2003, 10(3):149-152.
    • (2003) International Journal of Urology , vol.10 , Issue.3 , pp. 149-152
    • Fujikawa, K.1    Matsui, Y.2    Kobayashi, T.3    Miura, K.4    Oka, H.5    Fukuzawa, S.6
  • 7
    • 0038162240 scopus 로고    scopus 로고
    • A bayesian neural network approach for modelling censored data with an application to prognosis after surgery for breast cancer
    • Lisboa P.J.G., Wong H., Harris P., Swindell R. A bayesian neural network approach for modelling censored data with an application to prognosis after surgery for breast cancer. Artificial Intelligence in Medicine 2003, 28(1):1-25.
    • (2003) Artificial Intelligence in Medicine , vol.28 , Issue.1 , pp. 1-25
    • Lisboa, P.J.G.1    Wong, H.2    Harris, P.3    Swindell, R.4
  • 8
    • 33746901239 scopus 로고    scopus 로고
    • The use of artificial neural networks in decision support in cancer: a systematic review
    • Lisboa P.J.G., Taktak A.F.G. The use of artificial neural networks in decision support in cancer: a systematic review. Neural Networks 2006, 19(4):408-415.
    • (2006) Neural Networks , vol.19 , Issue.4 , pp. 408-415
    • Lisboa, P.J.G.1    Taktak, A.F.G.2
  • 9
    • 0006117317 scopus 로고    scopus 로고
    • Neural networks as statistical methods in survival analysis. In: Clinical application of artificial neural networks
    • Ripley BD, Ripley RM. Neural networks as statistical methods in survival analysis. In: Clinical application of artificial neural networks. Cambridge University Press; 2001. p. 237-55.
    • (2001) Cambridge University Press , pp. 237-55
    • Ripley, B.D.1    Ripley, R.M.2
  • 13
    • 0004093524 scopus 로고    scopus 로고
    • Sage Publications, Inc., Thousand Oaks, CA
    • Allison P.D. Missing data 2001, Sage Publications, Inc., Thousand Oaks, CA.
    • (2001) Missing data
    • Allison, P.D.1
  • 15
    • 16344379897 scopus 로고    scopus 로고
    • Partial identification with missing data: concepts and findings
    • Manski C.F. Partial identification with missing data: concepts and findings. International Journal of Approximate Reasoning 2005, 39(2-3):151-165.
    • (2005) International Journal of Approximate Reasoning , vol.39 , Issue.2-3 , pp. 151-165
    • Manski, C.F.1
  • 17
    • 85047673373 scopus 로고    scopus 로고
    • Missing data: our view of the state of the art
    • Schafer J.L., Graham J.W. Missing data: our view of the state of the art. Psychological Methods 2002, 7(2):147-177.
    • (2002) Psychological Methods , vol.7 , Issue.2 , pp. 147-177
    • Schafer, J.L.1    Graham, J.W.2
  • 18
    • 0037203136 scopus 로고    scopus 로고
    • Use of the mean, hot deck and multiple imputation techniques to predict outcome in intensive care unit patients in Colombia
    • Pérez A., Dennis R.J., Gil J.F.A., Rondón M.A., López A. Use of the mean, hot deck and multiple imputation techniques to predict outcome in intensive care unit patients in Colombia. Statistics in Medicine 2002, 21(24):3885-3896.
    • (2002) Statistics in Medicine , vol.21 , Issue.24 , pp. 3885-3896
    • Pérez, A.1    Dennis, R.J.2    Gil, J.F.A.3    Rondón, M.A.4    López, A.5
  • 19
    • 38849125474 scopus 로고    scopus 로고
    • Multiple imputation using an iterative hot-deck with distance-based donor selection
    • Siddique J., Belin T.R. Multiple imputation using an iterative hot-deck with distance-based donor selection. Statistics in Medicine 2008, 27(1):83-102.
    • (2008) Statistics in Medicine , vol.27 , Issue.1 , pp. 83-102
    • Siddique, J.1    Belin, T.R.2
  • 20
  • 21
    • 23044525261 scopus 로고    scopus 로고
    • Multiple imputation in practice: comparison of software packages for regression models with missing variables
    • Horton N.J., Lipsitz S.R. Multiple imputation in practice: comparison of software packages for regression models with missing variables. The American Statistician 2001, 55(3):244-254.
    • (2001) The American Statistician , vol.55 , Issue.3 , pp. 244-254
    • Horton, N.J.1    Lipsitz, S.R.2
  • 22
    • 34347372013 scopus 로고    scopus 로고
    • A comparison of imputation techniques for handling missing predictor values in a risk model with a binary outcome
    • Ambler G., Omar R.Z., Royston P. A comparison of imputation techniques for handling missing predictor values in a risk model with a binary outcome. Statistical Methods in Medical Research 2007, 16(3):277-298.
    • (2007) Statistical Methods in Medical Research , vol.16 , Issue.3 , pp. 277-298
    • Ambler, G.1    Omar, R.Z.2    Royston, P.3
  • 25
    • 39749093807 scopus 로고    scopus 로고
    • Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes
    • Brock G.N., Shaffer J.R., Blakesley R.E., Lotz M.J., Tseng G.C. Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes. BMC Bioinformatics 2008, 9:12.
    • (2008) BMC Bioinformatics , vol.9 , pp. 12
    • Brock, G.N.1    Shaffer, J.R.2    Blakesley, R.E.3    Lotz, M.J.4    Tseng, G.C.5
  • 26
    • 19344371607 scopus 로고    scopus 로고
    • Evaluation of missing value estimation for microarray data
    • Nguyen D.V., Wang N., Carrol R.J. Evaluation of missing value estimation for microarray data. Journal of Data Science 2004, 2:347-370.
    • (2004) Journal of Data Science , vol.2 , pp. 347-370
    • Nguyen, D.V.1    Wang, N.2    Carrol, R.J.3
  • 27
    • 28444441249 scopus 로고    scopus 로고
    • The influence of missing value imputation on detection of differentially expressed genes from microarray data
    • Scheel I., Aldrin M., Glad I.K., Shrum R., Lyng H., Frigessi A. The influence of missing value imputation on detection of differentially expressed genes from microarray data. Bioinformatics 2005, 21(23):4272-4279.
    • (2005) Bioinformatics , vol.21 , Issue.23 , pp. 4272-4279
    • Scheel, I.1    Aldrin, M.2    Glad, I.K.3    Shrum, R.4    Lyng, H.5    Frigessi, A.6
  • 28
    • 0002812717 scopus 로고
    • Dealing with missing values in neural-network-based diagnostic systems
    • Sharpe P.K., Solly R.J. Dealing with missing values in neural-network-based diagnostic systems. Neural Computing & Applications 1995, 3(2):73-77.
    • (1995) Neural Computing & Applications , vol.3 , Issue.2 , pp. 73-77
    • Sharpe, P.K.1    Solly, R.J.2
  • 29
    • 0043007739 scopus 로고    scopus 로고
    • Neural network imputation applied to the Norwegian 1990 census data
    • Nordbotten S. Neural network imputation applied to the Norwegian 1990 census data. Journal of Official Statistics 1996, 12(4):385-401.
    • (1996) Journal of Official Statistics , vol.12 , Issue.4 , pp. 385-401
    • Nordbotten, S.1
  • 31
    • 0033225727 scopus 로고    scopus 로고
    • Imputation of missing data in industrial databases
    • Lakshminarayan K., Harp S.A., Samad T. Imputation of missing data in industrial databases. Applied Intelligence 1999, 11(3):259-275.
    • (1999) Applied Intelligence , vol.11 , Issue.3 , pp. 259-275
    • Lakshminarayan, K.1    Harp, S.A.2    Samad, T.3
  • 32
    • 61849118651 scopus 로고    scopus 로고
    • A study of k-nearest neighbour as an imputation method
    • IOS Press, Santiago, Chile, A. Abraham, J.R. del Solar, M. Köppen (Eds.)
    • Batista G.E.A.P.A., Monard M.C. A study of k-nearest neighbour as an imputation method. HIS, vol. 87 of frontiers in artificial intelligence and applications 2002, 251-260. IOS Press, Santiago, Chile. A. Abraham, J.R. del Solar, M. Köppen (Eds.).
    • (2002) HIS, vol. 87 of frontiers in artificial intelligence and applications , pp. 251-260
    • Batista, G.E.A.P.A.1    Monard, M.C.2
  • 33
    • 22944460763 scopus 로고    scopus 로고
    • Ebecken NFF. Towards efficient imputation by nearest-neighbors: a clustering-based approach. In: AI 2004: advances in artificial intelligence, of lecture notes in computer science. Springer Berlin/Heidelberg
    • Hruschka ER, Hruschka ER, Ebecken NFF. Towards efficient imputation by nearest-neighbors: a clustering-based approach. In: AI 2004: advances in artificial intelligence, vol. 3339 of lecture notes in computer science. Springer Berlin/Heidelberg; 2005. p. 513-25.
    • (2005) , vol.3339 , pp. 513-25
    • Hruschka, E.R.1    Hruschka, E.R.2
  • 35
    • 23044533360 scopus 로고    scopus 로고
    • Self-organising map for data imputation and correction in surveys
    • Fessant F., Midenet S. Self-organising map for data imputation and correction in surveys. Neural Computing & Applications 2002, 10(4):300-310.
    • (2002) Neural Computing & Applications , vol.10 , Issue.4 , pp. 300-310
    • Fessant, F.1    Midenet, S.2
  • 36
    • 77957144426 scopus 로고    scopus 로고
    • Introduction to self-organizing maps modelling for imputation-techniques and technology
    • Piela P. Introduction to self-organizing maps modelling for imputation-techniques and technology. Research in Official Statistics 2002, 2:5-19.
    • (2002) Research in Official Statistics , vol.2 , pp. 5-19
    • Piela, P.1
  • 38
    • 10744226143 scopus 로고    scopus 로고
    • Epidemiological study of the geicam group about breast cancer in Spain, El Álamo project
    • Martín M., Llombart-Cussac A., Lluch A., Alba E., Munarriz B., Tusquets I., et al. Epidemiological study of the geicam group about breast cancer in Spain, El Álamo project. Medicina Clínica 2004, 122(1):12-17.
    • (2004) Medicina Clínica , vol.122 , Issue.1 , pp. 12-17
    • Martín, M.1    Llombart-Cussac, A.2    Lluch, A.3    Alba, E.4    Munarriz, B.5    Tusquets, I.6
  • 39
    • 49749121314 scopus 로고    scopus 로고
    • Spanish breast cancer research group (GEICAM) population-based study on breast cancer outcomes: El Álamo project
    • Ruíz A., Lluch A., Martín M., Munárriz B., Antón A., Alba E., et al. Spanish breast cancer research group (GEICAM) population-based study on breast cancer outcomes: El Álamo project. Journal of Clinical Oncology 2005, 23(16S):585.
    • (2005) Journal of Clinical Oncology , vol.23 S , Issue.16 , pp. 585
    • Ruíz, A.1    Lluch, A.2    Martín, M.3    Munárriz, B.4    Antón, A.5    Alba, E.6
  • 45
    • 77957146466 scopus 로고    scopus 로고
    • Amelia software website, [accessed 15.12.08]
    • Honaker J, King G, Blackwell M. Amelia software website, 2008 [accessed 15.12.08]. http://gking.harvard.edu/amelia.
    • (2008)
    • Honaker, J.1    King, G.2    Blackwell, M.3
  • 46
    • 77957148913 scopus 로고    scopus 로고
    • Pattern recognition and machine learning, information science and statistics. Springer Science+Business Media, LLC
    • Bishop CM. Pattern recognition and machine learning, information science and statistics. Springer Science+Business Media, LLC; 2007.
    • (2007)
    • Bishop, C.M.1
  • 49
    • 10944226441 scopus 로고    scopus 로고
    • Constructive learning techniques for designing neural network systems. In: Neural network systems, techniques and applications
    • Campbell C. Constructive learning techniques for designing neural network systems. In: Neural network systems, techniques and applications. San Diego: Academic Press; 1997. p. 1-54.
    • (1997) San Diego: Academic Press , pp. 1-54
    • Campbell, C.1
  • 50
    • 0000373566 scopus 로고
    • A dynamic node architecture scheme for layered neural networks
    • Bartlett E.B. A dynamic node architecture scheme for layered neural networks. Journal of Artificial Neural Network 1994, 1:229-245.
    • (1994) Journal of Artificial Neural Network , vol.1 , pp. 229-245
    • Bartlett, E.B.1
  • 51
    • 0025964567 scopus 로고
    • Back-propagation algorithm which varies the number of hidden units
    • Hirose Y., Yamashita K., Hijiya S. Back-propagation algorithm which varies the number of hidden units. Neural Networks 1991, 4(1):61-66.
    • (1991) Neural Networks , vol.4 , Issue.1 , pp. 61-66
    • Hirose, Y.1    Yamashita, K.2    Hijiya, S.3
  • 52
    • 0035654279 scopus 로고    scopus 로고
    • Feedforward neural network construction using cross validation
    • Setiono R. Feedforward neural network construction using cross validation. Neural Computation 2001, 13(12):2865-2877.
    • (2001) Neural Computation , vol.13 , Issue.12 , pp. 2865-2877
    • Setiono, R.1
  • 54
    • 0003410791 scopus 로고    scopus 로고
    • Springer Series in Information Sciences, Springer-Verlag, Berlin
    • Kohonen T. Self-organizing maps 2001, Springer Series in Information Sciences, Springer-Verlag, Berlin. 3rd ed.
    • (2001) Self-organizing maps
    • Kohonen, T.1
  • 55
    • 61849150502 scopus 로고    scopus 로고
    • K nearest neighbours with mutual information for simultaneous classification and missing data imputation
    • García-Laencina P.J., Sancho-Gómez J.L., Figueiras-Vidal A.R., Verleysen M. K nearest neighbours with mutual information for simultaneous classification and missing data imputation. Neurocomputing 2009, 72(7-9):1483-1493.
    • (2009) Neurocomputing , vol.72 , Issue.7-9 , pp. 1483-1493
    • García-Laencina, P.J.1    Sancho-Gómez, J.L.2    Figueiras-Vidal, A.R.3    Verleysen, M.4
  • 56
    • 0027205884 scopus 로고
    • A scaled conjugate-gradient algorithm for fast supervised learning
    • Moller M.F. A scaled conjugate-gradient algorithm for fast supervised learning. Neural Networks 1993, 6(4):525-533.
    • (1993) Neural Networks , vol.6 , Issue.4 , pp. 525-533
    • Moller, M.F.1
  • 57
    • 77957130515 scopus 로고
    • Neural network toolbox: for use with MATLAB: user's guide. Cochituate Place, 24 Prime Park Way, Natick, MA, USA: The Mathworks
    • Demuth H, Beale M. Neural network toolbox: for use with MATLAB: user's guide. Cochituate Place, 24 Prime Park Way, Natick, MA, USA: The Mathworks; 1993.
    • (1993)
    • Demuth, H.1    Beale, M.2
  • 58
    • 0020083498 scopus 로고
    • The meaning and use of the area under a receiver operating characteristic (roc) curve
    • Hanley J.A., McNeil B.J. The meaning and use of the area under a receiver operating characteristic (roc) curve. Radiology 1982, 143(1):29-36.
    • (1982) Radiology , vol.143 , Issue.1 , pp. 29-36
    • Hanley, J.A.1    McNeil, B.J.2
  • 60
    • 29644438050 scopus 로고    scopus 로고
    • Statistical comparisons of classifiers over multiple data sets
    • Demsar J. Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 2006, 7:1-30.
    • (2006) Journal of Machine Learning Research , vol.7 , pp. 1-30
    • Demsar, J.1
  • 61
    • 84944811700 scopus 로고
    • The use of ranks to avoid the assumption of normality implicit in the analysis of variance
    • Friedman M. The use of ranks to avoid the assumption of normality implicit in the analysis of variance. Journal of the American Statistical Association 1937, 32(200):675-701.
    • (1937) Journal of the American Statistical Association , vol.32 , Issue.200 , pp. 675-701
    • Friedman, M.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.