메뉴 건너뛰기




Volumn , Issue , 2009, Pages 796-804

Data mining techniques for data cleaning

Author keywords

Association rule; Bagging; Data Cleaning; Data Mining; Functional dependency; SVMs

Indexed keywords

BAGGING; DATA CLEANING; DATA MINING METHODS; DATA MINING TECHNIQUES; DATA QUALITY; FUNCTIONAL DEPENDENCY; INTERESTING INFORMATION; KEY TECHNIQUES; LARGE DATABASE; QUALITY INFORMATION MANAGEMENTS; SVMS;

EID: 84871499791     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (17)

References (39)
  • 1
    • 0001882616 scopus 로고
    • Fast algorithms for mining association rules in large databases
    • (Jorge B. Bocca, Matthias Jarke, and Carlo Zaniolo, eds.), Morgan Kaufmann.
    • Rakesh Agrawal and Ramakrishnan Srikant, (1994) Fast algorithms for mining association rules in large databases, VLDB (Jorge B. Bocca, Matthias Jarke, and Carlo Zaniolo, eds.), Morgan Kaufmann. pp. 487-499.
    • (1994) VLDB , pp. 487-499
    • Agrawal, R.1    Srikant, R.2
  • 3
    • 0030211964 scopus 로고    scopus 로고
    • Bagging predictors
    • Leo Breiman, (1996) Bagging predictors, Machine Learning 24, no. 2, 123-140.
    • (1996) Machine Learning , vol.24 , Issue.2 , pp. 123-140
    • Breiman, L.1
  • 4
    • 84880905070 scopus 로고    scopus 로고
    • Identification constraints and functional depen-dencies in description logics
    • (Bernhard Nebel, ed.), Morgan Kaufmann
    • Diego Calvanese, Giuseppe De Giacomo, and Maurizio Lenzerini, (2001) Identification constraints and functional depen-dencies in description logics, IJCAI (Bernhard Nebel, ed.), Morgan Kaufmann, pp. 155-160.
    • (2001) IJCAI , pp. 155-160
    • Calvanese, D.1    De Giacomo, G.2    Lenzerini, M.3
  • 5
    • 0344811122 scopus 로고    scopus 로고
    • An overview of data warehousing and olap technology
    • Surajit Chaudhuri and Umeshwar Dayal, (1997) An overview of data warehousing and olap technology, SIGMOD Record 26, no. 1, 65-74.
    • (1997) SIGMOD Record , vol.26 , Issue.1 , pp. 65-74
    • Chaudhuri, S.1    Dayal, U.2
  • 6
    • 0030387631 scopus 로고    scopus 로고
    • Data mining: An overview from a database perspective
    • Ming-Syan Chen, Jiawei Han, and Philip S. Yu, (1996) Data mining: An overview from a database perspective, IEEE Trans. Knowl. Data Eng. 8, no. 6, 866-883.
    • (1996) IEEE Trans. Knowl. Data Eng. , vol.8 , Issue.6 , pp. 866-883
    • Chen, M.-S.1    Han, J.2    Yu, P.S.3
  • 7
    • 12244300519 scopus 로고    scopus 로고
    • A general approach to incorporate data quality matrices into data mining algorithms
    • (Won Kim, Ron Kohavi, Johannes Gehrke, and William DuMouchel, eds.), ACM
    • Ian Davidson, Ashish Grover, Ashwin Satyanarayana, and Giri Kumar Tayi, (2004) A general approach to incorporate data quality matrices into data mining algorithms, KDD (Won Kim, Ron Kohavi, Johannes Gehrke, and William DuMouchel, eds.), ACM, pp. 794-798.
    • (2004) KDD , pp. 794-798
    • Davidson, I.1    Grover, A.2    Satyanarayana, A.3    Tayi, G.K.4
  • 8
    • 0032218208 scopus 로고    scopus 로고
    • Investigation data quality problems in the psp
    • Anne M. Disney and Philip M. Johnson, (1998) Investigation data quality problems in the psp, SIGSOFT FSE, pp. 143-152.
    • (1998) SIGSOFT FSE , pp. 143-152
    • Disney, A.M.1    Johnson, P.M.2
  • 11
    • 0030289446 scopus 로고    scopus 로고
    • Data mining and knowledge discovery in databases (introduction to the special section
    • Usama M. Fayyad and Ramasamy Uthurusamy, ((1996) Data mining and knowledge discovery in databases (introduction to the special section), Commun. ACM 39, no. 11, 24-26.
    • (1996) Commun. ACM , vol.39 , Issue.11 , pp. 24-26
    • Fayyad, U.M.1    Uthurusamy, R.2
  • 13
    • 0035988591 scopus 로고    scopus 로고
    • Data requirements and data sources for biodiversity priority area selection
    • WILLIAMS P. H., MARGULES C. R., and HILBERT D. W, (2002) Data requirements and data sources for biodiversity priority area selection, Journal of biosciences ISSN 0250-5991 vol. 27, no. no 4, pp. 327-338.
    • (2002) Journal of biosciences ISSN 0250-5991 , vol.27 , Issue.4 , pp. 327-338
    • Williams, P.H.1    Margules, C.R.2    Hilbert, D.W.3
  • 14
    • 0013331361 scopus 로고    scopus 로고
    • Real-world data is dirty: Data cleansing and the merge/purge problem
    • M.A Hernandez and J.S Stolfo, (1998) Real-world data is dirty: Data cleansing and the merge/purge problem, Data Mining and knowledge Discovery 2, 9-37.
    • (1998) Data Mining and knowledge Discovery , vol.2 , pp. 9-37
    • Hernandez, M.A.1    Stolfo, J.S.2
  • 15
    • 84871515553 scopus 로고    scopus 로고
    • Data quality mining-Making a virute of necessity
    • Jochen Hipp, Ulrich Guntzer, and Udo Grimmer, (2001) Data quality mining-making a virute of necessity, DMKD.
    • (2001) DMKD
    • Hipp, J.1    Guntzer, U.2    Grimmer, U.3
  • 16
    • 0345201769 scopus 로고    scopus 로고
    • An efficient algorithm for discovering functional and approximate dependencies
    • Yk?a Huhtala, Juha Karkkainen, Pasi Porkka, and Hannu Toivonen, Tane, (1999) An efficient algorithm for discovering functional and approximate dependencies, Comput. J. 42, no. 2, 100-111.
    • (1999) Comput. J. , vol.42 , Issue.2 , pp. 100-111
    • Huhtala, Y.1    Karkkainen, J.2    Porkka, P.3    Toivonen, T.H.4
  • 17
    • 3142708793 scopus 로고    scopus 로고
    • Ashraf aboulnaga cords: 2004) automatic discovery of correlations and soft functional dependencies
    • (Gerhard Weikum, Arnd Christian Konig, and Stefan Deßloch, eds.), ACM
    • Ihab F. Ilyas, Volker Markl, Peter J. Haas, Paul Brown, and Ashraf Aboulnaga, Cords: (2004) Automatic discovery of correlations and soft functional dependencies, SIGMOD Conference (Gerhard Weikum, Arnd Christian Konig, and Stefan Deßloch, eds.), ACM, pp. 647-658.
    • SIGMOD Conference , pp. 647-658
    • Ihab, F.1    Markl, I.V.2    Peter, J.3    Brown, H.P.4
  • 18
    • 84957886632 scopus 로고    scopus 로고
    • Association rules.. And what's next? Towards second generation data mining systems
    • (Witold Litwin, Tadeusz Morzy, and Gottfried Vossen, eds.), Lecture Notes in Computer Science, 1475, Springer
    • Tomasz Imielinski and Aashu Virmani, (1998) Association rules... and what's next? towards second generation data mining systems, ADBIS (Witold Litwin, Tadeusz Morzy, and Gottfried Vossen, eds.), Lecture Notes in Computer Science, vol. 1475, Springer, pp. 6-25.
    • (1998) ADBIS , pp. 6-25
    • Imielinski, T.1    Aashu, V.2
  • 19
    • 34447121992 scopus 로고    scopus 로고
    • Principles of data mining
    • David J.Hand, (2007) Principles of data mining, Drug Safety, pp. 30,621-622.
    • (2007) Drug Safety , vol.30 , pp. 621-622
    • David, J.Hand.1
  • 22
    • 0034173996 scopus 로고    scopus 로고
    • A tool for identifying attribute correspondences in heterogeneous databases using neural networks
    • Wen-Syan Li and Chris Clifton, Semint, (2000) A tool for identifying attribute correspondences in heterogeneous databases using neural networks, Data Knowl. Eng. 33, no. 1, 49-84.
    • (2000) Data Knowl. Eng. , vol.33 , Issue.1 , pp. 49-84
    • Li, W.-S.1    Semint, C.C.2
  • 23
    • 0022130080 scopus 로고
    • A data distortion by probability distribution
    • Chong K. Liew, Uinam J. Choi, and Chung J. Liew, (1985) A data distortion by probability distribution, ACM Trans. Database Syst. 10, no. 3, 395-411.
    • (1985) ACM Trans. Database Syst. , vol.10 , Issue.3 , pp. 395-411
    • Liew, C.K.1    Choi, U.J.2    Liew, C.J.3
  • 24
    • 0010274189 scopus 로고    scopus 로고
    • Data cleansing: Beyond integrity analysis
    • (Barbara D. Klein and Donald F. Rossin, eds.), MIT
    • Jonathan I. Maletic and Andrian Marcus, (2000) Data cleansing: Beyond integrity analysis, IQ (Barbara D. Klein and Donald F. Rossin, eds.), MIT, pp. 200-209.
    • (2000) IQ , pp. 200-209
    • Maletic, J.I.1    Marcus, A.2
  • 25
    • 0242288813 scopus 로고    scopus 로고
    • The support vector machine under test
    • David Meyer, Friedrich Leisch, and Kurt Hornik, (2003) The support vector machine under test, Neurocomputing 55, no. 1-2, 169-186.
    • (2003) Neurocomputing , vol.55 , Issue.1-2 , pp. 169-186
    • Meyer, D.1    Leisch, F.2    Hornik, K.3
  • 28
    • 0346801841 scopus 로고    scopus 로고
    • A theoretical basis for perturbation methods
    • Krishnamurty Muralidhar and Rathindra Sarathy, (2003) A theoretical basis for perturbation methods, Statistics and Com-puting 13, no. 4, 329-335.
    • (2003) Statistics and Com-puting , vol.13 , Issue.4 , pp. 329-335
    • Muralidhar, K.1    Sarathy, R.2
  • 29
    • 33644986740 scopus 로고    scopus 로고
    • Quality driven source selection using data envelope analysis
    • (InduShobha N. Chengalur-Smith and Leo Pipino, eds.), MIT
    • Felix Naumann, Johann Christoph Freytag, and Myra Spiliopoulou, (1998) Quality driven source selection using data envelope analysis, IQ (InduShobha N. Chengalur-Smith and Leo Pipino, eds.), MIT, pp. 137-152.
    • (1998) IQ , pp. 137-152
    • Naumann, F.1    Freytag, J.C.2    Spiliopoulou, M.3
  • 30
    • 4243102865 scopus 로고    scopus 로고
    • Assessing data quality with control matrices
    • Elizabeth M. Pierce, Assessing data quality with control matrices, Commun. ACM 47 (2004), no. 2, 82-86.
    • (2004) Commun. ACM , vol.47 , Issue.2 , pp. 82-86
    • Pierce, E.M.1
  • 31
    • 0002490026 scopus 로고    scopus 로고
    • Data cleaning: Problems and current approaches
    • Erhard Rahm and Hong Hai Do, (2000) Data cleaning: Problems and current approaches, IEEE Data Eng. Bull. 23, no. 4, 3-13.
    • (2000) IEEE Data Eng. Bull. , vol.23 , Issue.4 , pp. 3-13
    • Rahm, E.1    Do, H.H.2
  • 32
    • 84944315993 scopus 로고    scopus 로고
    • Potter's wheel: An interactive data cleaning system
    • (Peter M. G. Apers, Paolo Atzeni, Stefano Ceri, Stefano Paraboschi, Kotagiri Ramamohanarao, and Richard T. Snodgrass, eds.), Morgan Kaufmann
    • Vijayshankar Raman and Joseph M. Hellerstein, (2001) Potter's wheel: An interactive data cleaning system, VLDB (Peter M. G. Apers, Paolo Atzeni, Stefano Ceri, Stefano Paraboschi, Kotagiri Ramamohanarao, and Richard T. Snodgrass, eds.), Morgan Kaufmann, pp. 381-390.
    • (2001) VLDB , pp. 381-390
    • Raman, V.1    Hellerstein, J.M.2
  • 33
    • 52949101231 scopus 로고    scopus 로고
    • Mining imperfect data dealing with contamination and incomplete records
    • ISBN-10:0898715828, ISBN-13:978-0898715828,April,1
    • Ronald.K.Pearson, (2 0 0 5) Mining imperfect data: Dealing with contamination and incomplete records, SIAM,Society for Industrial and Applied Mathematics, ISBN-10:0898715828, ISBN-13:978-0898715828, April,1 2005.
    • (2005) SIAM,Society for Industrial and Applied Mathematics 2005
    • Pearson, R.K.1
  • 36
    • 0031144150 scopus 로고    scopus 로고
    • Data quality in context
    • Diane M. Strong, Yang W. Lee, and Richard Y. Wang, ((1997) Data quality in context, Commun. ACM 40, no. 5, 103-110.
    • (1997) Commun. ACM , vol.40 , Issue.5 , pp. 103-110
    • Strong, D.M.1    Lee, Y.W.2    Wang, R.Y.3
  • 37
    • 0012793677 scopus 로고
    • Towards methodology for statistical disclosure control
    • Daleniu T., (1977) Towards methodology for statistical disclosure control, Statistisktidskrift 5.
    • (1977) Statistisktidskrift , vol.5
    • Daleniu, T.1
  • 39
    • 78650272559 scopus 로고    scopus 로고
    • On association, similarity and dependency of attributes
    • Springer Verlag Berlin Heidelberg
    • Yi Yu Yao and Ning Zhong, (2000) On association, similarity and dependency of attributes, PAKDD, Springer Verlag Berlin Heidelberg, pp. 138-141.
    • (2000) PAKDD , pp. 138-141
    • Yao, Y.Y.1    Zhong, N.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.