메뉴 건너뛰기




Volumn 7, Issue 3, 2011, Pages 341-347

A comparative study in classification techniques for unsupervised record linkage model

Author keywords

Bayesian network; Classification techniques; Data integration; Duplicate detection; Heterogeneous data; ID3 algorithm; Longest Common Subsequence (LCS); Optical character recognition (OCR); Record linkage; Support vector machines

Indexed keywords


EID: 80052842512     PISSN: 15493636     EISSN: None     Source Type: Journal    
DOI: 10.3844/jcssp.2011.341.347     Document Type: Article
Times cited : (12)

References (34)
  • 1
    • 0023041177 scopus 로고
    • A bit-string longestcommon-subsequence algorithm
    • DOI: 10.1016/0020-0190(86)90091-8
    • Allison, L. and T.I. Dix, 1986. A bit-string longestcommon-subsequence algorithm. Inform. Process. Lett., 23: 305-310. DOI: 10.1016/0020-0190(86)90091-8
    • (1986) Inform. Process. Lett. , vol.23 , pp. 305-310
    • Allison, L.1    Dix, T.I.2
  • 2
    • 79251506653 scopus 로고    scopus 로고
    • Statistical bayesian learning for automatic arabic text categorization
    • DOI: 10.3844/jcssp.2011.39.45
    • Al-Salemi, B. and M.J.A. Aziz, 2011. Statistical bayesian learning for automatic arabic text categorization. J. Comput. Sci., 7: 39-45. DOI: 10.3844/jcssp.2011.39.45
    • (2011) J. Comput. Sci. , vol.7 , pp. 39-45
    • Al-Salemi, B.1    Aziz, M.J.A.2
  • 3
    • 44649135932 scopus 로고    scopus 로고
    • A two-step classification approach to unsupervised record linkage
    • Australian Computer Society, Inc. Darlinghurst, Australia
    • Christen, P., 2007. A two-step classification approach to unsupervised record linkage. Proceedings of the 6th Australasian Conference on Data Mining and Analytics (AusDM'07), Australian Computer Society, Inc. Darlinghurst, Australia, pp: 111-119.
    • (2007) Proceedings of the 6th Australasian Conference on Data Mining and Analytics (AusDM'07) , pp. 111-119
    • Christen, P.1
  • 4
    • 65449139594 scopus 로고    scopus 로고
    • Automatic record linkage using seeded nearest neighbour and support vector machine classification
    • (KDD'08), ACM New York, NY, USA.,DOI: 10.1145/1401890.1401913
    • Christen, P., 2008. Automatic record linkage using seeded nearest neighbour and support vector machine classification. Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (KDD'08), ACM New York, NY, USA., pp: 151-159. DOI: 10.1145/1401890.1401913
    • (2008) Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , pp. 151-159
    • Christen, P.1
  • 5
    • 84884417241 scopus 로고    scopus 로고
    • Preparation of name and address data for record linkage using hidden Markov models
    • DOI: 10.1186/1472-6947-2-9
    • Churches, T., P. Christen, K. Lim and J.X. Zhu, 2002. Preparation of name and address data for record linkage using hidden Markov models. BMC Med. Inform. Decision Mak., 2: 9-9. DOI: 10.1186/1472-6947-2-9
    • (2002) BMC Med. Inform. Decision Mak. , vol.2 , pp. 9-19
    • Churches, T.1    Christen, P.2    Lim, K.3    Zhu, J.X.4
  • 7
    • 84947399464 scopus 로고
    • A theory for record linkage
    • DOI: 10.2307/2286061
    • Fellegi, I.P. and A.B. Sunter, 1969. A theory for record linkage. J. Am. Stat. Assoc., 64: 1183-1210. DOI: 10.2307/2286061
    • (1969) J. Am. Stat. Assoc. , vol.64 , pp. 1183-1210
    • Fellegi, I.P.1    Sunter, A.B.2
  • 8
    • 37149056535 scopus 로고    scopus 로고
    • Decision Models for Record Linkage
    • DOI: 10.1007/11677437_12
    • Gu, L. and R. Baxter, 2006. Decision Models for Record Linkage. Data Min., 3755: 146-160. DOI: 10.1007/11677437_12
    • (2006) Data Min , vol.3755 , pp. 146-160
    • Gu, L.1    Baxter, R.2
  • 10
    • 70349826301 scopus 로고    scopus 로고
    • Creating probabilistic databases from duplicated data
    • DOI: 10.1007/s00778-009-0161-2
    • Hassanzadeh, O. and R.J. Miller, 2009. Creating probabilistic databases from duplicated data. VLDB J., 18: 1141-1166. DOI: 10.1007/s00778-009-0161-2
    • (2009) VLDB J , vol.18 , pp. 1141-1166
    • Hassanzadeh, O.1    Miller, R.J.2
  • 12
    • 0013331361 scopus 로고    scopus 로고
    • Real-world data is dirty: Data cleansing and the merge/purge problem
    • DOI: 10.1023/A:1009761603038
    • Hernández, M.A. and S.J. Stolfo, 1998. Real-world data is dirty: Data cleansing and the merge/purge problem. Data Min. Know. Discovery, 2: 9-37. DOI: 10.1023/A:1009761603038
    • (1998) Data Min. Know. Discovery , vol.2 , pp. 9-37
    • Hernández, M.A.1    Stolfo, S.J.2
  • 13
    • 49149122316 scopus 로고    scopus 로고
    • Semantic text similarity using corpus-based word similarity and string similarity
    • DOI: 10.1145/1376815.1376819
    • Islam, A. and D. Inkpen, 2008. Semantic text similarity using corpus-based word similarity and string similarity. ACM Trans. Knowl. Discov. Data, 2: 1-25. DOI: 10.1145/1376815.1376819
    • (2008) ACM Trans. Knowl. Discov. Data , vol.2 , pp. 1-25
    • Islam, A.1    Inkpen, D.2
  • 14
    • 84950419860 scopus 로고
    • Advances in record-linkage methodology as applied to matching the 1985 census of Tampa
    • DOI: 10.2307/2289924
    • Jaro, M.A., 1989. Advances in record-linkage methodology as applied to matching the 1985 census of Tampa. Florida. J. Am. Stat. Assoc., 84: 414-420. DOI: 10.2307/2289924
    • (1989) Florida. J. Am. Stat. Assoc. , vol.84 , pp. 414-420
    • Jaro, M.A.1
  • 15
    • 72649095071 scopus 로고    scopus 로고
    • Frameworks for entity matching: A comparison
    • DOI: 10.1016/j.datak.2009.10.003
    • Kopcke, H. and E. Rahm, 2010. Frameworks for entity matching: A comparison. Data Knowl. Eng., 69: 197-210. DOI: 10.1016/j.datak.2009.10.003
    • (2010) Data Knowl. Eng. , vol.69 , pp. 197-210
    • Kopcke, H.1    Rahm, E.2
  • 16
    • 77952496732 scopus 로고    scopus 로고
    • Differential diagnosis knowledge building by using CUC-C4.5 framework
    • DOI: 10.3844/jcssp.2010.180.185
    • Kusrini, S. Hartati, R. Wardoyo and A. Harjoko, 2010. Differential diagnosis knowledge building by using CUC-C4.5 framework. J. Comput. Sci., 6: 180-185. DOI: 10.3844/jcssp.2010.180.185
    • (2010) J. Comput. Sci. , vol.6 , pp. 180-185
    • Kusrini, S.H.1    Wardoyo, R.2    Harjoko, A.3
  • 17
    • 77954832863 scopus 로고    scopus 로고
    • A review of nearest neighbor-support vector machines hybrid classification models
    • DOI: 10.3923/jas.2010.1841.1858
    • Lee, L.H., C.H. Wan, T.F. Yong and H.M. Kok, 2011. A review of nearest neighbor-support vector machines hybrid classification models. J. Applied Sci., 10: 1841-1858. DOI: 10.3923/jas.2010.1841.1858
    • (2011) J. Applied Sci. , vol.10 , pp. 1841-1858
    • Lee, L.H.1    Wan, C.H.2    Yong, T.F.3    Kok, H.M.4
  • 18
    • 77953790411 scopus 로고    scopus 로고
    • Tournament structure ranking techniques for Bayesian text classification with highly similar categories
    • Lee, L.H., D. Isa, W.O. Choo and W.Y. Chue, 2010. Tournament structure ranking techniques for Bayesian text classification with highly similar categories. J. Applied Sci., 10: 1243-1254
    • (2010) J. Applied Sci. , vol.10 , pp. 1243-1254
    • Lee, L.H.1    Isa, D.2    Choo, W.O.3    Chue, W.Y.4
  • 20
    • 44649150596 scopus 로고    scopus 로고
    • A bayesian networks in intrusion detection systems
    • DOI: 10.3844/jcssp.2007.259.265
    • Mehdi, M., S. Zair, A. Anou and M. Bensebti, 2007. A bayesian networks in intrusion detection systems. J. Comput. Sci., 3: 259-265. DOI: 10.3844/jcssp.2007.259.265
    • (2007) J. Comput. Sci. , vol.3 , pp. 259-265
    • Mehdi, M.1    Zair, S.2    Anou, A.3    Bensebti, M.4
  • 21
    • 79251504274 scopus 로고    scopus 로고
    • Dynamic bayesian networks in classification-and-ranking architecture of response generation
    • DOI: 10.3844/jcssp.2011.59.64
    • Mustapha, A., M.N. Sulaiman, R. Mahmod and M.H. Selamat, 2011. Dynamic bayesian networks in classification-and-ranking architecture of response generation. J. Comput. Sci., 7: 59-64. DOI: 10.3844/jcssp.2011.59.64
    • (2011) J. Comput. Sci. , vol.7 , pp. 59-64
    • Mustapha, A.1    Sulaiman, M.N.2    Mahmod, R.3    Selamat, M.H.4
  • 23
    • 0001592068 scopus 로고
    • Automatic linkage of vital records
    • DOI: 10.1126/science.130.3381.954
    • Newcombe, H.B., J.M. Kennedy, S.J. Axford and A.P. James, 1959. Automatic linkage of vital records. Science, 130: 954-959. DOI: 10.1126/science.130.3381.954
    • (1959) Science , vol.130 , pp. 954-959
    • Newcombe, H.B.1    Kennedy, J.M.2    Axford, S.J.3    James, A.P.4
  • 24
    • 79952315725 scopus 로고    scopus 로고
    • A trigram hidden Markov model for metadata extraction from heterogeneous references
    • DOI: 10.1016/j.ins.2011.01.014
    • Ojokoh, B., M. Zhang, J. Tang, 2011. A trigram hidden Markov model for metadata extraction from heterogeneous references. Inform. Sci., 181: 1538-1551. DOI: 10.1016/j.ins.2011.01.014
    • (2011) Inform. Sci. , vol.181 , pp. 1538-1551
    • Ojokoh, B.1    Zhang, M.2    Tang, J.3
  • 25
    • 80052861770 scopus 로고    scopus 로고
    • Unbalance quantitative structure activity relationship problem reduction in drug design
    • DOI: 10.3844/jcssp.2009.764.772
    • Pugazhenthi, D. and S.P. Rajagopalan, 2009. Unbalance quantitative structure activity relationship problem reduction in drug design. J. Comput. Sci., 5: 764-772. DOI: 10.3844/jcssp.2009.764.772
    • (2009) J. Comput. Sci. , vol.5 , pp. 764-772
    • Pugazhenthi, D.1    Rajagopalan, S.P.2
  • 26
    • 0002442571 scopus 로고
    • Discovering Rules by Induction from Large Collections of Examples
    • Age, D. Michie, (Ed.). Edinburgh: Edinburgh University Press
    • Quinlan, J.R., 1979. Discovering Rules by Induction from Large Collections of Examples. In: Expert Systems in the Micro-Electronic Age, D. Michie, (Ed.). Edinburgh: Edinburgh University Press, pp: 168-201.
    • (1979) Expert Systems in the Micro-Electronic , pp. 168-201
    • Quinlan, J.R.1
  • 27
    • 0003500248 scopus 로고
    • 1st Edn., Morgan Kaufmann, USA., ISBN-10: 1558602380
    • Quinlan, J.R., 1992. C4.5: programs for machine learning. 1st Edn., Morgan Kaufmann, USA., ISBN-10: 1558602380, pp: 302.
    • (1992) C4.5:programs for machine learning , pp. 302
    • Quinlan, J.R.1
  • 29
    • 79956298114 scopus 로고    scopus 로고
    • Feature-based entity matching: The FBEM model, implementation, evaluation
    • DOI: 10.1007/978-3-642-13094-6_15
    • Stoermer, H., N. Rassadko and N. Vaidya, 2010. Feature-based entity matching: The FBEM model, implementation, evaluation. Advanced Inform. Syst. Eng., 6051: 180-193. DOI: 10.1007/978-3-642-13094-6_15
    • (2010) Advanced Inform. Syst. Eng. , vol.6051 , pp. 180-193
    • Stoermer, H.1    Rassadko, N.2    Vaidya, N.3
  • 31
    • 77956263873 scopus 로고    scopus 로고
    • Log data approach to acquisition of optimal Bayesian learner model
    • DOI: 10.3844/ajassp.2009.913.921
    • Ting, C.Y. and S. Phon-Amnuaisuk, 2009. Log data approach to acquisition of optimal Bayesian learner model. Am. J. Applied Sci., 6: 913-921. DOI: 10.3844/ajassp.2009.913.921
    • (2009) Am. J. Applied Sci. , vol.6 , pp. 913-921
    • Ting, C.Y.1    Phon-Amnuaisuk, S.2
  • 32
    • 80052862250 scopus 로고    scopus 로고
    • Forecasting daily demand in cash supply chains
    • DOI: 10.3844/ajebasp.2010.377.383
    • Wagner, M., 2010. Forecasting daily demand in cash supply chains. Am. J. Econ. Bus. Admin., 2: 377-383. DOI: 10.3844/ajebasp.2010.377.383
    • (2010) Am. J. Econ. Bus. Admin. , vol.2 , pp. 377-383
    • Wagner, M.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.