메뉴 건너뛰기




Volumn 22, Issue 4, 2010, Pages 578-589

Record matching over query results from multiple web databases

Author keywords

Data deduplication; Data integration; Duplicate detection; Query result record; Record linkage; Record matching; SVM; Web database

Indexed keywords

CLASSIFICATION (OF INFORMATION); DATA INTEGRATION; DATABASE SYSTEMS; ITERATIVE METHODS;

EID: 77649261370     PISSN: 10414347     EISSN: None     Source Type: Journal    
DOI: 10.1109/TKDE.2009.90     Document Type: Article
Times cited : (48)

References (38)
  • 2
    • 85059500505 scopus 로고    scopus 로고
    • R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval. ACM Press, 1999.
    • R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval. ACM Press, 1999.
  • 5
    • 77952372966 scopus 로고    scopus 로고
    • Adaptive Duplicate Detection Using Learnable String Similarity Measures
    • M. Bilenko and R.J. Mooney, "Adaptive Duplicate Detection Using Learnable String Similarity Measures," Proc. ACM SIGKDD, pp. 39-48, 2003.
    • (2003) Proc. ACM SIGKDD , pp. 39-48
    • Bilenko, M.1    Mooney, R.J.2
  • 6
    • 1142279457 scopus 로고    scopus 로고
    • Robust and Efficient Fuzzy Match for Online Data Cleaning
    • S. Chaudhuri, K. Ganjam, V. Ganti, and R. Motwani, "Robust and Efficient Fuzzy Match for Online Data Cleaning," Proc. ACM SIGMOD, pp. 313-324, 2003.
    • (2003) Proc. ACM SIGMOD , pp. 313-324
    • Chaudhuri, S.1    Ganjam, K.2    Ganti, V.3    Motwani, R.4
  • 8
    • 65449139594 scopus 로고    scopus 로고
    • Automatic Record Linkage Using Seeded Nearest Neighbour and Support Vector Machine Classification
    • P. Christen, "Automatic Record Linkage Using Seeded Nearest Neighbour and Support Vector Machine Classification," Proc. ACM SIGKDD, pp. 151-159, 2008.
    • (2008) Proc. ACM SIGKDD , pp. 151-159
    • Christen, P.1
  • 10
    • 33846428121 scopus 로고    scopus 로고
    • Quality and Complexity Measures for Data Linkage and Deduplication
    • F. Guillet and H. Hamilton, eds, Springer
    • P. Christen and K. Goiser, "Quality and Complexity Measures for Data Linkage and Deduplication," Quality Measures in Data Mining, F. Guillet and H. Hamilton, eds., vol. 43, pp. 127-151, Springer, 2007.
    • (2007) Quality Measures in Data Mining , vol.43 , pp. 127-151
    • Christen, P.1    Goiser, K.2
  • 12
    • 0242540438 scopus 로고    scopus 로고
    • Learning to Match and Cluster Large High-Dimensional Datasets for Data Integration
    • W.W. Cohen and J. Richman, "Learning to Match and Cluster Large High-Dimensional Datasets for Data Integration," Proc. ACM SIGKDD, pp. 475-480, 2002.
    • (2002) Proc. ACM SIGKDD , pp. 475-480
    • Cohen, W.W.1    Richman, J.2
  • 13
    • 85059499008 scopus 로고    scopus 로고
    • A. Culotta and A. McCallum, "A Conditional Model of Dedupli-cation for Multi-Type Relational Data," Technical Report IR-443, Dept. of Computer Science, Univ. of Massachusetts Amherst, 2005.
    • A. Culotta and A. McCallum, "A Conditional Model of Dedupli-cation for Multi-Type Relational Data," Technical Report IR-443, Dept. of Computer Science, Univ. of Massachusetts Amherst, 2005.
  • 16
    • 29844452555 scopus 로고    scopus 로고
    • Reference Reconciliation in Complex Information Spaces
    • X. Dong, A. Halevy, and J. Madhavan, "Reference Reconciliation in Complex Information Spaces," Proc. ACM SIGMOD, pp. 85-96, 2005.
    • (2005) Proc. ACM SIGMOD , pp. 85-96
    • Dong, X.1    Halevy, A.2    Madhavan, J.3
  • 19
    • 33745213976 scopus 로고    scopus 로고
    • Automatic Complex Schema Matching Across Web Query Interfaces: A Correlation Mining Approach
    • B. He and K.C.-C. Chang, "Automatic Complex Schema Matching Across Web Query Interfaces: A Correlation Mining Approach," ACM Trans. Database Systems, vol. 31, no. 1, pp. 346-396, 2006.
    • (2006) ACM Trans. Database Systems , vol.31 , Issue.1 , pp. 346-396
    • He, B.1    Chang, K.C.-C.2
  • 20
    • 84976856849 scopus 로고
    • The Merge/Purge Problem for Large Databases
    • M.A. Hernandez and S.J. Stolfo, "The Merge/Purge Problem for Large Databases," ACM SIGMOD Record, vol. 24, no. 2, pp. 127-138, 1995.
    • (1995) ACM SIGMOD Record , vol.24 , Issue.2 , pp. 127-138
    • Hernandez, M.A.1    Stolfo, S.J.2
  • 21
    • 84950419860 scopus 로고
    • Advances in Record-Linkage Methodology as Applied to Matching the 1985 Census of Tampa, Florida
    • M.A. Jaro, "Advances in Record-Linkage Methodology as Applied to Matching the 1985 Census of Tampa, Florida," J. Am. Statistical Assoc., vol. 89, no. 406, pp. 414-420, 1989.
    • (1989) J. Am. Statistical Assoc , vol.89 , Issue.406 , pp. 414-420
    • Jaro, M.A.1
  • 23
    • 34250670467 scopus 로고    scopus 로고
    • Record Linkage: Similarity Measures and Algorithms (Tutorial)
    • N. Koudas, S. Sarawagi, and D. Srivastava, "Record Linkage: Similarity Measures and Algorithms (Tutorial)," Proc. ACM SIGMOD, pp. 802-803, 2006.
    • (2006) Proc. ACM SIGMOD , pp. 802-803
    • Koudas, N.1    Sarawagi, S.2    Srivastava, D.3
  • 25
    • 85096855936 scopus 로고    scopus 로고
    • One-Class SVMs for Document Classification
    • L.M. Manevitz and M. Yousef, "One-Class SVMs for Document Classification," J. Machine Learning Research, vol. 2, pp. 139-154, 2001.
    • (2001) J. Machine Learning Research , vol.2 , pp. 139-154
    • Manevitz, L.M.1    Yousef, M.2
  • 26
    • 85059497637 scopus 로고    scopus 로고
    • A. McCallum, "Cora Citation Matching," http://www.cs.umass.edu/ ~mccallum/data/cora-refs.tar.gz, 2004.
    • A. McCallum, "Cora Citation Matching," http://www.cs.umass.edu/ ~mccallum/data/cora-refs.tar.gz, 2004.
  • 27
    • 0034592784 scopus 로고    scopus 로고
    • Efficient Clustering of High-Dimensional Datasets with Application to Reference Matching
    • A. McCallum, K. Nigam, and L.H. Ungar, "Efficient Clustering of High-Dimensional Datasets with Application to Reference Matching," Proc. ACM SIGKDD, pp. 169-178, 2000.
    • (2000) Proc. ACM SIGKDD , pp. 169-178
    • McCallum, A.1    Nigam, K.2    Ungar, L.H.3
  • 28
    • 0242456811 scopus 로고    scopus 로고
    • Interactive Deduplication Using Active Learning
    • S. Sarawagi and A. Bhamidipaty, "Interactive Deduplication Using Active Learning," Proc. ACM SIGKDD, pp. 269-278, 2002.
    • (2002) Proc. ACM SIGKDD , pp. 269-278
    • Sarawagi, S.1    Bhamidipaty, A.2
  • 31
    • 0242456803 scopus 로고    scopus 로고
    • Learning Domain-Independent String Transformation Weights for High Accuracy Object Identification
    • S. Tejada, C.A. Knoblock, and S. Minton, "Learning Domain-Independent String Transformation Weights for High Accuracy Object Identification," Proc. ACM SIGKDD, pp. 350-359, 2002.
    • (2002) Proc. ACM SIGKDD , pp. 350-359
    • Tejada, S.1    Knoblock, C.A.2    Minton, S.3
  • 32
    • 1842353812 scopus 로고
    • The Discrimination Power of Dependency Structures in Record Linkage
    • Y. Thibaudeau, "The Discrimination Power of Dependency Structures in Record Linkage," Survey Methodology, vol. 19, pp. 31-38, 1993.
    • (1993) Survey Methodology , vol.19 , pp. 31-38
    • Thibaudeau, Y.1
  • 33
    • 85059500543 scopus 로고    scopus 로고
    • V. Vapnik, The Nature of Statistical Learning Theory, second ed. Springer, 2000.
    • V. Vapnik, The Nature of Statistical Learning Theory, second ed. Springer, 2000.
  • 34
    • 0038208065 scopus 로고    scopus 로고
    • A Bayesian Decision Model for Cost Optimal Record Matching
    • V.S. Verykios, G.V. Moustakides, and M.G. Elfeky, "A Bayesian Decision Model for Cost Optimal Record Matching," The VLDB J., vol. 12, no. 1, pp. 28-40, 2003.
    • (2003) The VLDB J , vol.12 , Issue.1 , pp. 28-40
    • Verykios, V.S.1    Moustakides, G.V.2    Elfeky, M.G.3
  • 35
    • 0002940254 scopus 로고
    • Using the EM Algorithm for Weight Computation in the Fellegi-Sunter Model of Record Linkage
    • W.E. Winkler, "Using the EM Algorithm for Weight Computation in the Fellegi-Sunter Model of Record Linkage," Proc. Section Survey Research Methods, pp. 667-671, 1988.
    • (1988) Proc. Section Survey Research Methods , pp. 667-671
    • Winkler, W.E.1
  • 36
    • 0742268826 scopus 로고    scopus 로고
    • PEBL: Web Page Classification without Negative Examples
    • Jan
    • H. Yu, J. Han, and C.C. Chang, "PEBL: Web Page Classification without Negative Examples," IEEE Trans. Knowledge and Data Eng., vol. 16, no. 1, pp. 70-81, Jan. 2004.
    • (2004) IEEE Trans. Knowledge and Data Eng , vol.16 , Issue.1 , pp. 70-81
    • Yu, H.1    Han, J.2    Chang, C.C.3
  • 37
    • 33750797710 scopus 로고    scopus 로고
    • Structured Data Extraction from the Web Based on Partial Tree Alignment
    • Dec
    • Y. Zhai and B. Liu, "Structured Data Extraction from the Web Based on Partial Tree Alignment," IEEE Trans. Knowledge and Data Eng., vol. 18, no. 12, pp. 1614-1628, Dec. 2006.
    • (2006) IEEE Trans. Knowledge and Data Eng , vol.18 , Issue.12 , pp. 1614-1628
    • Zhai, Y.1    Liu, B.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.