메뉴 건너뛰기




Volumn 2, Issue 1, 2009, Pages 1282-1293

Framework for evaluating clustering algorithms in duplicate detection

Author keywords

[No Author keywords available]

Indexed keywords

QUALITY CONTROL; STRINGERS;

EID: 72649086387     PISSN: None     EISSN: 21508097     Source Type: Conference Proceeding    
DOI: 10.14778/1687627.1687771     Document Type: Article
Times cited : (194)

References (51)
  • 4
    • 4644233828 scopus 로고    scopus 로고
    • The Star Clustering Algorithm For Static And Dynamic Information Organization
    • J. A. Aslam, E. Pelekhov, and D. Rus. The Star Clustering Algorithm For Static And Dynamic Information Organization. Journal of Graph Algorithms and Applications, 8(1):95-129, 2004.
    • (2004) Journal of Graph Algorithms and Applications , vol.8 , Issue.1 , pp. 95-129
    • Aslam, J.A.1    Pelekhov, E.2    Rus, D.3
  • 9
    • 36348996876 scopus 로고    scopus 로고
    • Collective Entity Resolution in Relational Data
    • I. Bhattacharya and L. Getoor. Collective Entity Resolution in Relational Data. IEEE Data Engineering Bulletin, 29(2):4-12, 2006.
    • (2006) IEEE Data Engineering Bulletin , vol.29 , Issue.2 , pp. 4-12
    • Bhattacharya, I.1    Getoor, L.2
  • 11
    • 33751255087 scopus 로고    scopus 로고
    • Evaluation of Clustering Algorithms for Protein-Protein Interaction Networks
    • S. Brohee and J. van Helden. Evaluation of Clustering Algorithms for Protein-Protein Interaction Networks. BMC Bioinformatics, 7:488+, 2006.
    • (2006) BMC Bioinformatics , vol.7
    • Brohee, S.1    van Helden, J.2
  • 12
  • 14
    • 4243136572 scopus 로고    scopus 로고
    • On a Recursive Spectral Algorithm for Clustering from Pairwise Similarities
    • Technical Report MIT-LCS-TR-906, MIT LCS
    • D. Cheng, R. Kannan, S. Vempala, and G. Wang. On a Recursive Spectral Algorithm for Clustering from Pairwise Similarities. Technical Report MIT-LCS-TR-906, MIT LCS, 2003.
    • (2003)
    • Cheng, D.1    Kannan, R.2    Vempala, S.3    Wang, G.4
  • 18
    • 0002546287 scopus 로고
    • Efficient Algorithms for Agglomerative Hierarchical Clustering Methods
    • W. H. Day and H. Edelsbrunner. Efficient Algorithms for Agglomerative Hierarchical Clustering Methods. Journal of Classification, 1(1):7-24, 1984.
    • (1984) Journal of Classification , vol.1 , Issue.1 , pp. 7-24
    • Day, W.H.1    Edelsbrunner, H.2
  • 19
    • 33746868385 scopus 로고    scopus 로고
    • Immorlica. Correlation Clustering In General Weighted Graphs
    • E. D. Demaine, D. Emanuel, A. Fiat, and N. Immorlica. Correlation Clustering In General Weighted Graphs. Theor. Comput. Sci., 361(2):172-187, 2006.
    • (2006) Theor. Comput. Sci. , vol.361 , Issue.2 , pp. 172-187
    • Demaine, E.D.1    Emanuel, D.2    Fiat, A.3    Immorlica, N.4
  • 20
    • 0000891810 scopus 로고
    • Algorithm for Solution of a Problem Of Maximum Flow in Networks with Power Estimation
    • E. A. Dinic. Algorithm for Solution of a Problem Of Maximum Flow in Networks with Power Estimation. Soviet Math. Dokl, 11:1277-1280, 1970.
    • (1970) Soviet Math. Dokl , vol.11 , pp. 1277-1280
    • Dinic, E.A.1
  • 21
    • 0015330635 scopus 로고
    • Theoretical Improvements in Algorithmic Efficiency for Network Flow Problems
    • J. Edmonds and R. M. Karp. Theoretical Improvements in Algorithmic Efficiency for Network Flow Problems. J. ACM, 19(2):248-264, 1972.
    • (1972) J. ACM , vol.19 , Issue.2 , pp. 248-264
    • Edmonds, J.1    Karp, R.M.2
  • 23
  • 25
    • 0001261128 scopus 로고
    • Maximal Flow Through a Network
    • L. Ford and D. Fulkerson. Maximal Flow Through a Network. Canadian J. Math, 8:399-404, 1956.
    • (1956) Canadian J. Math , vol.8 , pp. 399-404
    • Ford, L.1    Fulkerson, D.2
  • 27
    • 0024090156 scopus 로고
    • A New Approach to the Maximum-Flow Problem
    • A. V. Goldberg and R. E. Tarjan. A New Approach to the Maximum-Flow Problem. Journal of the ACM, 35(4):921-940, 1988.
    • (1988) Journal of the ACM , vol.35 , Issue.4 , pp. 921-940
    • Goldberg, A.V.1    Tarjan, R.E.2
  • 28
    • 0035676057 scopus 로고    scopus 로고
    • On clustering validation techniques
    • M. Halkidi, Y. Batistakis, and M. Vazirgiannis. On clustering validation techniques. journal, 17(2-3):107-145, 2001.
    • (2001) journal , vol.17 , Issue.2-3 , pp. 107-145
    • Halkidi, M.1    Batistakis, Y.2    Vazirgiannis, M.3
  • 29
    • 70349814264 scopus 로고    scopus 로고
    • Benchmarking Declarative Approximate Selection Predicates
    • Master's thesis, University of Toronto
    • O. Hassanzadeh. Benchmarking Declarative Approximate Selection Predicates. Master's thesis, University of Toronto, February 2007.
    • (2007)
    • Hassanzadeh, O.1
  • 30
    • 84865063889 scopus 로고    scopus 로고
    • Creating Probabilistic Databases from Duplicated Data
    • Technical Report CSRG-568, University of Toronto, To appear in The VLDB Journal, Accepted on 26
    • O. Hassanzadeh and R. J. Miller. Creating Probabilistic Databases from Duplicated Data. Technical Report CSRG-568, University of Toronto, To appear in The VLDB Journal, Accepted on 26 June 2009.
    • (2009)
    • Hassanzadeh, O.1    Miller, R.J.2
  • 33
    • 0013331361 scopus 로고    scopus 로고
    • Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem
    • M. A. Hernández and S. J. Stolfo. Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem. Data Mining and Knowledge Discovery, 2(1):9-37, 1998.
    • (1998) Data Mining and Knowledge Discovery , vol.2 , Issue.1 , pp. 9-37
    • Hernández, M.A.1    Stolfo, S.J.2
  • 34
    • 0032729435 scopus 로고    scopus 로고
    • Exploring Expression Data: Identification and Analysis of Coexpressed Genes
    • L. J. Heyer, S. Kruglyak, and S. Yooseph. Exploring Expression Data: Identification and Analysis of Coexpressed Genes. Genome Res., 9(11):1106-1115, 1999.
    • (1999) Genome Res. , vol.9 , Issue.11 , pp. 1106-1115
    • Heyer, L.J.1    Kruglyak, S.2    Yooseph, S.3
  • 37
    • 4243128193 scopus 로고    scopus 로고
    • On clusterings: Good, bad and spectral
    • R. Kannan, S. Vempala, and A. Vetta. On clusterings: Good, bad and spectral. Journal of the ACM, 51(3):497-515, 2004.
    • (2004) Journal of the ACM , vol.51 , Issue.3 , pp. 497-515
    • Kannan, R.1    Vempala, S.2    Vetta, A.3
  • 38
    • 33750410010 scopus 로고    scopus 로고
    • A Framework for Protein Structure Classification and Identification of Novel Protein Structures
    • Y. J. Kim and J. M. Patel. A Framework for Protein Structure Classification and Identification of Novel Protein Structures. BMC Bioinformatics, 7:456+, 2006.
    • (2006) BMC Bioinformatics , vol.7
    • Kim, Y.J.1    Patel, J.M.2
  • 39
    • 10244276179 scopus 로고    scopus 로고
    • Graph Clustering with Restricted Neighbourhood Search
    • Master's thesis, University of Toronto
    • A. D. King. Graph Clustering with Restricted Neighbourhood Search. Master's thesis, University of Toronto, 2004.
    • (2004)
    • King, A.D.1
  • 41
    • 85011032600 scopus 로고    scopus 로고
    • VGRAM: Improving Performance of Approximate Queries on String Collections Using Variable-Length Grams
    • Vienna, Austria
    • C. Li, B. Wang, and X. Yang. VGRAM: Improving Performance of Approximate Queries on String Collections Using Variable-Length Grams. In Proc. of the Int'l Conf. on Very Large Data Bases (VLDB), pages 303-314, Vienna, Austria, 2007.
    • (2007) Proc. of the Int'l Conf. on Very Large Data Bases (VLDB) , pp. 303-314
    • Li, C.1    Wang, B.2    Yang, X.3
  • 42
    • 0001820920 scopus 로고    scopus 로고
    • X-Means: Extending K-Means with Efficient Estimation of the Number of Clusters
    • San Francisco, CA, USA
    • A. M. D. Pelleg. X-Means: Extending K-Means with Efficient Estimation of the Number of Clusters. In Proc. of the Int'l Conf. on Machine Learning, pages 727-734, San Francisco, CA, USA, 2000.
    • (2000) Proc. of the Int'l Conf. on Machine Learning , pp. 727-734
    • Pelleg, A.M.D.1
  • 44
    • 9444222292 scopus 로고    scopus 로고
    • The Information Bottleneck: Theory And Applications
    • PhD thesis, The Hebrew University
    • N. Slonim. The Information Bottleneck: Theory And Applications. PhD thesis, The Hebrew University, 2003.
    • (2003)
    • Slonim, N.1
  • 45
    • 1842435182 scopus 로고    scopus 로고
    • Correlation Clustering: Maximizing Agreements Via Semidefinite Programming
    • New Orleans, Louisiana, USA
    • C. Swamy. Correlation Clustering: Maximizing Agreements Via Semidefinite Programming. In Proc. of the Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 526-527, New Orleans, Louisiana, USA, 2004.
    • (2004) Proc. of the Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) , pp. 526-527
    • Swamy, C.1
  • 47
    • 0005924596 scopus 로고    scopus 로고
    • Graph Clustering By Flow Simulation
    • PhD thesis, University of Utrecht
    • S. van Dongen. Graph Clustering By Flow Simulation. PhD thesis, University of Utrecht, 2000.
    • (2000)
    • van Dongen, S.1
  • 48
    • 84865100525 scopus 로고    scopus 로고
    • Graph Clustering With Overlap
    • Master's thesis, University of Toronto
    • J. A. Whitney. Graph Clustering With Overlap. Master's thesis, University of Toronto, 2006.
    • (2006)
    • Whitney, J.A.1
  • 51
    • 0004045546 scopus 로고
    • Clustering of Large Data Sets
    • Research Studies Press
    • J. Zupan. Clustering of Large Data Sets. Research Studies Press, 1982.
    • (1982)
    • Zupan, J.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.