SCOPUS 정보 검색 플랫폼

Proceedings of the ACM SIGMOD International Conference on Management of Data

Volumn , Issue , 2010, Pages 63-74

Sampling dirty data for matching attributes

(5) Köhler, Henning a Zhou, Xiaofang a Sadiq, Shazia a Shu, Yanfeng b Taylor, Kerry b

a UNIVERSITY OF QUEENSLAND (Australia)

b CSIRO (Australia)

Author keywords

database integration; sampling; schema matching

Indexed keywords

DATABASE INTEGRATION; DIRTY DATA; EFFICIENT ALGORITHM; REAL WORLD DATA; RELATIONAL DATABASE; SCHEMA MATCHING; SIMILARITY COMPUTATION; SIMILARITY MEASURE; TEST RESULTS; TWO STAGE; VALUE SETS;

ALGORITHMS; COMPUTATIONAL EFFICIENCY; INTEGRATION;

DATABASE SYSTEMS;

EID: 77954738593 PISSN: 07308078 EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/1807167.1807177 Document Type: Conference Paper

Times cited : (20)

References (27)

1
- 35448951563
- Data integration: The teenage years
- A. Y. Halevy, A. Rajaraman, and J. J. Ordille, "Data integration: The teenage years," in VLDB, 2006, pp. 9-16.
- (2006) VLDB , pp. 9-16
- Halevy, A.Y.¹ Rajaraman, A.² Ordille, J.J.³

2
- 0035657983
- A survey of approaches to automatic schema matching
- E. Rahm and P. A. Bernstein, "A survey of approaches to automatic schema matching," VLDB J., vol. 10, no. 4, pp. 334-350, 2001.
- (2001) VLDB J. , vol.10 , Issue.4 , pp. 334-350
- Rahm, E.¹ Bernstein, P.A.²

3
- 31444453796
- From databases to dataspaces: A new abstraction for information management
- M. J. Franklin, A. Y. Halevy, and D. Maier, "From databases to dataspaces: a new abstraction for information management," SIGMOD Record, vol. 34, no. 4, pp. 27-33, 2005.
- (2005) SIGMOD Record , vol.34 , Issue.4 , pp. 27-33
- Franklin, M.J.¹ Halevy, A.Y.² Maier, D.³

4
- 34250660624
- Principles of dataspace systems
- A. Y. Halevy, M. J. Franklin, and D. Maier, "Principles of dataspace systems," in PODS, 2006, pp. 1-9.
- (2006) PODS , pp. 1-9
- Halevy, A.Y.¹ Franklin, M.J.² Maier, D.³

5
- 0036366837
- Mining database structure; or, how to build a data quality browser
- T. Dasu, T. Johnson, S. Muthukrishnan, and V. Shkapenyuk, "Mining database structure; or, how to build a data quality browser," in SIGMOD, 2002, pp. 240-251.
- (2002) SIGMOD , pp. 240-251
- Dasu, T.¹ Johnson, T.² Muthukrishnan, S.³ Shkapenyuk, V.⁴

6
- 0031346696
- On the resemblance and containment of documents
- IEEE Computer Society
- A. Broder, "On the resemblance and containment of documents," in SEQUENCES: Proceedings of the Compression and Complexity of Sequences. IEEE Computer Society, 1997, p. 21.
- (1997) SEQUENCES: Proceedings of the Compression and Complexity of Sequences , pp. 21
- Broder, A.¹

7
- 0010362121
- Syntactic clustering of the web
- A. Z. Broder, S. C. Glassman, M. S. Manasse, and G. Zweig, "Syntactic clustering of the web," Computer Networks, vol. 29, no. 8-13, pp. 1157-1166, 1997.
- (1997) Computer Networks , vol.29 , Issue.8-13 , pp. 1157-1166
- Broder, A.Z.¹ Glassman, S.C.² Manasse, M.S.³ Zweig, G.⁴

8
- 85043988965
- Finding similar files in a large file system
- U. Manber, "Finding similar files in a large file system," in USENIX Winter, 1994, pp. 1-10.
- (1994) USENIX Winter , pp. 1-10
- Manber, U.¹

9
- 79956075292
- Identifying and filtering near-duplicate documents
- A. Z. Broder, "Identifying and filtering near-duplicate documents," in CPM, 2000, pp. 1-10.
- (2000) CPM , pp. 1-10
- Broder, A.Z.¹

10
- 0003685012
- CSLI Publications
- C. E. Shannon, A Mathematical Theory of Communication. CSLI Publications, 1948.
- (1948) A Mathematical Theory of Communication
- Shannon, C.E.¹

11
- 85011032600
- Vgram: Improving performance of approximate queries on string collections using variable-length grams
- C. Li, B. Wang, and X. Yang, "Vgram: Improving performance of approximate queries on string collections using variable-length grams," in VLDB, 2007, pp. 303-314.
- (2007) VLDB , pp. 303-314
- Li, C.¹ Wang, B.² Yang, X.³

12
- 34548738941
- Efficiently detecting inclusion dependencies
- J. Bauckmann, U. Leser, F. Naumann, and V. Tietz, "Efficiently detecting inclusion dependencies," in ICDE, 2007, pp. 1448-1450.
- (2007) ICDE , pp. 1448-1450
- Bauckmann, J.¹ Leser, U.² Naumann, F.³ Tietz, V.⁴

13
- 33845667955
- Duplicate record detection: A survey
- A. K. Elmagarmid, P. G. Ipeirotis, and V. S. Verykios, "Duplicate record detection: A survey," IEEE Trans. Knowl. Data Eng., vol. 19, no. 1, pp. 1-16, 2007.
- (2007) IEEE Trans. Knowl. Data Eng. , vol.19 , Issue.1 , pp. 1-16
- Elmagarmid, A.K.¹ Ipeirotis, P.G.² Verykios, V.S.³

14
- 0002368671
- The New Jersey data reduction report
- D. Barbar'a, W. Dumouchel, C. Faloutsos, P. J. Haas, J. M. Hellerstein, Y. Ioannidis, H. V. Jagadish, T. Johnson, R. Ng, V. Poosala, K. A. Ross, and K. C. Sevcik, "The New Jersey data reduction report," IEEE Data Engineering Bulletin, vol. 20, pp. 3-45, 1997.
- (1997) IEEE Data Engineering Bulletin , vol.20 , pp. 3-45
- Barbar'a, D.¹ Dumouchel, W.² Faloutsos, C.³ Haas, P.J.⁴ Hellerstein, J.M.⁵ Ioannidis, Y.⁶ Jagadish, H.V.⁷ Johnson, T.⁸ Ng, R.⁹ Poosala, V.¹⁰ Ross, K.A.¹¹ Sevcik, K.C.¹²

15
- 0002513261
- Random sampling from databases - A survey
- F. Olken and D. Rotem, "Random sampling from databases - a survey," Statistics and Computing, vol. 5, pp. 25-42, 1994.
- (1994) Statistics and Computing , vol.5 , pp. 25-42
- Olken, F.¹ Rotem, D.²

16
- 0003229927
- Schema mapping as query discovery
- R. J. Miller, L. M. Haas, and M. A. Hernández, "Schema mapping as query discovery," in VLDB, 2000, pp. 77-88.
- (2000) VLDB , pp. 77-88
- Miller, R.J.¹ Haas, L.M.² Hernández, M.A.³

17
- 3142720555
- iMAP: Discovering complex mappings between database schemas
- R. Dhamankar, Y. Lee, A. Doan, A. Y. Halevy, and P. Domingos, "iMAP: Discovering complex mappings between database schemas," in SIGMOD, 2004, pp. 383-394.
- (2004) SIGMOD , pp. 383-394
- Dhamankar, R.¹ Lee, Y.² Doan, A.³ Halevy, A.Y.⁴ Domingos, P.⁵

18
- 52749083110
- Validating multi-column schema matchings by type
- B. T. Dai, N. Koudas, D. Srivastava, A. K. H. Tung, and S. Venkatasubramanian, "Validating multi-column schema matchings by type," in ICDE, 2008, pp. 120-129.
- (2008) ICDE , pp. 120-129
- Dai, B.T.¹ Koudas, N.² Srivastava, D.³ Tung, A.K.H.⁴ Venkatasubramanian, S.⁵

19
- 0032091575
- Integration of heterogeneous databases without common domains using queries based on textual similarity
- W. W. Cohen, "Integration of heterogeneous databases without common domains using queries based on textual similarity," in SIGMOD, 1998, pp. 201-212.
- (1998) SIGMOD , pp. 201-212
- Cohen, W.W.¹

20
- 84944318804
- Approximate string joins in a database (almost) for free
- L. Gravano, P. G. Ipeirotis, H. V. Jagadish, N. Koudas, S. Muthukrishnan, and D. Srivastava, "Approximate string joins in a database (almost) for free," in VLDB, 2001, pp. 491-500.
- (2001) VLDB , pp. 491-500
- Gravano, L.¹ Ipeirotis, P.G.² Jagadish, H.V.³ Koudas, N.⁴ Muthukrishnan, S.⁵ Srivastava, D.⁶

21
- 0022821574
- Simple random sampling from relational databases
- F. Olken and D. Rotem, "Simple random sampling from relational databases," in VLDB, 1986, pp. 160-169.
- (1986) VLDB , pp. 160-169
- Olken, F.¹ Rotem, D.²

22
- 0030157210
- Bifocal sampling for skew-resistant join size estimation
- S. Ganguly, P. B. Gibbons, Y. Matias, and A. Silberschatz, "Bifocal sampling for skew-resistant join size estimation," in SIGMOD, 1996, pp. 271-281.
- (1996) SIGMOD , pp. 271-281
- Ganguly, S.¹ Gibbons, P.B.² Matias, Y.³ Silberschatz, A.⁴

23
- 0347761807
- On random sampling over joins
- S. Chaudhuri, R. Motwani, and V. R. Narasayya, "On random sampling over joins," in SIGMOD, 1999, pp. 263-274.
- (1999) SIGMOD , pp. 263-274
- Chaudhuri, S.¹ Motwani, R.² Narasayya, V.R.³

24
- 0040885649
- Congressional samples for approximate answering of group-by queries
- S. Acharya, P. B. Gibbons, and V. Poosala, "Congressional samples for approximate answering of group-by queries," in SIGMOD, 2000, pp. 487-498.
- (2000) SIGMOD , pp. 487-498
- Acharya, S.¹ Gibbons, P.B.² Poosala, V.³

25
- 3142697062
- Effective use of block-level sampling in statistics estimation
- S. Chaudhuri, G. Das, and U. Srivastava, "Effective use of block-level sampling in statistics estimation," in SIGMOD Conf., 2004, pp. 287-298.
- SIGMOD Conf., 2004 , pp. 287-298
- Chaudhuri, S.¹ Das, G.² Srivastava, U.³

26
- 3142745395
- A bi-level Bernoulli scheme for database sampling
- P. J. Haas and C. Koenig, "A bi-level Bernoulli scheme for database sampling," in SIGMOD, 2004, pp. 275-286.
- (2004) SIGMOD , pp. 275-286
- Haas, P.J.¹ Koenig, C.²

27
- 3142748410
- Query sampling in DB2 universal database
- J. Gryz, J. Guo, L. Liu, and C. Zuzarte, "Query sampling in DB2 universal database," in SIGMOD, 2004, pp. 839-843.
- (2004) SIGMOD , pp. 839-843
- Gryz, J.¹ Guo, J.² Liu, L.³ Zuzarte, C.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.