메뉴 건너뛰기




Volumn 47, Issue 2, 2010, Pages 264-276

A survey of the research on similarity query technique of sequence data

Author keywords

Distance distribution; Filtering technique; Sequence data; Similarity metric; Similarity query

Indexed keywords

APPLICATION FIELDS; BIOLOGICAL DATABASE; DISTANCE DISTRIBUTIONS; FILTERING TECHNIQUE; HOT RESEARCH TOPICS; KEY ISSUES; KEY TECHNIQUES; LARGE-SCALE DATASETS; LARGE-SCALE SEQUENCES; QUERY ALGORITHMS; RANDOM SEQUENCE; RESEARCH TRENDS; SCIENTIFIC COMPUTING; SEQUENCE DATA; SEQUENCE SIMILARITY; SIMILARITY METRICS; SIMILARITY QUERY; STATISTICAL INFORMATION; WEB ACCESS;

EID: 77950577191     PISSN: 10001239     EISSN: None     Source Type: Journal    
DOI: None     Document Type: Review
Times cited : (6)

References (77)
  • 3
    • 37349093328 scopus 로고    scopus 로고
    • DNA sequence data mining technique
    • in Chinese
    • Zhu Yangyong, Xiong Yun. DNA sequence data mining technique[J]. Journal of Software, 2007, 18(11): 2766-2781 (in Chinese)
    • (2007) Journal of Software , vol.18 , Issue.11 , pp. 2766-2781
    • Zhu, Y.1    Xiong, Y.2
  • 12
    • 84944324113 scopus 로고    scopus 로고
    • An efficient index structure for string databases
    • Apers P, Ceri S, Paraboschi S, et al. San Francisco: Morgan Kaufmann
    • Kahveci T, Singh A K. An efficient index structure for string databases[C] //Apers P, Ceri S, Paraboschi S, et al. Proc of the 27th Int Conf on Very Large Data Bases(VLDB 2001). San Francisco: Morgan Kaufmann, 2001: 351-360
    • (2001) Proc of the 27th Int Conf on Very Large Data Bases(VLDB 2001) , pp. 351-360
    • Kahveci, T.1    Singh, A.K.2
  • 13
    • 33744735729 scopus 로고    scopus 로고
    • A sequence similarity query processing technique based on two-partitioning frequency transformation
    • in Chinese
    • Wang Guoren, Ge Jian, Xu Hengyu, et al. A sequence similarity query processing technique based on two-partitioning frequency transformation[J]. Journal of Software, 2006, 17(2): 232-241 (in Chinese)
    • (2006) Journal of Software , vol.17 , Issue.2 , pp. 232-241
    • Wang, G.1    Ge, J.2    Xu, H.3
  • 14
    • 0027113212 scopus 로고
    • Approximate string matching with q-gram and maximal matching
    • Ukkonen E. Approximate string matching with q-gram and maximal matching[J]. Theory Computer Science, 1992, 92(1): 191-211
    • (1992) Theory Computer Science , vol.92 , Issue.1 , pp. 191-211
    • Ukkonen, E.1
  • 15
    • 0001678047 scopus 로고
    • Longest common subsequences of two random sequences
    • Chvatal V, Sankoff D. Longest common subsequences of two random sequences[J]. Journal of Applied Probability, 1975, 12(2): 306-315
    • (1975) Journal of Applied Probability , vol.12 , Issue.2 , pp. 306-315
    • Chvatal, V.1    Sankoff, D.2
  • 16
    • 0345566149 scopus 로고    scopus 로고
    • A guided tour to approximate string matching
    • Navarro G. A guided tour to approximate string matching[J]. ACM Computing Surveys, 2001, 33(1): 31-88
    • (2001) ACM Computing Surveys , vol.33 , Issue.1 , pp. 31-88
    • Navarro, G.1
  • 17
    • 0031363556 scopus 로고    scopus 로고
    • On the approximate pattern occurrence in a text
    • Los Alamitos: IEEE Computer Society
    • Regnier M, Szpankowski W. On the approximate pattern occurrence in a text[C] //Proc of Compression and Complexity of Sequences. Los Alamitos: IEEE Computer Society, 1997: 253-264
    • (1997) Proc of Compression and Complexity of Sequences , pp. 253-264
    • Regnier, M.1    Szpankowski, W.2
  • 18
    • 0001114905 scopus 로고    scopus 로고
    • Faster approximate string matching
    • Baeza-Yates R, Navarro G. Faster approximate string matching[J]. Algorithmica, 1999, 23(2): 127-158
    • (1999) Algorithmica , vol.23 , Issue.2 , pp. 127-158
    • Baeza-Yates, R.1    Navarro, G.2
  • 19
    • 77950558047 scopus 로고    scopus 로고
    • National center for biotechnology information. genbank database
    • National center for biotechnology information. genbank database[OL]. [2008-01-23]. http://www.ncbi.nlm.nih.gov/
  • 20
    • 0017547820 scopus 로고
    • A fast string searching algorithm
    • Boyer R S, Moore J S. A fast string searching algorithm[J]. Communications of the ACM, 1977, 20(10): 762-772
    • (1977) Communications of the ACM , vol.20 , Issue.10 , pp. 762-772
    • Boyer, R.S.1    Moore, J.S.2
  • 24
    • 0016518897 scopus 로고
    • Efficient string matching: An aid to bibliographic search
    • Aho A V, Corasick M J. Efficient string matching: An aid to bibliographic search[J]. Communications of the ACM, 1975, 18(6): 333-340
    • (1975) Communications of the ACM , vol.18 , Issue.6 , pp. 333-340
    • Aho, A.V.1    Corasick, M.J.2
  • 25
    • 8344251916 scopus 로고    scopus 로고
    • Deterministic memory efficient string matching algorithms for intrusion detection
    • Li V O K. Piscataway: IEEE
    • Tuck N, Sherwood T, Calder B, et al. Deterministic memory efficient string matching algorithms for intrusion detection[C] //Li V O K. Proc of the IEEE INFOCOM 2004. Piscataway: IEEE, 2004: 333-340
    • (2004) Proc of the IEEE INFOCOM 2004 , pp. 333-340
    • Tuck, N.1    Sherwood, T.2    Calder, B.3
  • 26
    • 0036200624 scopus 로고    scopus 로고
    • Improved algorithms for matching multiple patterns
    • in Chinese
    • Wang Yongcheng, Shen Zhou, Xu Yizhen. Improved algorithms for matching multiple patterns[J]. Journal of Computer Research and Development, 2002, 39(1): 55-60 (in Chinese)
    • (2002) Journal of Computer Research and Development , vol.39 , Issue.1 , pp. 55-60
    • Wang, Y.1    Shen, Z.2    Xu, Y.3
  • 28
    • 84976654685 scopus 로고
    • Fast text searching allowing errors
    • Wu S, Manber U. Fast text searching allowing errors[J]. Communications of the ACM, 1992, 35(10): 83-91
    • (1992) Communications of the ACM , vol.35 , Issue.10 , pp. 83-91
    • Wu, S.1    Manber, U.2
  • 30
    • 0022030599 scopus 로고
    • Efficient randomized pattern matching algorithms
    • Karp R, Rabin M. Efficient randomized pattern matching algorithms[J]. IBM Journal of Research and Development, 1987, 31(2): 249-260
    • (1987) IBM Journal of Research and Development , vol.31 , Issue.2 , pp. 249-260
    • Karp, R.1    Rabin, M.2
  • 32
    • 0025702286 scopus 로고
    • An analysis of the KarpRabin string matching algorithm
    • Gonnet G, Baeza-Yates. An analysis of the KarpRabin string matching algorithm[J]. Information Processing Letters, 1992, 34(5): 271-274
    • (1992) Information Processing Letters , vol.34 , Issue.5 , pp. 271-274
    • Gonnet, G.1    Baeza, Y.2
  • 34
    • 0020494998 scopus 로고
    • Algorithms for approximate string matching
    • Ukkonen E. Algorithms for approximate string matching[J]. Information and Control, 1985, 64(3): 100-118
    • (1985) Information and Control , vol.64 , Issue.3 , pp. 100-118
    • Ukkonen, E.1
  • 35
    • 0023012946 scopus 로고
    • An O(ND) difference algorithm and its variations
    • Myers E. An O(ND) difference algorithm and its variations[J]. Algorithmica, 1986, 1(1): 251-266
    • (1986) Algorithmica , vol.1 , Issue.1 , pp. 251-266
    • Myers, E.1
  • 36
    • 27144540187 scopus 로고    scopus 로고
    • OASIS: An online and accurate technique for local-alignment searches on biological sequences
    • Freytag J C, Lockemann P C, Abiteboul S, et al. San Francisco: Morgan Kaufmann
    • Meek C, Patel J M, Kasetty S. OASIS: An online and accurate technique for local-alignment searches on biological sequences[C] //Freytag J C, Lockemann P C, Abiteboul S, et al. Proc of the 29th Int Conf on Very Large Data Bases(VLDB 2003). San Francisco: Morgan Kaufmann, 2003: 910-921
    • (2003) Proc of the 29th Int Conf on Very Large Data Bases(VLDB 2003) , pp. 910-921
    • Meek, C.1    Patel, J.M.2    Kasetty, S.3
  • 38
    • 84883284555 scopus 로고    scopus 로고
    • CoMRI: A compressed multi-resolution index structure for sequence similarity queries
    • Los Alamitos: IEEE Computer Society
    • Sun H, Ozturk O, Ferhatosmanoglu H. CoMRI: A compressed multi-resolution index structure for sequence similarity queries[C] //Proc of the Computational Systems Bioinformatics (CSB). Los Alamitos: IEEE Computer Society, 2003: 553-559
    • (2003) Proc of the Computational Systems Bioinformatics (CSB) , pp. 553-559
    • Sun, H.1    Ozturk, O.2    Ferhatosmanoglu, H.3
  • 41
    • 84993661659 scopus 로고    scopus 로고
    • M-Tree: An efficient access method for similarity search in metric spaces
    • Jarke M, Carey M J, Dittrich K R, et al. San Francisco: Morgan Kaufmann
    • Ciaccia P, Patella M, Zezula P. M-Tree: An efficient access method for similarity search in metric spaces[C] //Jarke M, Carey M J, Dittrich K R, et al. Proc of the 23rd Int Conf on Very Large Data Bases (VLDB'97). San Francisco: Morgan Kaufmann, 1997: 426-435
    • (1997) Proc of the 23rd Int Conf on Very Large Data Bases (VLDB'97) , pp. 426-435
    • Ciaccia, P.1    Patella, M.2    Zezula, P.3
  • 42
    • 0344065611 scopus 로고    scopus 로고
    • Distance based indexing for string proximity search
    • Dayal U, Ramamritham K, Vijayaraman. Los Alamitos: IEEE Computer Society
    • Sahinalp S C, Tasan M, Macker J, et al. Distance based indexing for string proximity search[C] //Dayal U, Ramamritham K, Vijayaraman. Proc of the 19th Int Conf on Data Engineering(ICDE). Los Alamitos: IEEE Computer Society, 2003: 125-136
    • (2003) Proc of the 19th Int Conf on Data Engineering(ICDE) , pp. 125-136
    • Sahinalp, S.C.1    Tasan, M.2    Macker, J.3
  • 43
    • 84939567221 scopus 로고    scopus 로고
    • Reference-based indexing of sequence databases
    • Dayal U, Whang K Y, Lomet D B, et al. New York: ACM
    • Venkateswaran J, Lachwani D, Kahveci T, et al. Reference-based indexing of sequence databases[C] //Dayal U, Whang K Y, Lomet D B, et al. Proc of the VLDB. New York: ACM, 2006: 906-917
    • (2006) Proc of the VLDB , pp. 906-917
    • Venkateswaran, J.1    Lachwani, D.2    Kahveci, T.3
  • 44
    • 46749129434 scopus 로고    scopus 로고
    • Reference-based indexing for metric spaces with costly distance measures
    • Venkateswaran J, Kahveci T, Jermaine C, et al. Reference-based indexing for metric spaces with costly distance measures[J]. The VLDB Journal, 2008, 17(5): 1231-1251
    • (2008) The VLDB Journal , vol.17 , Issue.5 , pp. 1231-1251
    • Venkateswaran, J.1    Kahveci, T.2    Jermaine, C.3
  • 47
    • 84944318804 scopus 로고    scopus 로고
    • Approximate string joins in a database (Almost) for free
    • Appers P, Atzeni P, Ceri S, et al. San Francisco: Morgan Kaufmann
    • Gravano L, Ipeirotis P, Jagadish H V, et al. Approximate string joins in a database (Almost) for free[C] //Appers P, Atzeni P, Ceri S, et al. Proc of the VLDB. San Francisco: Morgan Kaufmann, 2001: 491-500
    • (2001) Proc of the VLDB , pp. 491-500
    • Gravano, L.1    Ipeirotis, P.2    Jagadish, H.V.3
  • 48
    • 52649086729 scopus 로고    scopus 로고
    • Efficient merging and filtering algorithms for approximate string searches
    • Los Alamitos: IEEE Computer Society
    • Li C, Lu J H, Lu Y M. Efficient merging and filtering algorithms for approximate string searches[C] //Proc of the 24th Int Conf on Data Engineering (ICDE). Los Alamitos: IEEE Computer Society, 2008: 257-266
    • (2008) Proc of the 24th Int Conf on Data Engineering (ICDE) , pp. 257-266
    • Li, C.1    Lu, J.H.2    Lu, Y.M.3
  • 49
    • 70849105253 scopus 로고    scopus 로고
    • Ed-Jion: An efficient algorithm for similarity joins with edit distance constraints
    • Trondheim, Norway: VLDB Endowment
    • Xiao C, Wang W, Lin X M. Ed-Jion: An efficient algorithm for similarity joins with edit distance constraints[C] //Proc of the 34th Int Conf on Very Large Data Bases(VLDB). Trondheim, Norway: VLDB Endowment, 2008: 933-944
    • (2008) Proc of the 34th Int Conf on Very Large Data Bases(VLDB) , pp. 933-944
    • Xiao, C.1    Wang, W.2    Lin, X.M.3
  • 50
    • 84947737449 scopus 로고
    • On using g-gram locations in approximate string matching
    • Spirakis P G. Berlin: Springer
    • Sutinen E, Tarhio J. On using g-gram locations in approximate string matching[C] //Spirakis P G. Proc of the 3rd Annual European Symp on Algorithms. Berlin: Springer, 1995: 327-340
    • (1995) Proc of the 3rd Annual European Symp on Algorithms , pp. 327-340
    • Sutinen, E.1    Tarhio, J.2
  • 52
    • 0036202921 scopus 로고    scopus 로고
    • PatternHunter: Faster and more sensitive homology search
    • Ma Bin, Tromp J, Li Ming. PatternHunter: Faster and more sensitive homology search[J]. Bioinformatics, 2002, 18(3): 440-445
    • (2002) Bioinformatics , vol.18 , Issue.3 , pp. 440-445
    • Ma, B.1    Tromp, J.2    Li, M.3
  • 54
    • 85011032600 scopus 로고    scopus 로고
    • VGRAM: Improving performance of approximate queries on string collections using variable-length grams
    • Koch C, Gehrke J, Garofalakis M N, et al. New York: ACM
    • Li C, Wang B, Yang X C. VGRAM: Improving performance of approximate queries on string collections using variable-length grams[C] //Koch C, Gehrke J, Garofalakis M N, et al. Proc of the 33rd Int Conf on Very Large Data Bases(VLDB). New York: ACM, 2007: 303-314
    • (2007) Proc of the 33rd Int Conf on Very Large Data Bases(VLDB) , pp. 303-314
    • Li, C.1    Wang, B.2    Yang, X.C.3
  • 55
    • 57149130672 scopus 로고    scopus 로고
    • Cost-based variable-length-gram selection for string collections to support approximate queries efficiently
    • Wang J T L. New York: ACM
    • Yang X C, Wang B, Li C. Cost-based variable-length-gram selection for string collections to support approximate queries efficiently[C] //Wang J T L. Proc of the ACM SIGMOD Int Conf on Management of Data (SIGMOD 2008). New York: ACM, 2008: 353-364
    • (2008) Proc of the ACM SIGMOD Int Conf on Management of Data (SIGMOD 2008) , pp. 353-364
    • Yang, X.C.1    Wang, B.2    Li, C.3
  • 56
    • 24644523647 scopus 로고    scopus 로고
    • Sequence distance embeddings
    • Coventry: University of Warwick
    • Cormode G. Sequence distance embeddings[D]. Coventry: University of Warwick, 2003
    • (2003)
    • Cormode, G.1
  • 57
    • 51249178997 scopus 로고
    • On Lipschitz embedding of finite metric spaces in Hilbert space
    • Bourgain J. On Lipschitz embedding of finite metric spaces in Hilbert space[J]. Israel Journal of Mathematics, 1985, 52(1-2): 46-52
    • (1985) Israel Journal of Mathematics , vol.52 , Issue.1-2 , pp. 46-52
    • Bourgain, J.1
  • 58
    • 84976803260 scopus 로고
    • FastMap: A fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets
    • Carey M J, Schneider D A. New York: ACM
    • Faloutsos C, Lin D I. FastMap: A fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets[C] //Carey M J, Schneider D A. Proc of the 1995 ACM SIGMOD Int Conf on Management of Data. New York: ACM, 1995: 163-174
    • (1995) Proc of the 1995 ACM SIGMOD Int Conf on Management of Data , pp. 163-174
    • Faloutsos, C.1    Lin, D.I.2
  • 60
    • 0010011718 scopus 로고    scopus 로고
    • Cluster-preserving embedding of proteins, 99-50
    • New Brunswick: Rutegers University
    • Hristescu G, Farach-Colton M. Cluster-preserving embedding of proteins, 99-50[R]. New Brunswick: Rutegers University, 1999
    • (1999)
    • Hristescu, G.1    Farach-Colton, M.2
  • 61
    • 0038706882 scopus 로고    scopus 로고
    • Contractive embedding methods for similarity searching in metric spaces, CS-TR-4102
    • College Park: University of Maryland
    • Hjaltason G R, Samet H. Contractive embedding methods for similarity searching in metric spaces, CS-TR-4102[R]. College Park: University of Maryland, 2000
    • (2000)
    • Hjaltason, G.R.1    Samet, H.2
  • 62
    • 0035162484 scopus 로고    scopus 로고
    • Tutorial: Algorithmic applications of low-distortion geometric embeddings
    • Los Alamitos: IEEE Computer Society
    • Indyk P. Tutorial: Algorithmic applications of low-distortion geometric embeddings[C] //Proc of the 42nd Annual Symp on Foundations of Computer Science(FOCS). Los Alamitos: IEEE Computer Society, 2001: 10-33
    • (2001) Proc of the 42nd Annual Symp on Foundations of Computer Science(FOCS) , pp. 10-33
    • Indyk, P.1
  • 65
    • 35348998047 scopus 로고    scopus 로고
    • Low distortion embeddings for edit distance
    • Ostrovsky R, Rabani Y. Low distortion embeddings for edit distance[J]. Journal of the ACM, 2007, 54(5): 1-16
    • (2007) Journal of the ACM , vol.54 , Issue.5 , pp. 1-16
    • Ostrovsky, R.1    Rabani, Y.2
  • 66
    • 4944266407 scopus 로고    scopus 로고
    • Detecting protein sequence conservation via metric embeddings
    • Halperin E, Buhler J, Karp R, et al. Detecting protein sequence conservation via metric embeddings[J]. Bioinformatics, 2003, 19(Suppl1): 122-129
    • (2003) Bioinformatics , vol.19 , Issue.1 SUPPL. , pp. 122-129
    • Halperin, E.1    Buhler, J.2    Karp, R.3
  • 68
    • 50249132044 scopus 로고    scopus 로고
    • Approximate similarity search in genomic sequence databases using landmark-guided embedding
    • Los Alamitos: IEEE Computer Society
    • Sacan A, Toroslu L H. Approximate similarity search in genomic sequence databases using landmark-guided embedding[C] //Proc of the 24th Int Conf on Data Engineering Workshops. Los Alamitos: IEEE Computer Society, 2008: 338-345
    • (2008) Proc of the 24th Int Conf on Data Engineering Workshops , pp. 338-345
    • Sacan, A.1    Toroslu, L.H.2
  • 70
    • 0001944742 scopus 로고    scopus 로고
    • Similarity search in high dimensions via hashing
    • Atkinson M P, Orlowska M E, Valduriez P, et al. San Francisco: Morgan Kaufmann
    • Gionis A, Indyk P, Motwani R. Similarity search in high dimensions via hashing[C] //Atkinson M P, Orlowska M E, Valduriez P, et al. Proc of the 25th Int Conf on Very Large Data Bases(VLDB). San Francisco: Morgan Kaufmann, 1999: 518-529
    • (1999) Proc of the 25th Int Conf on Very Large Data Bases(VLDB) , pp. 518-529
    • Gionis, A.1    Indyk, P.2    Motwani, R.3
  • 71
    • 0035024494 scopus 로고    scopus 로고
    • Efficient large-scale sequence comparison by locality-sensitive hashing
    • Buhler J. Efficient large-scale sequence comparison by locality-sensitive hashing[J]. Bioinformatics, 2001, 17(5): 419-428
    • (2001) Bioinformatics , vol.17 , Issue.5 , pp. 419-428
    • Buhler, J.1
  • 72
  • 76
    • 37149053738 scopus 로고    scopus 로고
    • An efficient parallel implementation of the hidden Markov methods for genomic sequence search on a massively parallel system
    • Jiang K, Thosen O, Peters A, et al. An efficient parallel implementation of the hidden Markov methods for genomic sequence search on a massively parallel system[J]. IEEE Trans on Parallel and Distributed Systems, 2008, 19(1): 15-23
    • (2008) IEEE Trans on Parallel and Distributed Systems , vol.19 , Issue.1 , pp. 15-23
    • Jiang, K.1    Thosen, O.2    Peters, A.3
  • 77
    • 43349092363 scopus 로고    scopus 로고
    • CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment
    • Manavski S A, Valle G. CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment[J]. BMC Bioinformatics, 2008, 9(Suppl 2): S10
    • (2008) BMC Bioinformatics , vol.9 , Issue.2 SUPPL.
    • Manavski, S.A.1    Valle, G.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.