메뉴 건너뛰기




Volumn 32, Issue 8, 2007, Pages 1145-1165

Indexing schemes for similarity search in datasets of short protein fragments

Author keywords

Indexing; Protein fragments; Quasi metrics; Similarity search

Indexed keywords

ALGORITHMS; AMINO ACIDS; COMPUTATIONAL GEOMETRY; INDEXING (MATERIALS WORKING); PROTEINS;

EID: 34548513988     PISSN: 03064379     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.is.2007.03.001     Document Type: Article
Times cited : (13)

References (57)
  • 1
    • 84944328392 scopus 로고    scopus 로고
    • A database index to large biological sequences
    • Hunt E., Atkinson M.P., and Irving R.W. A database index to large biological sequences. VLDB J. 11 3 (2001) 139-148
    • (2001) VLDB J. , vol.11 , Issue.3 , pp. 139-148
    • Hunt, E.1    Atkinson, M.P.2    Irving, R.W.3
  • 2
    • 33645529104 scopus 로고    scopus 로고
    • Indexed searching on proteins using a suffix sequoia
    • Hunt E. Indexed searching on proteins using a suffix sequoia. IEEE Data Eng. Bull. 27 (2004) 24-31
    • (2004) IEEE Data Eng. Bull. , vol.27 , pp. 24-31
    • Hunt, E.1
  • 3
    • 0035024494 scopus 로고    scopus 로고
    • Efficient large-scale sequence comparison by locality-sensitive hashing
    • Buhler J. Efficient large-scale sequence comparison by locality-sensitive hashing. Bioinformatics 17 (2001) 419-428
    • (2001) Bioinformatics , vol.17 , pp. 419-428
    • Buhler, J.1
  • 4
    • 0036306157 scopus 로고    scopus 로고
    • SST: an algorithm for finding near-exact sequence matches in time proportional to the logarithm of the database size
    • Giladi E., Walker M.G., Wang J.Z., and Volkmuth W. SST: an algorithm for finding near-exact sequence matches in time proportional to the logarithm of the database size. Bioinformatics 18 6 (2002) 873-877
    • (2002) Bioinformatics , vol.18 , Issue.6 , pp. 873-877
    • Giladi, E.1    Walker, M.G.2    Wang, J.Z.3    Volkmuth, W.4
  • 5
    • 84942563927 scopus 로고    scopus 로고
    • R. Mao, W. Xu, N. Singh, D.P. Miranker, An assessment of a metric space database index to support sequence homology, in: Third IEEE International Symposium on BioInformatics and BioEngineering (BIBE 2003), Bethesda, Maryland, March 2003, pp. 375-384.
  • 6
    • 0036226603 scopus 로고    scopus 로고
    • BLAT-the BLAST-like alignment tool
    • Kent W.J. BLAT-the BLAST-like alignment tool. Genome Res. 12 4 (2002) 656-664
    • (2002) Genome Res. , vol.12 , Issue.4 , pp. 656-664
    • Kent, W.J.1
  • 7
    • 84872065533 scopus 로고    scopus 로고
    • Z. Tan, X. Cao, B.C. Ooi, A.K.H. Tung, The ed-tree: an index for large DNA sequence databases, in: SSDBM, 2003, pp. 151-160.
  • 8
    • 0000228203 scopus 로고
    • A model of evolutionary change in proteins
    • Dayhoff M.O. (Ed), National Biomedical Research Foundation (Chapter 22)
    • Dayhoff M.O., Schwartz R.M., and Orcutt B.C. A model of evolutionary change in proteins. In: Dayhoff M.O. (Ed). Atlas of Protein Sequence and Structure vol. 5 (1978), National Biomedical Research Foundation 345-352 (Chapter 22)
    • (1978) Atlas of Protein Sequence and Structure , vol.5 , pp. 345-352
    • Dayhoff, M.O.1    Schwartz, R.M.2    Orcutt, B.C.3
  • 9
    • 0026458378 scopus 로고
    • Amino acid substitution matrices from protein blocks
    • Henikoff S., and Henikoff J.G. Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA 89 (1992) 10915-10919
    • (1992) Proc. Natl. Acad. Sci. USA , vol.89 , pp. 10915-10919
    • Henikoff, S.1    Henikoff, J.G.2
  • 10
    • 0027113212 scopus 로고
    • Approximate string matching with q-grams and maximal matches
    • Ukkonen E. Approximate string matching with q-grams and maximal matches. Theoretical Computer Science 92 1 (1992) 191-211
    • (1992) Theoretical Computer Science , vol.92 , Issue.1 , pp. 191-211
    • Ukkonen, E.1
  • 11
    • 0035658023 scopus 로고    scopus 로고
    • HIV-1 and Ebola virus encode small peptide motifs that recruit Tsg101 to sites of particle assembly to facilitate egress
    • Martin-Serrano J., Zang T., and Bieniasz P.D. HIV-1 and Ebola virus encode small peptide motifs that recruit Tsg101 to sites of particle assembly to facilitate egress. Nat. Med. 7 (2001) 1313-1319
    • (2001) Nat. Med. , vol.7 , pp. 1313-1319
    • Martin-Serrano, J.1    Zang, T.2    Bieniasz, P.D.3
  • 12
    • 0038058742 scopus 로고    scopus 로고
    • Bioactive proteins and peptides from food sources. Applications of bioprocesses used in isolation and recovery
    • Kitts D.D., and Weiler K. Bioactive proteins and peptides from food sources. Applications of bioprocesses used in isolation and recovery. Curr. Pharm. Des. 9 (2003) 1309-1323
    • (2003) Curr. Pharm. Des. , vol.9 , pp. 1309-1323
    • Kitts, D.D.1    Weiler, K.2
  • 13
    • 33646775288 scopus 로고    scopus 로고
    • Indexing schemes for similarity search: an illustrated paradigm
    • Pestov V., and Stojmirović A. Indexing schemes for similarity search: an illustrated paradigm. Fund. Inf. 70 (2006) 367-385
    • (2006) Fund. Inf. , vol.70 , pp. 367-385
    • Pestov, V.1    Stojmirović, A.2
  • 14
    • 0014757386 scopus 로고
    • A general method applicable to the search for similarities in the amino acid sequence of two proteins
    • Needleman S.B., and Wunsch C.D. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48 (1970) 443-453
    • (1970) J. Mol. Biol. , vol.48 , pp. 443-453
    • Needleman, S.B.1    Wunsch, C.D.2
  • 15
    • 0019887799 scopus 로고
    • Identification of common molecular subsequences
    • Smith T.F., and Waterman M.S. Identification of common molecular subsequences. J. Mol. Biol. 147 (1981) 195-197
    • (1981) J. Mol. Biol. , vol.147 , pp. 195-197
    • Smith, T.F.1    Waterman, M.S.2
  • 18
    • 0037930646 scopus 로고    scopus 로고
    • Nonsymmetric distances and their associated topologies: about the origins of basic ideas in the area of asymmetric topology
    • Kluwer Academic Publishers, Dordrecht
    • Künzi H.-P.A. Nonsymmetric distances and their associated topologies: about the origins of basic ideas in the area of asymmetric topology. Handbook of the History of General Topology, History of Topology vol. 3 (2001), Kluwer Academic Publishers, Dordrecht 853-968
    • (2001) Handbook of the History of General Topology, History of Topology , vol.3 , pp. 853-968
    • Künzi, H.-P.A.1
  • 19
    • 34347334727 scopus 로고    scopus 로고
    • Quasi-metric spaces with measure
    • Stojmirović A. Quasi-metric spaces with measure. Topology Proc. 28 2 (2004) 655-671
    • (2004) Topology Proc. , vol.28 , Issue.2 , pp. 655-671
    • Stojmirović, A.1
  • 21
    • 25844504934 scopus 로고    scopus 로고
    • The representation of weighted quasi-metric spaces
    • Vitolo P. The representation of weighted quasi-metric spaces. Rend. Istit. Mat. Univ. Trieste 31 1-2 (1999) 95-100
    • (1999) Rend. Istit. Mat. Univ. Trieste , vol.31 , Issue.1-2 , pp. 95-100
    • Vitolo, P.1
  • 23
    • 0041664272 scopus 로고    scopus 로고
    • Index-driven similarity search in metric spaces
    • Hjaltason G.R., and Samet H. Index-driven similarity search in metric spaces. ACM Trans. Database Syst. 28 4 (2003) 517-580
    • (2003) ACM Trans. Database Syst. , vol.28 , Issue.4 , pp. 517-580
    • Hjaltason, G.R.1    Samet, H.2
  • 24
    • 34548480073 scopus 로고    scopus 로고
    • T.K. Sellis, N. Roussopoulos, C. Faloutsos, The R + - tree: a dynamic index for multi-dimensional objects, in: Proceedings of 13th International Conference on Very Large Data Bases (VLDB'87), Brighton, England, September 1987, pp. 507-518.
  • 25
    • 84947205653 scopus 로고    scopus 로고
    • K.S. Beyer, J. Goldstein, R. Ramakrishnan, U. Shaft, When is "nearest neighbor" meaningful?, in: Proceedings of Seventh International Conference on Database Theory (ICDT'99), Jerusalem, Israel, January 1999, pp. 217-235.
  • 26
    • 0033909182 scopus 로고    scopus 로고
    • On the geometry of similarity search: dimensionality curse and concentration of measure
    • Pestov V. On the geometry of similarity search: dimensionality curse and concentration of measure. Inf. Process. Lett. 73 (2000) 47-51
    • (2000) Inf. Process. Lett. , vol.73 , pp. 47-51
    • Pestov, V.1
  • 27
    • 34548503090 scopus 로고    scopus 로고
    • M. Gromov, Metric Structures for Riemannian and non-Riemannian Spaces, Progress in Mathematics, vol. 152, Birkhäuser, Base, 1999.
  • 28
    • 34548482163 scopus 로고    scopus 로고
    • M. Ledoux, The Concentration of Measure Phenomenon, Mathematical Surveys and Monographs, vol. 89, American Mathematical Society, 2001.
  • 29
    • 84993661659 scopus 로고    scopus 로고
    • P. Ciaccia, M. Patella, P. Zezula, M-tree: an efficient access method for similarity search in metric spaces, in: Proceedings of 23rd International Conference on Very Large Data Bases (VLDB'97), Athens, Greece, August 1997, pp. 426-435.
  • 30
    • 0031162001 scopus 로고    scopus 로고
    • T. Bozkaya, Z.M. Özsoyoglu, Distance-based indexing for high-dimensional metric spaces, in: Proceedings of the 1997 ACM SIGMOD International Conference on Management of Data, Tucson, Arizona, May 1997, pp. 357-368.
  • 32
    • 33745602932 scopus 로고    scopus 로고
    • Searching in metric spaces with user-defined and approximate distances
    • Ciaccia P., and Patella M. Searching in metric spaces with user-defined and approximate distances. ACM Trans. Database Syst. 27 4 (2002) 398-437
    • (2002) ACM Trans. Database Syst. , vol.27 , Issue.4 , pp. 398-437
    • Ciaccia, P.1    Patella, M.2
  • 33
    • 0027188633 scopus 로고    scopus 로고
    • P.N. Yianilos, Data structures and algorithms for nearest neighbor search in general metric spaces, in: Proceedings of the Fourth Annual ACM/SIGACT-SIAM Symposium on Discrete Algorithms, Austin, Texas, January 1993.
  • 35
    • 38149018071 scopus 로고
    • Patricia-practical algorithm to retrieve information coded in alphanumeric
    • Morrison D.R. Patricia-practical algorithm to retrieve information coded in alphanumeric. J. ACM 15 4 (1968) 514-534
    • (1968) J. ACM , vol.15 , Issue.4 , pp. 514-534
    • Morrison, D.R.1
  • 36
    • 85043482370 scopus 로고    scopus 로고
    • P. Weiner, Linear pattern matching algorithms, in: Proceedings of the 14th Annual Symposium on Switching and Automata Theory, IEEE, 1973, pp. 1-11.
  • 37
    • 0016942292 scopus 로고
    • A space-economical suffix tree construction algorithm
    • McCreight E.M. A space-economical suffix tree construction algorithm. J. ACM 23 2 (1976) 262-272
    • (1976) J. ACM , vol.23 , Issue.2 , pp. 262-272
    • McCreight, E.M.1
  • 38
    • 34548496597 scopus 로고    scopus 로고
    • E. Ukkonen, Constructing suffix trees on-line in linear time, in: Proceedings of the IFIP 12th World Computer Congress on Algorithms, Software, Architecture-Information Processing '92, vol. 1, Madrid, Spain, September 1992, pp. 484-492.
  • 40
    • 0033227559 scopus 로고    scopus 로고
    • Reducing the space requirements of suffix trees
    • Kurtz S. Reducing the space requirements of suffix trees. Software Pract. Exp. 29 13 (1999) 1149-1171
    • (1999) Software Pract. Exp. , vol.29 , Issue.13 , pp. 1149-1171
    • Kurtz, S.1
  • 41
    • 0027681165 scopus 로고
    • Suffix arrays: a new method for on-line string searches
    • Manber U., and Myers E.W. Suffix arrays: a new method for on-line string searches. SIAM J. Comput. 22 5 (1993) 935-948
    • (1993) SIAM J. Comput. , vol.22 , Issue.5 , pp. 935-948
    • Manber, U.1    Myers, E.W.2
  • 42
    • 0026656815 scopus 로고
    • Exhaustive matching of the entire protein sequence database
    • Gonnet G.H., Cohen M.A., and Benner S.A. Exhaustive matching of the entire protein sequence database. Science 256 (1992) 1443-1445
    • (1992) Science , vol.256 , pp. 1443-1445
    • Gonnet, G.H.1    Cohen, M.A.2    Benner, S.A.3
  • 43
    • 0025141443 scopus 로고
    • Automatic generation of primary sequence patterns from sets of related protein sequences
    • Smith R.F., and Smith T.S. Automatic generation of primary sequence patterns from sets of related protein sequences. Proc. Natl. Acad. Sci. USA 87 (1990) 118-122
    • (1990) Proc. Natl. Acad. Sci. USA , vol.87 , pp. 118-122
    • Smith, R.F.1    Smith, T.S.2
  • 44
    • 34548482162 scopus 로고    scopus 로고
    • H.H. Seward, Information sorting in the application of electronic digital computers to business operations, Master's Thesis, MIT, 1954.
  • 46
    • 0029906607 scopus 로고    scopus 로고
    • Dirichlet mixtures: a method for improving detection of weak but significant protein sequence homology
    • Sjölander K., Karplus K., Brown M., Hughey R., Krogh A., Mian I.S., and Haussler D. Dirichlet mixtures: a method for improving detection of weak but significant protein sequence homology. Comput. Appl. Biosci. 12 4 (1996) 327-345
    • (1996) Comput. Appl. Biosci. , vol.12 , Issue.4 , pp. 327-345
    • Sjölander, K.1    Karplus, K.2    Brown, M.3    Hughey, R.4    Krogh, A.5    Mian, I.S.6    Haussler, D.7
  • 48
    • 0033940118 scopus 로고    scopus 로고
    • RSDB: representative protein sequence databases have high information content
    • Park J., Holm L., Heger A., and Chothia C. RSDB: representative protein sequence databases have high information content. Bioinformatics 16 5 (2000) 458-464
    • (2000) Bioinformatics , vol.16 , Issue.5 , pp. 458-464
    • Park, J.1    Holm, L.2    Heger, A.3    Chothia, C.4
  • 51
    • 0035789678 scopus 로고    scopus 로고
    • Z. Bi, C. Faloutsos, F. Korn, The "DGX" distribution for mining massive, skewed data, in: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, California, August 2001, pp. 17-26.
  • 52
    • 34548495714 scopus 로고    scopus 로고
    • P. Ciaccia, M. Patella, Bulk loading the M-tree, in: Proceedings of the 9th Australasian Database Conference (ADC'98), Perth, Australia, February 1998, pp. 15-26.
  • 53
    • 84944324113 scopus 로고    scopus 로고
    • T. Kahveci, A.K. Singh, An efficient index structure for string databases, in: Proceedings of the 2001 VLDB Conference, pp. 351-360.
  • 54
    • 2442701936 scopus 로고    scopus 로고
    • A. Bhattacharya, T. Can, T. Kahveci, A.K. Singh, Y.-F. Wang, ProGreSS: simultaneous searching of protein databases by sequence and structure, in: Proceedings of the 2004 Pacific Symposium on Biocomputing, pp. 264-275.
  • 55
    • 84947556531 scopus 로고    scopus 로고
    • S. Burkhardt, J. Kinen, Better filtering with gapped q-grams, in: Combinatorial Pattern Matching, 2001, pp. 73-85.
  • 56
    • 2442603117 scopus 로고    scopus 로고
    • A hybrid indexing method for approximate string matching
    • Navarro G., and Baeza-Yates R. A hybrid indexing method for approximate string matching. J. Discret. Algorithms 1 1 (2000) 205-239
    • (2000) J. Discret. Algorithms , vol.1 , Issue.1 , pp. 205-239
    • Navarro, G.1    Baeza-Yates, R.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.