메뉴 건너뛰기




Volumn 17, Issue 1, 2010, Pages 1-20

Towards improved assessment of functional similarity in large-scale screens: A study on indel length

Author keywords

Algorithms; Biochemical networks; Computational molecular biology; Gene expression; HMM; Machine learning; RNA; Secondary structure

Indexed keywords

ESCHERICHIA COLI PROTEIN;

EID: 77149139160     PISSN: 10665277     EISSN: None     Source Type: Journal    
DOI: 10.1089/cmb.2009.0031     Document Type: Article
Times cited : (3)

References (53)
  • 1
    • 0029889221 scopus 로고    scopus 로고
    • Local alignment statistics
    • Altschul, S.F., and Gish, W. 1996. Local alignment statistics. Methods Enzymol. 266, 460-480.
    • (1996) Methods Enzymol. , vol.266 , pp. 460-480
    • Altschul, S.F.1    Gish, W.2
  • 2
    • 0001619220 scopus 로고
    • A phase transition for the score in matching random sequences allowing deletions
    • Arratia, R., and Waterman, M. 1994. A phase transition for the score in matching random sequences allowing deletions. Ann. Appl. Probabil. 4, 200-225.
    • (1994) Ann. Appl. Probabil. , vol.4 , pp. 200-225
    • Arratia, R.1    Waterman, M.2
  • 3
    • 77149123889 scopus 로고    scopus 로고
    • Constructions for clumps statistics
    • Available at: Accessed November 1, 2009
    • Bassino, F., Clement, J., Fayolle, J., et al. 2008. Constructions for clumps statistics. Proc. MathInfo'08 (to appear). Available at: www.arxiv.org=abs=0804.3671. Accessed November 1, 2009.
    • (2008) Proc. MathInfo'08 (To Appear)
    • Bassino, F.1    Clement, J.2    Fayolle, J.3
  • 4
    • 0027483434 scopus 로고
    • Empirical and structural models for insertions and deletions in the divergent evolution of proteins
    • Benner, S.A., Cohen, M.A., and Gonnet, G.H. 1993. Empirical and structural models for insertions and deletions in the divergent evolution of proteins. J. Mol. Biol. 229, 1065-1082.
    • (1993) J. Mol. Biol. , vol.229 , pp. 1065-1082
    • Benner, S.A.1    Cohen, M.A.2    Gonnet, G.H.3
  • 5
    • 34547101942 scopus 로고    scopus 로고
    • Relationship between insertion=deletion (indel) frequency of proteins and essentiality
    • Chan, S.K., Hsing, M., Hormozdiari, F., et al. 2007. Relationship between insertion=deletion (indel) frequency of proteins and essentiality. BMC Bioinform. 8, 227.
    • (2007) BMC Bioinform. , vol.8 , pp. 227
    • Chan, S.K.1    Hsing, M.2    Hormozdiari, F.3
  • 6
    • 3342888069 scopus 로고    scopus 로고
    • Empirical analysis of protein insertions and deletions determining parameters for the correct placement of gaps in protein sequence alignments
    • Chang, M.S.S., and Benner, S.A. 2004. Empirical analysis of protein insertions and deletions determining parameters for the correct placement of gaps in protein sequence alignments. J. Mol. Biol. 341, 617-631.
    • (2004) J. Mol. Biol. , vol.341 , pp. 617-631
    • Chang, M.S.S.1    Benner, S.A.2
  • 7
    • 30144444463 scopus 로고    scopus 로고
    • Large-scale survey for potentially targetable indels in bacterial and protozoan proteins
    • Cherkasov, A., Lee, S.J., Nandan, D., et al. 2005a. Large-scale survey for potentially targetable indels in bacterial and protozoan proteins. Proteins 62, 371-380.
    • (2005) Proteins , vol.62 , pp. 371-380
    • Cherkasov, A.1    Lee, S.J.2    Nandan, D.3
  • 8
    • 13944253296 scopus 로고    scopus 로고
    • Selective targetting of indel-inferred differences in 3D structures of highly homologous proteins
    • Cherkasov, A., Nandan, D., and Reiner, N.E. 2005b. Selective targetting of indel-inferred differences in 3D structures of highly homologous proteins. Proteins 58, 950-954.
    • (2005) Proteins , vol.58 , pp. 950-954
    • Cherkasov, A.1    Nandan, D.2    Reiner, N.E.3
  • 9
    • 33847103869 scopus 로고    scopus 로고
    • Measuring semantic similarity between Gene Ontology terms
    • Couto, F.M., Silva, M.J., and Coutinho, P.M. 2007. Measuring semantic similarity between Gene Ontology terms. Data Knowledge Eng. 61, 137-152.
    • (2007) Data Knowledge Eng. , vol.61 , pp. 137-152
    • Couto, F.M.1    Silva, M.J.2    Coutinho, P.M.3
  • 10
    • 0000387249 scopus 로고
    • Strong limit theorem of empirical functions for large exceedances of partial sums of i.i.d. variables
    • Dembo, A., and Karlin, S. 1991. Strong limit theorem of empirical functions for large exceedances of partial sums of i.i.d. variables. Ann. Probabil. 19, 1737-1755.
    • (1991) Ann. Probabil. , vol.19 , pp. 1737-1755
    • Dembo, A.1    Karlin, S.2
  • 11
    • 3943059570 scopus 로고    scopus 로고
    • High mutation rate and predominance of insertions in the Caenorhabditis elegans nuclear genome
    • Denver, D.R., Morris, K., Lynch, M., et al. 2004. High mutation rate and predominance of insertions in the Caenorhabditis elegans nuclear genome. Nature 430, 679-682.
    • (2004) Nature , vol.430 , pp. 679-682
    • Denver, D.R.1    Morris, K.2    Lynch, M.3
  • 13
    • 14644430471 scopus 로고    scopus 로고
    • ProbCons: Probabilistic consistency-based multiple sequence alignment
    • Do, C.B., Mahabhashyam, M.S., Brudno, M., et al. 2005. ProbCons: probabilistic consistency-based multiple sequence alignment. Genome Res. 15, 330-340.
    • (2005) Genome Res. , vol.15 , pp. 330-340
    • Do, C.B.1    Mahabhashyam, M.S.2    Brudno, M.3
  • 15
    • 0028856911 scopus 로고
    • Prediction of protein three-dimensional structures in insertion and deletion regions: A procedure for searching data bases of representative protein fragments using geometric scoring criteria
    • Fechteler, T., Dengler, U., and Schomburg, D. 1995. Prediction of protein three-dimensional structures in insertion and deletion regions: a procedure for searching data bases of representative protein fragments using geometric scoring criteria. J. Mol. Biol. 253, 114-131.
    • (1995) J. Mol. Biol. , vol.253 , pp. 114-131
    • Fechteler, T.1    Dengler, U.2    Schomburg, D.3
  • 16
    • 84950460234 scopus 로고
    • Distribution theory of runs: A Markov chain approach
    • Fu, J.C., and Koutras, M.V. 1994. Distribution theory of runs: a Markov chain approach. J. Am. Statist. Assoc. 89, 1050-1058.
    • (1994) J. Am. Statist. Assoc. , vol.89 , pp. 1050-1058
    • Fu, J.C.1    Koutras, M.V.2
  • 17
    • 0034069495 scopus 로고    scopus 로고
    • Gene Ontology: Tool for the unification of biology
    • Gene Ontology (GO) Consortium
    • Gene Ontology (GO) Consortium. 2000. Gene Ontology: tool for the unification of biology. Nat. Genet. 25, 25-29.
    • (2000) Nat. Genet. , vol.25 , pp. 25-29
  • 18
    • 33746664765 scopus 로고    scopus 로고
    • Can sequence determine function?
    • reviews0005.1-0005.10
    • Gerlt, J.A., and Babbitt, P.C. 2000. Can sequence determine function? Genome Biol. 1, reviews0005.1-0005.10.
    • (2000) Genome Biol. , vol.1
    • Gerlt, J.A.1    Babbitt, P.C.2
  • 19
    • 0020484488 scopus 로고
    • An improved algorithm for matching biological sequences
    • Gotoh, O. 1982. An improved algorithm for matching biological sequences. J. Mol. Biol. 162, 705-708.
    • (1982) J. Mol. Biol. , vol.162 , pp. 705-708
    • Gotoh, O.1
  • 20
    • 0028943946 scopus 로고
    • The size distribution of insertions and deletions in human and rodent pseudogenes suggests the logarithmic gap penalty for sequence alignment
    • Gu, X., and Li, W.-H. 1995. The size distribution of insertions and deletions in human and rodent pseudogenes suggests the logarithmic gap penalty for sequence alignment. J. Mol. Evol. 40, 464-473.
    • (1995) J. Mol. Evol. , vol.40 , pp. 464-473
    • Gu, X.1    Li, W.-H.2
  • 21
    • 15944375449 scopus 로고    scopus 로고
    • Applications of hidden Markov models for characterization of homologous DNA sequences with a common gene
    • Hobolth, A., and Jensen, J.L. 2005. Applications of hidden Markov models for characterization of homologous DNA sequences with a common gene. J. Comput. Biol. 12, 186-203.
    • (2005) J. Comput. Biol. , vol.12 , pp. 186-203
    • Hobolth, A.1    Jensen, J.L.2
  • 22
    • 55449094978 scopus 로고    scopus 로고
    • Evidence of a large novel gene pool associated with prokaryotic genomic islands
    • Hsiao, W.W.L., Ung, K., Aeschliman, D., et al. 2005. Evidence of a large novel gene pool associated with prokaryotic genomic islands. PLoS Genet. 1, e62.
    • (2005) PLoS Genet. , vol.1
    • Hsiao, W.W.L.1    Ung, K.2    Aeschliman, D.3
  • 24
    • 29144483361 scopus 로고    scopus 로고
    • An HMM posterior decoder for sequence feature prediction that includes homology information
    • Käll, L., Krogh, A., and Sonnhammer, E.L. 2005. An HMM posterior decoder for sequence feature prediction that includes homology information. Bioinformatics 21, Suppl. 1, i251-i257.
    • (2005) Bioinformatics , vol.21 , Issue.SUPPL.. 1
    • Käll, L.1    Krogh, A.2    Sonnhammer, E.L.3
  • 25
    • 0025259313 scopus 로고
    • Methods for assessing the statistic significance of molecular sequence features by using general scoring schemes
    • Karlin, S., and Altschul, S.F. 1990. Methods for assessing the statistic significance of molecular sequence features by using general scoring schemes. Proc. Natl. Acad. Sci. U.S.A. 87, 2264-2268.
    • (1990) Proc. Natl. Acad. Sci. U.S.A. , vol.87 , pp. 2264-2268
    • Karlin, S.1    Altschul, S.F.2
  • 26
    • 0141869092 scopus 로고    scopus 로고
    • Sequence alignments and pair hidden Markov models using evolutionary history
    • Knudsen, B., and Miyamoto, M.M. 2003. Sequence alignments and pair hidden Markov models using evolutionary history. J. Mol. Biol. 333, 453-460.
    • (2003) J. Mol. Biol. , vol.333 , pp. 453-460
    • Knudsen, B.1    Miyamoto, M.M.2
  • 27
    • 1042265186 scopus 로고    scopus 로고
    • Context of deletions and insertions in human coding sequences
    • Kondrashov, A.S., and Rogozin, I.B. 2004. Context of deletions and insertions in human coding sequences. Hum. Mutat. 23, 177-185.
    • (2004) Hum. Mutat. , vol.23 , pp. 177-185
    • Kondrashov, A.S.1    Rogozin, I.B.2
  • 28
    • 0033616714 scopus 로고    scopus 로고
    • Horizontal gene transfer among genomes: The complexity hypothesis
    • Lake, J.A., and Riveral, M.C. 1999. Horizontal gene transfer among genomes: the complexity hypothesis. Proc. Natl. Acad. Sci. U.S.A. 96, 3801-3806.
    • (1999) Proc. Natl. Acad. Sci. U.S.A. , vol.96 , pp. 3801-3806
    • Lake, J.A.1    Riveral, M.C.2
  • 29
    • 0005180705 scopus 로고    scopus 로고
    • An information-theoretic definition of similarity
    • Lin, D. 1998. An information-theoretic definition of similarity. Proc. 15th Int. Conf. Mach. Learn. 296-304.
    • (1998) Proc. 15th Int. Conf. Mach. Learn. , pp. 296-304
    • Lin, D.1
  • 30
    • 0037480738 scopus 로고    scopus 로고
    • Investigating semantic similarity measures across the Gene Ontology: The relationship between sequence and annotation
    • Lord, P.W., Stevens, R.D., Brass, A., et al. 2003. Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation. Bioinformatics 19, 1275-1283.
    • (2003) Bioinformatics , vol.19 , pp. 1275-1283
    • Lord, P.W.1    Stevens, R.D.2    Brass, A.3
  • 31
    • 34547830856 scopus 로고    scopus 로고
    • Probabilistic whole-genome alignments reveal high indel rates in the human and mouse genomes
    • Lunter, G. 2007. Probabilistic whole-genome alignments reveal high indel rates in the human and mouse genomes. Bioinformatics 23, i289-i296.
    • (2007) Bioinformatics , vol.23
    • Lunter, G.1
  • 32
    • 39049145326 scopus 로고    scopus 로고
    • Uncertainty in homology inferences: Assessing and improving genomic sequence alignment
    • doi:10.1101=gr.6725608
    • Lunter, G., Rocco, A., Mimouni, N., et al. 2007. Uncertainty in homology inferences: assessing and improving genomic sequence alignment. Genome Res. 18, doi:10.1101=gr.6725608.
    • (2007) Genome Res. , vol.18
    • Lunter, G.1    Rocco, A.2    Mimouni, N.3
  • 33
    • 0002064379 scopus 로고
    • Maximum likelihood estimation of the statistical distribution of Smith-Waterman local sequence similarity scores
    • Mott, R. 1992. Maximum likelihood estimation of the statistical distribution of Smith-Waterman local sequence similarity scores. Bull. Math. Biol. 54, 59-75.
    • (1992) Bull. Math. Biol. , vol.54 , pp. 59-75
    • Mott, R.1
  • 34
    • 33847377101 scopus 로고    scopus 로고
    • Indel-based targeting of essential proteins in human pathogens that have close host orthologue(s): Discovery of selective inhibitors for Leishmania donovani elongation factor-1-a
    • Nandan, D., Lopez, M., Ban, F., et al. 2007. Indel-based targeting of essential proteins in human pathogens that have close host orthologue(s): discovery of selective inhibitors for Leishmania donovani elongation factor-1-a. Proteins 67, 53-67.
    • (2007) Proteins , vol.67 , pp. 53-67
    • Nandan, D.1    Lopez, M.2    Ban, F.3
  • 35
    • 0014757386 scopus 로고
    • A general method applicable to the search for similarities in the amino acid sequence of two proteins
    • Needleman, S.B., and Wunsch, C.D. 1970. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443-453.
    • (1970) J. Mol. Biol. , vol.48 , pp. 443-453
    • Needleman, S.B.1    Wunsch, C.D.2
  • 37
    • 27644471132 scopus 로고    scopus 로고
    • SIMPROT: Using an empirically determined indel distribution in simulations of protein evolution
    • Pang, A., Smith, A.D., Nuin, P.A.S., et al. 2005. SIMPROT: using an empirically determined indel distribution in simulations of protein evolution. BMC Bioinform. 6, 236.
    • (2005) BMC Bioinform. , vol.6 , pp. 236
    • Pang, A.1    Smith, A.D.2    Nuin, P.A.S.3
  • 38
    • 0023989064 scopus 로고
    • Improved tools for biological sequence comparison
    • Pearson, W.R., and Lipman, D.J. 1988. Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. U.S.A. 85, 2444-2448.
    • (1988) Proc. Natl. Acad. Sci. U.S.A. , vol.85 , pp. 2444-2448
    • Pearson, W.R.1    Lipman, D.J.2
  • 39
    • 21844512398 scopus 로고
    • A simple derivation of exact reliability formulas for linear and circular consecutivek-of-n F systems
    • Peköz, E.A., and Ross, S.M. 1995. A simple derivation of exact reliability formulas for linear and circular consecutivek-of-n F systems. J. Appl. Probabil. 32, 554-557.
    • (1995) J. Appl. Probabil. , vol.32 , pp. 554-557
    • Peköz, E.A.1    Ross, S.M.2
  • 41
    • 0346652457 scopus 로고    scopus 로고
    • ProClust: Improved clustering of protein sequences with an extended graph-based approach
    • Pipenbacher, P., Schliep, A., Schneckener, S., et al. 2002. ProClust: improved clustering of protein sequences with an extended graph-based approach. Bioinformatics 18, Supp. 2, 182-191.
    • (2002) Bioinformatics , vol.18 , Issue.SUPPL.. 2 , pp. 182-191
    • Pipenbacher, P.1    Schliep, A.2    Schneckener, S.3
  • 42
    • 0035479653 scopus 로고    scopus 로고
    • Distribution of indel lengths
    • Qian, B., and Goldstein, R.A. 2001. Distribution of indel lengths. Proteins 45, 102-104.
    • (2001) Proteins , vol.45 , pp. 102-104
    • Qian, B.1    Goldstein, R.A.2
  • 43
    • 0024480920 scopus 로고
    • Development and application of a metric on semantic nets
    • Rada, R., Mili, H., Bicknell, E., et al. 1989. Development and application of a metric on semantic nets. IEEE Trans. Syst. Man Cybernet. 19, 17-30.
    • (1989) IEEE Trans. Syst. Man Cybernet. , vol.19 , pp. 17-30
    • Rada, R.1    Mili, H.2    Bicknell, E.3
  • 44
    • 0002016474 scopus 로고    scopus 로고
    • Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language
    • Resnik, P. 1999. Semantic similarity in a taxonomy: an information-based measure and its application to problems of ambiguity in natural language. Artific. Intell. Res. 11, 95-130.
    • (1999) Artific. Intell. Res. , vol.11 , pp. 95-130
    • Resnik, P.1
  • 45
    • 0032962457 scopus 로고    scopus 로고
    • Twilight zone of protein sequence alignments
    • Rost, B. 1999. Twilight zone of protein sequence alignments. Protein Eng. 12, 85-94.
    • (1999) Protein Eng. , vol.12 , pp. 85-94
    • Rost, B.1
  • 46
    • 43349091196 scopus 로고    scopus 로고
    • Combining statistical alignment and phylogenetic footprinting to detect regulatory elements
    • doi:10.1093=bioinformatics=btn104
    • Satija, R., Pachter, L., and Hein, J. 2008. Combining statistical alignment and phylogenetic footprinting to detect regulatory elements. Bioinformatics doi:10.1093=bioinformatics=btn104.
    • (2008) Bioinformatics
    • Satija, R.1    Pachter, L.2    Hein, J.3
  • 47
    • 33748335463 scopus 로고    scopus 로고
    • A new measure for functional similarity of gene products based on gene ontology
    • Schlicker, A., Domingues, F.S., Rahnenführer, J., et al. 2006. A new measure for functional similarity of gene products based on gene ontology. BMC Bioinform. 7, 302.
    • (2006) BMC Bioinform. , vol.7 , pp. 302
    • Schlicker, A.1    Domingues, F.S.2    Rahnenführer, J.3
  • 49
    • 0033674329 scopus 로고    scopus 로고
    • Models of protein sequence evolution and their applications
    • Thorne, J.L. 2000. Models of protein sequence evolution and their applications. Curr. Opin. Genet. Dev. 10, 602-605.
    • (2000) Curr. Opin. Genet. Dev. , vol.10 , pp. 602-605
    • Thorne, J.L.1
  • 50
    • 0026079507 scopus 로고
    • An evolutionary model for maximum likelihood alignment of DNA sequences
    • Thorne, J.L., Kishino, H., and Felsenstein, J. 1991. An evolutionary model for maximum likelihood alignment of DNA sequences. J. Mol. Evol. 33, 114-124.
    • (1991) J. Mol. Evol. , vol.33 , pp. 114-124
    • Thorne, J.L.1    Kishino, H.2    Felsenstein, J.3
  • 51
    • 0026528734 scopus 로고
    • Inching toward reality: An improved likelihood model of sequence evolution
    • Thorne, J.L., Kishino, H., and Felsenstein, J. 1992. Inching toward reality: an improved likelihood model of sequence evolution. J. Mol. Evol. 34, 3-16.
    • (1992) J. Mol. Evol. , vol.34 , pp. 3-16
    • Thorne, J.L.1    Kishino, H.2    Felsenstein, J.3
  • 52
    • 33846041078 scopus 로고    scopus 로고
    • The Universal Protein Resource (UniProt)
    • The UniProt Consortium.
    • The UniProt Consortium. 2007. The Universal Protein Resource (UniProt). Nucleic Acids Res. 35, D193-D197.
    • (2007) Nucleic Acids Res. , vol.35
  • 53
    • 84972546031 scopus 로고
    • Sequence comparison significance and Poisson approximation
    • Waterman, M.S., and Vingron, M. 1994. Sequence comparison significance and Poisson approximation. Statist. Sci. 9, 367-381.
    • (1994) Statist. Sci. , vol.9 , pp. 367-381
    • Waterman, M.S.1    Vingron, M.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.