메뉴 건너뛰기




Volumn 7, Issue 1, 2006, Pages 2-24

Statistical significance in biological sequence analysis

Author keywords

E value; Multiple alignment; P value; Pairwise alignment; Probabilistic model; Profile; Sequence analysis; Statistical significance

Indexed keywords

DNA;

EID: 33748809505     PISSN: 14675463     EISSN: 14774054     Source Type: Journal    
DOI: 10.1093/bib/bbk001     Document Type: Review
Times cited : (64)

References (118)
  • 1
    • 0032169271 scopus 로고    scopus 로고
    • Conservation of gene order: A fingerprint of proteins that physically interact
    • Dandekar T, Snel B, Huynen M, et al. Conservation of gene order: A fingerprint of proteins that physically interact. Trends Biochem Sci 1998;23:324-28.
    • (1998) Trends Biochem Sci , vol.23 , pp. 324-328
    • Dandekar, T.1    Snel, B.2    Huynen, M.3
  • 2
    • 0033523989 scopus 로고    scopus 로고
    • Protein interaction maps for complete genomes based on gene fusion events
    • Enright AJ, Iliopoulos I, Kyrpides NC, et al. Protein interaction maps for complete genomes based on gene fusion events. Nature 1999;402:86-90.
    • (1999) Nature , vol.402 , pp. 86-90
    • Enright, A.J.1    Iliopoulos, I.2    Kyrpides, N.C.3
  • 3
    • 0033394535 scopus 로고    scopus 로고
    • Genes regulated cooperatively by one or more transcription factors and their identification in whole eukaryotic genomes
    • Wagner A. Genes regulated cooperatively by one or more transcription factors and their identification in whole eukaryotic genomes. Bioinformatics 1999;15:776-84.
    • (1999) Bioinformatics , vol.15 , pp. 776-784
    • Wagner, A.1
  • 4
    • 0022360113 scopus 로고
    • Significance of nucleotide sequence alignments: A method for random sequence permutation that preserves dinucleotide and codon usage
    • Altschul SF, Erickson BW. Significance of nucleotide sequence alignments: A method for random sequence permutation that preserves dinucleotide and codon usage. Mol Biol Evol 1985;2:526-38.
    • (1985) Mol Biol Evol , vol.2 , pp. 526-538
    • Altschul, S.F.1    Erickson, B.W.2
  • 5
    • 0026600016 scopus 로고
    • Statistical analyses of counts and distributions of restriction sites in DNA sequences
    • Karlin S, Burge C, Campbell AM. Statistical analyses of counts and distributions of restriction sites in DNA sequences. Nucleic Acids Res 1992;20:1363-70.
    • (1992) Nucleic Acids Res , vol.20 , pp. 1363-1370
    • Karlin, S.1    Burge, C.2    Campbell, A.M.3
  • 6
    • 2742569191 scopus 로고
    • Comparative statistics for DNA and protein sequences: Single-sequence analysis
    • Karlin S, Ghandour G. Comparative statistics for DNA and protein sequences: Single-sequence analysis. Proc Natl Acad Sci USA 1985;82:5800-4.
    • (1985) Proc Natl Acad Sci USA , vol.82 , pp. 5800-5804
    • Karlin, S.1    Ghandour, G.2
  • 7
    • 0038673215 scopus 로고
    • Patterns in DNA and amino acid sequences and their statistical significance
    • In: Waterman MS (ed). Boca Raton: CRC Press
    • Karlin S, Ost F, Blaisdell BE. Patterns in DNA and amino acid sequences and their statistical significance. In: Waterman MS (ed). Mathematical Methods for DNA Sequences. Boca Raton: CRC Press, 1989;133-57.
    • (1989) Mathematical Methods for DNA Sequences , pp. 133-157
    • Karlin, S.1    Ost, F.2    Blaisdell, B.E.3
  • 8
    • 0021760111 scopus 로고
    • On the statistical significance of nucleic acid similarities
    • Lipman DJ, Wilbur WJ, Smith TF, et al. On the statistical significance of nucleic acid similarities. Nucleic Acids Res 1984;12:215-26.
    • (1984) Nucleic Acids Res , vol.12 , pp. 215-226
    • Lipman, D.J.1    Wilbur, W.J.2    Smith, T.F.3
  • 9
    • 0028289467 scopus 로고
    • Issues in searching molecular sequence databases
    • Altschul SF, Boguski MS, Gish W, et al. Issues in searching molecular sequence databases. Nature Genet 1994;6:119-29.
    • (1994) Nature Genet , vol.6 , pp. 119-129
    • Altschul, S.F.1    Boguski, M.S.2    Gish, W.3
  • 10
    • 0040260756 scopus 로고
    • Sequence alignments
    • In: Waterman MS (ed). Boca Raton: CRC Press
    • Waterman MS. Sequence alignments. In: Waterman MS (ed). Mathematical Methods for DNA Sequences. Boca Raton: CRC Press, 1989;53-92.
    • (1989) Mathematical Methods for DNA Sequences , pp. 53-92
    • Waterman, M.S.1
  • 11
    • 0041748378 scopus 로고
    • Consensus patterns in sequences
    • In: Waterman MS (ed). Boca Raton: CRC Press
    • Waterman MS. Consensus patterns in sequences. In: Waterman MS (ed). Mathematical Methods for DNA Sequences. Boca Raton: CRC Press, 1989;93-115.
    • (1989) Mathematical Methods for DNA Sequences , pp. 93-115
    • Waterman, M.S.1
  • 13
    • 0030449157 scopus 로고    scopus 로고
    • The statistical significance of nucleotide position-weight matrix matches
    • Claverie JM, Audic S. The statistical significance of nucleotide position-weight matrix matches. CABIOS 1996;12:431-39.
    • (1996) CABIOS , vol.12 , pp. 431-439
    • Claverie, J.M.1    Audic, S.2
  • 14
    • 0034647416 scopus 로고    scopus 로고
    • Accurate formula for p-values of gapped local sequence and profile alignments
    • Mott R. Accurate formula for p-values of gapped local sequence and profile alignments. J Mol Biol 2000;300: 649-59.
    • (2000) J Mol Biol , vol.300 , pp. 649-659
    • Mott, R.1
  • 15
    • 0032512799 scopus 로고    scopus 로고
    • Empirical statistical estimates for sequence similarity searches
    • Pearson WR. Empirical statistical estimates for sequence similarity searches. J Mol Biol 1998;276:71-84.
    • (1998) J Mol Biol , vol.276 , pp. 71-84
    • Pearson, W.R.1
  • 16
    • 84972546031 scopus 로고
    • Sequence comparison significance and Poisson approximation
    • Waterman MS, Vingron M. Sequence comparison significance and Poisson approximation. Statist Sci 1994;9: 367-81.
    • (1994) Statist Sci , vol.9 , pp. 367-381
    • Waterman, M.S.1    Vingron, M.2
  • 17
    • 0021760529 scopus 로고
    • On the statistical assessment of similarities in DNA sequencies
    • Reich JG, Drabsch H, Däumler A. On the statistical assessment of similarities in DNA sequencies. Nucleic Acids Res 1984;12:5529-43.
    • (1984) Nucleic Acids Res , vol.12 , pp. 5529-5543
    • Reich, J.G.1    Drabsch, H.2    Däumler, A.3
  • 18
    • 0028234758 scopus 로고
    • Rapid and accurate estimates of statistical significance for sequence database searches
    • Waterman MS, Vingron M. Rapid and accurate estimates of statistical significance for sequence database searches. Proc Natl Acad Sci USA 1994;91:4625-28.
    • (1994) Proc Natl Acad Sci USA , vol.91 , pp. 4625-4628
    • Waterman, M.S.1    Vingron, M.2
  • 20
    • 15844371193 scopus 로고    scopus 로고
    • CIS: Compound importance sampling method for protein-DNA binding site p-value estimation
    • Barash Y, Elidan G, Kaplan T, et al. CIS: Compound importance sampling method for protein-DNA binding site p-value estimation. Bioinformatics 2005;21:596-600.
    • (2005) Bioinformatics , vol.21 , pp. 596-600
    • Barash, Y.1    Elidan, G.2    Kaplan, T.3
  • 22
    • 0033563350 scopus 로고    scopus 로고
    • Significance of Z-value statistics of Smith-Waterman scores for protein alignments
    • Comet JP, Aude JC, Glémet E, et al. Significance of Z-value statistics of Smith-Waterman scores for protein alignments. Computers Chem 1999;23:317-31.
    • (1999) Computers Chem , vol.23 , pp. 317-331
    • Comet, J.P.1    Aude, J.C.2    Glémet, E.3
  • 23
    • 1542505031 scopus 로고    scopus 로고
    • Fundamentals of massive automatic pairwise alignments of protein sequences: Theoretical significance of Z-value statistics
    • Bastien O, Aude JC, Roy S, et al. Fundamentals of massive automatic pairwise alignments of protein sequences: Theoretical significance of Z-value statistics. Bioinformatics 2004;20:534-37.
    • (2004) Bioinformatics , vol.20 , pp. 534-537
    • Bastien, O.1    Aude, J.C.2    Roy, S.3
  • 24
    • 0022431785 scopus 로고
    • The statistical distribution of nucleic acid similarities
    • Smith TF, Waterman MS, Burks C. The statistical distribution of nucleic acid similarities. Nucleic Acids Res 1985;13:645-56.
    • (1985) Nucleic Acids Res , vol.13 , pp. 645-656
    • Smith, T.F.1    Waterman, M.S.2    Burks, C.3
  • 25
    • 0032826179 scopus 로고    scopus 로고
    • Identifying DNA and protein patterns with statistically significant alignments of multiple sequences
    • Hertz GZ, Stormo GD. Identifying DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics 1999;15:563-77.
    • (1999) Bioinformatics , vol.15 , pp. 563-577
    • Hertz, G.Z.1    Stormo, G.D.2
  • 26
    • 0002064379 scopus 로고
    • Maximum-likelihood estimation of the statistical distribution of Smith-Waterman local sequence similarity scores
    • Mott R. Maximum-likelihood estimation of the statistical distribution of Smith-Waterman local sequence similarity scores. Bull Math Biol 1992;54:59-75.
    • (1992) Bull Math Biol , vol.54 , pp. 59-75
    • Mott, R.1
  • 27
    • 0029889221 scopus 로고    scopus 로고
    • Local alignment statistics
    • Altschul SF, Gish W. Local alignment statistics. Meth Enzymol 1996;266:460-80.
    • (1996) Meth Enzymol , vol.266 , pp. 460-480
    • Altschul, S.F.1    Gish, W.2
  • 28
    • 0034125366 scopus 로고    scopus 로고
    • Probabilistic and statistical properties of words: An overview
    • Reinert G, Schbath S, Waterman MS. Probabilistic and statistical properties of words: An overview. J Comp Biol 2000;7:1-46.
    • (2000) J Comp Biol , vol.7 , pp. 1-46
    • Reinert, G.1    Schbath, S.2    Waterman, M.S.3
  • 29
    • 0003304127 scopus 로고
    • Some statistical aspects of the primary structure of nucleotide sequences
    • In: Waterman MS (ed). Boca Raton: CRC Press
    • Tavaré S, Giddins BW. Some statistical aspects of the primary structure of nucleotide sequences. In: Waterman MS (ed). Mathematical Methods for DNA Sequences. Boca Raton: CRC Press, 1989;117-33.
    • (1989) Mathematical Methods for DNA Sequences , pp. 117-133
    • Tavaré, S.1    Giddins, B.W.2
  • 30
    • 0001695936 scopus 로고
    • Statistical patterns in the primary structures of functional regions in the genome of E. coli. 3. Computer recognition of coding regions
    • Borodovskii MY, Sprizhitskii YA, Golovanov EI, et al. Statistical patterns in the primary structures of functional regions in the genome of E. coli. 3. Computer recognition of coding regions. Mol Biol 1986;20:1144-50.
    • (1986) Mol Biol , vol.20 , pp. 1144-1150
    • Borodovskii, M.Y.1    Sprizhitskii, Y.A.2    Golovanov, E.I.3
  • 31
    • 0002916795 scopus 로고
    • Statistical patterns in the primary structures of functional regions in the genome of E: coli. 1. Frequency characteristics
    • Borodovskii MY, Sprizhitskii YA, Golovanov EI, et al. Statistical patterns in the primary structures of functional regions in the genome of E: Coli. 1. Frequency characteristics. Mol Biol 1986;20:826-33.
    • (1986) Mol Biol , vol.20 , pp. 826-833
    • Borodovskii, M.Y.1    Sprizhitskii, Y.A.2    Golovanov, E.I.3
  • 32
    • 0002918043 scopus 로고
    • Statistical patterns in the primary structures of functional regions in the genome of E. coli. 2. Nonuniform Markov models
    • Borodovskii MY, Sprizhitskii YA, Golovanov EI, et al. Statistical patterns in the primary structures of functional regions in the genome of E. coli. 2. Nonuniform Markov models. Mol Biol 1986;20:833-40.
    • (1986) Mol Biol , vol.20 , pp. 833-840
    • Borodovskii, M.Y.1    Sprizhitskii, Y.A.2    Golovanov, E.I.3
  • 33
    • 0024584990 scopus 로고
    • Codon preference and primary sequence structure in protein-coding regions
    • Tavaré S, Song B. Codon preference and primary sequence structure in protein-coding regions. Bull Math Biol 1989; 51: 95-115.
    • (1989) Bull Math Biol , vol.51 , pp. 95-115
    • Tavaré, S.1    Song, B.2
  • 34
    • 0347503105 scopus 로고    scopus 로고
    • Statistical methods for DNA sequence segmentation
    • Braun JV, Müller HG. Statistical methods for DNA sequence segmentation. Statist Sci 1998;13:142-62.
    • (1998) Statist Sci , vol.13 , pp. 142-162
    • Braun, J.V.1    Müller, H.G.2
  • 35
    • 0031586003 scopus 로고    scopus 로고
    • Prediction of complete gene structures in human genomic DNA
    • Burge C, Karlin S. Prediction of complete gene structures in human genomic DNA. J Mol Biol 1997;268:78-94.
    • (1997) J Mol Biol , vol.268 , pp. 78-94
    • Burge, C.1    Karlin, S.2
  • 36
    • 0027944605 scopus 로고
    • A hidden Markov model that finds genes in E. coli DNA
    • Krogh A, Mian IS, Haussler D. A hidden Markov model that finds genes in E. coli DNA. Nucleic Acids Res 1994;22: 4768-78.
    • (1994) Nucleic Acids Res , vol.22 , pp. 4768-4778
    • Krogh, A.1    Mian, I.S.2    Haussler, D.3
  • 37
    • 0032519353 scopus 로고    scopus 로고
    • GeneMark.hmm: New solutions for gene finding
    • Lukashin AV, Borodovsky M. GeneMark.hmm: New solutions for gene finding. Nucleic Acids Res 1998;26: 1107-15.
    • (1998) Nucleic Acids Res , vol.26 , pp. 1107-1115
    • Lukashin, A.V.1    Borodovsky, M.2
  • 38
    • 0030801002 scopus 로고    scopus 로고
    • Gapped BLAST and PSI-BLAST: A new generation of protein database search programs
    • Altschul SF, Madden TL, Schäffer AA. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res 1997;25:3389-3402.
    • (1997) Nucleic Acids Res , vol.25 , pp. 3389-3402
    • Altschul, S.F.1    Madden, T.L.2    Schäffer, A.A.3
  • 39
    • 0026008859 scopus 로고
    • Distribution of glutamine and asparagine residues and their near neighbors in peptides and proteins
    • Robinson AB, Robinson LR. Distribution of glutamine and asparagine residues and their near neighbors in peptides and proteins. Proc Natl Acad Sci USA 1991;88:8880-84.
    • (1991) Proc Natl Acad Sci USA , vol.88 , pp. 8880-8884
    • Robinson, A.B.1    Robinson, L.R.2
  • 40
    • 0035878724 scopus 로고    scopus 로고
    • Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements
    • Schäffer AA, Aravind L, Madden TL, et al. Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res 2001;29:2994-3005.
    • (2001) Nucleic Acids Res , vol.29 , pp. 2994-3005
    • Schäffer, A.A.1    Aravind, L.2    Madden, T.L.3
  • 41
  • 42
    • 0026913444 scopus 로고
    • Poisson, compound Poisson and process approximations for testing statistical significance in sequence comparisons
    • Goldstein L, Waterman MS. Poisson, compound Poisson and process approximations for testing statistical significance in sequence comparisons. Bull Math Biol 1992;54:785-812.
    • (1992) Bull Math Biol , vol.54 , pp. 785-812
    • Goldstein, L.1    Waterman, M.S.2
  • 43
    • 0025259313 scopus 로고
    • Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes
    • Karlin S, Altschul SF. Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proc Natl Acad Sci USA 1990;87: 2264-68.
    • (1990) Proc Natl Acad Sci USA , vol.87 , pp. 2264-2268
    • Karlin, S.1    Altschul, S.F.2
  • 44
    • 0000960430 scopus 로고
    • The Erdös-Rényi law in distribution, for coin tossing and sequence matching
    • Arratia R, Gordon L, Waterman MS. The Erdös-Rényi law in distribution, for coin tossing and sequence matching. Ann Statist 1990;18:539-70.
    • (1990) Ann Statist , vol.18 , pp. 539-570
    • Arratia, R.1    Gordon, L.2    Waterman, M.S.3
  • 45
    • 0037100636 scopus 로고    scopus 로고
    • Statistical significance of clusters of motifs represented by position specific scoring matrices in nucleotide sequences
    • Frith MC, Spouge JL, Hansen U, et al. Statistical significance of clusters of motifs represented by position specific scoring matrices in nucleotide sequences. Nucleic Acids Res 2002;30: 3214-24.
    • (2002) Nucleic Acids Res , vol.30 , pp. 3214-3224
    • Frith, M.C.1    Spouge, J.L.2    Hansen, U.3
  • 46
    • 16644364007 scopus 로고    scopus 로고
    • In silico representation and discovery of transcription factor binding sites
    • Pavesi G, Mauri G, Pesole G. In silico representation and discovery of transcription factor binding sites. Brief Bioinform 2004;5:217-36.
    • (2004) Brief Bioinform , vol.5 , pp. 217-236
    • Pavesi, G.1    Mauri, G.2    Pesole, G.3
  • 47
    • 0029792949 scopus 로고    scopus 로고
    • Poisson process approximation for sequence repeats, and sequencing by hybridization
    • Arratia R, Martin D, Reinert G, et al. Poisson process approximation for sequence repeats, and sequencing by hybridization. J Comp Biol 1996;3:425-63.
    • (1996) J Comp Biol , vol.3 , pp. 425-463
    • Arratia, R.1    Martin, D.2    Reinert, G.3
  • 48
    • 2342647392 scopus 로고    scopus 로고
    • Statistical analysis of over-represented words in human promoter sequences
    • Mariño-Ramírez L, Spouge JL, Kanga GC, et al. Statistical analysis of over-represented words in human promoter sequences. Nucleic Acids Res 2004;32:949-58.
    • (2004) Nucleic Acids Res , vol.32 , pp. 949-958
    • Mariño-Ramírez, L.1    Spouge, J.L.2    Kanga, G.C.3
  • 49
    • 0037115891 scopus 로고    scopus 로고
    • Discovery of novel transcription factor binding sites by statistical overrepresentation
    • Sinha S, Tompa M. Discovery of novel transcription factor binding sites by statistical overrepresentation. Nucleic Acids Res 2002;30:5549-60.
    • (2002) Nucleic Acids Res , vol.30 , pp. 5549-5560
    • Sinha, S.1    Tompa, M.2
  • 50
    • 12744271927 scopus 로고    scopus 로고
    • DWE: Discriminating word enumerator
    • Sumazin P, Chen GX, Hata N, et al. DWE: Discriminating word enumerator. Bioinformatics 2005;21:31-38.
    • (2005) Bioinformatics , vol.21 , pp. 31-38
    • Sumazin, P.1    Chen, G.X.2    Hata, N.3
  • 51
    • 0037342499 scopus 로고    scopus 로고
    • Alignment-free sequence-comparison - A review
    • Vinga S, Almeida J. Alignment-free sequence-comparison - a review. Bioinformatics 2003;19:513-23.
    • (2003) Bioinformatics , vol.19 , pp. 513-523
    • Vinga, S.1    Almeida, J.2
  • 52
    • 0034048881 scopus 로고    scopus 로고
    • An overview on the distribution of word counts in Markov chains
    • Schbath S. An overview on the distribution of word counts in Markov chains. J Comp Biol 2000;7:193-201.
    • (2000) J Comp Biol , vol.7 , pp. 193-201
    • Schbath, S.1
  • 53
    • 0034786443 scopus 로고    scopus 로고
    • Numerical comparison of several approximations of the word count distribution in random sequences
    • Robin S, Schbath S. Numerical comparison of several approximations of the word count distribution in random sequences. J Comp Biol 2001;8:349-59.
    • (2001) J Comp Biol , vol.8 , pp. 349-359
    • Robin, S.1    Schbath, S.2
  • 54
    • 0026718403 scopus 로고
    • Chance and statistical significance in protein and DNA sequence analysis
    • Karlin S, Brendel V. Chance and statistical significance in protein and DNA sequence analysis. Science 1992;257: 39-49.
    • (1992) Science , vol.257 , pp. 39-49
    • Karlin, S.1    Brendel, V.2
  • 55
    • 0001318108 scopus 로고
    • Statistical composition of high-scoring segments from molecular sequences
    • Karlin S, Dembo A, Kawabata T. Statistical composition of high-scoring segments from molecular sequences. Ann Statist 1990;18:571-81.
    • (1990) Ann Statist , vol.18 , pp. 571-581
    • Karlin, S.1    Dembo, A.2    Kawabata, T.3
  • 56
    • 0001506434 scopus 로고
    • Extreme values in the GI/G/1 queue
    • Iglehart DL. Extreme values in the GI/G/1 queue. Ann Math Statist 1972;43:627-35.
    • (1972) Ann Math Statist , vol.43 , pp. 627-635
    • Iglehart, D.L.1
  • 57
    • 0000586227 scopus 로고
    • Limit distributions of maximal segmental score among Markov-dependent partial sums
    • Karlin S, Dembo A. Limit distributions of maximal segmental score among Markov-dependent partial sums. Adv Appl Probab 1992;24:113-40.
    • (1992) Adv Appl Probab , vol.24 , pp. 113-140
    • Karlin, S.1    Dembo, A.2
  • 58
    • 0027175241 scopus 로고
    • Applications and statistics for multiple high-scoring segments in molecular sequences
    • Karlin S, Altschul SF. Applications and statistics for multiple high-scoring segments in molecular sequences. Proc Natl Acad Sci USA 1993;90:5873-77.
    • (1993) Proc Natl Acad Sci USA , vol.90 , pp. 5873-5877
    • Karlin, S.1    Altschul, S.F.2
  • 59
    • 0028782111 scopus 로고
    • Statistical studies of biomolecular sequences: Score-based methods
    • Karlin S. Statistical studies of biomolecular sequences: Score-based methods. Philosophical Transactions: Biological Sciences 1994;344:391-402.
    • (1994) Philosophical Transactions: Biological Sciences , vol.344 , pp. 391-402
    • Karlin, S.1
  • 60
    • 0014757386 scopus 로고
    • A general method applicable to the search for similarities in the amino acid sequence of two proteins
    • Needleman SB, Wunsch CD. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 1970;48:443-53.
    • (1970) J Mol Biol , vol.48 , pp. 443-453
    • Needleman, S.B.1    Wunsch, C.D.2
  • 61
    • 0020484488 scopus 로고
    • An improved algorithm for matching biological sequences
    • Gotoh O. An improved algorithm for matching biological sequences. J Mol Biol 1982;162:705-8.
    • (1982) J Mol Biol , vol.162 , pp. 705-708
    • Gotoh, O.1
  • 62
    • 0036138287 scopus 로고    scopus 로고
    • Estimation of P-values for global alignments of protein sequences
    • Webber C, Barton GJ. Estimation of P-values for global alignments of protein sequences. Bioinformatics 2001;17: 1158-67.
    • (2001) Bioinformatics , vol.17 , pp. 1158-1167
    • Webber, C.1    Barton, G.J.2
  • 63
    • 0000228203 scopus 로고
    • A model of evolutionary change in proteins
    • In: Dayhoff MO (ed). Washington, DC: National Biomedical Research Foundation
    • Dayhoff MO, Schwartz RM, Orcutt BC. A model of evolutionary change in proteins. In: Dayhoff MO (ed). Atlas of Protein Sequence and Structure; Vol. 5. Washington, DC: National Biomedical Research Foundation, 1978:345-58.
    • (1978) Atlas of Protein Sequence and Structure , vol.5 , pp. 345-358
    • Dayhoff, M.O.1    Schwartz, R.M.2    Orcutt, B.C.3
  • 64
    • 0026458378 scopus 로고
    • Amino acid substitution matrices from protein blocks
    • Henikoff S, Henikovv JG. Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci USA 1992; 89:10915-19.
    • (1992) Proc Natl Acad Sci USA , vol.89 , pp. 10915-10919
    • Henikoff, S.1    Henikovv, J.G.2
  • 65
    • 0026656815 scopus 로고
    • Exhaustive matching of the entire protein sequence database
    • Gonnet GH, Cohen MA, Benner SA. Exhaustive matching of the entire protein sequence database. Science 1992;256: 1443-45.
    • (1992) Science , vol.256 , pp. 1443-1445
    • Gonnet, G.H.1    Cohen, M.A.2    Benner, S.A.3
  • 66
    • 0019887799 scopus 로고
    • Identification of common molecular subsequences
    • Smith TF, Waterman MS. Identification of common molecular subsequences. J Mol Biol 1981;147:195-97.
    • (1981) J Mol Biol , vol.147 , pp. 195-197
    • Smith, T.F.1    Waterman, M.S.2
  • 67
    • 0025878149 scopus 로고
    • Amino acid substitution matrices from an information theoretic perspective
    • Altschul SF. Amino acid substitution matrices from an information theoretic perspective. J Mol Biol 1991;219: 555-65.
    • (1991) J Mol Biol , vol.219 , pp. 555-565
    • Altschul, S.F.1
  • 69
    • 0000526802 scopus 로고
    • Limit distribution of maximal non-aligned two-sequence segmental score
    • Dembo A, Karlin S, Zeitouni O. Limit distribution of maximal non-aligned two-sequence segmental score. Ann Probab 1994;22:2022-39.
    • (1994) Ann Probab , vol.22 , pp. 2022-2039
    • Dembo, A.1    Karlin, S.2    Zeitouni, O.3
  • 70
    • 0035863762 scopus 로고    scopus 로고
    • The estimation of statistical parameters for local alignment score distributions
    • Altschul SF, Bundschuh R, Olsen R, et al. The estimation of statistical parameters for local alignment score distributions. Nucleic Acids Res 2001;29:351-61.
    • (2001) Nucleic Acids Res , vol.29 , pp. 351-361
    • Altschul, S.F.1    Bundschuh, R.2    Olsen, R.3
  • 71
    • 0036108574 scopus 로고    scopus 로고
    • Rapid significance estimation in local sequence alignment with gaps
    • Bundschuh R. Rapid significance estimation in local sequence alignment with gaps. J Comp Biol 2002;9: 243-60.
    • (2002) J Comp Biol , vol.9 , pp. 243-260
    • Bundschuh, R.1
  • 72
    • 23044521397 scopus 로고    scopus 로고
    • Approximate p-values for local sequence alignments
    • Siegmund D, Yakir B. Approximate p-values for local sequence alignments. Ann Statist 2000;28:657-80.
    • (2000) Ann Statist , vol.28 , pp. 657-680
    • Siegmund, D.1    Yakir, B.2
  • 73
    • 2442465931 scopus 로고    scopus 로고
    • Correction: Approximate p-values for local sequence alignments
    • (vol 28, pg 657, 2000)
    • Siegmund D, Yakir B. Correction: Approximate p-values for local sequence alignments (vol 28, pg 657, 2000). Ann Statist 2003;31:1027-31.
    • (2003) Ann Statist , vol.31 , pp. 1027-1031
    • Siegmund, D.1    Yakir, B.2
  • 74
    • 0001619220 scopus 로고
    • A phase transition for the score in matching random sequences allowing deletions
    • Arratia R, Waterman MS. A phase transition for the score in matching random sequences allowing deletions. Ann Appl Probab 1994;4:200-25.
    • (1994) Ann Appl Probab , vol.4 , pp. 200-225
    • Arratia, R.1    Waterman, M.S.2
  • 75
    • 0036740275 scopus 로고    scopus 로고
    • The correlation error and finite-size correction in an ungapped sequence alignment
    • Park Y, Spouge JL. The correlation error and finite-size correction in an ungapped sequence alignment. Bioinformatics 2002;18:1236-42.
    • (2002) Bioinformatics , vol.18 , pp. 1236-1242
    • Park, Y.1    Spouge, J.L.2
  • 76
    • 0031829618 scopus 로고    scopus 로고
    • Statistics of large-scale sequence searching
    • Spang R, Vingron M. Statistics of large-scale sequence searching. Bioinformatics 1998;14:279-84.
    • (1998) Bioinformatics , vol.14 , pp. 279-284
    • Spang, R.1    Vingron, M.2
  • 77
    • 0035614016 scopus 로고    scopus 로고
    • Finite-size corrections to Poisson approximations of rare events in renewal processes
    • Spouge JL. Finite-size corrections to Poisson approximations of rare events in renewal processes. J Appl Probab 2001; 38:554-69.
    • (2001) J Appl Probab , vol.38 , pp. 554-569
    • Spouge, J.L.1
  • 78
    • 26444458649 scopus 로고    scopus 로고
    • Large deviations for global maxima of independent superadditive processes with negative drift and an application to optimal sequence alignments
    • Grossmann S, Yakir B. Large deviations for global maxima of independent superadditive processes with negative drift and an application to optimal sequence alignments. Bernoulli 2004;10:829-45.
    • (2004) Bernoulli , vol.10 , pp. 829-845
    • Grossmann, S.1    Yakir, B.2
  • 79
    • 0035991777 scopus 로고    scopus 로고
    • Estimating and evaluating the statistics of gapped local-alignment scores
    • Bailey TL, Gribskov M. Estimating and evaluating the statistics of gapped local-alignment scores. J Comp Biol 2002; 9:575-93.
    • (2002) J Comp Biol , vol.9 , pp. 575-593
    • Bailey, T.L.1    Gribskov, M.2
  • 80
    • 0032943842 scopus 로고    scopus 로고
    • Approximate statistics of gapped alignments
    • Mott R, Tribe R. Approximate statistics of gapped alignments. J Comp Biol 1999;6:91-112.
    • (1999) J Comp Biol , vol.6 , pp. 91-112
    • Mott, R.1    Tribe, R.2
  • 81
    • 0034742452 scopus 로고    scopus 로고
    • Sequence alignment: An approximation law for the Z-value with applications to databank scanning
    • Bacro JN, Comet JP. Sequence alignment: An approximation law for the Z-value with applications to databank scanning. Computers Chem 2001;25:401-10.
    • (2001) Computers Chem , vol.25 , pp. 401-410
    • Bacro, J.N.1    Comet, J.P.2
  • 82
    • 0025183708 scopus 로고
    • Basic local alignment search tool
    • Altschul SF, Gish W, Miller W, et al. Basic local alignment search tool. J Mol Biol 1990;215:403-10.
    • (1990) J Mol Biol , vol.215 , pp. 403-410
    • Altschul, S.F.1    Gish, W.2    Miller, W.3
  • 83
    • 0023989064 scopus 로고
    • Improved tools for biological sequence comparison
    • Pearson WR, Lipman DJ. Improved tools for biological sequence comparison. Proc Natl Acad Sci USA 1988;85:2444-48.
    • (1988) Proc Natl Acad Sci USA , vol.85 , pp. 2444-2448
    • Pearson, W.R.1    Lipman, D.J.2
  • 84
    • 0038438514 scopus 로고    scopus 로고
    • IMPALA: Matching a protein sequence against a collection of PSI-BLAST-constructed position-specific score matrices
    • Schäffer AA, Wolf YI, Ponting CP, et al. IMPALA: Matching a protein sequence against a collection of PSI-BLAST-constructed position-specific score matrices. Bioinformatics 1999;15:1000-11.
    • (1999) Bioinformatics , vol.15 , pp. 1000-1011
    • Schäffer, A.A.1    Wolf, Y.I.2    Ponting, C.P.3
  • 85
    • 0029935866 scopus 로고    scopus 로고
    • Progressive alignment of amino acid sequences and construction of phylogenetic trees from them
    • Feng DF, Doolittle RF. Progressive alignment of amino acid sequences and construction of phylogenetic trees from them. Meth Enzymol 1996;266:368-82.
    • (1996) Meth Enzymol , vol.266 , pp. 368-382
    • Feng, D.F.1    Doolittle, R.F.2
  • 86
    • 0027968068 scopus 로고
    • CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice
    • Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 1994;22: 4673-80.
    • (1994) Nucleic Acids Res , vol.22 , pp. 4673-4680
    • Thompson, J.D.1    Higgins, D.G.2    Gibson, T.J.3
  • 87
    • 0034623005 scopus 로고    scopus 로고
    • T-Coffee: A novel method for fast and accurate multiple sequence alignment
    • Notredame C, Higgins DG, Heringa J. T-Coffee: A novel method for fast and accurate multiple sequence alignment. J Mol Biol 2000;302:205-17.
    • (2000) J Mol Biol , vol.302 , pp. 205-217
    • Notredame, C.1    Higgins, D.G.2    Heringa, J.3
  • 88
    • 13244255415 scopus 로고    scopus 로고
    • MUSCLE: A multiple sequence alignment method with reduced time and space complexity
    • Art. No. 113
    • Edgar RC. MUSCLE: A multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 2004;5, Art. No. 113.
    • (2004) BMC Bioinformatics , vol.5
    • Edgar, R.C.1
  • 89
    • 13744252890 scopus 로고    scopus 로고
    • MAFFT version 5: Improvement in accuracy of multiple sequence alignment
    • Katoh K, Kuma K, Toh H, et al. MAFFT version 5: Improvement in accuracy of multiple sequence alignment. Nucleic Acids Res 2005;33:511-18.
    • (2005) Nucleic Acids Res , vol.33 , pp. 511-518
    • Katoh, K.1    Kuma, K.2    Toh, H.3
  • 90
    • 20444489707 scopus 로고    scopus 로고
    • DIALIGN-T: An improved algorithm for segment-based multiple sequence alignment
    • Art. No. 66
    • Subramanian AR, Weyer-Menkhoff J, Kaufmann M, et al. DIALIGN-T: An improved algorithm for segment-based multiple sequence alignment. BMC Bioinformatics 2005;6, Art. No. 66.
    • (2005) BMC Bioinformatics , vol.6
    • Subramanian, A.R.1    Weyer-Menkhoff, J.2    Kaufmann, M.3
  • 91
    • 0026100921 scopus 로고
    • A workbench for multiple alignment construction and analysis
    • Schuler GD, Altschul SF, Lipman DJ. A workbench for multiple alignment construction and analysis. PROTEINS 1991;9:180-90.
    • (1991) PROTEINS , vol.9 , pp. 180-190
    • Schuler, G.D.1    Altschul, S.F.2    Lipman, D.J.3
  • 92
    • 0028685490 scopus 로고
    • Fitting a mixture model by expectation maximization to discover motifs in biopolymers
    • In: Altman R, Brutlag D, Karp P, Lathrop R, Searls D (eds)
    • Bailey TL, Elkan C. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. In: Altman R, Brutlag D, Karp P, Lathrop R, Searls D (eds). CA: AAAI Menlo Park, 1994;28-36.
    • (1994) CA: AAAI Menlo Park , pp. 28-36
    • Bailey, T.L.1    Elkan, C.2
  • 93
    • 0027912333 scopus 로고
    • Detecting subtle sequence signals: A Gibbs sampling strategy for multiple alignment
    • Lawrence CE, Altschul SF, Boguski MS. Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science 1993;262:208-14.
    • (1993) Science , vol.262 , pp. 208-214
    • Lawrence, C.E.1    Altschul, S.F.2    Boguski, M.S.3
  • 94
    • 13444309104 scopus 로고    scopus 로고
    • An Eulerian path approach to local multiple alignment for DNA sequences
    • Zhang Y, Waterman MS. An Eulerian path approach to local multiple alignment for DNA sequences. Proc Natl Acad Sci USA 2005;102:1285-90.
    • (2005) Proc Natl Acad Sci USA , vol.102 , pp. 1285-1290
    • Zhang, Y.1    Waterman, M.S.2
  • 95
    • 0035824873 scopus 로고    scopus 로고
    • Towards a reliable objective function for multiple sequence alignments
    • Thompson JD, Plewniak F, Ripp R, et al. Towards a reliable objective function for multiple sequence alignments. J Mol Biol 2001;314:937-51.
    • (2001) J Mol Biol , vol.314 , pp. 937-951
    • Thompson, J.D.1    Plewniak, F.2    Ripp, R.3
  • 96
    • 33748801366 scopus 로고    scopus 로고
    • A fast reliable algorithms to estimate the p-value of the multinomial Ilr statistic
    • Keich U, Nagarajan N. A fast reliable algorithms to estimate the p-value of the multinomial Ilr statistic. Lect Notes Comp Sci 2004;3240:111-22.
    • (2004) Lect Notes Comp Sci , vol.3240 , pp. 111-122
    • Keich, U.1    Nagarajan, N.2
  • 97
    • 0024604438 scopus 로고
    • Methods for calculating the probabilities of finding patterns in sequences
    • Staden R. Methods for calculating the probabilities of finding patterns in sequences. CABIOS 1989;5:89-96.
    • (1989) CABIOS , vol.5 , pp. 89-96
    • Staden, R.1
  • 98
    • 0028502095 scopus 로고
    • Some useful statistical properties of position weight matrices
    • Claverie JM. Some useful statistical properties of position weight matrices. Computers Chem 1994;18:287-94.
    • (1994) Computers Chem , vol.18 , pp. 287-294
    • Claverie, J.M.1
  • 101
    • 0029665409 scopus 로고    scopus 로고
    • Identification of sequence patterns with profile analysis
    • Gribskov M, Veretnik S. Identification of sequence patterns with profile analysis. Meth Enzymol 1996;266: 198-212.
    • (1996) Meth Enzymol , vol.266 , pp. 198-212
    • Gribskov, M.1    Veretnik, S.2
  • 102
    • 0034072450 scopus 로고    scopus 로고
    • DNA binding sites: Representation and discovery
    • Stormo GD. DNA binding sites: Representation and discovery. Bioinformatics 2000;16:16-23.
    • (2000) Bioinformatics , vol.16 , pp. 16-23
    • Stormo, G.D.1
  • 103
    • 0031901903 scopus 로고    scopus 로고
    • Methods and statistics for combining motif match scores
    • Bailey TL, Gribskov M. Methods and statistics for combining motif match scores. J Comp Biol 1998;5: 211-21.
    • (1998) J Comp Biol , vol.5 , pp. 211-221
    • Bailey, T.L.1    Gribskov, M.2
  • 104
    • 0031877016 scopus 로고    scopus 로고
    • Combining evidence using p-values: Application to sequence homology searches
    • Bailey TL, Gribskov M. Combining evidence using p-values: Application to sequence homology searches. Bioinformatics 1998;14:48-54.
    • (1998) Bioinformatics , vol.14 , pp. 48-54
    • Bailey, T.L.1    Gribskov, M.2
  • 105
    • 0028457448 scopus 로고
    • Approximations to profile score distribution
    • Goldstein L, Waterman MS. Approximations to profile score distribution. J Comp Biol 1994;1:93-104.
    • (1994) J Comp Biol , vol.1 , pp. 93-104
    • Goldstein, L.1    Waterman, M.S.2
  • 106
    • 0030965735 scopus 로고    scopus 로고
    • Score distributions for simultaneous matching to multiple motifs
    • Bailey TL, Gribskov M. Score distributions for simultaneous matching to multiple motifs. J Comp Biol 1997;4: 45-59.
    • (1997) J Comp Biol , vol.4 , pp. 45-59
    • Bailey, T.L.1    Gribskov, M.2
  • 107
    • 0025265665 scopus 로고
    • Searching for patterns in protein and nucleic acid sequences
    • Staden R. Searching for patterns in protein and nucleic acid sequences. Meth Enzymol 1990;183: 193-211.
    • (1990) Meth Enzymol , vol.183 , pp. 193-211
    • Staden, R.1
  • 108
    • 1842608248 scopus 로고    scopus 로고
    • Determination of local statistical significance of patterns in Markov sequences with application to promoter element identification
    • Huang HY, Kao MCH, Zhou XH, et al. Determination of local statistical significance of patterns in Markov sequences with application to promoter element identification. J Comp Biol 2004;11:1-14.
    • (2004) J Comp Biol , vol.11 , pp. 1-14
    • Huang, H.Y.1    Kao, M.C.H.2    Zhou, X.H.3
  • 109
    • 0041389077 scopus 로고    scopus 로고
    • Identification of functional clusters of transcription factor binding motifs in genome sequences: The MSCAN algorithm
    • Johansson Ö, Alkema W, Wasserman WW, et al. Identification of functional clusters of transcription factor binding motifs in genome sequences: The MSCAN algorithm. Bioinformatics 2003;19:i169-76.
    • (2003) Bioinformatics , vol.19
    • Johansson, Ö.1    Alkema, W.2    Wasserman, W.W.3
  • 110
    • 16644401595 scopus 로고    scopus 로고
    • Probabilistic methods of identifying genes in prokaryotic genomes: Connections to the HMM theory
    • Azad RK, Borodovsky M. Probabilistic methods of identifying genes in prokaryotic genomes: Connections to the HMM theory. Brief Bioinform 2004;5: 118-30.
    • (2004) Brief Bioinform , vol.5 , pp. 118-130
    • Azad, R.K.1    Borodovsky, M.2
  • 111
    • 0642307369 scopus 로고    scopus 로고
    • EasyGene - A prokaryotic gene finder that ranks ORFs by statistical significance
    • Art. No. 21
    • Larsen TS, Krogh A. EasyGene - a prokaryotic gene finder that ranks ORFs by statistical significance. BMC Bioinformatics 2003;4, Art. No. 21.
    • (2003) BMC Bioinformatics , vol.4
    • Larsen, T.S.1    Krogh, A.2
  • 112
    • 0028011075 scopus 로고
    • Hidden Markov models of biological primary sequence information
    • Baldi P, Chauvin Y, Hunkapiller T, et al. Hidden Markov models of biological primary sequence information. Proc Natl Acad Sci USA 1994;91:1059-63.
    • (1994) Proc Natl Acad Sci USA , vol.91 , pp. 1059-1063
    • Baldi, P.1    Chauvin, Y.2    Hunkapiller, T.3
  • 113
    • 0028181441 scopus 로고
    • Hidden Markov models in computational biology. Applications to protein modeling
    • Krogh A, Brown M, Mian IS, et al. Hidden Markov models in computational biology. Applications to protein modeling. J Mol Biol 1994;235:1501-31.
    • (1994) J Mol Biol , vol.235 , pp. 1501-1531
    • Krogh, A.1    Brown, M.2    Mian, I.S.3
  • 114
    • 0031743421 scopus 로고    scopus 로고
    • Profile hidden Markov models
    • Eddy SR. Profile hidden Markov models. Bioinformatics 1998;14:755-63.
    • (1998) Bioinformatics , vol.14 , pp. 755-763
    • Eddy, S.R.1
  • 115
    • 0029887381 scopus 로고    scopus 로고
    • Hidden Markov models for sequence analysis: Extension and analysis of the basic method
    • Hughey R, Krogh A. Hidden Markov models for sequence analysis: Extension and analysis of the basic method. CABIOS 1996;12:95-107.
    • (1996) CABIOS , vol.12 , pp. 95-107
    • Hughey, R.1    Krogh, A.2
  • 116
    • 33748801857 scopus 로고    scopus 로고
    • ftp://ftp.genetics.wustl.edu/pub/eddy/hmmer/CURRENT/ Userguide.pdf. Last accessed 13 January
    • ftp://ftp.genetics.wustl.edu/pub/eddy/hmmer/CURRENT/ Userguide.pdf. Last accessed 13 January 2006.
    • (2006)
  • 117
    • 0030889183 scopus 로고    scopus 로고
    • Scoring hidden Markov models
    • Barrett C, Hughey R, Karplus K. Scoring hidden Markov models. CABIOS 1997;13:191-99.
    • (1997) CABIOS , vol.13 , pp. 191-199
    • Barrett, C.1    Hughey, R.2    Karplus, K.3
  • 118
    • 0027194328 scopus 로고
    • Discovering simple DNA sequences by the algorithmic significance method
    • Milosavljević A, Jurka J. Discovering simple DNA sequences by the algorithmic significance method. CABIOS 1993;9:407-11.
    • (1993) CABIOS , vol.9 , pp. 407-411
    • Milosavljević, A.1    Jurka, J.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.