메뉴 건너뛰기




Volumn 17, Issue 4, 2010, Pages 581-592

The power of detecting enriched patterns: An HMM approach

Author keywords

Hidden Markov model; Motif; Pattern recognition; Statistical power

Indexed keywords

ARTICLE; AUTOMATED PATTERN RECOGNITION; CPG ISLAND; DNA BASE COMPOSITION; DNA SEQUENCE; GENETICS; INTERNET; MATHEMATICAL COMPUTING; METHODOLOGY; NUCLEOTIDE SEQUENCE; POISSON DISTRIBUTION; PROBABILITY; PROCEDURES;

EID: 77951997235     PISSN: 10665277     EISSN: None     Source Type: Journal    
DOI: 10.1089/cmb.2009.0218     Document Type: Article
Times cited : (13)

References (48)
  • 1
    • 0033539168 scopus 로고    scopus 로고
    • CpG islands as genomic footprints of promoters that are associated with replication origins
    • Antequera, F., and Bird, A. 1999. CpG islands as genomic footprints of promoters that are associated with replication origins. Curr. Biol. 9, 661667.
    • (1999) Curr. Biol. , vol.9 , pp. 661667
    • Antequera, F.1    Bird, A.2
  • 2
    • 0028685490 scopus 로고
    • Fitting a mixture model by expectation maximization to discover motifs in biopolymers
    • Bailey, T. L., and Elkan, C. 1994. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc. 2nd Int. Conf. Intell. Syst. Mol. Biol. 28-36.
    • (1994) Proc. 2nd Int. Conf. Intell. Syst. Mol. Biol. , pp. 28-36
    • Bailey, T.L.1    Elkan, C.2
  • 3
    • 0002759539 scopus 로고
    • Unsupervised learning of multiple motifs in biopolymers using expectation maxi-mization
    • Bailey, T. L., and Elkan, C. 1995. Unsupervised learning of multiple motifs in biopolymers using expectation maxi-mization. Mach. Learn. 21, 51-80
    • (1995) Mach. Learn , vol.21 , pp. 51-80
    • Bailey, T.L.1    Elkan, C.2
  • 4
    • 37849021289 scopus 로고    scopus 로고
    • Exact p-value calculation for heterotypic clusters of regulatory motifs and its application in computational annotation of cis-regulatory modules
    • Boeva, V., Clément, J., Régnier M., et al. 2007. Exact p-value calculation for heterotypic clusters of regulatory motifs and its application in computational annotation of cis-regulatory modules. Algorithms Mol. Biol. 2, 13.
    • (2007) Algorithms Mol. Biol. , vol.2 , pp. 13
    • Boeva, V.1    Clément, J.2    Régnier, M.3
  • 5
    • 24344476038 scopus 로고    scopus 로고
    • YAKUSA: A fast structural database scanning method
    • Carpentier, M., Brouillet, S., and Pothier, J. 2005. YAKUSA: a fast structural database scanning method. Proteins 61, 137-151.
    • (2005) Proteins , vol.61 , pp. 137-151
    • Carpentier, M.1    Brouillet, S.2    Pothier, J.3
  • 7
    • 34248164513 scopus 로고    scopus 로고
    • On the normal approximation for the distribution of the number of simple or compound patterns in a random sequence of multi-state trials
    • Fu, J. C., and Lou, W. Y. W. 2007. On the normal approximation for the distribution of the number of simple or compound patterns in a random sequence of multi-state trials. Methodol. Comput. Appl. Probabil. 9, 195-205.
    • (2007) Methodol. Comput. Appl. Probabil , vol.9 , pp. 195-205
    • Fu, J.C.1    Lou, W.Y.W.2
  • 8
    • 0035049521 scopus 로고    scopus 로고
    • Genome-scale compositional comparisons in eukaryotes
    • Gentles, A. J., and Karlin, S. 2001. Genome-scale compositional comparisons in eukaryotes. Genome Res. 11, 540546.
    • (2001) Genome Res. , vol.11 , pp. 540546
    • Gentles, A.J.1    Karlin, S.2
  • 9
    • 0000309739 scopus 로고
    • Poisson approximations for runs and patterns of rare events
    • Godbole, A. P. 1991. Poisson approximations for runs and patterns of rare events. Adv. Appl. Probabil. 23, 851-865.
    • (1991) Adv. Appl. Probabil. , vol.23 , pp. 851-865
    • Godbole, A.P.1
  • 10
    • 0036755739 scopus 로고    scopus 로고
    • Error bounds on multivariate normal approximations for word count statistics
    • Huang, H. 2002. Error bounds on multivariate normal approximations for word count statistics. Adv. Appl. Probabil. 34, 559-586.
    • (2002) Adv. Appl. Probabil. , vol.34 , pp. 559-586
    • Huang, H.1
  • 11
    • 0026600016 scopus 로고
    • Statistical analyses of counts and distributions of restriction sites in DNA sequences
    • Karlin, S., Burge, C., and Campbell, A. M. 1992. Statistical analyses of counts and distributions of restriction sites in DNA sequences. Nucleic Acids Res. 20, 1363-1370.
    • (1992) Nucleic Acids Res. , vol.20 , pp. 1363-1370
    • Karlin, S.1    Burge, C.2    Campbell, A.M.3
  • 12
    • 0026778661 scopus 로고
    • First and second moment of count of words in random texts generated by Markov chains
    • Kleffe, J., and Borodovsky, M. 1992. First and second moment of count of words in random texts generated by Markov chains. Comput. Appl. Biosci. 8, 433-441.
    • (1992) Comput. Appl. Biosci , vol.8 , pp. 433-441
    • Kleffe, J.1    Borodovsky, M.2
  • 13
    • 0025053733 scopus 로고
    • Exact computation of pattern probabilities in random sequences generated by Markov chains
    • Kleffe, J., and Langbecker, U. 1990. Exact computation of pattern probabilities in random sequences generated by Markov chains. Comput. Appl. Biosci. 6, 347-353.
    • (1990) Comput. Appl. Biosci , vol.6 , pp. 347-353
    • Kleffe, J.1    Langbecker, U.2
  • 14
    • 0037439997 scopus 로고    scopus 로고
    • Structural classification of zinc fingers: Survey and summary
    • Krishna, S., Majumdar, I., and Grishin, N. V. 2003. Structural classification of zinc fingers: survey and summary. Nucleic Acids Res. 31, 532-550.
    • (2003) Nucleic Acids Res. , vol.31 , pp. 532-550
    • Krishna, S.1    Majumdar, I.2    Grishin, N.V.3
  • 15
    • 51249123996 scopus 로고    scopus 로고
    • Protein structure search and local structure characterization
    • Ku, S. Y., and Hu, Y. J. 2008. Protein structure search and local structure characterization. BMC Bioinform. 9, 349.
    • (2008) BMC Bioinform , vol.9 , pp. 349
    • Ku, S.Y.1    Hu, Y.J.2
  • 16
    • 0027912333 scopus 로고
    • Detecting subtle sequence signals-a Gibbs sampling strategy for multiple alignment
    • Lawrence, C. E., Altschul, S. F., Boguski, M. S., et al. 1993. Detecting subtle sequence signals-a Gibbs sampling strategy for multiple alignment. Science 262, 208-214.
    • (1993) Science , vol.262 , pp. 208-214
    • Lawrence, C.E.1    Altschul, S.F.2    Boguski, M.S.3
  • 17
    • 79951870792 scopus 로고
    • The collapsed Gibbs sampler in Bayesian computations with applications to a gene-regulation problem
    • Liu, J. S. 1994. The collapsed Gibbs sampler in Bayesian computations with applications to a gene-regulation problem. J. Am. Statist. Assoc. 89, 958-966.
    • (1994) J. Am. Statist. Assoc. , vol.89 , pp. 958-966
    • Liu, J.S.1
  • 18
    • 0036324753 scopus 로고    scopus 로고
    • An algorithm for finding protein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments
    • Liu, X. L., Brutlag, D. L., and Liu, J. S. 2002. An algorithm for finding protein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments. Nat. Biotechnol. 20, 835-839.
    • (2002) Nat. Biotechnol. , vol.20 , pp. 835-839
    • Liu, X.L.1    Brutlag, D.L.2    Liu, J.S.3
  • 19
    • 38049165999 scopus 로고    scopus 로고
    • Protein structural similarity search by Ramachandran codes
    • Lo, W. C., Chang, C. H., Huang, P. J., et al. 2007. Protein structural similarity search by Ramachandran codes. BMC Bioinform. 8, 307.
    • (2007) BMC Bioinform , vol.8 , pp. 307
    • Lo, W.C.1    Chang, C.H.2    Huang, P.J.3
  • 20
    • 0035802289 scopus 로고    scopus 로고
    • Susceptibility of nonpromoter CpG islands to de novo methylation in normal and neoplastic cells
    • Nguyen, C., Liang, G. M., Nguyen, T. T., et al. 2001. Susceptibility of nonpromoter CpG islands to de novo methylation in normal and neoplastic cells. J. Natl. Cancer Inst. 93, 1465-1472.
    • (2001) J. Natl. Cancer Inst , vol.93 , pp. 1465-1472
    • Nguyen, C.1    Liang, G.M.2    Nguyen, T.T.3
  • 21
    • 12844258978 scopus 로고    scopus 로고
    • LD-SPatt: Large deviations statistics for patterns on Markov chains
    • Nuel, G. 2004. LD-SPatt: large deviations statistics for patterns on Markov chains. J. Comp. Biol. 11, 1023-1033.
    • (2004) J. Comp. Biol. , vol.11 , pp. 1023-1033
    • Nuel, G.1
  • 22
    • 34248358126 scopus 로고    scopus 로고
    • Effective P-value computations using finite Markov chain imbedding (FMCI) : Application to local score and to pattern statistics
    • Nuel, G. 2006a. Effective P-value computations using finite Markov chain imbedding (FMCI) : application to local score and to pattern statistics. Algorithms Mol. Biol. 1, 5.
    • (2006) Algorithms Mol. Biol. , vol.1 , pp. 5
    • Nuel, G.1
  • 23
    • 33750243061 scopus 로고    scopus 로고
    • Numerical solutions for pattern statistics on Markov chains
    • Nuel, G. 2006b. Numerical solutions for pattern statistics on Markov chains. Statist. Appl. Genet. Mol. Biol. 5, 26.
    • (2006) Statist. Appl. Genet. Mol. Biol. , vol.5 , pp. 26
    • Nuel, G.1
  • 24
    • 43049111077 scopus 로고    scopus 로고
    • Pattern Markov chains: Optimal Markov chain embedding through deterministic finite automata
    • Nuel, G. 2007. Pattern Markov chains: optimal Markov chain embedding through deterministic finite automata. J. Appl. Probabil. 45, 226-243.
    • (2007) J. Appl. Probabil , vol.45 , pp. 226-243
    • Nuel, G.1
  • 25
    • 84958437977 scopus 로고
    • Recursive evaluation of a family of compound distributions
    • Panjer, H. H. 1981. Recursive evaluation of a family of compound distributions. Astin Bull. 12, 22-26.
    • (1981) Astin Bull. , vol.12 , pp. 22-26
    • Panjer, H.H.1
  • 26
    • 3242884167 scopus 로고    scopus 로고
    • Weeder Web: Discovery of transcription factor binding sites in a set of sequences from co-regulated genes
    • Pavesi, G., Mereghetti, P., Mauri, G., et al. 2004. Weeder Web: discovery of transcription factor binding sites in a set of sequences from co-regulated genes. Nucleic Acids Res. 32, W199-W203.
    • (2004) Nucleic Acids Res. , vol.32
    • Pavesi, G.1    Mereghetti, P.2    Mauri, G.3
  • 27
    • 0024610919 scopus 로고
    • A tutorial on hidden Markov models and selected applications inspeech recognition
    • Rabiner, L. R. 1989. A tutorial on hidden Markov models and selected applications inspeech recognition. Proc. IEEE 77, 257-286.
    • (1989) Proc. IEEE , vol.77 , pp. 257-286
    • Rabiner, L.R.1
  • 28
    • 0000794292 scopus 로고    scopus 로고
    • A unified approach to word occurrence probabilities
    • Regnier, M. 2000. A unified approach to word occurrence probabilities. Discr. Appl. Math. 104, 259-280.
    • (2000) Discr. Appl. Math , vol.104 , pp. 259-280
    • Regnier, M.1
  • 29
    • 0031902984 scopus 로고    scopus 로고
    • Compound Poisson and Poisson process approximations for occurrences of multiple words in Markov chains
    • Reinert, G., and Schbath, S. 1998. Compound Poisson and Poisson process approximations for occurrences of multiple words in Markov chains. J. Comput. Biol. 15, 223-253.
    • (1998) J. Comput. Biol. , vol.15 , pp. 223-253
    • Reinert, G.1    Schbath, S.2
  • 30
    • 0034125366 scopus 로고    scopus 로고
    • Probabilistic and statistical properties of words: An overview
    • Reinert, G., Schbath, S., and Waterman, M. S. 2000. Probabilistic and statistical properties of words: an overview. J. Comput. Biol. 7, 1-46.
    • (2000) J. Comput. Biol. , vol.7 , pp. 1-46
    • Reinert, G.1    Schbath, S.2    Waterman, M.S.3
  • 31
    • 33847756846 scopus 로고    scopus 로고
    • Statistics on words with applications to biological sequences
    • Berstel, J., and Perrin, D., eds., Lothaire:, Cambridge University Press, New York
    • Reinert, G., Schbath, S., and Waterman, M. S. 2005. Statistics on words with applications to biological sequences, 251-328. In Berstel, J., and Perrin, D., eds., Lothaire: Applied Combinatorics on Words. Cambridge University Press, New York.
    • (2005) Applied Combinatorics on Words , pp. 251-328
    • Reinert, G.1    Schbath, S.2    Waterman, M.S.3
  • 32
    • 75149164526 scopus 로고    scopus 로고
    • Alignment free sequence comparison (I) : Statistics and power
    • Reinert, G., Chew, D., Sun F. Z., et al. 2009. Alignment free sequence comparison (I) : statistics and power. J. Comput. Biol. 16, 1-20.
    • (2009) J. Comput. Biol. , vol.16 , pp. 1-20
    • Reinert, G.1    Chew, D.2    Sun, F.Z.3
  • 33
    • 57249116908 scopus 로고    scopus 로고
    • Faster exact Markovian probability functions for motif occurrences: A DFA-only approach
    • Ribeca, P., and Raineri, E. 2008. Faster exact Markovian probability functions for motif occurrences: a DFA-only approach. Bioinformatics. 24, 2839-2848.
    • Bioinformatics , vol.24 , pp. 2839-2848
    • Ribeca, P.1    Raineri, E.2
  • 34
    • 0033238297 scopus 로고    scopus 로고
    • Exact distribution of word occurrences in a random sequence of letters
    • Robin, S., and Daudin, J. J. 1999. Exact distribution of word occurrences in a random sequence of letters. J. Appl. Probabil. 36, 179-193.
    • (1999) J. Appl. Probabil , vol.36 , pp. 179-193
    • Robin, S.1    Daudin, J.J.2
  • 35
    • 0034786443 scopus 로고    scopus 로고
    • Numerical comparison of several approximations of the word count distribution in random sequences
    • Robin, S., and Schbath, S. 2001. Numerical comparison of several approximations of the word count distribution in random sequences. J. Comput. Biol. 8, 349-359.
    • (2001) J. Comput. Biol. , vol.8 , pp. 349-359
    • Robin, S.1    Schbath, S.2
  • 36
    • 0003655416 scopus 로고
    • Prentice Hall, Englewood Cliffs, NJ
    • Royden, H. L. 1988. Real Analysis. Prentice Hall, Englewood Cliffs, NJ.
    • (1988) Real Analysis
    • Royden, H.L.1
  • 37
    • 84996141000 scopus 로고
    • Compound Poisson approximation of word counts in DNA sequences
    • Schbath, S. 1995. Compound Poisson approximation of word counts in DNA sequences. ESAIM Probabil. Statist. 1, 1-16.
    • (1995) ESAIM Probabil. Statist , vol.1 , pp. 1-16
    • Schbath, S.1
  • 38
    • 0034048881 scopus 로고    scopus 로고
    • An overview on the distribution of word counts in Markov chains
    • Schbath, S. 2000. An overview on the distribution of word counts in Markov chains. J. Comput. Biol. 7, 193-201.
    • (2000) J. Comput. Biol. , vol.7 , pp. 193-201
    • Schbath, S.1
  • 39
    • 65449171149 scopus 로고    scopus 로고
    • Counting of oligomers in sequences generated by Markov chains for DNA motif discovery
    • Shan, G., and Zheng, W. M. 2009. Counting of oligomers in sequences generated by Markov chains for DNA motif discovery. Bioinform. Comput. Biol. 7, 39-54.
    • (2009) Bioinform. Comput. Biol. , vol.7 , pp. 39-54
    • Shan, G.1    Zheng, W.M.2
  • 40
    • 0037133565 scopus 로고    scopus 로고
    • Comprehensive analysis of CpG islands in human chromosomes 21 and 22
    • Takai, D., and Jones, P. A. 2002. Comprehensive analysis of CpG islands in human chromosomes 21 and 22. Proc. Natl. Acad. Sci. USA 99, 3740-3745.
    • (2002) Proc. Natl. Acad. Sci. USA , vol.99 , pp. 3740-3745
    • Takai, D.1    Jones, P.A.2
  • 41
    • 0025331709 scopus 로고
    • Target detection assay (TDA) : A versatile procedure to determine DNA binding sites as demonstrated on SP1 protein
    • Thiesen, H., and Bach, C. 1990. Target detection assay (TDA) : a versatile procedure to determine DNA binding sites as demonstrated on SP1 protein. Nucleic Acids Res. 18, 3203-3209.
    • (1990) Nucleic Acids Res. , vol.18 , pp. 3203-3209
    • Thiesen, H.1    Bach, C.2
  • 42
    • 21144439147 scopus 로고    scopus 로고
    • Assessing computational tools for the discovery of transcription factor binding sites
    • Tompa, M., Li, N., Bailey, T. L., et al. 2005. Assessing computational tools for the discovery of transcription factor binding sites. Nat. Biotechnol. 23, 137-144.
    • (2005) Nat. Biotechnol. , vol.23 , pp. 137-144
    • Tompa, M.L.N.1    Bailey, T.L.2
  • 43
    • 33747831751 scopus 로고    scopus 로고
    • Protein Block Expert (PBE) : A web-based protein structure analysis server using a structural alphabet
    • Tyagi, M., Sharma, P., Swamy, C. S., et al. 2006. Protein Block Expert (PBE) : a web-based protein structure analysis server using a structural alphabet. Nucleic Acids Res. 34, W119-W123.
    • (2006) Nucleic Acids Res. , vol.34
    • Tyagi, M.1    Sharma, P.2    Swamy, C.S.3
  • 44
    • 58149356459 scopus 로고    scopus 로고
    • Poisson approximation for search of rare words in DNA sequences
    • Vergne, N., and Abadi, M. 2008. Poisson approximation for search of rare words in DNA sequences. Latin Am. J. Probabil. Math. Statist. 4, 223-244.
    • (2008) Latin Am. J. Probabil. Math. Statist , vol.4 , pp. 223-244
    • Vergne, N.1    Abadi, M.2
  • 45
    • 0003512471 scopus 로고
    • Introduction to computational biology
    • Chapman & Hall, New York
    • Waterman, M. S. 1995. Introduction to Computational Biology. Maps, Sequences and Genomes. Chapman & Hall, New York.
    • (1995) Maps, Sequences and Genomes
    • Waterman, M.S.1
  • 46
    • 38249036214 scopus 로고
    • Difference equation approaches in evaluation of compound distributions
    • Willmot, G. E., and Panjer, H. H. 1987. Difference equation approaches in evaluation of compound distributions. Insurance Math. Econ. 6, 43-56.
    • (1987) Insurance Math. Econ , vol.6 , pp. 43-56
    • Willmot, G.E.1    Panjer, H.H.2
  • 47
    • 33747096251 scopus 로고    scopus 로고
    • Protein structure database search and evolutionary classification
    • Yang, J. M., and Tung, C. H. 2006. Protein structure database search and evolutionary classification. Nucleic Acids Res. 34, 3646-3659.
    • (2006) Nucleic Acids Res. , vol.34 , pp. 3646-3659
    • Yang, J.M.1    Tung, C.H.2
  • 48
    • 34047159557 scopus 로고    scopus 로고
    • Computing exact P-values for DNA motifs
    • Zhang, J., Jiang, B., Li, M., et al. 2007. Computing exact P-values for DNA motifs. Bioinformatics 23, 531-537
    • (2007) Bioinformatics , vol.23 , pp. 531-537
    • Zhang, J.1    Jiang, B.2    Li, M.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.