메뉴 건너뛰기




Volumn 10, Issue 5, 2009, Pages 525-536

Finding sequence motifs in prokaryotic genomes - A brief practical guide for a microbiologist

Author keywords

Phylogenetic footprinting; Protein binding sites; r scan statistics; Software review; Supervised motif finding; Unsupervised motif finding

Indexed keywords

ALGORITHM; BINDING SITE; COMPUTER PROGRAM; GENE STRUCTURE; GENETIC DATABASE; NONHUMAN; NUCLEOTIDE SEQUENCE; PHYLOGENY; PROKARYOTE; PROTEIN BINDING; PROTEIN MOTIF; REVIEW; STATISTICAL ANALYSIS;

EID: 69249132587     PISSN: 14675463     EISSN: 14774054     Source Type: Journal    
DOI: 10.1093/bib/bbp032     Document Type: Review
Times cited : (13)

References (53)
  • 1
    • 63349085642 scopus 로고    scopus 로고
    • PSI-BLAST pseudocounts and the minimum description length principle
    • Altschul SF, Gertz EM, Agarwala R, et al. PSI-BLAST pseudocounts and the minimum description length principle. Nucleic Acids Res 2009;37:815-24.
    • (2009) Nucleic Acids Res , vol.37 , pp. 815-824
    • Altschul, S.F.1    Gertz, E.M.2    Agarwala, R.3
  • 2
    • 43349089047 scopus 로고    scopus 로고
    • Finn R, Griffiths-Jones S, Bateman A. Identifying protein domains with the Pfam database. Curr Protoc Bioinformatics 2003; Chapter 2: Unit 2 5.
    • Finn R, Griffiths-Jones S, Bateman A. Identifying protein domains with the Pfam database. Curr Protoc Bioinformatics 2003; Chapter 2: Unit 2 5.
  • 3
    • 36549060028 scopus 로고    scopus 로고
    • Pfam: A domain-centric method for analyzing proteins and proteomes
    • Mistry J, Finn R. Pfam: A domain-centric method for analyzing proteins and proteomes. Methods Mol Biol 2007; 396:43-58.
    • (2007) Methods Mol Biol , vol.396 , pp. 43-58
    • Mistry, J.1    Finn, R.2
  • 5
    • 5044223128 scopus 로고    scopus 로고
    • What is a hidden Markov model?
    • Eddy SR. What is a hidden Markov model? Nat Biotechnol 2004;22 1315-6.
    • (2004) Nat Biotechnol , vol.22 , pp. 1315-1316
    • Eddy, S.R.1
  • 6
    • 43349108444 scopus 로고    scopus 로고
    • Schuster-Böckler B, Bateman A. An introduction to hidden Markov models. Curr Protoc Bioinformatics 2007;Appendix 3:APpendix 3A.
    • Schuster-Böckler B, Bateman A. An introduction to hidden Markov models. Curr Protoc Bioinformatics 2007;Appendix 3:APpendix 3A.
  • 7
    • 0031438685 scopus 로고    scopus 로고
    • Searching for patterns in genomic data
    • Dsouza M, Larsen N, Overbeek R. Searching for patterns in genomic data. Trends Genet 1997;13:497-8.
    • (1997) Trends Genet , vol.13 , pp. 497-498
    • Dsouza, M.1    Larsen, N.2    Overbeek, R.3
  • 8
    • 33845351083 scopus 로고    scopus 로고
    • Pattern locator: A new tool for finding local sequence patterns in genomic DNA sequences
    • Mrázek J, Xie S. Pattern locator: A new tool for finding local sequence patterns in genomic DNA sequences. Bioinformatics 2006; 22:3099-3100.
    • (2006) Bioinformatics , vol.22 , pp. 3099-3100
    • Mrázek, J.1    Xie, S.2
  • 9
    • 0025008168 scopus 로고
    • Sequence logos: A new way to display consensus sequences
    • Schneider TD, Stephens RM. Sequence logos: A new way to display consensus sequences. Nucleic Acids Res 1990;18: 6097-100.
    • (1990) Nucleic Acids Res , vol.18 , pp. 6097-6100
    • Schneider, T.D.1    Stephens, R.M.2
  • 10
    • 2142738304 scopus 로고    scopus 로고
    • WebLogo: A sequence logo generator
    • Crooks GE, Hon G, Chandonia JM, et al. WebLogo: A sequence logo generator. Genome Res 2004;14:1188-90.
    • (2004) Genome Res , vol.14 , pp. 1188-1190
    • Crooks, G.E.1    Hon, G.2    Chandonia, J.M.3
  • 12
    • 33747819880 scopus 로고    scopus 로고
    • CorreLogo: An online server for 3D sequence logos of RNA and DNA alignments
    • Bindewald E, Schneider TD, Shapiro BA. CorreLogo: An online server for 3D sequence logos of RNA and DNA alignments. Nucleic Acids Res 2006;34:W405-11.
    • (2006) Nucleic Acids Res , vol.34
    • Bindewald, E.1    Schneider, T.D.2    Shapiro, B.A.3
  • 13
    • 0027912333 scopus 로고
    • Detecting subtle sequence signals: A Gibbs.sampling strategy for multiple alignment
    • Lawrence CE, Altschul SF, Boguski MS, et al. Detecting subtle sequence signals: A Gibbs.sampling strategy for multiple alignment. Science 1993;262:208-14.
    • (1993) Science , vol.262 , pp. 208-214
    • Lawrence, C.E.1    Altschul, S.F.2    Boguski, M.S.3
  • 14
    • 0043123063 scopus 로고    scopus 로고
    • Gibbs recursive sampler: Finding transcription factor binding sites
    • Thompson W, Rouchka EC, Lawrence CE. Gibbs recursive sampler: Finding transcription factor binding sites. Nucleic Acids Res 2003;31 3580-5.
    • (2003) Nucleic Acids Res , vol.31 , pp. 3580-3585
    • Thompson, W.1    Rouchka, E.C.2    Lawrence, C.E.3
  • 16
    • 0002759539 scopus 로고    scopus 로고
    • Unsupervised learning of multiple motifs in biopolymers using expectation maximization
    • Bailey TL, Elkan C. Unsupervised learning of multiple motifs in biopolymers using expectation maximization. Machine Learning 1996;21:51-80.
    • (1996) Machine Learning , vol.21 , pp. 51-80
    • Bailey, T.L.1    Elkan, C.2
  • 17
    • 0025320805 scopus 로고
    • An expectation maximization (EM) algorithm for the identification and characterization of common sites in unaligned biopolymer sequences
    • Lawrence CE, Reilly AA. An expectation maximization (EM) algorithm for the identification and characterization of common sites in unaligned biopolymer sequences. Proteins 1990;7:41-51.
    • (1990) Proteins , vol.7 , pp. 41-51
    • Lawrence, C.E.1    Reilly, A.A.2
  • 18
    • 1242264319 scopus 로고    scopus 로고
    • Finding functional sequence elements by multiple local alignment
    • Frith MC, Hansen U, Spouge JL, et al. Finding functional sequence elements by multiple local alignment. Nucleic Acids Res 2004;32 189-200.
    • (2004) Nucleic Acids Res , vol.32 , pp. 189-200
    • Frith, M.C.1    Hansen, U.2    Spouge, J.L.3
  • 19
    • 46149103439 scopus 로고    scopus 로고
    • Finding sequence motifs with Bayesian models incorporating positional information: An application to transcription factor binding sites
    • Kim NK, Tharakaraman K, Marino-Ramirez L, et al. Finding sequence motifs with Bayesian models incorporating positional information: An application to transcription factor binding sites. BMC Bioinformatics 2008; 9:262.
    • (2008) BMC Bioinformatics , vol.9 , pp. 262
    • Kim, N.K.1    Tharakaraman, K.2    Marino-Ramirez, L.3
  • 20
    • 38549144819 scopus 로고    scopus 로고
    • A survey of DNA motif finding algorithms
    • Das MK, Dai HK. A survey of DNA motif finding algorithms. BMC Bioinformatics 2007;8(Suppl 7):S21.
    • (2007) BMC Bioinformatics , vol.8 , Issue.SUPPL. 7
    • Das, M.K.1    Dai, H.K.2
  • 21
    • 0742306815 scopus 로고    scopus 로고
    • Computational prediction of transcription-factor binding site locations
    • Bulyk ML. Computational prediction of transcription-factor binding site locations. Genome Biol 2003;5:201.
    • (2003) Genome Biol , vol.5 , pp. 201
    • Bulyk, M.L.1
  • 22
    • 33747044757 scopus 로고    scopus 로고
    • Analysis of computational approaches for motif discovery
    • Li N, Tompa M. Analysis of computational approaches for motif discovery. Algorithms Mol Biol 2006;1:8.
    • (2006) Algorithms Mol Biol , vol.1 , pp. 8
    • Li, N.1    Tompa, M.2
  • 23
    • 21144439147 scopus 로고    scopus 로고
    • Assessing computational tools for the discovery of transcription factor binding sites
    • Tompa M, Li N, Bailey TL, et al. Assessing computational tools for the discovery of transcription factor binding sites. Nat Biotechnol 2005;23:137-44.
    • (2005) Nat Biotechnol , vol.23 , pp. 137-144
    • Tompa, M.1    Li, N.2    Bailey, T.L.3
  • 24
    • 33644876958 scopus 로고    scopus 로고
    • TRANSFAC and its module TRANSCompel: Transcriptional gene regulation in eukaryotes
    • Matys V, Kel-Margoulis OV, Fricke E, et al. TRANSFAC and its module TRANSCompel: Transcriptional gene regulation in eukaryotes. Nucleic Acids Res 2006;34: D108-10.
    • (2006) Nucleic Acids Res , vol.34
    • Matys, V.1    Kel-Margoulis, O.V.2    Fricke, E.3
  • 25
    • 0030927867 scopus 로고    scopus 로고
    • Searching for regulatory elements in human noncoding sequences
    • Duret L, Bucher P. Searching for regulatory elements in human noncoding sequences. Curr Opin Struct Biol 1997;7: 399-406.
    • (1997) Curr Opin Struct Biol , vol.7 , pp. 399-406
    • Duret, L.1    Bucher, P.2
  • 26
    • 0034330226 scopus 로고    scopus 로고
    • Comparative analysis of regulatory patterns in bacterial genomes
    • Gelfand MS, Novichkov PS, Novichkova ES, et al. Comparative analysis of regulatory patterns in bacterial genomes. Brief Bioinform 2000;1:357-71.
    • (2000) Brief Bioinform , vol.1 , pp. 357-371
    • Gelfand, M.S.1    Novichkov, P.S.2    Novichkova, E.S.3
  • 27
    • 0036798220 scopus 로고    scopus 로고
    • Factors influencing the identification of transcription factor binding sites by cross-species comparison
    • McCue LA, Thompson W, Carmack CS, et al. Factors influencing the identification of transcription factor binding sites by cross-species comparison. Genome Res 2002;12: 1523-32.
    • (2002) Genome Res , vol.12 , pp. 1523-1532
    • McCue, L.A.1    Thompson, W.2    Carmack, C.S.3
  • 28
    • 13944268797 scopus 로고    scopus 로고
    • A novel method for accurate operon predictions in all sequenced prokaryotes
    • Price MN, Huang KH, Alm EJ, et al. A novel method for accurate operon predictions in all sequenced prokaryotes. Nucleic Acids Res 2005;33:880-92.
    • (2005) Nucleic Acids Res , vol.33 , pp. 880-892
    • Price, M.N.1    Huang, K.H.2    Alm, E.J.3
  • 29
    • 33747816168 scopus 로고    scopus 로고
    • MicroFootPrinter: A tool for phylogenetic footprinting in prokaryotic genomes
    • Neph S, Tompa M. MicroFootPrinter: A tool for phylogenetic footprinting in prokaryotic genomes. Nucleic Acids Res 2006;34:W366-8.
    • (2006) Nucleic Acids Res , vol.34
    • Neph, S.1    Tompa, M.2
  • 30
    • 0043025430 scopus 로고    scopus 로고
    • Typing of nonencapsulated haemophilus strains by repetitive-element sequence-based PCR using intergenic dyad sequences
    • Bruant G, Watt S, Quentin R, et al. Typing of nonencapsulated haemophilus strains by repetitive-element sequence-based PCR using intergenic dyad sequences. J Clin Microbiol 2003;41:3473-80.
    • (2003) J Clin Microbiol , vol.41 , pp. 3473-3480
    • Bruant, G.1    Watt, S.2    Quentin, R.3
  • 31
    • 55949130302 scopus 로고    scopus 로고
    • Long simple sequence repeats in host-adapted pathogens localize near genes encoding antigens, housekeeping genes, and pseudogenes
    • Guo X, Mrázek J. Long simple sequence repeats in host-adapted pathogens localize near genes encoding antigens, housekeeping genes, and pseudogenes. J Mol Evol 2008;67: 497-509.
    • (2008) J Mol Evol , vol.67 , pp. 497-509
    • Guo, X.1    Mrázek, J.2
  • 32
    • 33745597862 scopus 로고    scopus 로고
    • Analysis of distribution indicates diverse functions of simple sequence repeats in Mycoplasma genomes
    • Mrázek J. Analysis of distribution indicates diverse functions of simple sequence repeats in Mycoplasma genomes. Mol Biol Evol 2006;23:1370-85.
    • (2006) Mol Biol Evol , vol.23 , pp. 1370-1385
    • Mrázek, J.1
  • 33
    • 0036796553 scopus 로고    scopus 로고
    • Frequent oligonucleotide motifs in genomes of three streptococci
    • Mrázek J, Gaynon LH, Karlin S. Frequent oligonucleotide motifs in genomes of three streptococci. Nucleic Acids Res 2002;30 4216-21.
    • (2002) Nucleic Acids Res , vol.30 , pp. 4216-4221
    • Mrázek, J.1    Gaynon, L.H.2    Karlin, S.3
  • 34
    • 22444439358 scopus 로고    scopus 로고
    • A-tract clusters may facilitate DNA packaging in bacterial nucleoid
    • Tolstorukov MY, Virnik KM, Adhya S, et al. A-tract clusters may facilitate DNA packaging in bacterial nucleoid. Nucleic Acids Res 2005;33:3907-18.
    • (2005) Nucleic Acids Res , vol.33 , pp. 3907-3918
    • Tolstorukov, M.Y.1    Virnik, K.M.2    Adhya, S.3
  • 35
    • 0026718403 scopus 로고
    • Chance and statistical significance in protein and DNA sequence analysis
    • Karlin S, Brendel V. Chance and statistical significance in protein and DNA sequence analysis. Science 1992;257: 39-49.
    • (1992) Science , vol.257 , pp. 39-49
    • Karlin, S.1    Brendel, V.2
  • 36
    • 10344260141 scopus 로고    scopus 로고
    • Frequent oligonucleotides and peptides of the Haemophilus influenzae genome
    • Karlin S, Mrázek J, Campbell AM. Frequent oligonucleotides and peptides of the Haemophilus influenzae genome. Nucleic Acids Res 1996;24:4263-72.
    • (1996) Nucleic Acids Res , vol.24 , pp. 4263-4272
    • Karlin, S.1    Mrázek, J.2    Campbell, A.M.3
  • 37
    • 0001184792 scopus 로고
    • Poisson approximations for r-scan processes
    • Dembo A, Karlin S. Poisson approximations for r-scan processes. Ann Appl Probab 1988;2:329-57.
    • (1988) Ann Appl Probab , vol.2 , pp. 329-357
    • Dembo, A.1    Karlin, S.2
  • 38
    • 0033382706 scopus 로고    scopus 로고
    • DNA uptake signal sequences in naturally transformable bacteria
    • Smith HO, Gwinn ML, Salzberg SL. DNA uptake signal sequences in naturally transformable bacteria. Res Microbiol 1999;150 603-16.
    • (1999) Res Microbiol , vol.150 , pp. 603-616
    • Smith, H.O.1    Gwinn, M.L.2    Salzberg, S.L.3
  • 39
    • 0034943923 scopus 로고    scopus 로고
    • A variable genetic island specific for Neisseria gonorrhoeae is involved in providing DNA for natural transformation and is found more often in disseminated infection isolates
    • Dillard JP, Seifert HS. A variable genetic island specific for Neisseria gonorrhoeae is involved in providing DNA for natural transformation and is found more often in disseminated infection isolates. Mol Microbiol 2001;41: 263-77.
    • (2001) Mol Microbiol , vol.41 , pp. 263-277
    • Dillard, J.P.1    Seifert, H.S.2
  • 40
    • 41949094499 scopus 로고    scopus 로고
    • AIMIE: A web-based environment for detection and interpretation of significant sequence motifs in prokaryotic genomes
    • Mrázek J, Xie S, Guo X, et al. AIMIE: A web-based environment for detection and interpretation of significant sequence motifs in prokaryotic genomes. Bioinformatics 2008;24:1041-8.
    • (2008) Bioinformatics , vol.24 , pp. 1041-1048
    • Mrázek, J.1    Xie, S.2    Guo, X.3
  • 41
    • 0027293240 scopus 로고
    • Comparative DNA sequence features in two long Escherichia coli contigs
    • Cardon LR, Burge C, Schachtel GA, et al. Comparative DNA sequence features in two long Escherichia coli contigs. Nucleic Acids Res 1993;21:3875-84.
    • (1993) Nucleic Acids Res , vol.21 , pp. 3875-3884
    • Cardon, L.R.1    Burge, C.2    Schachtel, G.A.3
  • 42
    • 0024213532 scopus 로고
    • Repetitive extragenic palindromic sequences, mRNA stability and gene expression: Evolution by gene conversion? A review
    • Higgins CF, McLaren RS, Newbury SF. Repetitive extragenic palindromic sequences, mRNA stability and gene expression: Evolution by gene conversion? A review. Gene 1988;72:3-14.
    • (1988) Gene , vol.72 , pp. 3-14
    • Higgins, C.F.1    McLaren, R.S.2    Newbury, S.F.3
  • 43
    • 0030942832 scopus 로고    scopus 로고
    • HIP1 propagates in cyanobacterial DNA via nucleotide substitutions but promotes excision at similar frequencies in Escherichia coli and Synechococcus PCC 7942
    • Robinson PJ, Cranenburgh RM, Head IM, et al. HIP1 propagates in cyanobacterial DNA via nucleotide substitutions but promotes excision at similar frequencies in Escherichia coli and Synechococcus PCC 7942. Mol Microbiol 1997;24:181-189.
    • (1997) Mol Microbiol , vol.24 , pp. 181-189
    • Robinson, P.J.1    Cranenburgh, R.M.2    Head, I.M.3
  • 44
    • 39149142575 scopus 로고    scopus 로고
    • CRISPR - a widespread system that provides acquired resistance against phages in bacteria and archaea
    • Sorek R, Kunin V, Hugenholtz P. CRISPR - a widespread system that provides acquired resistance against phages in bacteria and archaea. Nat Rev Microbiol 2008;6:181-6.
    • (2008) Nat Rev Microbiol , vol.6 , pp. 181-186
    • Sorek, R.1    Kunin, V.2    Hugenholtz, P.3
  • 45
    • 34247618196 scopus 로고    scopus 로고
    • Improved compound Poisson approximation for the number of occurrences of any rare word family in a stationary Markov chain
    • Roquain E, Schbath S. Improved compound Poisson approximation for the number of occurrences of any rare word family in a stationary Markov chain. Adv Appl Probab 2007;39:128-40.
    • (2007) Adv Appl Probab , vol.39 , pp. 128-140
    • Roquain, E.1    Schbath, S.2
  • 46
    • 0030839806 scopus 로고    scopus 로고
    • An efficient statistic to detect over- and under-represented words in DNA sequences
    • Schbath S. An efficient statistic to detect over- and under-represented words in DNA sequences. J Comput Biol 1997;4: 189-92.
    • (1997) J Comput Biol , vol.4 , pp. 189-192
    • Schbath, S.1
  • 47
    • 0035890910 scopus 로고    scopus 로고
    • REPuter: The manifold applications of repeat analysis on a genomic scale
    • Kurtz S, Choudhuri JV, Ohlebusch E, et al. REPuter: The manifold applications of repeat analysis on a genomic scale. Nucleic Acids Res 2001;29:4633-42.
    • (2001) Nucleic Acids Res , vol.29 , pp. 4633-4642
    • Kurtz, S.1    Choudhuri, J.V.2    Ohlebusch, E.3
  • 48
    • 2942538300 scopus 로고    scopus 로고
    • Versatile and open software for comparing large genomes
    • Kurtz S, Phillippy A, Delcher AL, et al. Versatile and open software for comparing large genomes. Genome Biol 2004;5:R12.
    • (2004) Genome Biol , vol.5
    • Kurtz, S.1    Phillippy, A.2    Delcher, A.L.3
  • 49
    • 84860513986 scopus 로고    scopus 로고
    • PILER: Identification and classification of genomic repeats
    • Edgar RC, Myers EW. PILER: Identification and classification of genomic repeats. Bioinformatics 2005;21(Suppl 1): I152-8.
    • (2005) Bioinformatics , vol.21 , Issue.SUPPL. 1
    • Edgar, R.C.1    Myers, E.W.2
  • 50
    • 33846975418 scopus 로고    scopus 로고
    • PILER-CR: Fast and accurate identification of CRISPR repeats
    • Edgar RC. PILER-CR: Fast and accurate identification of CRISPR repeats. BMC Bioinformatics 2007;8:18.
    • (2007) BMC Bioinformatics , vol.8 , pp. 18
    • Edgar, R.C.1
  • 51
    • 34547579396 scopus 로고    scopus 로고
    • CRISPRFinder: A web tool to identify clustered regularly interspaced short palindromic repeats
    • Grissa I, Vergnaud G, Pourcel C. CRISPRFinder: A web tool to identify clustered regularly interspaced short palindromic repeats. Nucleic Acids Res 2007;35:W52-7.
    • (2007) Nucleic Acids Res , vol.35
    • Grissa, I.1    Vergnaud, G.2    Pourcel, C.3
  • 52
    • 46249097882 scopus 로고    scopus 로고
    • Swelfe: A detector of internal repeats in sequences and structures
    • Abraham AL, Rocha EP, Pothier J. Swelfe: A detector of internal repeats in sequences and structures. Bioinformatics 2008;24:1536-7.
    • (2008) Bioinformatics , vol.24 , pp. 1536-1537
    • Abraham, A.L.1    Rocha, E.P.2    Pothier, J.3
  • 53
    • 0032573426 scopus 로고    scopus 로고
    • A comprehensive library of DNA-binding site matrices for 55 proteins applied to the complete Escherichia coli K-12 genome
    • Robison K, McGuire AM, Church GM. A comprehensive library of DNA-binding site matrices for 55 proteins applied to the complete Escherichia coli K-12 genome. J Mol Biol 1998;284:241-54.
    • (1998) J Mol Biol , vol.284 , pp. 241-254
    • Robison, K.1    McGuire, A.M.2    Church, G.M.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.