메뉴 건너뛰기




Volumn 5, Issue 1, 2010, Pages

Exact distribution of a pattern in a set of random sequences generated by a Markov source: Applications to biological data

Author keywords

[No Author keywords available]

Indexed keywords

BACTERIA (MICROORGANISMS);

EID: 77649093400     PISSN: None     EISSN: 17487188     Source Type: Journal    
DOI: 10.1186/1748-7188-5-15     Document Type: Article
Times cited : (19)

References (57)
  • 1
    • 38549144166 scopus 로고    scopus 로고
    • The Genomes On Line Database (GOLD) in 2007: status of genomic and metagenomic projects and their associated metadata
    • 10.1093/nar/gkm884, 2238992, 17981842
    • Liolios K, Mavromatis K, Tavernarakis N, Kyrpides NC. The Genomes On Line Database (GOLD) in 2007: status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Res 2008, 36:D475-479. 10.1093/nar/gkm884, 2238992, 17981842.
    • (2008) Nucleic Acids Res , vol.36
    • Liolios, K.1    Mavromatis, K.2    Tavernarakis, N.3    Kyrpides, N.C.4
  • 4
    • 0029795578 scopus 로고    scopus 로고
    • Over and underrepresentation of short DNA words in Herpesvirus genomes
    • Leung MY, Marsh GM, Speed TP. Over and underrepresentation of short DNA words in Herpesvirus genomes. J Comp Biol 1996, 3:345-360.
    • (1996) J Comp Biol , vol.3 , pp. 345-360
    • Leung, M.Y.1    Marsh, G.M.2    Speed, T.P.3
  • 5
    • 0032526323 scopus 로고    scopus 로고
    • Oligonucleotide bias in Bacillus subtilis: general trends and taxonomic comparisons
    • 10.1093/nar/26.12.2971, 147636, 9611243
    • Rocha E, Viari A, Danchin A. Oligonucleotide bias in Bacillus subtilis: general trends and taxonomic comparisons. Nucl Acids Res 1998, 26:2971-2980. 10.1093/nar/26.12.2971, 147636, 9611243.
    • (1998) Nucl Acids Res , vol.26 , pp. 2971-2980
    • Rocha, E.1    Viari, A.2    Danchin, A.3
  • 6
    • 0026600016 scopus 로고
    • Statistical analyses of counts and distributions of restriction sites in DNA sequences
    • 10.1093/nar/20.6.1363, 312184, 1313968
    • Karlin S, Burge C, Campbell A. Statistical analyses of counts and distributions of restriction sites in DNA sequences. Nucl Acids Res 1992, 20(6):1363-1370. 10.1093/nar/20.6.1363, 312184, 1313968.
    • (1992) Nucl Acids Res , vol.20 , Issue.6 , pp. 1363-1370
    • Karlin, S.1    Burge, C.2    Campbell, A.3
  • 7
    • 0031782376 scopus 로고    scopus 로고
    • Identification of the Chi site of Haemophilus influenzae as several sequences related to Escherichia coli Chi site
    • 10.1046/j.1365-2958.1998.00749.x, 9535091
    • Sourice S, Biaudet V, El Karoui M, Ehrlich S, Gruss A. Identification of the Chi site of Haemophilus influenzae as several sequences related to Escherichia coli Chi site. Mol Microbiol 1998, 27:1021-1029. 10.1046/j.1365-2958.1998.00749.x, 9535091.
    • (1998) Mol Microbiol , vol.27 , pp. 1021-1029
    • Sourice, S.1    Biaudet, V.2    El Karoui, M.3    Ehrlich, S.4    Gruss, A.5
  • 8
    • 0034651804 scopus 로고    scopus 로고
    • Statistical analysis of yeast genomic downstream sequences revels putative polyadenylation signals
    • 10.1093/nar/28.4.1000, 102588, 10648794
    • Van Helden J, Olmo M, Perez-Ortin JE. Statistical analysis of yeast genomic downstream sequences revels putative polyadenylation signals. Nucl Acids Res 2000, 28(4):1000-1010. 10.1093/nar/28.4.1000, 102588, 10648794.
    • (2000) Nucl Acids Res , vol.28 , Issue.4 , pp. 1000-1010
    • Van Helden, J.1    Olmo, M.2    Perez-Ortin, J.E.3
  • 10
    • 0034072450 scopus 로고    scopus 로고
    • DNA binding sites: representation and discovery
    • 10.1093/bioinformatics/16.1.16, 10812473
    • Stormo GD. DNA binding sites: representation and discovery. Bioinformatics 2000, 16:16-23. 10.1093/bioinformatics/16.1.16, 10812473.
    • (2000) Bioinformatics , vol.16 , pp. 16-23
    • Stormo, G.D.1
  • 11
    • 0030449157 scopus 로고    scopus 로고
    • The statistical significance of nucleotide position-weight matrix matches
    • Claverie JM, Audic S. The statistical significance of nucleotide position-weight matrix matches. Comput Appl Biosci 1996, 12:431-439.
    • (1996) Comput Appl Biosci , vol.12 , pp. 431-439
    • Claverie, J.M.1    Audic, S.2
  • 12
    • 0037100636 scopus 로고    scopus 로고
    • Statistical significance of clusters of motifs represented by position specific scoring matrices in nucleotide sequences
    • Frith MC, Spouge JL, Hansen U, Weng Z. statistical significance of clusters of motifs represented by position specific scoring matrices in nucleotide sequences. Nuc Acids Res 2002, 30(14):3214-3224.
    • (2002) Nuc Acids Res , vol.30 , Issue.14 , pp. 3214-3224
    • Frith, M.C.1    Spouge, J.L.2    Hansen, U.3    Weng, Z.4
  • 13
    • 0033675259 scopus 로고    scopus 로고
    • Compositional bias in DNA
    • 10.1016/S0959-437X(00)00144-1, 11088017
    • Gautier C. Compositional bias in DNA. Curr Opin Genet Dev 2000, 10:656-661. 10.1016/S0959-437X(00)00144-1, 11088017.
    • (2000) Curr Opin Genet Dev , vol.10 , pp. 656-661
    • Gautier, C.1
  • 14
    • 0037085737 scopus 로고    scopus 로고
    • Mining Bacillus subtilis chromosome heterogeneities using hidden Markov models
    • 10.1093/nar/30.6.1418, 101363, 11884641
    • Nicolas P, Bize L, Muri F, Hoebeke M, Rodolphe F, Ehrlich S, Prum B, Bessières P. Mining Bacillus subtilis chromosome heterogeneities using hidden Markov models. Nucleic Acids Res 2002, 30:1418-1426. 10.1093/nar/30.6.1418, 101363, 11884641.
    • (2002) Nucleic Acids Res , vol.30 , pp. 1418-1426
    • Nicolas, P.1    Bize, L.2    Muri, F.3    Hoebeke, M.4    Rodolphe, F.5    Ehrlich, S.6    Prum, B.7    Bessières, P.8
  • 15
    • 33745775924 scopus 로고    scopus 로고
    • Computational approaches to gene prediction
    • Do J, Choi D. Computational approaches to gene prediction. J Microbiol 2006, 44:137-144.
    • (2006) J Microbiol , vol.44 , pp. 137-144
    • Do, J.1    Choi, D.2
  • 16
    • 34547726221 scopus 로고    scopus 로고
    • Contribution of horizontally acquired genomic islands to the evolution of the tubercle bacilli
    • 10.1093/molbev/msm111, 17545187
    • Becq J, Gutierrez M, Rosas-Magallanes V, Rauzier J, Gicquel B, Neyrolles O, Deschavanne P. Contribution of horizontally acquired genomic islands to the evolution of the tubercle bacilli. Mol Biol Evol 2007, 24:1861-1871. 10.1093/molbev/msm111, 17545187.
    • (2007) Mol Biol Evol , vol.24 , pp. 1861-1871
    • Becq, J.1    Gutierrez, M.2    Rosas-Magallanes, V.3    Rauzier, J.4    Gicquel, B.5    Neyrolles, O.6    Deschavanne, P.7
  • 17
    • 33847001252 scopus 로고    scopus 로고
    • Analysis of an optimal hidden Markov model for secondary structure prediction
    • 10.1186/1472-6807-6-25, 1769381, 17166267
    • Martin J, Gibrat J, Rodolphe F. Analysis of an optimal hidden Markov model for secondary structure prediction. BMC Struct Biol 2006, 6:25. 10.1186/1472-6807-6-25, 1769381, 17166267.
    • (2006) BMC Struct Biol , vol.6 , pp. 25
    • Martin, J.1    Gibrat, J.2    Rodolphe, F.3
  • 18
    • 0000191544 scopus 로고
    • Stochastic models for heterogeneous DNA sequences
    • Churchill G. Stochastic models for heterogeneous DNA sequences. Bull Math Biol 1989, 268:8-14.
    • (1989) Bull Math Biol , vol.268 , pp. 8-14
    • Churchill, G.1
  • 19
    • 0026758856 scopus 로고
    • Base compositional Structure of Genomes
    • 10.1016/0888-7543(92)90019-O, 1505943
    • Fickett JW, Torney DC, Wolf DR. Base compositional Structure of Genomes. Genomics 1992, 13:1056-1064. 10.1016/0888-7543(92)90019-O, 1505943.
    • (1992) Genomics , vol.13 , pp. 1056-1064
    • Fickett, J.W.1    Torney, D.C.2    Wolf, D.R.3
  • 20
    • 71249162959 scopus 로고    scopus 로고
    • Distributions associated with general runs and patterns in hidden Markov models
    • Aston JAD, Martin DEK. Distributions associated with general runs and patterns in hidden Markov models. Ann Appl Stat 2007, 1:585-61.
    • (2007) Ann Appl Stat , vol.1 , pp. 585-661
    • Aston, J.A.D.1    Martin, D.E.K.2
  • 21
    • 70349847569 scopus 로고    scopus 로고
    • Couting patterns in degenerated sequences
    • Nuel G. Couting patterns in degenerated sequences. PRIB 2009, of Lec. Notes in Bioinfo 2009, 5780:222-232.
    • (2009) PRIB 2009, of Lec. Notes in Bioinfo , vol.5780 , pp. 222-232
    • Nuel, G.1
  • 22
    • 0000794292 scopus 로고    scopus 로고
    • A unified approach to word occurrences probabilities
    • Reignier M. A unified approach to word occurrences probabilities. Discrete Applied Mathematics 2000, 104:259-280.
    • (2000) Discrete Applied Mathematics , vol.104 , pp. 259-280
    • Reignier, M.1
  • 23
    • 0034125366 scopus 로고    scopus 로고
    • Probabilistic and Statistical Properties of Words: An Overview
    • Reinert G, Schbath S. Probabilistic and Statistical Properties of Words: An Overview. J of Comp Biol 2000, 7(1-2):1-46.
    • (2000) J of Comp Biol , vol.7 , Issue.1-2 , pp. 1-46
    • Reinert, G.1    Schbath, S.2
  • 25
    • 33750243061 scopus 로고    scopus 로고
    • Numerical solutions for Patterns Statistics on Markov chains
    • Nuel G. Numerical solutions for Patterns Statistics on Markov chains. Stat App in Genet and Mol Biol 2006, 5:26.
    • (2006) Stat App in Genet and Mol Biol , vol.5 , pp. 26
    • Nuel, G.1
  • 26
    • 21444442040 scopus 로고    scopus 로고
    • Distribution theory of runs and patterns associated with a sequence of multi-state trials
    • Fu JC. Distribution theory of runs and patterns associated with a sequence of multi-state trials. Statistica Sinica 1996, 6(4):957-974.
    • (1996) Statistica Sinica , vol.6 , Issue.4 , pp. 957-974
    • Fu, J.C.1
  • 27
    • 0031526698 scopus 로고    scopus 로고
    • Explicit distributional results in pattern formation
    • Stefanov V, Pakes AG. Explicit distributional results in pattern formation. Ann Appl Probab 1997, 7:666-678.
    • (1997) Ann Appl Probab , vol.7 , pp. 666-678
    • Stefanov, V.1    Pakes, A.G.2
  • 28
    • 0035596069 scopus 로고    scopus 로고
    • Waiting times for patterns in a sequence of multistate trials
    • Antzoulakos DL. Waiting times for patterns in a sequence of multistate trials. J Appl Prob 2001, 38:508-518.
    • (2001) J Appl Prob , vol.38 , pp. 508-518
    • Antzoulakos, D.L.1
  • 29
    • 26644452306 scopus 로고    scopus 로고
    • Distribution of waiting time until the rth occurrence of a compound pattern
    • Chang YM. Distribution of waiting time until the rth occurrence of a compound pattern. Statistics and Probability Letters 2005, 75:29-38.
    • (2005) Statistics and Probability Letters , vol.75 , pp. 29-38
    • Chang, Y.M.1
  • 31
    • 34248358126 scopus 로고    scopus 로고
    • Effective p-value computations using Finite Markov Chain Imbedding (FMCI): application to local score and to pattern statistics
    • 10.1186/1748-7188-1-5, 1479348, 16722531
    • Nuel G. Effective p-value computations using Finite Markov Chain Imbedding (FMCI): application to local score and to pattern statistics. Algorithms for Molecular Biology 2006, 1:5. 10.1186/1748-7188-1-5, 1479348, 16722531.
    • (2006) Algorithms for Molecular Biology , vol.1 , pp. 5
    • Nuel, G.1
  • 33
    • 37849021289 scopus 로고    scopus 로고
    • Exact p-value calculation for heterotypic clusters of regulatory motifs and its application in computational annotation of cis-regulatory modules
    • 10.1186/1748-7188-2-13, 2174486, 17927813
    • Boeva V, Clement J, Regnier M, Roytberg M, Makeev V. Exact p-value calculation for heterotypic clusters of regulatory motifs and its application in computational annotation of cis-regulatory modules. Algorithms for Molecular Biology 2007, 2:13. 10.1186/1748-7188-2-13, 2174486, 17927813.
    • (2007) Algorithms for Molecular Biology , vol.2 , pp. 13
    • Boeva, V.1    Clement, J.2    Regnier, M.3    Roytberg, M.4    Makeev, V.5
  • 34
    • 0024514063 scopus 로고
    • Linguistic of nucleotide sequences: The significance of deviation from mean statistical characteristics and prediction of frequencies of occurrence of words
    • Pevzner P, Borodovski M, Mironov A. Linguistic of nucleotide sequences: The significance of deviation from mean statistical characteristics and prediction of frequencies of occurrence of words. J Biomol Struct Dyn 1989, 6:1013-1026.
    • (1989) J Biomol Struct Dyn , vol.6 , pp. 1013-1026
    • Pevzner, P.1    Borodovski, M.2    Mironov, A.3
  • 35
    • 0001169815 scopus 로고
    • Expected frequencies of DNA patterns using Whittle's formula
    • Cowan R. Expected frequencies of DNA patterns using Whittle's formula. J Appl Prob 1991, 28:886-892.
    • (1991) J Appl Prob , vol.28 , pp. 886-892
    • Cowan, R.1
  • 36
    • 0026778661 scopus 로고    scopus 로고
    • First and second moment of counts of words in random texts generated by Markov chains
    • Kleffe J, Borodovski M. First and second moment of counts of words in random texts generated by Markov chains. Bioinformatics 1997, 8(5):433-441.
    • (1997) Bioinformatics , vol.8 , Issue.5 , pp. 433-441
    • Kleffe, J.1    Borodovski, M.2
  • 37
    • 0344808332 scopus 로고
    • Finding words with unexpected frequencies in DNA sequences
    • Prum B, Rodolphe F, de Turckheim E. Finding words with unexpected frequencies in DNA sequences. J R Statist Soc B 1995, 11:190-192.
    • (1995) J R Statist Soc B , vol.11 , pp. 190-192
    • Prum, B.1    Rodolphe, F.2    de Turckheim, E.3
  • 38
    • 0000309739 scopus 로고
    • Poissons approximations for runs and patterns of rare events
    • Godbole AP. Poissons approximations for runs and patterns of rare events. Adv Appl Prob 1991, 23.
    • (1991) Adv Appl Prob , vol.23
    • Godbole, A.P.1
  • 40
    • 0031902984 scopus 로고    scopus 로고
    • Compound Poisson and Poisson process approximations for occurrences of multiple words in markov chains
    • Reinert G, Schbath S. Compound Poisson and Poisson process approximations for occurrences of multiple words in markov chains. J of Comp Biol 1999, 5:223-254.
    • (1999) J of Comp Biol , vol.5 , pp. 223-254
    • Reinert, G.1    Schbath, S.2
  • 41
    • 0034376430 scopus 로고    scopus 로고
    • Compound Poisson approximation for counts of rare patterns in Markov chains and extreme sojourns in birth-death chains
    • Erhardsson T. Compound Poisson approximation for counts of rare patterns in Markov chains and extreme sojourns in birth-death chains. Ann Appl Probab 2000, 10(2):573-591.
    • (2000) Ann Appl Probab , vol.10 , Issue.2 , pp. 573-591
    • Erhardsson, T.1
  • 42
    • 40749108573 scopus 로고    scopus 로고
    • Cumulative distribution function of a geometric Poisson distribution
    • Nuelg G. Cumulative distribution function of a geometric Poisson distribution. J Stat Comp and Sim 2008, 78(3):211-220.
    • (2008) J Stat Comp and Sim , vol.78 , Issue.3 , pp. 211-220
    • Nuelg, G.1
  • 44
    • 12844258978 scopus 로고    scopus 로고
    • LD-SPatt: Large Deviations Statistics for Patterns on Markov Chains
    • Nuel G. LD-SPatt: Large Deviations Statistics for Patterns on Markov Chains. J Comp Biol 2004, 11(6):1023-1033.
    • (2004) J Comp Biol , vol.11 , Issue.6 , pp. 1023-1033
    • Nuel, G.1
  • 45
    • 67649647591 scopus 로고    scopus 로고
    • Approximate Probabilities for Runs and Patterns in i.i.d. and Markov Dependent Multi-state Trials
    • Fu J, Johnson B. Approximate Probabilities for Runs and Patterns in i.i.d. and Markov Dependent Multi-state Trials. Adv in Appl Prob 2009, 41:292-308.
    • (2009) Adv in Appl Prob , vol.41 , pp. 292-308
    • Fu, J.1    Johnson, B.2
  • 47
    • 0038347439 scopus 로고    scopus 로고
    • Waiting time and complexity for matching patterns with automata
    • Crochemore M, Stefanov V. Waiting time and complexity for matching patterns with automata. Info Proc Letters 2003, 87(3):119-125.
    • (2003) Info Proc Letters , vol.87 , Issue.3 , pp. 119-125
    • Crochemore, M.1    Stefanov, V.2
  • 49
    • 43049111077 scopus 로고    scopus 로고
    • Pattern Markov chains: optimal Markov chain embedding through deterministic finite automata
    • Nuel G. Pattern Markov chains: optimal Markov chain embedding through deterministic finite automata. J of Applied Prob 2008, 45:226-243.
    • (2008) J of Applied Prob , vol.45 , pp. 226-243
    • Nuel, G.1
  • 50
    • 57249116908 scopus 로고    scopus 로고
    • Faster exact Markovian probability functions for motif occurrences: a DFA-only approach
    • 10.1093/bioinformatics/btn525, 18845582
    • Ribeca P, Raineri E. Faster exact Markovian probability functions for motif occurrences: a DFA-only approach. Bioinformatics 2008, 24(24):2839-2848. 10.1093/bioinformatics/btn525, 18845582.
    • (2008) Bioinformatics , vol.24 , Issue.24 , pp. 2839-2848
    • Ribeca, P.1    Raineri, E.2
  • 51
    • 77649155677 scopus 로고    scopus 로고
    • On the first k moments of the random count of a pattern in a multi-states sequence generated by a Markov source
    • ArXiv
    • Nuel G. On the first k moments of the random count of a pattern in a multi-states sequence generated by a Markov source. ArXiv., http://arxiv.org/pdf/0909.4071
    • Nuel, G.1
  • 52
    • 84950460234 scopus 로고
    • Distribution theory of runs: a Markov chain approach
    • Fu JC, Koutras MV. Distribution theory of runs: a Markov chain approach. J Amer Statist Assoc 1994, 89:1050-1058.
    • (1994) J Amer Statist Assoc , vol.89 , pp. 1050-1058
    • Fu, J.C.1    Koutras, M.V.2
  • 53
    • 2442592993 scopus 로고    scopus 로고
    • A hidden Markov model derivated structural alphabet for proteins
    • Camproux AC, Gautier R, Tufféry T. A hidden Markov model derivated structural alphabet for proteins. J Mol Biol 2004, 339:561-605.
    • (2004) J Mol Biol , vol.339 , pp. 561-605
    • Camproux, A.C.1    Gautier, R.2    Tufféry, T.3
  • 57
    • 33846796002 scopus 로고    scopus 로고
    • Waiting times for clumps of patterns and for structured motifs in random sequences
    • Stefanov V, Robin S, Schbath S. Waiting times for clumps of patterns and for structured motifs in random sequences. Discrete Applied Mathematics 2007, 155:868-880.
    • (2007) Discrete Applied Mathematics , vol.155 , pp. 868-880
    • Stefanov, V.1    Robin, S.2    Schbath, S.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.