메뉴 건너뛰기




Volumn 20, Issue 11, 2008, Pages 1472-1489

Mining loosely structured motifs from biological data

Author keywords

Bioinformatics (genome or protein) databases; Data mining; Mining methods and algorithms

Indexed keywords

BIOINFORMATICS (GENOME OR PROTEIN) DATABASES; BIOLOGICAL DATA; BIOLOGICAL MECHANISMS; BIOLOGICAL SEQUENCES; DATA MINING; GENETIC DISEASES; LINEAR TIME; MINING METHODS AND ALGORITHMS; MOTIF DISCOVERY; PERFORMANCE RESULTS; REAL DATA SETS; REGULAR STRUCTURES; SPACE COMPLEXITIES;

EID: 52949134517     PISSN: 10414347     EISSN: None     Source Type: Journal    
DOI: 10.1109/TKDE.2008.65     Document Type: Article
Times cited : (15)

References (59)
  • 3
    • 33846934067 scopus 로고    scopus 로고
    • String Pattern Matching for a Deluge Survival Kit
    • J. Abello, P.M. Pardalos and M.G.C. Resende, eds, Kluwer Academic
    • A. Apostolico and M. Crochemore, "String Pattern Matching for a Deluge Survival Kit," Handbook of Massive Data Sets, J. Abello, P.M. Pardalos and M.G.C. Resende, eds., Kluwer Academic, 2000.
    • (2000) Handbook of Massive Data Sets
    • Apostolico, A.1    Crochemore, M.2
  • 4
    • 0030986188 scopus 로고    scopus 로고
    • The Hardwiring of Development: Organization and Function of Genomic Regulatory Systems
    • M.I. Arnone and E.H. Davidson, "The Hardwiring of Development: Organization and Function of Genomic Regulatory Systems," Development, vol. 124, pp. 1851-1864, 1997.
    • (1997) Development , vol.124 , pp. 1851-1864
    • Arnone, M.I.1    Davidson, E.H.2
  • 8
    • 0002759539 scopus 로고
    • Unsupervised Learning of Multiple Motifs in Biopolymers Using Expectation Maximization
    • T.L. Bailey and C Elkan, "Unsupervised Learning of Multiple Motifs in Biopolymers Using Expectation Maximization," Machine Learning, vol. 21, nos. 1-2, pp. 51-80, 1995.
    • (1995) Machine Learning , vol.21 , Issue.1-2 , pp. 51-80
    • Bailey, T.L.1    Elkan, C.2
  • 9
    • 0026553672 scopus 로고
    • PROSITE: A Dictionary of Protein Sites and Patterns
    • A. Bairoch, "PROSITE: A Dictionary of Protein Sites and Patterns," Nucleic Acid Research, vol. 20, pp. 2013-2018, 1992.
    • (1992) Nucleic Acid Research , vol.20 , pp. 2013-2018
    • Bairoch, A.1
  • 10
    • 0031878759 scopus 로고    scopus 로고
    • Approaches to the Automatic Discovery of Patterns in Biosequences
    • A. Brazma, I. Jonassen, I. Eidhammer, and D. Gilbert, "Approaches to the Automatic Discovery of Patterns in Biosequences," J. Computational Biology, vol. 5, no. 2, pp. 277-304, 1998.
    • (1998) J. Computational Biology , vol.5 , Issue.2 , pp. 277-304
    • Brazma, A.1    Jonassen, I.2    Eidhammer, I.3    Gilbert, D.4
  • 11
    • 0032413411 scopus 로고    scopus 로고
    • Predicting Gene Regulatory Elements in Silico on a Genomic Scale
    • A. Brazma, I. Jonassen, J. Vilo, and E. Ukkonen, "Predicting Gene Regulatory Elements in Silico on a Genomic Scale," Genome Research, vol. 8, pp. 1202-1215, 1998.
    • (1998) Genome Research , vol.8 , pp. 1202-1215
    • Brazma, A.1    Jonassen, I.2    Vilo, J.3    Ukkonen, E.4
  • 14
    • 13444294231 scopus 로고    scopus 로고
    • Meta-Analysis of Gross Insertions Causing Human Genetic Disease: Novel Mutational Mechanisms and the Role of Replication Slippage
    • J.M. Chen, N. Chuzhanova, P.D. Stenson, C. Ferec, and D.N. Cooper, "Meta-Analysis of Gross Insertions Causing Human Genetic Disease: Novel Mutational Mechanisms and the Role of Replication Slippage," Human Mutation, vol. 25, no. 2, pp. 207-221, 2005.
    • (2005) Human Mutation , vol.25 , Issue.2 , pp. 207-221
    • Chen, J.M.1    Chuzhanova, N.2    Stenson, P.D.3    Ferec, C.4    Cooper, D.N.5
  • 16
  • 17
    • 14844367057 scopus 로고    scopus 로고
    • An Improved Data Stream Summary: The Count-Min Sketch and Its Applications
    • G. Cormode and S. Muthukrishnan, "An Improved Data Stream Summary: The Count-Min Sketch and Its Applications," J. Algorithms, vol. 55, no. 1, pp. 58-75, 2005.
    • (2005) J. Algorithms , vol.55 , Issue.1 , pp. 58-75
    • Cormode, G.1    Muthukrishnan, S.2
  • 19
    • 0344464768 scopus 로고    scopus 로고
    • In Silico Analysis Reveals Substantial Variability in the Gene Contents of the Gamma Proteobacteria Lexa-Regulon
    • I. Erill, M. Escribano, S. Campoy, and J. Barbé, "In Silico Analysis Reveals Substantial Variability in the Gene Contents of the Gamma Proteobacteria Lexa-Regulon," Bioinformatics, vol. 19, no. 17, pp. 2225-2236, 2003.
    • (2003) Bioinformatics , vol.19 , Issue.17 , pp. 2225-2236
    • Erill, I.1    Escribano, M.2    Campoy, S.3    Barbé, J.4
  • 21
    • 2942618802 scopus 로고    scopus 로고
    • A Top-Down Method for Mining Most-Specific Frequent Patterns in Biological Sequences
    • M. Ester and X. Zhang, "A Top-Down Method for Mining Most-Specific Frequent Patterns in Biological Sequences," Proc. SIAM Int'l Conf. Data Mining (SDM), 2004.
    • (2004) Proc. SIAM Int'l Conf. Data Mining (SDM)
    • Ester, M.1    Zhang, X.2
  • 22
    • 0001905015 scopus 로고    scopus 로고
    • Synopsis Data Structures for Massive Data Sets
    • P.B. Gibbons and Y. Matias, "Synopsis Data Structures for Massive Data Sets," External Memory Algorithms, pp. 39-70, 1999.
    • (1999) External Memory Algorithms , pp. 39-70
    • Gibbons, P.B.1    Matias, Y.2
  • 25
    • 0034894539 scopus 로고    scopus 로고
    • Identifying Target Sites for Cooperatively Binding Factors
    • D. GuhaThakurta and G.D. Stormo, "Identifying Target Sites for Cooperatively Binding Factors," Bioinformatics, vol. 17, no. 7, pp. 608-621, 2001.
    • (2001) Bioinformatics , vol.17 , Issue.7 , pp. 608-621
    • GuhaThakurta, D.1    Stormo, G.D.2
  • 27
    • 0034656326 scopus 로고    scopus 로고
    • Discovering Regulatory Elements in Non-Coding Sequences by Analysis of Spaced Dyads
    • J. van Helden, A.F. Rios, and J. Collado-Vides, "Discovering Regulatory Elements in Non-Coding Sequences by Analysis of Spaced Dyads," Nucleic Acids Research, vol. 28, no. 8, pp. 1808-1818, 2000.
    • (2000) Nucleic Acids Research , vol.28 , Issue.8 , pp. 1808-1818
    • van Helden, J.1    Rios, A.F.2    Collado-Vides, J.3
  • 28
    • 0032826179 scopus 로고    scopus 로고
    • Identifying DNA and Protein Patterns with Statistically Significant Alignments of Multiple Sequences
    • G. Hertz and G. Stormo, "Identifying DNA and Protein Patterns with Statistically Significant Alignments of Multiple Sequences," Bioinformatics, vol. 15, nos. 7-8, pp. 563-577, 1999.
    • (1999) Bioinformatics , vol.15 , Issue.7-8 , pp. 563-577
    • Hertz, G.1    Stormo, G.2
  • 30
    • 0034628901 scopus 로고    scopus 로고
    • Computational Identification of CIS-Regulatory Elements Associated with Groups of Functionally Related Genes in Saccharomyces Cerevisiae
    • J.D. Hughes, P.W. Estep, S. Tavazoie, and G.M. Church, "Computational Identification of CIS-Regulatory Elements Associated with Groups of Functionally Related Genes in Saccharomyces Cerevisiae," J. Molecular Biology, vol. 296, no. 5, pp. 1205-1214, 2000.
    • (2000) J. Molecular Biology , vol.296 , Issue.5 , pp. 1205-1214
    • Hughes, J.D.1    Estep, P.W.2    Tavazoie, S.3    Church, G.M.4
  • 32
    • 0029159799 scopus 로고
    • Finding Flexible Patterns in Unaligned Protein Sequences
    • I. Jonassen, J.F. Collins, and D.G. Higgins, "Finding Flexible Patterns in Unaligned Protein Sequences," Protein Science, vol. 4, pp. 1587-1595, 1995.
    • (1995) Protein Science , vol.4 , pp. 1587-1595
    • Jonassen, I.1    Collins, J.F.2    Higgins, D.G.3
  • 34
    • 34447340240 scopus 로고    scopus 로고
    • GAPWM: A Genetic Algorithm Method for Optimizing a Position Weight Matrix
    • L. Li, Y. Liang, and R.L. Bass, "GAPWM: A Genetic Algorithm Method for Optimizing a Position Weight Matrix," Bioinformatics, vol. 23, no. 10, pp. 1188-1194, 2007.
    • (2007) Bioinformatics , vol.23 , Issue.10 , pp. 1188-1194
    • Li, L.1    Liang, Y.2    Bass, R.L.3
  • 36
    • 0033677426 scopus 로고    scopus 로고
    • Algorithms for Extracting Structured Motifs Using a Suffix Tree with Application to Promoter and Regulatory Site Consensus Identification
    • L. Marsan and M.F. Sagot, "Algorithms for Extracting Structured Motifs Using a Suffix Tree with Application to Promoter and Regulatory Site Consensus Identification," /. Computational Biology, vol. 7, pp. 345-360, 2000.
    • (2000) Computational Biology , vol.7 , pp. 345-360
    • Marsan, L.1    Sagot, M.F.2
  • 37
    • 33845359995 scopus 로고    scopus 로고
    • MUSA: A Parameter Free Algorithm for the Identification of Biologically Significant Motifs
    • N.D. Mendes, A.C. Casimiro, P.M. Santos, I. Sà-Correia, A.L. Oliveira, and A.T. Freitas, "MUSA: A Parameter Free Algorithm for the Identification of Biologically Significant Motifs," Bioinformatics, vol. 22, no. 24, pp. 2996-3002, 2006.
    • (2006) Bioinformatics , vol.22 , Issue.24 , pp. 2996-3002
    • Mendes, N.D.1    Casimiro, A.C.2    Santos, P.M.3    Sà-Correia, I.4    Oliveira, A.L.5    Freitas, A.T.6
  • 38
    • 0345566149 scopus 로고    scopus 로고
    • A Guided Tour to Approximate String Matching
    • G. Navarro, "A Guided Tour to Approximate String Matching," ACM Computing Surveys, vol. 33, no. 1, pp. 31-88, 2001.
    • (2001) ACM Computing Surveys , vol.33 , Issue.1 , pp. 31-88
    • Navarro, G.1
  • 39
    • 0029144601 scopus 로고
    • Gibbs Motif Sampling: Detection of Bacterial Outer Membrane Repeats
    • A. Neuwald, J. Liu, and C. Lawrence, "Gibbs Motif Sampling: Detection of Bacterial Outer Membrane Repeats," Protein Science, vol. 4, pp. 1618-1632, 1995.
    • (1995) Protein Science , vol.4 , pp. 1618-1632
    • Neuwald, A.1    Liu, J.2    Lawrence, C.3
  • 40
    • 0028270992 scopus 로고
    • Detecting Patterns in Protein Sequences
    • A.F. Neuwald and P. Green, "Detecting Patterns in Protein Sequences," /. Molecular Biology, vol. 239, pp. 698-712, 1994.
    • (1994) Molecular Biology , vol.239 , pp. 698-712
    • Neuwald, A.F.1    Green, P.2
  • 41
    • 4444301746 scopus 로고    scopus 로고
    • Essential Motifs in the 3' Untranslated Region Required for Retrotransposition and the Precise Start of Reverse Transcription in Non-Long-Terminal-Repeat Retrotransposon SARTl
    • M. Osanai, H. Takahashi, K.K. Kojima, M. Hamada, and H. Fujiwara, "Essential Motifs in the 3' Untranslated Region Required for Retrotransposition and the Precise Start of Reverse Transcription in Non-Long-Terminal-Repeat Retrotransposon SARTl," Molecular and Cellular Biology, vol. 24, no. 19, pp. 7902-7913, 2004.
    • (2004) Molecular and Cellular Biology , vol.24 , Issue.19 , pp. 7902-7913
    • Osanai, M.1    Takahashi, H.2    Kojima, K.K.3    Hamada, M.4    Fujiwara, H.5
  • 42
    • 16644364007 scopus 로고    scopus 로고
    • In Silico Representation and Discovery of Transcription Factor Binding Sites
    • G. Pavesi, G. Mauri, and G. Pesole, "In Silico Representation and Discovery of Transcription Factor Binding Sites," Briefings in Bioinformatics, vol. 5, pp. 217-236, 2004.
    • (2004) Briefings in Bioinformatics , vol.5 , pp. 217-236
    • Pavesi, G.1    Mauri, G.2    Pesole, G.3
  • 45
    • 34347339593 scopus 로고    scopus 로고
    • Improved Benchmarks for Computational Motif Discovery
    • G.K. Sandve, O. Abul, V. Walseng, and F. Drablos, "Improved Benchmarks for Computational Motif Discovery," BMC Bioinformatics, vol. 8, no. 193, pp. 1-13, 2007.
    • (2007) BMC Bioinformatics , vol.8 , Issue.193 , pp. 1-13
    • Sandve, G.K.1    Abul, O.2    Walseng, V.3    Drablos, F.4
  • 46
    • 34247579156 scopus 로고    scopus 로고
    • A Survey of Motif Discovery Methods in an Integrated Framework
    • G.K. Sandve and F. Drablos, "A Survey of Motif Discovery Methods in an Integrated Framework," Biology Direct, vol. 1, no. 11, pp. 1-16, 2006.
    • (2006) Biology Direct , vol.1 , Issue.11 , pp. 1-16
    • Sandve, G.K.1    Drablos, F.2
  • 47
    • 85026984223 scopus 로고    scopus 로고
    • Composite Motifs in Promoter Regions of Genes: Models and Algorithms,
    • S. Sinha, "Composite Motifs in Promoter Regions of Genes: Models and Algorithms," General Report, 2002.
    • (2002) General Report
    • Sinha, S.1
  • 48
    • 0042905768 scopus 로고    scopus 로고
    • YMF: A Program for Discovery of Novel Transcription Factor Binding Sites by Statistical Overrepresentation
    • S. Sinha and M. Tompa, "YMF: A Program for Discovery of Novel Transcription Factor Binding Sites by Statistical Overrepresentation," Nucleic Acid Research, vol. 31, no. 13, pp. 3586-3588, 2003.
    • (2003) Nucleic Acid Research , vol.31 , Issue.13 , pp. 3586-3588
    • Sinha, S.1    Tompa, M.2
  • 51
    • 11244303622 scopus 로고    scopus 로고
    • The Changing Tails of a Novel Short Interspersed Element in Aedes Aegypti: Genomic Evidence for Slippage Retrotransposition and the Relationship between 3' Tandem Repeats and the Poly(da) Tail
    • Z. Tu, S. Li, and C Mao, "The Changing Tails of a Novel Short Interspersed Element in Aedes Aegypti: Genomic Evidence for Slippage Retrotransposition and the Relationship between 3' Tandem Repeats and the Poly(da) Tail," Genetics, vol. 168, no. 4, pp. 2037-2047, 2004.
    • (2004) Genetics , vol.168 , Issue.4 , pp. 2037-2047
    • Tu, Z.1    Li, S.2    Mao, C.3
  • 53
    • 0033379116 scopus 로고    scopus 로고
    • Promoter Sequences and Algorithmical Methods for Identifying Them
    • A. Vanet, L. Marsan, and M.-F. Sagot, "Promoter Sequences and Algorithmical Methods for Identifying Them," Research in Microbiology, vol. 150, no. 9, pp. 779-799, 1999.
    • (1999) Research in Microbiology , vol.150 , Issue.9 , pp. 779-799
    • Vanet, A.1    Marsan, L.2    Sagot, M.-F.3
  • 55
    • 0032776529 scopus 로고    scopus 로고
    • Models for Prediction and Recognition of Eukaryotic Promoters
    • T. Werner, "Models for Prediction and Recognition of Eukaryotic Promoters," Mammalian Genome, vol. 10, no. 2, pp. 168-175, 1999.
    • (1999) Mammalian Genome , vol.10 , Issue.2 , pp. 168-175
    • Werner, T.1
  • 56
    • 0037869270 scopus 로고    scopus 로고
    • The State of the Art of Mammalian Promoter Recognition
    • T. Werner, "The State of the Art of Mammalian Promoter Recognition," Briefings in Bioinformatics, vol. 4, no. 1, pp. 22-30, 2003.
    • (2003) Briefings in Bioinformatics , vol.4 , Issue.1 , pp. 22-30
    • Werner, T.1
  • 57
    • 0034826102 scopus 로고    scopus 로고
    • Spade: An Efficient Algorithm for Mining Frequent Sequences
    • M.J. Zaki, "Spade: An Efficient Algorithm for Mining Frequent Sequences," Machine Learning, vol. 42, no. 1-2, pp. 31-60, 2001.
    • (2001) Machine Learning , vol.42 , Issue.1-2 , pp. 31-60
    • Zaki, M.J.1
  • 58
    • 34248356673 scopus 로고    scopus 로고
    • Y. Zhang and M.J. Zaki, EXMOTIF: Efficient Structured Motif Extraction, Algorithms for Molecular Biology, 1, no. 1, rec.No 21, 2006.
    • Y. Zhang and M.J. Zaki, "EXMOTIF: Efficient Structured Motif Extraction," Algorithms for Molecular Biology, vol. 1, no. 1, rec.No 21, 2006.
  • 59
    • 0032861774 scopus 로고    scopus 로고
    • SCPD: A Promoter Database for the Yeast Saccharomyces Cerevisiae
    • J. Zhu and M. Zhang, "SCPD: A Promoter Database for the Yeast Saccharomyces Cerevisiae," Bioinformatics, vol. 15, nos. 7-8, pp. 607-611, 1999.
    • (1999) Bioinformatics , vol.15 , Issue.7-8 , pp. 607-611
    • Zhu, J.1    Zhang, M.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.