메뉴 건너뛰기




Volumn 5, Issue 2, 1998, Pages 279-305

Approaches to the automatic discovery of patterns in biosequences

Author keywords

Automatic discovery; Bioinformatics; Biosequences; Machine learning; Patterns

Indexed keywords

ALGORITHM; AMINO ACID SEQUENCE; ARTICLE; CALCULATION; GENE SEQUENCE; INFORMATION SCIENCE; LANGUAGE; PRIORITY JOURNAL; PROBLEM SOLVING;

EID: 0031878759     PISSN: 10665277     EISSN: None     Source Type: Journal    
DOI: 10.1089/cmb.1998.5.279     Document Type: Article
Times cited : (194)

References (77)
  • 5
    • 5744242176 scopus 로고
    • Protein motif discovery from positive examples by minimal multiple generalization over regular patterns
    • Arimura, H., Fujino, R., Shinohara, T., and Arikawa, S. 1994. Protein motif discovery from positive examples by minimal multiple generalization over regular patterns. In Proc. of the 5th Genome Informatics Workshop, 39-48.
    • (1994) Proc. of the 5th Genome Informatics Workshop , pp. 39-48
    • Arimura, H.1    Fujino, R.2    Shinohara, T.3    Arikawa, S.4
  • 8
    • 0026553672 scopus 로고
    • PROSITE: A dictionary of sites and patterns in proteins
    • Bairoch, A. 1992. PROSITE: a dictionary of sites and patterns in proteins. Nucl. Acids Res. 20:2013-2018.
    • (1992) Nucl. Acids Res. , vol.20 , pp. 2013-2018
    • Bairoch, A.1
  • 10
    • 84990206939 scopus 로고    scopus 로고
    • Discovering unbounded unions of regular pattern languages from positive examples
    • Proceedings of 7th Annual International Symposium on Algorithms and Computation (ISAAC-96)
    • Brazma, A., Ukkonen, E., and Vilo, J. 1996a. Discovering unbounded unions of regular pattern languages from positive examples. In Proceedings of 7th Annual International Symposium on Algorithms and Computation (ISAAC-96), Lecture Notes in Computer Science 1178, 95-104.
    • (1996) Lecture Notes in Computer Science , vol.1178 , pp. 95-104
    • Brazma, A.1    Ukkonen, E.2    Vilo, J.3
  • 14
    • 0026887079 scopus 로고
    • A survey of multiple sequence comparison methods
    • Chan, S.C., Wong, A.K.C., and Chiu, D.K.Y. 1992. A survey of multiple sequence comparison methods. Bull. Math. Biol. 54(4):563-598.
    • (1992) Bull. Math. Biol. , vol.54 , Issue.4 , pp. 563-598
    • Chan, S.C.1    Wong, A.K.C.2    Chiu, D.K.Y.3
  • 17
    • 0018031196 scopus 로고
    • Synthesizing constraint expressions
    • ACM
    • Freuder, E.C. 1978. Synthesizing constraint expressions. Comm. ACM 21(11):958-966.
    • (1978) Comm. , vol.21 , Issue.11 , pp. 958-966
    • Freuder, E.C.1
  • 19
    • 49949150022 scopus 로고
    • Language identification in the limit
    • Gold, E.M. 1967. Language identification in the limit. Information and Control 10:447-474.
    • (1967) Information and Control , vol.10 , pp. 447-474
    • Gold, E.M.1
  • 21
    • 0026410103 scopus 로고
    • Automated assembly of protein blocks for database searching
    • Henikoff, S., and Henikoff, J.G. 1991. Automated assembly of protein blocks for database searching. Nucl. Acids Res. 19(23):6565-6572.
    • (1991) Nucl. Acids Res. , vol.19 , Issue.23 , pp. 6565-6572
    • Henikoff, S.1    Henikoff, J.G.2
  • 22
    • 0026458378 scopus 로고
    • Amino acid substitution matrices from protein blocks
    • Henikoff, S., and Henikoff, J.G. 1992. Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA 89:100915-100919.
    • (1992) Proc. Natl. Acad. Sci. USA , vol.89 , pp. 100915-100919
    • Henikoff, S.1    Henikoff, J.G.2
  • 23
    • 85006627991 scopus 로고
    • Color set size problem with application to string matching
    • A. Apostolico, M. Chrochemore, Z., and U. Manber., eds., Springer-Verlag
    • Hui, L.C.K. 1992. Color set size problem with application to string matching. In A. Apostolico, M. Chrochemore, Z., and U. Manber., eds., Proc. of Combinatorial Pattern Matching, 230-243. Springer-Verlag.
    • (1992) Proc. of Combinatorial Pattern Matching , pp. 230-243
    • Hui, L.C.K.1
  • 25
    • 0029159799 scopus 로고
    • Finding flexible patterns in unaligned protein sequences
    • Jonassen, I., Collins, J.F., and Higgins, D.G. 1995. Finding flexible patterns in unaligned protein sequences. Protein Sci. 4(8):1587-1595.
    • (1995) Protein Sci. , vol.4 , Issue.8 , pp. 1587-1595
    • Jonassen, I.1    Collins, J.F.2    Higgins, D.G.3
  • 26
    • 34548006824 scopus 로고    scopus 로고
    • Scoring function for pattern discovery programs taking into account sequence diversity
    • Dept. of Informatics, University of Bergen
    • Jonassen, I., Helgesen, C., and Higgins, D.G. 1996. Scoring function for pattern discovery programs taking into account sequence diversity. Reports in Informatics 116, Dept. of Informatics, University of Bergen.
    • (1996) Reports in Informatics , pp. 116
    • Jonassen, I.1    Helgesen, C.2    Higgins, D.G.3
  • 27
    • 0030670589 scopus 로고    scopus 로고
    • Efficient discovery of conserved patterns using a pattern graph
    • Jonassen, I. 1997. Efficient discovery of conserved patterns using a pattern graph. Comput. Applic. Biosci. 13:509-522.
    • (1997) Comput. Applic. Biosci. , vol.13 , pp. 509-522
    • Jonassen, I.1
  • 28
    • 0022065258 scopus 로고
    • The use of multiple alphabets in kappa-gene immunoglobulin DNA sequence comparison
    • Karlind, S., and Ghandour, G. 1985. The use of multiple alphabets in kappa-gene immunoglobulin DNA sequence comparison. The EMBO Journal 4:1217-1223.
    • (1985) The EMBO Journal , vol.4 , pp. 1217-1223
    • Karlind, S.1    Ghandour, G.2
  • 30
    • 0026619392 scopus 로고
    • An estimate of the sequencing error frequency in the DNA sequence databases
    • Kristensen, T., Lopez, R.S., and Prydz, H. 1992. An estimate of the sequencing error frequency in the DNA sequence databases. DNA Seq. 2:343-346.
    • (1992) DNA Seq. , vol.2 , pp. 343-346
    • Kristensen, T.1    Lopez, R.S.2    Prydz, H.3
  • 31
    • 0028181441 scopus 로고
    • Hidden Markov model in computational biology. Applications to protein modelling
    • Krogh, A., Brown, M., Mian, I. S., Sjoelander, K., and Haussler, D. 1994. Hidden Markov model in computational biology. Applications to protein modelling. J. Mol. Biol. 235:1501-1531.
    • (1994) J. Mol. Biol. , vol.235 , pp. 1501-1531
    • Krogh, A.1    Brown, M.2    Mian, I.S.3    Sjoelander, K.4    Haussler, D.5
  • 32
    • 0026625944 scopus 로고
    • Analysis of context of 5′-splice site sequences in mammalian mRNA precursors by subclass method
    • Kudo, M., Kitamura-Abe, S., Shimbo, M., and Iida, Y. 1992. Analysis of context of 5′-splice site sequences in mammalian mRNA precursors by subclass method. Comput. Applic. Biosci. 8(4):367-376.
    • (1992) Comput. Applic. Biosci. , vol.8 , Issue.4 , pp. 367-376
    • Kudo, M.1    Kitamura-Abe, S.2    Shimbo, M.3    Iida, Y.4
  • 35
    • 0025320805 scopus 로고
    • An expectation maximization (EM) algorithm for the identification and characterization of common sites in unaligned biopolymer sequences
    • Lawrence, C.E., and Reilly, A. A. 1990. An expectation maximization (EM) algorithm for the identification and characterization of common sites in unaligned biopolymer sequences. Proteins: Struct. Funct. Genet. 7:41-51.
    • (1990) Proteins: Struct. Funct. Genet. , vol.7 , pp. 41-51
    • Lawrence, C.E.1    Reilly, A.A.2
  • 36
    • 0027912333 scopus 로고
    • Detecting subtle sequence signals: A Gibbs sampling strategy for multiple alignment
    • Lawrence, C.E., Altschul, S.F., Boguski, M. S., Liu, J.S., Neuwald, A.F., and Wootton, J. C. 1993. Detecting subtle sequence signals: A Gibbs sampling strategy for multiple alignment. Science 262:208-214.
    • (1993) Science , vol.262 , pp. 208-214
    • Lawrence, C.E.1    Altschul, S.F.2    Boguski, M.S.3    Liu, J.S.4    Neuwald, A.F.5    Wootton, J.C.6
  • 37
    • 0001116877 scopus 로고
    • Binary codes capable of correcting deletion, insertions, and reversals
    • Levenshtein, V. I. 1966. Binary codes capable of correcting deletion, insertions, and reversals. Cybernetics and Control Theory 10:707-710.
    • (1966) Cybernetics and Control Theory , vol.10 , pp. 707-710
    • Levenshtein, V.I.1
  • 39
    • 0021919480 scopus 로고
    • Rapid and sensitive protein similarity searches
    • Lipman, D.J., and Pearson, W.R. 1985. Rapid and sensitive protein similarity searches. Science 227:1435-1441.
    • (1985) Science , vol.227 , pp. 1435-1441
    • Lipman, D.J.1    Pearson, W.R.2
  • 40
    • 0024284078 scopus 로고
    • A flexible multiple sequence alignment program
    • Martinez, H.M. 1988. A flexible multiple sequence alignment program. Nucl. Acids Res. 16(5):1683-1691.
    • (1988) Nucl. Acids Res. , vol.16 , Issue.5 , pp. 1683-1691
    • Martinez, H.M.1
  • 41
    • 0016942292 scopus 로고
    • A space-economical suffix tree construction algorithm
    • McCreight, E.M. 1976. A space-economical suffix tree construction algorithm. J. ACM 23:262-272.
    • (1976) J. ACM , vol.23 , pp. 262-272
    • McCreight, E.M.1
  • 42
    • 0014757386 scopus 로고
    • A general method applicable to the search for similarities in the amino acid sequence of two proteins
    • Needleman, S., and Wunsch, C. 1970. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48:443-454.
    • (1970) J. Mol. Biol. , vol.48 , pp. 443-454
    • Needleman, S.1    Wunsch, C.2
  • 43
    • 0028270992 scopus 로고
    • Detecting patterns in protein sequences
    • Neuwald, A.F., and Green, P. 1994. Detecting patterns in protein sequences. J Mol. Biol. 239:689-712.
    • (1994) J Mol. Biol. , vol.239 , pp. 689-712
    • Neuwald, A.F.1    Green, P.2
  • 44
    • 0004520251 scopus 로고
    • Ph.D. Dissertation, Yale University, Xerox Palo Alto Research Center, California
    • Nix, R.P. 1983. Editing by Example. Ph.D. Dissertation, Yale University, Xerox Palo Alto Research Center, California.
    • (1983) Editing by Example
    • Nix, R.P.1
  • 45
    • 0026728019 scopus 로고
    • Construction of a dictionary of sequence motifs that characterize groups of related proteins
    • Ogiwara, A., Uchiyama, I., Seto, Y., and Kanehisa, M. 1992. Construction of a dictionary of sequence motifs that characterize groups of related proteins. Protein Engng. 5(6):479-488.
    • (1992) Protein Engng. , vol.5 , Issue.6 , pp. 479-488
    • Ogiwara, A.1    Uchiyama, I.2    Seto, Y.3    Kanehisa, M.4
  • 46
    • 0024604147 scopus 로고
    • Prediction motifs derived from cytosine methyltransferases
    • Posfai, J., Bhagwat, A.S., Posfai, G., and Roberts, R.J. 1989. Prediction motifs derived from cytosine methyltransferases. Nucl. Acids Res. 17(7):2421-2435.
    • (1989) Nucl. Acids Res. , vol.17 , Issue.7 , pp. 2421-2435
    • Posfai, J.1    Bhagwat, A.S.2    Posfai, G.3    Roberts, R.J.4
  • 47
    • 0020055592 scopus 로고
    • Improvements to a program for DNA analysis: A procedure to find homologies among many sequences
    • Queen, C., Wegman, M.N., and Korn, L.J. 1982. Improvements to a program for DNA analysis: a procedure to find homologies among many sequences. Nucl. Acids Res. 10:449-456.
    • (1982) Nucl. Acids Res. , vol.10 , pp. 449-456
    • Queen, C.1    Wegman, M.N.2    Korn, L.J.3
  • 48
    • 33744584654 scopus 로고
    • Induction of decision trees
    • Quinlan, J.R. 1986. Induction of decision trees. Machine Learning 1:81-106.
    • (1986) Machine Learning , vol.1 , pp. 81-106
    • Quinlan, J.R.1
  • 49
    • 0018015137 scopus 로고
    • Modeling by the shortest data description
    • Rissanen, J. 1978. Modeling by the shortest data description. Automatica-J.IFAC 14:465-471.
    • (1978) Automatica-J.IFAC , vol.14 , pp. 465-471
    • Rissanen, J.1
  • 50
    • 0026571052 scopus 로고
    • A search for common patterns in many sequences
    • Roytberg, M.A. 1992. A search for common patterns in many sequences. Comput. Applic. Biosci. 8(1):57-64.
    • (1992) Comput. Applic. Biosci. , vol.8 , Issue.1 , pp. 57-64
    • Roytberg, M.A.1
  • 51
    • 84957713103 scopus 로고    scopus 로고
    • A double combinatorial approach to discovering patterns in biological sequences
    • Hirschberg, D., and Myers, G., eds., Springer-Verlag
    • Sagot, M.F., and Viari, A. 1996. A double combinatorial approach to discovering patterns in biological sequences. In Hirschberg, D., and Myers, G., eds., Combinatorial Pattern Matching, 186-208. Springer-Verlag.
    • (1996) Combinatorial Pattern Matching , pp. 186-208
    • Sagot, M.F.1    Viari, A.2
  • 55
    • 0028174103 scopus 로고
    • Identification of sequence motifs from a set of proteins with related function
    • Saqi, M.A.S., and Sternberg, M.J.E. 1994. Identification of sequence motifs from a set of proteins with related function. Protein Engng. 7(2):165-171.
    • (1994) Protein Engng. , vol.7 , Issue.2 , pp. 165-171
    • Saqi, M.A.S.1    Sternberg, M.J.E.2
  • 57
    • 0029258764 scopus 로고
    • Method for calculation of probability of matching a bounded regular expression in a random data string
    • Sewell, R.F., and Durbin, R. 1995. Method for calculation of probability of matching a bounded regular expression in a random data string. J. Comp. Biol. 2:25-31.
    • (1995) J. Comp. Biol. , vol.2 , pp. 25-31
    • Sewell, R.F.1    Durbin, R.2
  • 59
    • 0020916765 scopus 로고
    • Polynomial time inference of extended regular pattern languages
    • Shinohara, T., 1983. Polynomial time inference of extended regular pattern languages. Lecture Notes in Computer Science 147:115-127.
    • (1983) Lecture Notes in Computer Science , vol.147 , pp. 115-127
    • Shinohara, T.1
  • 60
    • 0025141443 scopus 로고
    • Automatic generation of primary sequence patterns from sets of related protein sequences
    • Smith, R.F., and Smith, T.F. 1990. Automatic generation of primary sequence patterns from sets of related protein sequences. In Proc. Natl. Acad. Sci. USA, 118-122.
    • (1990) Proc. Natl. Acad. Sci. USA , pp. 118-122
    • Smith, R.F.1    Smith, T.F.2
  • 61
    • 0019887799 scopus 로고
    • Identification of common molecular subsequences
    • Smith, T., and Waterman, M. 1981. Identification of common molecular subsequences. J. Mol. Biol. 147:195-197.
    • (1981) J. Mol. Biol. , vol.147 , pp. 195-197
    • Smith, T.1    Waterman, M.2
  • 62
    • 0025017156 scopus 로고
    • Finding sequence motifs in groups of functionally related proteins
    • Smith, H.O., Annau, T.M., and Chandrasegaran, S. 1990. Finding sequence motifs in groups of functionally related proteins. In Proc. Natl. Acad. Sci. USA, vol. 87, 826-830.
    • (1990) Proc. Natl. Acad. Sci. USA , vol.87 , pp. 826-830
    • Smith, H.O.1    Annau, T.M.2    Chandrasegaran, S.3
  • 63
    • 0024604438 scopus 로고
    • Methods for calculating the probabilities of finding patterns in sequences
    • Staden, R. 1989a. Methods for calculating the probabilities of finding patterns in sequences. Comput. Applic. Biosci. 5:89-96.
    • (1989) Comput. Applic. Biosci. , vol.5 , pp. 89-96
    • Staden, R.1
  • 64
    • 0024385974 scopus 로고
    • Methods for discovering novel motifs in nucleic acid sequences
    • Staden, R. 1989b. Methods for discovering novel motifs in nucleic acid sequences. Comput. Applic. Biosci. 5(4):293-298.
    • (1989) Comput. Applic. Biosci. , vol.5 , Issue.4 , pp. 293-298
    • Staden, R.1
  • 65
    • 0029557778 scopus 로고
    • Searching for common sequence patterns among distantly related proteins
    • Suyama, M., Nishioka, T., and Oda, J. 1995. Searching for common sequence patterns among distantly related proteins. Protein Engng. 8(11):1075-1080.
    • (1995) Protein Engng. , vol.8 , Issue.11 , pp. 1075-1080
    • Suyama, M.1    Nishioka, T.2    Oda, J.3
  • 69
    • 0022591495 scopus 로고
    • The classification of amino-acid conservation
    • Taylor, W.R. 1986. The classification of amino-acid conservation. J. Theoret. Biol. 119(2):205-218.
    • (1986) J. Theoret. Biol. , vol.119 , Issue.2 , pp. 205-218
    • Taylor, W.R.1
  • 70
    • 0006455454 scopus 로고
    • Constructing suffix trees on-line in linear time
    • Ukkonen, E. 1992. Constructing suffix trees on-line in linear time. Information Processing 1:484-492.
    • (1992) Information Processing , vol.1 , pp. 484-492
    • Ukkonen, E.1
  • 71
    • 0021518106 scopus 로고
    • A Theory of the Learnable
    • Valiant, G.L. 1984. A Theory of the Learnable. Comm. ACM 27(11):1134-1142.
    • (1984) Comm. ACM , vol.27 , Issue.11 , pp. 1134-1142
    • Valiant, G.L.1
  • 72
    • 0026036377 scopus 로고
    • Motif recognition and alignment for many sequences by comparison of dot-matrices
    • Vingron, M., and Argos, P. 1991. Motif recognition and alignment for many sequences by comparison of dot-matrices. J. Mol. Biol. 218:33-43.
    • (1991) J. Mol. Biol. , vol.218 , pp. 33-43
    • Vingron, M.1    Argos, P.2
  • 73
    • 0028679709 scopus 로고
    • On the complexity of multiple sequence alignment
    • Wang, L., and Jiang, T. 1994. On the complexity of multiple sequence alignment. J. Comp. Biol. 1(4):337-348.
    • (1994) J. Comp. Biol. , vol.1 , Issue.4 , pp. 337-348
    • Wang, L.1    Jiang, T.2
  • 74
    • 0027941109 scopus 로고
    • Discovering active motifs in sets of related protein sequences and using them for classification
    • Wang, J.T.L., Marr, T.G., Shasha, D., Shapiro, B.A., and Chirn, G.-W. 1994. Discovering active motifs in sets of related protein sequences and using them for classification. Nucl. Acids Res. 22(14):2769-2775.
    • (1994) Nucl. Acids Res. , vol.22 , Issue.14 , pp. 2769-2775
    • Wang, J.T.L.1    Marr, T.G.2    Shasha, D.3    Shapiro, B.A.4    Chirn, G.-W.5
  • 75
    • 0021665479 scopus 로고
    • Pattern recognition in several sequences: Consensus and alignment
    • Waterman, M.S., Arratia, R., and Galas, D.J. 1984. Pattern recognition in several sequences: Consensus and alignment. Bull. Math. Biol. 46(4):515-527.
    • (1984) Bull. Math. Biol. , vol.46 , Issue.4 , pp. 515-527
    • Waterman, M.S.1    Arratia, R.2    Galas, D.J.3
  • 76
    • 0029867505 scopus 로고    scopus 로고
    • Identification of functional elements in unaligned nucleic acid sequences by a novel tuple search algorithm
    • Wolferstetter, F., French, K., Herrmann, G., and Werner, T. 1996. Identification of functional elements in unaligned nucleic acid sequences by a novel tuple search algorithm. Comput. Applic. Biosci. 12(1):71-80.
    • (1996) Comput. Applic. Biosci. , vol.12 , Issue.1 , pp. 71-80
    • Wolferstetter, F.1    French, K.2    Herrmann, G.3    Werner, T.4
  • 77
    • 0029185456 scopus 로고
    • Identification of protein motifs using conserved amino acid properties and partitioning techniques
    • Menlo Park, California: AAAI Press
    • Wu, T.D., and Brutlag, D.L. 1995. Identification of protein motifs using conserved amino acid properties and partitioning techniques. In Proc. of Third International Conference on Intelligent Systems for Molecular Biology, 402-410. Menlo Park, California: AAAI Press.
    • (1995) Proc. of Third International Conference on Intelligent Systems for Molecular Biology , pp. 402-410
    • Wu, T.D.1    Brutlag, D.L.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.