메뉴 건너뛰기




Volumn 20, Issue 9, 2004, Pages 1335-1360

Automatic prediction of protein domains from sequence information using a hybrid learning system

Author keywords

[No Author keywords available]

Indexed keywords

ACCURACY; AMINO ACID SEQUENCE; ARTICLE; ARTIFICIAL NEURAL NETWORK; AUTOMATION; CALCULATION; CONTROLLED STUDY; CORRELATION ANALYSIS; INTERMETHOD COMPARISON; LEARNING; PREDICTION; PRIORITY JOURNAL; PROBABILITY; PROTEIN ANALYSIS; PROTEIN DOMAIN; PROTEIN STRUCTURE; QUANTITATIVE ANALYSIS; SCORING SYSTEM; SENSITIVITY ANALYSIS; SEQUENCE ALIGNMENT; SEQUENCE DATABASE;

EID: 3142680264     PISSN: 13674803     EISSN: 13674811     Source Type: Journal    
DOI: 10.1093/bioinformatics/bth086     Document Type: Article
Times cited : (64)

References (49)
  • 3
    • 0032893084 scopus 로고    scopus 로고
    • The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1999
    • Bairoch,A. and Apweiler,R. (1999) The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1999. Nucleic Acids Res., 27 49-54.
    • (1999) Nucleic Acids Res. , vol.27 , pp. 49-54
    • Bairoch, A.1    Apweiler, R.2
  • 5
    • 0026069154 scopus 로고
    • Development of hydrophobicity parameters to analyze proteins which bear post or cotranslational modifications
    • Black,S.D. and Mould,D.R. (1991) Development of hydrophobicity parameters to analyze proteins which bear post or cotranslational modifications. Anal. Biochem., 193, 72-82.
    • (1991) Anal. Biochem. , vol.193 , pp. 72-82
    • Black, S.D.1    Mould, D.R.2
  • 7
    • 0000387249 scopus 로고
    • Strong limit theorems of empirical functionals for large exceedances of partial sums of i.i.d variables
    • Dembo,A. and Karlin,S. (1991) Strong limit theorems of empirical functionals for large exceedances of partial sums of i.i.d variables. Ann. Prob., 19, 1737-1755.
    • (1991) Ann. Prob. , vol.19 , pp. 1737-1755
    • Dembo, A.1    Karlin, S.2
  • 8
    • 0028279918 scopus 로고
    • Self-organized neural maps of human protein sequences
    • Ferran,E.A., Pflugfelder,B. and Ferrara,P. (1994) Self-organized neural maps of human protein sequences. Protein Sci., 3, 507-521.
    • (1994) Protein Sci. , vol.3 , pp. 507-521
    • Ferran, E.A.1    Pflugfelder, B.2    Ferrara, P.3
  • 9
    • 0036721254 scopus 로고    scopus 로고
    • Protein domain identification and improved sequence similarity searching using PSI-BLAST
    • George,R.A. and Heringa,J. (2002a) Protein domain identification and improved sequence similarity searching using PSI-BLAST. Proteins, 48, 672-681.
    • (2002) Proteins , vol.48 , pp. 672-681
    • George, R.A.1    Heringa, J.2
  • 10
    • 0036306348 scopus 로고    scopus 로고
    • SnapDRAGON: A method to delineate protein structural domains from sequence data
    • George,R.A. and Heringa,J. (2002b) SnapDRAGON: a method to delineate protein structural domains from sequence data. J. Mol. Biol., 316, 839-851.
    • (2002) J. Mol. Biol. , vol.316 , pp. 839-851
    • George, R.A.1    Heringa, J.2
  • 12
    • 0027771868 scopus 로고
    • On the ancient nature of introns
    • Gilbert,W. and Glynias,M. (1993) On the ancient nature of introns. Gene, 135, 137-144.
    • (1993) Gene , vol.135 , pp. 137-144
    • Gilbert, W.1    Glynias, M.2
  • 14
    • 0033563522 scopus 로고    scopus 로고
    • Whole genome protein domain analysis using a new method for domain clustering
    • Gouzy,J., Corpet,F. and Kahn,D. (1999) Whole genome protein domain analysis using a new method for domain clustering. Comput. Chem., 23, 333-340.
    • (1999) Comput. Chem. , vol.23 , pp. 333-340
    • Gouzy, J.1    Corpet, F.2    Kahn, D.3
  • 15
    • 0031876699 scopus 로고    scopus 로고
    • Automated protein sequence database classification. I. Integration of copositional similarity search, local similarity search and multiple sequence alignment. II. Delineation of domain boundries from sequence similarity
    • Gracy,J. and Argos,P. (1998) Automated protein sequence database classification. I. Integration of copositional similarity search, local similarity search and multiple sequence alignment. II. Delineation of domain boundries from sequence similarity. Bioinformatics, 14, 164-187.
    • (1998) Bioinformatics , vol.14 , pp. 164-187
    • Gracy, J.1    Argos, P.2
  • 16
    • 0031744176 scopus 로고    scopus 로고
    • Domain identification by clustering sequence alignments
    • Guan,X. and Du,L. (1998) Domain identification by clustering sequence alignments. Bioinformatics, 14, 783-788.
    • (1998) Bioinformatics , vol.14 , pp. 783-788
    • Guan, X.1    Du, L.2
  • 17
    • 0014256178 scopus 로고
    • Contingency tables with given marginals
    • Ireland,C.T. and Kullback,S. (1968) Contingency tables with given marginals. Biometrika, 55, 179-189.
    • (1968) Biometrika , vol.55 , pp. 179-189
    • Ireland, C.T.1    Kullback, S.2
  • 19
    • 0026458378 scopus 로고
    • Amino acid substitution matrices from protein blocks
    • Henikoff,S. and Henikoff,J.G. (1992) Amino acid substitution matrices from protein blocks. Proc. Natl Acad. Sci., USA, 89, 10915-10919.
    • (1992) Proc. Natl. Acad. Sci. USA , vol.89 , pp. 10915-10919
    • Henikoff, S.1    Henikoff, J.G.2
  • 20
    • 0028043552 scopus 로고
    • Position-based sequence weights
    • Henikoff,S. and Henikoff,J.G. (1994) Position-based sequence weights. J. Mol. Biol., 243, 574-578.
    • (1994) J. Mol. Biol. , vol.243 , pp. 574-578
    • Henikoff, S.1    Henikoff, J.G.2
  • 21
    • 0029977162 scopus 로고    scopus 로고
    • Using substitution probabilities to improve position-specific scoring matrices
    • Henikoff,J.G. and Henikoff,S. (1996) Using substitution probabilities to improve position-specific scoring matrices. Comput. Appl. Biosci., 12:2, 135-143.
    • (1996) Comput. Appl. Biosci. , vol.12 , Issue.2 , pp. 135-143
    • Henikoff, J.G.1    Henikoff, S.2
  • 22
    • 0028290005 scopus 로고
    • Parser for protein folding units
    • Holm,L. and Sander,C. (1994) Parser for protein folding units. Proteins, 19, 256-268.
    • (1994) Proteins , vol.19 , pp. 256-268
    • Holm, L.1    Sander, C.2
  • 24
    • 0025259313 scopus 로고
    • Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes
    • Karlin,S. and Altschul,S.F. (1990) Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proc. Natl Acad. Sci., USA, 87, 2264-2268.
    • (1990) Proc. Natl. Acad. Sci. USA , vol.87 , pp. 2264-2268
    • Karlin, S.1    Altschul, S.F.2
  • 26
    • 0034493084 scopus 로고    scopus 로고
    • Automated search of natively folded protein fragments for high-throughput structure determination in structural genomics
    • Kuroda,Y., Tani,K., Matsuo,Y. and Yokoyama,S. (2000) Automated search of natively folded protein fragments for high-throughput structure determination in structural genomics. Protein Sci., 9, 2313-2321.
    • (2000) Protein Sci. , vol.9 , pp. 2313-2321
    • Kuroda, Y.1    Tani, K.2    Matsuo, Y.3    Yokoyama, S.4
  • 27
    • 0019588920 scopus 로고
    • Folding units in globular proteins
    • Lesk,A.M. and Rose,G.D. (1981) Folding units in globular proteins. Proc. Natl Acad. Sci., USA, 78, 4304-4308.
    • (1981) Proc. Natl. Acad. Sci. USA , vol.78 , pp. 4304-4308
    • Lesk, A.M.1    Rose, G.D.2
  • 28
    • 0025952277 scopus 로고
    • Divergence measures based on the Shannon entropy
    • Lin,J. (1991) Divergence measures based on the Shannon entropy. IEEE Trans. Info. Theory, 37, 145-151.
    • (1991) IEEE Trans. Info. Theory , vol.37 , pp. 145-151
    • Lin, J.1
  • 29
    • 0034044314 scopus 로고    scopus 로고
    • The PSIPRED protein structure prediction server
    • McGuffin,L.J., Bryson,K. and Jones,D.T. (2000) The PSIPRED protein structure prediction server. Bioinformatics, 16, 404-405.
    • (2000) Bioinformatics , vol.16 , pp. 404-405
    • McGuffin, L.J.1    Bryson, K.2    Jones, D.T.3
  • 30
    • 0036288851 scopus 로고    scopus 로고
    • Characterization and prediction of linker sequences of multi-domain proteins by a neural network
    • Miyazaki,S., Kuroda,Y. and Yokoyama,S. (2002) Characterization and prediction of linker sequences of multi-domain proteins by a neural network. J. Struct. Func. Genom., 15, 37-51.
    • (2002) J. Struct. Func. Genom. , vol.15 , pp. 37-51
    • Miyazaki, S.1    Kuroda, Y.2    Yokoyama, S.3
  • 31
    • 0034887296 scopus 로고    scopus 로고
    • Prediction of protein functional domains from sequences using artificial neural networks
    • Murvai,J., Vlahovicek,K., Szepesvari,C. and Pongor,S. (2001) Prediction of protein functional domains from sequences using artificial neural networks. Genome Res., 11, 1410-1417.
    • (2001) Genome Res. , vol.11 , pp. 1410-1417
    • Murvai, J.1    Vlahovicek, K.2    Szepesvari, C.3    Pongor, S.4
  • 32
    • 0028961335 scopus 로고
    • SCOP: A structural classification of proteins database for the investigation of sequences and structures
    • Murzin,A.G., Brenner,S.E., Hubbard,T. and Chothia,C. (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol., 247, 536-540.
    • (1995) J. Mol. Biol. , vol.247 , pp. 536-540
    • Murzin, A.G.1    Brenner, S.E.2    Hubbard, T.3    Chothia, C.4
  • 34
    • 0031874788 scopus 로고    scopus 로고
    • DIVCLUS: An automatic method in the GEANFAMMER package that finds homologous domains in single- and multi-domain proteins
    • Park,J. and Teichmann,S.A. (1998) DIVCLUS: an automatic method in the GEANFAMMER package that finds homologous domains in single- and multi-domain proteins. Bioinformatics, 14, 144-150.
    • (1998) Bioinformatics , vol.14 , pp. 144-150
    • Park, J.1    Teichmann, S.A.2
  • 35
    • 0030821675 scopus 로고    scopus 로고
    • Correlated mutations contain information about protein-protein interaction
    • Pazos,F., Helmer-Citterich,M., Ausiello,G. and Valencia,A. (1997) Correlated mutations contain information about protein-protein interaction. J. Mol. Biol., 271, 511-523.
    • (1997) J. Mol. Biol. , vol.271 , pp. 511-523
    • Pazos, F.1    Helmer-Citterich, M.2    Ausiello, G.3    Valencia, A.4
  • 37
    • 0023989064 scopus 로고
    • Improved tools for biological sequence comparison
    • Pearson,W.R. and Lipman,D.J. (1988) Improved tools for biological sequence comparison. Proc. Natl Acad. Sci., USA, 85, 2444-2448.
    • (1988) Proc. Natl. Acad. Sci. USA , vol.85 , pp. 2444-2448
    • Pearson, W.R.1    Lipman, D.J.2
  • 38
    • 0032919370 scopus 로고    scopus 로고
    • SMART: Identification and annotation of domains from signalling and extracellular protein sequences
    • Ponting,C.P., Schultz,J., Milpetz,F. and Bork,P. (1999) SMART: identification and annotation of domains from signalling and extracellular protein sequences. Nucleic Acids Res., 27, 229-232.
    • (1999) Nucleic Acids Res. , vol.27 , pp. 229-232
    • Ponting, C.P.1    Schultz, J.2    Milpetz, F.3    Bork, P.4
  • 39
    • 0036220048 scopus 로고    scopus 로고
    • Use of covariance analysis for the prediction of structural domain boundaries from multiple protein sequence alignments
    • Rigden,D.J. (2002) Use of covariance analysis for the prediction of structural domain boundaries from multiple protein sequence alignments. Protein Eng., 15, 65-77.
    • (2002) Protein Eng. , vol.15 , pp. 65-77
    • Rigden, D.J.1
  • 40
    • 0018782641 scopus 로고
    • Hierarchic organization of domains in globular proteins
    • Rose,G.D. (1979) Hierarchic organization of domains in globular proteins. J. Mol. Biol., 134, 447-470.
    • (1979) J. Mol. Biol. , vol.134 , pp. 447-470
    • Rose, G.D.1
  • 41
    • 0033977982 scopus 로고    scopus 로고
    • EID: The Exon-Intron Database - An exhaustive database of protein-coding intron-containing genes
    • Saxonov,S., Daizadeh,I., Fedorov,A. and Gilbert,W. (2000) EID: the Exon-Intron Database - an exhaustive database of protein-coding intron-containing genes. Nucleic Acids Res., 28, 185-190.
    • (2000) Nucleic Acids Res. , vol.28 , pp. 185-190
    • Saxonov, S.1    Daizadeh, I.2    Fedorov, A.3    Gilbert, W.4
  • 42
    • 0028218683 scopus 로고
    • Modular arrangement of proteins as inferred from analysis of homology
    • Sonnhammer,E.L.L. and Kahn,D. (1994) Modular arrangement of proteins as inferred from analysis of homology. Protein Sci., 3, 482-492.
    • (1994) Protein Sci. , vol.3 , pp. 482-492
    • Sonnhammer, E.L.L.1    Kahn, D.2
  • 43
    • 0030925920 scopus 로고    scopus 로고
    • PFam: A comprehensive database of protein domain families based on seed alignments
    • Sonnhammer,E.L., Eddy,S.R. and Durbin,R. (1997) PFam: a comprehensive database of protein domain families based on seed alignments. Proteins, 28, 405-420.
    • (1997) Proteins , vol.28 , pp. 405-420
    • Sonnhammer, E.L.1    Eddy, S.R.2    Durbin, R.3
  • 44
    • 0028914725 scopus 로고
    • An automatic method involving cluster analysis of secondary structures for the identification of domains in proteins
    • Sowdhamini,R. and Blundell,T.L. (1995) An automatic method involving cluster analysis of secondary structures for the identification of domains in proteins. Protein Sci., 4, 506-520.
    • (1995) Protein Sci. , vol.4 , pp. 506-520
    • Sowdhamini, R.1    Blundell, T.L.2
  • 45
    • 0032951163 scopus 로고    scopus 로고
    • Protein structural domain identification
    • Taylor,W.R. (1999) Protein structural domain identification. Protein Eng., 12, 203-216.
    • (1999) Protein Eng. , vol.12 , pp. 203-216
    • Taylor, W.R.1
  • 46
    • 18444373540 scopus 로고    scopus 로고
    • The Protein Data Bank: Unifying the archive
    • Westbrook,J., Feng,Z., Jain,S. et al. (2002) The Protein Data Bank: unifying the archive. Nucleic Acids Res., 30, 245-248.
    • (2002) Nucleic Acids Res. , vol.30 , pp. 245-248
    • Westbrook, J.1    Feng, Z.2    Jain, S.3
  • 47
    • 0033753811 scopus 로고    scopus 로고
    • Domain size distributions can predict domain boundaries
    • Wheelan,S.J., Marchier-Bauer,A. and Bryant,S.H. (2000) Domain size distributions can predict domain boundaries. Bioinformatics, 16, 613-618.
    • (2000) Bioinformatics , vol.16 , pp. 613-618
    • Wheelan, S.J.1    Marchier-Bauer, A.2    Bryant, S.H.3
  • 48
    • 0034566188 scopus 로고    scopus 로고
    • Towards a complete map of the protein space based on a unified sequence and structure analysis of all known proteins
    • AAAI Press, Menlo Park
    • Yona,G. and Levitt,M. (2000) Towards a complete map of the protein space based on a unified sequence and structure analysis of all known proteins. Proceedings of ISMB 2000. AAAI Press, Menlo Park, pp. 395-406.
    • (2000) Proceedings of ISMB 2000 , pp. 395-406
    • Yona, G.1    Levitt, M.2
  • 49
    • 0032726692 scopus 로고    scopus 로고
    • ProtoMap: Automatic classification of protein sequences, a hierarchy of protein families, and local maps of the protein space
    • Yona,G., Linial,N. and Linial,M. (1999) ProtoMap: automatic classification of protein sequences, a hierarchy of protein families, and local maps of the protein space. Proteins, 37, 360-378.
    • (1999) Proteins , vol.37 , pp. 360-378
    • Yona, G.1    Linial, N.2    Linial, M.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.