메뉴 건너뛰기




Volumn 353, Issue 3, 2005, Pages 744-759

Protein family clustering for structural genomics

Author keywords

Protein sequence clustering; Protein universe; SCOP benchmark; Structural genomics; Structure coverage

Indexed keywords

AMINO ACID SEQUENCE; ARTICLE; BACTERIAL GENOME; GENE CLUSTER; GENE STRUCTURE; GENETIC VARIABILITY; GENOME ANALYSIS; PRIORITY JOURNAL; PROTEIN DOMAIN; PROTEIN STRUCTURE; SENSITIVITY ANALYSIS; STRUCTURAL GENOMICS;

EID: 26244437278     PISSN: 00222836     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.jmb.2005.08.058     Document Type: Article
Times cited : (24)

References (61)
  • 1
    • 4344592748 scopus 로고    scopus 로고
    • The impact of structural genomics on the protein data bank
    • H.M. Berman, and J.D. Westbrook The impact of structural genomics on the protein data bank Am. J. Pharmacogenomics 4 2004 247 252
    • (2004) Am. J. Pharmacogenomics , vol.4 , pp. 247-252
    • Berman, H.M.1    Westbrook, J.D.2
  • 2
    • 0014693695 scopus 로고
    • A possible three-dimensional structure of bovine alpha-lactalbumin based on that of hen's egg-white lysozyme
    • W.J. Browne, A.C. North, D.C. Phillips, K. Brew, T.C. Vanaman, and R.L. Hill A possible three-dimensional structure of bovine alpha-lactalbumin based on that of hen's egg-white lysozyme J. Mol. Biol. 42 1969 65 86
    • (1969) J. Mol. Biol. , vol.42 , pp. 65-86
    • Browne, W.J.1    North, A.C.2    Phillips, D.C.3    Brew, K.4    Vanaman, T.C.5    Hill, R.L.6
  • 3
    • 0035487690 scopus 로고    scopus 로고
    • A tour of structural genomics
    • S.E. Brenner A tour of structural genomics Naure Rev. Genet. 2 2001 801 809
    • (2001) Naure Rev. Genet. , vol.2 , pp. 801-809
    • Brenner, S.E.1
  • 4
    • 0034674175 scopus 로고    scopus 로고
    • Estimating the number of protein folds and families from complete genome data
    • Y.I. Wolf, N.V. Grishin, and E.V. Koonin Estimating the number of protein folds and families from complete genome data J. Mol. Biol. 299 2000 897 905
    • (2000) J. Mol. Biol. , vol.299 , pp. 897-905
    • Wolf, Y.I.1    Grishin, N.V.2    Koonin, E.V.3
  • 7
    • 0036322695 scopus 로고    scopus 로고
    • Target space for structural genomics revisited
    • J. Liu, and B. Rost Target space for structural genomics revisited Bioinformatics 18 2002 922 933
    • (2002) Bioinformatics , vol.18 , pp. 922-933
    • Liu, J.1    Rost, B.2
  • 9
  • 10
    • 0035812694 scopus 로고    scopus 로고
    • Protein structure prediction and structural genomics
    • D. Baker, and A. Sali Protein structure prediction and structural genomics Science 294 2001 93 96
    • (2001) Science , vol.294 , pp. 93-96
    • Baker, D.1    Sali, A.2
  • 12
    • 0036384350 scopus 로고    scopus 로고
    • One fold with many functions: The evolutionary relationships between TIM barrel families based on their sequences, structures and functions
    • N. Nagano, C.A. Orengo, and J.M. Thornton One fold with many functions: the evolutionary relationships between TIM barrel families based on their sequences, structures and functions J. Mol. Biol. 321 2002 741 765
    • (2002) J. Mol. Biol. , vol.321 , pp. 741-765
    • Nagano, N.1    Orengo, C.A.2    Thornton, J.M.3
  • 13
    • 0016990971 scopus 로고
    • The origin and evolution of protein superfamilies
    • M.O. Dayhoff The origin and evolution of protein superfamilies Fed. Proc. Fed. Am. Soc. Expt. Biol. 35 1976 2132 2138
    • (1976) Fed. Proc. Fed. Am. Soc. Expt. Biol. , vol.35 , pp. 2132-2138
    • Dayhoff, M.O.1
  • 14
    • 0037249633 scopus 로고    scopus 로고
    • The TIGRFAMs database of protein families
    • D.H. Haft, J.D. Selengut, and O. White The TIGRFAMs database of protein families Nucl. Acids Res. 31 2003 371 373
    • (2003) Nucl. Acids Res. , vol.31 , pp. 371-373
    • Haft, D.H.1    Selengut, J.D.2    White, O.3
  • 16
    • 0031857779 scopus 로고    scopus 로고
    • Automated protein sequence database classification. II. Delineation of domain boundaries from sequence similarities
    • J. Gracy, and P. Argos Automated protein sequence database classification. II. Delineation of domain boundaries from sequence similarities Bioinformatics 14 1998 174 187
    • (1998) Bioinformatics , vol.14 , pp. 174-187
    • Gracy, J.1    Argos, P.2
  • 17
    • 0031876699 scopus 로고    scopus 로고
    • Automated protein sequence database classification. I. Integration of compositional similarity search, local similarity search, and multiple sequence alignment
    • J. Gracy, and P. Argos Automated protein sequence database classification. I. Integration of compositional similarity search, local similarity search, and multiple sequence alignment Bioinformatics 14 1998 164 173
    • (1998) Bioinformatics , vol.14 , pp. 164-173
    • Gracy, J.1    Argos, P.2
  • 18
    • 0036529479 scopus 로고    scopus 로고
    • An efficient algorithm for large-scale detection of protein families
    • A.J. Enright, S. Van Dongen, and C.A. Ouzounis An efficient algorithm for large-scale detection of protein families Nucl. Acids Res. 30 2002 1575 1584
    • (2002) Nucl. Acids Res. , vol.30 , pp. 1575-1584
    • Enright, A.J.1    Van Dongen, S.2    Ouzounis, C.A.3
  • 19
    • 4444270864 scopus 로고    scopus 로고
    • Cluster-C, an algorithm for the large-scale clustering of protein sequences based on the extraction of maximal cliques
    • S. Mohseni-Zadeh, P. Brezellec, and J.L. Risler Cluster-C, an algorithm for the large-scale clustering of protein sequences based on the extraction of maximal cliques Comput. Biol. Chem. 28 2004 211 218
    • (2004) Comput. Biol. Chem. , vol.28 , pp. 211-218
    • Mohseni-Zadeh, S.1    Brezellec, P.2    Risler, J.L.3
  • 20
    • 0036850213 scopus 로고    scopus 로고
    • The SUPERFAMILY database in structural genomics
    • J. Gough The SUPERFAMILY database in structural genomics Acta Crystallog. sect. D 58 2002 1897 1900
    • (2002) Acta Crystallog. Sect. D , vol.58 , pp. 1897-1900
    • Gough, J.1
  • 21
    • 0035070578 scopus 로고    scopus 로고
    • Picasso: Generating a covering set of protein family profiles
    • A. Heger, and L. Holm Picasso: generating a covering set of protein family profiles Bioinformatics 17 2001 272 279
    • (2001) Bioinformatics , vol.17 , pp. 272-279
    • Heger, A.1    Holm, L.2
  • 22
    • 0033965852 scopus 로고    scopus 로고
    • ProtoMap: Automatic classification of protein sequences and hierarchy of protein families
    • G. Yona, N. Linial, and M. Linial ProtoMap: automatic classification of protein sequences and hierarchy of protein families Nucl. Acids Res. 28 2000 49 55
    • (2000) Nucl. Acids Res. , vol.28 , pp. 49-55
    • Yona, G.1    Linial, N.2    Linial, M.3
  • 23
    • 0035167151 scopus 로고    scopus 로고
    • IProClass: An integrated, comprehensive and annotated protein classification database
    • C.H. Wu, C. Xiao, Z. Hou, H. Huang, and W.C. Barker iProClass: an integrated, comprehensive and annotated protein classification database Nucl. Acids Res. 29 2001 52 54
    • (2001) Nucl. Acids Res. , vol.29 , pp. 52-54
    • Wu, C.H.1    Xiao, C.2    Hou, Z.3    Huang, H.4    Barker, W.C.5
  • 27
    • 4444298729 scopus 로고    scopus 로고
    • Profile-profile methods provide improved fold-recognition: A study of different profile-profile alignment methods
    • T. Ohlson, B. Wallner, and A. Elofsson Profile-profile methods provide improved fold-recognition: a study of different profile-profile alignment methods Proteins: Struct. Funct. Genet. 57 2004 188 197
    • (2004) Proteins: Struct. Funct. Genet. , vol.57 , pp. 188-197
    • Ohlson, T.1    Wallner, B.2    Elofsson, A.3
  • 28
    • 2942544229 scopus 로고    scopus 로고
    • COACH: Profile-profile alignment of protein families using hidden Markov models
    • R.C. Edgar, and K. Sjolander COACH: profile-profile alignment of protein families using hidden Markov models Bioinformatics 20 2004 1309 1318
    • (2004) Bioinformatics , vol.20 , pp. 1309-1318
    • Edgar, R.C.1    Sjolander, K.2
  • 29
    • 0037423702 scopus 로고    scopus 로고
    • COMPASS: A tool for comparison of multiple protein alignments with assessment of statistical significance
    • R. Sadreyev, and N. Grishin COMPASS: a tool for comparison of multiple protein alignments with assessment of statistical significance J. Mol. Biol. 326 2003 317 336
    • (2003) J. Mol. Biol. , vol.326 , pp. 317-336
    • Sadreyev, R.1    Grishin, N.2
  • 30
    • 0030983386 scopus 로고    scopus 로고
    • Population statistics of protein structures: Lessons from structural classifications
    • S.E. Brenner, C. Chothia, and T.J. Hubbard Population statistics of protein structures: lessons from structural classifications Curr. Opin. Struct. Biol. 7 1997 369 376
    • (1997) Curr. Opin. Struct. Biol. , vol.7 , pp. 369-376
    • Brenner, S.E.1    Chothia, C.2    Hubbard, T.J.3
  • 31
    • 0027122748 scopus 로고
    • Proteins. One thousand families for the molecular biologist
    • C. Chothia Proteins. One thousand families for the molecular biologist Nature 357 1992 543 544
    • (1992) Nature , vol.357 , pp. 543-544
    • Chothia, C.1
  • 32
    • 0029785147 scopus 로고    scopus 로고
    • Mapping the protein universe
    • L. Holm, and C. Sander Mapping the protein universe Science 273 1996 595 603
    • (1996) Science , vol.273 , pp. 595-603
    • Holm, L.1    Sander, C.2
  • 33
    • 0031815861 scopus 로고    scopus 로고
    • A re-estimation for the total numbers of protein folds and superfamilies
    • Z.X. Wang A re-estimation for the total numbers of protein folds and superfamilies Protein Eng. 11 1998 621 626
    • (1998) Protein Eng. , vol.11 , pp. 621-626
    • Wang, Z.X.1
  • 34
    • 0032545151 scopus 로고    scopus 로고
    • Estimating the number of protein folds
    • C. Zhang, and C. DeLisi Estimating the number of protein folds J. Mol. Biol. 284 1998 1301 1305
    • (1998) J. Mol. Biol. , vol.284 , pp. 1301-1305
    • Zhang, C.1    Delisi, C.2
  • 36
    • 2942585636 scopus 로고    scopus 로고
    • Progress towards mapping the universe of protein folds
    • A. Grant, D. Lee, and C. Orengo Progress towards mapping the universe of protein folds Genome Biol. 5 2004 107
    • (2004) Genome Biol. , vol.5 , pp. 107
    • Grant, A.1    Lee, D.2    Orengo, C.3
  • 38
    • 0141815541 scopus 로고    scopus 로고
    • Analysis of singleton ORFans in fully sequenced microbial genomes
    • N. Siew, and D. Fischer Analysis of singleton ORFans in fully sequenced microbial genomes Proteins: Struct. Funct. Genet. 53 2003 241 251
    • (2003) Proteins: Struct. Funct. Genet. , vol.53 , pp. 241-251
    • Siew, N.1    Fischer, D.2
  • 40
    • 0031793928 scopus 로고    scopus 로고
    • Iterated profile searches with PSI-BLAST-a tool for discovery in protein databases
    • S.F. Altschul, and E.V. Koonin Iterated profile searches with PSI-BLAST-a tool for discovery in protein databases Trends Biochem. Sci. 23 1998 444 447
    • (1998) Trends Biochem. Sci. , vol.23 , pp. 444-447
    • Altschul, S.F.1    Koonin, E.V.2
  • 41
    • 0028501914 scopus 로고
    • Non-globular domains in protein sequences: Automated segmentation using complexity measures
    • J.C. Wootton Non-globular domains in protein sequences: automated segmentation using complexity measures Comput. Chem. 18 1994 269 285
    • (1994) Comput. Chem. , vol.18 , pp. 269-285
    • Wootton, J.C.1
  • 42
    • 3242891265 scopus 로고    scopus 로고
    • CHOP: Parsing proteins into structural domains
    • J. Liu, and B. Rost CHOP: parsing proteins into structural domains Nucl. Acids Res. 32 2004 W569 W571
    • (2004) Nucl. Acids Res. , vol.32
    • Liu, J.1    Rost, B.2
  • 43
    • 3042810008 scopus 로고    scopus 로고
    • Sequence-based prediction of protein domains
    • J. Liu, and B. Rost Sequence-based prediction of protein domains Nucl. Acids Res. 32 2004 3522 3530
    • (2004) Nucl. Acids Res. , vol.32 , pp. 3522-3530
    • Liu, J.1    Rost, B.2
  • 45
    • 0032726692 scopus 로고    scopus 로고
    • ProtoMap: Automatic classification of protein sequences, a hierarchy of protein families, and local maps of the protein space
    • G. Yona, N. Linial, and M. Linial ProtoMap: automatic classification of protein sequences, a hierarchy of protein families, and local maps of the protein space Proteins: Struct. Funct. Genet. 37 1999 360 378
    • (1999) Proteins: Struct. Funct. Genet. , vol.37 , pp. 360-378
    • Yona, G.1    Linial, N.2    Linial, M.3
  • 46
    • 0036209245 scopus 로고    scopus 로고
    • CASA: A server for the critical assessment of protein sequence alignment accuracy
    • R.Y. Kahsay, G. Wang, N. Dongre, G. Gao, and R.L. Dunbrack Jr CASA: a server for the critical assessment of protein sequence alignment accuracy Bioinformatics 18 2002 496 497
    • (2002) Bioinformatics , vol.18 , pp. 496-497
    • Kahsay, R.Y.1    Wang, G.2    Dongre, N.3    Gao, G.4    Dunbrack Jr., R.L.5
  • 47
    • 0032568596 scopus 로고    scopus 로고
    • Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships
    • S.E. Brenner, C. Chothia, and T.J. Hubbard Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships Proc. Natl Acad. Sci. USA 95 1998 6073 6078
    • (1998) Proc. Natl Acad. Sci. USA , vol.95 , pp. 6073-6078
    • Brenner, S.E.1    Chothia, C.2    Hubbard, T.J.3
  • 48
    • 0031576361 scopus 로고    scopus 로고
    • Intermediate sequences increase the detection of homology between sequences
    • J. Park, S.A. Teichmann, T. Hubbard, and C. Chothia Intermediate sequences increase the detection of homology between sequences J. Mol. Biol. 273 1997 349 354
    • (1997) J. Mol. Biol. , vol.273 , pp. 349-354
    • Park, J.1    Teichmann, S.A.2    Hubbard, T.3    Chothia, C.4
  • 49
    • 0033940118 scopus 로고    scopus 로고
    • RSDB: Representative protein sequence databases have high information content
    • J. Park, L. Holm, A. Heger, and C. Chothia RSDB: representative protein sequence databases have high information content Bioinformatics 16 2000 458 464
    • (2000) Bioinformatics , vol.16 , pp. 458-464
    • Park, J.1    Holm, L.2    Heger, A.3    Chothia, C.4
  • 50
    • 0033944826 scopus 로고    scopus 로고
    • GeneRAGE: A robust algorithm for sequence clustering and domain detection
    • A.J. Enright, and C.A. Ouzounis GeneRAGE: a robust algorithm for sequence clustering and domain detection Bioinformatics 16 2000 451 457
    • (2000) Bioinformatics , vol.16 , pp. 451-457
    • Enright, A.J.1    Ouzounis, C.A.2
  • 51
    • 0034855972 scopus 로고    scopus 로고
    • Evaluation of protein multiple alignments by SAM-T99 using the BAliBASE multiple alignment test set
    • K. Karplus, and B. Hu Evaluation of protein multiple alignments by SAM-T99 using the BAliBASE multiple alignment test set Bioinformatics 17 2001 713 720
    • (2001) Bioinformatics , vol.17 , pp. 713-720
    • Karplus, K.1    Hu, B.2
  • 52
    • 0035910270 scopus 로고    scopus 로고
    • Predicting transmembrane protein topology with a hidden Markov model: Application to complete genomes
    • A. Krogh, B. Larsson, G. von Heijne, and E.L. Sonnhammer Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes J. Mol. Biol. 305 2001 567 580
    • (2001) J. Mol. Biol. , vol.305 , pp. 567-580
    • Krogh, A.1    Larsson, B.2    Von Heijne, G.3    Sonnhammer, E.L.4
  • 54
    • 0032509105 scopus 로고    scopus 로고
    • Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods
    • J. Park, K. Karplus, C. Barrett, R. Hughey, D. Haussler, T. Hubbard, and C. Chothia Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods J. Mol. Biol. 284 1998 1201 1210
    • (1998) J. Mol. Biol. , vol.284 , pp. 1201-1210
    • Park, J.1    Karplus, K.2    Barrett, C.3    Hughey, R.4    Haussler, D.5    Hubbard, T.6    Chothia, C.7
  • 56
    • 0022706389 scopus 로고
    • The relation between the divergence of sequence and structure in proteins
    • C. Chothia, and A.M. Lesk The relation between the divergence of sequence and structure in proteins EMBO J. 5 1986 823 826
    • (1986) EMBO J. , vol.5 , pp. 823-826
    • Chothia, C.1    Lesk, A.M.2
  • 57
    • 0036140266 scopus 로고    scopus 로고
    • A unifold, mesofold, and superfold model of protein fold use
    • A.F. Coulson, and J. Moult A unifold, mesofold, and superfold model of protein fold use Proteins: Struct. Funct. Genet. 46 2002 61 71
    • (2002) Proteins: Struct. Funct. Genet. , vol.46 , pp. 61-71
    • Coulson, A.F.1    Moult, J.2
  • 59
    • 0036307754 scopus 로고    scopus 로고
    • The geometry of domain combination in proteins
    • M. Bashton, and C. Chothia The geometry of domain combination in proteins J. Mol. Biol. 315 2002 927 939
    • (2002) J. Mol. Biol. , vol.315 , pp. 927-939
    • Bashton, M.1    Chothia, C.2
  • 61
    • 0035065485 scopus 로고    scopus 로고
    • SNPs, protein structure, and disease
    • Z. Wang, and J. Moult SNPs, protein structure, and disease Hum. Mutat. 17 2001 263 270
    • (2001) Hum. Mutat. , vol.17 , pp. 263-270
    • Wang, Z.1    Moult, J.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.