-
1
-
-
0034708480
-
-
M. D. Adams et al., Science 287, 2185 (2000); C. elegans Sequencing Consortium, Science 282, 2012 (1998);
-
(2000)
Science
, vol.287
, pp. 2185
-
-
Adams, M.D.1
-
2
-
-
0032509302
-
-
M. D. Adams et al., Science 287, 2185 (2000); elegans Sequencing Consortium, Science 282, 2012 (1998);
-
(1998)
Science
, vol.282
, pp. 2012
-
-
-
3
-
-
10244239321
-
-
A. Goffeau et al., Science 274, 546 (1996).
-
(1996)
Science
, vol.274
, pp. 546
-
-
Goffeau, A.1
-
5
-
-
0343252565
-
-
note
-
C. elegans data were taken from A C. Elegans Database (ACEDB) release WS8.
-
-
-
-
6
-
-
0343688238
-
-
note
-
Local gene duplications were determined by searching for N similar genes within 2N genes on each arm. For example, if three similar genes are found within a region containing six genes, this counts as one cluster of three genes. Genes were judged to be similar if a BLASTP High Scoring Pair (HSP) with a score of 200 or more existed between them. Histone gene clusters were not included. C. elegans data were taken from ACEDB release WS8, containing 18,424 genes.
-
-
-
-
7
-
-
0342383035
-
-
More information about GO is available at http:// www.geneontology.org/. The Gene Ontology project provides terms for categorizing gene products on the basis of their molecular function, biological role, and cellular location using controlled vocabularies.
-
-
-
-
8
-
-
0342383034
-
-
note
-
Initial results came from an NxN BLASTP analysis performed for each fly, worm, and yeast sequence in a combined data set of these completed proteomes. The databases used are as follows: Celera-Berkeley Drosophila Genome Project (BDGP), 14,195 predicted protein sequences (1/5/2000); WormPep 18, Sanger Centre, 18,576 protein sequences; and Saccharomyces Genome Database (SGD), 6306 protein sequences (1/7/ 2000). A version of NCBI-BLAST2 was used with the SEG filter and with the effective search space length (Y option) set to 17,973,263. Pairs were formed between every query sequence with a significant BLASTP to one of the other organisms' sequences. Significance was based on E-value cutoffs and length of match. These pairs were then independently grouped using single linkage clustering (61). Finally, the number of proteins from each proteome was counted. The requirement for 80% alignment of sequences makes this method of defining orthotogy particularly sensitive to errors that arise from incorrect protein prediction. However, the results comparing yeast and worm are essentially identical to those previously reported (61), eventhough the effective database size was different, the data sets have changed (Chervitz: yeast 6217 and worm 19,099; this study: yeast 6306, and worm 18,576), and the version of BLAST used is quite different (Chervitz: WashU BLAST 2.0a19MP; this study: NCBI BLAST 2.08).
-
-
-
-
10
-
-
0033977579
-
-
J. G. Henikoff, E. A. Greene, S. Pietrokovski, S. Henikoff, Nucleic Acids Res. 28, 228 (2000).
-
(2000)
Nucleic Acids Res.
, vol.28
, pp. 228
-
-
Henikoff, J.G.1
Greene, E.A.2
Pietrokovski, S.3
Henikoff, S.4
-
11
-
-
0032944238
-
-
InterPro (Integrated resource for protein domains and functional sites) is a collaborative effort of the SWISS-PROT, TrEMBL, PROSITE, PRINTS, Pfam, and ProDom databases to integrate the different pattern databases into a single resource. The database and a detailed description of the project can be found under http://www.ebi.ac.uk/interpro/. PROSITE is described in K. Hofmann, P. Bucher, L. Falquet, A. Bairoch, Nucleic Acids Res. 27, 215 (1999); PFAM is described in A. Bateman et al., Nucleic Acids Res.27, 260 (1999); and PRINTS is described in T. K. Attwood et al., Nucleic Acids Res. 27, 220 (1999).
-
(1999)
Nucleic Acids Res.
, vol.27
, pp. 215
-
-
Hofmann, K.1
Bucher, P.2
Falquet, L.3
Bairoch, A.4
-
12
-
-
0032952229
-
-
InterPro (Integrated resource for protein domains and functional sites) is a collaborative effort of the SWISS-PROT, TrEMBL, PROSITE, PRINTS, Pfam, and ProDom databases to integrate the different pattern databases into a single resource. The database and a detailed description of the project can be found under http://www.ebi.ac.uk/interpro/. PROSITE is described in K. Hofmann, P. Bucher, L. Falquet, A. Bairoch, Nucleic Acids Res. 27, 215 (1999); PFAM is described in A. Bateman et al., Nucleic Acids Res.27, 260 (1999); and PRINTS is described in T. K. Attwood et al., Nucleic Acids Res. 27, 220 (1999).
-
(1999)
Nucleic Acids Res.
, vol.27
, pp. 260
-
-
Bateman, A.1
-
13
-
-
0032918028
-
-
InterPro (Integrated resource for protein domains and functional sites) is a collaborative effort of the SWISS-PROT, TrEMBL, PROSITE, PRINTS, Pfam, and ProDom databases to integrate the different pattern databases into a single resource. The database and a detailed description of the project can be found under http://www.ebi.ac.uk/interpro/. PROSITE is described in K. Hofmann, P. Bucher, L. Falquet, A. Bairoch, Nucleic Acids Res. 27, 215 (1999); PFAM is described in A. Bateman et al., Nucleic Acids Res.27, 260 (1999); and PRINTS is described in T. K. Attwood et al., Nucleic Acids Res. 27, 220 (1999).
-
(1999)
Nucleic Acids Res.
, vol.27
, pp. 220
-
-
Attwood, T.K.1
-
14
-
-
0033598830
-
-
G. D. Plowman, S. Sudarsanam, J. Bingham, D. Whyte, T. Hunter, Proc. Natl. Acad. Sci. U.S.A. 96, 13603, (1999).
-
(1999)
Proc. Natl. Acad. Sci. U.S.A.
, vol.96
, pp. 13603
-
-
Plowman, G.D.1
Sudarsanam, S.2
Bingham, J.3
Whyte, D.4
Hunter, T.5
-
15
-
-
0003931970
-
-
Academic Press, San Diego, CA
-
J. Barrett, N. D. Rawlings, J. F. Wessner, Eds., Handbook of Proteolytic Enzymes (Academic Press, San Diego, CA, 1998).
-
(1998)
Handbook of Proteolytic Enzymes
-
-
Barrett, J.1
Rawlings, N.D.2
Wessner, J.F.3
-
16
-
-
0028278901
-
-
C. L. Smith and R. DeLotto, Nature 368, 548 (1994); K. D. Konrad, T. J. Goralski, A. P. Mahowald, J. L. Marsh, Proc. Natl. Acad. Sci. U.S.A. 95, 6819 (1998); E. K. LeMosy, C. C. Hong, C. Hashimoto, Trends Cell Biol. 9, 102 (1999).
-
(1994)
, vol.368
, pp. 548
-
-
Smith, C.L.1
Delotto, R.2
Nature3
-
17
-
-
0032499717
-
-
C. L. Smith and R. DeLotto, Nature 368, 548 (1994); K. D. Konrad, T. J. Goralski, A. P. Mahowald, J. L. Marsh, Proc. Natl. Acad. Sci. U.S.A. 95, 6819 (1998); E. K. LeMosy, C. C. Hong, C. Hashimoto, Trends Cell Biol. 9, 102 (1999).
-
(1998)
Proc. Natl. Acad. Sci. U.S.A.
, vol.95
, pp. 6819
-
-
Konrad, K.D.1
Goralski, T.J.2
Mahowald, A.P.3
Marsh, J.L.4
-
18
-
-
0033106265
-
-
C. L. Smith and R. DeLotto, Nature 368, 548 (1994); K. D. Konrad, T. J. Goralski, A. P. Mahowald, J. L. Marsh, Proc. Natl. Acad. Sci. U.S.A. 95, 6819 (1998); E. K. LeMosy, C. C. Hong, C. Hashimoto, Trends Cell Biol. 9, 102 (1999).
-
(1999)
Trends Cell Biol.
, vol.9
, pp. 102
-
-
LeMosy, E.K.1
Hong, C.C.2
Hashimoto, C.3
-
20
-
-
0029777325
-
-
P. Bork, A. K. Downing, B. Kieffer, I. D. Campbell, Quart. Rev. Biophys. 29, 119 (1996).
-
(1996)
Quart. Rev. Biophys.
, vol.29
, pp. 119
-
-
Bork, P.1
Downing, A.K.2
Kieffer, B.3
Campbell, I.D.4
-
21
-
-
0028845598
-
-
P. Vernier, B. Cardinaud, O. Valdenaire, H. Philippe, J.-D. Vincent, Trends Pharmacol. Sci. 16, 375, (1995); J. Colas, J. Launay, J. Vonesch, P. Hickel, L. Maroteaux, Mech. Dev. 87, 77 (1999); M. R. Costa, E. T. Wilson, E. Wieschaus, Cell 76, 1075 (1994).
-
(1995)
Trends Pharmacol. Sci.
, vol.16
, pp. 375
-
-
Vernier, P.1
Cardinaud, B.2
Valdenaire, O.3
Philippe, H.4
Vincent, J.-D.5
-
22
-
-
0032819494
-
-
P. Vernier, B. Cardinaud, O. Valdenaire, H. Philippe, J.-D. Vincent, Trends Pharmacol. Sci. 16, 375, (1995); J. Colas, J. Launay, J. Vonesch, P. Hickel, L. Maroteaux, Mech. Dev. 87, 77 (1999); M. R. Costa, E. T. Wilson, E. Wieschaus, Cell 76, 1075 (1994).
-
(1999)
Mech. Dev.
, vol.87
, pp. 77
-
-
Colas, J.1
Launay, J.2
Vonesch, J.3
Hickel, P.4
Maroteaux, L.5
-
23
-
-
0028226299
-
-
P. Vernier, B. Cardinaud, O. Valdenaire, H. Philippe, J.-D. Vincent, Trends Pharmacol. Sci. 16, 375, (1995); J. Colas, J. Launay, J. Vonesch, P. Hickel, L. Maroteaux, Mech. Dev. 87, 77 (1999); M. R. Costa, E. T. Wilson, E. Wieschaus, Cell 76, 1075 (1994).
-
(1994)
Cell
, vol.76
, pp. 1075
-
-
Costa, M.R.1
Wilson, E.T.2
Wieschaus, E.3
-
26
-
-
0033082410
-
-
P. J. Clyne et al., Neuron 22, 327 (1999); L. B. Vosshall, H. Amrein, P. S. Morozov, A. Rzhetsky, R. Axel, Cell 96, 725 (1999); P. P. Laissue et al., J. Comp. Neurol. 405, 543 (1999).
-
(1999)
Neuron
, vol.22
, pp. 327
-
-
Clyne, P.J.1
-
27
-
-
0033525896
-
-
P. J. Clyne et al., Neuron 22, 327 (1999); L. B. Vosshall, H. Amrein, P. S. Morozov, A. Rzhetsky, R. Axel, Cell 96, 725 (1999); P. P. Laissue et al., J. Comp. Neurol. 405, 543 (1999).
-
(1999)
Cell
, vol.96
, pp. 725
-
-
Vosshall, L.B.1
Amrein, H.2
Morozov, P.S.3
Rzhetsky, A.4
Axel, R.5
-
28
-
-
0033594145
-
-
P. J. Clyne et al., Neuron 22, 327 (1999); L. B. Vosshall, H. Amrein, P. S. Morozov, A. Rzhetsky, R. Axel, Cell 96, 725 (1999); P. P. Laissue et al., J. Comp. Neurol. 405, 543 (1999).
-
(1999)
J. Comp. Neurol.
, vol.405
, pp. 543
-
-
Laissue, P.P.1
-
31
-
-
0028834902
-
-
S. N. Jones, A. E. Roe, L. A. Donehower, A. Bradley, Nature 378, 206 (1995).
-
(1995)
Nature
, vol.378
, pp. 206
-
-
Jones, S.N.1
Roe, A.E.2
Donehower, L.A.3
Bradley, A.4
-
32
-
-
0030996903
-
-
I. The et al., Science 276, 791 (1997).
-
(1997)
Science
, vol.276
, pp. 791
-
-
The, I.1
-
36
-
-
0028783413
-
-
P. R. Mueller, T. R. Coleman, A. Kumagai, W. G. Dunphy, Science 270, 86 (1995).
-
(1995)
Science
, vol.270
, pp. 86
-
-
Mueller, P.R.1
Coleman, T.R.2
Kumagai, A.3
Dunphy, W.G.4
-
37
-
-
0028298169
-
-
B. D. Dynlacht, A. Brook, M. Dembski, L. Yenush, N. Dyson, Proc. Natl. Acad. Sci. U.S.A. 91, 6359 (1994); W. Du, M. Vidal, J.-E. Xie, N. Dyson, Genes Dev. 10, 1206 (1996); T. Sawado et al., Biochem. Biophys. Res. Commun. 251, 409 (1998).
-
(1994)
Proc. Natl. Acad. Sci. U.S.A.
, vol.91
, pp. 6359
-
-
Dynlacht, B.D.1
Brook, A.2
Dembski, M.3
Yenush, L.4
Dyson, N.5
-
38
-
-
0029894254
-
-
B. D. Dynlacht, A. Brook, M. Dembski, L. Yenush, N. Dyson, Proc. Natl. Acad. Sci. U.S.A. 91, 6359 (1994); W. Du, M. Vidal, J.-E. Xie, N. Dyson, Genes Dev. 10, 1206 (1996); T. Sawado et al., Biochem. Biophys. Res. Commun. 251, 409 (1998).
-
(1996)
Genes Dev.
, vol.10
, pp. 1206
-
-
Du, W.1
Vidal, M.2
Xie, J.-E.3
Dyson, N.4
-
39
-
-
0032552957
-
-
B. D. Dynlacht, A. Brook, M. Dembski, L. Yenush, N. Dyson, Proc. Natl. Acad. Sci. U.S.A. 91, 6359 (1994); W. Du, M. Vidal, J.-E. Xie, N. Dyson, Genes Dev. 10, 1206 (1996); T. Sawado et al., Biochem. Biophys. Res. Commun. 251, 409 (1998).
-
(1998)
Biochem. Biophys. Res. Commun.
, vol.251
, pp. 409
-
-
Sawado, T.1
-
44
-
-
0033534575
-
-
A. Desai, S. Verma, T. J. Mitchison, C. E. Walczak, Cell 96, 69 (1999).
-
(1999)
Cell
, vol.96
, pp. 69
-
-
Desai, A.1
Verma, S.2
Mitchison, T.J.3
Walczak, C.E.4
-
45
-
-
0342383026
-
-
K. Weber, in (29), pp. 291-293.
-
, vol.29
, pp. 291-293
-
-
Weber, K.1
-
49
-
-
0029730738
-
-
M. P. Belvin and K. V. Anderson, Annu. Rev. Cell Dev. Biol. 12, 393 (1995); M. Hammerschmidt, A. Brook, A. P. McMahon, Trends Genet. 13, 14 (1997);
-
(1995)
Annu. Rev. Cell Dev. Biol.
, vol.12
, pp. 393
-
-
Belvin, M.P.1
Anderson, K.V.2
-
50
-
-
0031051385
-
-
M. P. Belvin and K. V. Anderson, Annu. Rev. Cell Dev. Biol. 12, 393 (1995); M. Hammerschmidt, A. Brook, A. P. McMahon, Trends Genet. 13, 14 (1997);
-
(1997)
Trends Genet.
, vol.13
, pp. 14
-
-
Hammerschmidt, M.1
Brook, A.2
McMahon, A.P.3
-
56
-
-
0028598861
-
-
P. W. H. Holland, J. Garcia-Fernandez, N. A. Williams, A. Sidow, Development (suppl.) (1994), p. 125.
-
(1994)
Development
, Issue.SUPPL.
, pp. 125
-
-
Holland, P.W.H.1
Garcia-Fernandez, J.2
Williams, N.A.3
Sidow, A.4
-
58
-
-
0032885388
-
-
W. C. Eamshaw, L. M. Martins, S. H. Kaufmann, Annu. Rev. Biochem. 68, 383 (1999); A. Zeurier, A. Eramo, C. Peschle, R. DeMaria, Cell Death Diff. 6, 1075 (1999).
-
(1999)
Annu. Rev. Biochem.
, vol.68
, pp. 383
-
-
Eamshaw, W.C.1
Martins, L.M.2
Kaufmann, S.H.3
-
59
-
-
0032805107
-
-
W. C. Eamshaw, L. M. Martins, S. H. Kaufmann, Annu. Rev. Biochem. 68, 383 (1999); A. Zeurier, A. Eramo, C. Peschle, R. DeMaria, Cell Death Diff. 6, 1075 (1999).
-
(1999)
Cell Death Diff.
, vol.6
, pp. 1075
-
-
Zeurier, A.1
Eramo, A.2
Peschle, C.3
DeMaria, R.4
-
60
-
-
0030581151
-
-
X. Liu, C. N. Kim, J. Yang, R. Jemmerson, X. Wang, Cell 86, 147 (1996); S. A. Susin et al., Nature 397, 441 (1999).
-
(1996)
Cell
, vol.86
, pp. 147
-
-
Liu, X.1
Kim, C.N.2
Yang, J.3
Jemmerson, R.4
Wang, X.5
-
61
-
-
0033521741
-
-
X. Liu, C. N. Kim, J. Yang, R. Jemmerson, X. Wang, Cell 86, 147 (1996); 5. A. Susin et al., Nature 397, 441 (1999).
-
(1999)
Nature
, vol.397
, pp. 441
-
-
Susin, S.A.1
-
62
-
-
0030715323
-
-
P. Li et al., Cell 91, 479 (1997).
-
(1997)
Cell
, vol.91
, pp. 479
-
-
Li, P.1
-
63
-
-
0040936837
-
-
A. G. Park, Trends Cell Biol. 10, 394 (2000); S. Sahara et al., Nature 401, 168 (1999).
-
(2000)
Trends Cell Biol.
, vol.10
, pp. 394
-
-
Park, A.G.1
-
64
-
-
0033539067
-
-
A. G. Park, Trends Cell Biol. 10, 394 (2000); S. Sahara et al., Nature 401, 168 (1999).
-
(1999)
Nature
, vol.401
, pp. 168
-
-
Sahara, S.1
-
68
-
-
0032476659
-
-
K. Thress, W. Henzel, W. Shillinglaw, S. Kornbluth, EMBO J. 17, 6135 (1998).
-
(1998)
EMBO J.
, vol.17
, pp. 6135
-
-
Thress, K.1
Henzel, W.2
Shillinglaw, W.3
Kornbluth, S.4
-
69
-
-
0033584339
-
-
J. T. Littleton, T. L. Serano, G. M. Rubin, B. Ganetzky, E. R. Chapman, Nature 400, 757 (1999).
-
(1999)
Nature
, vol.400
, pp. 757
-
-
Littleton, J.T.1
Serano, T.L.2
Rubin, G.M.3
Ganetzky, B.4
Chapman, E.R.5
-
70
-
-
0027413655
-
-
T. Solner et al., Nature 362, 318 (1993).
-
(1993)
Nature
, vol.362
, pp. 318
-
-
Solner, T.1
-
72
-
-
0029036374
-
-
K. Ichtchenko et al., Cell 81, 435 (1995).
-
(1995)
Cell
, vol.81
, pp. 435
-
-
Ichtchenko, K.1
-
74
-
-
0029916999
-
-
A. Pearson, Current Opin. Immunol. 8, 20 (1996); N. C. Franc et al., Immunity 4, 431 (1996); D. Kang et al., Proc. Natl. Acad. Sci. U.S.A. 95, 10078 (1998); W. J. Lee et al., Proc. Natl. Acad. Sci. U.S.A. 93, 7888 (1996).
-
(1996)
Current Opin. Immunol.
, vol.8
, pp. 20
-
-
Pearson, A.1
-
75
-
-
0030152160
-
-
A. Pearson, Current Opin. Immunol. 8, 20 (1996); N. C. Franc et al., Immunity 4, 431 (1996); D. Kang et al., Proc. Natl. Acad. Sci. U.S.A. 95, 10078 (1998); W. J. Lee et al., Proc. Natl. Acad. Sci. U.S.A. 93, 7888 (1996).
-
(1996)
Immunity
, vol.4
, pp. 431
-
-
Franc, N.C.1
-
76
-
-
0032544089
-
-
A. Pearson, Current Opin. Immunol. 8, 20 (1996); N. C. Franc et al., Immunity 4, 431 (1996); D. Kang et al., Proc. Natl. Acad. Sci. U.S.A. 95, 10078 (1998); W. J. Lee et al., Proc. Natl. Acad. Sci. U.S.A. 93, 7888 (1996).
-
(1998)
Proc. Natl. Acad. Sci. U.S.A.
, vol.95
, pp. 10078
-
-
Kang, D.1
-
77
-
-
0029846206
-
-
A. Pearson, Current Opin. Immunol. 8, 20 (1996); N. C. Franc et al., Immunity 4, 431 (1996); D. Kang et al., Proc. Natl. Acad. Sci. U.S.A. 95, 10078 (1998); W. J. Lee et al., Proc. Natl. Acad. Sci. U.S.A. 93, 7888 (1996).
-
(1996)
Proc. Natl. Acad. Sci. U.S.A.
, vol.93
, pp. 7888
-
-
Lee, W.J.1
-
79
-
-
0033966411
-
-
J. A. Hoffmann and J.-M. Reichhart, Trends Cell Biol. 7, 309 (1997); K. V. Anderson, Curr. Opin. Immun. 12, 13 (2000).
-
(2000)
Curr. Opin. Immun.
, vol.12
, pp. 13
-
-
Anderson, V.1
-
82
-
-
18544392423
-
-
J. M. Warrick et al., Cell 93, 939 (1998); G. R. Jackson et al., Neuron 21, 633 (1998).
-
(1998)
Cell
, vol.93
, pp. 939
-
-
Warrick, J.M.1
-
83
-
-
0032168160
-
-
J. M. Warrick et al., Cell 93, 939 (1998); G. R. Jackson et al., Neuron 21, 633 (1998).
-
(1998)
Neuron
, vol.21
, pp. 633
-
-
Jackson, G.R.1
-
88
-
-
0032509179
-
-
S. A. Chervitz et al., Science 282, 2022 (1998).
-
(1998)
Science
, vol.282
, pp. 2022
-
-
Chervitz, S.A.1
-
89
-
-
0343688230
-
-
See www.sciencemag.org/feature/data/1049664.shl for complete protein domain analysis.
-
-
-
-
90
-
-
0342383014
-
-
note
-
-6 was used for all of the analyses reported here). The resulting graph is then split into subgraphs that contain at least two-thirds of all possible arcs between vertices. The algorithm is "greedy"; that is, it arbitrarily chooses a starting sequence and adds new sequences to the subgraph as long as this criterion is met. An interesting property of this algorithm is that it inherently respects the multidomain nature of proteins: For example, two multidomain proteins may have significant similarity to one another but share only one or a few domains. In such a case, the two proteins will not be clustered if the unshared domains introduce a large number of other arcs.
-
-
-
-
91
-
-
0343688228
-
-
note
-
An NxN BLASTP analysis was performed for each fly, worm, and yeast sequence in a combined data set of these completed proteomes. The databases used are as follows: Celera-BDGP, 14,195 predicted protein sequences (1/5/2000); WormPep18, Sanger Centre, 18,424 protein sequences; and SGD, 6246 protein sequences (1/7/2000). BLASTP analysis was also performed against known mammalian proteins (2/1/ 2000, GenBank nonredundant amino acid, Human, Mouse, and Rat, 75,236 protein sequences), and TBLASTN analysis was performed against a database of mammalian ESTs (2/1/00, GenBank dbEST, Human, Mouse, and Rat). A version of NCBI-BLAST2 optimized for the Compaq Alpha architecture was used with the SEG filter and the effective search space length (Y option) set to 17,973,263.
-
-
-
-
92
-
-
0343688229
-
-
note
-
The many participants from academic institutions are grateful for their various sources of support. Participants from the Berkeley Drosophila Genome Project are supported by NIH grant P50HG00750 (G.M.R.) and grant P4IHG00739 (W.M.G.).
-
-
-
|