-
1
-
-
10244246754
-
-
Gene maps are available from GenBank Genomes Division at http://www.ncbi.nlm.nih.gov/
-
-
-
-
3
-
-
10244232983
-
-
note
-
The consortium consisted of genome mapping centers or groups at the Whitehead Institute for Biomedical Research, the Sanger Centre, Généthon, Stanford University, Oxford University, the University of Colorado Health Sciences Center, and informatics groups at the National Center for Biotechnology Information and the European Bioinformatics Institute. Additional laboratories having contributed to this mapping effort are the National Center for Human Genome Research and Kazusa DNA Research Institute.
-
-
-
-
4
-
-
84966176406
-
-
F. Antequera and A. Bird, Nature Genet. 8, 114 (1994); C. Fields et al., ibid. 7, 345 (1994); R. Nowak, Science 263, 608 (1994).
-
(1994)
Nature Genet.
, vol.8
, pp. 114
-
-
Antequera, F.1
Bird, A.2
-
5
-
-
0028301313
-
-
F. Antequera and A. Bird, Nature Genet. 8, 114 (1994); C. Fields et al., ibid. 7, 345 (1994); R. Nowak, Science 263, 608 (1994).
-
(1994)
Nature Genet.
, vol.7
, pp. 345
-
-
Fields, C.1
-
6
-
-
0028296185
-
-
F. Antequera and A. Bird, Nature Genet. 8, 114 (1994); C. Fields et al., ibid. 7, 345 (1994); R. Nowak, Science 263, 608 (1994).
-
(1994)
Science
, vol.263
, pp. 608
-
-
Nowak, R.1
-
7
-
-
0029916911
-
-
Swiss-Prot [A. Bairoch and R. Apweiler, Nucleic Acids Res. 24, 21 (1996)], a minimally redundant protein database, contained only 1790 human sequences in the release of 19 August 1991 (A. Bairoch, personal communication).
-
(1996)
Nucleic Acids Res.
, vol.24
, pp. 21
-
-
Bairoch, A.1
Apweiler, R.2
-
8
-
-
0029916911
-
-
personal communication
-
Swiss-Prot [A. Bairoch and R. Apweiler, Nucleic Acids Res. 24, 21 (1996)], a minimally redundant protein database, contained only 1790 human sequences in the release of 19 August 1991 (A. Bairoch, personal communication).
-
-
-
Bairoch, A.1
-
10
-
-
0025879424
-
-
M. D. Adams et al., Science 252, 1651 (1991).
-
(1991)
Science
, vol.252
, pp. 1651
-
-
Adams, M.D.1
-
11
-
-
0026947551
-
-
A. S. Khan et al., Nature Genet. 2, 180 (1992); K. Okubo et al., ibid., p. 173; J. M. Sikela and C. Auffray, ibid. 3, 189 (1993).
-
(1992)
Nature Genet.
, vol.2
, pp. 180
-
-
Khan, A.S.1
-
12
-
-
0026947551
-
-
A. S. Khan et al., Nature Genet. 2, 180 (1992); K. Okubo et al., ibid., p. 173; J. M. Sikela and C. Auffray, ibid. 3, 189 (1993).
-
Nature Genet.
, pp. 173
-
-
Okubo, K.1
-
13
-
-
0027408526
-
-
A. S. Khan et al., Nature Genet. 2, 180 (1992); K. Okubo et al., ibid., p. 173; J. M. Sikela and C. Auffray, ibid. 3, 189 (1993).
-
(1993)
Nature Genet.
, vol.3
, pp. 189
-
-
Sikela, J.M.1
Auffray, C.2
-
16
-
-
0024980741
-
-
M. Olson et al., Science 245, 1434 (1989).
-
(1989)
Science
, vol.245
, pp. 1434
-
-
Olson, M.1
-
17
-
-
0025796584
-
-
A. S. Wilcox et al., Nucleic Acids Res. 19, 1837 (1991). The majority of ESTs have been derived from cDNAs that were directionally cloned from their 3′ ends {using an oligo(dT) primer that anneals to the polyadenylate [poly(A)] tail found at the end of most human mRNAs}. About half of all EST sequences represent putative 3′ ends. Two advantages of using the 3′ UTRs are that they rarely contain introns and they usually display less sequence conservation than do coding regions [W. Makalowski et al., Genome Res. 6, 846 (1996)]. The former feature leads to PCR product sizes that are small enough to amplify; the latter feature makes it easier to discriminate among gene family members that are very similar in their coding regions.
-
(1991)
Nucleic Acids Res.
, vol.19
, pp. 1837
-
-
Wilcox, A.S.1
-
18
-
-
0029809212
-
-
A. S. Wilcox et al., Nucleic Acids Res. 19, 1837 (1991). The majority of ESTs have been derived from cDNAs that were directionally cloned from their 3′ ends {using an oligo(dT) primer that anneals to the polyadenylate [poly(A)] tail found at the end of most human mRNAs}. About half of all EST sequences represent putative 3′ ends. Two advantages of using the 3′ UTRs are that they rarely contain introns and they usually display less sequence conservation than do coding regions [W. Makalowski et al., Genome Res. 6, 846 (1996)]. The former feature leads to PCR product sizes that are small enough to amplify; the latter feature makes it easier to discriminate among gene family members that are very similar in their coding regions.
-
(1996)
Genome Res.
, vol.6
, pp. 846
-
-
Makalowski, W.1
-
19
-
-
0026693729
-
-
For efficiency, a hashing scheme was used to rapidly identify pairs of sequences having a potential relation before subjecting them to full optimal alignment. Any pair of sequences sharing at least two 13-base words separated by no more then two bases was considered an initial candidate. These pairs were analyzed with a variant of the basic Smith-Waterman algorithm [K.-M. Chao et al., Comput. Appl. Biosci. 8, 481 (1992)] in which the search for the optimal alignment was constrained to a band encompassing all observed word hits plus an additional 10 diagonals to each side. Alignment scores were calculated by summing +1 for a match, -2 for a mismatch, -1 for a gap position, and zero for an ambiguous position. The alignment quality was the score divided by the alignment length and was required to be at least 0.91 to be accepted. By this measure, a 100-base alignment with 97 matches and 3 mismatches would just meet the cutoff. An additional constraint was that the observed alignment must extend to within 35 bases of the edge of the search space.
-
(1992)
Comput. Appl. Biosci.
, vol.8
, pp. 481
-
-
Chao, K.-M.1
-
20
-
-
0028970844
-
-
The Genexpress Index: R. Houlgatte et al., Genome Res. 5, 272 (1995); the Merck Gene Index: J. S. Aaronson et al., ibid. 6, 829 (1996); and the THC (TIGR human cDNA) collection: M. D. Adams et al., Nature 377, 3 (1995), available at http://www. tigr.org/tdb/hummap/hummap.html
-
(1995)
Genome Res.
, vol.5
, pp. 272
-
-
Houlgatte, R.1
-
21
-
-
0029845634
-
-
The Genexpress Index: R. Houlgatte et al., Genome Res. 5, 272 (1995); the Merck Gene Index: J. S. Aaronson et al., ibid. 6, 829 (1996); and the THC (TIGR human cDNA) collection: M. D. Adams et al., Nature 377, 3 (1995), available at http://www. tigr.org/tdb/hummap/hummap.html
-
(1996)
Genome Res.
, vol.6
, pp. 829
-
-
Aaronson, J.S.1
-
22
-
-
0029653613
-
-
The Genexpress Index: R. Houlgatte et al., Genome Res. 5, 272 (1995); the Merck Gene Index: J. S. Aaronson et al., ibid. 6, 829 (1996); and the THC (TIGR human cDNA) collection: M. D. Adams et al., Nature 377, 3 (1995), available at http://www. tigr.org/tdb/hummap/hummap.html
-
(1995)
Nature
, vol.377
, pp. 3
-
-
Adams, M.D.1
-
23
-
-
54249116230
-
-
F. Jacob et al., J. Mol. Biol. 3, 318 (1961); F. Jacob et al., C. R. Acad. Sci. Paris 258, 3125 (1964).
-
(1961)
J. Mol. Biol.
, vol.3
, pp. 318
-
-
Jacob, F.1
-
24
-
-
78651159012
-
-
F. Jacob et al., J. Mol. Biol. 3, 318 (1961); F. Jacob et al., C. R. Acad. Sci. Paris 258, 3125 (1964).
-
(1964)
C. R. Acad. Sci. Paris
, vol.258
, pp. 3125
-
-
Jacob, F.1
-
25
-
-
0018815726
-
-
For example, P. A. Sharp, A. J. Berk, S. M. Berget, Methods Enzymol. 65, 750 (1980); J. Battey and D. A. Clayton, Cell 14, 143 (1978).
-
(1980)
Methods Enzymol.
, vol.65
, pp. 750
-
-
Sharp, P.A.1
Berk, A.J.2
Berget, S.M.3
-
26
-
-
0017814542
-
-
For example, P. A. Sharp, A. J. Berk, S. M. Berget, Methods Enzymol. 65, 750 (1980); J. Battey and D. A. Clayton, Cell 14, 143 (1978).
-
(1978)
Cell
, vol.14
, pp. 143
-
-
Battey, J.1
Clayton, D.A.2
-
27
-
-
13344259999
-
-
C. Dib et al., Nature 380, 152 (1996).
-
(1996)
Nature
, vol.380
, pp. 152
-
-
Dib, C.1
-
28
-
-
0029653653
-
-
I. M. Chumakov et al., ibid. 377, 175 (1995).
-
(1995)
Nature
, vol.377
, pp. 175
-
-
Chumakov, I.M.1
-
30
-
-
0029416826
-
-
T. J. Hudson et al., Science 270, 1945 (1995).
-
(1995)
Science
, vol.270
, pp. 1945
-
-
Hudson, T.J.1
-
32
-
-
10244279313
-
-
personal communication
-
E. A. Stewart, personal communication.
-
-
-
Stewart, E.A.1
-
33
-
-
0028343233
-
-
Genbridge4 Framework: analysis of all 1549 Généthon markers mapped on the GB4 panel shows that 873 markers can be ordered with high confidence (>1000:1) on the basis of only the RH retention patterns. The MultiMap [T. C. Matise et al., Nature Genet. 6, 384 (1994)] implementation of RADMAP (T. Matise et al., available at http://linkage.rockefeller.edu/multimap) was used in the framework map construction by using a maximum-interval-theta of 0.5, a minimum-interval-theta ot 0.05, and an odds threshold of 3 for adding markers to the map. In cases where the local order of markers on the RH map did not support the genetic map order with odds of at least 1000:1, markers were removed until such disagreements were resolved. This criterion was relaxed in 19 cases to odds as low as 10:1 to allow spanning of large gaps or at the telomeres. Additional reference markers that do not fulfill the 1000:1 confidence order were included by some mapping groups as reference markers for EST binning. Inclusion criteria used to select these additional markers included (i) markers separated by at least 2 cM and (ii) retention rates between 10 and 60%. This defined a scaffold map of 1038 reference markers. Genetic order was enforced for these reference markers. G3 Framework: reference set of 1000 Généthon markers typed in duplicate and incorporated into the Stanford G3 RH maps. All the framework markers are at 1000:1 odds on the Généthon map, and 707 of the 1000 are at 1000:1 odds on the Stanford map. YAC-Map Framework: genetic positions for the 1090 gene-based markers derived from the genetically anchored double-linked YAC contigs on the STS content YAC map generated at the Whitehead Institute/MIT Genome Center. This map, which has previously been described (17), has been expanded to include 566 additional Généthon markers, for a total of 4082 markers.
-
(1994)
Nature Genet.
, vol.6
, pp. 384
-
-
Matise, T.C.1
-
34
-
-
10244247890
-
-
Genbridge4 Framework: analysis of all 1549 Généthon markers mapped on the GB4 panel shows that 873 markers can be ordered with high confidence (>1000:1) on the basis of only the RH retention patterns. The MultiMap [T. C. Matise et al., Nature Genet. 6, 384 (1994)] implementation of RADMAP (T. Matise et al., available at http://linkage.rockefeller.edu/multimap) was used in the framework map construction by using a maximum-interval-theta of 0.5, a minimum-interval-theta ot 0.05, and an odds threshold of 3 for adding markers to the map. In cases where the local order of markers on the RH map did not support the genetic map order with odds of at least 1000:1, markers were removed until such disagreements were resolved. This criterion was relaxed in 19 cases to odds as low as 10:1 to allow spanning of large gaps or at the telomeres. Additional reference markers that do not fulfill the 1000:1 confidence order were included by some mapping groups as reference markers for EST binning. Inclusion criteria used to select these additional markers included (i) markers separated by at least 2 cM and (ii) retention rates between 10 and 60%. This defined a scaffold map of 1038 reference markers. Genetic order was enforced for these reference markers. G3 Framework: reference set of 1000 Généthon markers typed in duplicate and incorporated into the Stanford G3 RH maps. All the framework markers are at 1000:1 odds on the Généthon map, and 707 of the 1000 are at 1000:1 odds on the Stanford map. YAC-Map Framework: genetic positions for the 1090 gene-based markers derived from the genetically anchored double-linked YAC contigs on the STS content YAC map generated at the Whitehead Institute/MIT Genome Center. This map, which has previously been described (17), has been expanded to include 566 additional Généthon markers, for a total of 4082 markers.
-
-
-
Matise, T.1
-
35
-
-
0028685657
-
-
Some mapping candidates were initially selected from the Genexpress, THC, and Kazusa [N. Nomura et al., DNA Res. 1, 27 (1994); N. Nomura et al., ibid., p. 223; T. Nagase et al., ibid. 2, 167 (1995); T. Nagase, N. Seki, K. Ishikawa, A. Tanaka, N. Nomura, ibid. 3, 17 (1996)] collections and retrospectively cross-referenced to UniGene entries.
-
(1994)
DNA Res.
, vol.1
, pp. 27
-
-
Nomura, N.1
-
36
-
-
0028685657
-
-
Some mapping candidates were initially selected from the Genexpress, THC, and Kazusa [N. Nomura et al., DNA Res. 1, 27 (1994); N. Nomura et al., ibid., p. 223; T. Nagase et al., ibid. 2, 167 (1995); T. Nagase, N. Seki, K. Ishikawa, A. Tanaka, N. Nomura, ibid. 3, 17 (1996)] collections and retrospectively cross-referenced to UniGene entries.
-
DNA Res.
, pp. 223
-
-
Nomura, N.1
-
37
-
-
0029655169
-
-
Some mapping candidates were initially selected from the Genexpress, THC, and Kazusa [N. Nomura et al., DNA Res. 1, 27 (1994); N. Nomura et al., ibid., p. 223; T. Nagase et al., ibid. 2, 167 (1995); T. Nagase, N. Seki, K. Ishikawa, A. Tanaka, N. Nomura, ibid. 3, 17 (1996)] collections and retrospectively cross-referenced to UniGene entries.
-
(1995)
DNA Res.
, vol.2
, pp. 167
-
-
Nagase, T.1
-
38
-
-
0030605415
-
-
Some mapping candidates were initially selected from the Genexpress, THC, and Kazusa [N. Nomura et al., DNA Res. 1, 27 (1994); N. Nomura et al., ibid., p. 223; T. Nagase et al., ibid. 2, 167 (1995); T. Nagase, N. Seki, K. Ishikawa, A. Tanaka, N. Nomura, ibid. 3, 17 (1996)] collections and retrospectively cross-referenced to UniGene entries.
-
(1996)
DNA Res.
, vol.3
, pp. 17
-
-
Nagase, T.1
Seki, N.2
Ishikawa, K.3
Tanaka, A.4
Nomura, N.5
-
39
-
-
10244256450
-
-
note
-
Assays were considered unsuccessful if they consistently yielded no product or multiple products in human DNA, interfering bands in hamster DNA (about 5%)., abnormally low (<10%) or high (>60%) retention rates in RH panels, or more than four discrepants between duplicate tests. An additional 10% of assays meeting these criteria fail to map relative to the reference set of genetic markers, possibly because of their being placed past the end of the maps or because of a high proportion of errors.
-
-
-
-
40
-
-
10244222885
-
-
note
-
The majority of gene-based STSs were localized to an interval defined by two genetic framework markers or mapped at zero distance from (that is, were nonrecombinant with) a single reference marker, which allowed their positions to be resolved to centimorgan coordinates. In a small number of cases, markers were placed by two-point analysis, so that only the nearest framework marker is known but not an interval. For the purposes of drawing the histogram, a virtual interval was defined that extended half the distance to the nearest framework markers on each side. Sometimes markers mapped between a reference marker and the telomere, and these were plotted in separate bins above and below the maps. A uniform 1.5-cM bin size was used in plotting the histogram. If the interval determined for a marker spanned several of these bins, its contribution was split evenly among them. Duplicate mappings were counted only once, so the heights of the bars are proportional to distinct loci per centimorgan. In the case of mapping conflicts, a partial contribution was made at each of the possible locations.
-
-
-
-
41
-
-
0030008257
-
-
Most of the data were derived from a recent study [P. Bray-Ward et al., Genomics 32, 1 (1996)] in which a large number of CEPH YACs were mapped by FISH across the whole genome. A subset of the results were selected in which there was no evidence of the YAC having been chimeric, and the position was resolved to a cytogenetic interval (as opposed to fractional length only). A Généthon marker was given for each of these YACs, which allowed the position in centimorgans to be determined. Because this study contained no data for chromosomes 19, 21, and 22, an alternative strategy was used: Genes that are nonrecombinant with respect to Généthon markers [table 5 in C. Dib et al., Nature 380 (suppl.), iii (1996)] and have well-known cytogenetic locations [Online Mendelian Inheritance in Man (OMIM) at http://www. ncbi.nlm.nih.gov/] were used to establish the cross-reference.
-
(1996)
Genomics
, vol.32
, pp. 1
-
-
Bray-Ward, P.1
-
42
-
-
4243784019
-
-
Most of the data were derived from a recent study [P. Bray-Ward et al., Genomics 32, 1 (1996)] in which a large number of CEPH YACs were mapped by FISH across the whole genome. A subset of the results were selected in which there was no evidence of the YAC having been chimeric, and the position was resolved to a cytogenetic interval (as opposed to fractional length only). A Généthon marker was given for each of these YACs, which allowed the position in centimorgans to be determined. Because this study contained no data for chromosomes 19, 21, and 22, an alternative strategy was used: Genes that are nonrecombinant with respect to Généthon markers [table 5 in C. Dib et al., Nature 380 (suppl.), iii (1996)] and have well-known cytogenetic locations [Online Mendelian Inheritance in Man (OMIM) at http://www. ncbi.nlm.nih.gov/] were used to establish the cross-reference.
-
(1996)
Nature
, vol.380
, Issue.SUPPL.
-
-
Dib, C.1
-
44
-
-
10244279312
-
-
note
-
Detailed instructions and examples are provided on the Web site.
-
-
-
-
45
-
-
0030010970
-
-
G. D. Schuler, J. A. Epstein, H. Ohkawa, J. A. Kans, Methods Enzymol. 266, 141 (1996).
-
(1996)
Methods Enzymol.
, vol.266
, pp. 141
-
-
Schuler, G.D.1
Epstein, J.A.2
Ohkawa, H.3
Kans, J.A.4
-
46
-
-
10244267644
-
-
note
-
Five additional assays yielded ambiguous results as a result of the presence of an interfering mouse band or poor amplification.
-
-
-
-
47
-
-
10244261681
-
-
note
-
This could be the result of either a trivial primer tube labeling error or a sequence clustering error in which two separate genes were erroneously assigned to the same UniGene entry.
-
-
-
-
48
-
-
10244254014
-
-
note
-
Typing errors in a framework marker would allow a close EST to map with significant lod scores (logarithm of the odds ratio for linkage) to a correct chromosome location but would tend to localize the marker in a distant bin, in order to minimize "double-breaks" caused by the erroneous typings of the framework marker.
-
-
-
-
51
-
-
0028198760
-
-
S. Bahram, M. Bresnahan, D. E. Geraghty, T. Spies, Proc. Natl. Acad. Sci. U.S.A. 91, 6259 (1994).
-
(1994)
Proc. Natl. Acad. Sci. U.S.A.
, vol.91
, pp. 6259
-
-
Bahram, S.1
Bresnahan, M.2
Geraghty, D.E.3
Spies, T.4
-
52
-
-
0027414129
-
-
G. D. Billingsley et al., Am. J. Hum. Genet. 52, 343 (1993); S. S. Schneider et al., Proc. Natl. Acad. Sci. U.S.A. 92, 3147 (1995).
-
(1993)
Am. J. Hum. Genet.
, vol.52
, pp. 343
-
-
Billingsley, G.D.1
-
54
-
-
0029004341
-
-
R. Sherrington et al., Nature 375, 754 (1995); D. Levitan and I. Greenwald, ibid. 377, 351 (1995); S. A. Hahn et al., Science 271, 350 (1996).
-
(1995)
Nature
, vol.375
, pp. 754
-
-
Sherrington, R.1
-
55
-
-
0029116848
-
-
R. Sherrington et al., Nature 375, 754 (1995); D. Levitan and I. Greenwald, ibid. 377, 351 (1995); S. A. Hahn et al., Science 271, 350 (1996).
-
(1995)
Nature
, vol.377
, pp. 351
-
-
Levitan, D.1
Greenwald, I.2
-
56
-
-
0030593038
-
-
R. Sherrington et al., Nature 375, 754 (1995); D. Levitan and I. Greenwald, ibid. 377, 351 (1995); S. A. Hahn et al., Science 271, 350 (1996).
-
(1996)
Science
, vol.271
, pp. 350
-
-
Hahn, S.A.1
-
57
-
-
0028133514
-
-
S. Tugendreich et al., Hum. Mol. Genet. 3, 1509 (1994); D. E. Bassett Jr., M. S. Boguski, P. Hieter, Nature 379, 589 (1996); P. Hieter, D. E. Bassett Jr., D. Valle, Naiure Genet. 13, 253 (1996).
-
(1994)
Hum. Mol. Genet.
, vol.3
, pp. 1509
-
-
Tugendreich, S.1
-
58
-
-
0030033794
-
-
S. Tugendreich et al., Hum. Mol. Genet. 3, 1509 (1994); D. E. Bassett Jr., M. S. Boguski, P. Hieter, Nature 379, 589 (1996); P. Hieter, D. E. Bassett Jr., D. Valle, Naiure Genet. 13, 253 (1996).
-
(1996)
Nature
, vol.379
, pp. 589
-
-
Bassett Jr., D.E.1
Boguski, M.S.2
Hieter, P.3
-
59
-
-
0029957096
-
-
S. Tugendreich et al., Hum. Mol. Genet. 3, 1509 (1994); D. E. Bassett Jr., M. S. Boguski, P. Hieter, Nature 379, 589 (1996); P. Hieter, D. E. Bassett Jr., D. Valle, Naiure Genet. 13, 253 (1996).
-
(1996)
Naiure Genet.
, vol.13
, pp. 253
-
-
Hieter, P.1
Bassett Jr., D.E.2
Valle, D.3
-
60
-
-
0000228203
-
-
M. O. Dayhoff, Ed. National Biomedical Research Foundation, Washington, DC
-
The scoring systems were amino acid substitution matrices based on the PAM (point accepted mutation) model of evolutionary distance. Pam matrices may be generated for any number of PAMs by extrapolation of observed mutation frequencies [M. O. Dayhoff et al., in Atlas of Protein Sequence and Structure, M. O. Dayhoff, Ed. (National Biomedical Research Foundation, Washington, DC, 1978), vol. 5, suppl. 3, pp. 345-352; S. F. Altschul, J. Mol. Evol. 36, 290 (1993)]. PAM matrices were customized for scoring matches between sequences in each of the 15 species pairs: for the pool of remaining proteins ("Other organisms" in Table 4), the PAM120 matrix was used because it has been shown to be good for general-purpose searching [S. F. Altschul, J. Mol. Biol. 219, 555 (1991)]. The BLASTX program [W. Gish and D. J. States, Nature Genet. 3, 266 (1993)] takes a nucleotide sequence query (EST or gene), translates it into all six conceptual ORFs, and then compares these with protein sequences in the database; TBLASTN performs a similar function but instead searches a protein query sequence against six-frame translations of each entry in a nucleotide sequence database. Searches were performed with E = 1e-6 and E2 = 1e-5 as the primary and secondary expectation parameters.
-
(1978)
Atlas of Protein Sequence and Structure
, vol.5
, Issue.3 SUPPL.
, pp. 345-352
-
-
Dayhoff, M.O.1
-
61
-
-
0027516090
-
-
The scoring systems were amino acid substitution matrices based on the PAM (point accepted mutation) model of evolutionary distance. Pam matrices may be generated for any number of PAMs by extrapolation of observed mutation frequencies [M. O. Dayhoff et al., in Atlas of Protein Sequence and Structure, M. O. Dayhoff, Ed. (National Biomedical Research Foundation, Washington, DC, 1978), vol. 5, suppl. 3, pp. 345-352; S. F. Altschul, J. Mol. Evol. 36, 290 (1993)]. PAM matrices were customized for scoring matches between sequences in each of the 15 species pairs: for the pool of remaining proteins ("Other organisms" in Table 4), the PAM120 matrix was used because it has been shown to be good for general-purpose searching [S. F. Altschul, J. Mol. Biol. 219, 555 (1991)]. The BLASTX program [W. Gish and D. J. States, Nature Genet. 3, 266 (1993)] takes a nucleotide sequence query (EST or gene), translates it into all six conceptual ORFs, and then compares these with protein sequences in the database; TBLASTN performs a similar function but instead searches a protein query sequence against six-frame translations of each entry in a nucleotide sequence database. Searches were performed with E = 1e-6 and E2 = 1e-5 as the primary and secondary expectation parameters.
-
(1993)
J. Mol. Evol.
, vol.36
, pp. 290
-
-
Altschul, S.F.1
-
62
-
-
0025878149
-
-
The scoring systems were amino acid substitution matrices based on the PAM (point accepted mutation) model of evolutionary distance. Pam matrices may be generated for any number of PAMs by extrapolation of observed mutation frequencies [M. O. Dayhoff et al., in Atlas of Protein Sequence and Structure, M. O. Dayhoff, Ed. (National Biomedical Research Foundation, Washington, DC, 1978), vol. 5, suppl. 3, pp. 345-352; S. F. Altschul, J. Mol. Evol. 36, 290 (1993)]. PAM matrices were customized for scoring matches between sequences in each of the 15 species pairs: for the pool of remaining proteins ("Other organisms" in Table 4), the PAM120 matrix was used because it has been shown to be good for general-purpose searching [S. F. Altschul, J. Mol. Biol. 219, 555 (1991)]. The BLASTX program [W. Gish and D. J. States, Nature Genet. 3, 266 (1993)] takes a nucleotide sequence query (EST or gene), translates it into all six conceptual ORFs, and then compares these with protein sequences in the database; TBLASTN performs a similar function but instead searches a protein query sequence against six-frame translations of each entry in a nucleotide sequence database. Searches were performed with E = 1e-6 and E2 = 1e-5 as the primary and secondary expectation parameters.
-
(1991)
J. Mol. Biol.
, vol.219
, pp. 555
-
-
Altschul, S.F.1
-
63
-
-
0027399530
-
-
The scoring systems were amino acid substitution matrices based on the PAM (point accepted mutation) model of evolutionary distance. Pam matrices may be generated for any number of PAMs by extrapolation of observed mutation frequencies [M. O. Dayhoff et al., in Atlas of Protein Sequence and Structure, M. O. Dayhoff, Ed. (National Biomedical Research Foundation, Washington, DC, 1978), vol. 5, suppl. 3, pp. 345-352; S. F. Altschul, J. Mol. Evol. 36, 290 (1993)]. PAM matrices were customized for scoring matches between sequences in each of the 15 species pairs: for the pool of remaining proteins ("Other organisms" in Table 4), the PAM120 matrix was used because it has been shown to be good for general-purpose searching [S. F. Altschul, J. Mol. Biol. 219, 555 (1991)]. The BLASTX program [W. Gish and D. J. States, Nature Genet. 3, 266 (1993)] takes a nucleotide sequence query (EST or gene), translates it into all six conceptual ORFs, and then compares these with protein sequences in the database; TBLASTN performs a similar function but instead searches a protein query sequence against six-frame translations of each entry in a nucleotide sequence database. Searches were performed with E = 1e-6 and E2 = 1e-5 as the primary and secondary expectation parameters.
-
(1993)
Nature Genet.
, vol.3
, pp. 266
-
-
Gish, W.1
States, D.J.2
-
64
-
-
0025735884
-
-
M. A. Pericak-Vance et al., Am. J. Hum. Genet. 48. 1034 (1991); W. J. Strittmatter et al., Proc. Natl. Acad. Sci. U.S.A. 90, 1977 (1993); R. Fishel et al., Cell 75, 1027 (1993); N. Papadopoulos et al., Science 263, 1625 (1994); R. Shiang et al., Cell 78, 335 (1994); L. M. Mulligan et al., Nature 363, 458 (1993); P. Edery et al., ibid. 367, 378 (1994).
-
(1991)
Am. J. Hum. Genet.
, vol.48
, pp. 1034
-
-
Pericak-Vance, M.A.1
-
65
-
-
0027407565
-
-
M. A. Pericak-Vance et al., Am. J. Hum. Genet. 48. 1034 (1991); W. J. Strittmatter et al., Proc. Natl. Acad. Sci. U.S.A. 90, 1977 (1993); R. Fishel et al., Cell 75, 1027 (1993); N. Papadopoulos et al., Science 263, 1625 (1994); R. Shiang et al., Cell 78, 335 (1994); L. M. Mulligan et al., Nature 363, 458 (1993); P. Edery et al., ibid. 367, 378 (1994).
-
(1993)
Proc. Natl. Acad. Sci. U.S.A.
, vol.90
, pp. 1977
-
-
Strittmatter, W.J.1
-
66
-
-
0027742295
-
-
M. A. Pericak-Vance et al., Am. J. Hum. Genet. 48. 1034 (1991); W. J. Strittmatter et al., Proc. Natl. Acad. Sci. U.S.A. 90, 1977 (1993); R. Fishel et al., Cell 75, 1027 (1993); N. Papadopoulos et al., Science 263, 1625 (1994); R. Shiang et al., Cell 78, 335 (1994); L. M. Mulligan et al., Nature 363, 458 (1993); P. Edery et al., ibid. 367, 378 (1994).
-
(1993)
Cell
, vol.75
, pp. 1027
-
-
Fishel, R.1
-
67
-
-
0028350601
-
-
M. A. Pericak-Vance et al., Am. J. Hum. Genet. 48. 1034 (1991); W. J. Strittmatter et al., Proc. Natl. Acad. Sci. U.S.A. 90, 1977 (1993); R. Fishel et al., Cell 75, 1027 (1993); N. Papadopoulos et al., Science 263, 1625 (1994); R. Shiang et al., Cell 78, 335 (1994); L. M. Mulligan et al., Nature 363, 458 (1993); P. Edery et al., ibid. 367, 378 (1994).
-
(1994)
Science
, vol.263
, pp. 1625
-
-
Papadopoulos, N.1
-
68
-
-
0027964261
-
-
M. A. Pericak-Vance et al., Am. J. Hum. Genet. 48. 1034 (1991); W. J. Strittmatter et al., Proc. Natl. Acad. Sci. U.S.A. 90, 1977 (1993); R. Fishel et al., Cell 75, 1027 (1993); N. Papadopoulos et al., Science 263, 1625 (1994); R. Shiang et al., Cell 78, 335 (1994); L. M. Mulligan et al., Nature 363, 458 (1993); P. Edery et al., ibid. 367, 378 (1994).
-
(1994)
Cell
, vol.78
, pp. 335
-
-
Shiang, R.1
-
69
-
-
0027231568
-
-
M. A. Pericak-Vance et al., Am. J. Hum. Genet. 48. 1034 (1991); W. J. Strittmatter et al., Proc. Natl. Acad. Sci. U.S.A. 90, 1977 (1993); R. Fishel et al., Cell 75, 1027 (1993); N. Papadopoulos et al., Science 263, 1625 (1994); R. Shiang et al., Cell 78, 335 (1994); L. M. Mulligan et al., Nature 363, 458 (1993); P. Edery et al., ibid. 367, 378 (1994).
-
(1993)
Nature
, vol.363
, pp. 458
-
-
Mulligan, L.M.1
-
70
-
-
0027972513
-
-
M. A. Pericak-Vance et al., Am. J. Hum. Genet. 48. 1034 (1991); W. J. Strittmatter et al., Proc. Natl. Acad. Sci. U.S.A. 90, 1977 (1993); R. Fishel et al., Cell 75, 1027 (1993); N. Papadopoulos et al., Science 263, 1625 (1994); R. Shiang et al., Cell 78, 335 (1994); L. M. Mulligan et al., Nature 363, 458 (1993); P. Edery et al., ibid. 367, 378 (1994).
-
(1994)
Nature
, vol.367
, pp. 378
-
-
Edery, P.1
-
71
-
-
10244231022
-
-
The potential effect of a comprehensive gene map is well illustrated by the fact that 82% of genes that have been positionally cloned to date are represented by one or more ESTs in GenBank (http://www.ncbi.nlm.nih.gov/dbEST/dbEST_ genes/).
-
-
-
-
74
-
-
10244252863
-
-
note
-
Consensus sequences from at least three overlapping ESTs were generated from 294 UniGene clusters. Of the 230 (78%) redesigned primer pairs, 188 (82%) of these yielded successful PCR assays.
-
-
-
-
75
-
-
10244255258
-
-
note
-
We thank M. O. Anderson, A. J. Collymore, D. F. Courtney, R. Devine, D. Gray, L. T. Horton Jr., V. Kouyoumjian, J. Tam, W. Ye, and I. S. Zemsteva from the Whitehead Institute for technical assistance. We thank W. Miller, E. Myers, D. J. Lipman, and A. Schaffer for essential contributions toward the development of UniGene. Supported by NIH awards HG00098 to E.S.L., HG00206 to R.M.M., HG00835 to J.M.S., and HG00151 to T.C.M., and by the Whitehead Institute for Biomedical Research and the Wellcome Trust. T.J.H. is a recipient of a Clinician Scientist Award from the Medical Research Council of Canada. D.C.P. is an assistant investigator of the Howard Hughes Medical Institute. The Stanford Human Genome Center and the Whitehead Institute-MIT Genome Center are thankful for the oligonucleotides purchased with funds donated by Sandoz Pharmaceutical. Généthon is supported by the Association Francaise contre les Myopathies and the Groupement d'Etudes sur le Genome. The Sanger Centre, Généthon, and Oxford are grateful for support from the European Union EVRHEST programme. The Human Genome Organization and the Wellcome Trust sponsored a series of meetings from October 1994 to November 1995 without which this collaboration would not have been possible.
-
-
-
|