메뉴 건너뛰기




Volumn 8, Issue 3, 1998, Pages 346-354

Finding the genes in genomic DNA

Author keywords

[No Author keywords available]

Indexed keywords

DNA;

EID: 0032104509     PISSN: 0959440X     EISSN: None     Source Type: Journal    
DOI: 10.1016/S0959-440X(98)80069-9     Document Type: Article
Times cited : (461)

References (49)
  • 1
    • 0030218799 scopus 로고    scopus 로고
    • Finding genes by computer: The state of the art
    • Fickett JW. Finding genes by computer: the state of the art. Trends Genet. 12:1996;316-320.
    • (1996) Trends Genet , vol.12 , pp. 316-320
    • Fickett, J.W.1
  • 2
    • 0030768930 scopus 로고    scopus 로고
    • Computational methods for the identification of genes in vertebrate genomic sequences
    • of special interest. This well-written review gives a brief history of the various methods that have been applied to computational gene identification, summarizes the methods used by current programs, and includes web addresses for most available gene finding software as well as fairly extensive references. The author also points out some of the limitations of the current methods, for example, the inability of current algorithms to handle the complexities of overlapping genes and alternative transcription or splicing patterns, and the difficulties in predicting the beginning and end of genes.
    • Claverie J-M. Computational methods for the identification of genes in vertebrate genomic sequences. of special interest Hum Mol Genet. 6:1997;1735-1744 This well-written review gives a brief history of the various methods that have been applied to computational gene identification, summarizes the methods used by current programs, and includes web addresses for most available gene finding software as well as fairly extensive references. The author also points out some of the limitations of the current methods, for example, the inability of current algorithms to handle the complexities of overlapping genes and alternative transcription or splicing patterns, and the difficulties in predicting the beginning and end of genes.
    • (1997) Hum Mol Genet , vol.6 , pp. 1735-1744
    • Claverie J-M1
  • 3
    • 0030585734 scopus 로고    scopus 로고
    • Evaluation of gene structure prediction programs
    • of outstanding interest. This landmark paper provided the first large-scale, systematic, unbiased comparison of available gene-finding methods. The authors describe the construction of a large reference data set of 570 vertebrate gene sequences, critically evaluate the usefulness of a variety of predictive accuracy measures proposed previously, and introduce some new accuracy measures. They also provide the results of a systematic test of all available exon and gene prediction programs and assess the current (as of 1996) state of the gene finding problem.
    • Burset M, Guigo R. Evaluation of gene structure prediction programs. of outstanding interest Genomics. 34:1996;353-367 This landmark paper provided the first large-scale, systematic, unbiased comparison of available gene-finding methods. The authors describe the construction of a large reference data set of 570 vertebrate gene sequences, critically evaluate the usefulness of a variety of predictive accuracy measures proposed previously, and introduce some new accuracy measures. They also provide the results of a systematic test of all available exon and gene prediction programs and assess the current (as of 1996) state of the gene finding problem.
    • (1996) Genomics , vol.34 , pp. 353-367
    • Burset, M.1    Guigo, R.2
  • 4
    • 0029258948 scopus 로고
    • Prediction of function in DNA sequence analysis
    • Gelfand MS. Prediction of function in DNA sequence analysis. J Comput Biol. 2:1995;87-115.
    • (1995) J Comput Biol , vol.2 , pp. 87-115
    • Gelfand, M.S.1
  • 5
    • 0028826042 scopus 로고
    • FANS-REF: A bibliography on statistics and functional analysis of nucleotide sequences
    • Gelfand MS. FANS-REF: a bibliography on statistics and functional analysis of nucleotide sequences. Comput Appl Biosci. 11:1995;541.
    • (1995) Comput Appl Biosci , vol.11 , pp. 541
    • Gelfand, M.S.1
  • 6
    • 0027944605 scopus 로고
    • A hidden Markov model that finds genes in E. coli DNA
    • Krogh A, Mian IS, Haussler D. A hidden Markov model that finds genes in E. coli DNA. Nucleic Acids Res. 22:1994;4768-4778.
    • (1994) Nucleic Acids Res , vol.22 , pp. 4768-4778
    • Krogh, A.1    Mian, I.S.2    Haussler, D.3
  • 7
    • 0000241874 scopus 로고
    • GENMARK: Parallel gene recognition for both DNA strands
    • Borodovsky M, McIninch J. GENMARK: parallel gene recognition for both DNA strands. Comput Chem. 17:1993;123-133.
    • (1993) Comput Chem , vol.17 , pp. 123-133
    • Borodovsky, M.1    McIninch, J.2
  • 8
    • 0032518163 scopus 로고    scopus 로고
    • Microbial gene identification using interpolated Markov models
    • Salzberg SL, Delcher AL, Kasif S, White O. Microbial gene identification using interpolated Markov models. Nucleic Acids Res. 26:1998;544-548.
    • (1998) Nucleic Acids Res , vol.26 , pp. 544-548
    • Salzberg, S.L.1    Delcher, A.L.2    Kasif, S.3    White, O.4
  • 9
    • 0028136885 scopus 로고
    • Bacterial gene transfer by genetic transformation in the environment
    • Lorenz MG, Wackernagel W. Bacterial gene transfer by genetic transformation in the environment. Microbiol Rev. 58:1994;563-602.
    • (1994) Microbiol Rev , vol.58 , pp. 563-602
    • Lorenz, M.G.1    Wackernagel, W.2
  • 10
    • 0026332291 scopus 로고
    • Evidence for horizontal gene transfer in Escherichia coli speciation
    • Medigue C, Rouxel T, Vigier P, Henaut A, Danchin A. Evidence for horizontal gene transfer in Escherichia coli speciation. J Mol Biol. 222:1991;851-856.
    • (1991) J Mol Biol , vol.222 , pp. 851-856
    • Medigue, C.1    Rouxel, T.2    Vigier, P.3    Henaut, A.4    Danchin, A.5
  • 12
    • 0030569016 scopus 로고    scopus 로고
    • What drives codon choices in human genes?
    • Karlin S, Mrazek J. What drives codon choices in human genes? J Mol Biol. 262:1996;459-472.
    • (1996) J Mol Biol , vol.262 , pp. 459-472
    • Karlin, S.1    Mrazek, J.2
  • 13
    • 0025278259 scopus 로고
    • Weight matrix descriptions of four eukaryotic RNA polymerase II promoter elements derived from 502 unrelated promoter sequences
    • Bucher P. Weight matrix descriptions of four eukaryotic RNA polymerase II promoter elements derived from 502 unrelated promoter sequences. J Mol Biol. 212:1990;563-578.
    • (1990) J Mol Biol , vol.212 , pp. 563-578
    • Bucher, P.1
  • 15
    • 0029038960 scopus 로고
    • Predicting pol II promoter sequences using transcription factor binding sites
    • Prestridge DS. Predicting pol II promoter sequences using transcription factor binding sites. J Mol Biol. 249:1995;923-932.
    • (1995) J Mol Biol , vol.249 , pp. 923-932
    • Prestridge, D.S.1
  • 16
    • 0030213227 scopus 로고    scopus 로고
    • Interpreting cDNA sequences: Some insights from studies on translation
    • Kozak M. Interpreting cDNA sequences: some insights from studies on translation. Mamm Genome. 7:1996;563-574.
    • (1996) Mamm Genome , vol.7 , pp. 563-574
    • Kozak, M.1
  • 17
    • 0031586003 scopus 로고    scopus 로고
    • Prediction of complete gene structures in human genomic DNA
    • of outstanding interest. The authors introduce a probabilistic model for the structural and sequence compositional properties of genes in human genomic DNA and describe the application of this model to gene finding using the program GENSCAN. The model architecture employed is quite general, allowing for multiple complete or partial gene structures occurring on either or both DNA strands. The model also captures sequence properties of some of the most important cis elements involved in transcription, translation, and pre-mRNA splicing, as well as the length distributions of gene components such as exons and introns. The results show significant improvements in predictive accuracy over other available gene-finding methods, as measured on standard test sets of human and vertebrate geromic sequences.
    • Burge C, Karlin S. Prediction of complete gene structures in human genomic DNA. of outstanding interest J Mol Biol. 268:1997;78-94 The authors introduce a probabilistic model for the structural and sequence compositional properties of genes in human genomic DNA and describe the application of this model to gene finding using the program GENSCAN. The model architecture employed is quite general, allowing for multiple complete or partial gene structures occurring on either or both DNA strands. The model also captures sequence properties of some of the most important cis elements involved in transcription, translation, and pre-mRNA splicing, as well as the length distributions of gene components such as exons and introns. The results show significant improvements in predictive accuracy over other available gene-finding methods, as measured on standard test sets of human and vertebrate geromic sequences.
    • (1997) J Mol Biol , vol.268 , pp. 78-94
    • Burge, C.1    Karlin, S.2
  • 18
    • 0001877802 scopus 로고
    • Splicing of precursors to mRNAs by the spliceosome
    • R.F. Gesteland, Atkins J.F. Plainview, New York: Cold Spring Harbor Laboratory Press
    • Moore MJ, Query CC, Sharp PA. Splicing of precursors to mRNAs by the spliceosome. Gesteland RF, Atkins JF. RNA World. 1993;305-358 Cold Spring Harbor Laboratory Press, Plainview, New York.
    • (1993) RNA World , pp. 305-358
    • Moore, M.J.1    Query, C.C.2    Sharp, P.A.3
  • 19
    • 0031456850 scopus 로고    scopus 로고
    • Classification of introns: U2-type or U12-type
    • of special interest. The authors summarize recent research that has shown: that a very small fraction of nuclear pre-mRNA introns have AT and AC dinucleotides at their 5′ and 3′ termini rather than the more common termini of GT and AG; that two distinct types of spliceosome, termed U2-type and U12-type, are present in both animal and plant cells, that individual introns are apparently spliced by only one type of spliceosome or the other; and that contrary to what was initially thought, the type of spliceosome used is not determined simply by the terminal dinucleotides, but instead depends on the presence or absence of specific internal consensus sequences at both the 5′ splice site and branch site of the intron. Known U12-type AT→AC introns, U2-type AT→AC introns, and U12-type GT→AG introns are tabulated and consensus patterns are described.
    • Sharp PA, Burge CB. Classification of introns: U2-type or U12-type. of special interest Cell. 91:1997;875-879 The authors summarize recent research that has shown: that a very small fraction of nuclear pre-mRNA introns have AT and AC dinucleotides at their 5′ and 3′ termini rather than the more common termini of GT and AG; that two distinct types of spliceosome, termed U2-type and U12-type, are present in both animal and plant cells, that individual introns are apparently spliced by only one type of spliceosome or the other; and that contrary to what was initially thought, the type of spliceosome used is not determined simply by the terminal dinucleotides, but instead depends on the presence or absence of specific internal consensus sequences at both the 5′ splice site and branch site of the intron. Known U12-type AT→AC introns, U2-type AT→AC introns, and U12-type GT→AG introns are tabulated and consensus patterns are described.
    • (1997) Cell , vol.91 , pp. 875-879
    • Sharp, P.A.1    Burge, C.B.2
  • 20
    • 0030810993 scopus 로고    scopus 로고
    • Finding genes in DNA with a hidden Markov model
    • Henderson J, Salzberg S, Fasman KH. Finding genes in DNA with a hidden Markov model. J Comput Biol. 4:1997;127-141.
    • (1997) J Comput Biol , vol.4 , pp. 127-141
    • Henderson, J.1    Salzberg, S.2    Fasman, K.H.3
  • 23
    • 0000093674 scopus 로고    scopus 로고
    • Modeling dependencies in pre-mRNA splicing signals
    • S. Salzberg, D.B. Searls, Kasif S. Amsterdam: Elsevier Science
    • Burge C. Modeling dependencies in pre-mRNA splicing signals. Salzberg S, Searls DB, Kasif S. Computational Methods in Molecular Biology. 1998;127-163 Elsevier Science, Amsterdam.
    • (1998) Computational Methods in Molecular Biology , pp. 127-163
    • Burge, C.1
  • 24
    • 0028895417 scopus 로고
    • Exon recognition in vertebrate splicing
    • Berget SM. Exon recognition in vertebrate splicing. J Biol Chem. 270:1995;2411-2414.
    • (1995) J Biol Chem , vol.270 , pp. 2411-2414
    • Berget, S.M.1
  • 25
    • 0031027525 scopus 로고    scopus 로고
    • Identification of protein coding regions in the human genome by quadratic discriminant analysis
    • of special interest. This paper describes a program called MZEF for the predict on of internal coding exons in genomic sequences using a weighted combination of factors related to splice sites and the composition of exons and introns. The method, using quadratic discriminant analysis, is a generalization of the linear discriminant analysis approach to exon prediction used by Solovyev et al. [34] with the widely used HEXON/FGENEH program.
    • Zhang MQ. Identification of protein coding regions in the human genome by quadratic discriminant analysis. of special interest Proc Natl Acad Sci USA. 94:1997;565-568 This paper describes a program called MZEF for the predict on of internal coding exons in genomic sequences using a weighted combination of factors related to splice sites and the composition of exons and introns. The method, using quadratic discriminant analysis, is a generalization of the linear discriminant analysis approach to exon prediction used by Solovyev et al. [34] with the widely used HEXON/FGENEH program.
    • (1997) Proc Natl Acad Sci USA , vol.94 , pp. 565-568
    • Zhang, M.Q.1
  • 26
    • 0027059264 scopus 로고
    • Assessment of protein coding measures
    • Fickett JW, Tung C-S. Assessment of protein coding measures. Nucleic Acids Res. 20:1992;6441-6450.
    • (1992) Nucleic Acids Res , vol.20 , pp. 6441-6450
    • Fickett, J.W.1    Tung C-S2
  • 27
    • 0029587471 scopus 로고
    • The human genome: Organization and evolutionary history
    • Bernardi G. The human genome: organization and evolutionary history. Annu Rev Genet. 29:1995;445-476.
    • (1995) Annu Rev Genet , vol.29 , pp. 445-476
    • Bernardi, G.1
  • 28
    • 0030560766 scopus 로고    scopus 로고
    • Base composition and gene distribution: Critical patterns in mammalian genome organization
    • Gardiner K. Base composition and gene distribution: critical patterns in mammalian genome organization. Trends Genet. 12:1996;519-524.
    • (1996) Trends Genet , vol.12 , pp. 519-524
    • Gardiner, K.1
  • 29
    • 0024610919 scopus 로고
    • A tutorial on hidden Markov models and selected applications in speech recognition
    • Rabiner LR. A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE. 77:1987;257-285.
    • (1987) Proc IEEE , vol.77 , pp. 257-285
    • Rabiner, L.R.1
  • 32
    • 0030333286 scopus 로고    scopus 로고
    • A generalized hidden Markov model for the recognition of human genes in DNA
    • of special interest. AAAI Press Menlo Park This paper pioneered the use of 'generalized' HMMs, in which gene structure is modeled by an underlying Markov process of generalized hidden states, each of which can emit one or more nucleotides, possibly according to probabilities derived from an internal model structure such as a HMM or neural network
    • Kulp D, Haussler D, Reese MG, Eeckman FH. A generalized hidden Markov model for the recognition of human genes in DNA. of special interest Proceedings of the Fourth International Conference on Intelligent Systems for Molecular Biology. 1996;AAAI Press, Menlo Park, This paper pioneered the use of 'generalized' HMMs, in which gene structure is modeled by an underlying Markov process of generalized hidden states, each of which can emit one or more nucleotides, possibly according to probabilities derived from an internal model structure such as a HMM or neural network.
    • (1996) Proceedings of the Fourth International Conference on Intelligent Systems for Molecular Biology
    • Kulp, D.1    Haussler, D.2    Reese, M.G.3    Eeckman, F.H.4
  • 33
    • 0003415703 scopus 로고    scopus 로고
    • Identification of genes in human genomic DNA
    • Stanford: Stanford University
    • Burge C. Identification of genes in human genomic DNA. PhD thesis. 1997;Stanford University, Stanford.
    • (1997) PhD Thesis
    • Burge, C.1
  • 34
    • 0029814920 scopus 로고    scopus 로고
    • A segment-based dynamic programming algorithm for predicting gene structure
    • Wu T. A segment-based dynamic programming algorithm for predicting gene structure. J Comput Biol. 3:1996;375-394.
    • (1996) J Comput Biol , vol.3 , pp. 375-394
    • Wu, T.1
  • 35
    • 0028618270 scopus 로고
    • Predicting internal exons by oligonucleotide composition and discriminant analysis of spliceable open reading frames
    • Solovyev VV, Salamov AA, Lawrence CB. Predicting internal exons by oligonucleotide composition and discriminant analysis of spliceable open reading frames. Nucleic Acids Res. 22:1994;5156-5163.
    • (1994) Nucleic Acids Res , vol.22 , pp. 5156-5163
    • Solovyev, V.V.1    Salamov, A.A.2    Lawrence, C.B.3
  • 37
    • 0031573382 scopus 로고    scopus 로고
    • A tool for analyzing and annotating genomic sequences
    • Huang X, Adams MD, Zhou H, Kerlavage AR. A tool for analyzing and annotating genomic sequences. Genomics. 46:1997;37-45.
    • (1997) Genomics , vol.46 , pp. 37-45
    • Huang, X.1    Adams, M.D.2    Zhou, H.3    Kerlavage, A.R.4
  • 39
    • 0027968068 scopus 로고
    • CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice
    • Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:1994;4673-4680.
    • (1994) Nucleic Acids Res , vol.22 , pp. 4673-4680
    • Thompson, J.D.1    Higgins, D.G.2    Gibson, T.J.3
  • 40
    • 0028605559 scopus 로고
    • Constructing gene models from accurately predicted exons: An application of dynamic programming
    • Xu Y, Mural RJ, Uberbacher EC. Constructing gene models from accurately predicted exons: an application of dynamic programming. Comput Appl Biosci. 10:1994;613-623.
    • (1994) Comput Appl Biosci , vol.10 , pp. 613-623
    • Xu, Y.1    Mural, R.J.2    Uberbacher, E.C.3
  • 41
    • 0030786487 scopus 로고    scopus 로고
    • Las Vegas algorithms for gene recognition: Suboptimal and error-tolerant spliced alignment
    • of special interest. The authors describe some variations and extensions to the 'spliced alignment' algorithm introduced by Gelfand et al. [38] and implemented in the PROCRUSTES program. The basic idea of PROCRUSTES is to identify the exon and intron structure of a gene in a genomic sequence by searching for the set of genomic segments (predicted exons) that maximize a global similarity measure to a pre-specified homologous protein. These homology-based methods may be extremely accurate when a sufficiently similar protein is available (e.g. human genomic DNA versus orthologous mouse protein).
    • Sze S-H, Pcvzner PA. Las Vegas algorithms for gene recognition: suboptimal and error-tolerant spliced alignment. of special interest J Comput Biol. 4:1997;297-309 The authors describe some variations and extensions to the 'spliced alignment' algorithm introduced by Gelfand et al. [38] and implemented in the PROCRUSTES program. The basic idea of PROCRUSTES is to identify the exon and intron structure of a gene in a genomic sequence by searching for the set of genomic segments (predicted exons) that maximize a global similarity measure to a pre-specified homologous protein. These homology-based methods may be extremely accurate when a sufficiently similar protein is available (e.g. human genomic DNA versus orthologous mouse protein).
    • (1997) J Comput Biol , vol.4 , pp. 297-309
    • Sze S-H1    Pcvzner, P.A.2
  • 42
    • 0030479536 scopus 로고    scopus 로고
    • The origin of interspersed repeats in the human genome
    • Smit AFA. The origin of interspersed repeats in the human genome. Curr Opin Genet Dev. 6:1996;743-748.
    • (1996) Curr Opin Genet Dev , vol.6 , pp. 743-748
    • Smit, A.F.A.1
  • 43
    • 0030094146 scopus 로고    scopus 로고
    • CENSOR - A program for identification and elimination of repetitive elements from DNA sequences
    • Jurka J, Klonowski P, Dagman V, Pelton P. CENSOR - a program for identification and elimination of repetitive elements from DNA sequences. Comput Chem. 20:1996;119-122.
    • (1996) Comput Chem , vol.20 , pp. 119-122
    • Jurka, J.1    Klonowski, P.2    Dagman, V.3    Pelton, P.4
  • 44
    • 0030854739 scopus 로고    scopus 로고
    • TRNAcan-SE: A program for improved detection of transfer RNA genes in genomic sequence
    • Lowe TM, Eddy SR. tRNAcan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 25:1997;955-964.
    • (1997) Nucleic Acids Res , vol.25 , pp. 955-964
    • Lowe, T.M.1    Eddy, S.R.2
  • 45
    • 0026456701 scopus 로고
    • The human XIST gene: Analysis of a 17 kb inactive X-specific RNA that contains conserved repeats and is highly localized within the nucleus
    • Brown CJ, Hendrich BD, Rupert JL, Lafreniere RG, Xing Y, Lawrence J, Willard HF. The human XIST gene: analysis of a 17 kb inactive X-specific RNA that contains conserved repeats and is highly localized within the nucleus. Cell. 71:1992;527-542.
    • (1992) Cell , vol.71 , pp. 527-542
    • Brown, C.J.1    Hendrich, B.D.2    Rupert, J.L.3    Lafreniere, R.G.4    Xing, Y.5    Lawrence, J.6    Willard, H.F.7
  • 46
    • 0026749475 scopus 로고
    • DNA sequence analysis of 66 kb of human MHC class II region encoding a cluster of genes for antigen processing
    • Beck S, Kelly A, Radley E, Khurshid F, Alderton RP, Trowsdale J. DNA sequence analysis of 66 kb of human MHC class II region encoding a cluster of genes for antigen processing. J Mol Biol. 228:1992;433-441.
    • (1992) J Mol Biol , vol.228 , pp. 433-441
    • Beck, S.1    Kelly, A.2    Radley, E.3    Khurshid, F.4    Alderton, R.P.5    Trowsdale, J.6
  • 47
    • 0028782111 scopus 로고
    • Statistical studies of biomolecular sequences: Score-based methods
    • Karlin S. Statistical studies of biomolecular sequences: score-based methods. Phil Trans R Soc Lond Biol. 344:1994;391-402.
    • (1994) Phil Trans R Soc Lond Biol , vol.344 , pp. 391-402
    • Karlin, S.1
  • 48
    • 0030786488 scopus 로고    scopus 로고
    • Automated gene identification in large-scale genomic sequences
    • Xu Y, Uberbacher EC. Automated gene identification in large-scale genomic sequences. J Comput Biol. 4:1997;325-338.
    • (1997) J Comput Biol , vol.4 , pp. 325-338
    • Xu, Y.1    Uberbacher, E.C.2
  • 49
    • 0031972648 scopus 로고    scopus 로고
    • Regulation of sex-specific selection of fruitless 5′ splice sites by transformer and transformer-2
    • Heinrichs V, Ryner LC, Baker BS. Regulation of sex-specific selection of fruitless 5′ splice sites by transformer and transformer-2. Mol Cell Biol. 18:1998;450-458.
    • (1998) Mol Cell Biol , vol.18 , pp. 450-458
    • Heinrichs, V.1    Ryner, L.C.2    Baker, B.S.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.