메뉴 건너뛰기




Volumn 257, Issue 5066, 1992, Pages 39-49

Chance and statistical significance in protein and DNA sequence analysis

Author keywords

[No Author keywords available]

Indexed keywords

AMINO ACID SEQUENCE; ARTICLE; DNA SEQUENCE; NUCLEOTIDE SEQUENCE; PRIORITY JOURNAL; STATISTICAL ANALYSIS;

EID: 0026718403     PISSN: 00368075     EISSN: None     Source Type: Journal    
DOI: 10.1126/science.1621093     Document Type: Article
Times cited : (154)

References (118)
  • 3
    • 0003725141 scopus 로고
    • Although our primary interests derive from studies of biomolecular sequences, the concepts and methods described in this article can be adapted to sequence comparisons with respect to human speech and text collations, with respect to bird songs and general musical scores, and in many contexts of computer science including string editing, comparison of computer files, coding theory, and information theory. For extensive references and other discussions on these topics, see Addison Wesley, Reading, MA
    • (1983) Time Warps, String Edits and Macromolecules: The Theory and Practice of Sequence Comparisons
    • Sankoff, D.1    Kruskel, J.B.2
  • 31
    • 0020475449 scopus 로고
    • The hydropathy index of associates: I, 4.5; V, 4.2; L, 3.8; F, 2.8; C, 2.5; M, 1.9; A, 1.8; G, -0.4; T, -0.7; S, -0.8; W, -0.9; Y, -1.3; P. -1.6; H. -3.2; E, -3.5; Q, -3.5; D, -3.5; N, -3.5; K, -3.9; and R, -4.5. In a typical hydropathy plot these scores are averaged over a small fixed window size and the resulting moving average is plotted along the sequence. Hydrophobic segments show up as positive peaks in the profile. The question is as to what constitutes a significant peak, higher or broader or both than what would occur due to chance fluctuations
    • (1982) J. Mol. Biol , vol.157 , pp. 105
    • Kyte, J.1    Doolittle, R.F.2
  • 41
    • 0001506434 scopus 로고
    • The result of formula 1 can also be applied in characterizing the asymptotic maximal waiting time distribution for the single-server queue (GI/G/1)
    • (1972) Ann. Math. Stat , vol.43 , pp. 627
    • Iglehart, D.1
  • 42
    • 0001392225 scopus 로고
    • for insurance risk models and traffic flow The Markov chain extension of formula 1 is developed in (16)
    • (1982) Adv. Appl. Prob , vol.14 , pp. 143
    • Assmussen, S.1
  • 43
    • 0001621270 scopus 로고
    • The parameter λ∗ is known as the conjugate or dual exponent associated with a process of partial sums of independently identically distributed real random variables [for example, (7, 19)]. Its Markov chain analog is set forth in (15, 16); see also
    • (1987) Ann. Prob , vol.15 , pp. 561
    • Ney, P.1    Nummelin, E.2
  • 53
    • 0025275577 scopus 로고
    • For a sequence of length N and a letter occurring with frequency f, the probability of observing a run of this letter of length exceeding L = In N/(-Inf) + z is asymptotically at most 1 - exp{-(1 -f)fZ}. Setting this probability equal to 0.01 we obtain z and L corresponding to the length of runs significant at the 1% level. Formulas for estimating the significance of runs with errors and of periodic patterns (for example, charge occurring every second or third residue) are given in see also (57)
    • (1990) Methods Enzymol , vol.183 , pp. 388
    • Karlin, S.1    Blaisdell, B.E.2    Brendel, V.3
  • 85
    • 0002218919 scopus 로고
    • Applications of scan-statistics analysis for a fixed length sliding window pertain to phenomena such as clusters of disease in time, generalized birthday proximities, and rth nearest-neighbor problems. Early work on scan statistics focused mainly on exact formulae [see the bibliographic compilation of exploiting calculations of coincidence probabilities in diffusion stochastic processes.
    • (1979) Int. Stat. Rev , vol.47 , pp. 47
    • Naus, J.I.1
  • 86
    • 0023140234 scopus 로고
    • More recent distributional studies of scan statistics concentrate largely on bounds and approximations [for example
    • (1987) Stat. Med , vol.6 , pp. 197
    • Wallenstein, S.1    Neff, N.2
  • 89
    • 84972496740 scopus 로고
    • and (43)]. The theory extends to the case where the marker sites {X1} are generated in a Markov-dependent manner (43)
    • (1990) Stat. Sci , vol.5 , pp. 403


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.