-
3
-
-
0003725141
-
-
Although our primary interests derive from studies of biomolecular sequences, the concepts and methods described in this article can be adapted to sequence comparisons with respect to human speech and text collations, with respect to bird songs and general musical scores, and in many contexts of computer science including string editing, comparison of computer files, coding theory, and information theory. For extensive references and other discussions on these topics, see Addison Wesley, Reading, MA
-
(1983)
Time Warps, String Edits and Macromolecules: The Theory and Practice of Sequence Comparisons
-
-
Sankoff, D.1
Kruskel, J.B.2
-
31
-
-
0020475449
-
-
The hydropathy index of associates: I, 4.5; V, 4.2; L, 3.8; F, 2.8; C, 2.5; M, 1.9; A, 1.8; G, -0.4; T, -0.7; S, -0.8; W, -0.9; Y, -1.3; P. -1.6; H. -3.2; E, -3.5; Q, -3.5; D, -3.5; N, -3.5; K, -3.9; and R, -4.5. In a typical hydropathy plot these scores are averaged over a small fixed window size and the resulting moving average is plotted along the sequence. Hydrophobic segments show up as positive peaks in the profile. The question is as to what constitutes a significant peak, higher or broader or both than what would occur due to chance fluctuations
-
(1982)
J. Mol. Biol
, vol.157
, pp. 105
-
-
Kyte, J.1
Doolittle, R.F.2
-
41
-
-
0001506434
-
-
The result of formula 1 can also be applied in characterizing the asymptotic maximal waiting time distribution for the single-server queue (GI/G/1)
-
(1972)
Ann. Math. Stat
, vol.43
, pp. 627
-
-
Iglehart, D.1
-
42
-
-
0001392225
-
-
for insurance risk models and traffic flow The Markov chain extension of formula 1 is developed in (16)
-
(1982)
Adv. Appl. Prob
, vol.14
, pp. 143
-
-
Assmussen, S.1
-
43
-
-
0001621270
-
-
The parameter λ∗ is known as the conjugate or dual exponent associated with a process of partial sums of independently identically distributed real random variables [for example, (7, 19)]. Its Markov chain analog is set forth in (15, 16); see also
-
(1987)
Ann. Prob
, vol.15
, pp. 561
-
-
Ney, P.1
Nummelin, E.2
-
53
-
-
0025275577
-
-
For a sequence of length N and a letter occurring with frequency f, the probability of observing a run of this letter of length exceeding L = In N/(-Inf) + z is asymptotically at most 1 - exp{-(1 -f)fZ}. Setting this probability equal to 0.01 we obtain z and L corresponding to the length of runs significant at the 1% level. Formulas for estimating the significance of runs with errors and of periodic patterns (for example, charge occurring every second or third residue) are given in see also (57)
-
(1990)
Methods Enzymol
, vol.183
, pp. 388
-
-
Karlin, S.1
Blaisdell, B.E.2
Brendel, V.3
-
85
-
-
0002218919
-
-
Applications of scan-statistics analysis for a fixed length sliding window pertain to phenomena such as clusters of disease in time, generalized birthday proximities, and rth nearest-neighbor problems. Early work on scan statistics focused mainly on exact formulae [see the bibliographic compilation of exploiting calculations of coincidence probabilities in diffusion stochastic processes.
-
(1979)
Int. Stat. Rev
, vol.47
, pp. 47
-
-
Naus, J.I.1
-
86
-
-
0023140234
-
-
More recent distributional studies of scan statistics concentrate largely on bounds and approximations [for example
-
(1987)
Stat. Med
, vol.6
, pp. 197
-
-
Wallenstein, S.1
Neff, N.2
-
89
-
-
84972496740
-
-
and (43)]. The theory extends to the case where the marker sites {X1} are generated in a Markov-dependent manner (43)
-
(1990)
Stat. Sci
, vol.5
, pp. 403
-
-
|