-
1
-
-
0036287773
-
Learning algorithms for Markov decision processes with average cost
-
J. Abounadi, D. P. Bertsekas, and V. S. Borkar (2001), Learning algorithms for Markov decision processes with average cost, SIAM J. Control Optim., 40, pp. 681-698.
-
(2001)
SIAM J. Control Optim.
, vol.40
, pp. 681-698
-
-
Abounadi, J.1
Bertsekas, D.P.2
Borkar, V.S.3
-
3
-
-
0003161907
-
An analysis of stochastic shortest path problems
-
D. P. Bertsekas and J. N. Tsitsiklis (1991), An analysis of stochastic shortest path problems, Math. Oper. Res., 16, pp. 580-595.
-
(1991)
Math. Oper. Res.
, vol.16
, pp. 580-595
-
-
Bertsekas, D.P.1
Tsitsiklis, J.N.2
-
8
-
-
0027656581
-
White noise representations in stochastic realization theory
-
V. S. Borkar (1993), White noise representations in stochastic realization theory, SIAM J. Control Optim., 31, pp. 1093-1102.
-
(1993)
SIAM J. Control Optim.
, vol.31
, pp. 1093-1102
-
-
Borkar, V.S.1
-
10
-
-
0032075427
-
Asynchronous stochastic approximations
-
Correction note in ibid, 38 (2000), pp. 662-663
-
V. S. Borkar (1998), Asynchronous stochastic approximations, SIAM J. Control Optim., 36, pp. 840-851. Correction note in ibid, 38 (2000), pp. 662-663.
-
(1998)
SIAM J. Control Optim.
, vol.36
, pp. 840-851
-
-
Borkar, V.S.1
-
11
-
-
0033876515
-
The O.D.E. method for convergence of stochastic approximation and reinforcement learning
-
V. S. Borkar and S. P. Meyn (2000), The O.D.E. method for convergence of stochastic approximation and reinforcement learning, SIAM J. Control Optim., 38, pp. 447-469.
-
(2000)
SIAM J. Control Optim.
, vol.38
, pp. 447-469
-
-
Borkar, V.S.1
Meyn, S.P.2
-
13
-
-
0016458868
-
Learning under computational constraints from weakly dependent samples
-
S. Csibi (1975), Learning under computational constraints from weakly dependent samples, Prob. Control Inform. Theory, 4, pp. 3-21.
-
(1975)
Prob. Control Inform. Theory
, vol.4
, pp. 3-21
-
-
Csibi, S.1
-
14
-
-
0026923443
-
Rate of convergence of recursive estimators
-
L. Gerencsér (1992), Rate of convergence of recursive estimators, SIAM J. Control Optim., 30, pp. 1200-1227.
-
(1992)
SIAM J. Control Optim.
, vol.30
, pp. 1200-1227
-
-
Gerencsér, L.1
-
15
-
-
0000439891
-
On the convergence of stochastic iterative dynamic programming algorithms
-
T. Jaakola, M. I. Jordan, and S. P. Singh (1994), On the convergence of stochastic iterative dynamic programming algorithms, Neural Computation, 6, pp. 1185-1201.
-
(1994)
Neural Computation
, vol.6
, pp. 1185-1201
-
-
Jaakola, T.1
Jordan, M.I.2
Singh, S.P.3
-
19
-
-
0017526570
-
Analysis of recursive stochastic algorithms
-
L. Ljung (1977), Analysis of recursive stochastic algorithms, IEEE Trans. Automat. Control, 22, pp. 551-575.
-
(1977)
IEEE Trans. Automat. Control
, vol.22
, pp. 551-575
-
-
Ljung, L.1
-
21
-
-
0025430267
-
Partially asynchronous parallel algorithms for network flow and other problems
-
P. Tseng, D. P. Bertsekas, and J. N. Tsitsiklis (1990), Partially asynchronous parallel algorithms for network flow and other problems, SIAM J. Control Optim., 28, pp. 678-710.
-
(1990)
SIAM J. Control Optim.
, vol.28
, pp. 678-710
-
-
Tseng, P.1
Bertsekas, D.P.2
Tsitsiklis, J.N.3
-
22
-
-
0028497630
-
Asynchronous stochastic approximation and Q-learning
-
J. N. Tsitsiklis (1994), Asynchronous stochastic approximation and Q-learning, Machine Learning, 16, pp. 185-202.
-
(1994)
Machine Learning
, vol.16
, pp. 185-202
-
-
Tsitsiklis, J.N.1
-
23
-
-
0004049893
-
Learning from delayed rewards
-
Ph.D. thesis, Cambridge University, Cambridge, England
-
C. J. C. H. Watkins (1989), Learning from delayed rewards, Ph.D. thesis, Cambridge University, Cambridge, England.
-
(1989)
-
-
Watkins, C.J.C.H.1
-
25
-
-
84968514083
-
Smoothing derivatives of functions and applications
-
F. W. Wilson (1969), Smoothing derivatives of functions and applications, Trans. Amer. Math. Soc., 139, pp. 413-428.
-
(1969)
Trans. Amer. Math. Soc.
, vol.139
, pp. 413-428
-
-
Wilson, F.W.1
|