-
1
-
-
62949170421
-
Constrained Markov decision processes
-
Chapman & Hall/CRC, Boca Raton, FL
-
E. Altman. Constrained Markov decision processes. Stochastic Modeling. Chapman & Hall/CRC, Boca Raton, FL, 1999.
-
(1999)
Stochastic Modeling
-
-
Altman, E.1
-
3
-
-
0034437507
-
-
V. S. Borkar. Average cost dynamic programming equations for controlled Markov chains with partial observations. SIAM J. Control Optim., 39(3):673-681 (electronic), 2000.
-
V. S. Borkar. Average cost dynamic programming equations for controlled Markov chains with partial observations. SIAM J. Control Optim., 39(3):673-681 (electronic), 2000.
-
-
-
-
4
-
-
0037290932
-
Dynamic programming for ergodic control with partial observations
-
V. S. Borkar. Dynamic programming for ergodic control with partial observations. Stoch. Proc. Applns., 103(2):293-310, 2003.
-
(2003)
Stoch. Proc. Applns
, vol.103
, Issue.2
, pp. 293-310
-
-
Borkar, V.S.1
-
5
-
-
0033876515
-
-
V. S. Borkar and S. P. Meyn. The O.D.E. method for convergence of stochastic approximation and reinforcement learning. SIAM J. Control Optim., 38(2):447-469, 2000. (also presented at the IEEE CDC, December, 1998).
-
V. S. Borkar and S. P. Meyn. The O.D.E. method for convergence of stochastic approximation and reinforcement learning. SIAM J. Control Optim., 38(2):447-469, 2000. (also presented at the IEEE CDC, December, 1998).
-
-
-
-
6
-
-
0037097924
-
Optimal prediction with memory
-
A. Chorin, O. Hald, and R. Kupferman. Optimal prediction with memory. Physica D, 166:239-257, 2002.
-
(2002)
Physica D
, vol.166
, pp. 239-257
-
-
Chorin, A.1
Hald, O.2
Kupferman, R.3
-
8
-
-
84980140517
-
-
M.D. Donsker and S.R.S. Varadhan. Asymptotic evaluation of certain Markov process expectations for large time. I. II. Comm. Pure Appl. Math., 28:1-47; ibid. 28 (1975), 279-301, 1975.
-
M.D. Donsker and S.R.S. Varadhan. Asymptotic evaluation of certain Markov process expectations for large time. I. II. Comm. Pure Appl. Math., 28:1-47; ibid. 28 (1975), 279-301, 1975.
-
-
-
-
9
-
-
0003273397
-
Hidden Markov models
-
of, New York, Springer-Verlag, New York, Estimation and control
-
R. J. Elliott, L. Aggoun, and J. B. Moore. Hidden Markov models, volume 29 of Applications of Mathematics (New York). Springer-Verlag, New York, 1995. Estimation and control.
-
(1995)
Applications of Mathematics
, vol.29
-
-
Elliott, R.J.1
Aggoun, L.2
Moore, J.B.3
-
10
-
-
13244262450
-
Handbook of Markov decision processes
-
E. A. Feinberg and A. Shwartz, editors, Kluwer Academic Publishers, Boston, MA, Methods and applications
-
E. A. Feinberg and A. Shwartz, editors. Handbook of Markov decision processes. International Series in Operations Research & Management Science, 40. Kluwer Academic Publishers, Boston, MA, 2002. Methods and applications.
-
(2002)
International Series in Operations Research & Management Science
, vol.40
-
-
-
12
-
-
0343893613
-
-
V. R. Konda and V. S. Borkar. Actor-critic-type learning algorithms for Markov decision processes. SIAM J. Control Optim., 38(1):94-123 (electronic), 1999.
-
V. R. Konda and V. S. Borkar. Actor-critic-type learning algorithms for Markov decision processes. SIAM J. Control Optim., 38(1):94-123 (electronic), 1999.
-
-
-
-
13
-
-
4043069840
-
-
V. R. Konda and J. N. Tsitsiklis. On actor-critic algorithms. SIAM J. Control Optim., 42(4):1143-1166 (electronic), 2003.
-
V. R. Konda and J. N. Tsitsiklis. On actor-critic algorithms. SIAM J. Control Optim., 42(4):1143-1166 (electronic), 2003.
-
-
-
-
14
-
-
0037279497
-
-
I. Kontoyiannis and S. P. Meyn. Spectral theory and limit theorems for geometrically ergodic Markov processes. Ann. Appl. Probab., 13:304-362, 2003. Presented at the INFORMS Applied Probability Conference, NYC, July, 2001.
-
I. Kontoyiannis and S. P. Meyn. Spectral theory and limit theorems for geometrically ergodic Markov processes. Ann. Appl. Probab., 13:304-362, 2003. Presented at the INFORMS Applied Probability Conference, NYC, July, 2001.
-
-
-
-
15
-
-
0016533472
-
Applying a new device in the optimization of exponential queueing systems
-
S. Lippman. Applying a new device in the optimization of exponential queueing systems. Operations Res., 23:687-710, 1975.
-
(1975)
Operations Res
, vol.23
, pp. 687-710
-
-
Lippman, S.1
-
16
-
-
62949120272
-
-
G. Mathew and S. Meyn. Learning macroscopic dynamics for optimal prediction. Submitted to 2008 IEEE Conf. on Dec. and Control. Preliminary version presented at Info. Thy. & Appl. at ITA, UCSD 2008.
-
G. Mathew and S. Meyn. Learning macroscopic dynamics for optimal prediction. Submitted to 2008 IEEE Conf. on Dec. and Control. Preliminary version presented at Info. Thy. & Appl. at ITA, UCSD 2008.
-
-
-
-
17
-
-
56449091120
-
An analysis of reinforcement learning with function approximation
-
F. S. Melo, S. Meyn, and M. Isabel Ribeiro. An analysis of reinforcement learning with function approximation. In Proceedings of ICML, pages 664-671, 2008.
-
(2008)
Proceedings of ICML
, pp. 664-671
-
-
Melo, F.S.1
Meyn, S.2
Isabel Ribeiro, M.3
-
19
-
-
0031344030
-
The policy iteration algorithm for average reward Markov decision processes with general state space
-
S. P. Meyn. The policy iteration algorithm for average reward Markov decision processes with general state space. IEEE Trans. Automat. Control, 42(12):1663-1680, 1997.
-
(1997)
IEEE Trans. Automat. Control
, vol.42
, Issue.12
, pp. 1663-1680
-
-
Meyn, S.P.1
-
21
-
-
0003637131
-
-
Springer-Verlag, London, second edition, Edition to appear, Cambride University Press, Cambridge Mathematical Library, edition online:, 2008
-
S. P. Meyn and R. L. Tweedie. Markov Chains and Stochastic Stability. Springer-Verlag, London, second edition, 1993. 2008 Edition to appear, Cambride University Press, Cambridge Mathematical Library. 1993 edition online: http://black.csl.uiuc.edu/~meyn/pages/book.html.
-
(1993)
Markov Chains and Stochastic Stability
-
-
Meyn, S.P.1
Tweedie, R.L.2
-
22
-
-
0000514837
-
Transport, collective motion, and brownian motion
-
H. Mori. Transport, collective motion, and brownian motion. Progress of Theoretical Physics, 33:423-455, 1965.
-
(1965)
Progress of Theoretical Physics
, vol.33
, pp. 423-455
-
-
Mori, H.1
-
23
-
-
0036832956
-
Kernel-based reinforcement learning
-
Dirk Ormoneit and Śaunak Sen. Kernel-based reinforcement learning. Mach. Learn., 49(2-3):161-178, 2002.
-
(2002)
Mach. Learn
, vol.49
, Issue.2-3
, pp. 161-178
-
-
Ormoneit, D.1
Sen, S.2
-
24
-
-
0001296683
-
Perturbation theory and finite Markov chains
-
P. J. Schweitzer. Perturbation theory and finite Markov chains. J. Appl. Prob., 5:401-403, 1968.
-
(1968)
J. Appl. Prob
, vol.5
, pp. 401-403
-
-
Schweitzer, P.J.1
-
25
-
-
84856043672
-
A mathematical theory of communication
-
C.E. Shannon. A mathematical theory of communication. Bell System Tech. J., 27:379-423, 623-656, 1948.
-
(1948)
Bell System Tech. J
, vol.27
, Issue.379-423
, pp. 623-656
-
-
Shannon, C.E.1
-
27
-
-
0026626024
-
Jointly optimal routing and scheduling in packet radio networks
-
L. Tassiulas and A. Ephremides. Jointly optimal routing and scheduling in packet radio networks. IEEE Trans. Inform. Theory, 38(1):165-168, 1992.
-
(1992)
IEEE Trans. Inform. Theory
, vol.38
, Issue.1
, pp. 165-168
-
-
Tassiulas, L.1
Ephremides, A.2
-
28
-
-
0029752470
-
Feature-based methods for large scale dynamic programming
-
J. N. Tsitsiklis and B. Van Roy. Feature-based methods for large scale dynamic programming. Mach. Learn., 22(1-3):59-94, 1996.
-
(1996)
Mach. Learn
, vol.22
, Issue.1-3
, pp. 59-94
-
-
Tsitsiklis, J.N.1
Van Roy, B.2
|