-
1
-
-
61349084295
-
-
Technical report, RSISE, Australian National University, Canberra, Australia
-
Aberdeen, D., J. Baxter. 2001. Internal-state policy-gradient algorithms for infinite-horizon POMDPs. Technical report, RSISE, Australian National University, Canberra, Australia.
-
(2001)
Internal-state policy-gradient algorithms for infinite-horizon POMDPs
-
-
Aberdeen, D.1
Baxter, J.2
-
4
-
-
38049180377
-
An expected average reward criterion
-
Bierth, K. J. 1987. An expected average reward criterion. Stochastic Process. Appl. 26 123-140.
-
(1987)
Stochastic Process. Appl
, vol.26
, pp. 123-140
-
-
Bierth, K.J.1
-
6
-
-
61349154325
-
An ε-optimal control of a finite Markov chain with, an average reward criterion
-
Feinberg, E. A. 1980. An ε-optimal control of a finite Markov chain with, an average reward criterion. Theory Probab. Appl. 25 70-81.
-
(1980)
Theory Probab. Appl
, vol.25
, pp. 70-81
-
-
Feinberg, E.A.1
-
7
-
-
0001467751
-
Controlled Markov processes with arbitrary numerical criteria
-
Feinberg, E. A. 1982. Controlled Markov processes with arbitrary numerical criteria. Theory Probab. Appl. 27 486-503.
-
(1982)
Theory Probab. Appl
, vol.27
, pp. 486-503
-
-
Feinberg, E.A.1
-
8
-
-
0002610493
-
Nonrandomized Markov and semi-Markov strategies in dynamic programming
-
Feinberg, E. A. 1982. Nonrandomized Markov and semi-Markov strategies in dynamic programming. Theory Probab. Appl. 27 116-126.
-
(1982)
Theory Probab. Appl
, vol.27
, pp. 116-126
-
-
Feinberg, E.A.1
-
9
-
-
0004808420
-
On the average cost optimality equation and the structure of optimal policies for partially observable Markov decision processes
-
Fernández-Gaucherand, E., A. Arapostathis, S. I. Marcus. 1991. On the average cost optimality equation and the structure of optimal policies for partially observable Markov decision processes. Ann. Oper: Res. 29 439-470.
-
(1991)
Ann. Oper: Res
, vol.29
, pp. 439-470
-
-
Fernández-Gaucherand, E.1
Arapostathis, A.2
Marcus, S.I.3
-
10
-
-
28544443262
-
On the existence of stationary optimal policies for partially observed MDPs under the long-run average cost criterion
-
Hsu, S.-P., D.-M. Chuang, A. Arapostathis. 2006. On the existence of stationary optimal policies for partially observed MDPs under the long-run average cost criterion. Systems Control Lett. 55 165-173.
-
(2006)
Systems Control Lett
, vol.55
, pp. 165-173
-
-
Hsu, S.-P.1
Chuang, D.-M.2
Arapostathis, A.3
-
11
-
-
0000624333
-
Reinforcement learning algorithm for partially observable Markov decision problems
-
Denver, CO. MIT Press, Cambridge, MA
-
Jaakkola, T. S., S. P. Singh, M. I. Jordan. 1995. Reinforcement learning algorithm for partially observable Markov decision problems. Proc. Neural Inform. Processing Systems Conf., Denver, CO. MIT Press, Cambridge, MA.
-
(1995)
Proc. Neural Inform. Processing Systems Conf
-
-
Jaakkola, T.S.1
Singh, S.P.2
Jordan, M.I.3
-
12
-
-
0004047518
-
-
Oxford University Press, Oxford, UK
-
Lauritzen, S. L. 1996. Graphical Models. Oxford University Press, Oxford, UK.
-
(1996)
Graphical Models
-
-
Lauritzen, S.L.1
-
13
-
-
0002103968
-
Learning finite-state controllers for partially observable environment
-
Stockholm, Sweden. Morgan Kaufmann, San Francisco
-
Meuleau, N., L. Peshkin, K.-E. Kim, L. P. Kaelbling. 1999. Learning finite-state controllers for partially observable environment. Proc. 15th Conf. Uncertainty in Artificial Intelligence, Stockholm, Sweden. Morgan Kaufmann, San Francisco.
-
(1999)
Proc. 15th Conf. Uncertainty in Artificial Intelligence
-
-
Meuleau, N.1
Peshkin, L.2
Kim, K.-E.3
Kaelbling, L.P.4
-
14
-
-
0019037868
-
Optimal infinite-horizon undiscounted control of finite probabilistic systems
-
Platzman, L. K. 1980. Optimal infinite-horizon undiscounted control of finite probabilistic systems. SIAM J. Control Optim. 18(4) 362-380.
-
(1980)
SIAM J. Control Optim
, vol.18
, Issue.4
, pp. 362-380
-
-
Platzman, L.K.1
-
16
-
-
5744238162
-
Arbitrary state Markovian decision processes
-
Ross, S. M. 1968. Arbitrary state Markovian decision processes. Ann. Math. Statist. 39(6) 2118-2122.
-
(1968)
Ann. Math. Statist
, vol.39
, Issue.6
, pp. 2118-2122
-
-
Ross, S.M.1
-
17
-
-
61349091186
-
-
Giardini Editori e Stampatori, Pisa, Italy
-
Runggaldier, W. J., L. Stettner. 1994. Approximations of Discrete Time Partially Observable Control Problems, Applied Mathematics Monographs, Vol. 6. Giardini Editori e Stampatori, Pisa, Italy.
-
(1994)
Approximations of Discrete Time Partially Observable Control Problems, Applied Mathematics Monographs
, vol.6
-
-
Runggaldier, W.J.1
Stettner, L.2
-
18
-
-
80053276028
-
A function approximation approach to estimation of policy gradient for POMDP with structured polices
-
Edinburgh, UK, AUAI Press
-
Yu, H. 2005. A function approximation approach to estimation of policy gradient for POMDP with structured polices. Proc. 21st Conf. Uncertainty in Artificial Intelligence, Edinburgh, UK, AUAI Press.
-
(2005)
Proc. 21st Conf. Uncertainty in Artificial Intelligence
-
-
Yu, H.1
|