SCOPUS 정보 검색 플랫폼

Probability in the Engineering and Informational Sciences

Volumn 17, Issue 2, 2003, Pages 213-234

Convergence of simulation-based policy iteration

(3) Cooper, William L a Henderson, Shane G b Lewis, Mark E c

a 4 174 EE CSCI Building (United States)

b School of Civil and Environmental Engineering (United States)

c UNIVERSITY OF MICHIGAN (United States)

Author keywords

[No Author keywords available]

Indexed keywords

MARKOV PROCESSES;

ALMOST SURE CONVERGENCE; AVERAGE REWARD; MARKOV DECISION PROCESSES; OPTIMAL DECISION-RULE; OPTIMAL POLICIES; POLICY ITERATION; POLICY ITERATION ALGORITHMS; SIMULATION RUN LENGTH;

ITERATIVE METHODS;

EID: 0038380746 PISSN: 02699648 EISSN: None Source Type: Journal
DOI: 10.1017/S0269964803172051 Document Type: Article

Times cited : (32)

References (22)

1
- 0020970738
- Neuron-like elements that can solve difficult learning control problems
- Barto, A.G., Sutton, R.S., & Anderson, C.W. (1983). Neuron-like elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man and Cybernetics 13: 835-846.
- (1983) IEEE Transactions on Systems, Man and Cybernetics , vol.13 , pp. 835-846
- Barto, A.G.¹ Sutton, R.S.² Anderson, C.W.³

2
- 0003565783
- Belmont, MA: Athena Scientific
- Bertsekas, D.P. (1995). Dynamic programming and optimal control, Vol. II. Belmont, MA: Athena Scientific.
- (1995) Dynamic Programming and Optimal Control , vol.2
- Bertsekas, D.P.¹

3
- 0003487482
- Belmont, MA: Athena Scientific
- Bertsekas, D.P. & Tsitsiklis, J.N. (1996). Neuro-dynamic programming, Belmont, MA: Athena Scientific.
- (1996) Neuro-Dynamic Programming
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

4
- 0003618624
- New York: Springer-Verlag
- Brémaud, P. (1999). Markov chains: Gibbs fields, Monte Carlo simulation, and queues. New York: Springer-Verlag.
- (1999) Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues
- Brémaud, P.¹

5
- 0032027940
- The relations among potentials, perturbation analysis, and Markov decision processes
- Cao, X.-R. (1998). The relations among potentials, perturbation analysis, and Markov decision processes. Discrete-Event Dynamic Systems: Theory and Applications 8: 71-87.
- (1998) Discrete-Event Dynamic Systems: Theory and Applications , vol.8 , pp. 71-87
- Cao, X.-R.¹

6
- 0033247533
- Single sample path-based optimization of Markov chains
- Cao, X.-R. (1999). Single sample path-based optimization of Markov chains. Journal of Optimization Theory and Applications 100: 527-548.
- (1999) Journal of Optimization Theory and Applications , vol.100 , pp. 527-548
- Cao, X.-R.¹

7
- 0033884215
- A unified approach to Markov decision problems and performance sensitivity analysis
- Cao, X.-R. (2000). A unified approach to Markov decision problems and performance sensitivity analysis. Automatica 36: 771-774.
- (2000) Automatica , vol.36 , pp. 771-774
- Cao, X.-R.¹

8
- 0031258478
- Perturbation realization, potentials, and sensitivity analysis of Markov processes
- Cao, X.-R. & Chen, H.-F. (1997). Perturbation realization, potentials, and sensitivity analysis of Markov processes. IEEE Transactions on Automatic Control 42(10): 1382-1393.
- (1997) IEEE Transactions on Automatic Control , vol.42 , Issue.10 , pp. 1382-1393
- Cao, X.-R.¹ Chen, H.-F.²

9
- 0004120309
- Boston: Birkhauser
- Fristedt, B. & Gray, L. (1997). A modern approach to probability theory. Boston: Birkhauser.
- (1997) A Modern Approach to Probability Theory
- Fristedt, B.¹ Gray, L.²

10
- 0003644124
- New York: Wiley
- Howard, R.A. (1960). Dynamic programming and Markov processes. New York: Wiley.
- (1960) Dynamic Programming and Markov Processes
- Howard, R.A.¹

11
- 0343893613
- Actor-critic-type learning algorithms for Markov decision processes
- Konda, V.R. & Borkar, V.S. (1999). Actor-critic-type learning algorithms for Markov decision processes. SIAM Journal on Control and Optimization 38: 94-123.
- (1999) SIAM Journal on Control and Optimization , vol.38 , pp. 94-123
- Konda, V.R.¹ Borkar, V.S.²

12
- 0035112214
- A probabilistic analysis of bias optimality in unichain Markov decision processes
- Lewis, M.E. & Puterman, M.L. (2001). A probabilistic analysis of bias optimality in unichain Markov decision processes. IEEE Transactions on Automatic Control 46(1): 96-100.
- (2001) IEEE Transactions on Automatic Control , vol.46 , Issue.1 , pp. 96-100
- Lewis, M.E.¹ Puterman, M.L.²

13
- 33646733560
- Bias optimality
- E. Feinberg & A. Shwartz (eds.), Boston: Kluwer Academic
- Lewis, M.E. & Puterman, M.L. (2002). Bias optimality. In E. Feinberg & A. Shwartz (eds.), Handbook of Markov decision processes: Methods and applications. Boston: Kluwer Academic.
- (2002) Handbook of Markov Decision Processes: Methods and Applications
- Lewis, M.E.¹ Puterman, M.L.²

14
- 0005193926
- Exact sampling with coupled Markov chains and applications to statistical mechanics
- Propp, J.G. & Wilson, D.B. (1996). Exact sampling with coupled Markov chains and applications to statistical mechanics. Random Structures and Algorithms 9: 223-252.
- (1996) Random Structures and Algorithms , vol.9 , pp. 223-252
- Propp, J.G.¹ Wilson, D.B.²

15
- 0000162904
- How to get a perfectly random sample from a generic Markov chain and generate a random spanning tree of a directed graph
- Propp, J.G. & Wilson, D.B. (1998). How to get a perfectly random sample from a generic Markov chain and generate a random spanning tree of a directed graph. Journal of Algorithms 27:170-217.
- (1998) Journal of Algorithms , vol.27 , pp. 170-217
- Propp, J.G.¹ Wilson, D.B.²

16
- 85102627959
- New York: Wiley
- Puterman, M.L. (1994). Markov decision processes: Discrete stochastic dynamic programming. New York: Wiley.
- (1994) Markov Decision Processes: Discrete Stochastic Dynamic Programming
- Puterman, M.L.¹

17
- 0004167552
- Boston: Birkhauser
- Resnick, S. (1992). Adventures in stochastic processes. Boston: Birkhauser.
- (1992) Adventures in Stochastic Processes
- Resnick, S.¹

18
- 0003778293
- New York: Wiley
- Ross, S.M. (1996). Stochastic processes, 2nd ed. New York: Wiley.
- (1996) Stochastic Processes, 2nd Ed.
- Ross, S.M.¹

19
- 0004176142
- New York: Springer-Verlag
- Thorisson, H. (2000). Coupling, stationarity, and regeneration. New York: Springer-Verlag.
- (2000) Coupling, Stationarity, and Regeneration
- Thorisson, H.¹

20
- 0028497630
- Asynchronous stochastic approximation and Q-learning
- Tsitsiklis, I. (1994). Asynchronous stochastic approximation and Q-learning. Machine Learning 16: 195-202.
- (1994) Machine Learning , vol.16 , pp. 195-202
- Tsitsiklis, I.¹

21
- 34249833101
- Q-learning
- Watkins, C.J.C.H. & Dayan, P. (1992). Q-Learning. Machine Learning 8: 279-292.
- (1992) Machine Learning , vol.8 , pp. 279-292
- Watkins, C.J.C.H.¹ Dayan, P.²

22
- 0000617461
- Layered multishift coupling for use in perfect sampling algorithms (with a primer on CFTP)
- Providence, RI: American Mathematical Society
- Wilson, D.B. (2000). Layered multishift coupling for use in perfect sampling algorithms (with a primer on CFTP). Fields Institute Communications, Vol. 26. Providence, RI: American Mathematical Society.
- (2000) Fields Institute Communications , vol.26
- Wilson, D.B.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.