SCOPUS 정보 검색 플랫폼

Volumn 3, Issue , 2002, Pages 3367-3371

Gradient-based policy iteration: An example

Author keywords

Discrete event dynamic systems; Markov decision processes; Perturbation analysis; Poisson equations; Potentials; Q learning; W factors

Indexed keywords

GRADIENT METHODS; ITERATIVE METHODS; MARKOV PROCESSES; OPTIMIZATION; PERTURBATION TECHNIQUES; POISSON EQUATION; SENSITIVITY ANALYSIS;

DISCRETE EVENT DYNAMIC SYSTEMS; REINFORCEMENT LEARNING;

DISCRETE TIME CONTROL SYSTEMS;

EID: 0036992818 PISSN: 01912216 EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (11)

References (12)

1
- 0003487482
- Athena Scientific, Belmont, Massachusetts
- D. P. Bertsekas, and J. N. Tsitsiklis, Neuro-Dynamic Programming, Athena Scientific, Belmont, Massachusetts, 1996.
- (1996) Neuro-Dynamic Programming
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

2
- 0032027940
- The relations among potentials, perturbation analysis, and Markov decision processes
- X.R. Cao, The relations among potentials, perturbation analysis, and Markov decision processes, Discrete Event Dynamic Systems: Theory and Applications, Vol. 8, 71-87, 1998.
- (1998) Discrete Event Dynamic Systems: Theory and Applications , vol.8 , pp. 71-87
- Cao, X.R.¹

3
- 0033884215
- A unified approach to Markov decision problems and performance sensitivity analysis
- X.R. Cao, A unified approach to Markov decision problems and performance sensitivity analysis, Automatica, 36, 771-774, 2000.
- (2000) Automatica , vol.36 , pp. 771-774
- Cao, X.R.¹

5
- 0031258478
- Potentials, perturbation realization, and sensitivity analysis of Markov processes
- X. R. Cao and H. F. Chen, Potentials, perturbation realization, and sensitivity analysis of Markov processes, IEEE Transactions on Automat. Control, Vol. 42, 1382-1393, 1997.
- (1997) IEEE Transactions on Automat. Control , vol.42 , pp. 1382-1393
- Cao, X.R.¹ Chen, H.F.²

6
- 84971173381
- A use of complex probabilities in the theory of stochastic processes
- D.R. Cox, A use of complex probabilities in the theory of stochastic processes, Proceedings Cambridge Philosophical Society, 51, 313-319, 1955.
- (1955) Proceedings Cambridge Philosophical Society , vol.51 , pp. 313-319
- Cox, D.R.¹

7
- 0003864139
- Kluwer Academic Publishers, Boston
- C. Cassandras and S. Lafortune, Introduction to Discrete Event Dynamic Systems, Kluwer Academic Publishers, Boston, 1999.
- (1999) Introduction to Discrete Event Dynamic Systems
- Cassandras, C.¹ Lafortune, S.²

8
- 0003585978
- Kluwer Academic Publisher, Boston
- Y. C. Ho and X. R. Cao, Perturbation Analysis of Discrete-Event Dynamic Systems, Kluwer Academic Publisher, Boston, 1991.
- (1991) Perturbation Analysis of Discrete-Event Dynamic Systems
- Ho, Y.C.¹ Cao, X.R.²

9
- 0031344030
- The policy iteration algorithm for average reward Markov decision processes with general state space
- S.P. Meyn, The policy iteration algorithm for average reward Markov decision processes with general state space, IEEE Trans. Automat. Control, 42, 1663-1680, 1997.
- (1997) IEEE Trans. Automat. Control , vol.42 , pp. 1663-1680
- Meyn, S.P.¹

10
- 85102627959
- Wiley, New York
- M.L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming, Wiley, New York, 1994.
- (1994) Markov Decision Processes: Discrete Stochastic Dynamic Programming
- Puterman, M.L.¹

11
- 0003737306
- Elsevier, Amesterdam
- D. Revuz, Markov Chains, Elsevier, Amesterdam, 1984.
- (1984) Markov Chains
- Revuz, D.¹

12
- 0004102479
- MIT Press, Cambridge, MA
- R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, MIT Press, Cambridge, MA, 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.