메뉴 건너뛰기




Volumn 3, Issue , 2002, Pages 3367-3371

Gradient-based policy iteration: An example

Author keywords

Discrete event dynamic systems; Markov decision processes; Perturbation analysis; Poisson equations; Potentials; Q learning; W factors

Indexed keywords

GRADIENT METHODS; ITERATIVE METHODS; MARKOV PROCESSES; OPTIMIZATION; PERTURBATION TECHNIQUES; POISSON EQUATION; SENSITIVITY ANALYSIS;

EID: 0036992818     PISSN: 01912216     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (11)

References (12)
  • 2
    • 0032027940 scopus 로고    scopus 로고
    • The relations among potentials, perturbation analysis, and Markov decision processes
    • X.R. Cao, The relations among potentials, perturbation analysis, and Markov decision processes, Discrete Event Dynamic Systems: Theory and Applications, Vol. 8, 71-87, 1998.
    • (1998) Discrete Event Dynamic Systems: Theory and Applications , vol.8 , pp. 71-87
    • Cao, X.R.1
  • 3
    • 0033884215 scopus 로고    scopus 로고
    • A unified approach to Markov decision problems and performance sensitivity analysis
    • X.R. Cao, A unified approach to Markov decision problems and performance sensitivity analysis, Automatica, 36, 771-774, 2000.
    • (2000) Automatica , vol.36 , pp. 771-774
    • Cao, X.R.1
  • 4
    • 0242711423 scopus 로고    scopus 로고
    • From perturbation analysis to Markov decision processes and reinforcement learning
    • manuscript
    • X. R. Cao, From perturbation analysis to Markov decision processes and reinforcement learning, manuscript, 2002.
    • (2002)
    • Cao, X.R.1
  • 5
    • 0031258478 scopus 로고    scopus 로고
    • Potentials, perturbation realization, and sensitivity analysis of Markov processes
    • X. R. Cao and H. F. Chen, Potentials, perturbation realization, and sensitivity analysis of Markov processes, IEEE Transactions on Automat. Control, Vol. 42, 1382-1393, 1997.
    • (1997) IEEE Transactions on Automat. Control , vol.42 , pp. 1382-1393
    • Cao, X.R.1    Chen, H.F.2
  • 6
    • 84971173381 scopus 로고
    • A use of complex probabilities in the theory of stochastic processes
    • D.R. Cox, A use of complex probabilities in the theory of stochastic processes, Proceedings Cambridge Philosophical Society, 51, 313-319, 1955.
    • (1955) Proceedings Cambridge Philosophical Society , vol.51 , pp. 313-319
    • Cox, D.R.1
  • 9
    • 0031344030 scopus 로고    scopus 로고
    • The policy iteration algorithm for average reward Markov decision processes with general state space
    • S.P. Meyn, The policy iteration algorithm for average reward Markov decision processes with general state space, IEEE Trans. Automat. Control, 42, 1663-1680, 1997.
    • (1997) IEEE Trans. Automat. Control , vol.42 , pp. 1663-1680
    • Meyn, S.P.1
  • 11


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.