메뉴 건너뛰기




Volumn 53, Issue 4, 2008, Pages 1076-1082

Event-based optimization of Markov systems

Author keywords

Markov decision processes (MDPs); Performance potentials; Perturbation analysis (PA); Policy gradients; Policy iteration

Indexed keywords

DECISION THEORY; OPTIMIZATION; PERTURBATION TECHNIQUES; SENSITIVITY ANALYSIS;

EID: 44849134414     PISSN: 00189286     EISSN: None     Source Type: Journal    
DOI: 10.1109/TAC.2008.919557     Document Type: Article
Times cited : (45)

References (9)
  • 1
    • 0037288370 scopus 로고    scopus 로고
    • Recent advances in hierarchical reinforcement learning, special issue on reinforcement learning
    • A. Barto and S. Mahadevan, "Recent advances in hierarchical reinforcement learning, special issue on reinforcement learning," Discrete Event Dyn. Syst.: Theory Appl., vol. 13, pp. 41-77, 2003.
    • (2003) Discrete Event Dyn. Syst.: Theory Appl , vol.13 , pp. 41-77
    • Barto, A.1    Mahadevan, S.2
  • 2
    • 0013535965 scopus 로고    scopus 로고
    • Infinite-horizon policy-gradient estimation,
    • J. Baxter and P. L. Bartlett, "Infinite-horizon policy-gradient estimation,", J. Artif. Intell. Res., vol. 15, pp. 319-350, 2001.
    • (2001) J. Artif. Intell. Res , vol.15 , pp. 319-350
    • Baxter, J.1    Bartlett, P.L.2
  • 3
    • 14644388113 scopus 로고    scopus 로고
    • Basic ideas for event-based optimization of Markov systems,
    • X. R. Cao, "Basic ideas for event-based optimization of Markov systems,: Discrete Event Dyn. Syst.: Theory Appl., vol. 15, pp. 169-197, 2005.
    • (2005) Discrete Event Dyn. Syst.: Theory Appl , vol.15 , pp. 169-197
    • Cao, X.R.1
  • 4
    • 0031258478 scopus 로고    scopus 로고
    • Perturbation realization, potentials and sensitivity analysis of processes
    • Oct
    • X. R. Cao and H. F. Chen, "Perturbation realization, potentials and sensitivity analysis of processes," IEEE Trans. Autom. Control, vol. 42, no. 10, pp. 1382-1393, Oct. 1997.
    • (1997) IEEE Trans. Autom. Control , vol.42 , Issue.10 , pp. 1382-1393
    • Cao, X.R.1    Chen, H.F.2
  • 5
    • 44849084717 scopus 로고    scopus 로고
    • The nth-order bias optimality for multi-chain Markuv decision processes
    • Mar
    • X. R. Cao and J. Y. Zhang, "The nth-order bias optimality for multi-chain Markuv decision processes," IEEE Trans. Autnom. Control vol. 53, no. 2, pp. 496-508, Mar. 2008.
    • (2008) IEEE Trans. Autnom. Control , vol.53 , Issue.2 , pp. 496-508
    • Cao, X.R.1    Zhang, J.Y.2
  • 7
    • 0038380746 scopus 로고    scopus 로고
    • Convergence of simulation-based policy iteration
    • W. L. Cooper, S. G. Henderson, and M. E. Lewis, "Convergence of simulation-based policy iteration," Probab. Eng. Inf. Sci., vol. 17, pp. 213 234, 2003.
    • (2003) Probab. Eng. Inf. Sci , vol.17 , pp. 213-234
    • Cooper, W.L.1    Henderson, S.G.2    Lewis, M.E.3
  • 8
    • 2442614974 scopus 로고    scopus 로고
    • Potential-based on-line policy iteration algorithms for Markov decision processes
    • Apr
    • H. T. Fang and X. R. Cao, "Potential-based on-line policy iteration algorithms for Markov decision processes," IEEE Trans. Autom. Control, vol. 49, no. 4, pp. 493-505, Apr. 2004.
    • (2004) IEEE Trans. Autom. Control , vol.49 , Issue.4 , pp. 493-505
    • Fang, H.T.1    Cao, X.R.2
  • 9
    • 0035249254 scopus 로고    scopus 로고
    • Simulation-based optimization of Markov reward processes
    • Feb
    • P. Marbach and T. N. Tsitsiklis, "Simulation-based optimization of Markov reward processes," IEEE Trans. Autom. Control, vol. 46, no. 2. pp. 191-209, Feb. 2001.
    • (2001) IEEE Trans. Autom. Control , vol.46 , Issue.2 , pp. 191-209
    • Marbach, P.1    Tsitsiklis, T.N.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.