메뉴 건너뛰기




Volumn 48, Issue 5, 2003, Pages 758-769

Semi-Markov decision problems and performance sensitivity analysis

Author keywords

Discounted Poisson equations; Discrete event dynamic systems (DEDS); Lyapunov equations; Markov decision processes (MDPs); Perturbation analysis (PA); Perturbation realization; Poisson equations; Policy iteration; Potentials; Reinforcement learning (RL)

Indexed keywords

ALGORITHMS; DECISION MAKING; ITERATIVE METHODS; LYAPUNOV METHODS; MATHEMATICAL MODELS; PERTURBATION TECHNIQUES; POISSON EQUATION; SENSITIVITY ANALYSIS;

EID: 0038631988     PISSN: 00189286     EISSN: None     Source Type: Journal    
DOI: 10.1109/TAC.2003.811252     Document Type: Article
Times cited : (90)

References (23)
  • 2
    • 0016036113 scopus 로고
    • Nonnegative matrices in the mathematical sciences
    • A. Herman and R. J. Plemmons, "Nonnegative matrices in the mathematical sciences," SIAM J. Numer. Anal. vol. 11, pp. 145-154, 1974.
    • (1974) SIAM J. Numer. Anal. , vol.11 , pp. 145-154
    • Herman, A.1    Plemmons, R.J.2
  • 5
    • 0009843739 scopus 로고    scopus 로고
    • The Maclaurin series for performance functions of Markov chains
    • _, "The Maclaurin series for performance functions of Markov chains," Adv. Appl. Probab., vol. 30, pp. 676-692, 1998.
    • (1998) Adv. Appl. Probab. , vol.30 , pp. 676-692
  • 6
    • 0038258780 scopus 로고    scopus 로고
    • The relation among potentials, perturbation analysis, Markov decision processes, and other topics
    • _, "The relation among potentials, perturbation analysis, Markov decision processes, and other topics," J. Discrete Event Dyna. Syst., vol. 8, pp. 71-87, 1998.
    • (1998) J. Discrete Event Dyna. Syst. , vol.8 , pp. 71-87
  • 7
    • 0033884215 scopus 로고    scopus 로고
    • A unified approach to Markov decision problems and performance sensitivity analysis
    • _, "A unified approach to Markov decision problems and performance sensitivity analysis," Automatica, vol. 36, pp. 771-774, 2000.
    • (2000) Automatica , vol.36 , pp. 771-774
  • 8
    • 0037289322 scopus 로고    scopus 로고
    • From perturbation analysis to Markov decision processes and reinforcement learning
    • _, "From perturbation analysis to Markov decision processes and reinforcement learning," J. Discrete Event Dyna. Syst., vol. 13, pp. 9-39, 2003.
    • (2003) J. Discrete Event Dyna. Syst. , vol.13 , pp. 9-39
  • 9
    • 0031258478 scopus 로고    scopus 로고
    • Potentials, perturbation realization, and sensitivity analysis of Markov processes
    • Oct.
    • X.-R. Cao and H. F. Chen, "Potentials, perturbation realization, and sensitivity analysis of Markov processes," IEEE Trans. Automat. Contr., vol. 42, pp. 1382-1393, Oct. 1997.
    • (1997) IEEE Trans. Automat. Contr. , vol.42 , pp. 1382-1393
    • Cao, X.-R.1    Chen, H.F.2
  • 10
    • 0036604532 scopus 로고    scopus 로고
    • A time aggregation approach to Markov decision processes
    • X.-R. Cao, Z. Y. Ren, S. Bhatnagar, M. Fu, and S. Marcus, "A time aggregation approach to Markov decision processes," Automatica, vol. 38, pp. 929-943, 2002.
    • (2002) Automatica , vol.38 , pp. 929-943
    • Cao, X.-R.1    Ren, Z.Y.2    Bhatnagar, S.3    Fu, M.4    Marcus, S.5
  • 13
    • 0038597488 scopus 로고    scopus 로고
    • Single sample path based recursive algorithms for Markov decision processes
    • to be published
    • H.-T. Fang and X.-R. Cao, "Single sample path based recursive algorithms for Markov decision processes," IEEE Trans. Automat. Contr., 2003, to be published.
    • (2003) IEEE Trans. Automat. Contr.
    • Fang, H.-T.1    Cao, X.-R.2
  • 14
    • 0030522182 scopus 로고    scopus 로고
    • A Lyapunov bound for solutions of Poisson's equation
    • P.W. Glynn and S. P. Meyn, "A Lyapunov bound for solutions of Poisson's equation," Ann. Probab., vol. 24, pp. 916-931, 1996.
    • (1996) Ann. Probab. , vol.24 , pp. 916-931
    • Glynn, P.W.1    Meyn, S.P.2
  • 17
    • 0004210802 scopus 로고
    • Theory. New York: Wiley
    • L. Kleinrock, Queueing Systems, Volume 1: Theory. New York: Wiley, 1975.
    • (1975) Queueing Systems , vol.1
    • Kleinrock, L.1
  • 18
    • 0035249254 scopus 로고    scopus 로고
    • Simulation-based optimization of Markov reward processes
    • Feb.
    • P. Marbach and T. N. Tsitsiklis, "Simulation-based optimization of Markov reward processes," IEEE Trans. Automat. Contr., vol. 46, pp. 191-209, Feb. 2001.
    • (2001) IEEE Trans. Automat. Contr. , vol.46 , pp. 191-209
    • Marbach, P.1    Tsitsiklis, T.N.2
  • 22
    • 0033170372 scopus 로고    scopus 로고
    • Between MDPs and Semi-MDPs: A framework for temporal abstraction in reinforcement learning
    • R. S. Sutton, D. Precup, and S. Singh, "Between MDPs and Semi-MDPs: a framework for temporal abstraction in reinforcement learning," Artif. Intell., vol. 112, pp. 181-211, 1999.
    • (1999) Artif. Intell. , vol.112 , pp. 181-211
    • Sutton, R.S.1    Precup, D.2    Singh, S.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.