메뉴 건너뛰기




Volumn 17, Issue 2, 2003, Pages 213-234

Convergence of simulation-based policy iteration

Author keywords

[No Author keywords available]

Indexed keywords

MARKOV PROCESSES;

EID: 0038380746     PISSN: 02699648     EISSN: None     Source Type: Journal    
DOI: 10.1017/S0269964803172051     Document Type: Article
Times cited : (32)

References (22)
  • 5
    • 0032027940 scopus 로고    scopus 로고
    • The relations among potentials, perturbation analysis, and Markov decision processes
    • Cao, X.-R. (1998). The relations among potentials, perturbation analysis, and Markov decision processes. Discrete-Event Dynamic Systems: Theory and Applications 8: 71-87.
    • (1998) Discrete-Event Dynamic Systems: Theory and Applications , vol.8 , pp. 71-87
    • Cao, X.-R.1
  • 6
    • 0033247533 scopus 로고    scopus 로고
    • Single sample path-based optimization of Markov chains
    • Cao, X.-R. (1999). Single sample path-based optimization of Markov chains. Journal of Optimization Theory and Applications 100: 527-548.
    • (1999) Journal of Optimization Theory and Applications , vol.100 , pp. 527-548
    • Cao, X.-R.1
  • 7
    • 0033884215 scopus 로고    scopus 로고
    • A unified approach to Markov decision problems and performance sensitivity analysis
    • Cao, X.-R. (2000). A unified approach to Markov decision problems and performance sensitivity analysis. Automatica 36: 771-774.
    • (2000) Automatica , vol.36 , pp. 771-774
    • Cao, X.-R.1
  • 8
    • 0031258478 scopus 로고    scopus 로고
    • Perturbation realization, potentials, and sensitivity analysis of Markov processes
    • Cao, X.-R. & Chen, H.-F. (1997). Perturbation realization, potentials, and sensitivity analysis of Markov processes. IEEE Transactions on Automatic Control 42(10): 1382-1393.
    • (1997) IEEE Transactions on Automatic Control , vol.42 , Issue.10 , pp. 1382-1393
    • Cao, X.-R.1    Chen, H.-F.2
  • 11
    • 0343893613 scopus 로고    scopus 로고
    • Actor-critic-type learning algorithms for Markov decision processes
    • Konda, V.R. & Borkar, V.S. (1999). Actor-critic-type learning algorithms for Markov decision processes. SIAM Journal on Control and Optimization 38: 94-123.
    • (1999) SIAM Journal on Control and Optimization , vol.38 , pp. 94-123
    • Konda, V.R.1    Borkar, V.S.2
  • 12
    • 0035112214 scopus 로고    scopus 로고
    • A probabilistic analysis of bias optimality in unichain Markov decision processes
    • Lewis, M.E. & Puterman, M.L. (2001). A probabilistic analysis of bias optimality in unichain Markov decision processes. IEEE Transactions on Automatic Control 46(1): 96-100.
    • (2001) IEEE Transactions on Automatic Control , vol.46 , Issue.1 , pp. 96-100
    • Lewis, M.E.1    Puterman, M.L.2
  • 14
    • 0005193926 scopus 로고    scopus 로고
    • Exact sampling with coupled Markov chains and applications to statistical mechanics
    • Propp, J.G. & Wilson, D.B. (1996). Exact sampling with coupled Markov chains and applications to statistical mechanics. Random Structures and Algorithms 9: 223-252.
    • (1996) Random Structures and Algorithms , vol.9 , pp. 223-252
    • Propp, J.G.1    Wilson, D.B.2
  • 15
    • 0000162904 scopus 로고    scopus 로고
    • How to get a perfectly random sample from a generic Markov chain and generate a random spanning tree of a directed graph
    • Propp, J.G. & Wilson, D.B. (1998). How to get a perfectly random sample from a generic Markov chain and generate a random spanning tree of a directed graph. Journal of Algorithms 27:170-217.
    • (1998) Journal of Algorithms , vol.27 , pp. 170-217
    • Propp, J.G.1    Wilson, D.B.2
  • 20
    • 0028497630 scopus 로고
    • Asynchronous stochastic approximation and Q-learning
    • Tsitsiklis, I. (1994). Asynchronous stochastic approximation and Q-learning. Machine Learning 16: 195-202.
    • (1994) Machine Learning , vol.16 , pp. 195-202
    • Tsitsiklis, I.1
  • 22
    • 0000617461 scopus 로고    scopus 로고
    • Layered multishift coupling for use in perfect sampling algorithms (with a primer on CFTP)
    • Providence, RI: American Mathematical Society
    • Wilson, D.B. (2000). Layered multishift coupling for use in perfect sampling algorithms (with a primer on CFTP). Fields Institute Communications, Vol. 26. Providence, RI: American Mathematical Society.
    • (2000) Fields Institute Communications , vol.26
    • Wilson, D.B.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.