메뉴 건너뛰기




Volumn 30, Issue 3, 2005, Pages 545-561

On the empirical state-action frequencies in Markov decision processes under general policies

Author keywords

Empirical measure; Large deviations; Markov decision processes; State action frequencies

Indexed keywords

EMPIRICAL MEASURE; LARGE DEVIATIONS; MARKOV DECISION PROCESSES; STATE-ACTION FREQUENCIES;

EID: 31144437617     PISSN: 0364765X     EISSN: 15265471     Source Type: Journal    
DOI: 10.1287/moor.1050.0148     Document Type: Article
Times cited : (28)

References (17)
  • 1
    • 0026188613 scopus 로고
    • Markov decision problems and state-action frequencies
    • Altman, E., A. Shwartz. 1991. Markov decision problems and state-action frequencies. SIAM J. Control Optim. 29(4) 786-809.
    • (1991) SIAM J. Control Optim. , vol.29 , Issue.4 , pp. 786-809
    • Altman, E.1    Shwartz, A.2
  • 2
    • 1942469570 scopus 로고
    • Rate of convergence of empirical measures and costs in controlled Markov chains and transient optimality
    • Altman, E., O. Zeitouni. 1994. Rate of convergence of empirical measures and costs in controlled Markov chains and transient optimality. Math. Oper. Res. 19(4) 955-974.
    • (1994) Math. Oper. Res. , vol.19 , Issue.4 , pp. 955-974
    • Altman, E.1    Zeitouni, O.2
  • 3
    • 0002801896 scopus 로고    scopus 로고
    • Multiplicative ergodicity and large deviations for an irreducible Markov chain
    • Balaji, S., S. P. Meyn. 2000. Multiplicative ergodicity and large deviations for an irreducible Markov chain. Stochastic Processes Their Appl. 90 123-144.
    • (2000) Stochastic Processes Their Appl. , vol.90 , pp. 123-144
    • Balaji, S.1    Meyn, S.P.2
  • 9
    • 0037079674 scopus 로고    scopus 로고
    • Hoeffding's inequality for uniformly ergodic Markov chains
    • Glynn, P. W., D. Ormoneit. 2002. Hoeffding's inequality for uniformly ergodic Markov chains. Statist. Probab. Lett. 56 143-146.
    • (2002) Statist. Probab. Lett. , vol.56 , pp. 143-146
    • Glynn, P.W.1    Ormoneit, D.2
  • 10
    • 0001108744 scopus 로고
    • Hitting and occupation time bounds implied by drift analysis and applications
    • Hajek, B. 1982. Hitting and occupation time bounds implied by drift analysis and applications. Adv. Appl. Probab. 14 502-525.
    • (1982) Adv. Appl. Probab. , vol.14 , pp. 502-525
    • Hajek, B.1
  • 11
    • 84980072516 scopus 로고
    • A strong law for the maximum cumulative sum of independent random variables
    • Hirsch, W. M. 1965. A strong law for the maximum cumulative sum of independent random variables. Comm. Pure Appl. Math. 18 109-127.
    • (1965) Comm. Pure Appl. Math. , vol.18 , pp. 109-127
    • Hirsch, W.M.1
  • 12
    • 0003759935 scopus 로고
    • Linear programming and finite Markovian control problems
    • Mathematics Centrum Tract, Amsterdam, The Netherlands
    • Kallenberg, L. C. M. 1983. Linear programming and finite Markovian control problems. Technical Report 148, Mathematics Centrum Tract, Amsterdam, The Netherlands.
    • (1983) Technical Report , vol.148
    • Kallenberg, L.C.M.1
  • 14
    • 0000413365 scopus 로고
    • A convexity property in the theory of random variables defined by a finite Markov chain
    • Miller, H. D. 1961. A convexity property in the theory of random variables defined by a finite Markov chain. Ann. Math. Statist. 32(4) 1260-1270.
    • (1961) Ann. Math. Statist. , vol.32 , Issue.4 , pp. 1260-1270
    • Miller, H.D.1
  • 16
    • 9444280555 scopus 로고
    • Extremal large deviations in controlled I.I.D. processes with applications to hypothesis testing
    • Shimkin, N. 1993. Extremal large deviations in controlled I.I.D. processes with applications to hypothesis testing. Adv. Appl. Probab. 25 875-894.
    • (1993) Adv. Appl. Probab. , vol.25 , pp. 875-894
    • Shimkin, N.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.