메뉴 건너뛰기




Volumn 42, Issue 12, 1997, Pages 1663-1680

The policy iteration algorithm for average reward Markov decision processes with general state space

Author keywords

Howard's algorithm; Markov decision processes; Multiclass queueing networks; Poisson equation; Policy iteration algorithm

Indexed keywords

ALGORITHMS; COMPUTATIONAL METHODS; COMPUTER NETWORKS; CONVERGENCE OF NUMERICAL METHODS; ITERATIVE METHODS; MARKOV PROCESSES; PROBLEM SOLVING; STATE SPACE METHODS; SYSTEM STABILITY;

EID: 0031344030     PISSN: 00189286     EISSN: None     Source Type: Journal    
DOI: 10.1109/9.650016     Document Type: Article
Times cited : (99)

References (49)
  • 2
    • 0003448964 scopus 로고
    • Topics in controlled Markov chains
    • UK: Longman Scientific & Technical
    • V. S. Borkar, "Topics in controlled Markov chains," in Pitman Res. Notes in Math. Series # 240. UK: Longman Scientific & Technical, 1991.
    • (1991) Pitman Res. Notes in Math. Series # 240
    • Borkar, V.S.1
  • 4
    • 0040898816 scopus 로고    scopus 로고
    • Value iteration in a class of average controlled Markov chains with unbounded costs: Necessary and sufficient conditions for pointwise convergence
    • R. Cavazos-Cadena and E. Fernandez-Gaucherand, "Value iteration in a class of average controlled Markov chains with unbounded costs: Necessary and sufficient conditions for pointwise convergence," J. Appl. Probability, vol. 33, pp. 986-1002, 1996.
    • (1996) J. Appl. Probability , vol.33 , pp. 986-1002
    • Cavazos-Cadena, R.1    Fernandez-Gaucherand, E.2
  • 7
    • 0001875921 scopus 로고    scopus 로고
    • Fluid network models Linear programs for control and performance bounds
    • J. Cruz, J. Gertler, and M. Peshkin, Eds, San Francisco, CA
    • J. Humphrey, D. Eng, and S. P. Meyn, "Fluid network models Linear programs for control and performance bounds," in Proc. 13th IFAC World Congr., J. Cruz, J. Gertler, and M. Peshkin, Eds, San Francisco, CA, 1996, vol. B, pp. 19-24.
    • (1996) Proc. 13th IFAC World Congr. , vol.B , pp. 19-24
    • Humphrey, J.1    Eng, D.2    Meyn, S.P.3
  • 8
    • 0030086281 scopus 로고    scopus 로고
    • Stability and instability of fluid models for certain re-entrant lines
    • Feb.
    • J. Dai and G. Weiss, "Stability and instability of fluid models for certain re-entrant lines," Math. Ops. Res., vol. 21, no. 1, pp. 115-134, Feb. 1996.
    • (1996) Math. Ops. Res. , vol.21 , Issue.1 , pp. 115-134
    • Dai, J.1    Weiss, G.2
  • 9
    • 0003077340 scopus 로고
    • On the positive Harris recurrence for multiclass queueing networks a unified approach via fluid limit models
    • J. G. Dai, "On the positive Harris recurrence for multiclass queueing networks A unified approach via fluid limit models," Ann. Appl. Probab., vol. 5, pp. 49-77, 1995.
    • (1995) Ann. Appl. Probab. , vol.5 , pp. 49-77
    • Dai, J.G.1
  • 10
    • 0029404157 scopus 로고
    • Stability and convergence of moments for multiclass queueing networks via fluid limit models
    • Nov.
    • J. G. Dai and S. P. Meyn, "Stability and convergence of moments for multiclass queueing networks via fluid limit models," IEEE Trans. Automat. Contr., vol. 40, pp. 1889-1904, Nov. 1995.
    • (1995) IEEE Trans. Automat. Contr. , vol.40 , pp. 1889-1904
    • Dai, J.G.1    Meyn, S.P.2
  • 11
    • 2342525669 scopus 로고
    • Counterexamples for compact action Markov decision chains with average reward criteria
    • R. Dekker, "Counterexamples for compact action Markov decision chains with average reward criteria," Comm. Statist.-Stoch. Models, vol. 3, pp. 357-368, 1987.
    • (1987) Comm. Statist.-Stoch. Models , vol.3 , pp. 357-368
    • Dekker, R.1
  • 12
    • 0343812266 scopus 로고
    • Dunumerable state MDPs
    • C. Derman, "Dunumerable state MDPs," Ann. Amth. Statist., vol. 37, pp. 1545-1554, 1966.
    • (1966) Ann. Amth. Statist. , vol.37 , pp. 1545-1554
    • Derman, C.1
  • 13
    • 21344460828 scopus 로고    scopus 로고
    • Geometric and uniform ergodicity of Markov processes
    • D. Down, S. P. Meyn, and R. L. Tweedie, "Geometric and uniform ergodicity of Markov processes," Ann. Probab., vol. 23, no. 4, pp. 1671-1691, 1996.
    • (1996) Ann. Probab. , vol.23 , Issue.4 , pp. 1671-1691
    • Down, D.1    Meyn, S.P.2    Tweedie, R.L.3
  • 17
    • 0010107106 scopus 로고
    • Ergodicity of queueing networks
    • S. Foss, "Ergodicity of queueing networks," Siberian Math. J., vol. 32, pp. 183-202, 1991.
    • (1991) Siberian Math. J. , vol.32 , pp. 183-202
    • Foss, S.1
  • 18
    • 0030522182 scopus 로고    scopus 로고
    • A Lyapunov bound for solutions of Poisson's equation
    • Apr.
    • P. W. Glynn and S. P. Meyn, "A Lyapunov bound for solutions of Poisson's equation," Ann. Probab., vol. 24, Apr. 1996.
    • (1996) Ann. Probab. , vol.24
    • Glynn, P.W.1    Meyn, S.P.2
  • 21
    • 0006238280 scopus 로고
    • Recurrence conditions for Markov decision processes with Borel state space: A survey
    • O. Hernández-Lerma, R. Montes-de-Oca, and R. Cavazos-Cadena, "Recurrence conditions for Markov decision processes with Borel state space: A survey," Ann. Operations Res., vol. 28, pp. 29-46, 1991.
    • (1991) Ann. Operations Res. , vol.28 , pp. 29-46
    • Hernández-Lerma, O.1    Montes-de-Oca, R.2    Cavazos-Cadena, R.3
  • 23
    • 0023295768 scopus 로고
    • On the convergence of policy iteration
    • A. Hordijk and M. L. Puterman, "On the convergence of policy iteration," Math. Ops. Res., vol. 12, pp. 163-176, 1987.
    • (1987) Math. Ops. Res. , vol.12 , pp. 163-176
    • Hordijk, A.1    Puterman, M.L.2
  • 27
    • 0029732538 scopus 로고    scopus 로고
    • Duality and linear programs for stability and performance analysis queueing networks and scheduling policies
    • Jan.
    • P. R. Kumar and S. P. Meyn, "Duality and linear programs for stability and performance analysis queueing networks and scheduling policies," IEEE Trans. Automat. Contr., vol. 41, pp. 4-17, Jan. 1996.
    • (1996) IEEE Trans. Automat. Contr. , vol.41 , pp. 4-17
    • Kumar, P.R.1    Meyn, S.P.2
  • 28
    • 0028749105 scopus 로고
    • Fluctuation smoothing policies are stable for stochastic re-entrant lines
    • Dec.
    • S. Kumar and P. R. Kumar, "Fluctuation smoothing policies are stable for stochastic re-entrant lines," in Proc. 33rd IEEE Conf. Decision Contr., Dec. 1994.
    • (1994) Proc. 33rd IEEE Conf. Decision Contr.
    • Kumar, S.1    Kumar, P.R.2
  • 30
    • 21344474295 scopus 로고
    • Transience of multiclass queueing networks via fluid limit models
    • S. P. Meyn, "Transience of multiclass queueing networks via fluid limit models," Ann. Appl. Probab., vol. 5, pp. 946-957, 1995.
    • (1995) Ann. Appl. Probab. , vol.5 , pp. 946-957
    • Meyn, S.P.1
  • 31
    • 0001695753 scopus 로고
    • Stability of generalized Jackson networks
    • S. P. Meyn and D. Down, "Stability of generalized Jackson networks," Ann. Appl. Probab., vol. 4, pp. 24-148, 1994.
    • (1994) Ann. Appl. Probab. , vol.4 , pp. 24-148
    • Meyn, S.P.1    Down, D.2
  • 32
    • 0009089457 scopus 로고
    • Generalized resolvents and Harris recurrence of Markov processes
    • S. P. Meyn and R. L. Tweedie, "Generalized resolvents and Harris recurrence of Markov processes," Contemporary Math., vol. 149, pp. 227-250, 1993.
    • (1993) Contemporary Math. , vol.149 , pp. 227-250
    • Meyn, S.P.1    Tweedie, R.L.2
  • 34
    • 0001340188 scopus 로고
    • Stability of Markovian processes III: Foster-Lyapunov criteria for continuous time processes
    • _, "Stability of Markovian processes III: Foster-Lyapunov criteria for continuous time processes," Adv. Appl. Probab., vol. 25, pp. 518-548, 1993.
    • (1993) Adv. Appl. Probab. , vol.25 , pp. 518-548
  • 37
    • 0039367797 scopus 로고
    • On the Poisson equation in the potential theory of a single kernel
    • _, "On the Poisson equation in the potential theory of a single kernel," Math. Scand., vol. 68, pp. 59-82, 1991.
    • (1991) Math. Scand. , vol.68 , pp. 59-82
  • 38
    • 0013467006 scopus 로고
    • Ph.D. thesis, Univ. Illinois, Urbana, IL, Sept. Tech. Rep. UILU-ENG-93-2237 (DC-155)
    • J. Perkins, "Control of push and pull manufacturing systems," Ph.D. thesis, Univ. Illinois, Urbana, IL, Sept. 1993; Tech. Rep. UILU-ENG-93-2237 (DC-155).
    • (1993) Control of Push and Pull Manufacturing Systems
    • Perkins, J.1
  • 40
    • 12144262499 scopus 로고
    • Optimal stationary policies in general state space Markov decision chains with finite action set
    • Nov.
    • R. K. Ritt and L. I. Sennott, "Optimal stationary policies in general state space Markov decision chains with finite action set," Math. Ops. Res., vol. 17, no. 4, pp. 901-909, Nov. 1993.
    • (1993) Math. Ops. Res. , vol.17 , Issue.4 , pp. 901-909
    • Ritt, R.K.1    Sennott, L.I.2
  • 41
    • 33747116089 scopus 로고
    • Applied probability models with optimization applications
    • republication of the work first published by Holden-Day, 1970
    • S. M. Ross, "Applied probability models with optimization applications," in Dover Books on Advanced Mathematics, 1992; republication of the work first published by Holden-Day, 1970.
    • (1992) Dover Books on Advanced Mathematics
    • Ross, S.M.1
  • 42
    • 0022737669 scopus 로고
    • A new condition for the existence of optimal stationary policies in average cost Markov decision processes
    • L. I. Sennott, "A new condition for the existence of optimal stationary policies in average cost Markov decision processes," Ops. Res. Lett., vol. 5, pp. 17-23, 1986.
    • (1986) Ops. Res. Lett. , vol.5 , pp. 17-23
    • Sennott, L.I.1
  • 43
    • 0024702152 scopus 로고
    • Average cost optimal stationary policies in infini te state Markov decision processes with unbounded cost
    • _, "Average cost optimal stationary policies in infini te state Markov decision processes with unbounded cost," Ops. Res., vol. 37, pp. 626-633, 1989.
    • (1989) Ops. Res. , vol.37 , pp. 626-633
  • 44
    • 0009245727 scopus 로고    scopus 로고
    • The convergence of value iteration in average cost Markov decision chains
    • _, "The convergence of value iteration in average cost Markov decision chains," Ops. Res. Lett., vol. 19, pp. 11-16, 1996.
    • (1996) Ops. Res. Lett. , vol.19 , pp. 11-16
  • 46
    • 0001153782 scopus 로고
    • Subgeometric rates of convergence of f-ergodic Markov chains
    • P. Tuominen and R. L. Tweedie, "Subgeometric rates of convergence of f-ergodic Markov chains," Adv. Appl. Probab., vol. 26, pp. 775-798, 1994.
    • (1994) Adv. Appl. Probab. , vol.26 , pp. 775-798
    • Tuominen, P.1    Tweedie, R.L.2
  • 47
    • 0000048705 scopus 로고
    • Optimal control of service rates in networks of queues
    • R. Weber and S. Stidham, "Optimal control of service rates in networks of queues," Adv. Appl. Probab., vol. 19, pp. 202-218, 1987.
    • (1987) Adv. Appl. Probab. , vol.19 , pp. 202-218
    • Weber, R.1    Stidham, S.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.