SCOPUS 정보 검색 플랫폼

Volumn 42, Issue 12, 1997, Pages 1663-1680

The policy iteration algorithm for average reward Markov decision processes with general state space

b University of Illinois at Urbana Champaign (United States)

Author keywords

Howard's algorithm; Markov decision processes; Multiclass queueing networks; Poisson equation; Policy iteration algorithm

Indexed keywords

ALGORITHMS; COMPUTATIONAL METHODS; COMPUTER NETWORKS; CONVERGENCE OF NUMERICAL METHODS; ITERATIVE METHODS; MARKOV PROCESSES; PROBLEM SOLVING; STATE SPACE METHODS; SYSTEM STABILITY;

HOWARD'S ALGORITHM; LINEAR QUADRATIC GAUSSIAN PROBLEM; MARKOV DECISION PROCESSES; MULTICLASS QUEUEING NETWORKS; POISSON EQUATION; POLICY ITERATION ALGORITHM;

OPTIMAL CONTROL SYSTEMS;

EID: 0031344030 PISSN: 00189286 EISSN: None Source Type: Journal
DOI: 10.1109/9.650016 Document Type: Article

Times cited : (100)

References (49)

1
- 0027557742
- Discrete-time controlled Markov processes with average cost criterion: A survey
- A. Arapostathis, V. S. Borkar, E. Fernandez-Gaucherand, M. K. Ghosh, and S. I. Marcus, "Discrete-time controlled Markov processes with average cost criterion: A survey," SIAM J. Contr. Optim., vol. 31, pp. 282-344, 1993.
- (1993) SIAM J. Contr. Optim. , vol.31 , pp. 282-344
- Arapostathis, A.¹ Borkar, V.S.² Fernandez-Gaucherand, E.³ Ghosh, M.K.⁴ Marcus, S.I.⁵

2
- 0003448964
- Topics in controlled Markov chains
- UK: Longman Scientific & Technical
- V. S. Borkar, "Topics in controlled Markov chains," in Pitman Res. Notes in Math. Series # 240. UK: Longman Scientific & Technical, 1991.
- (1991) Pitman Res. Notes in Math. Series # 240
- Borkar, V.S.¹

3
- 0009090320
- Univ. Autónoma Agraria Anonio Narro, Tech. Rep.
- R. Cavazos-Cadena, "Value iteration in a class of communicating Markov decision chains with the average cost criterion," Univ. Autónoma Agraria Anonio Narro, Tech. Rep., 1996.
- (1996) Value Iteration in a Class of Communicating Markov Decision Chains with the Average Cost Criterion
- Cavazos-Cadena, R.¹

4
- 0040898816
- Value iteration in a class of average controlled Markov chains with unbounded costs: Necessary and sufficient conditions for pointwise convergence
- R. Cavazos-Cadena and E. Fernandez-Gaucherand, "Value iteration in a class of average controlled Markov chains with unbounded costs: Necessary and sufficient conditions for pointwise convergence," J. Appl. Probability, vol. 33, pp. 986-1002, 1996.
- (1996) J. Appl. Probability , vol.33 , pp. 986-1002
- Cavazos-Cadena, R.¹ Fernandez-Gaucherand, E.²

5
- 84866210504
- Univ. de Rouen UFR des Sciences, Tech. Rep.
- F. Charlot and A. Nafidi, "Irréducibilité, petits ensembles, et stabilité des réseaux de Jackson généralisés," Univ. de Rouen UFR des Sciences, Tech. Rep., 1996.
- (1996) Irréducibilité, Petits Ensembles, et Stabilité des Réseaux de Jackson Généralisés
- Charlot, F.¹ Nafidi, A.²

6
- 0041855289
- to be published
- R.-R. Chen and S. P. Meyn, "Value iteration and optimization of multiclass queueing networks," to be published.
- Value Iteration and Optimization of Multiclass Queueing Networks
- Chen, R.-R.¹ Meyn, S.P.²

7
- 0001875921
- Fluid network models Linear programs for control and performance bounds
- J. Cruz, J. Gertler, and M. Peshkin, Eds, San Francisco, CA
- J. Humphrey, D. Eng, and S. P. Meyn, "Fluid network models Linear programs for control and performance bounds," in Proc. 13th IFAC World Congr., J. Cruz, J. Gertler, and M. Peshkin, Eds, San Francisco, CA, 1996, vol. B, pp. 19-24.
- (1996) Proc. 13th IFAC World Congr. , vol.B , pp. 19-24
- Humphrey, J.¹ Eng, D.² Meyn, S.P.³

8
- 0030086281
- Stability and instability of fluid models for certain re-entrant lines
- Feb.
- J. Dai and G. Weiss, "Stability and instability of fluid models for certain re-entrant lines," Math. Ops. Res., vol. 21, no. 1, pp. 115-134, Feb. 1996.
- (1996) Math. Ops. Res. , vol.21 , Issue.1 , pp. 115-134
- Dai, J.¹ Weiss, G.²

9
- 0003077340
- On the positive Harris recurrence for multiclass queueing networks a unified approach via fluid limit models
- J. G. Dai, "On the positive Harris recurrence for multiclass queueing networks A unified approach via fluid limit models," Ann. Appl. Probab., vol. 5, pp. 49-77, 1995.
- (1995) Ann. Appl. Probab. , vol.5 , pp. 49-77
- Dai, J.G.¹

10
- 0029404157
- Stability and convergence of moments for multiclass queueing networks via fluid limit models
- Nov.
- J. G. Dai and S. P. Meyn, "Stability and convergence of moments for multiclass queueing networks via fluid limit models," IEEE Trans. Automat. Contr., vol. 40, pp. 1889-1904, Nov. 1995.
- (1995) IEEE Trans. Automat. Contr. , vol.40 , pp. 1889-1904
- Dai, J.G.¹ Meyn, S.P.²

11
- 2342525669
- Counterexamples for compact action Markov decision chains with average reward criteria
- R. Dekker, "Counterexamples for compact action Markov decision chains with average reward criteria," Comm. Statist.-Stoch. Models, vol. 3, pp. 357-368, 1987.
- (1987) Comm. Statist.-Stoch. Models , vol.3 , pp. 357-368
- Dekker, R.¹

12
- 0343812266
- Dunumerable state MDPs
- C. Derman, "Dunumerable state MDPs," Ann. Amth. Statist., vol. 37, pp. 1545-1554, 1966.
- (1966) Ann. Amth. Statist. , vol.37 , pp. 1545-1554
- Derman, C.¹

13
- 21344460828
- Geometric and uniform ergodicity of Markov processes
- D. Down, S. P. Meyn, and R. L. Tweedie, "Geometric and uniform ergodicity of Markov processes," Ann. Probab., vol. 23, no. 4, pp. 1671-1691, 1996.
- (1996) Ann. Probab. , vol.23 , Issue.4 , pp. 1671-1691
- Down, D.¹ Meyn, S.P.² Tweedie, R.L.³

14
- 9944244112
- Masson
- M. Duflo, Méthodes Récursives Aléatoires. Masson, 1990.
- (1990) Méthodes Récursives Aléatoires
- Duflo, M.¹

15
- 0003634432
- Controlled Markov processes
- New York: Springer-Verlag
- E. B. Dynkin and A. A. Yushkevich, "Controlled Markov processes," in volume Grundlehren der mathematischen Wissenschaften 235 of A Series of Comprehensive Studies in Mathematics. New York: Springer-Verlag, 1979.
- (1979) Volume Grundlehren der Mathematischen Wissenschaften 235 of a Series of Comprehensive Studies in Mathematics
- Dynkin, E.B.¹ Yushkevich, A.A.²

16
- 0000215112
- Massachusetts Inst. Technol., Tech. Rep.
- D. Bertsimas, F. Avram, and M. Ricard, "Fluid models of sequencing problems in open queueing networks: An optimal control approach," Massachusetts Inst. Technol., Tech. Rep., 1995.
- (1995) Fluid Models of Sequencing Problems in Open Queueing Networks: An Optimal Control Approach
- Bertsimas, D.¹ Avram, F.² Ricard, M.³

17
- 0010107106
- Ergodicity of queueing networks
- S. Foss, "Ergodicity of queueing networks," Siberian Math. J., vol. 32, pp. 183-202, 1991.
- (1991) Siberian Math. J. , vol.32 , pp. 183-202
- Foss, S.¹

18
- 0030522182
- A Lyapunov bound for solutions of Poisson's equation
- Apr.
- P. W. Glynn and S. P. Meyn, "A Lyapunov bound for solutions of Poisson's equation," Ann. Probab., vol. 24, Apr. 1996.
- (1996) Ann. Probab. , vol.24
- Glynn, P.W.¹ Meyn, S.P.²

19
- 84866209609
- IPN, Departamento de Matematicas, Mexico, and LAAS-CNRS, France, Tech. Rep.
- O. Hernández-Lerma and J. B. Lasserre, "Policy iteration for average cost Markov control processes on Borel spaces," IPN, Departamento de Matematicas, Mexico, and LAAS-CNRS, France, Tech. Rep., 1995; Acta Applicandae Mathematicae, to be published.
- (1995) Policy Iteration for Average Cost Markov Control Processes on Borel Spaces
- Hernández-Lerma, O.¹ Lasserre, J.B.²

20
- 85075781529
- New York: Springer-Verlag
- _, Discrete Time Markov Control Processes I. New York: Springer-Verlag, 1996.
- (1996) Discrete Time Markov Control Processes I

21
- 0006238280
- Recurrence conditions for Markov decision processes with Borel state space: A survey
- O. Hernández-Lerma, R. Montes-de-Oca, and R. Cavazos-Cadena, "Recurrence conditions for Markov decision processes with Borel state space: A survey," Ann. Operations Res., vol. 28, pp. 29-46, 1991.
- (1991) Ann. Operations Res. , vol.28 , pp. 29-46
- Hernández-Lerma, O.¹ Montes-de-Oca, R.² Cavazos-Cadena, R.³

22
- 0004211484
- A. Hordijk, Dynamic Programming and Markov Potential Theory, 1977.
- (1977) Dynamic Programming and Markov Potential Theory
- Hordijk, A.¹

23
- 0023295768
- On the convergence of policy iteration
- A. Hordijk and M. L. Puterman, "On the convergence of policy iteration," Math. Ops. Res., vol. 12, pp. 163-176, 1987.
- (1987) Math. Ops. Res. , vol.12 , pp. 163-176
- Hordijk, A.¹ Puterman, M.L.²

24
- 0346169030
- Leiden Univ. and Colorado State Univ., Tech. Rep.
- A. Hordijk, F. M. Spieksma, and R. L. Tweedie, "Uniform stability conditions for general space Markov decision processes," Leiden Univ. and Colorado State Univ., Tech. Rep., 1995.
- (1995) Uniform stability conditions for general space Markov decision processes
- Hordijk, A.¹ Spieksma, F.M.² Tweedie, R.L.³

25
- 0003644124
- New York: Wiley
- R. A. Howard, Dynamic Programming and Markov Processes. New York: Wiley, 1960.
- (1960) Dynamic Programming and Markov Processes
- Howard, R.A.¹

26
- 33747144865
- Indian Inst. Sci., Bangalore, Tech. Rep.
- V. R. Konda and V. S. Borkar, "Learning algorithms for Markov decision processes," Indian Inst. Sci., Bangalore, Tech. Rep., 1996.
- (1996) Learning Algorithms for Markov Decision Processes
- Konda, V.R.¹ Borkar, V.S.²

27
- 0029732538
- Duality and linear programs for stability and performance analysis queueing networks and scheduling policies
- Jan.
- P. R. Kumar and S. P. Meyn, "Duality and linear programs for stability and performance analysis queueing networks and scheduling policies," IEEE Trans. Automat. Contr., vol. 41, pp. 4-17, Jan. 1996.
- (1996) IEEE Trans. Automat. Contr. , vol.41 , pp. 4-17
- Kumar, P.R.¹ Meyn, S.P.²

28
- 0028749105
- Fluctuation smoothing policies are stable for stochastic re-entrant lines
- Dec.
- S. Kumar and P. R. Kumar, "Fluctuation smoothing policies are stable for stochastic re-entrant lines," in Proc. 33rd IEEE Conf. Decision Contr., Dec. 1994.
- (1994) Proc. 33rd IEEE Conf. Decision Contr.
- Kumar, S.¹ Kumar, P.R.²

29
- 0003670043
- New York: Wiley-Intersci.
- H. Kwakernaak and R. Sivan, Linear Optimal Control Systems. New York: Wiley-Intersci., 1972.
- (1972) Linear Optimal Control Systems
- Kwakernaak, H.¹ Sivan, R.²

30
- 21344474295
- Transience of multiclass queueing networks via fluid limit models
- S. P. Meyn, "Transience of multiclass queueing networks via fluid limit models," Ann. Appl. Probab., vol. 5, pp. 946-957, 1995.
- (1995) Ann. Appl. Probab. , vol.5 , pp. 946-957
- Meyn, S.P.¹

31
- 0001695753
- Stability of generalized Jackson networks
- S. P. Meyn and D. Down, "Stability of generalized Jackson networks," Ann. Appl. Probab., vol. 4, pp. 24-148, 1994.
- (1994) Ann. Appl. Probab. , vol.4 , pp. 24-148
- Meyn, S.P.¹ Down, D.²

32
- 0009089457
- Generalized resolvents and Harris recurrence of Markov processes
- S. P. Meyn and R. L. Tweedie, "Generalized resolvents and Harris recurrence of Markov processes," Contemporary Math., vol. 149, pp. 227-250, 1993.
- (1993) Contemporary Math. , vol.149 , pp. 227-250
- Meyn, S.P.¹ Tweedie, R.L.²

33
- 0003637131
- London: Springer-Verlag
- _, Markov Chains and Stochastic Stability. London: Springer-Verlag, 1993.
- (1993) Markov Chains and Stochastic Stability

34
- 0001340188
- Stability of Markovian processes III: Foster-Lyapunov criteria for continuous time processes
- _, "Stability of Markovian processes III: Foster-Lyapunov criteria for continuous time processes," Adv. Appl. Probab., vol. 25, pp. 518-548, 1993.
- (1993) Adv. Appl. Probab. , vol.25 , pp. 518-548

35
- 0001633213
- Stability and optimization of multiclass queueing networks and their fluid models
- S. P. Meyn, "Stability and optimization of multiclass queueing networks and their fluid models," in Proc. Summer Seminar on "The Math. Stochastic Manufacturing Syst.," Amer. Math. Soc., 1997.
- (1997) Proc. Summer Seminar on "the Math. Stochastic Manufacturing Syst.," Amer. Math. Soc.
- Meyn, S.P.¹

36
- 0003499179
- Cambridge: Cambridge Univ. Press
- E. Nummelin, General Irreducible Markov Chains and Non-Negative Operators. Cambridge: Cambridge Univ. Press, 1984.
- (1984) General Irreducible Markov Chains and Non-Negative Operators
- Nummelin, E.¹

37
- 0039367797
- On the Poisson equation in the potential theory of a single kernel
- _, "On the Poisson equation in the potential theory of a single kernel," Math. Scand., vol. 68, pp. 59-82, 1991.
- (1991) Math. Scand. , vol.68 , pp. 59-82

38
- 0013467006
- Ph.D. thesis, Univ. Illinois, Urbana, IL, Sept. Tech. Rep. UILU-ENG-93-2237 (DC-155)
- J. Perkins, "Control of push and pull manufacturing systems," Ph.D. thesis, Univ. Illinois, Urbana, IL, Sept. 1993; Tech. Rep. UILU-ENG-93-2237 (DC-155).
- (1993) Control of Push and Pull Manufacturing Systems
- Perkins, J.¹

39
- 0003998452
- New York: Wiley
- M. L. Puterman, Markov Decision Processes. New York: Wiley, 1994.
- (1994) Markov Decision Processes
- Puterman, M.L.¹

40
- 12144262499
- Optimal stationary policies in general state space Markov decision chains with finite action set
- Nov.
- R. K. Ritt and L. I. Sennott, "Optimal stationary policies in general state space Markov decision chains with finite action set," Math. Ops. Res., vol. 17, no. 4, pp. 901-909, Nov. 1993.
- (1993) Math. Ops. Res. , vol.17 , Issue.4 , pp. 901-909
- Ritt, R.K.¹ Sennott, L.I.²

41
- 33747116089
- Applied probability models with optimization applications
- republication of the work first published by Holden-Day, 1970
- S. M. Ross, "Applied probability models with optimization applications," in Dover Books on Advanced Mathematics, 1992; republication of the work first published by Holden-Day, 1970.
- (1992) Dover Books on Advanced Mathematics
- Ross, S.M.¹

42
- 0022737669
- A new condition for the existence of optimal stationary policies in average cost Markov decision processes
- L. I. Sennott, "A new condition for the existence of optimal stationary policies in average cost Markov decision processes," Ops. Res. Lett., vol. 5, pp. 17-23, 1986.
- (1986) Ops. Res. Lett. , vol.5 , pp. 17-23
- Sennott, L.I.¹

43
- 0024702152
- Average cost optimal stationary policies in infini te state Markov decision processes with unbounded cost
- _, "Average cost optimal stationary policies in infini te state Markov decision processes with unbounded cost," Ops. Res., vol. 37, pp. 626-633, 1989.
- (1989) Ops. Res. , vol.37 , pp. 626-633

44
- 0009245727
- The convergence of value iteration in average cost Markov decision chains
- _, "The convergence of value iteration in average cost Markov decision chains," Ops. Res. Lett., vol. 19, pp. 11-16, 1996.
- (1996) Ops. Res. Lett. , vol.19 , pp. 11-16

45
- 0008813539
- Massachusetts Inst. Technol., Cambridge, MA, Tech. Rep. LIDS-P-2322, Mar. also IEEE Trans. Automat. Contr.
- J. N. Tsitsiklis and B. Van Roy, "An analysis of temporal-difference learning with function approximation," Massachusetts Inst. Technol., Cambridge, MA, Tech. Rep. LIDS-P-2322, Mar. 1996: also IEEE Trans. Automat. Contr..
- (1996) An Analysis of Temporal-difference Learning with Function Approximation
- Tsitsiklis, J.N.¹ Van Roy, B.²

46
- 0001153782
- Subgeometric rates of convergence of f-ergodic Markov chains
- P. Tuominen and R. L. Tweedie, "Subgeometric rates of convergence of f-ergodic Markov chains," Adv. Appl. Probab., vol. 26, pp. 775-798, 1994.
- (1994) Adv. Appl. Probab. , vol.26 , pp. 775-798
- Tuominen, P.¹ Tweedie, R.L.²

47
- 0000048705
- Optimal control of service rates in networks of queues
- R. Weber and S. Stidham, "Optimal control of service rates in networks of queues," Adv. Appl. Probab., vol. 19, pp. 202-218, 1987.
- (1987) Adv. Appl. Probab. , vol.19 , pp. 202-218
- Weber, R.¹ Stidham, S.²

48
- 33747089935
- Georgia Inst. Technol. Technion, Tech. Rep.
- G. Weiss, "On the optimal draining of re-entrant fluid lines," Georgia Inst. Technol. Technion, Tech. Rep., 1994.
- (1994) On the Optimal Draining of Re-entrant Fluid Lines
- Weiss, G.¹

49
- 0003502787
- Chichester, NY: Wiley
- P. Whittle, Risk-Sensitive Optimal Control. Chichester, NY: Wiley, 1990.
- (1990) Risk-Sensitive Optimal Control
- Whittle, P.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.