SCOPUS 정보 검색 플랫폼

Mathematics of Operations Research

Volumn 31, Issue 3, 2006, Pages 597-620

A cost-shaping linear program for average-cost approximate dynamic programming with performance guarantees

(2) De Farias, Daniela Pucci a Van Roy, Benjamin b

a Massachusetts Institute of Technology Cambridge (United States)

b STANFORD UNIVERSITY (United States)

Author keywords

Approximate dynamic programming; Average cost; Linear programming

Indexed keywords

APPROXIMATE DYNAMIC PROGRAMMING; AVERAGE COSTS; DIFFERENTIAL COSTS; MARKOV DECISION PROCESSES (MDP);

ALGORITHMS; DYNAMIC PROGRAMMING; FUNCTIONS; LYAPUNOV METHODS; MARKOV PROCESSES; OPTIMIZATION; PROBLEM SOLVING; QUEUEING THEORY;

LINEAR PROGRAMMING;

EID: 33748414214 PISSN: 0364765X EISSN: 15265471 Source Type: Journal
DOI: 10.1287/moor.1060.0208 Document Type: Article

Times cited : (39)

References (36)

1
- 4544373774
- A price-directed approach to stochastic inventory/routing
- Adelman, D. 2004. A price-directed approach to stochastic inventory/routing. Oper. Res. 52(4) 499-514.
- (2004) Oper. Res. , vol.52 , Issue.4 , pp. 499-514
- Adelman, D.¹

2
- 0003565783
- Athena Scientific, Belmont, MA
- Bertsekas, D. P. 2001. Dynamic Programming and Optimal Control, 2nd ed. Athena Scientific, Belmont, MA.
- (2001) Dynamic Programming and Optimal Control, 2nd Ed.
- Bertsekas, D.P.¹

3
- 0003487482
- Athena Scientific, Belmont, MA
- Bertsekas, D. P., J. N. Tsitsiklis. 1996. Neuro-Dynamic Programming. Athena Scientific, Belmont, MA.
- (1996) Neuro-dynamic Programming
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

4
- 13244262450
- Convex analytic methods in Markov decision processes
- E. Feinberg, A. Shwartz, eds.. Kluwer, Boston, MA
- Borkar, V. 2001. Convex analytic methods in Markov decision processes. E. Feinberg, A. Shwartz, eds. Handbook of Markov Decision Processes: Methods and Applications. Kluwer, Boston, MA.
- (2001) Handbook of Markov Decision Processes: Methods and Applications
- Borkar, V.¹

5
- 0001133021
- Generalization in reinforcement learning: Safely approximating the value function
- MIT Press, Cambridge, MA
- Boyan, J. A., A. W. Moore. 1995. Generalization in reinforcement learning: Safely approximating the value function. Advances in Neural Information Processing Systems, Vol. 7. MIT Press, Cambridge, MA.
- (1995) Advances in Neural Information Processing Systems , vol.7
- Boyan, J.A.¹ Moore, A.W.²

6
- 0033245832
- Value iteration and optimization of multiclass queueing networks
- Chen, R.-R., S. Meyn. 1999. Value iteration and optimization of multiclass queueing networks. Queueing Systems 32 65-97.
- (1999) Queueing Systems , vol.32 , pp. 65-97
- Chen, R.-R.¹ Meyn, S.²

7
- 5544258192
- On constraint sampling in the linear programming approach to approximate dynamic programming
- de Farias, D. P., B. Van Roy. 2004. On constraint sampling in the linear programming approach to approximate dynamic programming. Math. Oper. Res. 29(3) 462-478.
- (2004) Math. Oper. Res. , vol.29 , Issue.3 , pp. 462-478
- De Farias, D.P.¹ Van Roy, B.²

8
- 84898987009
- Approximate linear programming for average-cost dynamic programming
- MIT Press, Cambridge, MA
- de Farias, D. P., B. Van Roy. 2003. Approximate linear programming for average-cost dynamic programming. Advances in Neural Information Processing Systems, Vol. 15. MIT Press, Cambridge, MA.
- (2003) Advances in Neural Information Processing Systems , vol.15
- De Farias, D.P.¹ Van Roy, B.²

9
- 0348090400
- The linear programming approach to approximate dynamic programming
- de Farias, D. P., B. Van Roy. 2003. The linear programming approach to approximate dynamic programming. Oper. Res. 51(6) 850-865.
- (2003) Oper. Res. , vol.51 , Issue.6 , pp. 850-865
- De Farias, D.P.¹ Van Roy, B.²

10
- 0006464452
- A probabilistic production and inventory problem
- D'Epenoux, P. 1963. A probabilistic production and inventory problem. Management Sci. 10(1) 98-108.
- (1963) Management Sci. , vol.10 , Issue.1 , pp. 98-108
- D'Epenoux, P.¹

11
- 33748427607
- Terris: A study of randomized constraint sampling
- G. Calafiore, P. Dabbene, eds.. Springer-Verlag, London, UK
- Parias, V. F., B. Van Roy. 2006. Terris: A study of randomized constraint sampling. G. Calafiore, P. Dabbene, eds. Probabilistic and Randomized Methods for Design Under Uncertainty. Springer-Verlag, London, UK.
- (2006) Probabilistic and Randomized Methods for Design under Uncertainty
- Parias, V.F.¹ Van Roy, B.²

12
- 0038595393
- Stable function approximation in dynamic programming
- Carnegie Mellon University, Pittsburgh, PA
- Gordon, G. J. 1995. Stable function approximation in dynamic programming. Technical Report CMU-CS-95-103, Carnegie Mellon University, Pittsburgh, PA.
- (1995) Technical Report , vol.CMU-CS-95-103
- Gordon, G.J.¹

13
- 84880694195
- Stable function approximation in dynamic programming
- San Francisco, CA
- Gordon, G. J. 1995. Stable function approximation in dynamic programming. Machine Learning: Proc. Twelfth Internat. Conf. (ICML), San Francisco, CA.
- (1995) Machine Learning: Proc. Twelfth Internat. Conf. (ICML)
- Gordon, G.J.¹

14
- 0003989207
- Ph.D. thesis, Carnegie Mellon University, Pittsburgh, PA
- Gordon, G. J. 1999. Approximate solutions to Markov decision processes. Ph.D. thesis, Carnegie Mellon University, Pittsburgh, PA.
- (1999) Approximate Solutions to Markov Decision Processes
- Gordon, G.J.¹

15
- 14344256227
- Ph.D. thesis, Stanford University, Stanford, CA
- Guestrin, C. 2003. Planning under uncertainty in complex structured environments. Ph.D. thesis, Stanford University, Stanford, CA.
- (2003) Planning under Uncertainty in Complex Structured Environments
- Guestrin, C.¹

16
- 29344475738
- Solving factored MDPs with continuous and discrete variables
- Banff, Alberta, Canada
- Guestrin, C., M. Hauskrecht, B. Kveton. 2004. Solving factored MDPs with continuous and discrete variables. Twentieth Conf. Uncertainty in Artificial Intelligence, Banff, Alberta, Canada.
- (2004) Twentieth Conf. Uncertainty in Artificial Intelligence
- Guestrin, C.¹ Hauskrecht, M.² Kveton, B.³

17
- 4544318426
- Efficient solution algorithms for factored MDPs
- Guestrin, C., D. Koller, R. Parr. 2003. Efficient solution algorithms for factored MDPs. J. Artificial Intelligence Res. 19 399-468.
- (2003) J. Artificial Intelligence Res. , vol.19 , pp. 399-468
- Guestrin, C.¹ Koller, D.² Parr, R.³

18
- 0008815437
- Advantage updating applied to a differential game
- MIT Press, Cambridge, MA
- Harmon, M. E., L. C. Baird, A. H. Klopf. 1995. Advantage updating applied to a differential game. Advances in Neural Information Processing Systems, Vol. 7. MIT Press, Cambridge, MA.
- (1995) Advances in Neural Information Processing Systems , vol.7
- Harmon, M.E.¹ Baird, L.C.² Klopf, A.H.³

19
- 14844352327
- Linear program approximations to factored continuous-state Markov decision processes
- MIT Press, Cambridge, MA
- Hauskrecht, M., B. Kveton. 2003. Linear program approximations to factored continuous-state Markov decision processes. Advances in Neural Information Processing Systems, Vol. 17. MIT Press, Cambridge, MA.
- (2003) Advances in Neural Information Processing Systems , vol.17
- Hauskrecht, M.¹ Kveton, B.²

20
- 0037289503
- Performance evaluation and policy selection in multiclass networks
- (Special issue on learning, optimization and decision making (invited))
- Henderson, S. G., S. P. Meyn, V. B. Tadić. 2003. Performance evaluation and policy selection in multiclass networks. Discrete Event Dynam. Systems: Theory Appl. 13(Special issue on learning, optimization and decision making (invited)) 149-189.
- (2003) Discrete Event Dynam. Systems: Theory Appl. , vol.13 , pp. 149-189
- Henderson, S.G.¹ Meyn, S.P.² Tadić, V.B.³

21
- 55549087665
- The linear programming approach
- E. Feinberg, A. Shwartz, eds.. Kluwer, Boston, MA
- Hernández-Lerma, O., J. B. Lasserre. 2001. The linear programming approach. E. Feinberg, A. Shwartz, eds. Handbook of Markov Decision Processes: Methods and Applications. Kluwer, Boston, MA.
- (2001) Handbook of Markov Decision Processes: Methods and Applications
- Hernández-Lerma, O.¹ Lasserre, J.B.²

22
- 0038876647
- Criteria for uniform ergodicity and strong stability of Markov chains with a common phase space
- Kartashov, N. V. 1985. Criteria for uniform ergodicity and strong stability of Markov chains with a common phase space. Theory Probab. Appl. 30 71-89.
- (1985) Theory Probab. Appl. , vol.30 , pp. 71-89
- Kartashov, N.V.¹

23
- 0040061152
- Inequalities in theorems of ergodicity and stability for Markov chains with a common phase space
- Kartashov, N. V. 1985. Inequalities in theorems of ergodicity and stability for Markov chains with a common phase space. Theory Probab. Appl. 30 247-259.
- (1985) Theory Probab. Appl. , vol.30 , pp. 247-259
- Kartashov, N.V.¹

24
- 0036832954
- Near-optimal reinforcement learning in polynomial time
- Kearns, M., S. Singh. 2002. Near-optimal reinforcement learning in polynomial time. Machine Learning 49(2) 209-232.
- (2002) Machine Learning , vol.49 , Issue.2 , pp. 209-232
- Kearns, M.¹ Singh, S.²

25
- 0001257766
- Linear programming and sequential decisions
- Manne, A. S. Linear programming and sequential decisions. Management Sci. 6(3) 259-267.
- Management Sci. , vol.6 , Issue.3 , pp. 259-267
- Manne, A.S.¹

26
- 0031344030
- The policy iteration algorithm for average reward Markov decision processes with general state space
- Meyn, S. P. 1997. The policy iteration algorithm for average reward Markov decision processes with general state space. Trans. Automatic Control 42(12) 1663-1680.
- (1997) Trans. Automatic Control , vol.42 , Issue.12 , pp. 1663-1680
- Meyn, S.P.¹

27
- 23944498849
- Workload models for stochastic networks: Value functions and performance evaluation
- Meyn, S. P. 2005. Workload models for stochastic networks: Value functions and performance evaluation. IEEE Trans. Automatic Control 50(8) 1106-1122.
- (2005) IEEE Trans. Automatic Control , vol.50 , Issue.8 , pp. 1106-1122
- Meyn, S.P.¹

28
- 0003637131
- Springer-Verlag
- Meyn, S. P., R. Tweedie. 1993. Markov Chains and Stochastic Stability. Springer-Verlag.
- (1993) Markov Chains and Stochastic Stability
- Meyn, S.P.¹ Tweedie, R.²

29
- 0033247532
- New linear program performance bounds for queueing networks
- Morrison, J. R., P. R. Kumar. 1999. New linear program performance bounds for queueing networks. J. Optim. Theory Appl. 100(3) 575-597.
- (1999) J. Optim. Theory Appl. , vol.100 , Issue.3 , pp. 575-597
- Morrison, J.R.¹ Kumar, P.R.²

30
- 1942516880
- Error bounds for approximate policy iteration
- AAAI Press, Menlo Park, CA
- Munos, R. 2003. Error bounds for approximate policy iteration. Machine Learning: Proc. Twentieth Internat. Conf. (ICML). AAAI Press, Menlo Park, CA.
- (2003) Machine Learning: Proc. Twentieth Internat. Conf. (ICML)
- Munos, R.¹

31
- 0003998452
- John Wiley & Sons, New York
- Puterman, M. L. 1994. Markov Decision Processes. John Wiley & Sons, New York.
- (1994) Markov Decision Processes
- Puterman, M.L.¹

32
- 1542342765
- Direct value-approximation for factored MDPs
- MIT Press, Cambridge, MA
- Schuurmans, D., R. Patrascu. 2001. Direct value-approximation for factored MDPs. Advances in Neural Information Processing Systems, Vol. 14. MIT Press, Cambridge, MA.
- (2001) Advances in Neural Information Processing Systems , vol.14
- Schuurmans, D.¹ Patrascu, R.²

33
- 0000273218
- Generalized polynomial approximation in Markov decision processes
- Schweitzer, P. J., A. Seidman. 1985. Generalized polynomial approximation in Markov decision processes. J. Math. Anal. Appl. 110 568-582.
- (1985) J. Math. Anal. Appl. , vol.110 , pp. 568-582
- Schweitzer, P.J.¹ Seidman, A.²

34
- 0031350985
- Spline approximations to value functions: A linear programming approach
- Trick, M., S. Zin. 1997. Spline approximations to value functions: A linear programming approach. Macroeconomic Dynam. 1(1) 255-277.
- (1997) Macroeconomic Dynam. , vol.1 , Issue.1 , pp. 255-277
- Trick, M.¹ Zin, S.²

35
- 0029752470
- Feature-based methods for large-scale dynamic programming
- Tsitsiklis, J. N., B. Van Roy. 1996. Feature-based methods for large-scale dynamic programming. Machine. Learning 22 59-94.
- (1996) Machine. Learning , vol.22 , pp. 59-94
- Tsitsiklis, J.N.¹ Van Roy, B.²

36
- 26244463334
- Submitted for publication
- Veatch, M. H. 2005. Approximate dynamic programming for networks: Fluid models and constraint reduction. Submitted for publication.
- (2005) Approximate Dynamic Programming for Networks: Fluid Models and Constraint Reduction
- Veatch, M.H.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.