메뉴 건너뛰기




Volumn 31, Issue 3, 2006, Pages 597-620

A cost-shaping linear program for average-cost approximate dynamic programming with performance guarantees

Author keywords

Approximate dynamic programming; Average cost; Linear programming

Indexed keywords

APPROXIMATE DYNAMIC PROGRAMMING; AVERAGE COSTS; DIFFERENTIAL COSTS; MARKOV DECISION PROCESSES (MDP);

EID: 33748414214     PISSN: 0364765X     EISSN: 15265471     Source Type: Journal    
DOI: 10.1287/moor.1060.0208     Document Type: Article
Times cited : (39)

References (36)
  • 1
    • 4544373774 scopus 로고    scopus 로고
    • A price-directed approach to stochastic inventory/routing
    • Adelman, D. 2004. A price-directed approach to stochastic inventory/routing. Oper. Res. 52(4) 499-514.
    • (2004) Oper. Res. , vol.52 , Issue.4 , pp. 499-514
    • Adelman, D.1
  • 4
    • 13244262450 scopus 로고    scopus 로고
    • Convex analytic methods in Markov decision processes
    • E. Feinberg, A. Shwartz, eds.. Kluwer, Boston, MA
    • Borkar, V. 2001. Convex analytic methods in Markov decision processes. E. Feinberg, A. Shwartz, eds. Handbook of Markov Decision Processes: Methods and Applications. Kluwer, Boston, MA.
    • (2001) Handbook of Markov Decision Processes: Methods and Applications
    • Borkar, V.1
  • 5
    • 0001133021 scopus 로고
    • Generalization in reinforcement learning: Safely approximating the value function
    • MIT Press, Cambridge, MA
    • Boyan, J. A., A. W. Moore. 1995. Generalization in reinforcement learning: Safely approximating the value function. Advances in Neural Information Processing Systems, Vol. 7. MIT Press, Cambridge, MA.
    • (1995) Advances in Neural Information Processing Systems , vol.7
    • Boyan, J.A.1    Moore, A.W.2
  • 6
    • 0033245832 scopus 로고    scopus 로고
    • Value iteration and optimization of multiclass queueing networks
    • Chen, R.-R., S. Meyn. 1999. Value iteration and optimization of multiclass queueing networks. Queueing Systems 32 65-97.
    • (1999) Queueing Systems , vol.32 , pp. 65-97
    • Chen, R.-R.1    Meyn, S.2
  • 7
    • 5544258192 scopus 로고    scopus 로고
    • On constraint sampling in the linear programming approach to approximate dynamic programming
    • de Farias, D. P., B. Van Roy. 2004. On constraint sampling in the linear programming approach to approximate dynamic programming. Math. Oper. Res. 29(3) 462-478.
    • (2004) Math. Oper. Res. , vol.29 , Issue.3 , pp. 462-478
    • De Farias, D.P.1    Van Roy, B.2
  • 8
    • 84898987009 scopus 로고    scopus 로고
    • Approximate linear programming for average-cost dynamic programming
    • MIT Press, Cambridge, MA
    • de Farias, D. P., B. Van Roy. 2003. Approximate linear programming for average-cost dynamic programming. Advances in Neural Information Processing Systems, Vol. 15. MIT Press, Cambridge, MA.
    • (2003) Advances in Neural Information Processing Systems , vol.15
    • De Farias, D.P.1    Van Roy, B.2
  • 9
    • 0348090400 scopus 로고    scopus 로고
    • The linear programming approach to approximate dynamic programming
    • de Farias, D. P., B. Van Roy. 2003. The linear programming approach to approximate dynamic programming. Oper. Res. 51(6) 850-865.
    • (2003) Oper. Res. , vol.51 , Issue.6 , pp. 850-865
    • De Farias, D.P.1    Van Roy, B.2
  • 10
    • 0006464452 scopus 로고
    • A probabilistic production and inventory problem
    • D'Epenoux, P. 1963. A probabilistic production and inventory problem. Management Sci. 10(1) 98-108.
    • (1963) Management Sci. , vol.10 , Issue.1 , pp. 98-108
    • D'Epenoux, P.1
  • 12
    • 0038595393 scopus 로고
    • Stable function approximation in dynamic programming
    • Carnegie Mellon University, Pittsburgh, PA
    • Gordon, G. J. 1995. Stable function approximation in dynamic programming. Technical Report CMU-CS-95-103, Carnegie Mellon University, Pittsburgh, PA.
    • (1995) Technical Report , vol.CMU-CS-95-103
    • Gordon, G.J.1
  • 19
    • 14844352327 scopus 로고    scopus 로고
    • Linear program approximations to factored continuous-state Markov decision processes
    • MIT Press, Cambridge, MA
    • Hauskrecht, M., B. Kveton. 2003. Linear program approximations to factored continuous-state Markov decision processes. Advances in Neural Information Processing Systems, Vol. 17. MIT Press, Cambridge, MA.
    • (2003) Advances in Neural Information Processing Systems , vol.17
    • Hauskrecht, M.1    Kveton, B.2
  • 20
    • 0037289503 scopus 로고    scopus 로고
    • Performance evaluation and policy selection in multiclass networks
    • (Special issue on learning, optimization and decision making (invited))
    • Henderson, S. G., S. P. Meyn, V. B. Tadić. 2003. Performance evaluation and policy selection in multiclass networks. Discrete Event Dynam. Systems: Theory Appl. 13(Special issue on learning, optimization and decision making (invited)) 149-189.
    • (2003) Discrete Event Dynam. Systems: Theory Appl. , vol.13 , pp. 149-189
    • Henderson, S.G.1    Meyn, S.P.2    Tadić, V.B.3
  • 22
    • 0038876647 scopus 로고
    • Criteria for uniform ergodicity and strong stability of Markov chains with a common phase space
    • Kartashov, N. V. 1985. Criteria for uniform ergodicity and strong stability of Markov chains with a common phase space. Theory Probab. Appl. 30 71-89.
    • (1985) Theory Probab. Appl. , vol.30 , pp. 71-89
    • Kartashov, N.V.1
  • 23
    • 0040061152 scopus 로고
    • Inequalities in theorems of ergodicity and stability for Markov chains with a common phase space
    • Kartashov, N. V. 1985. Inequalities in theorems of ergodicity and stability for Markov chains with a common phase space. Theory Probab. Appl. 30 247-259.
    • (1985) Theory Probab. Appl. , vol.30 , pp. 247-259
    • Kartashov, N.V.1
  • 24
    • 0036832954 scopus 로고    scopus 로고
    • Near-optimal reinforcement learning in polynomial time
    • Kearns, M., S. Singh. 2002. Near-optimal reinforcement learning in polynomial time. Machine Learning 49(2) 209-232.
    • (2002) Machine Learning , vol.49 , Issue.2 , pp. 209-232
    • Kearns, M.1    Singh, S.2
  • 25
    • 0001257766 scopus 로고    scopus 로고
    • Linear programming and sequential decisions
    • Manne, A. S. Linear programming and sequential decisions. Management Sci. 6(3) 259-267.
    • Management Sci. , vol.6 , Issue.3 , pp. 259-267
    • Manne, A.S.1
  • 26
    • 0031344030 scopus 로고    scopus 로고
    • The policy iteration algorithm for average reward Markov decision processes with general state space
    • Meyn, S. P. 1997. The policy iteration algorithm for average reward Markov decision processes with general state space. Trans. Automatic Control 42(12) 1663-1680.
    • (1997) Trans. Automatic Control , vol.42 , Issue.12 , pp. 1663-1680
    • Meyn, S.P.1
  • 27
    • 23944498849 scopus 로고    scopus 로고
    • Workload models for stochastic networks: Value functions and performance evaluation
    • Meyn, S. P. 2005. Workload models for stochastic networks: Value functions and performance evaluation. IEEE Trans. Automatic Control 50(8) 1106-1122.
    • (2005) IEEE Trans. Automatic Control , vol.50 , Issue.8 , pp. 1106-1122
    • Meyn, S.P.1
  • 29
    • 0033247532 scopus 로고    scopus 로고
    • New linear program performance bounds for queueing networks
    • Morrison, J. R., P. R. Kumar. 1999. New linear program performance bounds for queueing networks. J. Optim. Theory Appl. 100(3) 575-597.
    • (1999) J. Optim. Theory Appl. , vol.100 , Issue.3 , pp. 575-597
    • Morrison, J.R.1    Kumar, P.R.2
  • 33
    • 0000273218 scopus 로고
    • Generalized polynomial approximation in Markov decision processes
    • Schweitzer, P. J., A. Seidman. 1985. Generalized polynomial approximation in Markov decision processes. J. Math. Anal. Appl. 110 568-582.
    • (1985) J. Math. Anal. Appl. , vol.110 , pp. 568-582
    • Schweitzer, P.J.1    Seidman, A.2
  • 34
    • 0031350985 scopus 로고    scopus 로고
    • Spline approximations to value functions: A linear programming approach
    • Trick, M., S. Zin. 1997. Spline approximations to value functions: A linear programming approach. Macroeconomic Dynam. 1(1) 255-277.
    • (1997) Macroeconomic Dynam. , vol.1 , Issue.1 , pp. 255-277
    • Trick, M.1    Zin, S.2
  • 35
    • 0029752470 scopus 로고    scopus 로고
    • Feature-based methods for large-scale dynamic programming
    • Tsitsiklis, J. N., B. Van Roy. 1996. Feature-based methods for large-scale dynamic programming. Machine. Learning 22 59-94.
    • (1996) Machine. Learning , vol.22 , pp. 59-94
    • Tsitsiklis, J.N.1    Van Roy, B.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.