메뉴 건너뛰기




Volumn 51, Issue 6, 2003, Pages 850-865

The linear programming approach to approximate dynamic programming

Author keywords

Dynamic programming optimal control: approximations large scale problems; Queues, algorithms: control of queueing networks

Indexed keywords

LARGE-SCALE PROBLEMS; QUEUES; STOCHASTIC MODELS;

EID: 0348090400     PISSN: 0030364X     EISSN: None     Source Type: Journal    
DOI: 10.1287/opre.51.6.850.24925     Document Type: Article
Times cited : (602)

References (37)
  • 3
    • 0035497817 scopus 로고    scopus 로고
    • Performance of multiclass Markovian queueing networks via piecewise linear Lyapunov functions
    • _, D. Gamarnik, J. Tsitsiklis. 2001. Performance of multiclass Markovian queueing networks via piecewise linear Lyapunov functions. Ann. Appl. Probab. 11(4) 1384-1428.
    • (2001) Ann. Appl. Probab. , vol.11 , Issue.4 , pp. 1384-1428
    • Gamarnik, D.1    Tsitsiklis, J.2
  • 5
    • 0343709784 scopus 로고
    • A convex analytic approach to Markov decision processes
    • Borkar, V. 1988. A convex analytic approach to Markov decision processes. Probab. Theory Related Fields 78 583-602.
    • (1988) Probab. Theory Related Fields , vol.78 , pp. 583-602
    • Borkar, V.1
  • 6
    • 0033245832 scopus 로고    scopus 로고
    • Value iteration and optimization of multiclass queueing networks
    • Chen, R.-R., S. Meyn. 1999. Value iteration and optimization of multiclass queueing networks. Queueing Systems 32 65-97.
    • (1999) Queueing Systems , vol.32 , pp. 65-97
    • Chen, R.-R.1    Meyn, S.2
  • 7
    • 0001820934 scopus 로고    scopus 로고
    • Applying experimental design and regression splines to high-dimensional continuous-state stochastic dynamic programming
    • Chen, V. C. P., D. Ruppert, C. A. Shoemaker. 1999. Applying experimental design and regression splines to high-dimensional continuous-state stochastic dynamic programming. Oper. Res. 47(1) 38-53.
    • (1999) Oper. Res. , vol.47 , Issue.1 , pp. 38-53
    • Chen, V.C.P.1    Ruppert, D.2    Shoemaker, C.A.3
  • 8
    • 85156187730 scopus 로고    scopus 로고
    • Improving elevator performance using reinforcement learning
    • MIT Press, Cambridge, MA
    • Crites, R. H., A. G. Barto. 1996. Improving elevator performance using reinforcement learning. Advances in Neural Information Processing Systems, Vol. 8. MIT Press, Cambridge, MA, 1017-1023.
    • (1996) Advances in Neural Information Processing Systems , vol.8 , pp. 1017-1023
    • Crites, R.H.1    Barto, A.G.2
  • 9
    • 0000430514 scopus 로고
    • The convergence of TD(λ) for general λ
    • Dayan, P. 1992. The convergence of TD(λ) for general λ. Machine Learning 8 341-362.
    • (1992) Machine Learning , vol.8 , pp. 341-362
    • Dayan, P.1
  • 10
    • 0034342516 scopus 로고    scopus 로고
    • On the existence of fixed points for appproximate value iteration and temporal-difference learning
    • de Farias, D. P., B. Van Roy. 2000. On the existence of fixed points for appproximate value iteration and temporal-difference learning. J. Optim. Theory Appl. 105(3) 589-608.
    • (2000) J. Optim. Theory Appl. , vol.105 , Issue.3 , pp. 589-608
    • De Farias, D.P.1    Van Roy, B.2
  • 11
    • 4344702636 scopus 로고    scopus 로고
    • On constraint sampling in the linear programming approach to approximate dynamic programming
    • Conditionally accepted to
    • _, _. 2001. On constraint sampling in the linear programming approach to approximate dynamic programming. Conditionally accepted to Math. Oper. Res.
    • (2001) Math. Oper. Res.
  • 13
    • 0001554538 scopus 로고
    • On linear programming in a Markov decision problem
    • Denardo, E. V. 1970. On linear programming in a Markov decision problem. Management Sci. 16(5) 282-288.
    • (1970) Management Sci. , vol.16 , Issue.5 , pp. 282-288
    • Denardo, E.V.1
  • 14
    • 0006464452 scopus 로고
    • A probabilistic production and inventory problem
    • D'Epenoux, F. 1963. A probabilistic production and inventory problem. Management Sci. 10(1) 98-108.
    • (1963) Management Sci. , vol.10 , Issue.1 , pp. 98-108
    • D'Epenoux, F.1
  • 16
    • 33947289916 scopus 로고
    • Solution of large-scale symmetric travelling salesman problems
    • Grötschel, M., O. Holland. 1991. Solution of large-scale symmetric travelling salesman problems. Math. Programming 51 141-202.
    • (1991) Math. Programming , vol.51 , pp. 141-202
    • Grötschel, M.1    Holland, O.2
  • 19
    • 0018455841 scopus 로고
    • Linear programming and Markov decision chains
    • Hordijk, A., L. C. M. Kallenberg. 1979. Linear programming and Markov decision chains. Management Sci. 25 352-362.
    • (1979) Management Sci. , vol.25 , pp. 352-362
    • Hordijk, A.1    Kallenberg, L.C.M.2
  • 20
    • 0025401166 scopus 로고
    • Dynamic instabilities and stabilization methods in distributed real-time scheduling of manufacturing systems
    • Kumar, P. R., T. I. Seidman. 1990. Dynamic instabilities and stabilization methods in distributed real-time scheduling of manufacturing systems. IEEE Trans. Automatic Control 35(3) 289-298.
    • (1990) IEEE Trans. Automatic Control , vol.35 , Issue.3 , pp. 289-298
    • Kumar, P.R.1    Seidman, T.I.2
  • 21
    • 0035578679 scopus 로고    scopus 로고
    • Valuing American options by simulation: A simple least squares approach
    • Longstaff, F., E. S. Schwartz. 2001. Valuing American options by simulation: A simple least squares approach. Rev. Financial Stud. 14 113-147.
    • (2001) Rev. Financial Stud. , vol.14 , pp. 113-147
    • Longstaff, F.1    Schwartz, E.S.2
  • 22
    • 0001257766 scopus 로고
    • Linear programming and sequential decisions
    • Manne, A. S. 1960. Linear programming and sequential decisions. Management Sci. 6(3) 259-267.
    • (1960) Management Sci. , vol.6 , Issue.3 , pp. 259-267
    • Manne, A.S.1
  • 23
    • 0033247532 scopus 로고    scopus 로고
    • New linear program performance bounds for queueing networks
    • Morrison, J. R., P. R. Kumar. 1999. New linear program performance bounds for queueing networks. J. Optim. Theory Appl. 100(3) 575-597.
    • (1999) J. Optim. Theory Appl. , vol.100 , Issue.3 , pp. 575-597
    • Morrison, J.R.1    Kumar, P.R.2
  • 25
    • 0026889533 scopus 로고
    • On the ergodicity of stochastic processes describing the operation of open queueing networks
    • Rybko, A. N., A. L. Stolyar. 1992. On the ergodicity of stochastic processes describing the operation of open queueing networks. Problemy Peredachi Informatsii 28 3-26.
    • (1992) Problemy Peredachi Informatsii , vol.28 , pp. 3-26
    • Rybko, A.N.1    Stolyar, A.L.2
  • 27
    • 0000273218 scopus 로고
    • Generalized polynomial approximations in Markovian decision processes
    • Schweitzer, P., A. Seidmann. 1985. Generalized polynomial approximations in Markovian decision processes. J. Math. Anal. Appl. 110 568-582.
    • (1985) J. Math. Anal. Appl. , vol.110 , pp. 568-582
    • Schweitzer, P.1    Seidmann, A.2
  • 28
    • 33847202724 scopus 로고
    • Learning to predict by the methods of temporal differences
    • Sutton, R. S. 1988. Learning to predict by the methods of temporal differences. Machine Learning 3 9-44.
    • (1988) Machine Learning , vol.3 , pp. 9-44
    • Sutton, R.S.1
  • 30
    • 0029276036 scopus 로고
    • Temporal difference learning and TD-gammon
    • Tesauro, C. J. 1995. Temporal difference learning and TD-gammon. Comm. ACM 38 58-68.
    • (1995) Comm. ACM , vol.38 , pp. 58-68
    • Tesauro, C.J.1
  • 32
    • 0031350985 scopus 로고    scopus 로고
    • Spline approximations to value functions: A linear programming approach
    • _, _. 1997. Spline approximations to value functions: A linear programming approach. Macroeconomic Dynamics 1 255-277.
    • (1997) Macroeconomic Dynamics , vol.1 , pp. 255-277
  • 33
    • 0031143730 scopus 로고    scopus 로고
    • An analysis of temporal-difference learning with function approximation
    • Tsitsiklis, J. N., B. Van Roy. 1997. An analysis of temporal-difference learning with function approximation. IEEE Trans. Auto. Control 42(5) 674-690.
    • (1997) IEEE Trans. Auto. Control , vol.42 , Issue.5 , pp. 674-690
    • Tsitsiklis, J.N.1    Van Roy, B.2
  • 34
    • 0035391083 scopus 로고    scopus 로고
    • Regression methods for pricing complex American-style options
    • _, _. 2001. Regression methods for pricing complex American-style options. IEEE Trans. Neural Networks 12(4) 694-703.
    • (2001) IEEE Trans. Neural Networks , vol.12 , Issue.4 , pp. 694-703
  • 36
    • 4344559261 scopus 로고    scopus 로고
    • Neuro-dynamic programming: Overview and recent trends
    • E. Feinberg, A. Schwartz, eds. Kluwer, Norwell, MA
    • _. 2000. Neuro-dynamic programming: Overview and recent trends. E. Feinberg, A. Schwartz, eds. Markov Decision Processes: Models, Methods, Directions, and Open Problems. Kluwer, Norwell, MA.
    • (2000) Markov Decision Processes: Models, Methods, Directions, and Open Problems
  • 37
    • 85156225449 scopus 로고    scopus 로고
    • High-performance job-shop scheduling with a time-delay TD(λ) network
    • MIT Press, Cambridge, MA
    • Zhang, W., T. G. Dietterich. 1996. High-performance job-shop scheduling with a time-delay TD(λ) network. Advances in Neural Information Processing Systems, Vol. 8. MIT Press, Cambridge, MA, 1024-1030.
    • (1996) Advances in Neural Information Processing Systems , vol.8 , pp. 1024-1030
    • Zhang, W.1    Dietterich, T.G.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.