메뉴 건너뛰기




Volumn 27, Issue 4, 2006, Pages 433-452

Approximate policy optimization and adaptive control in regression models

Author keywords

Dynamic programming; Learning by doing; Monte Carlo; Policy iteration; Rollout

Indexed keywords


EID: 33745450286     PISSN: 09277099     EISSN: 15729974     Source Type: Journal    
DOI: 10.1007/s10614-005-9007-1     Document Type: Article
Times cited : (6)

References (27)
  • 2
    • 0006231196 scopus 로고
    • Some experimental resultson the statistical properties of least squares estimates in controlproblems
    • Anderson, T.W. and Taylor, J. (1976). Some experimental resultson the statistical properties of least squares estimates in controlproblems. Econometrica 44, 1289-1302.
    • (1976) Econometrica , vol.44 , pp. 1289-1302
    • Anderson, T.W.1    Taylor, J.2
  • 3
    • 0020815940 scopus 로고
    • Theory and applications of adaptive control - A survey
    • Åstrom, K.J. (1983). Theory and applications of adaptive control - a survey. Automatica 19, 471-486.
    • (1983) Automatica , vol.19 , pp. 471-486
    • Åstrom, K.J.1
  • 4
    • 85012688561 scopus 로고
    • Princeton, NJ:Princeton University Press
    • Bellman, R. (1957). Dynamic Programming. Princeton, NJ:PRinceton University Press.
    • (1957) Dynamic Programming
    • Bellman, R.1
  • 8
    • 0002930340 scopus 로고
    • Rational expectationsequilibrium: An alternative approach
    • Blume, L. and Easley, D. (1984). Rational expectationsequilibrium: An alternative approach. Journal of EconomicTheory 34, 116-129.
    • (1984) Journal of EconomicTheory , vol.34 , pp. 116-129
    • Blume, L.1    Easley, D.2
  • 12
    • 37849017798 scopus 로고
    • A value function arising in the economics ofinformation
    • Kiefer, N.M. (1989). A value function arising in the economics ofinformation. Journal of Economic Dynamics and Control 13, 201-223.
    • (1989) Journal of Economic Dynamics and Control , vol.13 , pp. 201-223
    • Kiefer, N.M.1
  • 13
    • 0000878355 scopus 로고
    • Adaptive design and stochastic approximation
    • Lai, T.L. and Robbins, H. (1979). Adaptive design and stochastic approximation. Annals of Statistics 7, 1196-1221.
    • (1979) Annals of Statistics , vol.7 , pp. 1196-1221
    • Lai, T.L.1    Robbins, H.2
  • 14
  • 15
  • 16
    • 0035578679 scopus 로고    scopus 로고
    • Valuing American options by simulation: A simple least-squaresapproach
    • Longstaff, F.A. and Schwartz, E.S. (2001). Valuing American options by simulation: A simple least-squaresapproach. Review of Financial Studies 14, 113-147.
    • (2001) Review of Financial Studies , vol.14 , pp. 113-147
    • Longstaff, F.A.1    Schwartz, E.S.2
  • 17
    • 0036832961 scopus 로고    scopus 로고
    • Building a basic block instruction scheduler using reinforcement learning and rollouts
    • MacGovern, A., Moss, E. and Barto, A. (2002). Building a basic block instruction scheduler using reinforcement learning and rollouts. Machine Learning 49, 141-160.
    • (2002) Machine Learning , vol.49 , pp. 141-160
    • MacGovern, A.1    Moss, E.2    Barto, A.3
  • 18
    • 0000464775 scopus 로고
    • The multi-period control problem under uncertainty
    • Prescott, E. (1972). The multi-period control problem under uncertainty. Econometrica 40, 1043-1058.
    • (1972) Econometrica , vol.40 , pp. 1043-1058
    • Prescott, E.1
  • 19
    • 0001509947 scopus 로고    scopus 로고
    • Using randomization to break the curse of dimensionality
    • Rust, J. (1997). Using randomization to break the curse of dimensionality. Econometrica 65, 487-516.
    • (1997) Econometrica , vol.65 , pp. 487-516
    • Rust, J.1
  • 20
    • 0035463693 scopus 로고    scopus 로고
    • A rollout policy for the vehicle routing problem with stochastic demands
    • Secomandi, N. (2001). A rollout policy for the vehicle routing problem with stochastic demands. Operations Research, 49, 796-802.
    • (2001) Operations Research , vol.49 , pp. 796-802
    • Secomandi, N.1
  • 21
    • 0142157191 scopus 로고    scopus 로고
    • Analysis of a rollout approach to sequencing problems with stochastic routing applications
    • Secomandi, N. (2003). Analysis of a rollout approach to sequencing problems with stochastic routing applications. Journal of Heuristics 9, 321-352.
    • (2003) Journal of Heuristics , vol.9 , pp. 321-352
    • Secomandi, N.1
  • 23
    • 84898992015 scopus 로고    scopus 로고
    • On-line policy improvement using Monte-Carlo search
    • Cambridge, MA: MIT Press
    • Tesauro, G. and Galperin, G. (1996). On-line policy improvement using Monte-Carlo search. In Advances inNeural Information Processing Systems 9, 1068-1074. Cambridge, MA: MIT Press.
    • (1996) Advances in Neural Information Processing Systems , vol.9 , pp. 1068-1074
    • Tesauro, G.1    Galperin, G.2
  • 24
    • 0035391083 scopus 로고    scopus 로고
    • Regression methods for pricing complex American-style options
    • Tsitsiklis, J.N. and Van Roy, B. (2001). Regression methods for pricing complex American-style options. IEEE Transactions on Neural Networks 12, 694-703.
    • (2001) IEEE Transactions on Neural Networks , vol.12 , pp. 694-703
    • Tsitsiklis, J.N.1    Van Roy, B.2
  • 25
    • 0042933390 scopus 로고
    • Optimal control with unknown parameters - A study of optimal learning strategies with anapplication to monetary policy
    • Ph.D. Thesis, Stanford University
    • Wieland, V. (1995). Optimal control with unknown parameters - a study of optimal learning strategies with anapplication to monetary policy. Ph.D. Thesis, Stanford University.
    • (1995)
    • Wieland, V.1
  • 26
    • 0006250951 scopus 로고    scopus 로고
    • Learning by doing and the value of optimal experimentation
    • Wieland, V. (2000). Learning by doing and the value of optimal experimentation. Journal of Economic Dynamics and Control 24, 501-534.
    • (2000) Journal of Economic Dynamics and Control , vol.24 , pp. 501-534
    • Wieland, V.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.