메뉴 건너뛰기




Volumn , Issue , 2002, Pages 285-291

Greedy linear value-approximation fo radtored Markov decision processes

Author keywords

[No Author keywords available]

Indexed keywords

APPROXIMATION THEORY; AUTOMATION; DECISION THEORY; ITERATIVE METHODS; LINEAR PROGRAMMING;

EID: 0036927202     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (23)

References (19)
  • 4
    • 0346942368 scopus 로고    scopus 로고
    • Decision-theoretic planning: Structural assumptions and computational leverage
    • Boutilier, C.; Dean, T.; and Hanks, S. 1999. Decision-theoretic planning: Structural assumptions and computational leverage. JAIR 11:1-94.
    • (1999) JAIR , vol.11 , pp. 1-94
    • Boutilier, C.1    Dean, T.2    Hanks, S.3
  • 7
    • 0002278788 scopus 로고    scopus 로고
    • Hierarchical reinforcement learning with the MAXQ value function decomposition
    • Dietterich, T. 2000. Hierarchical reinforcement learning with the MAXQ value function decomposition. JAIR 13:227-303.
    • (2000) JAIR , vol.13 , pp. 227-303
    • Dietterich, T.1
  • 10
    • 84880688552 scopus 로고    scopus 로고
    • Computing factored value functions for policies in structured MDPs
    • Koller, D., and Parr, R. 1999. Computing factored value functions for policies in structured MDPs. In Proceedings IJCAI.
    • (1999) Proceedings IJCAI
    • Koller, D.1    Parr, R.2
  • 11
    • 0010359703 scopus 로고    scopus 로고
    • Policy iteration for factored MDPs
    • Koller, D., and Parr, R. 2000. Policy iteration for factored MDPs. In Proceedings UAI.
    • (2000) Proceedings UAI
    • Koller, D.1    Parr, R.2
  • 12
    • 0036374190 scopus 로고    scopus 로고
    • Non-approximability results for partially observable Markov decision processes
    • Lusena, C.; Goldsmith, J.; and Mundhenk, M. 2001. Non-approximability results for partially observable Markov decision processes. JAIR 14:83-103.
    • (2001) JAIR , vol.14 , pp. 83-103
    • Lusena, C.1    Goldsmith, J.2    Mundhenk, M.3
  • 13
    • 0001205548 scopus 로고    scopus 로고
    • Complexity of finite-horizon Markov decision processes
    • Mundhenk, M.; Goldsmith, J.; Lusena, C.; and Allender, E. 2000. Complexity of finite-horizon Markov decision processes. JACM 47(4):681-720.
    • (2000) JACM , vol.47 , Issue.4 , pp. 681-720
    • Mundhenk, M.1    Goldsmith, J.2    Lusena, C.3    Allender, E.4
  • 15
    • 0012297390 scopus 로고    scopus 로고
    • Using free energies to represent Q-values in a multiagent reinforcement learning task
    • Sallans, B., and Hinton, G. 2000. Using free energies to represent Q-values in a multiagent reinforcement learning task. In Proceedings NIPS.
    • (2000) Proceedings NIPS
    • Sallans, B.1    Hinton, G.2
  • 16
  • 17
    • 0000273218 scopus 로고
    • Generalized polynomial approximations in Markovian decision problems
    • Schweitzer, P., and Seidman, A. 1985. Generalized polynomial approximations in Markovian decision problems. J. Math. Anal. and Appl. 110:568-582.
    • (1985) J. Math. Anal. and Appl. , vol.110 , pp. 568-582
    • Schweitzer, P.1    Seidman, A.2
  • 18
    • 0031350985 scopus 로고    scopus 로고
    • Spline approximations to value functions: A linear programming approach
    • Trick, M. A., and Zin, S. E. 1997. Spline approximations to value functions: A linear programming approach. Macroeconomic Dynamics 1:255-277.
    • (1997) Macroeconomic Dynamics , vol.1 , pp. 255-277
    • Trick, M.A.1    Zin, S.E.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.