메뉴 건너뛰기




Volumn , Issue , 2002, Pages 292-299

Piecewise linear value function approximation for factored MDPs

Author keywords

[No Author keywords available]

Indexed keywords

APPROXIMATION THEORY; DECISION THEORY; OPTIMAL CONTROL SYSTEMS; PIECEWISE LINEAR TECHNIQUES; TREES (MATHEMATICS);

EID: 0036923210     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (27)

References (18)
  • 2
    • 84880685295 scopus 로고    scopus 로고
    • Prioritized goal decomposition of Markov decision processes: Toward a synthesis of classical and decision theoretic planning
    • Nagoya
    • C. Boutilier, R. I. Brafman, and C. Geib. Prioritized goal decomposition of Markov decision processes: Toward a synthesis of classical and decision theoretic planning. In Proc. Fifteenth International Joint Conf. on AI, pp.1156-1162, Nagoya, 1997.
    • (1997) Proc. Fifteenth International Joint Conf. on AI , pp. 1156-1162
    • Boutilier, C.1    Brafman, R.I.2    Geib, C.3
  • 4
    • 0002192119 scopus 로고
    • Input generalization in delayed reinforcement learning: An algorithm and performance comparisons
    • Sydney
    • D. Chapman and L. P. Kaelbling. Input generalization in delayed reinforcement learning: An algorithm and performance comparisons. In Proc. Twelfth International Joint Conf. on AI, pp.726-731, Sydney, 1991.
    • (1991) Proc. Twelfth International Joint Conf. on AI , pp. 726-731
    • Chapman, D.1    Kaelbling, L.P.2
  • 5
    • 0000746330 scopus 로고    scopus 로고
    • Model reduction techniques for computing approximately optimal solutions for Markov decision processes
    • Providence, RI
    • T. Dean, R. Givan, and S. Leach. Model reduction techniques for computing approximately optimal solutions for Markov decision processes. In Proc. Thirteenth Conf. on Uncertainty in AI, pp.124-131, Providence, RI, 1997.
    • (1997) Proc. Thirteenth Conf. on Uncertainty in AI , pp. 124-131
    • Dean, T.1    Givan, R.2    Leach, S.3
  • 6
    • 84990553353 scopus 로고
    • A model for reasoning about persistence and causation
    • T. Dean and K. Kanazawa. A model for reasoning about persistence and causation. Comput. Intel, 5(3): 142-150, 1989.
    • (1989) Comput. Intel , vol.5 , Issue.3 , pp. 142-150
    • Dean, T.1    Kanazawa, K.2
  • 7
    • 0030697013 scopus 로고    scopus 로고
    • Abstraction and approximate decision theoretic planning
    • R. Dearden and C. Boutilier. Abstraction and approximate decision theoretic planning. Artif. Intel, 89:219-283, 1997.
    • (1997) Artif. Intel , vol.89 , pp. 219-283
    • Dearden, R.1    Boutilier, C.2
  • 13
    • 0029514510 scopus 로고
    • The parti-game algorithm for variable resolution reinforcement learning in multidimensional state spaces
    • A. W. Moore and C. G. Atkeson. The parti-game algorithm for variable resolution reinforcement learning in multidimensional state spaces. Mach. Learn., 21:199-234, 1995.
    • (1995) Mach. Learn. , vol.21 , pp. 199-234
    • Moore, A.W.1    Atkeson, C.G.2
  • 17
    • 84899022377 scopus 로고    scopus 로고
    • How to dynamically merge Markov decision processes
    • MIT Press, Cambridge
    • S. P. Singh and D. Cohn. How to dynamically merge Markov decision processes. In Advances in Neural Info. Processing Sys. 10, pp.1057-1063. MIT Press, Cambridge, 1998.
    • (1998) Advances in Neural Info. Processing Sys. , vol.10 , pp. 1057-1063
    • Singh, S.P.1    Cohn, D.2
  • 18
    • 0029752470 scopus 로고    scopus 로고
    • Feature-based methods for large scale dynamic programming
    • J. Tsitsiklis and B. Van Roy. Feature-based methods for large scale dynamic programming. Mach. Learn., 22:59-94, 1996.
    • (1996) Mach. Learn. , vol.22 , pp. 59-94
    • Tsitsiklis, J.1    Van Roy, B.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.