SCOPUS 정보 검색 플랫폼 - 논문 보기

메뉴 건너뛰기

Proceedings of the National Conference on Artificial Intelligence

Volumn , Issue , 2002, Pages 285-291

Greedy linear value-approximation fo radtored Markov decision processes

(5) Patrascu, Relu a Poupart, Pascal b Schuurmans, Dale a Boutilier, Craig b Guestrin, Carlos c

a UNIVERSITY OF WATERLOO (Canada)

b UNIVERSITY OF TORONTO (Canada)

c STANFORD UNIVERSITY (United States)

Author keywords

[No Author keywords available]

Indexed keywords

APPROXIMATION THEORY; AUTOMATION; DECISION THEORY; ITERATIVE METHODS; LINEAR PROGRAMMING;

MARKOV DECISION PROCESSES (MDP);

MARKOV PROCESSES;

EID: 0036927202 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (23)

References (19)

1
- 0003487482
- Athena Scientific
- Bertsekas, D., and Tsitsiklis, J. 1996. Neuro-Dynamic Programming. Athena Scientific.
- (1996) Neuro-Dynamic Programming
- Bertsekas, D.¹ Tsitsiklis, J.²

2
- 0003565783
- Athena Scientific
- Bertsekas, D. 1995a. Dynamic Programming and Optimal Control, volume 2. Athena Scientific.
- (1995) Dynamic Programming and Optimal Control , vol.2
- Bertsekas, D.¹

3
- 0003746996
- Athena Scientific
- Bertsekas, D. 1995b. Nonlinear Optimization. Athena Scientific.
- (1995) Nonlinear Optimization
- Bertsekas, D.¹

4
- 0346942368
- Decision-theoretic planning: Structural assumptions and computational leverage
- Boutilier, C.; Dean, T.; and Hanks, S. 1999. Decision-theoretic planning: Structural assumptions and computational leverage. JAIR 11:1-94.
- (1999) JAIR , vol.11 , pp. 1-94
- Boutilier, C.¹ Dean, T.² Hanks, S.³

5
- 0034248853
- Stochastic dynamic programming with factored representations
- Boutilier, C.; Dearden, R.; and Goldszmidt, M. 2000. Stochastic dynamic programming with factored representations. Artificial Intelligence.
- (2000) Artificial Intelligence
- Boutilier, C.¹ Dearden, R.² Goldszmidt, M.³

6
- 0012255582
- de Farias, D., and Van Roy, B. 2001. The linear programming approach to approximate dynamic programming.
- (2001) The Linear Programming Approach to Approximate Dynamic Programming
- De Farias, D.¹ Van Roy, B.²

7
- 0002278788
- Hierarchical reinforcement learning with the MAXQ value function decomposition
- Dietterich, T. 2000. Hierarchical reinforcement learning with the MAXQ value function decomposition. JAIR 13:227-303.
- (2000) JAIR , vol.13 , pp. 227-303
- Dietterich, T.¹

8
- 84880898477
- Max-norm projection for factored MDPs
- Guestrin, C.; Koller, D.; and Parr, R. 2001a. Max-norm projection for factored MDPs. In Proceedings IJCAI.
- (2001) Proceedings IJCAI
- Guestrin, C.¹ Koller, D.² Parr, R.³

9
- 0012296128
- Multiagent planning with factored MDPs
- Guestrin, C.; Koller, D.; and Parr, R. 2001b. Multiagent planning with factored MDPs. In Proceedings NIPS.
- (2001) Proceedings NIPS
- Guestrin, C.¹ Koller, D.² Parr, R.³

10
- 84880688552
- Computing factored value functions for policies in structured MDPs
- Koller, D., and Parr, R. 1999. Computing factored value functions for policies in structured MDPs. In Proceedings IJCAI.
- (1999) Proceedings IJCAI
- Koller, D.¹ Parr, R.²

11
- 0010359703
- Policy iteration for factored MDPs
- Koller, D., and Parr, R. 2000. Policy iteration for factored MDPs. In Proceedings UAI.
- (2000) Proceedings UAI
- Koller, D.¹ Parr, R.²

12
- 0036374190
- Non-approximability results for partially observable Markov decision processes
- Lusena, C.; Goldsmith, J.; and Mundhenk, M. 2001. Non-approximability results for partially observable Markov decision processes. JAIR 14:83-103.
- (2001) JAIR , vol.14 , pp. 83-103
- Lusena, C.¹ Goldsmith, J.² Mundhenk, M.³

13
- 0001205548
- Complexity of finite-horizon Markov decision processes
- Mundhenk, M.; Goldsmith, J.; Lusena, C.; and Allender, E. 2000. Complexity of finite-horizon Markov decision processes. JACM 47(4):681-720.
- (2000) JACM , vol.47 , Issue.4 , pp. 681-720
- Mundhenk, M.¹ Goldsmith, J.² Lusena, C.³ Allender, E.⁴

14
- 0003998452
- Wiley
- Puterman, M. 1994. Markov Decision Processes: Discrete Dynamic Programming. Wiley.
- (1994) Markov Decision Processes: Discrete Dynamic Programming
- Puterman, M.¹

15
- 0012297390
- Using free energies to represent Q-values in a multiagent reinforcement learning task
- Sallans, B., and Hinton, G. 2000. Using free energies to represent Q-values in a multiagent reinforcement learning task. In Proceedings NIPS.
- (2000) Proceedings NIPS
- Sallans, B.¹ Hinton, G.²

16
- 1542342765
- Direct value-approximation for factored MDPs
- Schuurmans, D., and Patrascu, R. 2001. Direct value-approximation for factored MDPs. In Proceedings NIPS.
- (2001) Proceedings NIPS
- Schuurmans, D.¹ Patrascu, R.²

17
- 0000273218
- Generalized polynomial approximations in Markovian decision problems
- Schweitzer, P., and Seidman, A. 1985. Generalized polynomial approximations in Markovian decision problems. J. Math. Anal. and Appl. 110:568-582.
- (1985) J. Math. Anal. and Appl. , vol.110 , pp. 568-582
- Schweitzer, P.¹ Seidman, A.²

18
- 0031350985
- Spline approximations to value functions: A linear programming approach
- Trick, M. A., and Zin, S. E. 1997. Spline approximations to value functions: A linear programming approach. Macroeconomic Dynamics 1:255-277.
- (1997) Macroeconomic Dynamics , vol.1 , pp. 255-277
- Trick, M.A.¹ Zin, S.E.²

19
- 0012252296
- Technical report, Northeastern University
- Williams, R., and Baird, L. 1993. Tight performance bounds on greedy policies based on imperfect value functions. Technical report, Northeastern University.
- (1993) Tight Performance Bounds on Greedy Policies Based on Imperfect Value Functions
- Williams, R.¹ Baird, L.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.