SCOPUS 정보 검색 플랫폼

Volumn 227, Issue , 2007, Pages 751-758

Tracking value function dynamics to improve reinforcement learning with piecewise linear function approximation

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS; APPROXIMATION THEORY; ERROR CORRECTION; FUNCTION EVALUATION; KALMAN FILTERS; LINEAR SYSTEMS;

MAXQ ALGORITHMS; MEAN-SQUARE BELLMAN ERROR; Q-LEARNING; RANDOM-WALK MODEL;

REINFORCEMENT LEARNING;

EID: 34547974097 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/1273496.1273591 Document Type: Conference Paper

Times cited : (9)

References (18)

2
- 0036832950
- Technical update: Least-squares temporal difference learning
- Boyan, J. (2002). Technical update: Least-squares temporal difference learning. Machine Learning, 49, 233-246.
- (2002) Machine Learning , vol.49 , pp. 233-246
- Boyan, J.¹

3
- 0001771345
- Linear least-squares algorithms for temporal difference learning
- Bradtke, S. J., amp; Barto, A. G. (1996). Linear least-squares algorithms for temporal difference learning. Machine Learning, 22, 33-57.
- (1996) Machine Learning , vol.22 , pp. 33-57
- Bradtke, S.J.¹ Barto, A.G.²

4
- 33646435300
- A generalized kalman filter for fixed point approximation and efficient temporal difference learning
- Choi, D., amp; Roy, B. V. (2006). A generalized kalman filter for fixed point approximation and efficient temporal difference learning. Discrete Event Dynamic Systems, 16, 207-239.
- (2006) Discrete Event Dynamic Systems , vol.16 , pp. 207-239
- Choi, D.¹ Roy, B.V.²

5
- 0002278788
- Hierarchical reinforcement learning with the MAXQ value function decomposition
- Dietterieh, T. G. (2000). Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 13, 227-303.
- (2000) Journal of Artificial Intelligence Research , vol.13 , pp. 227-303
- Dietterieh, T.G.¹

6
- 33750737011
- Incremental least-squares temporal difference learning
- Geramifard, A., Bowling, M., amp; Sutton, R. S. (2006). Incremental least-squares temporal difference learning. Twenty-First National Conference on Artificial Intelligence (AAAI-06) (pp. 356-361).
- (2006) Twenty-First National Conference on Artificial Intelligence (AAAI-06) , pp. 356-361
- Geramifard, A.¹ Bowling, M.² Sutton, R.S.³

7
- 0003807773
- 4th edition, Prentice Hall
- Haykin, S. (2001). Adaptive filter theory (4th edition). Prentice Hall.
- (2001) Adaptive filter theory
- Haykin, S.¹

8
- 85024429815
- A new approach to linear filtering and prediction problems
- Kalman, R. E. (1960). A new approach to linear filtering and prediction problems. Transactions of the ASME - Journal of Basic Engineering, 82, 35-45.
- (1960) Transactions of the ASME - Journal of Basic Engineering , vol.82 , pp. 35-45
- Kalman, R.E.¹

9
- 4644323293
- Least-squares policy iteration
- Lagoudakis, M. G., amp; Parr, R. (2003). Least-squares policy iteration. The Journal of Machine Learning Research, 4, 1107-1149.
- (2003) The Journal of Machine Learning Research , vol.4 , pp. 1107-1149
- Lagoudakis, M.G.¹ Parr, R.²

10
- 0036832953
- Variable resolution discretization in optimal control
- Munos, R., amp; Moore, A. (2002). Variable resolution discretization in optimal control. Machine Learning, 49, 291-323.
- (2002) Machine Learning , vol.49 , pp. 291-323
- Munos, R.¹ Moore, A.²

12
- 30044434365
- Incremental learning of linear model trees
- Potts, D., amp; Sammut, C. (2005). Incremental learning of linear model trees. Machine Learning, 61, 5-48.
- (2005) Machine Learning , vol.61 , pp. 5-48
- Potts, D.¹ Sammut, C.²

13
- 84889830739
- Wiley-Interscience
- Simon, D. (2006). Optimal state estimation: Kalman, H-infinity, and nonlinear approaches. Wiley-Interscience.
- (2006) Optimal state estimation: Kalman, H-infinity, and nonlinear approaches
- Simon, D.¹

14
- 0036236260
- Instrumental variable methods for system identification
- Soderstrom, T., amp; Stoica, P. (2002). Instrumental variable methods for system identification. Circuits Systems Signal Processing, 21, 1-9.
- (2002) Circuits Systems Signal Processing , vol.21 , pp. 1-9
- Soderstrom, T.¹ Stoica, P.²

15
- 0004102479
- Cambridge, MA: MIT Press
- Sutton, R. S., amp; Barto, A. G. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT Press.
- (1998) Reinforcement learning: An introduction
- Sutton, R.S.¹ Barto, A.G.²

16
- 0031143730
- An analysis of temporal-difference learning with function approximation
- Tsitsiklis, J. N., amp; Roy, B. V. (1997). An analysis of temporal-difference learning with function approximation. IEEE Transactions on Automatic Control, 42, 674-690.
- (1997) IEEE Transactions on Automatic Control , vol.42 , pp. 674-690
- Tsitsiklis, J.N.¹ Roy, B.V.²

18
- 0041345290
- Efficient reinforcement learning using recursive least squares methods
- Xu, X., gen He, H., & Hu, D. (2002). Efficient reinforcement learning using recursive least squares methods. Journal of Artificial Intelligence Research, 16, 259-292.
- (2002) Journal of Artificial Intelligence Research , vol.16 , pp. 259-292
- Xu, X.¹ gen He, H.² Hu, D.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.