SCOPUS 정보 검색 플랫폼

ACM International Conference Proceeding Series

Volumn 148, Issue , 2006, Pages 449-456

Automatic basis function construction for approximate dynamic programming and reinforcement learning

(3) Keller, Philipp W a Mannor, Shie a Precup, Doina a

a MCGILL UNIVERSITY (Canada)

Author keywords

[No Author keywords available]

Indexed keywords

INVENTORY CONTROL; MARKOV PROCESSES; REINFORCEMENT LEARNING;

BELLMAN ERROR; BERTSEKAS AND CASTAFION; DIMENSIONALITY REDUCTION TECHNIQUE; LINEAR APPROXIMATION; NEIGHBORHOOD COMPONENT ANALYSIS; TEMPORAL DIFFERENCE; VALUE ITERATION;

DYNAMIC PROGRAMMING;

EID: 34250706852 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/1143844.1143901 Document Type: Conference Paper

Times cited : (56)

References (17)

1
- 0003565783
- third edition. Athena Scientific
- Bertsekas, D. (2005). Dynamic programming and optimal control vol. 1 third edition. Athena Scientific.
- (2005) Dynamic programming and optimal control , vol.1
- Bertsekas, D.¹

2
- 0024680419
- Adaptive aggregation methods for infinite horizon dynamic programming
- Bertsekas, D., & Castañon, D. (1989). Adaptive aggregation methods for infinite horizon dynamic programming. IEEE Transactions on Automatic Control, 34, 589-598.
- (1989) IEEE Transactions on Automatic Control , vol.34 , pp. 589-598
- Bertsekas, D.¹ Castañon, D.²

3
- 0036832950
- Technical update: Least-squares temporal difference learning
- Boyan, J. (2002). Technical update: Least-squares temporal difference learning. Machine Learning.
- (2002) Machine Learning
- Boyan, J.¹

4
- 0001771345
- Linear least-squares algorithms for temporal difference learning
- Bradtke, S., & Barto, A. (1996). Linear least-squares algorithms for temporal difference learning. Machine Learning, 22, 33-57.
- (1996) Machine Learning , vol.22 , pp. 33-57
- Bradtke, S.¹ Barto, A.²

5
- 63249106662
- Experiments with random projections for machine learning
- Fradkin, D., & Madigan, D. (2003). Experiments with random projections for machine learning. In Proc. of KDD.
- (2003) Proc. of KDD
- Fradkin, D.¹ Madigan, D.²

6
- 84898993653
- Neighbourhood components analysis
- Goldberger, J., Roweis, S., Hinton, G., & Salakhutdinov, R. (2005). Neighbourhood components analysis. In NIPS 17, 513-520.
- (2005) NIPS 17 , pp. 513-520
- Goldberger, J.¹ Roweis, S.² Hinton, G.³ Salakhutdinov, R.⁴

7
- 0024137490
- Increased rates of convergence through learning rate adaptation
- Jacobs, R. A. (1988). Increased rates of convergence through learning rate adaptation. Neural Networks, 1, 295-307.
- (1988) Neural Networks , vol.1 , pp. 295-307
- Jacobs, R.A.¹

8
- 29344433509
- Samuel meets Amarel: Automating value function approximation using global state space analysis
- Mahadevan, S. (2005). Samuel meets Amarel: Automating value function approximation using global state space analysis. In Proceedings of AAAI.
- (2005) Proceedings of AAAI
- Mahadevan, S.¹

9
- 17444414191
- Basis function adaptation in temporal difference reinforcement learning
- Mannor, S., Menache, I., & Shimkin, N. (2005). Basis function adaptation in temporal difference reinforcement learning. Annals of Operations Research, 134, 215-238.
- (2005) Annals of Operations Research , vol.134 , pp. 215-238
- Mannor, S.¹ Menache, I.² Shimkin, N.³

10
- 0036832953
- Variable resolution discretization in optimal control
- Munos, R., & Moore, A. (2002). Variable resolution discretization in optimal control. Machine Learning, 49, 291-323.
- (2002) Machine Learning , vol.49 , pp. 291-323
- Munos, R.¹ Moore, A.²

11
- 26944478343
- Sparse distributed memories for on-line value-based reinforcement learning
- Ratitch, B., & Precup, D. (2004). Sparse distributed memories for on-line value-based reinforcement learning. In Proceedings of ECML.
- (2004) Proceedings of ECML
- Ratitch, B.¹ Precup, D.²

12
- 34547971837
- Explicit manifold representations for value-functions in reinforcement learning
- Smart, W. (2004). Explicit manifold representations for value-functions in reinforcement learning. In Proceedings of the 8th int. symp. on ai and mathematics.
- (2004) Proceedings of the 8th int. symp. on ai and mathematics
- Smart, W.¹

13
- 33847202724
- Learning to predict by the methods of temporal differences
- Sutton, R. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3, 9-44.
- (1988) Machine Learning , vol.3 , pp. 9-44
- Sutton, R.¹

14
- 0004102479
- MIT Press
- Sutton, R., & Barto, A. (1998). Reinforcement learning: An introduction. MIT Press.
- (1998) Reinforcement learning: An introduction
- Sutton, R.¹ Barto, A.²

15
- 0035283402
- On the convergence of temporal-difference learning with linear function approximation
- Tadic, V. (2001). On the convergence of temporal-difference learning with linear function approximation. Machine learning, 42, 241-267.
- (2001) Machine learning , vol.42 , pp. 241-267
- Tadic, V.¹

16
- 0031143730
- An analysis of temporal-difference learning with function approximation
- Tsitsiklis, J., & Van Roy, B. (1997). An analysis of temporal-difference learning with function approximation. IEEE Transactions on Automatic Control, 42, 674-690.
- (1997) IEEE Transactions on Automatic Control , vol.42 , pp. 674-690
- Tsitsiklis, J.¹ Van Roy, B.²

17
- 84891543200
- Multigrid algorithms for temporal difference reinforcement learning
- Ziv, O., & Shimkin, N. (2005). Multigrid algorithms for temporal difference reinforcement learning. In Proc. icml workshop on rich representations for rl.
- (2005) Proc. icml workshop on rich representations for rl
- Ziv, O.¹ Shimkin, N.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.