메뉴 건너뛰기




Volumn 148, Issue , 2006, Pages 449-456

Automatic basis function construction for approximate dynamic programming and reinforcement learning

Author keywords

[No Author keywords available]

Indexed keywords

INVENTORY CONTROL; MARKOV PROCESSES; REINFORCEMENT LEARNING;

EID: 34250706852     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1143844.1143901     Document Type: Conference Paper
Times cited : (56)

References (17)
  • 2
    • 0024680419 scopus 로고
    • Adaptive aggregation methods for infinite horizon dynamic programming
    • Bertsekas, D., & Castañon, D. (1989). Adaptive aggregation methods for infinite horizon dynamic programming. IEEE Transactions on Automatic Control, 34, 589-598.
    • (1989) IEEE Transactions on Automatic Control , vol.34 , pp. 589-598
    • Bertsekas, D.1    Castañon, D.2
  • 3
    • 0036832950 scopus 로고    scopus 로고
    • Technical update: Least-squares temporal difference learning
    • Boyan, J. (2002). Technical update: Least-squares temporal difference learning. Machine Learning.
    • (2002) Machine Learning
    • Boyan, J.1
  • 4
    • 0001771345 scopus 로고    scopus 로고
    • Linear least-squares algorithms for temporal difference learning
    • Bradtke, S., & Barto, A. (1996). Linear least-squares algorithms for temporal difference learning. Machine Learning, 22, 33-57.
    • (1996) Machine Learning , vol.22 , pp. 33-57
    • Bradtke, S.1    Barto, A.2
  • 5
    • 63249106662 scopus 로고    scopus 로고
    • Experiments with random projections for machine learning
    • Fradkin, D., & Madigan, D. (2003). Experiments with random projections for machine learning. In Proc. of KDD.
    • (2003) Proc. of KDD
    • Fradkin, D.1    Madigan, D.2
  • 7
    • 0024137490 scopus 로고
    • Increased rates of convergence through learning rate adaptation
    • Jacobs, R. A. (1988). Increased rates of convergence through learning rate adaptation. Neural Networks, 1, 295-307.
    • (1988) Neural Networks , vol.1 , pp. 295-307
    • Jacobs, R.A.1
  • 8
    • 29344433509 scopus 로고    scopus 로고
    • Samuel meets Amarel: Automating value function approximation using global state space analysis
    • Mahadevan, S. (2005). Samuel meets Amarel: Automating value function approximation using global state space analysis. In Proceedings of AAAI.
    • (2005) Proceedings of AAAI
    • Mahadevan, S.1
  • 9
    • 17444414191 scopus 로고    scopus 로고
    • Basis function adaptation in temporal difference reinforcement learning
    • Mannor, S., Menache, I., & Shimkin, N. (2005). Basis function adaptation in temporal difference reinforcement learning. Annals of Operations Research, 134, 215-238.
    • (2005) Annals of Operations Research , vol.134 , pp. 215-238
    • Mannor, S.1    Menache, I.2    Shimkin, N.3
  • 10
    • 0036832953 scopus 로고    scopus 로고
    • Variable resolution discretization in optimal control
    • Munos, R., & Moore, A. (2002). Variable resolution discretization in optimal control. Machine Learning, 49, 291-323.
    • (2002) Machine Learning , vol.49 , pp. 291-323
    • Munos, R.1    Moore, A.2
  • 11
    • 26944478343 scopus 로고    scopus 로고
    • Sparse distributed memories for on-line value-based reinforcement learning
    • Ratitch, B., & Precup, D. (2004). Sparse distributed memories for on-line value-based reinforcement learning. In Proceedings of ECML.
    • (2004) Proceedings of ECML
    • Ratitch, B.1    Precup, D.2
  • 12
  • 13
    • 33847202724 scopus 로고
    • Learning to predict by the methods of temporal differences
    • Sutton, R. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3, 9-44.
    • (1988) Machine Learning , vol.3 , pp. 9-44
    • Sutton, R.1
  • 15
    • 0035283402 scopus 로고    scopus 로고
    • On the convergence of temporal-difference learning with linear function approximation
    • Tadic, V. (2001). On the convergence of temporal-difference learning with linear function approximation. Machine learning, 42, 241-267.
    • (2001) Machine learning , vol.42 , pp. 241-267
    • Tadic, V.1
  • 16
    • 0031143730 scopus 로고    scopus 로고
    • An analysis of temporal-difference learning with function approximation
    • Tsitsiklis, J., & Van Roy, B. (1997). An analysis of temporal-difference learning with function approximation. IEEE Transactions on Automatic Control, 42, 674-690.
    • (1997) IEEE Transactions on Automatic Control , vol.42 , pp. 674-690
    • Tsitsiklis, J.1    Van Roy, B.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.