메뉴 건너뛰기




Volumn , Issue , 2008, Pages 752-759

An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning

Author keywords

[No Author keywords available]

Indexed keywords

MACHINE LEARNING; REINFORCEMENT LEARNING; FUNCTIONS; LEARNING SYSTEMS; REINFORCEMENT; ROBOT LEARNING;

EID: 56449092660     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1390156.1390251     Document Type: Conference Paper
Times cited : (189)

References (15)
  • 1
    • 0038595396 scopus 로고    scopus 로고
    • Least-squares temporal difference learning
    • Boyan, J. A. (1999). Least-squares temporal difference learning. ICML-99.
    • (1999) ICML , vol.99
    • Boyan, J.A.1
  • 2
    • 0001771345 scopus 로고    scopus 로고
    • Linear least-squares algorithms for temporal difference learning
    • Bradtke, S., & Barto, A. (1996). Linear least-squares algorithms for temporal difference learning. Machine Learning, 2.
    • (1996) Machine Learning , vol.2
    • Bradtke, S.1    Barto, A.2
  • 3
    • 1942515215 scopus 로고    scopus 로고
    • Model minimization in Markov decision processes
    • Dean, T., & Givan, R. (1997). Model minimization in Markov decision processes. AAAI-97.
    • (1997) AAAI , vol.97
    • Dean, T.1    Givan, R.2
  • 4
    • 33749263205 scopus 로고    scopus 로고
    • Automatic basis function construction for approximate dynamic programming and reinforcement learning
    • Keller, P., Mannor, S., & Precup, D. (2006). Automatic basis function construction for approximate dynamic programming and reinforcement learning. ICML 2006.
    • (2006) ICML 2006
    • Keller, P.1    Mannor, S.2    Precup, D.3
  • 5
    • 84880688552 scopus 로고    scopus 로고
    • Computing factored value functions for policies in structured MDPs
    • Koller, D., & Parr, R. (1999). Computing factored value functions for policies in structured MDPs. IJCAI-99.
    • (1999) IJCAI , vol.99
    • Koller, D.1    Parr, R.2
  • 6
    • 4644323293 scopus 로고    scopus 로고
    • Least squares policy iteration
    • Lagoudakis, M., & Parr, R. (2003). Least squares policy iteration. JMLR, 4.
    • (2003) JMLR , vol.4
    • Lagoudakis, M.1    Parr, R.2
  • 7
    • 35748957806 scopus 로고    scopus 로고
    • Proto-value functions: A Laplacian framework for learning representation and control in Markov decision processes
    • Mahadevan, S., & Maggioni, M. (2007). Proto-value functions: A Laplacian framework for learning representation and control in Markov decision processes. JMLR, 8.
    • (2007) JMLR , vol.8
    • Mahadevan, S.1    Maggioni, M.2
  • 9
    • 51849168812 scopus 로고    scopus 로고
    • Analyzing feature generation for value-function approximation
    • Parr, R., Painter-Wakefield, C., Li, L., & Littman, M. (2007). Analyzing feature generation for value-function approximation. ICML-07.
    • (2007) ICML , vol.7
    • Parr, R.1    Painter-Wakefield, C.2    Li, L.3    Littman, M.4
  • 10
    • 84880899807 scopus 로고    scopus 로고
    • An analysis of Laplacian methods for value function approximation in MDPs
    • Petrik, M. (2007). An analysis of Laplacian methods for value function approximation in MDPs. IJCAI-07.
    • (2007) IJCAI , vol.7
    • Petrik, M.1
  • 11
    • 72949112166 scopus 로고    scopus 로고
    • Approximate linear programming for first-order MDPs
    • Sanner, S., & Boutilier. C. (2005). Approximate linear programming for first-order MDPs. UAI-05.
    • (2005) UAI , vol.5
    • Sanner, S.1    Boutilier, C.2
  • 12
    • 33847202724 scopus 로고
    • Learning to predict by the methods of temporal differences
    • Sutton, R. S. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3.
    • (1988) Machine Learning , vol.3
    • Sutton, R.S.1
  • 14
    • 56449129819 scopus 로고    scopus 로고
    • Feature-discovering approximate value iteration methods
    • TR-ECE-04-06, Purdue University
    • Wu, J.-H., & Givan, R. (2004). Feature-discovering approximate value iteration methods (Technical Report TR-ECE-04-06). Purdue University.
    • (2004) Technical Report
    • Wu, J.-H.1    Givan, R.2
  • 15
    • 34547991475 scopus 로고    scopus 로고
    • Convergence results for some temporal difference methods based on least squares
    • LIDS-2697, MIT
    • Yu, H., & Bertsekas, D. (2006). Convergence results for some temporal difference methods based on least squares (Technical Report LIDS-2697). MIT.
    • (2006) Technical Report
    • Yu, H.1    Bertsekas, D.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.