메뉴 건너뛰기




Volumn , Issue , 2006, Pages 428-433

Kernel rewards regression: An information efficient batch policy iteration approach

Author keywords

Intelligent control; Kernel based learning; Learning theory; Machine learning; Policy iteration; Reinforcement learning

Indexed keywords

CONTINUOUS TIME SYSTEMS; ITERATIVE METHODS; LEARNING ALGORITHMS; REINFORCEMENT LEARNING; SUPPORT VECTOR MACHINES;

EID: 38049121469     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (11)

References (27)
  • 2
    • 84947807317 scopus 로고    scopus 로고
    • Open theoretical questions in reinforcement learning
    • Richard S. Sutton. Open theoretical questions in reinforcement learning. In EuroCOLT, pages 11-17, 1999.
    • (1999) EuroCOLT , pp. 11-17
    • Sutton, R.S.1
  • 5
    • 0036832956 scopus 로고    scopus 로고
    • Kernel-based reinforcement learning
    • D. Ormoneit and S. Sen. Kernel-based reinforcement learning. Machine Learning, 49(2-3):161-178, 2002.
    • (2002) Machine Learning , vol.49 , Issue.2-3 , pp. 161-178
    • Ormoneit, D.1    Sen, S.2
  • 6
    • 1942514242 scopus 로고    scopus 로고
    • Batch value function approximation via support vectors
    • T. Dietterich and X. Wang. Batch value function approximation via support vectors. In NIPS, pages 1491-1498, 2001.
    • (2001) NIPS , pp. 1491-1498
    • Dietterich, T.1    Wang, X.2
  • 8
    • 35048819671 scopus 로고    scopus 로고
    • Least-squares methods in reinforcement learning for control
    • Michail G. Lagoudakis, Ronald Parr, and Michael L. Littman. Least-squares methods in reinforcement learning for control. In SETN, pages 249-260, 2002.
    • (2002) SETN , pp. 249-260
    • Lagoudakis, M.G.1    Parr, R.2    Littman, M.L.3
  • 9
    • 0038595396 scopus 로고    scopus 로고
    • Least-squares temporal difference learning
    • I. Bratko and S. Dzeroski, editors, Morgan Kaufmann, San Francisco, CA
    • Justin A. Boyan. Least-squares temporal difference learning. In I. Bratko and S. Dzeroski, editors, Machine Learning: Proceedings of the Sixteenth International Conference, volume 14, pages 49-56. Morgan Kaufmann, San Francisco, CA, 1999.
    • (1999) Machine Learning: Proceedings of the Sixteenth International Conference , vol.14 , pp. 49-56
    • Boyan, J.A.1
  • 10
    • 84898955987 scopus 로고    scopus 로고
    • Incorporating invariances in non-linear support vector machines
    • T. G. Dietterich, S. Becker, and Z. Ghahramani, editors, MIT Press, Cambridge, MA
    • O. Chapelle and B. Schoelkopf. Incorporating invariances in non-linear support vector machines. In T. G. Dietterich, S. Becker, and Z. Ghahramani, editors, Advances in Neural Information Processing Systems, volume 14, pages 609-616. MIT Press, Cambridge, MA, 2002.
    • (2002) Advances in Neural Information Processing Systems , vol.14 , pp. 609-616
    • Chapelle, O.1    Schoelkopf, B.2
  • 11
    • 33751577170 scopus 로고    scopus 로고
    • Tangent distance kernels for support vector machines
    • B. Haasdonk and D. Keysers. Tangent distance kernels for support vector machines. In Proceedings of the 16th ICPR, pages 864-868, 2002.
    • (2002) Proceedings of the 16th ICPR , pp. 864-868
    • Haasdonk, B.1    Keysers, D.2
  • 13
    • 84942484786 scopus 로고
    • Ridge regression: Biased estimation for nonorthogonal problems
    • AE Hoerl and RW Kennard. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, pages 55-68, 1970.
    • (1970) Technometrics , pp. 55-68
    • Hoerl, A.E.1    Kennard, R.W.2
  • 19
    • 4544388567 scopus 로고    scopus 로고
    • Mixtures of gaussian processes
    • Volker Tresp. Mixtures of gaussian processes. In NIPS, volume 13, 2000.
    • (2000) NIPS , vol.13
    • Tresp, V.1
  • 23
    • 0003798637 scopus 로고    scopus 로고
    • Computer Sciences Department, University of Wisconsin, Madison, Wisconsin, Technical Report 99-03
    • O. L. Mangasarian and David R.. Musicant. Data discrimination via nonlinear generalized support vector machines. Computer Sciences Department, University of Wisconsin, Madison, Wisconsin, Technical Report 99-03, 1999.
    • (1999)
    • Mangasarian, O.L.1    David, R.2
  • 24
    • 0141596576 scopus 로고    scopus 로고
    • Policy invariance under reward transformations: Theory and application to reward shaping
    • Andrew Y. Ng, Daishi Harada, and Stuart Russell. Policy invariance under reward transformations: Theory and application to reward shaping. In Proc. 16th Intl. Conf. on Machine Learning, pages 278-287, 1999.
    • (1999) Proc. 16th Intl. Conf. on Machine Learning , pp. 278-287
    • Ng, A.Y.1    Harada, D.2    Russell, S.3
  • 25
    • 38349009432 scopus 로고
    • Siemens AG, CTIC 4, Technical Report
    • Volker Tresp. The wet game of chicken. Siemens AG, CTIC 4, Technical Report, 1994.
    • (1994)
    • Tresp, V.1
  • 26
    • 0003120218 scopus 로고    scopus 로고
    • Fast training of support vector machines using sequential minimal optimization
    • MIT Press, Cambridge, MA, USA
    • John C. Piatt. Fast training of support vector machines using sequential minimal optimization. In Advances in kernel methods: support vector learning, pages 185-208. MIT Press, Cambridge, MA, USA, 1999.
    • (1999) Advances in kernel methods: Support vector learning , pp. 185-208
    • Piatt, J.C.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.