메뉴 건너뛰기




Volumn , Issue , 2009, Pages 153-160

Policy search with cross-entropy optimization of basis functions

Author keywords

[No Author keywords available]

Indexed keywords

BASIS FUNCTIONS; CLOSED-LOOP; COMPUTATIONAL COSTS; CROSS ENTROPY; CROSS-ENTROPY METHOD; INITIAL STATE; LARGE CLASS; MARKOV DECISION PROCESSES; NOVEL ALGORITHM; PARAMETERIZATIONS; POLICY SEARCH; SIMULATION EXPERIMENTS;

EID: 67650502101     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ADPRL.2009.4927539     Document Type: Conference Paper
Times cited : (12)

References (17)
  • 3
    • 0036832953 scopus 로고    scopus 로고
    • Variable-resolution discretization in optimal control
    • R. Munos and A. Moore, "Variable-resolution discretization in optimal control," Machine Learning, Vol. 49, no. 2-3, pp. 291-323, 2002.
    • (2002) Machine Learning , vol.49 , Issue.2-3 , pp. 291-323
    • Munos, R.1    Moore, A.2
  • 6
    • 35748957806 scopus 로고    scopus 로고
    • Proto-value functions: A Laplacian framework for learning representation and control in Markov decision processes
    • S. Mahadevan and M. Maggioni, "Proto-value functions: A Laplacian framework for learning representation and control in Markov decision processes," Journal of Machine Learning Research, Vol. 8, pp. 2169-2231, 2007. (Pubitemid 350046199)
    • (2007) Journal of Machine Learning Research , vol.8 , pp. 2169-2231
    • Mahadevan, S.1    Maggioni, M.2
  • 7
    • 84898939480 scopus 로고    scopus 로고
    • Policy gradient methods for reinforcement learning with function approximation
    • S. A. Solla, T. K. Leen, and K.-R. Müller, Eds. MIT Press
    • R. S. Sutton, D. A. McAllester, S. P. Singh, and Y. Mansour, "Policy gradient methods for reinforcement learning with function approximation," in Advances in Neural Information Processing Systems 12, S. A. Solla, T. K. Leen, and K.-R. Müller, Eds. MIT Press, 2000, pp. 1057-1063.
    • (2000) Advances in Neural Information Processing Systems , vol.12 , pp. 1057-1063
    • Sutton, R.S.1    Mc Allester, D.A.2    Singh, S.P.3    Mansour, Y.4
  • 8
    • 0037288469 scopus 로고    scopus 로고
    • Approximate gradient methods in policy-space optimization of Markov reward processes
    • P. Marbach and J. N. Tsitsiklis, "Approximate gradient methods in policy-space optimization of Markov reward processes," Discrete Event Dynamic Systems: Theory and Applications, Vol. 13, pp. 111-148, 2003.
    • (2003) Discrete Event Dynamic Systems: Theory and Applications , vol.13 , pp. 111-148
    • Marbach, P.1    Tsitsiklis, J.N.2
  • 9
    • 33646399442 scopus 로고    scopus 로고
    • Policy gradient in continuous time
    • R. Munos, "Policy gradient in continuous time," Journal of Machine Learning Research, Vol. 7, pp. 771-791, 2006.
    • (2006) Journal of Machine Learning Research , vol.7 , pp. 771-791
    • Munos, R.1
  • 12
    • 33646714634 scopus 로고    scopus 로고
    • Evolutionary function approximation for reinforcement learning
    • S. Whiteson and P. Stone, "Evolutionary function approximation for reinforcement learning," Journal of Machine Learning Research, Vol. 7, pp. 877-917, 2006. (Pubitemid 43736560)
    • (2006) Journal of Machine Learning Research , vol.7 , pp. 877-917
    • Whiteson, S.1    Stone, P.2
  • 15
    • 34250354563 scopus 로고    scopus 로고
    • Convergence properties of the cross-entropy method for discrete optimization
    • DOI 10.1016/j.orl.2006.11.005, PII S0167637706001313
    • A. Costa, O. D. Jones, and D. Kroese, "Convergence properties of the cross-entropy method for discrete optimization," Operations Research Letters, Vol. 35, pp. 573-580, 2007. (Pubitemid 47198343)
    • (2007) Operations Research Letters , vol.35 , Issue.5 , pp. 573-580
    • Costa, A.1    Jones, O.D.2    Kroese, D.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.