메뉴 건너뛰기




Volumn , Issue , 2000, Pages 1022-1028

Policy search via density estimation

Author keywords

[No Author keywords available]

Indexed keywords

MARKOV PROCESSES;

EID: 84898967780     PISSN: 10495258     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (23)

References (9)
  • 1
    • 84898958374 scopus 로고    scopus 로고
    • Gradient descent for general reinforcement learning
    • L. Baird and A.W. Moore. Gradient descent for general Reinforcement Learning. In NIPS 11, 1999.
    • (1999) NIPS 11
    • Baird, L.1    Moore, A.W.2
  • 3
    • 0002436850 scopus 로고    scopus 로고
    • Tractable inference for complex stochastic processes
    • X. Boyen and D. Koller. Tractable inference for complex stochastic processes. In Proc. UAI, pages 33-42, 1998.
    • (1998) Proc. UAI , pp. 33-42
    • Boyen, X.1    Koller, D.2
  • 5
    • 0003987291 scopus 로고    scopus 로고
    • Using learning for approximation in stochastic processes
    • D. Koller and R. Fratkina. Using learning for approximation in stochastic processes. In Proc. ICML, pages 287-295, 1998.
    • (1998) Proc ICML , pp. 287-295
    • Koller, D.1    Fratkina, R.2
  • 6
    • 33646430192 scopus 로고    scopus 로고
    • Learning finite-state controllers for partially observable environments
    • N. Meuleau, L. Peshkin, K-E. Kim, and L.P. Kaelbling. Learning finite-state controllers for partially observable environments. In Proc. UAI 15, 1999.
    • (1999) Proc. UAI , vol.15
    • Meuleau, N.1    Peshkin, L.2    Kim, K.-E.3    Kaelbling, L.P.4
  • 7
    • 1642401055 scopus 로고    scopus 로고
    • Learning to drive a bicycle using reinforcement learning and shaping
    • J. Randløv and P. Alstrøm. Learning to drive a bicycle using reinforcement learning and shaping. In Proc. ICML, 1998.
    • (1998) Proc ICML
    • Randløv, J.1    Alstrøm, P.2
  • 8
    • 84898958707 scopus 로고    scopus 로고
    • Experiments with an algorithm which learns stochastic memoryless policies for POMDPs
    • J.K. Williams and S. Singh. Experiments with an algorithm which learns stochastic memoryless policies for POMDPs. In NIPS 11, 1999.
    • (1999) NIPS 11
    • Williams, J.K.1    Singh, S.2
  • 9
    • 0000337576 scopus 로고
    • Simple statistical gradient-following algorithms for connectionist reinforcement learning
    • R.J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8:229-256, 1992.
    • (1992) Machine Learning , vol.8 , pp. 229-256
    • Williams, R.J.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.