메뉴 건너뛰기




Volumn , Issue , 2007, Pages 185-192

Randomly sampling actions in dynamic programming

Author keywords

[No Author keywords available]

Indexed keywords

DISCRETE TIME CONTROL SYSTEMS; OPTIMIZATION; RANDOM PROCESSES; SAMPLING;

EID: 34548784023     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ADPRL.2007.368187     Document Type: Conference Paper
Times cited : (13)

References (22)
  • 3
    • 84921399937 scopus 로고    scopus 로고
    • J. Si, A. Barto, W. B. Powell, and D. W. II, Handbook of Learning and Approximate Dynamic Programming. IEEE, 2004.
    • J. Si, A. Barto, W. B. Powell, and D. W. II, Handbook of Learning and Approximate Dynamic Programming. IEEE, 2004.
  • 10
    • 41849132401 scopus 로고    scopus 로고
    • An evolutionary random policy search algorithm for solving Markov decision processes, INFORMS Journal on Computing, vol
    • to appear
    • J. Hu, M. C. Fu, V. R. Ramezani, and S. I. Marcus, "An evolutionary random policy search algorithm for solving Markov decision processes," INFORMS Journal on Computing, vol. to appear, 2007.
    • (2007)
    • Hu, J.1    Fu, M.C.2    Ramezani, V.R.3    Marcus, S.I.4
  • 11
    • 14644444172 scopus 로고    scopus 로고
    • An adaptive sampling algorithm for solving Markov decision processes
    • H. S. Chang, M. C. Fu, J. Hu, and S. I. Marcus, "An adaptive sampling algorithm for solving Markov decision processes," Operations Research, vol. 53, pp. 126-139, 2005.
    • (2005) Operations Research , vol.53 , pp. 126-139
    • Chang, H.S.1    Fu, M.C.2    Hu, J.3    Marcus, S.I.4
  • 15
    • 0001509947 scopus 로고    scopus 로고
    • Using randomization to break the curse of dimensionality
    • Online, Available
    • J. Rust, "Using randomization to break the curse of dimensionality," Econometrica, vol. 65, no. 3, pp. 487-516, 1997. [Online]. Available: citeseer.ist.psu.edu/rust96using.html
    • (1997) Econometrica , vol.65 , Issue.3 , pp. 487-516
    • Rust, J.1
  • 18
    • 34548751619 scopus 로고    scopus 로고
    • G. Gordon, Approximate solutions to Markov decision processes, Ph.D. dissertation, Carnegie Mellon University, 1999. [Online]. Available: citeseer.ist.psu.edu/gordon99approximate.html
    • G. Gordon, "Approximate solutions to Markov decision processes," Ph.D. dissertation, Carnegie Mellon University, 1999. [Online]. Available: citeseer.ist.psu.edu/gordon99approximate.html
  • 19
    • 34548750791 scopus 로고    scopus 로고
    • R. J. Williams and L. C. Baird, III, Analysis of some incremental variants of policy iteration: First steps toward understanding actor-critic learning systems, Northeastern University, Tech. Rep. NU-CCS-93-11, 1993. [Online]. Available: citeseer.ist.psu.edu/williams93analysis.html
    • R. J. Williams and L. C. Baird, III, "Analysis of some incremental variants of policy iteration: First steps toward understanding actor-critic learning systems," Northeastern University, Tech. Rep. NU-CCS-93-11, 1993. [Online]. Available: citeseer.ist.psu.edu/williams93analysis.html
  • 20
    • 34249833101 scopus 로고
    • Q-learning
    • C. Watkins and P. Dayan, "Q-learning," Machine Learning, vol. 8, no. 3, pp. 279-292, 1992.
    • (1992) Machine Learning , vol.8 , Issue.3 , pp. 279-292
    • Watkins, C.1    Dayan, P.2
  • 21
    • 33750502578 scopus 로고    scopus 로고
    • From dynamic programming to RRTs: Algorithmic design of feasible trajectories
    • Springer-Verlag
    • S. M. LaValle, "From dynamic programming to RRTs: Algorithmic design of feasible trajectories," in Control Problems in Robotics. Springer-Verlag, 2002, pp. 19-37.
    • (2002) Control Problems in Robotics , pp. 19-37
    • LaValle, S.M.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.