메뉴 건너뛰기




Volumn , Issue , 2006, Pages 3178-3183

Towards direct policy search reinforcement learning for robot control

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS; CONTROL SYSTEM ANALYSIS; INTELLIGENT CONTROL; OPTIMIZATION; ROBOTS; STATE ESTIMATION;

EID: 34250644253     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/IROS.2006.282342     Document Type: Conference Paper
Times cited : (8)

References (30)
  • 7
    • 34250624603 scopus 로고    scopus 로고
    • C. Anderson, Approximating a policy can be easier than approximating a value function, University of Colorado State, Computer Science Technical Report, 2000.
    • C. Anderson, "Approximating a policy can be easier than approximating a value function," University of Colorado State," Computer Science Technical Report, 2000.
  • 8
    • 14344253499 scopus 로고    scopus 로고
    • Policy-gradient algorithms for partially observable markov decision processes,
    • Ph.D. dissertation, Australian National University, April
    • D. A. Aberdeen, "Policy-gradient algorithms for partially observable markov decision processes," Ph.D. dissertation, Australian National University, April 2003.
    • (2003)
    • Aberdeen, D.A.1
  • 10
    • 34250683611 scopus 로고    scopus 로고
    • N. Meuleau, K. E. Kim, L. P. Kaelbling, and A. R. Cassandra, Solving POMDPs by searching the space of finite policies, in 15th Conference on Uncertainty in Artificial Intelligence, M. Kaufmann, Ed., Computer science Dep., Brown University, July 1999, pp. 127-136.
    • N. Meuleau, K. E. Kim, L. P. Kaelbling, and A. R. Cassandra, "Solving POMDPs by searching the space of finite policies," in 15th Conference on Uncertainty in Artificial Intelligence, M. Kaufmann, Ed., Computer science Dep., Brown University, July 1999, pp. 127-136.
  • 14
    • 34250638040 scopus 로고    scopus 로고
    • P. Marbach and J. N. Tsitsiklis, Gradient-based optimization of Markov reward processes: Practical variants, Center for Communications Systems Research, University of Cambridge, Tech. Rep., March 2000.
    • P. Marbach and J. N. Tsitsiklis, "Gradient-based optimization of Markov reward processes: Practical variants," Center for Communications Systems Research, University of Cambridge, Tech. Rep., March 2000.
  • 16
    • 34250649024 scopus 로고    scopus 로고
    • N. Meuleau, L. Peshkin, and K. Kim, Exploration in gradient based reinforcement learning, Massachusetts Institute of Technology, AI Memo 2001-003, Tech. Rep., April 2001.
    • N. Meuleau, L. Peshkin, and K. Kim, "Exploration in gradient based reinforcement learning," Massachusetts Institute of Technology, AI Memo 2001-003, Tech. Rep., April 2001.
  • 17
    • 0000337576 scopus 로고
    • Simple statistical gradient-following algorithms for connectionist reinforcement learning
    • R. Williams, "Simple statistical gradient-following algorithms for connectionist reinforcement learning," Machine Learning, vol. 8, pp. 229-256, 1992.
    • (1992) Machine Learning , vol.8 , pp. 229-256
    • Williams, R.1
  • 19
    • 85153938292 scopus 로고
    • Reinforcement Learning algorithms for partially observable Markov decision problems
    • T. Jaakkola, S. Singh, and M. Jordan, Reinforcement Learning algorithms for partially observable Markov decision problems. Morgan Kaufman, 1995, vol. 7, pp. 345-352.
    • (1995) Morgan Kaufman , vol.7 , pp. 345-352
    • Jaakkola, T.1    Singh, S.2    Jordan, M.3
  • 20
    • 34250642878 scopus 로고    scopus 로고
    • J. Baxter and P. Bartlett, Direct gradient-based reinforcement learning: I. gradient estimation algorithms, Australian National University, Tech. Rep., 1999.
    • J. Baxter and P. Bartlett, "Direct gradient-based reinforcement learning: I. gradient estimation algorithms," Australian National University, Tech. Rep., 1999.
  • 21
    • 0009011171 scopus 로고    scopus 로고
    • Simulation-based optimization of markov reward processes
    • Technical report LIDS-P-2411, Massachussets Institute of Technology
    • P. Marbach and J. N. Tsitsiklis, "Simulation-based optimization of markov reward processes," Technical report LIDS-P-2411, Massachussets Institute of Technology, 1998.
    • (1998)
    • Marbach, P.1    Tsitsiklis, J.N.2
  • 22
    • 0009011171 scopus 로고    scopus 로고
    • Simulation-based methods for markov decision processes,
    • PhD Thesis, Laboratory for Information and Decision Systems, MIT
    • P. Marbach, "Simulation-based methods for markov decision processes," PhD Thesis, Laboratory for Information and Decision Systems, MIT, 1998.
    • (1998)
    • Marbach, P.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.