SCOPUS 정보 검색 플랫폼

2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009 - Proceedings

Volumn , Issue , 2009, Pages 153-160

Policy search with cross-entropy optimization of basis functions

(4) Buşoniu, Lucian a Ernst, Damien c De Schutter, Bart a,b Babuska, Robert a

a DELFT UNIVERSITY OF TECHNOLOGY (Netherlands)

b DELFT UNIVERSITY OF TECHNOLOGY (Netherlands)

c UNIVERSITY OF LIÈGE (Belgium)

Author keywords

[No Author keywords available]

Indexed keywords

BASIS FUNCTIONS; CLOSED-LOOP; COMPUTATIONAL COSTS; CROSS ENTROPY; CROSS-ENTROPY METHOD; INITIAL STATE; LARGE CLASS; MARKOV DECISION PROCESSES; NOVEL ALGORITHM; PARAMETERIZATIONS; POLICY SEARCH; SIMULATION EXPERIMENTS;

ALGORITHMS; KNOWLEDGE BASED SYSTEMS; MARKOV PROCESSES; OPTIMIZATION; PARAMETERIZATION; REINFORCEMENT; REINFORCEMENT LEARNING; SYSTEMS ENGINEERING; UNMANNED AERIAL VEHICLES (UAV);

DYNAMIC PROGRAMMING;

EID: 67650502101 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ADPRL.2009.4927539 Document Type: Conference Paper

Times cited : (12)

References (17)

1
- 0003565783
- 3rd ed. Athena Scientific
- D.P. Bertsekas, Dynamic Programming and Optimal Control, 3rd ed. Athena Scientific, 2007, Vol. 2.
- (2007) Dynamic Programming and Optimal Control , vol.2
- Bertsekas, D.P.¹

2
- 0004102479
- MIT Press
- R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. MIT Press, 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

3
- 0036832953
- Variable-resolution discretization in optimal control
- R. Munos and A. Moore, "Variable-resolution discretization in optimal control," Machine Learning, Vol. 49, no. 2-3, pp. 291-323, 2002.
- (2002) Machine Learning , vol.49 , Issue.2-3 , pp. 291-323
- Munos, R.¹ Moore, A.²

4
- 4644323293
- Least-squares policy iteration
- M. G. Lagoudakis and R. Parr, "Least-squares policy iteration," Journal of Machine Learning Research, Vol. 4, pp. 1107-1149, 2003.
- (2003) Journal of Machine Learning Research , vol.4 , pp. 1107-1149
- Lagoudakis, M.G.¹ Parr, R.²

5
- 21844465127
- Tree-based batch mode reinforcement learning
- D. Ernst, P. Geurts, and L. Wehenkel, "Tree-based batch mode reinforcement learning," Journal of Machine Learning Research, Vol. 6, pp. 503-556, 2005. (Pubitemid 40958851)
- (2005) Journal of Machine Learning Research , vol.6
- Ernst, D.¹ Geurts, P.² Wehenkel, L.³

6
- 35748957806
- Proto-value functions: A Laplacian framework for learning representation and control in Markov decision processes
- S. Mahadevan and M. Maggioni, "Proto-value functions: A Laplacian framework for learning representation and control in Markov decision processes," Journal of Machine Learning Research, Vol. 8, pp. 2169-2231, 2007. (Pubitemid 350046199)
- (2007) Journal of Machine Learning Research , vol.8 , pp. 2169-2231
- Mahadevan, S.¹ Maggioni, M.²

7
- 84898939480
- Policy gradient methods for reinforcement learning with function approximation
- S. A. Solla, T. K. Leen, and K.-R. Müller, Eds. MIT Press
- R. S. Sutton, D. A. McAllester, S. P. Singh, and Y. Mansour, "Policy gradient methods for reinforcement learning with function approximation," in Advances in Neural Information Processing Systems 12, S. A. Solla, T. K. Leen, and K.-R. Müller, Eds. MIT Press, 2000, pp. 1057-1063.
- (2000) Advances in Neural Information Processing Systems , vol.12 , pp. 1057-1063
- Sutton, R.S.¹ Mc Allester, D.A.² Singh, S.P.³ Mansour, Y.⁴

8
- 0037288469
- Approximate gradient methods in policy-space optimization of Markov reward processes
- P. Marbach and J. N. Tsitsiklis, "Approximate gradient methods in policy-space optimization of Markov reward processes," Discrete Event Dynamic Systems: Theory and Applications, Vol. 13, pp. 111-148, 2003.
- (2003) Discrete Event Dynamic Systems: Theory and Applications , vol.13 , pp. 111-148
- Marbach, P.¹ Tsitsiklis, J.N.²

9
- 33646399442
- Policy gradient in continuous time
- R. Munos, "Policy gradient in continuous time," Journal of Machine Learning Research, Vol. 7, pp. 771-791, 2006.
- (2006) Journal of Machine Learning Research , vol.7 , pp. 771-791
- Munos, R.¹

10
- 1942516890
- The cross-entropy method for fast policy search
- US, 21-24 August
- S. Mannor, R. Y. Rubinstein, and Y. Gat, "The cross-entropy method for fast policy search," in Proceedings 20th International Conference on Machine Learning (ICML-03), Washington, US, 21-24 August 2003, pp. 512-519.
- (2003) Proceedings 20th International Conference on Machine Learning (ICML-03), Washington , pp. 512-519
- Mannor, S.¹ Rubinstein, R.Y.² Gat, Y.³

11
- 34547120053
- Springer
- H. S. Chang, M. C. Fu, J. Hu, and S. I. Marcus, Simulation-Based Algorithms for Markov Decision Processes. Springer, 2007.
- (2007) Simulation-Based Algorithms for Markov Decision Processes
- Chang, H.S.¹ Fu, M.C.² Hu, J.³ Marcus, S.I.⁴

12
- 33646714634
- Evolutionary function approximation for reinforcement learning
- S. Whiteson and P. Stone, "Evolutionary function approximation for reinforcement learning," Journal of Machine Learning Research, Vol. 7, pp. 877-917, 2006. (Pubitemid 43736560)
- (2006) Journal of Machine Learning Research , vol.7 , pp. 877-917
- Whiteson, S.¹ Stone, P.²

13
- 3543089271
- The cross entropy method
- M. Jordan, J. Kleinberg, B. Scholkopf, F. Kelly, and I. Witten, Eds. Springer
- R. Y. Rubinstein and D. P. Kroese, The Cross Entropy Method. A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation, and Machine Learning, ser. Information Science and Statistics, M. Jordan, J. Kleinberg, B. Scholkopf, F. Kelly, and I. Witten, Eds. Springer, 2004.
- (2004) A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation, and Machine Learning, ser. Information Science and Statistics
- Rubinstein, R.Y.¹ Kroese, D.P.²

14
- 4043069840
- On actor-critic algorithms
- V. R. Konda and J. N. Tsitsiklis, "On actor-critic algorithms," SIAM Journal on Control and Optimization, Vol. 42, no. 4, pp. 1143-1166, 2003.
- (2003) SIAM Journal on Control and Optimization , vol.42 , Issue.4 , pp. 1143-1166
- Konda, V.R.¹ Tsitsiklis, J.N.²

15
- 34250354563
- Convergence properties of the cross-entropy method for discrete optimization
- DOI 10.1016/j.orl.2006.11.005, PII S0167637706001313
- A. Costa, O. D. Jones, and D. Kroese, "Convergence properties of the cross-entropy method for discrete optimization," Operations Research Letters, Vol. 35, pp. 573-580, 2007. (Pubitemid 47198343)
- (2007) Operations Research Letters , vol.35 , Issue.5 , pp. 573-580
- Costa, A.¹ Jones, O.D.² Kroese, D.³

16
- 49949101369
- Continuousstate reinforcement learning with fuzzy approximation
- K. Tuyls, A. Nowé, Z. Guessoum, and D. Kudenko, Eds. Springer
- L. Buşoniu, D. Ernst, B. De Schutter, and R. Babuška, "Continuousstate reinforcement learning with fuzzy approximation," in Adaptive Agents and Multi-Agent Systems III, ser. Lecture Notes in Computer Science, K. Tuyls, A. Nowé, Z. Guessoum, and D. Kudenko, Eds. Springer, 2008, Vol. 4865, pp. 27-43.
- (2008) Adaptive Agents and Multi-Agent Systems III, ser. Lecture Notes in Computer Science , vol.4865 , pp. 27-43
- Buşoniu, L.¹ Ernst, D.² De Schutter, B.³ Babuška, R.⁴

17
- 1642401055
- Learning to drive a bicycle using reinforcement learning and shaping
- Madison, US, 24-27 July
- J. Randløv and P. Alstrøm, "Learning to drive a bicycle using reinforcement learning and shaping," in Proceedings 15th International Conference on Machine Learning (ICML-98), Madison, US, 24-27 July 1998, pp. 463-471.
- (1998) Proceedings 15th International Conference on Machine Learning (ICML-98) , pp. 463-471
- Randløv, J.¹ Alstrøm, P.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.