SCOPUS 정보 검색 플랫폼

Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011

Volumn , Issue , 2011, Pages

Action-gap phenomenon in reinforcement learning

(1) Farahmand, Amir Massoud a

a MCGILL UNIVERSITY (Canada)

Author keywords

[No Author keywords available]

Indexed keywords

ITERATIVE METHODS;

APPROXIMATE VALUE ITERATION ALGORITHM; GAP PHENOMENON; GREEDY POLICY; LEARNING PROBLEM; OPTIMAL PERFORMANCE; PERFORMANCE; PERFORMANCE LOSS; REINFORCEMENT LEARNINGS; VALUE FUNCTIONS;

REINFORCEMENT LEARNING;

EID: 85162479771 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (57)

References (25)

1
- 4644323293
- Least-squares policy iteration
- Michail G. Lagoudakis and Ronald Parr. Least-squares policy iteration. Journal of Machine Learning Research, 4:1107-1149, 2003.
- (2003) Journal of Machine Learning Research , vol.4 , pp. 1107-1149
- Lagoudakis, M.G.¹ Parr, R.²

2
- 0003487482
- Athena Scientific
- Dimitri P. Bertsekas and John N. Tsitsiklis. Neuro-Dynamic Programming (Optimization and Neural Computation Series, 3). Athena Scientific, 1996.
- (1996) Neuro-Dynamic Programming (Optimization and Neural Computation Series 3)
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

3
- 0033234630
- Smooth discrimination analysis
- Enno Mammen and Alexander B. Tsybakov. Smooth discrimination analysis. The Annals of Statistics, 27(6):1808-1829, 1999.
- (1999) The Annals of Statistics , vol.27 , Issue.6 , pp. 1808-1829
- Mammen, E.¹ Tsybakov, A.B.²

4
- 3142725508
- Optimal aggregation of classifiers in statistical learning
- Alexander B. Tsybakov. Optimal aggregation of classifiers in statistical learning. The Annals of Statistics, 32 (1):135-166, 2004.
- (2004) The Annals of Statistics , vol.32 , Issue.1 , pp. 135-166
- Tsybakov, A.B.¹

5
- 34547706430
- Fast learning rates for plug-in classifiers
- Jean-Yves Audibert and Alexander B. Tsybakov. Fast learning rates for plug-in classifiers. The Annals of Statistics, 35(2):608-633, 2007.
- (2007) The Annals of Statistics , vol.35 , Issue.2 , pp. 608-633
- Audibert, J.-Y.¹ Tsybakov, A.B.²

6
- 77957604813
- Generalized density clustering
- Alessandro Rinaldo and Larry Wasserman. Generalized density clustering. The Annals of Statistics, 38(5):2678-2722, 2010.
- (2010) The Annals of Statistics , vol.38 , Issue.5 , pp. 2678-2722
- Rinaldo, A.¹ Wasserman, L.²

7
- 1942420814
- Reinforcement learning as classification: Leveraging modern classifiers
- Michail G. Lagoudakis and Ronald Parr. Reinforcement learning as classification: Leveraging modern classifiers. In ICML '03: Proceedings of the 20th international conference on Machine learning, pages 424-431, 2003.
- (2003) ICML '03: Proceedings of the 20th International Conference on Machine Learning , pp. 424-431
- Lagoudakis, M.G.¹ Parr, R.²

8
- 77956523230
- Analysis of a classificationbased policy iteration algorithm
- Omnipress
- Alessandro Lazaric, Mohammad Ghavamzadeh, and Rémi Munos. Analysis of a classificationbased policy iteration algorithm. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), pages 607-614. Omnipress, 2010.
- (2010) Proceedings of the 27th International Conference on Machine Learning (ICML-10) , pp. 607-614
- Lazaric, A.¹ Ghavamzadeh, M.² Munos, R.³

9
- 85162059109
- A reduction from apprenticeship learning to classification
- J. Lafferty, C. K. I.Williams, J. Shawe-Taylor, R.S. Zemel, and A. Culotta, editors
- Omar Syed and Robert E. Schapire. A reduction from apprenticeship learning to classification. In J. Lafferty, C. K. I.Williams, J. Shawe-Taylor, R.S. Zemel, and A. Culotta, editors, Advances in Neural Information Processing Systems (NIPS - 23), pages 2253-2261, 2010.
- (2010) Advances in Neural Information Processing Systems (NIPS - 23) , pp. 2253-2261
- Syed, O.¹ Schapire, R.E.²

10
- 40849145988
- Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
- András Antos, Csaba Szepesvári, and Rémi Munos. Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path. Machine Learning, 71:89-129, 2008.
- (2008) Machine Learning , vol.71 , pp. 89-129
- Antos, A.¹ Szepesvári, C.² Munos, R.³

11
- 44649189852
- Finite-time bounds for fitted value iteration
- Rémi Munos and Csaba Szepesvári. Finite-time bounds for fitted value iteration. Journal of Machine Learning Research, 9:815-857, 2008.
- (2008) Journal of Machine Learning Research , vol.9 , pp. 815-857
- Munos, R.¹ Szepesvári, C.²

12
- 70449644892
- Regularized fitted Q-iteration for planning in continuous-space markovian decision problems
- June
- Amir-massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, and Shie Mannor. Regularized fitted Q-iteration for planning in continuous-space Markovian Decision Problems. In Proceedings of American Control Conference (ACC), pages 725-730, June 2009.
- (2009) Proceedings of American Control Conference (ACC( , pp. 725-730
- Farahmand, A.-M.¹ Ghavamzadeh, M.² Szepesvári, C.³ Mannor, S.⁴

13
- 70049096468
- Regularized policy iteration
- D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, editors. MIT Press
- Amir-massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, and Shie Mannor. Regularized policy iteration. In D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, editors, Advances in Neural Information Processing Systems (NIPS - 21), pages 441-448. MIT Press, 2009.
- (2009) Advances in Neural Information Processing Systems (NIPS - 21) , pp. 441-448
- Farahmand, A.-M.¹ Ghavamzadeh, M.² Szepesvári, C.³ Mannor, S.⁴

14
- 84860641013
- Finitesample analysis of Bellman residual minimization
- Odalric Maillard, Rémi Munos, Alessandro Lazaric, and Mohammad Ghavamzadeh. Finitesample analysis of Bellman residual minimization. In Proceedings of the Second Asian Conference on Machine Learning (ACML), 2010.
- (2010) Proceedings of the Second Asian Conference on Machine Learning (ACML)
- Maillard, O.¹ Munos, R.² Lazaric, A.³ Ghavamzadeh, M.⁴

15
- 0004102479
- The MIT Press
- Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning). The MIT Press, 1998.
- (1998) Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning
- Sutton, R.S.¹ Barto, A.G.²

16
- 79955859296
- Morgan Claypool Publishers
- Csaba Szepesvári. Algorithms for Reinforcement Learning. Morgan Claypool Publishers, 2010.
- (2010) Algorithms for Reinforcement Learning
- Szepesvári, C.¹

17
- 77956541799
- Toward off-policy learning control with function approximation
- Johannes Fürnkranz and Thorsten Joachims, editors, Haifa, Israel, June. Omnipress
- Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, and Richard S. Sutton. Toward off-policy learning control with function approximation. In Johannes Fürnkranz and Thorsten Joachims, editors, Proceedings of the 27th International Conference on Machine Learning (ICML-10), pages 719-726, Haifa, Israel, June 2010. Omnipress.
- (2010) Proceedings of the 27th International Conference on Machine Learning (ICML-10) , pp. 719-726
- Maei, H.R.¹ Szepesvári, C.² Bhatnagar, S.³ Sutton, R.S.⁴

18
- 71149121683
- Regularization and feature selection in least-squares temporal difference learning
- ACM
- J. Zico Kolter and Andrew Y. Ng. Regularization and feature selection in least-squares temporal difference learning. In ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning, pages 521-528. ACM, 2009.
- (2009) ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning , pp. 521-528
- Kolter, J.Z.¹ Ng, A.Y.²

19
- 33646398129
- Neural fitted Q iteration - First experiences with a data efficient neural reinforcement learning method
- Martin Riedmiller. Neural fitted Q iteration - first experiences with a data efficient neural reinforcement learning method. In 16th European Conference on Machine Learning, pages 317-328, 2005.
- (2005) 16th European Conference on Machine Learning , pp. 317-328
- Riedmiller, M.¹

20
- 21844465127
- Tree-based batch mode reinforcement learning
- Damien Ernst, Pierre Geurts, and Louis Wehenkel. Tree-based batch mode reinforcement learning. Journal of Machine Learning Research, 6:503-556, 2005.
- (2005) Journal of Machine Learning Research , vol.6 , pp. 503-556
- Ernst, D.¹ Geurts, P.² Wehenkel, L.³

21
- 0348090400
- The linear programming approach to approximate dynamic programming
- Daniela Pucci de Farias and Benjamin Van Roy. The linear programming approach to approximate dynamic programming. Operations Research, 51(6):850-865, 2003.
- (2003) Operations Research , vol.51 , Issue.6 , pp. 850-865
- De Farias, D.P.¹ Van Roy, B.²

22
- 71149105671
- Constraint relaxation in approximate linear programs
- New York, NY, USA. ACM
- Marek Petrik and Shlomo Zilberstein. Constraint relaxation in approximate linear programs. In Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, pages 809-816, New York, NY, USA, 2009. ACM.
- (2009) Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09 , pp. 809-816
- Petrik, M.¹ Zilberstein, S.²

23
- 1942516880
- Error bounds for approximate policy iteration
- Rémi Munos. Error bounds for approximate policy iteration. In ICML 2003: Proceedings of the 20th Annual International Conference on Machine Learning, pages 560-567, 2003.
- (2003) ICML 2003: Proceedings of the 20th Annual International Conference on Machine Learning , pp. 560-567
- Munos, R.¹

24
- 40949107944
- p norm for approximate value iteration
- p norm for approximate value iteration. SIAM Journal on Control and Optimization, pages 541-561, 2007.
- (2007) SIAM Journal on Control and Optimization , pp. 541-561
- Munos, R.¹

25
- 85162063395
- Error propagation for approximate policy and value iteration
- J. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R.S. Zemel, and A. Culotta, editors
- Amir-massoud Farahmand, Rémi Munos, and Csaba Szepesvári. Error propagation for approximate policy and value iteration. In J. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R.S. Zemel, and A. Culotta, editors, Advances in Neural Information Processing Systems (NIPS - 23), pages 568-576. 2010.
- (2010) Advances in Neural Information Processing Systems (NIPS - 23) , pp. 568-576
- Farahmand, A.-M.¹ Munos, R.² Szepesvári, C.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.