SCOPUS 정보 검색 플랫폼

Volumn , Issue , 2014, Pages 780-789

Model regularization for stable sample rollouts

Author keywords

[No Author keywords available]

Indexed keywords

ARTIFICIAL INTELLIGENCE;

EFFECTIVE PLANNING; IMPERFECT MODELING; PLANNING ALGORITHMS; TRAINING DATA;

ERRORS;

EID: 84923276855 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (103)

References (15)

1
- 84898957872
- Improving the accuracy and speed of support vector machines
- Burges, C. J. & Schölkopf, B. (1997), Improving the accuracy and speed of support vector machines, in 'Advances in Neural Information Processing Systems (NIPS)', pp. 375-381.
- (1997) Advances in Neural Information Processing Systems (NIPS) , pp. 375-381
- Burges, C.J.¹ Schölkopf, B.²

2
- 84887290718
- Reinforcement learning with misspecified model classes
- Joseph, J., Geramifard, A., Roberts, J. W., How, J. P. & Roy, N. (2013), Reinforcement learning with misspecified model classes, in '2013 IEEE International Conference on Robotics and Automation (ICRA)', pp. 939-946.
- (2013) 2013 IEEE International Conference on Robotics and Automation (ICRA) , pp. 939-946
- Joseph, J.¹ Geramifard, A.² Roberts, J.W.³ How, J.P.⁴ Roy, N.⁵

3
- 0036832951
- A sparse sampling algorithm for near-optimal planning in large markov decision processes
- Kearns, M., Mansour, Y. & Ng, A. Y. (2002), 'A sparse sampling algorithm for near-optimal planning in large markov decision processes', Machine Learning 49(2-3), 193-208.
- (2002) Machine Learning , vol.49 , Issue.2-3 , pp. 193-208
- Kearns, M.¹ Mansour, Y.² Ng, A.Y.³

4
- 33750293964
- Bandit based montecarlo planning
- Kocsis, L. & Szepesvári, C. (2006), Bandit based montecarlo planning, in 'Proceedings of the 17th European Conference on Machine Learning (ECML)', pp. 282-293.
- (2006) Proceedings of the 17th European Conference on Machine Learning (ECML) , pp. 282-293
- Kocsis, L.¹ Szepesvári, C.²

5
- 0000123778
- Self-improving reactive agents based on reinforcement learning planning and teaching
- Lin, L.-J. (1992), 'Self-improving reactive agents based on reinforcement learning, planning and teaching', Machine learning 8(3-4), 293-321.
- (1992) Machine Learning , vol.8 , Issue.3-4 , pp. 293-321
- Lin, L.-J.¹

6
- 84896535000
- Learning with marginalized corrupted features
- Maaten, L., Chen, M., Tyree, S. & Weinberger, K. Q. (2013), Learning with marginalized corrupted features, in 'Proceedings of the 30th International Conference on Machine Learning (ICML-13)', pp. 410-418.
- (2013) Proceedings of the 30th International Conference on Machine Learning (ICML-13) , pp. 410-418
- Maaten, L.¹ Chen, M.² Tyree, S.³ Weinberger, K.Q.⁴

7
- 0026858102
- Noise injection into inputs in backpropagation learning
- Matsuoka, K. (1992), 'Noise injection into inputs in backpropagation learning', IEEE Transactions on Systems, Man and Cybernetics 22(3), 436-440.
- (1992) IEEE Transactions on Systems, Man and Cybernetics , vol.22 , Issue.3 , pp. 436-440
- Matsuoka, K.¹

8
- 84867115891
- Agnostic system identification for model-based reinforcement learning
- Ross, S. & Bagnell, D. (2012), Agnostic system identification for model-based reinforcement learning, in 'Proceedings of the 29th International Conference on Machine Learning (ICML-12)', pp. 1703-1710.
- (2012) Proceedings of the 29th International Conference on Machine Learning (ICML-12) , pp. 1703-1710
- Ross, S.¹ Bagnell, D.²

9
- 0022471098
- Learning representations by back-propagating errors
- Rumelhart, D. E., Hinton, G. E. & Williams, R. J. (1986), 'Learning representations by back-propagating errors', Nature 323(9), 533-536.
- (1986) Nature , vol.323 , Issue.9 , pp. 533-536
- Rumelhart, D.E.¹ Hinton, G.E.² Williams, R.J.³

10
- 84991580149
- A constraint generation approach to learning stable linear dynamical systems
- Siddiqi, S. M., Boots, B. & Gordon, G. J. (2007), A constraint generation approach to learning stable linear dynamical systems, in 'Advances in Neural Information Processing Systems (NIPS)', pp. 1329-1336.
- (2007) Advances in Neural Information Processing Systems (NIPS) , pp. 1329-1336
- Siddiqi, S.M.¹ Boots, B.² Gordon, G.J.³

11
- 85161963598
- Monte-carlo planning in large pomdps
- Silver, D. & Veness, J. (2010), Monte-carlo planning in large pomdps, in 'Advances in Neural Information Processing Systems (NIPS)', pp. 2164-2172.
- (2010) Advances in Neural Information Processing Systems (NIPS) , pp. 2164-2172
- Silver, D.¹ Veness, J.²

12
- 85161967377
- Reward design via online gradient ascent
- Sorg, J., Lewis, R. L. & Singh, S. (2010), Reward design via online gradient ascent, in 'Advances in Neural Information Processing Systems (NIPS)', pp. 2190-2198.
- (2010) Advances in Neural Information Processing Systems (NIPS) , pp. 2190-2198
- Sorg, J.¹ Lewis, R.L.² Singh, S.³

13
- 77956525933
- Internal rewards mitigate agent boundedness
- Sorg, J., Singh, S. & Lewis, R. L. (2010), Internal rewards mitigate agent boundedness, in 'Proceedings of the 27th International Conference on Machine Learning (ICML-10)', pp. 1007-1014.
- (2010) Proceedings of the 27th International Conference on Machine Learning (ICML-10) , pp. 1007-1014
- Sorg, J.¹ Singh, S.² Lewis, R.L.³

14
- 79956344726
- A monte-carlo AIXI approximation
- Veness, J., Ng, K. S., Hutter, M., Uther, W. T. B. & Silver, D. (2011), 'A Monte-Carlo AIXI Approximation', Journal of Artificial Intelligence Research 40, 95-142.
- (2011) Journal of Artificial Intelligence Research , vol.40 , pp. 95-142
- Veness, J.¹ Ng, K.S.² Hutter, M.³ Uther, W.T.B.⁴ Silver, D.⁵

15
- 0029307102
- The context tree weighting method: Basic properties
- Willems, F. M., Shtarkov, Y. M. & Tjalkens, T. J. (1995), 'The context tree weighting method: Basic properties', IEEE Transactions on Information Theory 41, 653-664.
- (1995) IEEE Transactions on Information Theory , vol.41 , pp. 653-664
- Willems, F.M.¹ Shtarkov, Y.M.² Tjalkens, T.J.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.