SCOPUS 정보 검색 플랫폼

Advances in Neural Information Processing Systems

Volumn , Issue , 2016, Pages 1479-1487

Unifying count-based exploration and intrinsic motivation

(6) Bellemare, Marc G a Srinivasan, Sriram a Ostrovski, Georg a Schaul, Tom a Saxton, David a Munos, Rémi a

a DEEPMIND (United Kingdom)

Author keywords

[No Author keywords available]

Indexed keywords

REINFORCEMENT LEARNING;

DENSITY MODELING; DENSITY MODELS; EXPLORATION ALGORITHMS; INTRINSIC MOTIVATION; NOVEL ALGORITHM;

MOTIVATION;

EID: 85018935382 PISSN: 10495258 EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (1495)

References (24)

1
- 84929046579
- Intrinsic motivation and reinforcement learning
- Springer
- Barto, A. G. (2013). Intrinsic motivation and reinforcement learning. In Intrinsically Motivated Learning in Natural and Artificial Systems, pages 17-47. Springer.
- (2013) Intrinsically Motivated Learning in Natural and Artificial Systems , pp. 17-47
- Barto, A.G.¹

2
- 84919784622
- Skip context tree switching
- Bellemare, M., Veness, J., and Talvitie, E. (2014). Skip context tree switching. In Proceedings of the 31st International Conference on Machine Learning, pages 1458-1466.
- (2014) Proceedings of the 31st International Conference on Machine Learning , pp. 1458-1466
- Bellemare, M.¹ Veness, J.² Talvitie, E.³

3
- 84879976780
- The arcade learning environment: An evaluation platform for general agents
- Bellemare, M. G., Naddaf, Y., Veness, J., and Bowling, M. (2013). The Arcade Learning Environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research, 47:253-279.
- (2013) Journal of Artificial Intelligence Research , vol.47 , pp. 253-279
- Bellemare, M.G.¹ Naddaf, Y.² Veness, J.³ Bowling, M.⁴

4
- 0003487482
- Athena Scientific
- Bertsekas, D. P. and Tsitsiklis, J. N. (1996). Neuro-Dynamic Programming. Athena Scientific.
- (1996) Neuro-Dynamic Programming
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

5
- 84889281816
- John Wiley & Sons
- Cover, T. M. and Thomas, J. A. (1991). Elements of information theory. John Wiley & Sons.
- (1991) Elements of Information Theory
- Cover, T.M.¹ Thomas, J.A.²

6
- 85018869330
- Houthooft, R., Chen, X., Duan, Y., Schulman, J., De Turck, F., and Abbeel, P. (2016). Variational information maximizing exploration.
- (2016) Variational Information Maximizing Exploration
- Houthooft, R.¹ Chen, X.² Duan, Y.³ Schulman, J.⁴ De Turck, F.⁵ Abbeel, P.⁶

7
- 84919808508
- Sparse adaptive dirichlet-multinomial-like processes
- Hutter, M. (2013). Sparse adaptive dirichlet-multinomial-like processes. In Proceedings of the Conference on Online Learning Theory.
- (2013) Proceedings of the Conference on Online Learning Theory
- Hutter, M.¹

8
- 71149109483
- Near-Bayesian exploration in polynomial time
- Kolter, Z. J. and Ng, A. Y. (2009). Near-bayesian exploration in polynomial time. In Proceedings of the 26th International Conference on Machine Learning.
- (2009) Proceedings of the 26th International Conference on Machine Learning
- Kolter, Z.J.¹ Ng, A.Y.²

9
- 85002497864
- Thompson sampling is asymptotically optimal in general environments
- Leike, J., Lattimore, T., Orseau, L., and Hutter, M. (2016). Thompson sampling is asymptotically optimal in general environments. In Proceedings of the Conference on Uncertainty in Artificial Intelligence.
- (2016) Proceedings of the Conference on Uncertainty in Artificial Intelligence
- Leike, J.¹ Lattimore, T.² Orseau, L.³ Hutter, M.⁴

10
- 84877724875
- Exploration in model-based reinforcement learning by empirically estimating learning progress
- Lopes, M., Lang, T., Toussaint, M., and Oudeyer, P.-Y. (2012). Exploration in model-based reinforcement learning by empirically estimating learning progress. In Advances in Neural Information Processing Systems 25.
- (2012) Advances in Neural Information Processing Systems , vol.25
- Lopes, M.¹ Lang, T.² Toussaint, M.³ Oudeyer, P.-Y.⁴

11
- 84964652743
- Domain-independent optimistic initialization for reinforcement learning
- Machado, M. C., Srinivasan, S., and Bowling, M. (2015). Domain-independent optimistic initialization for reinforcement learning. AAAI Workshop on Learning for General Competency in Video Games.
- (2015) AAAI Workshop on Learning for General Competency in Video Games
- Machado, M.C.¹ Srinivasan, S.² Bowling, M.³

12
- 84999036937
- Asynchronous methods for deep reinforcement learning
- Mnih, V., Badia, A. P., Mirza, M., Graves, A., Lillicrap, T. P., Harley, T., Silver, D., and Kavukcuoglu, K. (2016). Asynchronous methods for deep reinforcement learning. In Proceedings of the International Conference on Machine Learning.
- (2016) Proceedings of the International Conference on Machine Learning
- Mnih, V.¹ Badia, A.P.² Mirza, M.³ Graves, A.⁴ Lillicrap, T.P.⁵ Harley, T.⁶ Silver, D.⁷ Kavukcuoglu, K.⁸

13
- 84924051598
- Human-level control through deep reinforcement learning
- Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., et al. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540):529-533.
- (2015) Nature , vol.518 , Issue.7540 , pp. 529-533
- Mnih, V.¹ Kavukcuoglu, K.² Silver, D.³ Rusu, A.A.⁴ Veness, J.⁵ Bellemare, M.G.⁶ Graves, A.⁷ Riedmiller, M.⁸ Fidjeland, A.K.⁹ Ostrovski, G.¹⁰

14
- 84965128263
- Variational information maximisation for intrinsically motivated reinforcement learning
- Mohamed, S. and Rezende, D. J. (2015). Variational information maximisation for intrinsically motivated reinforcement learning. In Advances in Neural Information Processing Systems 28.
- (2015) Advances in Neural Information Processing Systems , vol.28
- Mohamed, S.¹ Rezende, D.J.²

15
- 85018893133
- arXiv preprint arXiv:1503.04304
- Ollivier, Y. (2015). Laplace's rule of succession in information geometry. arXiv preprint arXiv:1503.04304.
- (2015) Laplace's Rule of Succession in Information Geometry
- Ollivier, Y.¹

16
- 85018875593
- Universal knowledge-seeking agents for stochastic environments
- Orseau, L., Lattimore, T., and Hutter, M. (2013). Universal knowledge-seeking agents for stochastic environments. In Proceedings of the Conference on Algorithmic Learning Theory.
- (2013) Proceedings of the Conference on Algorithmic Learning Theory
- Orseau, L.¹ Lattimore, T.² Hutter, M.³

17
- 34047267520
- Intrinsic motivation systems for autonomous mental development
- Oudeyer, P., Kaplan, F., and Haffner, V. (2007). Intrinsic motivation systems for autonomous mental development. IEEE Transactions on Evolutionary Computation, 11(2):265-286.
- (2007) IEEE Transactions on Evolutionary Computation , vol.11 , Issue.2 , pp. 265-286
- Oudeyer, P.¹ Kaplan, F.² Haffner, V.³

18
- 85007188953
- Efficient PAC-optimal exploration in concurrent, continuous state MDPs with delayed updates
- Pazis, J. and Parr, R. (2016). Efficient PAC-optimal exploration in concurrent, continuous state MDPs with delayed updates. In Proceedings of the 30th AAAI Conference on Artificial Intelligence.
- (2016) Proceedings of the 30th AAAI Conference on Artificial Intelligence
- Pazis, J.¹ Parr, R.²

19
- 2442467081
- A possibility for implementing curiosity and boredom in model-building neural controllers
- Schmidhuber, J. (1991). A possibility for implementing curiosity and boredom in model-building neural controllers. In From animals to animats: proceedings of the first international conference on simulation of adaptive behavior.
- (1991) From Animals to Animats: Proceedings of the First International Conference on Simulation of Adaptive Behavior
- Schmidhuber, J.¹

20
- 33745609140
- Intrinsically motivated reinforcement learning
- Singh, S., Barto, A. G., and Chentanez, N. (2004). Intrinsically motivated reinforcement learning. In Advances in Neural Information Processing Systems 16.
- (2004) Advances in Neural Information Processing Systems , vol.16
- Singh, S.¹ Barto, A.G.² Chentanez, N.³

21
- 84959023524
- arXiv preprint arXiv:1507.00814
- Stadie, B. C., Levine, S., and Abbeel, P. (2015). Incentivizing exploration in reinforcement learning with deep predictive models. arXiv preprint arXiv:1507.00814.
- (2015) Incentivizing Exploration in Reinforcement Learning with Deep Predictive Models
- Stadie, B.C.¹ Levine, S.² Abbeel, P.³

22
- 55549110436
- An analysis of model-based interval estimation for Markov decision processes
- Strehl, A. L. and Littman, M. L. (2008). An analysis of model-based interval estimation for Markov decision processes. Journal of Computer and System Sciences, 74(8):1309-1331.
- (2008) Journal of Computer and System Sciences , vol.74 , Issue.8 , pp. 1309-1331
- Strehl, A.L.¹ Littman, M.L.²

23
- 84999048365
- Pixel recurrent neural networks
- Van den Oord, A., Kalchbrenner, N., and Kavukcuoglu, K. (2016). Pixel recurrent neural networks. In Pro-ceedigns of the 33rd International Conference on Machine Learning.
- (2016) Pro-ceedigns of the 33rd International Conference on Machine Learning
- Van Den Oord, A.¹ Kalchbrenner, N.² Kavukcuoglu, K.³

24
- 85007210890
- Deep reinforcement learning with double Q-learning
- van Hasselt, H., Guez, A., and Silver, D. (2016). Deep reinforcement learning with double Q-learning. In Proceedings of the 30th AAAI Conference on Artificial Intelligence.
- (2016) Proceedings of the 30th AAAI Conference on Artificial Intelligence
- Van Hasselt, H.¹ Guez, A.² Silver, D.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.