SCOPUS 정보 검색 플랫폼

4th International Conference on Learning Representations, ICLR 2016 - Conference Track Proceedings

Volumn , Issue , 2016, Pages

Prioritized experience replay

(4) Schaul, Tom a Quan, John a Antonoglou, Ioannis a Silver, David a

a DEEPMIND (United Kingdom)

Author keywords

[No Author keywords available]

Indexed keywords

INTELLIGENT AGENTS; LEARNING ALGORITHMS; MACHINE LEARNING;

HUMAN-LEVEL PERFORMANCE; REINFORCEMENT LEARNING AGENT; STATE OF THE ART;

REINFORCEMENT LEARNING;

EID: 85083953310 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (1478)

References (37)

1
- 21844480297
- Generalized prioritized sweeping
- Citeseer
- Andre, David, Friedman, Nir, and Parr, Ronald. Generalized prioritized sweeping. In Advances in Neural Information Processing Systems. Citeseer, 1998.
- (1998) Advances in Neural Information Processing Systems
- Andre, D.¹ Friedman, N.² Parr, R.³

2
- 84940795121
- Memory trace replay: The shaping of memory consolidation by neuromodulation
- Atherton, Laura A, Dupret, David, and Mellor, Jack R. Memory trace replay: the shaping of memory consolidation by neuromodulation. Trends in neurosciences, 38(9):560–570, 2015.
- (2015) Trends in Neurosciences , vol.38 , Issue.9 , pp. 560-570
- Atherton, L.A.¹ Dupret, D.² Mellor, J.R.³

3
- 84879678310
- arXiv preprint
- Bellemare, Marc G, Naddaf, Yavar, Veness, Joel, and Bowling, Michael. The arcade learning environment: An evaluation platform for general agents. arXiv preprint arXiv:1207.4708, 2012.
- (2012) The Arcade Learning Environment: An Evaluation Platform for General Agents
- Bellemare, M.G.¹ Naddaf, Y.² Veness, J.³ Bowling, M.⁴

4
- 85007236718
- Increasing the action gap: New operators for reinforcement learning
- Bellemare, Marc G., Ostrovski, Georg, Guez, Arthur, Thomas, Philip S., and Munos, Rémi. Increasing the action gap: New operators for reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence, 2016. URL http://arxiv.org/abs/1512.04860.
- (2016) Proceedings of the AAAI Conference on Artificial Intelligence
- Bellemare, M.G.¹ Ostrovski, G.² Guez, A.³ Thomas, P.S.⁴ Munos, R.⁵

5
- 84888340666
- Torch7: A matlab-like environment for machine learning
- number EPFL-CONF-192376
- Collobert, Ronan, Kavukcuoglu, Koray, and Farabet, Clément. Torch7: A matlab-like environment for machine learning. In BigLearn, NIPS Workshop, number EPFL-CONF-192376, 2011.
- (2011) BigLearn, NIPS Workshop
- Collobert, R.¹ Kavukcuoglu, K.² Farabet, C.³

6
- 21844491206
- Zebras and the Anna Karenina principle
- Diamond, Jared. Zebras and the Anna Karenina principle. Natural History, 103:4–4, 1994.
- (1994) Natural History , vol.103 , pp. 4
- Diamond, J.¹

7
- 51949101231
- A discriminatively trained, multi-scale, deformable part model
- Felzenszwalb, Pedro, McAllester, David, and Ramanan, Deva. A discriminatively trained, multi-scale, deformable part model. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pp. 1–8. IEEE, 2008.
- (2008) Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on , pp. 1-8
- Felzenszwalb, P.¹ McAllester, D.² Ramanan, D.³

8
- 33645458694
- Reverse replay of behavioural sequences in hippocampal place cells during the awake state
- Foster, David J and Wilson, Matthew A. Reverse replay of behavioural sequences in hippocampal place cells during the awake state. Nature, 440(7084):680–683, 2006.
- (2006) Nature , vol.440 , Issue.7084 , pp. 680-683
- Foster, D.J.¹ Wilson, M.A.²

9
- 84862515469
- A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches
- Galar, Mikel, Fernandez, Alberto, Barrenechea, Edurne, Bustince, Humberto, and Herrera, Francisco. A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 42(4):463–484, 2012.
- (2012) Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on , vol.42 , Issue.4 , pp. 463-484
- Galar, M.¹ Fernandez, A.² Barrenechea, E.³ Bustince, H.⁴ Herrera, F.⁵

10
- 80053456360
- Online discovery of feature dependencies
- Geramifard, Alborz, Doshi, Finale, Redding, Joshua, Roy, Nicholas, and How, Jonathan. Online discovery of feature dependencies. In Proceedings of the 28th International Conference on Machine Learning (ICML-11), pp. 881–888, 2011.
- (2011) Proceedings of the 28th International Conference on Machine Learning (ICML-11) , pp. 881-888
- Geramifard, A.¹ Doshi, F.² Redding, J.³ Roy, N.⁴ How, J.⁵

11
- 84937779024
- Deep learning for real-time atari game play using offline Monte-carlo tree search planning
- Ghahra-mani, Z., Welling, M., Cortes, C., Lawrence, and Weinberger, K.Q. (eds), Curran Associates, Inc
- Guo, Xiaoxiao, Singh, Satinder, Lee, Honglak, Lewis, Richard L, and Wang, Xiaoshi. Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning. In Ghahra-mani, Z., Welling, M., Cortes, C., Lawrence, N.D., and Weinberger, K.Q. (eds.), Advances in Neural Information Processing Systems 27, pp. 3338–3346. Curran Associates, Inc., 2014.
- (2014) Advances in Neural Information Processing Systems , vol.27 , pp. 3338-3346
- Guo, X.¹ Singh, S.² Lee, H.³ Lewis, R.L.⁴ Wang, X.⁵

12
- 34848816179
- To recognize shapes, first learn to generate images
- Hinton, Geoffrey E. To recognize shapes, first learn to generate images. Progress in brain research, 165:535–547, 2007.
- (2007) Progress in Brain Research , vol.165 , pp. 535-547
- Hinton, G.E.¹

13
- 85083951076
- ADaM: A method for stochastic optimization
- Kingma, Diederik P. and Ba, Jimmy. Adam: A method for stochastic optimization. CoRR, abs/1412.6980, 2014.
- (2014) CoRR
- Kingma, D.P.¹ Ba, J.²

14
- 0032203257
- Gradient-based learning applied to document recognition
- Nov
- Lecun, Y., Bottou, L., Bengio, Y., and Haffner, P. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, Nov 1998. ISSN 0018-9219. doi: 10.1109/5.726791.
- (1998) Proceedings of the IEEE , vol.86 , Issue.11 , pp. 2278-2324
- Lecun, Y.¹ Bottou, L.² Bengio, Y.³ Haffner, P.⁴

15
- 6344235947
- LeCun, Yann, Cortes, Corinna, and Burges, Christopher JC. The MNIST database of handwritten digits, 1998.
- (1998) The MNIST Database of Handwritten Digits
- LeCun, Y.¹ Cortes, C.² Burges, C.J.C.³

16
- 0000123778
- Self-improving reactive agents based on reinforcement learning, planning and teaching
- Lin, Long-Ji. Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine learning, 8(3-4):293–321, 1992.
- (1992) Machine Learning , vol.8 , Issue.3-4 , pp. 293-321
- Lin, L.-J.¹

17
- 84937883130
- Weighted importance sampling for off-policy learning with linear function approximation
- Mahmood, A Rupam, van Hasselt, Hado P, and Sutton, Richard S. Weighted importance sampling for off-policy learning with linear function approximation. In Advances in Neural Information Processing Systems, pp. 3014–3022, 2014.
- (2014) Advances in Neural Information Processing Systems , pp. 3014-3022
- Mahmood, A.R.¹ Van Hasselt, H.P.² Sutton, R.S.³

18
- 84947899563
- Dopaminergic neurons promote hippocampal reactivation and spatial memory persistence
- McNamara, Colin G, Tejero-Cantero, Álvaro, Trouche, Stéphanie, Campo-Urriza, Natalia, and Dupret, David. Dopaminergic neurons promote hippocampal reactivation and spatial memory persistence. Nature neuroscience, 2014.
- (2014) Nature Neuroscience
- McNamara, C.G.¹ Tejero-Cantero, Á.² Trouche, S.³ Campo-Urriza, N.⁴ Dupret, D.⁵

19
- 84904867557
- arXiv preprint
- Mnih, Volodymyr, Kavukcuoglu, Koray, Silver, David, Graves, Alex, Antonoglou, Ioannis, Wierstra, Daan, and Riedmiller, Martin. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013.
- (2013) Playing Atari with Deep Reinforcement Learning
- Mnih, V.¹ Kavukcuoglu, K.² Silver, D.³ Graves, A.⁴ Antonoglou, I.⁵ Wierstra, D.⁶ Riedmiller, M.⁷

20
- 84924051598
- Human-level control through deep reinforcement learning
- Mnih, Volodymyr, Kavukcuoglu, Koray, Silver, David, Rusu, Andrei A, Veness, Joel, Bellemare, Marc G, Graves, Alex, Riedmiller, Martin, Fidjeland, Andreas K, Ostrovski, Georg, Petersen, Stig, Beattie, Charles, Sadik, Amir, Antonoglou, Ioannis, King, Helen, Kumaran, Dharshan, Wierstra, Daan, Legg, Shane, and Hassabis, Demis. Human-level control through deep reinforcement learning. Nature, 518(7540):529–533, 2015.
- (2015) Nature , vol.518 , Issue.7540 , pp. 529-533
- Mnih, V.¹ Kavukcuoglu, K.² Silver, D.³ Rusu, A.A.⁴ Veness, J.⁵ Bellemare, M.G.⁶ Graves, A.⁷ Riedmiller, M.⁸ Fidjeland, A.K.⁹ Ostrovski, G.¹⁰ Petersen, S.¹¹ Beattie, C.¹² Sadik, A.¹³ Antonoglou, I.¹⁴ King, H.¹⁵ Kumaran, D.¹⁶ Wierstra, D.¹⁷ Legg, S.¹⁸ Hassabis, D.¹⁹

21
- 0027684215
- Prioritized sweeping: Reinforcement learning with less data and less time
- Moore, Andrew W and Atkeson, Christopher G. Prioritized sweeping: Reinforcement learning with less data and less time. Machine Learning, 13(1):103–130, 1993.
- (1993) Machine Learning , vol.13 , Issue.1 , pp. 103-130
- Moore, A.W.¹ Atkeson, C.G.²

22
- 84980007683
- arXiv preprint
- Nair, Arun, Srinivasan, Praveen, Blackwell, Sam, Alcicek, Cagdas, Fearon, Rory, Maria, Alessan-dro De, Panneershelvam, Vedavyas, Suleyman, Mustafa, Beattie, Charles, Petersen, Stig, Legg, Shane, Mnih, Volodymyr, Kavukcuoglu, Koray, and Silver, David. Massively parallel methods for deep reinforcement learning. arXiv preprint arXiv:1507.04296, 2015.
- (2015) Massively Parallel Methods for Deep Reinforcement Learning
- Nair, A.¹ Srinivasan, P.² Blackwell, S.³ Alcicek, C.⁴ Fearon, R.⁵ De Maria, A.-D.⁶ Panneershelvam, V.⁷ Suleyman, M.⁸ Beattie, C.⁹ Petersen, S.¹⁰ Legg, S.¹¹ Mnih, V.¹² Kavukcuoglu, K.¹³ Silver, D.¹⁴

23
- 84959861546
- Language understanding for text-based games using deep reinforcement learning
- Narasimhan, Karthik, Kulkarni, Tejas, and Barzilay, Regina. Language understanding for text-based games using deep reinforcement learning. In Conference on Empirical Methods in Natural Language Processing (EMNLP), 2015.
- (2015) Conference on Empirical Methods in Natural Language Processing (EMNLP)
- Narasimhan, K.¹ Kulkarni, T.² Barzilay, R.³

24
- 84937060789
- Hippocampal place cells construct reward related sequences through unexplored space
- Ólafsdóttir, H Freyja, Barry, Caswell, Saleem, Aman B, Hassabis, Demis, and Spiers, Hugo J. Hippocampal place cells construct reward related sequences through unexplored space. Elife, 4: e06063, 2015.
- (2015) Elife , vol.4
- Ólafsdóttir, H.F.¹ Barry, C.² Saleem, A.B.³ Hassabis, D.⁴ Spiers, H.J.⁵

25
- 0004109478
- Riedmiller, Martin. Rprop-description and implementation details. 1994.
- (1994) Rprop-Description and Implementation Details
- Riedmiller, M.¹

26
- 0031082536
- New methods for competitive coevolution
- Rosin, Christopher D and Belew, Richard K. New methods for competitive coevolution. Evolutionary Computation, 5(1):1–29, 1997.
- (1997) Evolutionary Computation , vol.5 , Issue.1 , pp. 1-29
- Rosin, C.D.¹ Belew, R.K.²

27
- 84897487847
- No more pesky learning rates
- Schaul, Tom, Zhang, Sixin, and Lecun, Yann. No more pesky learning rates. In Proceedings of the 30th International Conference on Machine Learning (ICML-13), pp. 343–351, 2013.
- (2013) Proceedings of the 30th International Conference on Machine Learning (ICML-13) , pp. 343-351
- Schaul, T.¹ Zhang, S.² Lecun, Y.³

28
- 0026306990
- Curious model-building control systems
- Schmidhuber, Jürgen. Curious model-building control systems. In Neural Networks, 1991. 1991 IEEE International Joint Conference on, pp. 1458–1463. IEEE, 1991.
- (1991) Neural Networks, 1991. 1991 IEEE International Joint Conference on , pp. 1458-1463
- Schmidhuber, J.¹

29
- 72149101860
- Rewarded outcomes enhance reactivation of experience in the hippocampus
- Singer, Annabelle C and Frank, Loren M. Rewarded outcomes enhance reactivation of experience in the hippocampus. Neuron, 64(6):910–921, 2009.
- (2009) Neuron , vol.64 , Issue.6 , pp. 910-921
- Singer, A.C.¹ Frank, L.M.²

30
- 84959023524
- arXiv preprint
- Stadie, Bradly C, Levine, Sergey, and Abbeel, Pieter. Incentivizing exploration in reinforcement learning with deep predictive models. arXiv preprint arXiv:1507.00814, 2015.
- (2015) Incentivizing Exploration in Reinforcement Learning with Deep Predictive Models
- Stadie, B.C.¹ Levine, S.² Abbeel, P.³

31
- 80053457849
- Incremental basis construction from temporal difference error
- Sun, Yi, Ring, Mark, Schmidhuber, Jürgen, and Gomez, Faustino J. Incremental basis construction from temporal difference error. In Proceedings of the 28th International Conference on Machine Learning (ICML-11), pp. 481–488, 2011.
- (2011) Proceedings of the 28th International Conference on Machine Learning (ICML-11) , pp. 481-488
- Sun, Y.¹ Ring, M.² Schmidhuber, J.³ Gomez, F.J.⁴

32
- 85161998941
- Double Q-learning
- van Hasselt, Hado. Double Q-learning. In Advances in Neural Information Processing Systems, pp. 2613–2621, 2010.
- (2010) Advances in Neural Information Processing Systems , pp. 2613-2621
- Van Hasselt, H.¹

33
- 85007210890
- Deep Reinforcement Learning with Double Q-learning
- van Hasselt, Hado, Guez, Arthur, and Silver, David. Deep Reinforcement Learning with Double Q-learning. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016. URL http://arxiv.org/abs/1509.06461.
- (2016) Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence
- Van Hasselt, H.¹ Guez, A.² Silver, D.³

34
- 85062336054
- Planning by prioritized sweeping with small backups
- van Seijen, Harm and Sutton, Richard. Planning by prioritized sweeping with small backups. In Proceedings of The 30th International Conference on Machine Learning, pp. 361–369, 2013.
- (2013) Proceedings of the 30th International Conference on Machine Learning , pp. 361-369
- Van Seijen, H.¹ Sutton, R.²

35
- 84998595997
- Technical report
- Wang, Z., de Freitas, N., and Lanctot, M. Dueling network architectures for deep reinforcement learning. Technical report, 2015. URL http://arxiv.org/abs/1511.06581.
- (2015) Dueling Network Architectures for Deep Reinforcement Learning
- Wang, Z.¹ De Freitas, N.² Lanctot, M.³

36
- 34249833101
- Q-learning
- Watkins, Christopher JCH and Dayan, Peter. Q-learning. Machine learning, 8(3-4):279–292, 1992.
- (1992) Machine Learning , vol.8 , Issue.3-4 , pp. 279-292
- Watkins, C.J.C.H.¹ Dayan, P.²

37
- 84975055760
- Surprise and curiosity for big data robotics
- White, Adam, Modayil, Joseph, and Sutton, Richard S. Surprise and curiosity for big data robotics. In Workshops at the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014.
- (2014) Workshops at the Twenty-Eighth AAAI Conference on Artificial Intelligence
- White, A.¹ Modayil, J.² Sutton, R.S.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.