메뉴 건너뛰기




Volumn 4, Issue , 2016, Pages 2850-2869

Asynchronous methods for deep reinforcement learning

Author keywords

[No Author keywords available]

Indexed keywords

ARTIFICIAL INTELLIGENCE; LEARNING ALGORITHMS; REINFORCEMENT LEARNING;

EID: 84999036937     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (2539)

References (30)
  • 5
    • 84869424969 scopus 로고    scopus 로고
    • Model-free reinforcement learning with continuous action in practice
    • IEEE
    • Degris, Thomas, Pilarski, Patrick M, and Sutton, Richard S. Model-free reinforcement learning with continuous action in practice. In American Control Conference (ACC), 2012, pp. 2177-2182. IEEE, 2012.
    • (2012) American Control Conference (ACC) , pp. 2177-2182
    • Degris, T.1    Pilarski, P.M.2    Sutton, R.S.3
  • 9
    • 84861705660 scopus 로고    scopus 로고
    • Mapreduce for parallel reinforcement learning
    • EWRL 2011, Athens, Greece, September 9-11, 2011, Revised Selected Papers
    • Li, Yuxi and Schuurmans, Dale. Mapreduce for parallel reinforcement learning. In Recent Advances in Reinforcement Learning - 9th European Workshop, EWRL 2011, Athens, Greece, September 9-11, 2011, Revised Selected Papers, pp. 309-320, 2011.
    • (2011) Recent Advances in Reinforcement Learning - 9th European Workshop , pp. 309-320
    • Li, Y.1    Schuurmans, D.2
  • 13
    • 0000955979 scopus 로고    scopus 로고
    • Incremental multi-step q-learning
    • Peng, Jing and Williams, Ronald J. Incremental multi-step q-learning. Machine Learning, 22(1-3):283-290, 1996.
    • (1996) Machine Learning , vol.22 , Issue.1-3 , pp. 283-290
    • Peng, J.1    Williams, R.J.2
  • 14
    • 85162467517 scopus 로고    scopus 로고
    • Hogwild: A lock-free approach to parallelizing stochastic gradient descent
    • Recht, Benjamin, Re, Christopher, Wright, Stephen, and Niu, Feng. Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In Advances in Neural Information Processing Systems, pp. 693-701, 2011.
    • (2011) Advances in Neural Information Processing Systems , pp. 693-701
    • Recht, B.1    Re, C.2    Wright, S.3    Niu, F.4
  • 15
    • 33646398129 scopus 로고    scopus 로고
    • Neural fitted q iteration-first experiences with a data efficient neural reinforcement learning method
    • Springer Berlin Heidelberg
    • Riedmiller, Martin. Neural fitted q iteration-first experiences with a data efficient neural reinforcement learning method. In Machine Learning: ECML 2005, pp. 317- 328. Springer Berlin Heidelberg, 2005.
    • (2005) Machine Learning: ECML 2005 , pp. 317-328
    • Riedmiller, M.1
  • 21
    • 84893343292 scopus 로고    scopus 로고
    • Lecture 6.5- rmsprop: Divide the gradient by a running average of its recent magnitude
    • Tieleman, Tijmen and Hinton, Geoffrey. Lecture 6.5- rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural Networks for Machine Learning, 4, 2012.
    • (2012) COURSERA: Neural Networks for Machine Learning , vol.4
    • Tieleman, T.1    Hinton, G.2
  • 28
    • 0000337576 scopus 로고
    • Simple statistical gradient-following algorithms for connectionist reinforcement learning
    • Williams, R.J. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8(3):229-256, 1992.
    • (1992) Machine Learning , vol.8 , Issue.3 , pp. 229-256
    • Williams, R.J.1
  • 29
    • 0041154467 scopus 로고
    • Function optimization using connectionist reinforcement learning algorithms
    • Williams, Ronald J and Peng, Jing. Function optimization using connectionist reinforcement learning algorithms. Connection Science, 3(3):241-268, 1991.
    • (1991) Connection Science , vol.3 , Issue.3 , pp. 241-268
    • Williams, R.J.1    Peng, J.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.