메뉴 건너뛰기




Volumn 518, Issue 7540, 2015, Pages 529-533

Human-level control through deep reinforcement learning

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHM; ARTIFICIAL INTELLIGENCE; BEHAVIORAL ECOLOGY; GAME THEORY; HARMONIC ANALYSIS; HIERARCHICAL SYSTEM; NEUROLOGY; PARAMETERIZATION; SENSORY SYSTEM;

EID: 84924051598     PISSN: 00280836     EISSN: 14764687     Source Type: Journal    
DOI: 10.1038/nature14236     Document Type: Article
Times cited : (27782)

References (33)
  • 3
    • 0030896968 scopus 로고    scopus 로고
    • A neural substrate of prediction and reward
    • Schultz, W., Dayan, P. & Montague, P. R. A neural substrate of prediction and reward. Science 275, 1593-1599 (1997).
    • (1997) Science , vol.275 , pp. 1593-1599
    • Schultz, W.1    Dayan, P.2    Montague, P.R.3
  • 5
    • 0019152630 scopus 로고
    • Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position
    • Fukushima, K. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36, 193-202 (1980).
    • (1980) Biol. Cybern. , vol.36 , pp. 193-202
    • Fukushima, K.1
  • 6
    • 0029276036 scopus 로고
    • Temporal difference learning and TD-Gammon
    • Tesauro, G. Temporal difference learning and TD-Gammon. Commun. ACM 38, 58-68 (1995).
    • (1995) Commun. ACM , vol.38 , pp. 58-68
    • Tesauro, G.1
  • 8
    • 56449093331 scopus 로고    scopus 로고
    • An object-oriented representation for efficient reinforcement learning
    • Diuk, C., Cohen, A. & Littman, M. L. An object-oriented representation for efficient reinforcement learning. Proc. Int. Conf. Mach. Learn. 240-247 (2008).
    • (2008) Proc. Int. Conf. Mach. Learn. , pp. 240-247
    • Diuk, C.1    Cohen, A.2    Littman, M.L.3
  • 10
  • 11
    • 33746600649 scopus 로고    scopus 로고
    • Reducing the dimensionality of data with neural networks
    • Hinton, G. E. & Salakhutdinov, R. R. Reducing the dimensionality of data with neural networks. Science 313, 504-507 (2006).
    • (2006) Science , vol.313 , pp. 504-507
    • Hinton, G.E.1    Salakhutdinov, R.R.2
  • 12
    • 84879976780 scopus 로고    scopus 로고
    • The arcade learning environment: An evaluation platform for general agents
    • Bellemare, M. G., Naddaf, Y., Veness, J. & Bowling, M. The arcade learning environment: An evaluation platform for general agents. J. Artif. Intell. Res. 47, 253-279 (2013).
    • (2013) J. Artif. Intell. Res. , vol.47 , pp. 253-279
    • Bellemare, M.G.1    Naddaf, Y.2    Veness, J.3    Bowling, M.4
  • 13
    • 36749026673 scopus 로고    scopus 로고
    • Universal Intelligence: A definition of machine intelligence
    • Legg, S. & Hutter, M. Universal Intelligence: a definition of machine intelligence. Minds Mach. 17, 391-444 (2007).
    • (2007) Minds Mach. , vol.17 , pp. 391-444
    • Legg, S.1    Hutter, M.2
  • 14
    • 21344445698 scopus 로고    scopus 로고
    • General game playing: Overview of the AAAI competition
    • Genesereth, M., Love, N. & Pell, B. General game playing: overview of the AAAI competition. AI Mag. 26, 62-72 (2005).
    • (2005) AI Mag. , vol.26 , pp. 62-72
    • Genesereth, M.1    Love, N.2    Pell, B.3
  • 17
    • 0032203257 scopus 로고    scopus 로고
    • Gradient-based learning applied to document recognition
    • LeCun, Y., Bottou, L., Bengio, Y. & Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278-2324 (1998).
    • (1998) Proc. IEEE , vol.86 , pp. 2278-2324
    • Lecun, Y.1    Bottou, L.2    Bengio, Y.3    Haffner, P.4
  • 18
    • 0345285364 scopus 로고
    • Shape and arrangement of columns in cat?s striate cortex
    • Hubel, D. H. & Wiesel, T. N. Shape and arrangement of columns in cat?s striate cortex. J. Physiol. 165, 559-568 (1963).
    • (1963) J. Physiol. , vol.165 , pp. 559-568
    • Hubel, D.H.1    Wiesel, T.N.2
  • 20
    • 0031143730 scopus 로고    scopus 로고
    • An analysis of temporal-difference learning with function approximation
    • Tsitsiklis, J. & Roy, B. V. An analysis of temporal-difference learning with function approximation. IEEE Trans. Automat. Contr. 42, 674-690 (1997).
    • (1997) IEEE Trans. Automat. Contr. , vol.42 , pp. 674-690
    • Tsitsiklis, J.1    Roy, B.V.2
  • 21
    • 0029340352 scopus 로고
    • Why there are complementary learning systems in the hippocampus and neocortex: Insights fromthe successes and failures of connectionist models of learning and memory
    • McClelland, J. L., McNaughton, B. L.& O?Reilly, R. C.Why there are complementary learning systems in the hippocampus and neocortex: insights fromthe successes and failures of connectionist models of learning and memory. Psychol. Rev. 102, 419-457 (1995).
    • (1995) Psychol. Rev. , vol.102 , pp. 419-457
    • McClelland, J.L.1    McNaughton, B.L.2    Oreilly, R.C.3
  • 24
    • 33646398129 scopus 로고    scopus 로고
    • Neural fitted Q iteration - First experiences with a data efficient neural reinforcement learning method
    • Springer
    • Riedmiller, M. Neural fitted Q iteration - first experiences with a data efficient neural reinforcement learning method. Mach. Learn.: ECML, 3720, 317-328 (Springer, 2005).
    • (2005) Mach. Learn.: ECML , vol.3720 , pp. 317-328
    • Riedmiller, M.1
  • 25
    • 57249084011 scopus 로고    scopus 로고
    • Visualizing high-dimensional data using t-SNE
    • Van der Maaten, L. J. P. & Hinton, G. E. Visualizing high-dimensional data using t-SNE. J. Mach. Learn. Res. 9, 2579-2605 (2008).
    • (2008) J. Mach. Learn. Res. , vol.9 , pp. 2579-2605
    • Van Der Maaten, L.J.P.1    Hinton, G.E.2
  • 27
    • 67349117811 scopus 로고    scopus 로고
    • Reinforcement learning can account for associative and perceptual learning on a visual decision task
    • Law, C.-T. & Gold, J. I. Reinforcement learning can account for associative and perceptual learning on a visual decision task. Nature Neurosci. 12, 655 (2009).
    • (2009) Nature Neurosci. , vol.12 , pp. 655
    • Law, C.-T.1    Gold, J.I.2
  • 28
    • 0037122807 scopus 로고    scopus 로고
    • Visual categorization shapes feature selectivity in the primate temporal cortex
    • Sigala, N. & Logothetis, N. K. Visual categorization shapes feature selectivity in the primate temporal cortex. Nature 415, 318-320 (2002).
    • (2002) Nature , vol.415 , pp. 318-320
    • Sigala, N.1    Logothetis, N.K.2
  • 29
    • 84866690401 scopus 로고    scopus 로고
    • Biasing the content of hippocampal replay during sleep
    • Bendor, D.& Wilson, M. A. Biasing the content of hippocampal replay during sleep. Nature Neurosci. 15, 1439-1444 (2012).
    • (2012) Nature Neurosci. , vol.15 , pp. 1439-1444
    • Bendor, D.1    Wilson, M.A.2
  • 30
    • 0027684215 scopus 로고
    • Prioritized sweeping: Reinforcement learning with less data and less real time
    • Moore, A.&Atkeson, C. Prioritized sweeping: reinforcement learning with less data and less real time. Mach. Learn. 13, 103-130 (1993).
    • (1993) Mach. Learn. , vol.13 , pp. 103-130
    • Moore, A.1    Atkeson, C.2
  • 32
    • 77956509090 scopus 로고    scopus 로고
    • Rectified linear units improve restricted Boltzmann machines
    • Nair, V. & Hinton, G. E. Rectified linear units improve restricted Boltzmann machines. Proc. Int. Conf. Mach. Learn. 807-814 (2010).
    • (2010) Proc. Int. Conf. Mach. Learn. , pp. 807-814
    • Nair, V.1    Hinton, G.E.2
  • 33


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.