-
2
-
-
84884276459
-
Reinforcement learning in robotics: A survey
-
J. Kober, J. A. Bagnell, J. Peters, "Reinforcement learning in robotics: A survey, " Int. J. Robot. Res., vol. 32, no. 11, pp. 1238-1274, 2013.
-
(2013)
Int. J. Robot. Res.
, vol.32
, Issue.11
, pp. 1238-1274
-
-
Kober, J.1
Bagnell, J.A.2
Peters, J.3
-
3
-
-
0026880130
-
Automatic programming of behaviorbased robots using reinforcement learning
-
Jun.
-
S. Mahadevan and J. Connell, "Automatic programming of behaviorbased robots using reinforcement learning, " Artif. Intell., vol. 55, nos. 2-3, pp. 311-365, Jun. 1992.
-
(1992)
Artif. Intell.
, vol.55
, Issue.2-3
, pp. 311-365
-
-
Mahadevan, S.1
Connell, J.2
-
5
-
-
84979846290
-
Gait balance and acceleration of a biped robot based on Q-learning
-
J.-L. Lin, K.-S. Hwang, W.-C. Jiang, Y.-J. Chen, "Gait balance and acceleration of a biped robot based on Q-learning, " IEEE Access, vol. 4, pp. 2439-2449, 2016.
-
(2016)
IEEE Access
, vol.4
, pp. 2439-2449
-
-
Lin, J.-L.1
Hwang, K.-S.2
Jiang, W.-C.3
Chen, Y.-J.4
-
6
-
-
84876259942
-
Learning to select and generalize striking movements in robot table tennis
-
K. Mülling, J. Kober, O. Kroemer, J. Peters, "Learning to select and generalize striking movements in robot table tennis, " Int. J. Robot. Res., vol. 32, no. 3, pp. 263-279, 2013.
-
(2013)
Int. J. Robot. Res.
, vol.32
, Issue.3
, pp. 263-279
-
-
Mülling, K.1
Kober, J.2
Kroemer, O.3
Peters, J.4
-
7
-
-
67650996818
-
Reinforcement learning for robot soccer
-
M. Riedmiller, T. Gabel, R. Hafner, S. Lange, "Reinforcement learning for robot soccer, " Auton. Robots, vol. 27, no. 1, pp. 55-73, 2009.
-
(2009)
Auton. Robots
, vol.27
, Issue.1
, pp. 55-73
-
-
Riedmiller, M.1
Gabel, T.2
Hafner, R.3
Lange, S.4
-
8
-
-
33845914873
-
Animal intelligence: An experimental study of the associate processes in animals
-
E. L. Thorndike, "Animal intelligence: An experimental study of the associate processes in animals, " Amer. Psychol., vol. 53, no. 10, pp. 1125-1127, 1998.
-
(1998)
Amer. Psychol.
, vol.53
, Issue.10
, pp. 1125-1127
-
-
Thorndike, E.L.1
-
9
-
-
0030896968
-
Aneural substrate of prediction and reward
-
W. Schultz, P. Dayan, P. R. Montague, "Aneural substrate of prediction and reward, " Science, vol. 275, no. 5306, pp. 1593-1599, 1997.
-
(1997)
Science
, vol.275
, Issue.5306
, pp. 1593-1599
-
-
Schultz, W.1
Dayan, P.2
Montague, P.R.3
-
10
-
-
85012688561
-
-
Princeton NJ USA: Princeton Univ. Press
-
R. Bellman, Dynamic Programming. Princeton, NJ, USA: Princeton Univ. Press, 2010.
-
(2010)
Dynamic Programming
-
-
Bellman, R.1
-
11
-
-
84903724014
-
Deep learning: Methods and applications
-
Jun.
-
L. Deng and D. Yu, "Deep learning: Methods and applications, " Found. Trends Signal Process., vol. 7, nos. 3-4, pp. 197-387, Jun. 2014.
-
(2014)
Found. Trends Signal Process.
, vol.7
, Issue.3-4
, pp. 197-387
-
-
Deng, L.1
Yu, D.2
-
12
-
-
85040285806
-
ImageNet: Constructing a large-scale image database
-
L. Fei-Fei, J. Deng, K. Li, "ImageNet: Constructing a large-scale image database, " J. Vis., vol. 9, no. 8, p. 1037, 2009.
-
(2009)
J. Vis.
, vol.9
, Issue.8
, pp. 1037
-
-
Fei-Fei, L.1
Deng, J.2
Li, K.3
-
13
-
-
84876231242
-
ImageNet classification with deep convolutional neural networks
-
Dec.
-
A. Krizhevsky, I. Sutskever, G. E. Hinton, "ImageNet classification with deep convolutional neural networks, " in Proc. Adv. Neural Inf. Pro-cess. Syst., Dec. 2012, pp. 1097-1105.
-
(2012)
Proc. Adv. Neural Inf. Pro-cess. Syst
, pp. 1097-1105
-
-
Krizhevsky, A.1
Sutskever, I.2
Hinton, G.E.3
-
15
-
-
84911364368
-
Large-scale video classification with convolutional neural networks
-
Jun.
-
A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, L. Fei-Fei, "Large-scale video classification with convolutional neural networks, " in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2014, pp. 1725-1732.
-
(2014)
Proc. IEEE Conf. Comput. Vis. Pattern Recognit
, pp. 1725-1732
-
-
Karpathy, A.1
Toderici, G.2
Shetty, S.3
Leung, T.4
Sukthankar, R.5
Fei-Fei, L.6
-
16
-
-
84867605836
-
Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition
-
Mar.
-
O. Abdel-Hamid, A. Mohamed, H. Jiang, G. Penn, "Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition, " in Proc. IEEE Conf. Acoust. Speech Signal Process., Mar. 2012, pp. 4277-4280.
-
(2012)
Proc. IEEE Conf. Acoust. Speech Signal Process
, pp. 4277-4280
-
-
Abdel-Hamid, O.1
Mohamed, A.2
Jiang, H.3
Penn, G.4
-
17
-
-
85034835301
-
Deep neural language models for machine translation
-
Jul.
-
M.-T. Luong, M. Kayser, C. D. Manning, "Deep neural language models for machine translation, " in Proc. Conf. Comput. Nat. Lang. Learn., Jul. 2015, pp. 305-309.
-
(2015)
Proc. Conf. Comput. Nat. Lang. Learn
, pp. 305-309
-
-
Luong, M.-T.1
Kayser, M.2
Manning, C.D.3
-
18
-
-
85028548653
-
Deep learning in robotics: A review of recent research
-
H. A. Pierson and M. S. Gashler, "Deep learning in robotics: A review of recent research, " Adv. Robot., vol. 31, no. 16, pp. 821-835, 2017.
-
(2017)
Adv. Robot.
, vol.31
, Issue.16
, pp. 821-835
-
-
Pierson, H.A.1
Gashler, M.S.2
-
19
-
-
34249833101
-
Q-learning
-
C. J. C. H. Watkins and P. Dayan, "Q-learning, " Mach. Learn., vol. 8, nos. 3-4, pp. 279-292, 1992.
-
(1992)
Mach. Learn.
, vol.8
, Issue.3-4
, pp. 279-292
-
-
Watkins, C.J.C.H.1
Dayan, P.2
-
20
-
-
84924051598
-
Human-level control through deep reinforcement learning
-
V. Mnih, et al., "Human-level control through deep reinforcement learning, " Nature, vol. 518, no. 7540, pp. 529-533, 2015.
-
(2015)
Nature
, vol.518
, Issue.7540
, pp. 529-533
-
-
Mnih, V.1
-
21
-
-
84879976780
-
The arcade learning environment: An evaluation platform for general agents
-
May
-
M. G. Bellemare, Y. Naddaf, J. Veness, M. Bowling, "The arcade learning environment: An evaluation platform for general agents, " J. Artif. Intell. Res., vol. 47, pp. 253-279, May 2013.
-
(2013)
J. Artif. Intell. Res.
, vol.47
, pp. 253-279
-
-
Bellemare, M.G.1
Naddaf, Y.2
Veness, J.3
Bowling, M.4
-
22
-
-
84963949906
-
Mastering the game of Go with deep neural networks and tree search
-
Jan.
-
D. Silver, et al., "Mastering the game of Go with deep neural networks and tree search, " Nature, vol. 529, no. 7578, pp. 484-489, Jan. 2016.
-
(2016)
Nature
, vol.529
, Issue.7578
, pp. 484-489
-
-
Silver, D.1
-
24
-
-
84887003012
-
An analysis of temporal-difference learning with function approximation
-
J. N. Tsitsiklis and B. Van Roy, "An analysis of temporal-difference learning with function approximation, " in Proc. Adv. Neural Inf. Process. Syst., 1997, pp. 1075-1081.
-
(1997)
Proc. Adv. Neural Inf. Process. Syst.
, pp. 1075-1081
-
-
Tsitsiklis, J.N.1
Van Roy, B.2
-
25
-
-
84868289914
-
Investigating contingency awareness using Atari 2600 games
-
Jul.
-
M. G. Bellemare, J. Veness, M. Bowling, "Investigating contingency awareness using Atari 2600 games, " in Proc. AAAI Conf. Artif. Intell., Jul. 2012, pp. 864-871.
-
(2012)
Proc. AAAI Conf. Artif. Intell
, pp. 864-871
-
-
Bellemare, M.G.1
Veness, J.2
Bowling, M.3
-
26
-
-
0036149522
-
Deep blue
-
Jan.
-
M. Campbell, A. J. Hoane, Jr., F.-H. Hsu, "Deep blue, " Artif. Intell., vol. 134, nos. 1-2, pp. 57-83, Jan. 2002.
-
(2002)
Artif. Intell.
, vol.134
, Issue.1-2
, pp. 57-83
-
-
Campbell, M.1
Hoane, A.J.2
Hsu, F.-H.3
-
27
-
-
34547971839
-
Efficient selectivity and backup operators in Monte-Carlo tree search
-
R. Coulom, "Efficient selectivity and backup operators in Monte-Carlo tree search, " in Proc. Int. Conf. Comput. Games, 2006, pp. 72-83.
-
(2006)
Proc. Int. Conf. Comput. Games
, pp. 72-83
-
-
Coulom, R.1
-
28
-
-
84858960516
-
A survey of Monte Carlo tree search methods
-
Mar.
-
C. B. Browne, et al., "A survey of Monte Carlo tree search methods, " IEEE Trans. Comput. Intell. AI Games, vol. 4, no. 1, pp. 1-43, Mar. 2012.
-
(2012)
IEEE Trans. Comput. Intell. AI Games
, vol.4
, Issue.1
, pp. 1-43
-
-
Browne, C.B.1
-
31
-
-
0029276036
-
Temporal difference learning and TD-Gammon
-
Mar.
-
G. Tesauro, "Temporal difference learning and TD-Gammon, " Commun. ACM, vol. 38, no. 3, pp. 58-68, Mar. 1995.
-
(1995)
Commun. ACM
, vol.38
, Issue.3
, pp. 58-68
-
-
Tesauro, G.1
-
32
-
-
0032923221
-
Catastrophic forgetting in connectionist networks
-
R. M. French, "Catastrophic forgetting in connectionist networks, " Trends Cognit. Sci., vol. 3, no. 4, pp. 128-135, 1999.
-
(1999)
Trends Cognit. Sci.
, vol.3
, Issue.4
, pp. 128-135
-
-
French, R.M.1
-
33
-
-
84974531647
-
What learning systems do intelligent agents need Complementary learning systems theory updated
-
D. Kumaran, D. Hassabis, J. L. McClelland, "What learning systems do intelligent agents need Complementary learning systems theory updated, " Trends Cognit. Sci., vol. 20, no. 7, pp. 512-534, 2016.
-
(2016)
Trends Cognit. Sci.
, vol.20
, Issue.7
, pp. 512-534
-
-
Kumaran, D.1
Hassabis, D.2
McClelland, J.L.3
-
34
-
-
84904163933
-
Dropout: A simple way to prevent neural networks from overfitting
-
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, R. Salakhutdinov, "Dropout: A simple way to prevent neural networks from overfitting, " J. Mach. Learn. Res., vol. 15, no. 1, pp. 1929-1958, 2014.
-
(2014)
J. Mach. Learn. Res.
, vol.15
, Issue.1
, pp. 1929-1958
-
-
Srivastava, N.1
Hinton, G.2
Krizhevsky, A.3
Sutskever, I.4
Salakhutdinov, R.5
-
35
-
-
0001219859
-
Regularization theory and neural networks architectures
-
Mar.
-
F. Girosi, M. Jones, T. Poggio, "Regularization theory and neural networks architectures, " Neural Comput., vol. 7, no. 2, pp. 219-269, Mar. 1995.
-
(1995)
Neural Comput.
, vol.7
, Issue.2
, pp. 219-269
-
-
Girosi, F.1
Jones, M.2
Poggio, T.3
-
39
-
-
84980003817
-
-
Nov.
-
A. A. Rusu, et al. (Nov. 2015). "Policy distillation." [Online]. Available: Https://arxiv.org/abs/1511.06295
-
(2015)
Policy Distillation
-
-
Rusu, A.A.1
-
41
-
-
85030470931
-
Knowledge transfer for deep reinforcement learning with hierarchical experience replay
-
Jan.
-
H. Yin and S. J. Pan, "Knowledge transfer for deep reinforcement learning with hierarchical experience replay, " in Proc. AAAI Conf. Artif. Intell., Jan. 2017, pp. 1640-1646.
-
(2017)
Proc. AAAI Conf. Artif. Intell
, pp. 1640-1646
-
-
Yin, H.1
Pan, S.J.2
-
42
-
-
85016395012
-
Overcoming catastrophic forgetting in neural networks
-
J. Kirkpatrick, et al., "Overcoming catastrophic forgetting in neural networks, " in Proc. Nat. Acad. Sci. USA, vol. 114, no. 3, pp. 3521-3526, 2017.
-
(2017)
Proc. Nat. Acad. Sci. USA
, vol.114
, Issue.3
, pp. 3521-3526
-
-
Kirkpatrick, J.1
-
43
-
-
84865745286
-
Synaptic consolidation: An approach to long-term learning
-
Jun.
-
C. Clopath, "Synaptic consolidation: An approach to long-term learning, " Cognit. Neurodyn., vol. 6, no. 3, pp. 251-257, Jun. 2012.
-
(2012)
Cognit. Neurodyn.
, vol.6
, Issue.3
, pp. 251-257
-
-
Clopath, C.1
-
44
-
-
0026897370
-
Uniqueness of the weights for minimal feedforward nets with a given input-output map
-
Jul./Aug.
-
H. J. Sussmann, "Uniqueness of the weights for minimal feedforward nets with a given input-output map, " J. Neural Netw., vol. 5, no. 4, pp. 589-593, Jul./Aug. 1992.
-
(1992)
J. Neural Netw.
, vol.5
, Issue.4
, pp. 589-593
-
-
Sussmann, H.J.1
-
45
-
-
33644997478
-
Flocking for multi-agent dynamic systems: Algorithms and theory
-
Mar.
-
R. Olfati-Saber, "Flocking for multi-agent dynamic systems: Algorithms and theory, " IEEE Trans. Autom. Control, vol. 51, no. 3, pp. 401-420, Mar. 2006.
-
(2006)
IEEE Trans. Autom. Control
, vol.51
, Issue.3
, pp. 401-420
-
-
Olfati-Saber, R.1
-
46
-
-
85149834820
-
Markov games as a framework for multi-agent reinforcement learning
-
M. L. Littman, "Markov games as a framework for multi-agent reinforcement learning, " in Proc. Int. Conf. Mach. Learn., 1994, pp. 157-163.
-
(1994)
Proc. Int. Conf. Mach. Learn.
, pp. 157-163
-
-
Littman, M.L.1
-
47
-
-
85017018413
-
Multiagent cooperation and competition with deep reinforcement learning
-
Apr.
-
A. Tampuu, et al., "Multiagent cooperation and competition with deep reinforcement learning, " PLoS ONE, vol. 12, no. 4, p. e0172395, Apr. 2017.
-
(2017)
PLoS ONE
, vol.12
, Issue.4
, pp. e0172395
-
-
Tampuu, A.1
-
48
-
-
84962082047
-
Multi-agent reinforcement learning as a rehearsal for decentralized planning
-
May
-
L. Kraemer and B. Banerjee, "Multi-agent reinforcement learning as a rehearsal for decentralized planning, " Neurocomputing, vol. 190, pp. 82-94, May 2016.
-
(2016)
Neurocomputing
, vol.190
, pp. 82-94
-
-
Kraemer, L.1
Banerjee, B.2
-
49
-
-
85019195482
-
Learning to communicate with deep multi-agent reinforcement learning
-
J. Foerster, Y. A. M. Assael, N. de Freitas, S. Whiteson, "Learning to communicate with deep multi-agent reinforcement learning, " in Proc. Adv. Neural Inf. Process. Syst., 2016, pp. 2137-2145.
-
(2016)
Proc. Adv. Neural Inf. Process. Syst.
, pp. 2137-2145
-
-
Foerster, J.1
Assael, Y.A.M.2
De Freitas, N.3
Whiteson, S.4
-
50
-
-
85018860957
-
Learning multiagent communication with backpropagation
-
S. Sukhbaatar, A. Szlam, R. Fergus, "Learning multiagent communication with backpropagation, " in Proc. Adv. Neural Inf. Process. Syst., 2016, pp. 2244-2252.
-
(2016)
Proc. Adv. Neural Inf. Process. Syst.
, pp. 2244-2252
-
-
Sukhbaatar, S.1
Szlam, A.2
Fergus, R.3
-
51
-
-
84999027774
-
Opponent modeling in deep reinforcement learning
-
H. He, J. Boyd-Graber, K. Kwok, H. Daumé, III, "Opponent modeling in deep reinforcement learning, " in Proc. Int. Conf. Mach. Learn., 2016, pp. 1804-1813.
-
(2016)
Proc. Int. Conf. Mach. Learn.
, pp. 1804-1813
-
-
He, H.1
Boyd-Graber, J.2
Kwok, K.3
Daumé, H.4
-
55
-
-
84971448181
-
Asynchronous methods for deep reinforcement learning
-
V. Mnih, et al., "Asynchronous methods for deep reinforcement learning, " in Proc. Int. Conf. Mach. Learn., 2016, pp. 1928-1937.
-
(2016)
Proc. Int. Conf. Mach. Learn.
, pp. 1928-1937
-
-
Mnih, V.1
-
56
-
-
84964687570
-
Deep recurrent Q-learning for partially observable MDPs
-
Nov. [Online]. Available
-
M. Hausknecht and P. Stone, "Deep recurrent Q-learning for partially observable MDPs, " in Proc. AAAI Symp. Seq. Decis. Mak. Intell. Agents, Nov. 2015. [Online]. Available: Https://www.cs.utexas.edu/~pstone/Papers/bib2html/b2hd-SDMIA15-Hausknecht.html
-
(2015)
Proc. AAAI Symp. Seq. Decis. Mak. Intell. Agents
-
-
Hausknecht, M.1
Stone, P.2
-
59
-
-
85007210890
-
Deep reinforcement learning with double Q-learning
-
H. van Hasselt, A. Guez, D. Silver, "Deep reinforcement learning with double Q-learning, " in Proc. AAAI Conf. Artif. Intell., 2016, pp. 2094-2100.
-
(2016)
Proc. AAAI Conf. Artif. Intell.
, pp. 2094-2100
-
-
Van Hasselt, H.1
Guez, A.2
Silver, D.3
-
60
-
-
0141988716
-
Recent advances in hierarchical reinforcement learning
-
Oct.
-
A. G. Barto and S. Mahadevan, "Recent advances in hierarchical reinforcement learning, " Discrete Event Dyn. Syst., vol. 13, no. 4, pp. 341-379, Oct. 2003.
-
(2003)
Discrete Event Dyn. Syst.
, vol.13
, Issue.4
, pp. 341-379
-
-
Barto, A.G.1
Mahadevan, S.2
-
61
-
-
0033170372
-
Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
-
Aug.
-
R. S. Sutton, D. Precup, S. Singh, "Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning, " Artif. Intell., vol. 112, nos. 1-2, pp. 181-211, Aug. 1999.
-
(1999)
Artif. Intell.
, vol.112
, Issue.1-2
, pp. 181-211
-
-
Sutton, R.S.1
Precup, D.2
Singh, S.3
-
62
-
-
4544318426
-
Efficient solution algorithms for factored MDPs
-
Jul./Dec. [Online]. Available
-
C. Guestrin, D. Koller, R. Parr, S. Venkataraman, "Efficient solution algorithms for factored MDPs, " J. Artif. Intell. Res., vol. 19, pp. 399-468, Jul./Dec. 2003. [Online]. Available: Http://www.jair.org/contents.html
-
(2003)
J. Artif. Intell. Res.
, vol.19
, pp. 399-468
-
-
Guestrin, C.1
Koller, D.2
Parr, R.3
Venkataraman, S.4
-
63
-
-
0002278788
-
Hierarchical reinforcement learning with the MAXQ value function decomposition
-
Nov.
-
T. G. Dietterich, "Hierarchical reinforcement learning with the MAXQ value function decomposition, " J. Artif. Intell. Res., vol. 13, pp. 227-303, Nov. 2000.
-
(2000)
J. Artif. Intell. Res.
, vol.13
, pp. 227-303
-
-
Dietterich, T.G.1
-
64
-
-
84899031920
-
Intrinsically motivated reinforcement learning
-
N. Chentanez, A. G. Barto, S. P. Singh, "Intrinsically motivated reinforcement learning, " in Proc. Adv. Neural Inf. Process. Syst., 2005, pp. 1281-1288.
-
(2005)
Proc. Adv. Neural Inf. Process. Syst.
, pp. 1281-1288
-
-
Chentanez, N.1
Barto, A.G.2
Singh, S.P.3
-
65
-
-
85019246453
-
Hierarchical deep reinforcement learning: Integrating temporal abstraction and intrinsic motivation
-
T. D. Kulkarni, K. R. Narasimhan, A. Saeedi, J. B. Tenenbaum, "Hierarchical deep reinforcement learning: Integrating temporal abstraction and intrinsic motivation, " in Proc. Adv. Neural Inf. Process. Syst., 2016, pp. 3675-3683.
-
(2016)
Proc. Adv. Neural Inf. Process. Syst.
, pp. 3675-3683
-
-
Kulkarni, T.D.1
Narasimhan, K.R.2
Saeedi, A.3
Tenenbaum, J.B.4
-
66
-
-
85015444377
-
-
[Online]. Available
-
G. Brockman, et al. (2016). "OpenAI gym." [Online]. Available: Https://arxiv.org/abs/1606.01540
-
(2016)
OpenAI Gym
-
-
Brockman, G.1
-
68
-
-
84988352824
-
Theano: Deep learning on GPUs with Python
-
J. Bergstra, et al., "Theano: Deep learning on GPUs with Python, " in Proc. BigLearn Workshop NIPS, 2011.
-
(2011)
Proc. BigLearn Workshop NIPS
-
-
Bergstra, J.1
-
69
-
-
84971640658
-
-
[Online]. Available
-
F. Chollet. (2015). Keras. [Online]. Available: Http://keras.io
-
(2015)
Keras
-
-
Chollet, F.1
-
70
-
-
80555140075
-
Scikit-learn: Machine learning in Python
-
Oct.
-
F. Pedregosa, et al., "Scikit-learn: Machine learning in Python, " J. Mach. Learn. Res., vol. 12, pp. 2825-2830, Oct. 2011.
-
(2011)
J. Mach. Learn. Res.
, vol.12
, pp. 2825-2830
-
-
Pedregosa, F.1
-
71
-
-
85040287186
-
-
[Online]. Available
-
O. Klimov and J. Schulman. (2017). Roboschool. [Online]. Available: Https://blog.openai.com/roboschool/
-
(2017)
Roboschool
-
-
Klimov, O.1
Schulman, J.2
-
73
-
-
70449370276
-
RL-glue: Language-independent software for reinforcement-learning experiments
-
Sep.
-
B. Tanner and A. White, "RL-glue: Language-independent software for reinforcement-learning experiments, " J. Mach. Learn. Res., vol. 10, pp. 2133-2136, Sep. 2009.
-
(2009)
J. Mach. Learn. Res.
, vol.10
, pp. 2133-2136
-
-
Tanner, B.1
White, A.2
-
74
-
-
84979833309
-
Meta-learning within projective simulation
-
A. Makmal, A. A. Melnikov, V. Dunjko, H. J. Briegel, "Meta-learning within projective simulation, " IEEE Access, vol. 4, pp. 2110-2122, 2016.
-
(2016)
IEEE Access
, vol.4
, pp. 2110-2122
-
-
Makmal, A.1
Melnikov, A.A.2
Dunjko, V.3
Briegel, H.J.4
-
75
-
-
84865223864
-
Inverse reinforcement learning
-
New York, NY, USA: Springer
-
P. Abbeel and A. Y. Ng, "Inverse reinforcement learning, " in Encyclopedia of Machine Learning. New York, NY, USA: Springer, 2011, pp. 554-558.
-
(2011)
Encyclopedia of Machine Learning
, pp. 554-558
-
-
Abbeel, P.1
Ng, A.Y.2
|