-
1
-
-
14844340822
-
Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach
-
M. Abu-Khalaf and F. L. Lewis, "Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach," Automatica, vol. 41, no. 5, pp. 779-791, 2005.
-
(2005)
Automatica
, vol.41
, Issue.5
, pp. 779-791
-
-
Abu-Khalaf, M.1
Lewis, F.L.2
-
2
-
-
0000396062
-
Natural gradient works efficiently in learning
-
Feb
-
S. Amari, "Natural gradient works efficiently in learning," Neural Comput., vol. 10, no. 2, pp. 251-276, Feb. 1998.
-
(1998)
Neural Comput
, vol.10
, Issue.2
, pp. 251-276
-
-
Amari, S.1
-
3
-
-
0344154963
-
Strategy learning with multilayer connectionist representations
-
C. W. Anderson, "Strategy learning with multilayer connectionist representations," in Proc. 4th Int. Workshop Mach. Learn., 1987, pp. 103-114.
-
(1987)
Proc. 4th Int. Workshop Mach. Learn
, pp. 103-114
-
-
Anderson, C.W.1
-
4
-
-
0031457098
-
Avoiding catastrophic forgetting by coupling two reverberating neural networks
-
B. Ans and S. Rousset, "Avoiding catastrophic forgetting by coupling two reverberating neural networks," C. R. Acad. sci., Sér., Sci. vie, vol. 320, no. 12, pp. 989-997, 1997.
-
(1997)
C. R. Acad. sci., Sér., Sci. vie
, vol.320
, Issue.12
, pp. 989-997
-
-
Ans, B.1
Rousset, S.2
-
5
-
-
0004370245
-
-
Wright Lab, Wright Patterson Air Force Base, Dayton, OH, Tech. Rep. WL-TR-93-1146
-
L. C. Baird, "Advantage updating," Wright Lab., Wright Patterson Air Force Base, Dayton, OH, Tech. Rep. WL-TR-93-1146, 1993.
-
(1993)
Advantage updating
-
-
Baird, L.C.1
-
6
-
-
0027599793
-
Universal approximation bounds for superpositions of a sigmoidal function
-
May
-
A. R. Barron, "Universal approximation bounds for superpositions of a sigmoidal function," IEEE Trans. Inf. Theory, vol. 39, no. 3, pp. 930-945, May 1993.
-
(1993)
IEEE Trans. Inf. Theory
, vol.39
, Issue.3
, pp. 930-945
-
-
Barron, A.R.1
-
7
-
-
0001864005
-
On the bang-bang control problem
-
R. Bellman, I. Glicksberg, and O. Gross, "On the bang-bang control problem," Q. Appl. Math., vol. 14, no. 1, pp. 11-18, 1956.
-
(1956)
Q. Appl. Math
, vol.14
, Issue.1
, pp. 11-18
-
-
Bellman, R.1
Glicksberg, I.2
Gross, O.3
-
11
-
-
10944228202
-
Reinforcement learning using neural networks, with applications to motor control,
-
M.S. thesis, Inst. Nat. Polytech. Grenoble, Grenoble, France
-
R. Coulom, "Reinforcement learning using neural networks, with applications to motor control," M.S. thesis, Inst. Nat. Polytech. Grenoble, Grenoble, France, 2002.
-
(2002)
-
-
Coulom, R.1
-
12
-
-
34250718115
-
High-accuracy value-function approximation with neural networks applied to the acrobat
-
M. Verleysen, Ed, Bruges, Belgium
-
R. Coulom, "High-accuracy value-function approximation with neural networks applied to the acrobat," in Proc. Eur. Symp. Artif. Neural Netw., M. Verleysen, Ed., Bruges, Belgium, 2004.
-
(2004)
Proc. Eur. Symp. Artif. Neural Netw
-
-
Coulom, R.1
-
13
-
-
85156231814
-
Temporal difference learning in continuous time and space
-
D. S Touretzky, M. C Mozer, and M. E Hasselmo, Eds. Cambridge, MA: MIT Press
-
K. Doya, "Temporal difference learning in continuous time and space," in Advances in Neural Information Processing Systems, vol. 8, D. S Touretzky, M. C Mozer, and M. E Hasselmo, Eds. Cambridge, MA: MIT Press, 1996, pp. 1073-1079.
-
(1996)
Advances in Neural Information Processing Systems
, vol.8
, pp. 1073-1079
-
-
Doya, K.1
-
14
-
-
0033629916
-
Reinforcement learning in continuous time and space
-
Jan
-
K. Doya, "Reinforcement learning in continuous time and space," Neural Comput., vol. 12, no. 1, pp. 219-245, Jan. 2000.
-
(2000)
Neural Comput
, vol.12
, Issue.1
, pp. 219-245
-
-
Doya, K.1
-
15
-
-
0347683832
-
Pseudo-recurrent connectionist networks: An approach to the 'sensitivity-stability' dilemma
-
Dec
-
R. M. French, "Pseudo-recurrent connectionist networks: An approach to the 'sensitivity-stability' dilemma," Connect. Sci., vol. 9, no. 4, pp. 353-380, Dec. 1997.
-
(1997)
Connect. Sci
, vol.9
, Issue.4
, pp. 353-380
-
-
French, R.M.1
-
16
-
-
84880694195
-
Stable function approximation in dynamic programming
-
A. Prieditis and S. Russel, Eds, San Francisco, CA
-
G. J. Gordon, "Stable function approximation in dynamic programming," in Proc. 12th Int. Conf. Mach. Learn., A. Prieditis and S. Russel, Eds., San Francisco, CA, 1995.
-
(1995)
Proc. 12th Int. Conf. Mach. Learn
-
-
Gordon, G.J.1
-
17
-
-
0029679044
-
Reinforcement learning: A survey
-
L. P. Kaelbling, M. L. Littman, and A. W. Moore, "Reinforcement learning: A survey," J. Artif. Intell. Res., vol. 4, pp. 237-285, 1996.
-
(1996)
J. Artif. Intell. Res
, vol.4
, pp. 237-285
-
-
Kaelbling, L.P.1
Littman, M.L.2
Moore, A.W.3
-
18
-
-
0000873069
-
A method for the solution of certain non-linear problems in least squares
-
K. Levenberg, "A method for the solution of certain non-linear problems in least squares," Q. J. Appl. Math., vol. II, no. 2, pp. 164-168, 1944.
-
(1944)
Q. J. Appl. Math
, vol.2
, Issue.2
, pp. 164-168
-
-
Levenberg, K.1
-
19
-
-
0000169232
-
An algorithm for least-squares estimation of nonlinear parameters
-
Jun
-
D. W. Marquardt, "An algorithm for least-squares estimation of nonlinear parameters," J. Soc. Ind. Appl. Math., vol. 11, no. 2, pp. 431-441, Jun. 1963.
-
(1963)
J. Soc. Ind. Appl. Math
, vol.11
, Issue.2
, pp. 431-441
-
-
Marquardt, D.W.1
-
20
-
-
0027205884
-
A scaled conjugate gradient algorithm for fast supervised learning
-
M. F. Moller, "A scaled conjugate gradient algorithm for fast supervised learning," Neural Netw., vol. 6, no. 4, pp. 525-533, 1993.
-
(1993)
Neural Netw
, vol.6
, Issue.4
, pp. 525-533
-
-
Moller, M.F.1
-
21
-
-
0033308517
-
Gradient descent approaches to neural-net-based solutions of the Hamilton-Jacobi-Bellman equation
-
R. Munos, L. C. Baird, and A. W. Moore, "Gradient descent approaches to neural-net-based solutions of the Hamilton-Jacobi-Bellman equation," in Proc. Int. Joint Conf. Neural Netw., 1999, pp. 2152-2157.
-
(1999)
Proc. Int. Joint Conf. Neural Netw
, pp. 2152-2157
-
-
Munos, R.1
Baird, L.C.2
Moore, A.W.3
-
22
-
-
0009589301
-
How to train neural networks
-
G. B. Orr and K.-R. Müller, Eds. New York: Springer-Verlag
-
R. Neuneier and H.-G. Zimmermann, "How to train neural networks," in Neural Networks: Tricks of the Trade, G. B. Orr and K.-R. Müller, Eds. New York: Springer-Verlag, 1998.
-
(1998)
Neural Networks: Tricks of the Trade
-
-
Neuneier, R.1
Zimmermann, H.-G.2
-
23
-
-
0039225088
-
On-line estimation of the optimal value function: HJB-estimates
-
C. L. Giles, S. J. Hanson, and J. D. Cowan, Eds. San Mateo, CA: Morgan Kaufmann
-
J. K. Peterson, "On-line estimation of the optimal value function: HJB-estimates," in Advances in Neural Information Processing Systems vol. 5, C. L. Giles, S. J. Hanson, and J. D. Cowan, Eds. San Mateo, CA: Morgan Kaufmann, 1993, pp. 319-326.
-
(1993)
Advances in Neural Information Processing Systems
, vol.5
, pp. 319-326
-
-
Peterson, J.K.1
-
24
-
-
84943274699
-
A direct adaptive method for faster backpropagation learning: The RPROP algorithm
-
M. Riedmiller and H. Braun, "A direct adaptive method for faster backpropagation learning: The RPROP algorithm," in Proc. IEEE Int. Conf. Neural Netw., 1993, pp. 586-591.
-
(1993)
Proc. IEEE Int. Conf. Neural Netw
, pp. 586-591
-
-
Riedmiller, M.1
Braun, H.2
-
25
-
-
38149038993
-
Catastrophic forgetting, rehearsal, and pseudorehearsal
-
Jun
-
A. Robins, "Catastrophic forgetting, rehearsal, and pseudorehearsal," Connect. Sci., vol. 7, no. 2, pp. 123-146, Jun. 1995.
-
(1995)
Connect. Sci
, vol.7
, Issue.2
, pp. 123-146
-
-
Robins, A.1
-
26
-
-
0030896968
-
A neural substrate of prediction and reward
-
Mar
-
W. Shultz, P. Dayan, and P. R. Montague, "A neural substrate of prediction and reward," Science, vol. 275, no. 5306, pp. 1593-1599, Mar. 1997.
-
(1997)
Science
, vol.275
, Issue.5306
, pp. 1593-1599
-
-
Shultz, W.1
Dayan, P.2
Montague, P.R.3
-
27
-
-
33847202724
-
Learning to predict by the methods of temporal differences
-
Aug
-
R. S. Sutton, "Learning to predict by the methods of temporal differences," Mach. Learn., vol. 3, no. 1, pp. 9-44, Aug. 1988.
-
(1988)
Mach. Learn
, vol.3
, Issue.1
, pp. 9-44
-
-
Sutton, R.S.1
-
29
-
-
84898939480
-
Policy gradient methods for reinforcement learning with function approximation
-
R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour, "Policy gradient methods for reinforcement learning with function approximation," in Proc. Advances Neural Inf. Process. Syst., 2000, vol. 12, pp. 1057-1063.
-
(2000)
Proc. Advances Neural Inf. Process. Syst
, vol.12
, pp. 1057-1063
-
-
Sutton, R.S.1
McAllester, D.2
Singh, S.3
Mansour, Y.4
-
30
-
-
0029276036
-
Temporal difference learning and TD-Gammon
-
Mar
-
G. Tesauro, "Temporal difference learning and TD-Gammon," Commun. ACM, vol. 38, no. 3, pp. 58-68, Mar. 1995.
-
(1995)
Commun. ACM
, vol.38
, Issue.3
, pp. 58-68
-
-
Tesauro, G.1
-
31
-
-
0036829017
-
Optimal feedback control as a theory of motor coordination
-
Nov
-
E. Todorov and M. Jordan, "Optimal feedback control as a theory of motor coordination," Nat. Neurosci., vol. 5, no. 11, pp. 1226-1235, Nov. 2002.
-
(2002)
Nat. Neurosci
, vol.5
, Issue.11
, pp. 1226-1235
-
-
Todorov, E.1
Jordan, M.2
|