-
1
-
-
84967758647
-
Viscosity solutions of Hamilton-Jacobi equations
-
M. Crandall and P. Lions, "Viscosity solutions of Hamilton-Jacobi equations," Trans. Amer. Math. Soc., vol. 277, 1983.
-
(1983)
Trans. Amer. Math. Soc
, vol.277
-
-
Crandall, M.1
Lions, P.2
-
2
-
-
85153940465
-
Generalization in reinforcement learning: Safely approximating the value function
-
G. Tesauro, D. S. Touretzky, and T. K. Leen, Eds. Cambridge, MA: MIT Press
-
J. A. Boyan and A. W. Moore, "Generalization in reinforcement learning: Safely approximating the value function," in Advances in Neural Information Processing Systems 7, G. Tesauro, D. S. Touretzky, and T. K. Leen, Eds. Cambridge, MA: MIT Press, 1995, pp. 369-376.
-
(1995)
Advances in Neural Information Processing Systems 7
, pp. 369-376
-
-
Boyan, J.A.1
Moore, A.W.2
-
3
-
-
0024866495
-
On the approximate realization of continuous mappings by neural networks
-
K.-I. Funahashi, "On the approximate realization of continuous mappings by neural networks," Neural Netw., vol. 2, pp. 183-192, 1989.
-
(1989)
Neural Netw
, vol.2
, pp. 183-192
-
-
Funahashi, K.-I.1
-
4
-
-
14844340822
-
Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach
-
M. Abu-Khalaf and F. L. Lewis, "Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach," Automatica, vol. 41, no. 5, pp. 779-791, 2005.
-
(2005)
Automatica
, vol.41
, Issue.5
, pp. 779-791
-
-
Abu-Khalaf, M.1
Lewis, F.L.2
-
5
-
-
10944228202
-
Reinforcement learning using neural networks, with applications to motor control,
-
Ph.D. dissertation, Institut National Polytechnique de Grenoble, Grenoble, France
-
R. Coulom, "Reinforcement learning using neural networks, with applications to motor control," Ph.D. dissertation, Institut National Polytechnique de Grenoble, Grenoble, France, 2002.
-
(2002)
-
-
Coulom, R.1
-
8
-
-
33847202724
-
Learning to predict by the methods of temporal differences
-
R. S. Sutton, "Learning to predict by the methods of temporal differences," Mach. Learn., vol. 3, pp. 9-44, 1988.
-
(1988)
Mach. Learn
, vol.3
, pp. 9-44
-
-
Sutton, R.S.1
-
9
-
-
0030896968
-
A neural substrate of prediction and reward
-
W. Schultz, P. Dayan, and P. R. Montague, "A neural substrate of prediction and reward," Science, vol. 275, pp. 1593-1599, 1997.
-
(1997)
Science
, vol.275
, pp. 1593-1599
-
-
Schultz, W.1
Dayan, P.2
Montague, P.R.3
-
10
-
-
35048819671
-
Least-squares methods in reinforcement learning for control
-
M. G. Lagoudakis, R. E. Parr, and M. L. Littman, "Least-squares methods in reinforcement learning for control," in Proc. 2nd Hellenic Conf. Artif. Intell., 2002, vol. 2308, pp. 249-260.
-
(2002)
Proc. 2nd Hellenic Conf. Artif. Intell
, vol.2308
, pp. 249-260
-
-
Lagoudakis, M.G.1
Parr, R.E.2
Littman, M.L.3
-
11
-
-
84880680664
-
Variable resolution discretization for high-accuracy solutions of optimal control problems
-
R. Munos and A. W. Moore, "Variable resolution discretization for high-accuracy solutions of optimal control problems," in Proc. Int. Joint Conf. Artif. Intell., 1999, pp. 1348-1355.
-
(1999)
Proc. Int. Joint Conf. Artif. Intell
, pp. 1348-1355
-
-
Munos, R.1
Moore, A.W.2
-
12
-
-
0004671869
-
Temporal difference learning in continuous time and space
-
D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo, Eds. Cambridge, MA: MIT Press
-
K. Doya, "Temporal difference learning in continuous time and space," in Advances in Neural Information Processing Systems, D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo, Eds. Cambridge, MA: MIT Press, 1996, vol. 8.
-
(1996)
Advances in Neural Information Processing Systems
, vol.8
-
-
Doya, K.1
-
13
-
-
2542485629
-
Practical issues in temporal difference learning
-
J. E. Moody, S. J. Hanson, and R. P. Lippmann, Eds. San Mateo, CA: Morgan Kaufmann
-
G. Tesauro, "Practical issues in temporal difference learning," in Advances in Neural Information Processing Systems, J. E. Moody, S. J. Hanson, and R. P. Lippmann, Eds. San Mateo, CA: Morgan Kaufmann, 1992, vol. 4, pp. 259-266.
-
(1992)
Advances in Neural Information Processing Systems
, vol.4
, pp. 259-266
-
-
Tesauro, G.1
-
14
-
-
0033308517
-
Gradient descent approaches to neural net-based solutions of the Hamilton-Jacobi-Bellman equation
-
R. Munos, L. Baird, and A. Moore, "Gradient descent approaches to neural net-based solutions of the Hamilton-Jacobi-Bellman equation," in Proc. Int. Joint Conf. Neural Netw., 1999, pp. 1316-1323.
-
(1999)
Proc. Int. Joint Conf. Neural Netw
, pp. 1316-1323
-
-
Munos, R.1
Baird, L.2
Moore, A.3
-
15
-
-
0003270924
-
Issues in using function approximation for reinforcement learning
-
M. Mozer, P. Smolensky, D. Touretzky, J. Elman, and A. Weigend, Eds
-
S. Thrun and A. Schwartz, "Issues in using function approximation for reinforcement learning," in Proc. 1993 Connectionist Models Summer School, M. Mozer, P. Smolensky, D. Touretzky, J. Elman, and A. Weigend, Eds., 1993, pp. 255-263.
-
(1993)
Proc. 1993 Connectionist Models Summer School
, pp. 255-263
-
-
Thrun, S.1
Schwartz, A.2
-
16
-
-
0032202335
-
Successive galerkin approximation algorithms for nonlinear optimal and robust control
-
R. Beard and T. McLain, "Successive galerkin approximation algorithms for nonlinear optimal and robust control, Proc. Int. J. Control: Special Issue Breakthroughs Control Nonlinear Syst., vol. 71, no. 5, pp. 717-743, 1998.
-
(1998)
Proc. Int. J. Control: Special Issue Breakthroughs Control Nonlinear Syst
, vol.71
, Issue.5
, pp. 717-743
-
-
Beard, R.1
McLain, T.2
-
18
-
-
0025399567
-
Identification and control of dynamical systems using neural networks
-
Mar
-
K. S. Narendra and K. Parthasarathy, "Identification and control of dynamical systems using neural networks," IEEE Trans. Neural Netw. vol. 1, no. 1, pp. 4-27, Mar. 1990.
-
(1990)
IEEE Trans. Neural Netw
, vol.1
, Issue.1
, pp. 4-27
-
-
Narendra, K.S.1
Parthasarathy, K.2
-
19
-
-
0027594098
-
On the nonlinear optimal regulator problem
-
C. J. Goh, "On the nonlinear optimal regulator problem," Automatica vol. 29, no. 3, pp. 751-756, 1993.
-
(1993)
Automatica
, vol.29
, Issue.3
, pp. 751-756
-
-
Goh, C.J.1
-
20
-
-
0001440803
-
Tangent prop - A formalism for specifying selected invariances in an adaptive network
-
J. M. R. Lippman and S. J. Hanson, Eds. San Mateo, CA: Morgan Kaufmann
-
P. Simard, B. Victorri, Y. LeCun, and J. Denker, "Tangent prop - A formalism for specifying selected invariances in an adaptive network," in Neural Information Processing Systems, J. M. R. Lippman and S. J. Hanson, Eds. San Mateo, CA: Morgan Kaufmann, 1992, vol. 4.
-
(1992)
Neural Information Processing Systems
, vol.4
-
-
Simard, P.1
Victorri, B.2
LeCun, Y.3
Denker, J.4
-
21
-
-
0039224634
-
Hybrid learning of mapping and its Jacobian in multilayer neural networks
-
J. W. Lee and J. H. Oh, "Hybrid learning of mapping and its Jacobian in multilayer neural networks," Neural Comput., vol. 9, pp. 937-958, 1997.
-
(1997)
Neural Comput
, vol.9
, pp. 937-958
-
-
Lee, J.W.1
Oh, J.H.2
-
22
-
-
0033699871
-
Neural networks learning differential data
-
R. Masuoka, "Neural networks learning differential data," IEICE Trans. Inf. Syst., vol. E83-D, no. 6, pp. 1291-1300, 2000.
-
(2000)
IEICE Trans. Inf. Syst
, vol.E83-D
, Issue.6
, pp. 1291-1300
-
-
Masuoka, R.1
-
24
-
-
0018441647
-
An approximation theory of optimal control for trainable manipulators
-
Mar
-
G. Saridis and C. S. Lee, "An approximation theory of optimal control for trainable manipulators," IEEE Trans. Syst., Man, Cybern., vol. SMC-9, no. 3, pp. 152-159, Mar. 1979.
-
(1979)
IEEE Trans. Syst., Man, Cybern
, vol.SMC-9
, Issue.3
, pp. 152-159
-
-
Saridis, G.1
Lee, C.S.2
-
25
-
-
0000442791
-
Generalization of back-propagation to recurrent neural networks
-
F. Pineda, "Generalization of back-propagation to recurrent neural networks," Phys. Rev. Lett., vol. 19, no. 59, pp. 2229-2232, 1987.
-
(1987)
Phys. Rev. Lett
, vol.19
, Issue.59
, pp. 2229-2232
-
-
Pineda, F.1
-
27
-
-
0025536870
-
Improving the learning speed of 2-layer neural network by choosing initial values of the adaptive weights
-
D. H. Nguyen and B. Widrow, "Improving the learning speed of 2-layer neural network by choosing initial values of the adaptive weights," in Proc. 1st IEEE Int. Joint Conf. Neural Netw., 1990, vol. 3, pp. 21-26.
-
(1990)
Proc. 1st IEEE Int. Joint Conf. Neural Netw
, vol.3
, pp. 21-26
-
-
Nguyen, D.H.1
Widrow, B.2
-
28
-
-
0002020770
-
On the efficiency of certain quasi-random sequences of points in evaluating multi-dimensional integrals
-
J. H. Halton, "On the efficiency of certain quasi-random sequences of points in evaluating multi-dimensional integrals," Numerische Mathematik, vol. 2, pp. 84-90, 1960.
-
(1960)
Numerische Mathematik
, vol.2
, pp. 84-90
-
-
Halton, J.H.1
-
29
-
-
0033629916
-
Reinforcement learning in continuous time and space
-
K. Doya, "Reinforcement learning in continuous time and space," Neural Comput., vol. 12, no. 1, pp. 219-245, 2000.
-
(2000)
Neural Comput
, vol.12
, Issue.1
, pp. 219-245
-
-
Doya, K.1
-
30
-
-
0029200844
-
Control system analysis and design upon the Lyapunov method
-
Jun
-
S. E. Lyshevski and A. U. Meyer, "Control system analysis and design upon the Lyapunov method," in Proc. Amer. Control Conf., Jun. 1995, pp. 3219-3223.
-
(1995)
Proc. Amer. Control Conf
, pp. 3219-3223
-
-
Lyshevski, S.E.1
Meyer, A.U.2
-
31
-
-
84914965022
-
On an iterative technique for Riccati equation computations
-
Feb
-
D. Kleinman, "On an iterative technique for Riccati equation computations," IEEE Trans. Autom. Control, vol. 13, no. 1, pp. 114-115, Feb. 1968.
-
(1968)
IEEE Trans. Autom. Control
, vol.13
, Issue.1
, pp. 114-115
-
-
Kleinman, D.1
-
32
-
-
0029514510
-
The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces
-
A. Moore and C. Atkeson, "The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces," Mach. Learn., vol. 21, pp. 1-36, 1995.
-
(1995)
Mach. Learn
, vol.21
, pp. 1-36
-
-
Moore, A.1
Atkeson, C.2
-
33
-
-
0011766779
-
Local gain adaptation in stochastic gradient descent ISDIA, Lugano, Switzerland
-
Tech. Rep. IDSIA-09-99
-
N. N. Schraudolph, "Local gain adaptation in stochastic gradient descent ISDIA, Lugano, Switzerland, Tech. Rep. IDSIA-09-99, 1999, p. 8.
-
(1999)
, pp. 8
-
-
Schraudolph, N.N.1
-
34
-
-
27844606351
-
Support vector regression for the simultaneous learning of a multivariate function and its derivatives
-
M. Lazaro, I. Santamaria, F. Perez-Cruz, and A. Artes-Rodriguez, "Support vector regression for the simultaneous learning of a multivariate function and its derivatives," Neurocomput., vol. 69, pp. 42-61, 2005.
-
(2005)
Neurocomput
, vol.69
, pp. 42-61
-
-
Lazaro, M.1
Santamaria, I.2
Perez-Cruz, F.3
Artes-Rodriguez, A.4
-
35
-
-
0000255539
-
Fast exact multiplication by the Hessian
-
B. A. Pearlmutter, "Fast exact multiplication by the Hessian," Neural Comput., vol. 6, no. 1, pp. 147-160, 1994.
-
(1994)
Neural Comput
, vol.6
, Issue.1
, pp. 147-160
-
-
Pearlmutter, B.A.1
|