-
4
-
-
0013495368
-
Experiments with infinite-horizon, policy- gradient estimation
-
Baxter, J., Bartlett, P., Weaver, L.: Experiments with infinite-horizon, policy- gradient estimation. Journal of Artificial Intelligence Research 15, 351-381 (2001)
-
(2001)
Journal of Artificial Intelligence Research
, vol.15
, pp. 351-381
-
-
Baxter, J.1
Bartlett, P.2
Weaver, L.3
-
5
-
-
34250635407
-
Policy gradient methods for robotics
-
Beijing, China, pp
-
Peters, J., Schaal, S.: Policy gradient methods for robotics. In: IROS. Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Beijing, China, pp. 2219-2225 (2006)
-
(2006)
IROS. Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems
, pp. 2219-2225
-
-
Peters, J.1
Schaal, S.2
-
6
-
-
0000337576
-
Simple statistical gradient-following algorithms for connectionist reinforcement learning
-
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning 8, 229-256 (1992)
-
(1992)
Machine Learning
, vol.8
, pp. 229-256
-
-
Williams, R.J.1
-
7
-
-
0025600638
-
A stochastic reinforcement learning algorithm for learning real-valued functions
-
Gullapalli, V.: A stochastic reinforcement learning algorithm for learning real-valued functions. Neural Networks 3(6), 671-692 (1990)
-
(1990)
Neural Networks
, vol.3
, Issue.6
, pp. 671-692
-
-
Gullapalli, V.1
-
8
-
-
33745327217
-
Fast online policy gradient learning with smd gain vector adaptation
-
Weiss, Y, Schölkopf, B, Platt, J, eds, MIT Press, Cambridge, MA
-
Schraudolph, N., Yu, J., Aberdeen, D.: Fast online policy gradient learning with smd gain vector adaptation. In: Weiss, Y., Schölkopf, B., Platt, J. (eds.) Advances in Neural Information Processing Systems, vol. 18, MIT Press, Cambridge, MA (2006)
-
(2006)
Advances in Neural Information Processing Systems
, vol.18
-
-
Schraudolph, N.1
Yu, J.2
Aberdeen, D.3
-
9
-
-
33646413135
-
-
Peters, J., Vijayakumar, S., Schaal, S.: Natural actor-critic. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), 3720, pp. 280-291. Springer, Heidelberg (2005)
-
Peters, J., Vijayakumar, S., Schaal, S.: Natural actor-critic. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 280-291. Springer, Heidelberg (2005)
-
-
-
-
10
-
-
33750244274
-
-
Sutton, R., McAllester, D., Singh, S., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation (2001)
-
(2001)
Policy gradient methods for reinforcement learning with function approximation
-
-
Sutton, R.1
McAllester, D.2
Singh, S.3
Mansour, Y.4
-
12
-
-
0025503558
-
Back propagation through time: What it does and how to do it
-
Werbos, P.: Back propagation through time: What it does and how to do it. Proceedings of the IEEE 78, 1550-1560 (1990)
-
(1990)
Proceedings of the IEEE
, vol.78
, pp. 1550-1560
-
-
Werbos, P.1
-
13
-
-
2142812536
-
Learning without state-estimation in partially observable markovian decision processes
-
Singh, S.P., Jaakkola, T., Jordan, M.I.: Learning without state-estimation in partially observable markovian decision processes. In: International Conference on Machine Learning, pp. 284-292 (1994)
-
(1994)
International Conference on Machine Learning
, pp. 284-292
-
-
Singh, S.P.1
Jaakkola, T.2
Jordan, M.I.3
-
15
-
-
0002103968
-
Learning finite-state controllers for partially observable environments
-
Morgan Kaufmann, San Francisco
-
Meuleau, N., Peshkin, L., Kim, K.-E., Kaelbling, L.P.: Learning finite-state controllers for partially observable environments. In: UAI '99. Proc. Fifteenth Conference on Uncertainty in Artificial Intelligence, pp. 427-436. Morgan Kaufmann, San Francisco (1999)
-
(1999)
UAI '99. Proc. Fifteenth Conference on Uncertainty in Artificial Intelligence
, pp. 427-436
-
-
Meuleau, N.1
Peshkin, L.2
Kim, K.-E.3
Kaelbling, L.P.4
-
19
-
-
0041914606
-
Gradient flow in recurrent nets: The difficulty of learning long-term dependencies
-
Kremer, S.C, Kolen, J.F, eds, IEEE Press, NJ, New York
-
Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J.: Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In: Kremer, S.C., Kolen, J.F. (eds.) A Field Guide to Dynamical Recurrent Neural Networks, IEEE Press, NJ, New York (2001)
-
(2001)
A Field Guide to Dynamical Recurrent Neural Networks
-
-
Hochreiter, S.1
Bengio, Y.2
Frasconi, P.3
Schmidhuber, J.4
-
21
-
-
0026626840
-
Evolving neural network controllers for unstable systems
-
Seattle, WA, pp, IEEE Service Center, Piscataway, NJ
-
Wieland, A.: Evolving neural network controllers for unstable systems. In: Proceedings of the International Joint Conference on Neural Networks, Seattle, WA, pp. 667-673. IEEE Service Center, Piscataway, NJ (1991)
-
(1991)
Proceedings of the International Joint Conference on Neural Networks
, pp. 667-673
-
-
Wieland, A.1
|