-
1
-
-
0001130234
-
A trust region method based on interior point techniques for nonlinear programming
-
Byrd, R.H., Gilbert, J.C., Nocedal, J.: A trust region method based on interior point techniques for nonlinear programming. Mathematical Programming 89, 149-185 (1996)
-
(1996)
Mathematical Programming
, vol.89
, pp. 149-185
-
-
Byrd, R.H.1
Gilbert, J.C.2
Nocedal, J.3
-
2
-
-
84861697773
-
Reinforcement Learning with a Bilinear Q Function
-
Sanner, S., Hutter, M. (eds.) EWRL 2011. Springer, Heidelberg
-
Elkan, C.: Reinforcement Learning with a Bilinear Q Function. In: Sanner, S., Hutter, M. (eds.) EWRL 2011. LNCS, vol. 7188, pp. 78-88. Springer, Heidelberg (2012)
-
(2012)
LNCS
, vol.7188
, pp. 78-88
-
-
Elkan, C.1
-
3
-
-
21844465127
-
Tree-based batch mode reinforcement learning
-
Ernst, D., Geurts, P., Wehenkel, L.: Tree-based batch mode reinforcement learning. Journal of Machine Learning Research 6(1), 503-556 (2005)
-
(2005)
Journal of Machine Learning Research
, vol.6
, Issue.1
, pp. 503-556
-
-
Ernst, D.1
Geurts, P.2
Wehenkel, L.3
-
6
-
-
0001455103
-
Comments on the origin and application of Markov decision processes
-
Howard, R.A.: Comments on the origin and application of Markov decision processes. Management Science 14(7), 503-507 (1968)
-
(1968)
Management Science
, vol.14
, Issue.7
, pp. 503-507
-
-
Howard, R.A.1
-
8
-
-
4644323293
-
Least-squares policy iteration
-
Lagoudakis, M.G., Parr, R., Bartlett, L.: Least-squares policy iteration. Journal of Machine Learning Research 4, 1107-1149 (2003)
-
(2003)
Journal of Machine Learning Research
, vol.4
, pp. 1107-1149
-
-
Lagoudakis, M.G.1
Parr, R.2
Bartlett, L.3
-
9
-
-
35748957806
-
Proto-value functions: A Laplacian framework for learning representation and control in Markov decision processes
-
Mahadevan, S., Maggioni, M.: Proto-value functions: A Laplacian framework for learning representation and control in Markov decision processes. Journal of Machine Learning Research, 2169-2231 (2007)
-
(2007)
Journal of Machine Learning Research
, pp. 2169-2231
-
-
Mahadevan, S.1
Maggioni, M.2
-
10
-
-
56049095326
-
Fitted Natural Actor-Critic: A New Algorithm for Continuous State-Action MDPs
-
Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part II. Springer, Heidelberg
-
Melo, F.S., Lopes, M.: Fitted Natural Actor-Critic: A New Algorithm for Continuous State-Action MDPs. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part II. LNCS (LNAI), vol. 5212, pp. 66-81. Springer, Heidelberg (2008)
-
(2008)
LNCS (LNAI)
, vol.5212
, pp. 66-81
-
-
Melo, F.S.1
Lopes, M.2
-
11
-
-
17444414191
-
Basis function adaptation in temporal difference reinforcement learning
-
Menache, I., Mannor, S., Shimkin, N.: Basis function adaptation in temporal difference reinforcement learning. Annals of Operations Research 134(1), 215-238 (2005)
-
(2005)
Annals of Operations Research
, vol.134
, Issue.1
, pp. 215-238
-
-
Menache, I.1
Mannor, S.2
Shimkin, N.3
-
12
-
-
56449092660
-
An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning
-
Parr, R., Li, L., Taylor, G., Painter-Wakefield, C., Littman, M.: An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning. In: Proceedings of the 25th International Conference on Machine Learning (ICML), pp. 752-759 (2008)
-
(2008)
Proceedings of the 25th International Conference on Machine Learning (ICML)
, pp. 752-759
-
-
Parr, R.1
Li, L.2
Taylor, G.3
Painter-Wakefield, C.4
Littman, M.5
-
13
-
-
77949361320
-
Merging AI and or to solve high-dimensional stochastic optimization problems using approximate dynamic programming
-
Powell, W.B.: Merging AI and OR to solve high-dimensional stochastic optimization problems using approximate dynamic programming. INFORMS Journal on Computing 22(1), 2-17 (2010)
-
(2010)
INFORMS Journal on Computing
, vol.22
, Issue.1
, pp. 2-17
-
-
Powell, W.B.1
-
17
-
-
77949360515
-
Commentary - Perspectives on stochastic optimization over time
-
Tsitsiklis, J.N.: Commentary - perspectives on stochastic optimization over time. INFORMS Journal on Computing 22(1), 18-19 (2010)
-
(2010)
INFORMS Journal on Computing
, vol.22
, Issue.1
, pp. 18-19
-
-
Tsitsiklis, J.N.1
-
19
-
-
0030082891
-
An approach to fuzzy control of nonlinear systems: Stability and design issues
-
Wang, H.O., Tanaka, K., Griffin, M.F.: An approach to fuzzy control of nonlinear systems: stability and design issues. IEEE Transactions on Fuzzy Systems 4(1), 14-23 (1996)
-
(1996)
IEEE Transactions on Fuzzy Systems
, vol.4
, Issue.1
, pp. 14-23
-
-
Wang, H.O.1
Tanaka, K.2
Griffin, M.F.3
|