-
2
-
-
0036832951
-
A sparse sampling algorithm for near-optimal planning in large Markov decision processes
-
DOI 10.1023/A:1017932429737
-
M. J. Kearns, Y. Mansour, and A. Y. Ng, "A sparse sampling algorithm for near-optimal planning in large Markov decision processes," Machine Learning, vol. 49, no. 2-3, pp. 193-208, 2002. (Pubitemid 34325686)
-
(2002)
Machine Learning
, vol.49
, Issue.2-3
, pp. 193-208
-
-
Kearns, M.1
Mansour, Y.2
Ng, A.Y.3
-
3
-
-
80052243319
-
Online resolution techniques
-
O. Sigaud and O. Buffet, Eds. Wiley, ch. 6
-
L. Péret and F. Garcia, "Online resolution techniques," in Markov Decision Processes in Artificial Intelligence, O. Sigaud and O. Buffet, Eds. Wiley, 2010, ch. 6, pp. 153-183.
-
(2010)
Markov Decision Processes in Artificial Intelligence
, pp. 153-183
-
-
Péret, L.1
Garcia, F.2
-
4
-
-
58449098161
-
Lazy planning under uncertainties by optimizing decisions on an ensemble of incomplete disturbance trees
-
S. Girgin, M. Loth, R. Munos, P. Preux, and D. Ryabko, Eds. Springer
-
B. Defourny, D. Ernst, and L. Wehenkel, "Lazy planning under uncertainties by optimizing decisions on an ensemble of incomplete disturbance trees," in Recent Advances in Reinforcement Learning, ser. Lecture Notes in Computer Science, S. Girgin, M. Loth, R. Munos, P. Preux, and D. Ryabko, Eds. Springer, 2008, vol. 5323, pp. 1-14.
-
(2008)
Recent Advances in Reinforcement Learning, Ser. Lecture Notes in Computer Science
, vol.5323
, pp. 1-14
-
-
Defourny, B.1
Ernst, D.2
Wehenkel, L.3
-
7
-
-
0036568025
-
Finite-time analysis of the multiarmed bandit problem
-
DOI 10.1023/A:1013689704352, Computational Learning Theory
-
P. Auer, N. Cesa-Bianchi, and P. Fischer, "Finite-time analysis of the multiarmed bandit problem," Machine Learning, vol. 47, no. 2-3, pp. 235-256, 2002. (Pubitemid 34126111)
-
(2002)
Machine Learning
, vol.47
, Issue.2-3
, pp. 235-256
-
-
Auer, P.1
Cesa-Bianchi, N.2
Fischer, P.3
-
8
-
-
70349275222
-
Bandit algorithms for tree search
-
Vancouver, Canada 19-22 July
-
P.-A. Coquelin and R. Munos, "Bandit algorithms for tree search," in Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence (UAI-07), Vancouver, Canada, 19-22 July 2007, pp. 67-74.
-
(2007)
Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence (UAI-07)
, pp. 67-74
-
-
Coquelin, P.-A.1
Munos, R.2
-
9
-
-
77952027689
-
Online optimization in X-armed bandits
-
D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, Eds. MIT Press
-
S. Bubeck, R. Munos, G. Stoltz, and C. Szepesvári, "Online optimization in X-armed bandits," in Advances in Neural Information Processing Systems 21, D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, Eds. MIT Press, 2009, pp. 201-208.
-
(2009)
Advances in Neural Information Processing Systems
, vol.21
, pp. 201-208
-
-
Bubeck, S.1
Munos, R.2
Stoltz, G.3
Szepesvári, C.4
-
12
-
-
58449106591
-
Optimistic planning of deterministic systems
-
Villeneuve d'Ascq, France, 30 June-3 July
-
J.-F. Hren and R. Munos, "Optimistic planning of deterministic systems," in Proceedings 8th European Workshop on Reinforcement Learning (EWRL-08), Villeneuve d'Ascq, France, 30 June-3 July 2008, pp. 151-164.
-
(2008)
Proceedings 8th European Workshop on Reinforcement Learning (EWRL-08)
, pp. 151-164
-
-
Hren, J.-F.1
Munos, R.2
-
13
-
-
67650469377
-
Planning under uncertainty, ensembles of disturbance trees and kernelized discrete action spaces
-
Nashville, US, 30 March-2 April 2009
-
B. Defourny, D. Ernst, and L. Wehenkel, "Planning under uncertainty, ensembles of disturbance trees and kernelized discrete action spaces," in Proceedings 2009 IEEE International Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL-09), Nashville, US, 30 March-2 April 2009, pp. 145-152.
-
Proceedings 2009 IEEE International Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL-09)
, pp. 145-152
-
-
Defourny, B.1
Ernst, D.2
Wehenkel, L.3
-
14
-
-
84888141227
-
Open loop optimistic planning
-
Haifa, Israel 27-29 June
-
S. Bubeck and R. Munos, "Open loop optimistic planning," in Proceedings 23rd Annual Conference on Learning Theory (COLT-10), Haifa, Israel, 27-29 June 2010, pp. 477-489.
-
(2010)
Proceedings 23rd Annual Conference on Learning Theory (COLT-10)
, pp. 477-489
-
-
Bubeck, S.1
Munos, R.2
-
19
-
-
77955814101
-
Reinforcement learning and dynamic programming using function approximators, ser
-
Taylor & Francis CRC Press
-
L. Bus,oniu, R. Babuška, B. De Schutter, and D. Ernst, Reinforcement Learning and Dynamic Programming Using Function Approximators, ser. Automation and Control Engineering. Taylor & Francis CRC Press, 2010.
-
(2010)
Automation and Control Engineering
-
-
Buşoniu, L.1
Babuška, R.2
De Schutter, B.3
Ernst, D.4
-
20
-
-
77950867376
-
Approximate dynamic programming with a fuzzy parameterization
-
L. Bus,oniu, D. Ernst, B. De Schutter, and R. Babuška, "Approximate dynamic programming with a fuzzy parameterization," Automatica, vol. 46, no. 5, pp. 804-814, 2010.
-
(2010)
Automatica
, vol.46
, Issue.5
, pp. 804-814
-
-
Buşoniu, L.1
Ernst, D.2
De Schutter, B.3
Babuška, R.4
-
21
-
-
28544448294
-
Dynamic multidrug therapies for HIV: Optimal and STI control approaches
-
B. Adams, H. Banks, H.-D. Kwon, and H. Tran, "Dynamic multidrug therapies for HIV: Optimal and STI control approaches," Mathematical Biosciences and Engineering, vol. 1, no. 2, pp. 223-241, 2004.
-
(2004)
Mathematical Biosciences and Engineering
, vol.1
, Issue.2
, pp. 223-241
-
-
Adams, B.1
Banks, H.2
Kwon, H.-D.3
Tran, H.4
-
22
-
-
0033609174
-
Control of HIV despite the discontinuation of antiretroviral therapy [2]
-
DOI 10.1056/NEJM199905273402114
-
J. Lisziewicz, E. Rosenberg, and J. Liebermann, "Control of HIV despite the discontinuation of antiretroviral therapy," New England Journal of Medicine, vol. 340, pp. 1683-1684, 1999. (Pubitemid 29249442)
-
(1999)
New England Journal of Medicine
, vol.340
, Issue.21
, pp. 1683-1684
-
-
Lisziewicz, J.1
Rosenberg, E.2
Lieberman, J.3
Jessen, H.4
Lopalco, L.5
Siliciano, R.6
Walker, B.7
Lori, F.8
-
23
-
-
39649096058
-
Clinical data based optimal STI strategies for HIV: A reinforcement learning approach
-
4177178, Proceedings of the 45th IEEE Conference on Decision and Control 2006, CDC
-
D. Ernst, G.-B. Stan, J. Gonc,alves, and L. Wehenkel, "Clinical data based optimal STI strategies for HIV: A reinforcement learning approach," in Proceedings 45th IEEE Conference on Decision & Control, San Diego, US, 13-15 December 2006, pp. 667-672. (Pubitemid 351283311)
-
(2006)
Proceedings of the IEEE Conference on Decision and Control
, pp. 667-672
-
-
Ernst, D.1
Stan, G.-B.2
Goncalves, J.3
Wehenkel, L.4
-
24
-
-
79551686776
-
Cross-entropy optimization of control policies with adaptive basis functions
-
accepted for publication, available online
-
L. Bus,oniu, D. Ernst, B. De Schutter, and R. Babuška, "Cross-entropy optimization of control policies with adaptive basis functions," IEEE Transactions on Systems, Man, and Cybernetics-Part B: Cybernetics, vol. 41, no. 1, 2011, accepted for publication, available online.
-
(2011)
IEEE Transactions on Systems, Man, and Cybernetics-Part B: Cybernetics
, vol.41
, Issue.1
-
-
Buşoniu, L.1
Ernst, D.2
De Schutter, B.3
Babuška, R.4
|