-
2
-
-
32044460264
-
Understanding the fine structure of electricity prices
-
H. Geman and A. Roncoroni, "Understanding the Fine Structure of Electricity Prices," The Journal of Business, vol. 79, no. 3, 2006.
-
(2006)
The Journal of Business
, vol.79
, Issue.3
-
-
Geman, H.1
Roncoroni, A.2
-
5
-
-
84921399937
-
-
New York: IEEE Press
-
J. Si, A. G. Barto, W. B. Powell, and D. Wunsch, Eds., Handbook of Learning and Approximate Dynamic Programming. New York: IEEE Press, 2004.
-
(2004)
Handbook of Learning and Approximate Dynamic Programming
-
-
Si, J.1
Barto, A.G.2
Powell, W.B.3
Wunsch, D.4
-
7
-
-
0031388983
-
A neuro-dynamic programming approach to retailer inventory management
-
B. Van Roy, D. Bertsekas, Y. Lee, and J. Tsitsiklis, "A neuro-dynamic programming approach to retailer inventory management," in Proceedings of the 36th IEEE Conference on Decision and Control, vol. 4, 1997, pp. 4052-4057.
-
(1997)
Proceedings of the 36th IEEE Conference on Decision and Control
, vol.4
, pp. 4052-4057
-
-
Van Roy, B.1
Bertsekas, D.2
Lee, Y.3
Tsitsiklis, J.4
-
8
-
-
0001046225
-
Practical issues in temporal difference learning
-
G. Tesauro, "Practical Issues in Temporal Difference Learning," Machine Learning, vol. 8, no. 3-4, pp. 257-277, 1992.
-
(1992)
Machine Learning
, vol.8
, Issue.3-4
, pp. 257-277
-
-
Tesauro, G.1
-
9
-
-
4544257178
-
Approximating Q-values with basis function representations
-
Hillsdale, NJ, M. Mozer, D. Touretzky, and P. Smolensky, Eds.
-
P. Sabes, "Approximating Q-values with basis function representations," in Proceedings of the Fourth Connectionist Models Summer School, Hillsdale, NJ, M. Mozer, D. Touretzky, and P. Smolensky, Eds., 1993, pp. 264-271.
-
(1993)
Proceedings of the Fourth Connectionist Models Summer School
, pp. 264-271
-
-
Sabes, P.1
-
10
-
-
17444414191
-
Basis function adaptation in temporal difference reinforcement learning
-
DOI 10.1007/s10479-005-5732-z
-
I. Menache, S. Mannor, and N. Shimkin, "Basis function adaptation in temporal-difference reinforcement learning," Annals of Operations Research, vol. 134, no. 1, pp. 215-238, 2005. (Pubitemid 40550047)
-
(2005)
Annals of Operations Research
, vol.134
, Issue.1
, pp. 215-238
-
-
Menache, I.1
Mannor, S.2
Shimkin, N.3
-
11
-
-
71149099079
-
Fast gradient-descent methods for temporal-difference learning with linear function approximation
-
R. Sutton, H. Maei, D. Precup, S. Bhatnagar, D. Silver, C. Szepesvári, and E. Wiewiora, "Fast gradient-descent methods for temporal-difference learning with linear function approximation," in Proceedings of the 26th International Conference on Machine Learning, 2009, pp. 993-1000.
-
(2009)
Proceedings of the 26th International Conference on Machine Learning
, pp. 993-1000
-
-
Sutton, R.1
Maei, H.2
Precup, D.3
Bhatnagar, S.4
Silver, D.5
Szepesvári, C.6
Wiewiora, E.7
-
13
-
-
3843131884
-
A new criterion using information gain for action selection strategy in reinforcement learning
-
K. Iwata, K. Ikeda, and H. Sakai, "A new criterion using information gain for action selection strategy in reinforcement learning," IEEE Transactions on Neural Networks, vol. 15, no. 4, pp. 792-799, 2004.
-
(2004)
IEEE Transactions on Neural Networks
, vol.15
, Issue.4
, pp. 792-799
-
-
Iwata, K.1
Ikeda, K.2
Sakai, H.3
-
17
-
-
1142281527
-
Model-based Bayesian exploration
-
R. Dearden, N. Friedman, and D. Andre, "Model-based Bayesian Exploration," in Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence, 1999, pp. 150-159.
-
(1999)
Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence
, pp. 150-159
-
-
Dearden, R.1
Friedman, N.2
Andre, D.3
-
20
-
-
33749251297
-
An analytic solution to discrete Bayesian reinforcement learning
-
P. Poupart, N. Vlassis, J. Hoey, and K. Regan, "An analytic solution to discrete Bayesian reinforcement learning," in Proceedings of the 23rd International Conference on Machine Learning, 2006, pp. 697-704.
-
(2006)
Proceedings of the 23rd International Conference on Machine Learning
, pp. 697-704
-
-
Poupart, P.1
Vlassis, N.2
Hoey, J.3
Regan, K.4
-
21
-
-
0031619316
-
Bayesian Q-learning
-
R. Dearden, N. Friedman, and S. Russell, "Bayesian Q-learning," in Proceedings of the 15th National Conference on Artificial Intelligence, 1998, pp. 761-768.
-
(1998)
Proceedings of the 15th National Conference on Artificial Intelligence
, pp. 761-768
-
-
Dearden, R.1
Friedman, N.2
Russell, S.3
-
22
-
-
1942421151
-
Bayes meets bellman: The Gaussian process approach to temporal difference learning
-
Y. Engel, S. Mannor, and R. Meir, "Bayes Meets Bellman: The Gaussian Process Approach to Temporal Difference Learning," in Proceedings of the 20th International Conference on Machine Learning, 2003, pp. 154-161.
-
(2003)
Proceedings of the 20th International Conference on Machine Learning
, pp. 154-161
-
-
Engel, Y.1
Mannor, S.2
Meir, R.3
-
24
-
-
0030590294
-
Bayesian look ahead one-stage sampling allocations for selection of the best population
-
DOI 10.1016/0378-3758(95)00169-7
-
S. Gupta and K. Miescke, "Bayesian look ahead one-stage sampling allocations for selection of the best population," Journal of Statistical Planning and Inference, vol. 54, no. 2, pp. 229-244, 1996. (Pubitemid 126161097)
-
(1996)
Journal of Statistical Planning and Inference
, vol.54
, Issue.2
, pp. 229-244
-
-
Gupta, S.S.1
Miescke, K.J.2
-
25
-
-
55549135706
-
A knowledge gradient policy for sequential information collection
-
P. I. Frazier, W. B. Powell, and S. Dayanik, "A knowledge gradient policy for sequential information collection," SIAM Journal on Control and Optimization, vol. 47, no. 5, pp. 2410-2439, 2008.
-
(2008)
SIAM Journal on Control and Optimization
, vol.47
, Issue.5
, pp. 2410-2439
-
-
Frazier, P.I.1
Powell, W.B.2
Dayanik, S.3
-
26
-
-
67650505320
-
The knowledge gradient algorithm for online subset selection
-
Nashville, TN
-
I. O. Ryzhov and W. B. Powell, "The knowledge gradient algorithm for online subset selection," in Proceedings of the 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, Nashville, TN, 2009, pp. 137-144.
-
(2009)
Proceedings of the 2009rfsti IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning
, pp. 137-144
-
-
Ryzhov, I.O.1
Powell, W.B.2
-
28
-
-
79951586758
-
Optimal learning of transition probabilities in the two-agent newsvendor problem
-
B. Johansson, S. Jain, J. Montoya-Torres, J. Hugan, and E. Yücesan, Eds.
-
I. O. Ryzhov, M. R. Valdez-Vivas, and W. B. Powell, "Optimal Learning of Transition Probabilities in the Two-Agent Newsvendor Problem," in Proceedings of the 2010 Winter Simulation Conference, B. Johansson, S. Jain, J. Montoya-Torres, J. Hugan, and E. Yücesan, Eds., 2010.
-
(2010)
Proceedings of the 2010 Winter Simulation Conference
-
-
Ryzhov, I.O.1
Valdez-Vivas, M.R.2
Powell, W.B.3
-
30
-
-
79961092747
-
The knowledge-gradient algorithm for sequencing experiments in drug discovery
-
to appear
-
D. Negoescu, P. Frazier, and W. Powell, "The Knowledge-Gradient Algorithm for Sequencing Experiments in Drug Discovery," INFORMS J. on Computing (to appear), 2010.
-
(2010)
INFORMS J. on Computing
-
-
Negoescu, D.1
Frazier, P.2
Powell, W.3
-
31
-
-
33748998787
-
Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming
-
DOI 10.1007/s10994-006-8365-9
-
A. George andW. B. Powell, "Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming," Machine Learning, vol. 65, no. 1, pp. 167-198, 2006. (Pubitemid 44451197)
-
(2006)
Machine Learning
, vol.65
, Issue.1
, pp. 167-198
-
-
George, A.P.1
Powell, W.B.2
-
33
-
-
70449498873
-
The knowledge-gradient policy for correlated normal rewards
-
P. I. Frazier, W. B. Powell, and S. Dayanik, "The knowledge-gradient policy for correlated normal rewards," INFORMS J. on Computing, vol. 21, no. 4, pp. 599-613, 2009.
-
(2009)
INFORMS J. on Computing
, vol.21
, Issue.4
, pp. 599-613
-
-
Frazier, P.I.1
Powell, W.B.2
Dayanik, S.3
-
34
-
-
0000792991
-
The stochastic behavior of commodity prices: Implications for valuation and hedging
-
E. Schwartz, "The stochastic behavior of commodity prices: Implications for valuation and hedging," Journal of Finance, vol. 52, no. 3, pp. 923- 973, 1997. (Pubitemid 127344954)
-
(1997)
Journal of Finance
, vol.52
, Issue.3
, pp. 923-973
-
-
Schwartz, E.S.1
-
35
-
-
77956513316
-
A convergent O(n) algorithm for off-policy temporal-difference learning with linear function approximation
-
R. Sutton, C. Szepesvári, and H. Maei, "A convergent O(n) algorithm for off-policy temporal-difference learning with linear function approximation," Advances in Neural Information Processing Systems, vol. 21, pp. 1609-1616, 2008.
-
(2008)
Advances in Neural Information Processing Systems
, vol.21
, pp. 1609-1616
-
-
Sutton, R.1
Szepesvári, C.2
Maei, H.3
|