-
1
-
-
84862009362
-
Computing near-optimal policies in generalized joint replenishment
-
D. Adelman and D. Klabjan, "Computing near-optimal policies in generalized joint replenishment," INFORMS J. Comp., vol. 24, no. 1, pp. 148-164, 2012.
-
(2012)
INFORMS J. Comp.
, vol.24
, Issue.1
, pp. 148-164
-
-
Adelman, D.1
Klabjan, D.2
-
2
-
-
77952074893
-
Approximate dynamic programming for ambulance redeployment
-
M. S. Maxwell, M. Restrepo, S. G. Henderson, and H. Topaloglu, "Approximate dynamic programming for ambulance redeployment,"INFORMS J. Comp., vol. 22, no. 2, pp. 266-281, 2010.
-
(2010)
INFORMS J. Comp.
, vol.22
, Issue.2
, pp. 266-281
-
-
Maxwell, M.S.1
Restrepo, M.2
Henderson, S.G.3
Topaloglu, H.4
-
3
-
-
84862297978
-
Approximate dynamic programming algorithms for optimal dosage decisions in controlled ovarian hyperstimulation
-
M. He, L. Zhao, and W. B. Powell, "Approximate dynamic programming algorithms for optimal dosage decisions in controlled ovarian hyperstimulation,"Eur. J. Oper. Res., vol. 222, no. 2, pp. 328-340, 2012.
-
(2012)
Eur. J. Oper. Res.
, vol.222
, Issue.2
, pp. 328-340
-
-
He, M.1
Zhao, L.2
Powell, W.B.3
-
4
-
-
77953561812
-
An approximate dynamic programming approach to benchmark practice-based heuristics for natural gas storage valuation
-
G. Lai, F. Margot, and N. Secomandi, "An approximate dynamic programming approach to benchmark practice-based heuristics for natural gas storage valuation," Oper. Res., vol. 58, no. 3, pp. 564-582, 2010.
-
(2010)
Oper. Res.
, vol.58
, Issue.3
, pp. 564-582
-
-
Lai, G.1
Margot, F.2
Secomandi, N.3
-
5
-
-
79952360158
-
Optimal day-ahead trading and storage of renewable energies - an approximate dynamic programming approach
-
N. Löhndorf and S. Minner, "Optimal day-ahead trading and storage of renewable energies - an approximate dynamic programming approach,"Energy Syst., vol. 1, no. 1, pp. 61-77, 2010.
-
(2010)
Energy Syst
, vol.1
, Issue.1
, pp. 61-77
-
-
Löhndorf, N.1
Minner, S.2
-
6
-
-
77949559841
-
Optimal commodity trading with a capacitated storage asset
-
N. Secomandi, "Optimal commodity trading with a capacitated storage asset," Manag. Sci., vol. 56, no. 3, pp. 449-467, 2010.
-
(2010)
Manag. Sci.
, vol.56
, Issue.3
, pp. 449-467
-
-
Secomandi, N.1
-
7
-
-
70449631674
-
An approximate dynamic programming approach to network revenue management with customer choice
-
D. Zhang and D. Adelman, "An approximate dynamic programming approach to network revenue management with customer choice,"Transportation Sci., vol. 43, no. 3, pp. 381-394, 2009.
-
(2009)
Transportation Sci
, vol.43
, Issue.3
, pp. 381-394
-
-
Zhang, D.1
Adelman, D.2
-
9
-
-
63449141834
-
An approximate dynamic programming algorithm for large-scale fleet management: A case application
-
H. P. Simão, J. Day, A. P. George, T. Gifford, J. Nienow, and W. B. Powell, "An approximate dynamic programming algorithm for large-scale fleet management: A case application," Transportation Sci., vol. 43, no. 2, pp. 178-197, 2009.
-
(2009)
Transportation Sci
, vol.43
, Issue.2
, pp. 178-197
-
-
Simão, H.P.1
Day, J.2
George, A.P.3
Gifford, T.4
Nienow, J.5
Powell, W.B.6
-
10
-
-
84873152819
-
SMART: A stochastic multiscale model for the analysis of energy resources, technology and policy
-
W. B. Powell, A. George, A. Lamont, J. Stewart, and W. R. Scott, "SMART: A stochastic multiscale model for the analysis of energy resources, technology and policy," INFORMS J. Comp., vol. 24, no. 4, pp. 665-682, 2012.
-
(2012)
INFORMS J. Comp.
, vol.24
, Issue.4
, pp. 665-682
-
-
Powell, W.B.1
George, A.2
Lamont, A.3
Stewart, J.4
Scott, W.R.5
-
13
-
-
84968519017
-
Functional approximations and dynamic programming
-
R. Bellman and S. Dreyfus, "Functional approximations and dynamic programming," Math. Tables Aids Comp., vol. 13, pp. 247-251, 1959.
-
(1959)
Math. Tables Aids Comp.
, vol.13
, pp. 247-251
-
-
Bellman, R.1
Dreyfus, S.2
-
16
-
-
84921399937
-
-
New York, NY, USA: IEEE Press
-
J. Si, A. G. Barto, W. B. Powell, and D. Wunsch, Eds., Handbook of Learning and Approximate Dynamic Programming. New York, NY, USA: IEEE Press, 2004.
-
(2004)
Handbook of Learning and Approximate Dynamic Programming
-
-
Si, J.1
Barto, A.G.2
Powell, W.B.3
Wunsch, D.4
-
19
-
-
34249833101
-
Q-learning
-
C. Watkins and P. Dayan, "Q-learning," Machine Learning, vol. 8, no. 3, pp. 279-292, 1992.
-
(1992)
Machine Learning
, vol.8
, Issue.3
, pp. 279-292
-
-
Watkins, C.1
Dayan, P.2
-
20
-
-
33645566756
-
Dynamic programming approximations for stochastic, time-staged integer multicommodity flow problems
-
H. Topaloglu and W. B. Powell, "Dynamic programming approximations for stochastic, time-staged integer multicommodity flow problems,"INFORMS J. Comp., vol. 18, no. 1, pp. 31-42, 2006.
-
(2006)
INFORMS J. Comp.
, vol.18
, Issue.1
, pp. 31-42
-
-
Topaloglu, H.1
Powell, W.B.2
-
21
-
-
0004093909
-
-
Cambridge, U.K.: Cambridge Univ. Press
-
M. Wasan, Stochastic Approximation. Cambridge, U.K.: Cambridge Univ. Press, 1969.
-
(1969)
Stochastic Approximation
-
-
Wasan, M.1
-
23
-
-
0028497630
-
Asynchronous stochastic approximation and Q-learning
-
J. Tsitsiklis, "Asynchronous stochastic approximation and Q-learning,"Machine Learning, vol. 16, pp. 185-202, 1994.
-
(1994)
Machine Learning
, vol.16
, pp. 185-202
-
-
Tsitsiklis, J.1
-
24
-
-
4243385070
-
Convergence of Stochastic Iterative Dynamic Programming algorithms
-
J. Cowan, G. Tesauro, and J. Alspector, Eds. San Francisco, CA, USA: Morgan Kaufmann Publishers
-
T. Jaakkola, M. Jordan, and S. Singh, "Convergence of stochastic iterative dynamic programming algorithms," in Advances in Neural Information Processing Systems, vol. 6, J. Cowan, G. Tesauro, and J. Alspector, Eds. San Francisco, CA, USA: Morgan Kaufmann Publishers, 1994, pp. 703-710.
-
(1994)
Advances in Neural Information Processing Systems
, vol.6
, pp. 703-710
-
-
Jaakkola, T.1
Jordan, M.2
Singh, S.3
-
25
-
-
85162416897
-
Speedy Q-learning
-
M. G. Azar, R. Munos, M. Ghavamzadeh, and H. J. Kappen, "Speedy Q-learning," Adv. Neural Inform. Processing Syst., vol. 24, pp. 2411-2419, 2011.
-
(2011)
Adv. Neural Inform. Processing Syst.
, vol.24
, pp. 2411-2419
-
-
Azar, M.G.1
Munos, R.2
Ghavamzadeh, M.3
Kappen, H.J.4
-
26
-
-
84898998140
-
The Asymptotic Convergence-rate of Q-learning
-
M. Jordan, M. Kearns, and S. Solla , Eds. Cambridge, MA, USA: MIT Press
-
C. Szepesvári, "The asymptotic convergence-rate of Q-learning," in Advances in Neural Information Processing Systems, vol. 10, M. Jordan, M. Kearns, and S. Solla, Eds. Cambridge, MA, USA: MIT Press, 1997, pp. 1064-1070.
-
(1997)
Advances in Neural Information Processing Systems
, vol.10
, pp. 1064-1070
-
-
Szepesvári, C.1
-
28
-
-
60749124483
-
On Step Sizes, Stochastic Shortest Paths, Survival Probabilities in Reinforcement learning
-
S. J. Mason, R. R. Hill, L.Mönch, O. Rose, T. Jefferson, and J. W. Fowler, Eds.
-
A. Gosavi, "On step sizes, stochastic shortest paths, survival probabilities in reinforcement learning," in Proc. Winter Simul. Conf., S. J. Mason, R. R. Hill, L. Mönch, O. Rose, T. Jefferson, and J. W. Fowler, Eds., 2008, pp. 525-531.
-
(2008)
Proc. Winter Simul. Conf.
, pp. 525-531
-
-
Gosavi, A.1
-
29
-
-
84993077818
-
Approximate dynamic programming for management of high-value spare parts
-
H. P. Simão and W. B. Powell, "Approximate dynamic programming for management of high-value spare parts," J. Manufact. Technol. Manag., vol. 20, no. 2, pp. 147-160, 2009.
-
(2009)
J. Manufact. Technol. Manag.
, vol.20
, Issue.2
, pp. 147-160
-
-
Simão, H.P.1
Powell, W.B.2
-
30
-
-
0003778897
-
-
New York, NY, USA: Springer-Verlag
-
A. Benveniste, M. Metivier, and P. Priouret, Adaptive Algorithms and Stochastic Approximations. New York, NY, USA: Springer-Verlag, 1990.
-
(1990)
Adaptive Algorithms and Stochastic Approximations
-
-
Benveniste, A.1
Metivier, M.2
Priouret, P.3
-
31
-
-
80052250414
-
Adaptive subgradient methods for online learning and stochastic optimization
-
J. Duchi, E. Hazan, and Y. Singer, "Adaptive subgradient methods for online learning and stochastic optimization," J. Machine Learning Res., vol. 12, pp. 2121-2159, 2011.
-
(2011)
J. Machine Learning Res.
, vol.12
, pp. 2121-2159
-
-
Duchi, J.1
Hazan, E.2
Singer, Y.3
-
34
-
-
84867615954
-
Tuning-free stepsize adaptation
-
A. R. Mahmood, R. S. Sutton, T. Degris, and P. M. Pilarski, "Tuning-free stepsize adaptation," in Proc. IEEE Int. Conf. Acous., Speech, Signal Processing, 2012, pp. 2121-2124.
-
(2012)
Proc. IEEE Int. Conf. Acous., Speech, Signal Processing
, pp. 2121-2124
-
-
Mahmood, A.R.1
Sutton, R.S.2
Degris, T.3
Pilarski, P.M.4
-
36
-
-
33646435300
-
A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning
-
D. P. Choi and B. Van Roy, "A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning," Discrete Event Dyn. Syst., vol. 16, pp. 207-239, 2006.
-
(2006)
Discrete Event Dyn. Syst.
, vol.16
, pp. 207-239
-
-
Choi, D.P.1
Van Roy, B.2
-
37
-
-
33748998787
-
Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming
-
A. George and W. B. Powell, "Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming," Machine Learning, vol. 65, no. 1, pp. 167-198, 2006.
-
(2006)
Machine Learning
, vol.65
, Issue.1
, pp. 167-198
-
-
George, A.1
Powell, W.B.2
-
38
-
-
85053849310
-
Temporal Difference Updating without a Learning rate
-
J. C. Platt, D. Koller, Y. Singer, and S. Roweis , Eds. C ambridge, MA, USA: MIT Press
-
M. Hutter and S. Legg, "Temporal difference updating without a learning rate," in Advances in Neural Information Processing Systems, vol. 20, J. C. Platt, D. Koller, Y. Singer, and S. Roweis, Eds. Cambridge, MA, USA: MIT Press, 2007, pp. 705-712.
-
(2007)
Advances in Neural Information Processing Systems
, vol.20
, pp. 705-712
-
-
Hutter, M.1
Legg, S.2
-
39
-
-
77956513316
-
A convergent O(n) algorithm for off-policy temporal-difference learning with linear function approximation
-
R. Sutton, C. Szepesvári, and H. Maei, "A convergent O(n) algorithm for off-policy temporal-difference learning with linear function approximation,"Adv. Neural Inform. Processing Syst., vol. 21, pp. 1609-1616, 2008.
-
(2008)
Adv. Neural Inform. Processing Syst.
, vol.21
, pp. 1609-1616
-
-
Sutton, R.1
Szepesvári, C.2
Maei, H.3
-
42
-
-
0003684449
-
-
New York, NY, USA: Springer ser. Statistics
-
T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning. New York, NY, USA: Springer, 2001, ser. Statistics.
-
(2001)
The Elements of Statistical Learning
-
-
Hastie, T.1
Tibshirani, R.2
Friedman, J.3
-
43
-
-
81455141800
-
General bounds and finite-time improvement for the Kiefer-Wolfowitz stochastic approximation algorithm
-
M. Broadie, D. Cicek, and A. Zeevi, "General bounds and finite-time improvement for the Kiefer-Wolfowitz stochastic approximation algorithm,"Oper. Res., vol. 59, no. 5, pp. 1211-1224, 2011.
-
(2011)
Oper. Res.
, vol.59
, Issue.5
, pp. 1211-1224
-
-
Broadie, M.1
Cicek, D.2
Zeevi, A.3
-
45
-
-
56349109509
-
Value function approximation using multiple aggregation for multiattribute resource management
-
A. George,W. B. Powell, and S. R. Kulkarni, "Value function approximation using multiple aggregation for multiattribute resource management,"J. Machine Learning Res., vol. 9, pp. 2079-2111, 2008.
-
(2008)
J. Machine Learning Res.
, vol.9
, pp. 2079-2111
-
-
George, A.1
Powell, W.B.2
Kulkarni, S.R.3
|