-
1
-
-
0002493843
-
Network flows
-
R. K. Ahuja, J.B. Orlin, and D. Sharma, Eds. Englewood Cliffs, NJ: Prentice-Hall
-
R. K. Ahuja, T. L. Magnanti, and J. Orlin, "Network flows," in Theory, Algorithms, and Applications, R. K. Ahuja, J.B. Orlin, and D. Sharma, Eds. Englewood Cliffs, NJ: Prentice-Hall, 1993, vol. 91, pp. 71-97.
-
(1993)
Theory, Algorithms, and Applications
, vol.91
, pp. 71-97
-
-
Ahuja, R.K.1
Magnanti, T.L.2
Orlin, J.3
-
2
-
-
34548752490
-
Value-iteration based fitted policy iteration: Learning with a single trajectory
-
DOI 10.1109/ADPRL.2007.368207, 4220852, Proceedings of the 2007 IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning, ADPRL 2007
-
A. Antos, C. Szepesvári, and R. Munos, "Value-iteration based fitted policy iteration: Learning with a single trajectory," in Proc. 2007 IEEE Int. Symp. Approximate Dynamic Programming and Reinforcement Learning, 2007, pp. 330-337. (Pubitemid 47431404)
-
(2007)
Proceedings of the 2007 IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning, ADPRL 2007
, pp. 330-337
-
-
Antos, A.1
Szepesvari, C.2
Munos, R.3
-
3
-
-
40849145988
-
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
-
A. Antos, C. Szepesvári, and R. Munos, "Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path," Machine Learn., vol. 71, no. 1, pp. 89-129, 2008.
-
(2008)
Machine Learn.
, vol.71
, Issue.1
, pp. 89-129
-
-
Antos, A.1
Szepesvári, C.2
Munos, R.3
-
4
-
-
0029210635
-
Learning to act using realtime dynamic programming
-
A. G. Barto, S. J. Bradtke, and S. P. Singh, "Learning to act using realtime dynamic programming," Artificial Intelligence, Special Volume on Computational Research on Interaction and Agency, vol. 72, pp. 81-138, 1995.
-
(1995)
Artificial Intelligence, Special Volume on Computational Research on Interaction and Agency
, vol.72
, pp. 81-138
-
-
Barto, A.G.1
Bradtke, S.J.2
Singh, S.P.3
-
7
-
-
79960439729
-
Approximate policy iteration: A survey and some new methods
-
D. P. Bertsekas, "Approximate policy iteration: A survey and some new methods," J. Control Theory Appl., vol. 9, no. 3, pp. 310-335, 2011.
-
(2011)
J. Control Theory Appl.
, vol.9
, Issue.3
, pp. 310-335
-
-
Bertsekas, D.P.1
-
8
-
-
84875759098
-
-
4th ed. Belmont, MA: Athena Scientific, Dynamic Programming and Optimal Control, ch. 6
-
D. P. Bertsekas, Approximate Dynamic Programming, 4th ed. Belmont, MA: Athena Scientific, 2012, vol. II, Dynamic Programming and Optimal Control, ch. 6.
-
(2012)
Approximate Dynamic Programming
, vol.II
-
-
Bertsekas, D.P.1
-
9
-
-
0037225359
-
Stochastic approximation for nonexpansive maps: Application to Q-learning algorithms
-
D. P. Bertsekas, J. Abounadi, and V. Borkar, "Stochastic approximation for nonexpansive maps: Application to Q-learning algorithms," SIAM J. Control Optim., vol. 41, no. 1, pp. 1-22, 2003.
-
(2003)
SIAM J. Control Optim.
, vol.41
, Issue.1
, pp. 1-22
-
-
Bertsekas, D.P.1
Abounadi, J.2
Borkar, V.3
-
11
-
-
0033876515
-
The O.D.E. Method for convergence of stochastic approximation and reinforcement learning
-
V. S. Borkar and S. P. Meyn, "The O.D.E. method for convergence of stochastic approximation and reinforcement learning," SIAM J. Control Optim., vol. 38, no. 2, p. 447, 2000.
-
(2000)
SIAM J. Control Optim.
, vol.38
, Issue.2
, pp. 447
-
-
Borkar, V.S.1
Meyn, S.P.2
-
12
-
-
0033245908
-
A convergent cutting-plane and partial-sampling algorithm for multistage stochastic linear programs with recourse
-
Z.-L. Chen and W. B. Powell, "A convergent cutting-plane and partial-sampling algorithm for multistage stochastic linear programs with recourse," J. Optimization Theory and Applications, vol. 102, no. 3, pp. 497-524, 1999.
-
(1999)
J. Optimization Theory and Applications
, vol.102
, Issue.3
, pp. 497-524
-
-
Chen, Z.-L.1
Powell, W.B.2
-
13
-
-
79952361855
-
Robust policies for the transformer acquisition and allocation problem
-
J. Enders, W. B. Powell, and D. M. Egan, "Robust policies for the transformer acquisition and allocation problem," Energy Syst., vol. 1, no. 3, pp. 245-272, 2010.
-
(2010)
Energy Syst.
, vol.1
, Issue.3
, pp. 245-272
-
-
Enders, J.1
Powell, W.B.2
Egan, D.M.3
-
14
-
-
56349109509
-
Value function approximation using multiple aggregation for multiattribute resource management
-
A. George, W. B. Powell, and S. Kulkarni, "Value function approximation using multiple aggregation for multiattribute resource management," J. Mach. Learn. Res., vol. 9, pp. 2079-2111, 2008.
-
(2008)
J. Mach. Learn. Res.
, vol.9
, pp. 2079-2111
-
-
George, A.1
Powell, W.B.2
Kulkarni, S.3
-
15
-
-
0035435908
-
An adaptive, distribution-free algorithm for the newsvendor problem with censored demands, with applications to inventory and distribution
-
G. Godfrey and W. B. Powell, "An adaptive, distribution-free algorithm for the newsvendor problem with censored demands, with applications to inventory and distribution," Manage. Sci., vol. 47, no. 8, pp. 1101-1112, 2001. (Pubitemid 34192808)
-
(2001)
Management Science
, vol.47
, Issue.8
, pp. 1101-1112
-
-
Godfrey, G.A.1
Powell, W.B.2
-
16
-
-
77954033753
-
Optimal control of dosage decisions in controlled ovarian hyperstimulation
-
M. He, L. Zhao, and W. B. Powell, "Optimal control of dosage decisions in controlled ovarian hyperstimulation," Ann. Oper. Res., vol. 178, pp. 223-245, 2010.
-
(2010)
Ann. Oper. Res.
, vol.178
, pp. 223-245
-
-
He, M.1
Zhao, L.2
Powell, W.B.3
-
17
-
-
0000800872
-
Stochastic decomposition: An algorithm for two stage linear programs with recourse
-
J. Higle and S. Sen, "Stochastic decomposition: An algorithm for two stage linear programs with recourse," Math. Oper. Res., vol. 16, no. 3, pp. 650-669, 1991.
-
(1991)
Math. Oper. Res.
, vol.16
, Issue.3
, pp. 650-669
-
-
Higle, J.1
Sen, S.2
-
18
-
-
0000439891
-
On the convergence of stochastic iterative dynamic programming algorithms
-
T. Jaakkola, M. Jordan, and S. P. Singh, "On the convergence of stochastic iterative dynamic programming algorithms," Neural Comput., vol. 1201, no. 1988, pp. 1185-1201, 1994.
-
(1994)
Neural Comput.
, vol.1201
, Issue.1988
, pp. 1185-1201
-
-
Jaakkola, T.1
Jordan, M.2
Singh, S.P.3
-
19
-
-
44649189852
-
Finite-time bounds for fitted value iteration
-
R. Munos and C. Szepesvári, "Finite-time bounds for fitted value iteration," J. Mach. Learn. Res., vol. 1, pp. 815-857, 2008.
-
(2008)
J. Mach. Learn. Res.
, vol.1
, pp. 815-857
-
-
Munos, R.1
Szepesvári, C.2
-
20
-
-
67649963262
-
An optimal approximate dynamic programming algorithm for the lagged asset acquisition problem
-
J. M. Nascimento and W. B. Powell, "An optimal approximate dynamic programming algorithm for the lagged asset acquisition problem," Math. Oper. Res., vol. 34, no. 1, pp. 210-237, 2009.
-
(2009)
Math. Oper. Res.
, vol.34
, Issue.1
, pp. 210-237
-
-
Nascimento, J.M.1
Powell, W.B.2
-
22
-
-
84873152819
-
SMART: A stochastic multiscale model for the analysis of energy resources, technology and policy
-
Fall
-
W. B. Powell, A. George, A. Lamont, and J. Stewart, "SMART: A stochastic multiscale model for the analysis of energy resources, technology and policy," INFORMS J. Comput., vol. 24, no. 4, pp. 665-682, Fall 2011.
-
(2011)
INFORMS J. Comput.
, vol.24
, Issue.4
, pp. 665-682
-
-
Powell, W.B.1
George, A.2
Lamont, A.3
Stewart, J.4
-
23
-
-
11244261270
-
Learning algorithms for separable approximations of discrete stochastic optimization problems
-
DOI 10.1287/moor.1040.0107
-
W. B. Powell, A. Ruszczyński, and H. Topaloglu, "Learning algorithms for separable approximations of stochastic optimization problems," Math. Oper. Res., vol. 29, no. 4, pp. 814-836, 2004. (Pubitemid 40055836)
-
(2004)
Mathematics of Operations Research
, vol.29
, Issue.4
, pp. 814-836
-
-
Powell, W.1
Ruszczynski, A.2
Topaloglu, H.3
-
24
-
-
77950512601
-
Monte carlo sampling methods
-
A. Ruszczyński and A. Shapiro, Eds. Amsterdam, The Netherlands: Elsevier
-
A. Shapiro, "Monte Carlo sampling methods," in Handbooks in Operations Research and Management Science: Stochastic Programming, A. Ruszczyński and A. Shapiro, Eds. Amsterdam, The Netherlands: Elsevier, 2003, vol. 10, pp. 353-425.
-
(2003)
Handbooks in Operations Research and Management Science: Stochastic Programming
, vol.10
, pp. 353-425
-
-
Shapiro, A.1
-
26
-
-
67649991691
-
Probability theory
-
New York: Springer-Verlag
-
A. Shiryaev, "Probability theory," in Graduate Texts in Mathematics. New York: Springer-Verlag, 1996, vol. 95.
-
(1996)
Graduate Texts in Mathematics.
, vol.95
-
-
Shiryaev, A.1
-
28
-
-
77955790905
-
Algorithms for reinforcement learning
-
C. Szepesvári, "Algorithms for reinforcement learning," Synthesis Lectures on Artificial Intelligence and Machine Learning, vol. 4, no. 1, pp. 1-103, 2010.
-
(2010)
Synthesis Lectures on Artificial Intelligence and Machine Learning
, vol.4
, Issue.1
, pp. 1-103
-
-
Szepesvári, C.1
-
29
-
-
33645566756
-
Dynamic-programming approximations for stochastic time-staged integer multicommodity-flow problems
-
DOI 10.1287/ijoc.1040.0079
-
H. Topaloglu and W. B. Powell, "Dynamic programming approximations for stochastic, time-staged integer multicommodity flow problems," INFORMS J. Comput., vol. 18, no. 1, pp. 31-42, 2006. (Pubitemid 43515944)
-
(2006)
INFORMS Journal on Computing
, vol.18
, Issue.1
, pp. 31-42
-
-
Topaloglu, H.1
Powell, W.B.2
-
30
-
-
0028497630
-
Asynchronous stochastic approximation and Q-learning
-
J. N. Tsitsiklis, "Asynchronous stochastic approximation and Q-learning," Machine Learn., vol. 16, pp. 185-202, 1994.
-
(1994)
Machine Learn.
, vol.16
, pp. 185-202
-
-
Tsitsiklis, J.N.1
-
31
-
-
0031143730
-
An analysis of temporal-difference learning with function approximation
-
PII S0018928697034375
-
J. N. Tsitsiklis and B. Van Roy, "An analysis of temporal-difference learning with function approximation," IEEE Trans. Autom. Control, vol. 42, no. 7, pp. 674-690, May 1997. (Pubitemid 127760263)
-
(1997)
IEEE Transactions on Automatic Control
, vol.42
, Issue.5
, pp. 674-690
-
-
Tsitsiklis, J.N.1
Van Roy, B.2
-
32
-
-
0014534894
-
L-shaped linear programs with applications to optimal control and stochastic programming
-
R. Van Slyke and R. Wets, "L-shaped linear programs with applications to optimal control and stochastic programming," SIAM J. Appl. Math., vol. 17, no. 4, pp. 638-663, 1969.
-
(1969)
SIAM J. Appl. Math.
, vol.17
, Issue.4
, pp. 638-663
-
-
Van Slyke, R.1
Wets, R.2
-
34
-
-
0024874096
-
Backpropagation and neurocontrol: A review and prospectus
-
P. J. Werbos, "Backpropagation and neurocontrol: A review and prospectus," Neural Networks, pp. 209-216, 1989.
-
(1989)
Neural Networks
, pp. 209-216
-
-
Werbos, P.J.1
|