-
1
-
-
0003874616
-
Learning algorithms for Markov decision processes with average cost
-
Laboratory for Information and Decision Systems, Cambridge, MA: MIT)
-
Abounadi, J., Bertsekas, D., and Borkar, V. S., 1998, Learning algorithms for Markov decision processes with average cost. LIDS-P-2434, Laboratory for Information and Decision Systems (Cambridge, MA: MIT).
-
(1998)
LIDS-P-2434
-
-
Abounadi, J.1
Bertsekas, D.2
Borkar, V.S.3
-
2
-
-
0013155747
-
A general framework for the study of decentralized distribution systems
-
Anupindi, R., Bassok, Y., and Zemel, E., 2001, A general framework for the study of decentralized distribution systems. Journal of Manufacturing and Service Operations Management, 3, 349-368.
-
(2001)
Journal of Manufacturing and Service Operations Management
, vol.3
, pp. 349-368
-
-
Anupindi, R.1
Bassok, Y.2
Zemel, E.3
-
3
-
-
0003787146
-
-
Princeton, NJ: Princeton university Press)
-
Bellman, R. E., 1957, Dynamic Programming (Princeton, NJ: Princeton university Press).
-
(1957)
Dynamic Programming
-
-
Bellman, R.E.1
-
4
-
-
85156187730
-
Improving elevator performance using reinforcement learning
-
In D. S. Touretzky, M. C. Mozer, M. E. Hasselmo (eds.), Cambridge, MA: MIT
-
Crites, R., and Barto, A., 1996, Improving elevator performance using reinforcement learning. In D. S. Touretzky, M. C. Mozer, M. E. Hasselmo (eds.) Advances in Neural Information Processing Systems 8 (Cambridge, MA: MIT) pp. 1017-1023.
-
(1996)
Advances in Neural Information Processing Systems
, vol.8
, pp. 1017-1023
-
-
Crites, R.1
Barto, A.2
-
5
-
-
84996565038
-
Learning rate schedules for faster stochastic gradient search
-
In D. A. White and D. A. Sofge (eds.), Piscataway, NJ: IEEE Press
-
Darken, C., Chang, J., and Moody, J., 1992, Learning rate schedules for faster stochastic gradient search. In D. A. White and D. A. Sofge (eds.) Neural Networks for Signal Processing 2-Proceedings of the 1992 IEEE Workshop (Piscataway, NJ: IEEE Press).
-
(1992)
Neural Networks for Signal Processing 2-Proceedings of the 1992 IEEE Workshop
-
-
Darken, C.1
Chang, J.2
Moody, J.3
-
6
-
-
0032643313
-
Solving semi-Markov decision problems using average reward reinforcement learning
-
Das, T. K., Gosavi, A., Mahadevan, S., and Marchalleck N., 1999, Solving semi-Markov decision problems using average reward reinforcement learning. Management Science, 45, 560-574.
-
(1999)
Management Science
, vol.45
, pp. 560-574
-
-
Das, T.K.1
Gosavi, A.2
Mahadevan, S.3
Marchalleck, N.4
-
7
-
-
0038829878
-
Predicting how people play games: Reinforcement learning in experimental games with unique, mixed strategy equilibria
-
Erev, I., and Roth, A. E., 1998, Predicting how people play games: Reinforcement learning in experimental games with unique, mixed strategy equilibria. The American Economic Review, 88, 848-881.
-
(1998)
The American Economic Review
, vol.88
, pp. 848-881
-
-
Erev, I.1
Roth, A.E.2
-
10
-
-
0036722536
-
A reinforcement learning approach to airline seat allocation for multiple fare classes with overbooking
-
(, on Advances on Large Scale Optimization for Logistics, Production, and Manufacturing Systems)
-
Gosavi, A., Bandla, N., and Das, T. K., 2002, A reinforcement learning approach to airline seat allocation for multiple fare classes with overbooking. IIE Transactions on Operations Engineering (SpecialIssue on Advances on Large Scale Optimization for Logistics, Production, and Manufacturing Systems), 34, 729-742.
-
(2002)
IIE Transactions on Operations Engineering
, vol.34
, pp. 729-742
-
-
Gosavi, A.1
Bandla, N.2
Das, T.K.3
-
11
-
-
84995317030
-
A simulation-based learning automata framework for solving semi-markov decision problems under long-run average cost
-
Gosavi, A., Das, T. K., and Sarkar, S., In press. A simulation-based learning automata framework for solving semi-markov decision problems under long-run average cost. IIE Transactions on Operations Engineering.
-
IIE Transactions on Operations Engineering
-
-
Gosavi, A.1
Das, T.K.2
Sarkar, S.3
-
14
-
-
0035124331
-
Intelligent dynamic control policies for serial production lines
-
Paternina, C. D., and Das, T. K., 2000, Intelligent dynamic control policies for serial production lines. IIE Transactions, 33, pp. 65-77.
-
(2000)
IIE Transactions
, vol.33
, pp. 65-77
-
-
Paternina, C.D.1
Das, T.K.2
-
17
-
-
0000016172
-
A Stochastic Approximation Method
-
Robbins, H., and Monro, S., 1951, A Stochastic Approximation Method. Annals of Mathematics and Statistics, 22, 400-407.
-
(1951)
Annals of Mathematics and Statistics
, vol.22
, pp. 400-407
-
-
Robbins, H.1
Monro, S.2
-
18
-
-
0344545422
-
Reinforcement learning for dynamic channel alloction in cellular telephone systems
-
Cambridge, MA: MIT Press)
-
Singh, S., and Bertsekas, D., 1996, Reinforcement learning for dynamic channel alloction in cellular telephone systems. Neural Information Processing Systems (Cambridge, MA: MIT Press).
-
(1996)
Neural Information Processing Systems
-
-
Singh, S.1
Bertsekas, D.2
|