-
1
-
-
0003874616
-
-
LIDS-P-2434, Laboratory for Information and Decision Systems, MIT, Cambridge, MA
-
Abounadi, J., Bertsekas, D. and Borkar, V.S. (1998) Learning algorithms for Markov decision processes with average cost report. LIDS-P-2434, Laboratory for Information and Decision Systems, MIT, Cambridge, MA.
-
(1998)
Learning Algorithms for Markov Decision Processes with Average Cost Report
-
-
Abounadi, J.1
Bertsekas, D.2
Borkar, V.S.3
-
2
-
-
0013155747
-
A general framework for the study of decentralized distribution systems
-
Anupindi, R., Bassok, Y. and Zemel, E. (2001) A general framework for the study of decentralized distribution systems. Journal of Manufacturing and Service Operations Management, 3(4).
-
(2001)
Journal of Manufacturing and Service Operations Management
, vol.3
, Issue.4
-
-
Anupindi, R.1
Bassok, Y.2
Zemel, E.3
-
3
-
-
0003787146
-
-
Princeton University Press, Princeton, NJ
-
Bellman, R.E. (1957) Dynamic Programming, Princeton University Press, Princeton, NJ.
-
(1957)
Dynamic Programming
-
-
Bellman, R.E.1
-
5
-
-
84996565038
-
Learning rate schedules for faster stochastic gradient search
-
White, D.A. and Sofge, D.A. (eds.), IEEE Press, Piscataway, NJ
-
Darken, C., Chang, J. and Moody, J. (1992) Learning rate schedules for faster stochastic gradient search, Neural Networks for Signal Processing 2 - Proceedings of the 1992 IEEE Workshop, in White, D.A. and Sofge, D.A. (eds.), IEEE Press, Piscataway, NJ.
-
(1992)
Neural Networks for Signal Processing 2 - Proceedings of the 1992 IEEE Workshop
-
-
Darken, C.1
Chang, J.2
Moody, J.3
-
6
-
-
0032643313
-
Solving semi-Markov decision problems using average reward reinforcement learning
-
Das, T.K., Gosavi, A., Mahadevan, S. and Marchalleck, N. (1999) Solving semi-Markov decision problems using average reward reinforcement learning. Management Science, 45(4), 560-574.
-
(1999)
Management Science
, vol.45
, Issue.4
, pp. 560-574
-
-
Das, T.K.1
Gosavi, A.2
Mahadevan, S.3
Marchalleck, N.4
-
7
-
-
0038829878
-
Predicting how people play games: Reinforcement learning in experimental games with unique, mixed strategy equilibria
-
Erev, I. and Roth, A.E. (1998) Predicting how people play games: reinforcement learning in experimental games with unique, mixed strategy equilibria. The American Economic Review, 88(4), 848-881.
-
(1998)
The American Economic Review
, vol.88
, Issue.4
, pp. 848-881
-
-
Erev, I.1
Roth, A.E.2
-
10
-
-
0036722536
-
A reinforcement learning approach to airline seat allocation for multiple fare classes with over-booking
-
Gosavi, A., Bandla, N. and Das, T.K. (2002) A reinforcement learning approach to airline seat allocation for multiple fare classes with over-booking. IIE Transactions, 34(9), 729-742.
-
(2002)
IIE Transactions
, vol.34
, Issue.9
, pp. 729-742
-
-
Gosavi, A.1
Bandla, N.2
Das, T.K.3
-
12
-
-
1642351771
-
Learning Nash equilibrium for average reward irreducible stochastic games
-
Department of Industrial and Management Systems Engineering, University of South Florida, Tampa, FL 33620
-
Li, J. and Das, T.K. (2003) Learning Nash equilibrium for average reward irreducible stochastic games. Working paper, Department of Industrial and Management Systems Engineering, University of South Florida, Tampa, FL 33620.
-
(2003)
Working Paper
-
-
Li, J.1
Das, T.K.2
-
14
-
-
0001730497
-
Non-cooperative games
-
Nash, J.F. (1951) Non-cooperative games. Annals of Mathematics, 54, 286-295.
-
(1951)
Annals of Mathematics
, vol.54
, pp. 286-295
-
-
Nash, J.F.1
-
15
-
-
0016594972
-
On the core of linear production games
-
Owen, G. (1975) On the core of linear production games. Mathamatical Programming, 9, 358-370.
-
(1975)
Mathamatical Programming
, vol.9
, pp. 358-370
-
-
Owen, G.1
-
16
-
-
0035124331
-
Intelligent dynamic control policies for serial production lines
-
Paternina, C.D. and Das, T.K. (2000) Intelligent dynamic control policies for serial production lines. IIE Transactions, 33(1), 65-77.
-
(2000)
IIE Transactions
, vol.33
, Issue.1
, pp. 65-77
-
-
Paternina, C.D.1
Das, T.K.2
-
20
-
-
0346523383
-
Competitive outcomes in the core of market games
-
The Rand Corporation
-
Shapley, L. and Shubik, M. (1975) Competitive outcomes in the core of market games. Technical report R-1692-NSF, The Rand Corporation.
-
(1975)
Technical Report
, vol.R-1692-NSF
-
-
Shapley, L.1
Shubik, M.2
-
22
-
-
0001081294
-
Simplicial variable dimension algorithms for solving the nonlinear complimentary problem on a product of unit simplices using a general labeling
-
Van der Lann, G., Talman, A.J.J. and Van der Heyden, L. (1987) Simplicial variable dimension algorithms for solving the nonlinear complimentary problem on a product of unit simplices using a general labeling. Mathematics of Operations Research, 377-397.
-
(1987)
Mathematics of Operations Research
, pp. 377-397
-
-
Van der Lann, G.1
Talman, A.J.J.2
Van der Heyden, L.3
-
23
-
-
0003787427
-
-
Ph.D. thesis, Laboratory for Information and Decision Systems, MIT, Cambridge, MA
-
Van Roy, B. (1998) Learning and value function approximation in complex decision processes. Ph.D. thesis, Laboratory for Information and Decision Systems, MIT, Cambridge, MA.
-
(1998)
Learning and Value Function Approximation in Complex Decision Processes
-
-
Van Roy, B.1
|