메뉴 건너뛰기




Volumn 57, Issue 9, 2012, Pages 2266-2280

Mean field for Markov decision processes: From discrete to continuous optimization

Author keywords

Epidemic model; Hamilton Jacobi Bellman (HJB); Markov decision processes; mean field; optimal control

Indexed keywords

EPIDEMIC MODELS; HAMILTON-JACOBI-BELLMAN; MARKOV DECISION PROCESSES; MEAN FIELD; OPTIMAL CONTROLS;

EID: 84865675087     PISSN: 00189286     EISSN: None     Source Type: Journal    
DOI: 10.1109/TAC.2012.2186176     Document Type: Article
Times cited : (63)

References (25)
  • 1
    • 0001793657 scopus 로고    scopus 로고
    • Dynamics of stochastic approximation algorithms
    • Lecture Notes in Math
    • M. Benaïm,"Dynamics of stochastic approximation algorithms," Séminaire de Probabilités XXXIII. Lecture Notes in Math, vol. 1709, pp. 1-68, 1999.
    • (1999) Séminaire de Probabilités XXXIII , vol.1709 , pp. 1-68
    • Benaïm, M.1
  • 2
    • 53649095038 scopus 로고    scopus 로고
    • A class of mean field interaction models for computer and communication systems
    • M. Benaim and J.-Y. L. Boudec,"A class of mean field interaction models for computer and communication systems," Perform. Eval., vol. 65, no. 11-12, pp. 823-838, 2008.
    • (2008) Perform. Eval. , vol.65 , Issue.11-12 , pp. 823-838
    • Benaim, M.1    L. Boudec, J.-Y.2
  • 3
    • 84865699023 scopus 로고    scopus 로고
    • Deterministic approximation of stochastic evolution in games: A generalization
    • M. Benaim and J. Weibull,"Deterministic Approximation of Stochastic Evolution in Games: A Generalization," Tech. Rep., mimeo, 2003.
    • (2003) Tech. Rep., mimeo
    • Benaim, M.1    Weibull, J.2
  • 5
    • 37149025078 scopus 로고    scopus 로고
    • Grid brokering for batch allocation using indexes
    • of LNCS. New York: Springer
    • V. G. BertenB,"Grid brokering for batch allocation using indexes," in Network Control and Optimization, volume 4465 of LNCS. New York: Springer, 2007.
    • (2007) Network Control and Optimization , vol.4465
    • Bertenb, V.G.1
  • 7
    • 34547189876 scopus 로고    scopus 로고
    • Initial studies on worm propagation in manets for future army combat systems
    • Pentagon Reports
    • R. Cole,"Initial studies on worm propagation in manets for future army combat systems," Tech. Rep., Pentagon Reports, 2004.
    • (2004) Tech. Rep.
    • Cole, R.1
  • 8
    • 0348090400 scopus 로고    scopus 로고
    • The linear programming approach to approximate dynamic programming
    • D. P. De Farias and B. Van Roy,"The linear programming approach to approximate dynamic programming," Operat. Res., vol. 51, no. 6, pp. 850-865, 2003.
    • (2003) Operat. Res. , vol.51 , Issue.6 , pp. 850-865
    • De Farias, D.P.1    Van Roy, B.2
  • 9
    • 84861082568 scopus 로고    scopus 로고
    • Mean field limit of non-smooth systems and differential inclusions
    • N. Gast and B. Gaujal,"Mean field limit of non-smooth systems and differential inclusions," ACM SIGMETRICS Perform. Eval. Rev., vol. 38, no. 2, pp. 30-32, 2010.
    • (2010) ACM SIGMETRICS Perform. Eval. Rev. , vol.38 , Issue.2 , pp. 30-32
    • Gast, N.1    Gaujal, B.2
  • 10
    • 79951558493 scopus 로고    scopus 로고
    • A mean field approach for optimization in discrete time
    • N. Gast and B. Gaujal,"A mean field approach for optimization in discrete time," Discrete Event Dynam. Syst., vol. 21, pp. 63-101, 2011.
    • (2011) Discrete Event Dynam. Syst. , vol.21 , pp. 63-101
    • Gast, N.1    Gaujal, B.2
  • 11
    • 64749113126 scopus 로고    scopus 로고
    • Deterministic approximation of best-response dynamics for the matching pennies game
    • Z. Gorodeisky,"Deterministic approximation of best-response dynamics for the matching pennies game," Games Econ. Behav., vol. 66, no. 1, pp. 191-201, 2009.
    • (2009) Games Econ. Behav. , vol.66 , Issue.1 , pp. 191-201
    • Gorodeisky, Z.1
  • 12
    • 39549087376 scopus 로고    scopus 로고
    • Large population stochastic dynamic games: Closed-loop mckean-vlasov systems and the nash certainty equivalence principle
    • M. Huang, P. E. Caines, and R. P. Malhame,"Large population stochastic dynamic games: Closed-loop Mckean-Vlasov systems and the Nash certainty equivalence principle," Com. Inf. Syst., vol. 6, pp. 221-252, 2006.
    • (2006) Com. Inf. Syst. , vol.6 , pp. 221-252
    • Huang, M.1    Caines, P.E.2    Malhame, R.P.3
  • 13
    • 34249105008 scopus 로고    scopus 로고
    • Nash certainty equivalence in large population stochastic dynamic games: Connections with the physics of interacting particle systems
    • 4177530, Proceedings of the 45th IEEE Conference on Decision and Control 2006, CDC
    • M. Huang, P. E. Caines, and R. P. Malhame,"Nash certainty equivalence in large population stochastic dynamic games: Connections with the physics of interacting particle systems," in Proc. 45th IEEE Conf. Decision Control, San Diego, 2006, pp. 4921-4926. (Pubitemid 351283806)
    • (2006) Proceedings of the IEEE Conference on Decision and Control , pp. 4921-4926
    • Huang, M.1    Malhame, R.P.2    Caines, P.E.3
  • 14
    • 34648831837 scopus 로고    scopus 로고
    • Large-population cost-coupled LQG problems with nonuniform agents: Individual-mass behavior and decentralized ε-nash equilibria
    • DOI 10.1109/TAC.2007.904450
    • M. Huang, P. E. Caines, and R. P. Malhame,"Large-population costcoupled lqg problems with nonuniform agents: individual-mass behavior and decentralized -Nash equilibria," IEEE Trans. Autom. Control, vol. 52, no. 9, pp. 1560-1571, Sep. 2007. (Pubitemid 47456068)
    • (2007) IEEE Transactions on Automatic Control , vol.52 , Issue.9 , pp. 1560-1571
    • Huang, M.1    Caines, P.E.2    Malhame, R.P.3
  • 15
    • 77953320145 scopus 로고    scopus 로고
    • Maximum damage malware attack in mobile wireless networks
    • San Diego, CA
    • M. H. R. Khouzani, S. Sarkar, and E. Altman,"Maximum damage malware attack in mobile wireless networks," in Proc. IEEE Infocom, San Diego, CA, 2010, pp. 1-9.
    • (2010) Proc. IEEE Infocom , pp. 1-9
    • Khouzani, M.H.R.1    Sarkar, S.2    Altman, E.3
  • 17
    • 0002232633 scopus 로고
    • Solutions of ordinary differential equations as limits of pure jump markov processes
    • T. Kurtz,"Solutions of ordinary differential equations as limits of pure jump Markov processes," J. Appl. Probab., vol. 7, pp. 49-58, 1970.
    • (1970) J. Appl. Probab. , vol.7 , pp. 49-58
    • Kurtz, T.1
  • 19
    • 47949103963 scopus 로고    scopus 로고
    • A generic mean field convergence result for systems of interacting objects
    • J. Y. Le Boudec, D. McDonald, and J. Mundinger,"A generic mean field convergence result for systems of interacting objects," Proc. QEST '07, pp. 3-18, 2007.
    • (2007) Proc. QEST '07 , pp. 3-18
    • Le Boudec, J.Y.1    McDonald, D.2    Mundinger, J.3
  • 20
    • 0032628612 scopus 로고    scopus 로고
    • The complexity of optimal queuing network control
    • C. H. Papadimitriou and J. N. Tsitsiklis,"The complexity of optimal queuing network control," Math. Oper. Res., vol. 24, no. 2, pp. 292-305, 1999.
    • (1999) Math. Oper. Res. , vol.24 , Issue.2 , pp. 292-305
    • Papadimitriou, C.H.1    Tsitsiklis, J.N.2
  • 22
    • 80053647353 scopus 로고    scopus 로고
    • Vaccine:war of the worms in wired and wireless networks
    • S. Tanachaiwiwat and A. Helmy,"Vaccine:War of the worms in wired and wireless networks," in Proc. IEEE INFOCOM, 2006.
    • (2006) Proc. IEEE INFOCOM
    • Tanachaiwiwat, S.1    Helmy, A.2
  • 23
    • 70349977430 scopus 로고    scopus 로고
    • Mean field asymptotic of markov decision evolutionary games and teams
    • H. Tembine, J.-Y. Le Boudec, R. El-Azouzi, and E. Altman,"Mean field asymptotic of Markov decision evolutionary games and teams," Gamenets, 2009.
    • (2009) Gamenets
    • Tembine, H.1    Le Boudec, J.-Y.2    El-Azouzi, R.3    Altman, E.4
  • 25
    • 0031143730 scopus 로고    scopus 로고
    • An analysis of temporal-difference learning with function approximation
    • PII S0018928697034375
    • J. N. Tsitsiklis and B. V. Roy,"An analysis of temporal-difference learning with function approximation," IEEE Trans. Autom. Control, vol. 42, no. 5, pp. 674-690, May 1997. (Pubitemid 127760263)
    • (1997) IEEE Transactions on Automatic Control , vol.42 , Issue.5 , pp. 674-690
    • Tsitsiklis, J.N.1    Van Roy, B.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.