메뉴 건너뛰기




Volumn , Issue , 2011, Pages 48-55

Optimistic planning for sparsely stochastic systems

Author keywords

Markov decision processes; model predictive control; online planning; optimistic planning; stochastic systems

Indexed keywords

HIV INFECTION; MARKOV DECISION PROCESSES; NOVEL ALGORITHM; NUMERICAL RESULTS; ON-LINE CONTROLS; ON-LINE PLANNING; OPTIMISTIC PLANNING; RANDOM STATE; SELECTION METHODS;

EID: 80052220117     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ADPRL.2011.5967375     Document Type: Conference Paper
Times cited : (15)

References (24)
  • 2
    • 0036832951 scopus 로고    scopus 로고
    • A sparse sampling algorithm for near-optimal planning in large Markov decision processes
    • DOI 10.1023/A:1017932429737
    • M. J. Kearns, Y. Mansour, and A. Y. Ng, "A sparse sampling algorithm for near-optimal planning in large Markov decision processes," Machine Learning, vol. 49, no. 2-3, pp. 193-208, 2002. (Pubitemid 34325686)
    • (2002) Machine Learning , vol.49 , Issue.2-3 , pp. 193-208
    • Kearns, M.1    Mansour, Y.2    Ng, A.Y.3
  • 4
    • 58449098161 scopus 로고    scopus 로고
    • Lazy planning under uncertainties by optimizing decisions on an ensemble of incomplete disturbance trees
    • S. Girgin, M. Loth, R. Munos, P. Preux, and D. Ryabko, Eds. Springer
    • B. Defourny, D. Ernst, and L. Wehenkel, "Lazy planning under uncertainties by optimizing decisions on an ensemble of incomplete disturbance trees," in Recent Advances in Reinforcement Learning, ser. Lecture Notes in Computer Science, S. Girgin, M. Loth, R. Munos, P. Preux, and D. Ryabko, Eds. Springer, 2008, vol. 5323, pp. 1-14.
    • (2008) Recent Advances in Reinforcement Learning, Ser. Lecture Notes in Computer Science , vol.5323 , pp. 1-14
    • Defourny, B.1    Ernst, D.2    Wehenkel, L.3
  • 7
    • 0036568025 scopus 로고    scopus 로고
    • Finite-time analysis of the multiarmed bandit problem
    • DOI 10.1023/A:1013689704352, Computational Learning Theory
    • P. Auer, N. Cesa-Bianchi, and P. Fischer, "Finite-time analysis of the multiarmed bandit problem," Machine Learning, vol. 47, no. 2-3, pp. 235-256, 2002. (Pubitemid 34126111)
    • (2002) Machine Learning , vol.47 , Issue.2-3 , pp. 235-256
    • Auer, P.1    Cesa-Bianchi, N.2    Fischer, P.3
  • 9
    • 77952027689 scopus 로고    scopus 로고
    • Online optimization in X-armed bandits
    • D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, Eds. MIT Press
    • S. Bubeck, R. Munos, G. Stoltz, and C. Szepesvári, "Online optimization in X-armed bandits," in Advances in Neural Information Processing Systems 21, D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, Eds. MIT Press, 2009, pp. 201-208.
    • (2009) Advances in Neural Information Processing Systems , vol.21 , pp. 201-208
    • Bubeck, S.1    Munos, R.2    Stoltz, G.3    Szepesvári, C.4
  • 19
    • 77955814101 scopus 로고    scopus 로고
    • Reinforcement learning and dynamic programming using function approximators, ser
    • Taylor & Francis CRC Press
    • L. Bus,oniu, R. Babuška, B. De Schutter, and D. Ernst, Reinforcement Learning and Dynamic Programming Using Function Approximators, ser. Automation and Control Engineering. Taylor & Francis CRC Press, 2010.
    • (2010) Automation and Control Engineering
    • Buşoniu, L.1    Babuška, R.2    De Schutter, B.3    Ernst, D.4
  • 20
    • 77950867376 scopus 로고    scopus 로고
    • Approximate dynamic programming with a fuzzy parameterization
    • L. Bus,oniu, D. Ernst, B. De Schutter, and R. Babuška, "Approximate dynamic programming with a fuzzy parameterization," Automatica, vol. 46, no. 5, pp. 804-814, 2010.
    • (2010) Automatica , vol.46 , Issue.5 , pp. 804-814
    • Buşoniu, L.1    Ernst, D.2    De Schutter, B.3    Babuška, R.4
  • 21
    • 28544448294 scopus 로고    scopus 로고
    • Dynamic multidrug therapies for HIV: Optimal and STI control approaches
    • B. Adams, H. Banks, H.-D. Kwon, and H. Tran, "Dynamic multidrug therapies for HIV: Optimal and STI control approaches," Mathematical Biosciences and Engineering, vol. 1, no. 2, pp. 223-241, 2004.
    • (2004) Mathematical Biosciences and Engineering , vol.1 , Issue.2 , pp. 223-241
    • Adams, B.1    Banks, H.2    Kwon, H.-D.3    Tran, H.4
  • 23
    • 39649096058 scopus 로고    scopus 로고
    • Clinical data based optimal STI strategies for HIV: A reinforcement learning approach
    • 4177178, Proceedings of the 45th IEEE Conference on Decision and Control 2006, CDC
    • D. Ernst, G.-B. Stan, J. Gonc,alves, and L. Wehenkel, "Clinical data based optimal STI strategies for HIV: A reinforcement learning approach," in Proceedings 45th IEEE Conference on Decision & Control, San Diego, US, 13-15 December 2006, pp. 667-672. (Pubitemid 351283311)
    • (2006) Proceedings of the IEEE Conference on Decision and Control , pp. 667-672
    • Ernst, D.1    Stan, G.-B.2    Goncalves, J.3    Wehenkel, L.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.