메뉴 건너뛰기




Volumn 227, Issue , 2007, Pages 225-232

Percentile optimization in uncertain Markov decision processes with application to efficient exploration

Author keywords

[No Author keywords available]

Indexed keywords

DECISION MAKING; MARKOV PROCESSES; MATHEMATICAL MODELS; OPTIMIZATION; PARAMETER ESTIMATION; UNCERTAIN SYSTEMS;

EID: 34547985785     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1273496.1273525     Document Type: Conference Paper
Times cited : (31)

References (19)
  • 2
    • 0041965975 scopus 로고    scopus 로고
    • R-max - a general polynomial time algorithm for near-optimal reinforcement learning
    • Brafman, R., & Tennenholtz, M. (2003). R-max - a general polynomial time algorithm for near-optimal reinforcement learning. J. of Machine Learning Research., 3, 213-231.
    • (2003) J. of Machine Learning Research , vol.3 , pp. 213-231
    • Brafman, R.1    Tennenholtz, M.2
  • 3
    • 33845679809 scopus 로고    scopus 로고
    • On distributionally robust chance-constrained linear programs
    • Calafiore, G., & El Ghaoui, L. (2006). On distributionally robust chance-constrained linear programs. Optimization Theory and Applications, 130, 1-22.
    • (2006) Optimization Theory and Applications , vol.130 , pp. 1-22
    • Calafiore, G.1    El Ghaoui, L.2
  • 4
    • 0002395681 scopus 로고
    • Chance constrained programming
    • Charnes, A., & Cooper, W. (1959). Chance constrained programming. Management Science, 6, 73-79.
    • (1959) Management Science , vol.6 , pp. 73-79
    • Charnes, A.1    Cooper, W.2
  • 6
    • 0029219995 scopus 로고
    • Percentile performance criteria for limiting average Markov control problems
    • Filar, J., Krass, D., & Ross, K. (1995). Percentile performance criteria for limiting average Markov control problems. IEEE Trans, on Automatic Control, 40, 2-10.
    • (1995) IEEE Trans, on Automatic Control , vol.40 , pp. 2-10
    • Filar, J.1    Krass, D.2    Ross, K.3
  • 8
    • 0034272032 scopus 로고    scopus 로고
    • Boundedparameter Markov decision processes
    • Givan, R., Leach, S., & Dean, T. (2000). Boundedparameter Markov decision processes. Artificial Intelligence, 122, 71-109.
    • (2000) Artificial Intelligence , vol.122 , pp. 71-109
    • Givan, R.1    Leach, S.2    Dean, T.3
  • 11
    • 0012257655 scopus 로고    scopus 로고
    • Near-optimal reinforcement learning in polynomial time
    • Kearns, M., & Singh, S. (1998). Near-optimal reinforcement learning in polynomial time. Proc. ICML (pp. 260-268).
    • (1998) Proc. ICML , pp. 260-268
    • Kearns, M.1    Singh, S.2
  • 14
    • 36248992411 scopus 로고    scopus 로고
    • Convex approximations of chance constrained programs
    • Nemirovski, A., & Shapiro, A. (2006). Convex approximations of chance constrained programs. SIAM Journal on Optimization, 17, 969-996.
    • (2006) SIAM Journal on Optimization , vol.17 , pp. 969-996
    • Nemirovski, A.1    Shapiro, A.2
  • 15
    • 14344250395 scopus 로고    scopus 로고
    • Robust Markov decision processes with uncertain transition matrices
    • Nilim, A., & El Ghaoui, L. Robust Markov decision processes with uncertain transition matrices. Operations Research, 53, 780-798.
    • Operations Research , vol.53 , pp. 780-798
    • Nilim, A.1    El Ghaoui, L.2
  • 18
    • 34547984629 scopus 로고
    • Markovian decision processes with uncertain transition probabilities or rewards
    • 1, Operations Research Center, MIT
    • Silver, E. (1963). Markovian decision processes with uncertain transition probabilities or rewards (Technical Report 1). Operations Research Center, MIT.
    • (1963) Technical Report
    • Silver, E.1
  • 19
    • 31844432138 scopus 로고    scopus 로고
    • A theoretical analysis of model-based interval estimation
    • Støehl, A., & Littman, M. (2005). A theoretical analysis of model-based interval estimation. Proc. ICML (pp. 857-864).
    • (2005) Proc. ICML , pp. 857-864
    • Støehl, A.1    Littman, M.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.