메뉴 건너뛰기




Volumn 178, Issue 3, 2007, Pages 808-818

A policy gradient method for semi-Markov decision processes with application to call admission control

Author keywords

Call admission control; Policy gradient; Semi Markov decision process; Stochastic processes; Two time scale

Indexed keywords

APPROXIMATION THEORY; DECISION MAKING; MARKOV PROCESSES; OPTIMIZATION;

EID: 33751077547     PISSN: 03772217     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.ejor.2006.02.023     Document Type: Article
Times cited : (33)

References (21)
  • 1
    • 0345802362 scopus 로고    scopus 로고
    • Price-directed replenishment of subsets: Methodology and its application to inventory routing
    • Adelman D. Price-directed replenishment of subsets: Methodology and its application to inventory routing. Manufacturing and Service Operations Management 5 4 (2003) 348-371
    • (2003) Manufacturing and Service Operations Management , vol.5 , Issue.4 , pp. 348-371
    • Adelman, D.1
  • 4
    • 0034389611 scopus 로고    scopus 로고
    • Gradient convergence in gradient methods with errors
    • Bertsekas D.P., and Tsitsiklis J.N. Gradient convergence in gradient methods with errors. SIAM Journal on Optimization 10 3 (2000) 627-642
    • (2000) SIAM Journal on Optimization , vol.10 , Issue.3 , pp. 627-642
    • Bertsekas, D.P.1    Tsitsiklis, J.N.2
  • 7
    • 0038631988 scopus 로고    scopus 로고
    • Semi-Markov decision problems and performance sensitivity analysis
    • Cao X.-R. Semi-Markov decision problems and performance sensitivity analysis. IEEE Transactions on Automatic Control 48 5 (2003) 758-769
    • (2003) IEEE Transactions on Automatic Control , vol.48 , Issue.5 , pp. 758-769
    • Cao, X.-R.1
  • 10
    • 0742319170 scopus 로고    scopus 로고
    • Reinforcement learning for long-run average cost
    • Gosavi A. Reinforcement learning for long-run average cost. European Journal of Operational Research 155 (2004) 654-674
    • (2004) European Journal of Operational Research , vol.155 , pp. 654-674
    • Gosavi, A.1
  • 11
    • 0033876565 scopus 로고    scopus 로고
    • Call admission control and routing in integrated services networks using neuro-dynamic programming
    • Marbach P., Mihatsch O., and Tsitsiklis J.N. Call admission control and routing in integrated services networks using neuro-dynamic programming. IEEE JSAC 18 2 (2000) 197-208
    • (2000) IEEE JSAC , vol.18 , Issue.2 , pp. 197-208
    • Marbach, P.1    Mihatsch, O.2    Tsitsiklis, J.N.3
  • 14
    • 0036611716 scopus 로고    scopus 로고
    • Integrated voice/data call admission control for wireless DS-CDMA systems with fading
    • Singh S., Krishnamurthy V., and Poor H.V. Integrated voice/data call admission control for wireless DS-CDMA systems with fading. IEEE Transactions on Signal Processing 50 6 (2002) 1483-1495
    • (2002) IEEE Transactions on Signal Processing , vol.50 , Issue.6 , pp. 1483-1495
    • Singh, S.1    Krishnamurthy, V.2    Poor, H.V.3
  • 16
    • 0033170372 scopus 로고    scopus 로고
    • Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
    • Sutton R.S., Precup D., and Singh S.P. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112 (1999) 181-211
    • (1999) Artificial Intelligence , vol.112 , pp. 181-211
    • Sutton, R.S.1    Precup, D.2    Singh, S.P.3
  • 17
    • 0035283402 scopus 로고    scopus 로고
    • On the convergence of temporal-difference learning with linear function approximation
    • Tadić V. On the convergence of temporal-difference learning with linear function approximation. Machine Learning 42 3 (2001) 241-267
    • (2001) Machine Learning , vol.42 , Issue.3 , pp. 241-267
    • Tadić, V.1
  • 19
    • 0142199953 scopus 로고    scopus 로고
    • V. Tadić, A. Doucet, Two time-scale stochastic approximation for constrained stochastic optimization and constrained Markov decision problems, in: Proceedings of ACC, 2003.
  • 20
    • 0142231039 scopus 로고    scopus 로고
    • V. Tadić, S.P. Meyn, Asymptotic properties of two time-scale stochastic approximation algorithms with constant step sizes, in: Proceedings of ACC, 2003.


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.