SCOPUS 정보 검색 플랫폼 - 논문 보기

메뉴 건너뛰기

European Journal of Operational Research

Volumn 178, Issue 3, 2007, Pages 808-818

A policy gradient method for semi-Markov decision processes with application to call admission control

(3) Singh, Sumeetpal S a Tadić, Vladislav B b Doucet, Arnaud c

a UNIVERSITY OF CAMBRIDGE (United Kingdom)

b UNIVERSITY OF SHEFFIELD (United Kingdom)

c UNIVERSITY OF BRITISH COLUMBIA (Canada)

Author keywords

Call admission control; Policy gradient; Semi Markov decision process; Stochastic processes; Two time scale

Indexed keywords

APPROXIMATION THEORY; DECISION MAKING; MARKOV PROCESSES; OPTIMIZATION;

CALL ADMISSION CONTROL; SEMI-MARKOV DECISION PROCESS; TWO TIME-SCALE;

OPERATIONS RESEARCH;

EID: 33751077547 PISSN: 03772217 EISSN: None Source Type: Journal
DOI: 10.1016/j.ejor.2006.02.023 Document Type: Article

Times cited : (33)

References (21)

1
- 0345802362
- Price-directed replenishment of subsets: Methodology and its application to inventory routing
- Adelman D. Price-directed replenishment of subsets: Methodology and its application to inventory routing. Manufacturing and Service Operations Management 5 4 (2003) 348-371
- (2003) Manufacturing and Service Operations Management , vol.5 , Issue.4 , pp. 348-371
- Adelman, D.¹

2
- 0013535965
- Infinite-horizon policy-gradient estimation
- Baxter J., and Bartlett P.L. Infinite-horizon policy-gradient estimation. Journal of Artificial Intelligence Research 15 (2001) 319-350
- (2001) Journal of Artificial Intelligence Research , vol.15 , pp. 319-350
- Baxter, J.¹ Bartlett, P.L.²

3
- 0003565783
- Athena Scientific, Belmont
- Bertsekas D.P. Dynamic Programming and Optimal Control vol. 2 (1995), Athena Scientific, Belmont
- (1995) Dynamic Programming and Optimal Control , vol.2
- Bertsekas, D.P.¹

4
- 0034389611
- Gradient convergence in gradient methods with errors
- Bertsekas D.P., and Tsitsiklis J.N. Gradient convergence in gradient methods with errors. SIAM Journal on Optimization 10 3 (2000) 627-642
- (2000) SIAM Journal on Optimization , vol.10 , Issue.3 , pp. 627-642
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

5
- 0003487482
- Athena Scientific, Belmont
- Bertsekas D.P., and Tsitsiklis J.N. Neuro-dynamic Programming (1996), Athena Scientific, Belmont
- (1996) Neuro-dynamic Programming
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

6
- 0000409272
- Reinforcement learning methods for continuous-time Markov decision problems
- Bradtke S.J., and Duff M.O. Reinforcement learning methods for continuous-time Markov decision problems. Advances in Neural Information Processing Systems 7 (1995)
- (1995) Advances in Neural Information Processing Systems , vol.7
- Bradtke, S.J.¹ Duff, M.O.²

7
- 0038631988
- Semi-Markov decision problems and performance sensitivity analysis
- Cao X.-R. Semi-Markov decision problems and performance sensitivity analysis. IEEE Transactions on Automatic Control 48 5 (2003) 758-769
- (2003) IEEE Transactions on Automatic Control , vol.48 , Issue.5 , pp. 758-769
- Cao, X.-R.¹

8
- 0003745958
- Prentice-Hall, Englewood Cliffs, NJ
- Cinlar E. Introduction to Stochastic Processes (1974), Prentice-Hall, Englewood Cliffs, NJ
- (1974) Introduction to Stochastic Processes
- Cinlar, E.¹

9
- 33751087288
- Springer, New York
- Comaniciu C., Mandayam N.B., and Poor H.V. Wireless Networks: Multiuser Detection in Cross-Layer Design (2005), Springer, New York
- (2005) Wireless Networks: Multiuser Detection in Cross-Layer Design
- Comaniciu, C.¹ Mandayam, N.B.² Poor, H.V.³

10
- 0742319170
- Reinforcement learning for long-run average cost
- Gosavi A. Reinforcement learning for long-run average cost. European Journal of Operational Research 155 (2004) 654-674
- (2004) European Journal of Operational Research , vol.155 , pp. 654-674
- Gosavi, A.¹

11
- 0033876565
- Call admission control and routing in integrated services networks using neuro-dynamic programming
- Marbach P., Mihatsch O., and Tsitsiklis J.N. Call admission control and routing in integrated services networks using neuro-dynamic programming. IEEE JSAC 18 2 (2000) 197-208
- (2000) IEEE JSAC , vol.18 , Issue.2 , pp. 197-208
- Marbach, P.¹ Mihatsch, O.² Tsitsiklis, J.N.³

12
- 0003637131
- Springer-Verlag, London
- Meyn S.P., and Tweedie R.L. Markov Chains and Stochastic Stability (1993), Springer-Verlag, London
- (1993) Markov Chains and Stochastic Stability
- Meyn, S.P.¹ Tweedie, R.L.²

13
- 0003407086
- Springer-Verlag, London
- Ross K.W. Multiservice Loss Models for Broadband Telecommunication Networks (1995), Springer-Verlag, London
- (1995) Multiservice Loss Models for Broadband Telecommunication Networks
- Ross, K.W.¹

14
- 0036611716
- Integrated voice/data call admission control for wireless DS-CDMA systems with fading
- Singh S., Krishnamurthy V., and Poor H.V. Integrated voice/data call admission control for wireless DS-CDMA systems with fading. IEEE Transactions on Signal Processing 50 6 (2002) 1483-1495
- (2002) IEEE Transactions on Signal Processing , vol.50 , Issue.6 , pp. 1483-1495
- Singh, S.¹ Krishnamurthy, V.² Poor, H.V.³

15
- 84898939480
- Policy gradient methods for reinforcement learning with function approximation
- Suton R.S., McAllester D., Singh S., and Mansour Y. Policy gradient methods for reinforcement learning with function approximation. Advances in Neural Information Processing Systems 12 (2000) 1057-1063
- (2000) Advances in Neural Information Processing Systems , vol.12 , pp. 1057-1063
- Suton, R.S.¹ McAllester, D.² Singh, S.³ Mansour, Y.⁴

16
- 0033170372
- Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
- Sutton R.S., Precup D., and Singh S.P. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112 (1999) 181-211
- (1999) Artificial Intelligence , vol.112 , pp. 181-211
- Sutton, R.S.¹ Precup, D.² Singh, S.P.³

17
- 0035283402
- On the convergence of temporal-difference learning with linear function approximation
- Tadić V. On the convergence of temporal-difference learning with linear function approximation. Machine Learning 42 3 (2001) 241-267
- (2001) Machine Learning , vol.42 , Issue.3 , pp. 241-267
- Tadić, V.¹

18
- 0003636741
- John Wiley & Sons, Chichester
- Tijms H.C. Stochastic Models: An Algorithmic Approach (1994), John Wiley & Sons, Chichester
- (1994) Stochastic Models: An Algorithmic Approach
- Tijms, H.C.¹

19
- 0142199953
- V. Tadić, A. Doucet, Two time-scale stochastic approximation for constrained stochastic optimization and constrained Markov decision problems, in: Proceedings of ACC, 2003.

20
- 0142231039
- V. Tadić, S.P. Meyn, Asymptotic properties of two time-scale stochastic approximation algorithms with constant step sizes, in: Proceedings of ACC, 2003.

21
- 33751101419
- Intelligent Techniques in High Speed Networks. IEEE Journal on Selected Areas in Communications 18 2 (2000). (special issue). Available from:
- (2000) IEEE Journal on Selected Areas in Communications , vol.18 , Issue.2

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.