SCOPUS 정보 검색 플랫폼

Volumn 54, Issue 3, 2005, Pages 207-213

An actor-critic algorithm for constrained Markov decision processes

Author keywords

Actor critic algorithms; Constrained Markov decision processes; Envelope theorem; Reinforcement learning; Stochastic approximation

Indexed keywords

ALGORITHMS; APPROXIMATION THEORY; DECISION THEORY; DYNAMIC PROGRAMMING; LEARNING SYSTEMS; THEOREM PROVING;

ACTOR-CRITIC ALGORITHMS; CONSTRAINED MARKOV DECISION PROCESSES; ENVELOPE THEOREM; REINFORCEMENT LEARNING; STOCHASTIC APPROXIMATION;

MARKOV PROCESSES;

EID: 13244278201 PISSN: 01676911 EISSN: None Source Type: Journal
DOI: 10.1016/j.sysconle.2004.08.007 Document Type: Article

Times cited : (211)

References (17)

1
- 0003989208
- Chapman & Hall/CRC Press Boca Raton, FL
- E. Altman Constrained Markov Decision Processes 1999 Chapman & Hall/CRC Press Boca Raton, FL
- (1999) Constrained Markov Decision Processes
- Altman, E.¹

3
- 0003809134
- Birkhauser Boston
- M. Bardi, and I. Capuzzo-Dolcetta Optimal Control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations 1997 Birkhauser Boston
- (1997) Optimal Control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations
- Bardi, M.¹ Capuzzo-Dolcetta, I.²

4
- 0003778897
- Springer Berlin
- A. Benveniste, M. Metivier, and P. Priouret Adaptive Algorithms and Stochastic Approximations 1990 Springer Berlin
- (1990) Adaptive Algorithms and Stochastic Approximations
- Benveniste, A.¹ Metivier, M.² Priouret, P.³

5
- 0003487482
- Athena Scientific Belmont, MA
- D.P. Bertsekas, and J.N. Tsitsiklis Neuro-Dynamic Programming 1996 Athena Scientific Belmont, MA
- (1996) Neuro-dynamic Programming
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

6
- 0031076413
- Stochastic approximation with two time scales
- V.S. Borkar Stochastic approximation with two time scales Systems Control Lett. 29 1997 291 294
- (1997) Systems Control Lett. , vol.29 , pp. 291-294
- Borkar, V.S.¹

8
- 0343893613
- Actor-critic-type learning algorithms for Markov decision processes
- V.R. Konda, and V.S. Borkar Actor-critic-type learning algorithms for Markov decision processes SIAM J. Control Optim. 38 1999 94 123
- (1999) SIAM J. Control Optim. , vol.38 , pp. 94-123
- Konda, V.R.¹ Borkar, V.S.²

9
- 4043069840
- On actor-critic algorithms
- V.R. Konda, and J.N. Tsitsiklis On actor-critic algorithms SIAM J. Control Optim. 42 2003 1143 1166
- (2003) SIAM J. Control Optim. , vol.42 , pp. 1143-1166
- Konda, V.R.¹ Tsitsiklis, J.N.²

10
- 79960013704
- A geometric approach to multi-criterion reinforcement learning
- S. Mannor, and N. Shimkin A geometric approach to multi-criterion reinforcement learning J. Mach. Learn. Res. 5 2004 325 360
- (2004) J. Mach. Learn. Res. , vol.5 , pp. 325-360
- Mannor, S.¹ Shimkin, N.²

11
- 0004235785
- Oxford University Press New York
- A. Mas-Colell, M.D. Whinston, and J.R. Green Microeconomic Theory 1995 Oxford University Press New York
- (1995) Microeconomic Theory
- Mas-Colell, A.¹ Whinston, M.D.² Green, J.R.³

12
- 0036212678
- Envelope theorems for arbitrary choice sets
- P. Milgrom, and I. Segal Envelope theorems for arbitrary choice sets Econometrica 70 2002 583 601
- (2002) Econometrica , vol.70 , pp. 583-601
- Milgrom, P.¹ Segal, I.²

13
- 0003503424
- Kluwer Academic Publishers Dordrecht
- A.B. Piunovskiy Optimal Control of Random Sequences in Problems with Constraints 1997 Kluwer Academic Publishers Dordrecht
- (1997) Optimal Control of Random Sequences in Problems with Constraints
- Piunovskiy, A.B.¹

14
- 0003998452
- Wiley New York
- M. Puterman Markov Decision Processes 1994 Wiley New York
- (1994) Markov Decision Processes
- Puterman, M.¹

15
- 0031143730
- An analysis of temporal-difference learning with function approximation
- J.N. Tsitsiklis, and B. Van Roy An analysis of temporal-difference learning with function approximation IEEE Trans. Automat. Control 42 1997 674 690
- (1997) IEEE Trans. Automat. Control , vol.42 , pp. 674-690
- Tsitsiklis, J.N.¹ Van Roy, B.²

17
- 13244262451
- Self learning control of constrained Markov decision processes - A gradient approach
- F.J. Vazquez Abad, V. Krishnamurthy, Self learning control of constrained Markov decision processes - a gradient approach, Les Cahiers du GERAD, 2003.
- (2003) Les Cahiers Du GERAD
- Abad, F.J.V.¹ Krishnamurthy, V.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.