SCOPUS 정보 검색 플랫폼

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Volumn 6911 LNAI, Issue PART 1, 2011, Pages 487-502

Lagrange dual decomposition for finite horizon Markov decision processes

(2) Furmston, Thomas a Barber, David a

a UNIVERSITY COLLEGE LONDON (United Kingdom)

Author keywords

Lagrange Duality; Markov Decision Processes; Planning

Indexed keywords

CONVERGENT ALGORITHMS; DUAL DECOMPOSITION; EMPIRICAL PERFORMANCE; EXPECTATION-MAXIMISATION; FINITE-HORIZON MARKOV DECISION PROCESS; HARD PROBLEMS; LAGRANGE DUAL; LAGRANGE DUALITY; MARKOV DECISION PROCESSES; NONSTATIONARY; PLANNING ALGORITHMS; POLICY GRADIENT; STATIONARY POLICY; SUB-PROBLEMS; LAGRANGE DUAL DECOMPOSITIONS;

LAGRANGE MULTIPLIERS; LEARNING SYSTEMS; MARKOV PROCESSES; DATA MINING; MACHINE LEARNING; PLANNING;

LEARNING ALGORITHMS; BEHAVIORAL RESEARCH;

EID: 80052418186 PISSN: 03029743 EISSN: 16113349 Source Type: Book Series
DOI: 10.1007/978-3-642-23780-5_41 Document Type: Conference Paper

Times cited : (5)

References (20)

1
- 0004102479
- MIT Press, Cambridge
- Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

2
- 52949118902
- A Concise Introduction to Multiagent Systems and Distributed Artificial Intelligence
- Vlassis, N.: A Concise Introduction to Multiagent Systems and Distributed Artificial Intelligence. Synthesis Lectures on Artificial Intelligence and Machine Learning 1(1), 1-71 (2007)
- (2007) Synthesis Lectures on Artificial Intelligence and Machine Learning , vol.1 , Issue.1 , pp. 1-71
- Vlassis, N.¹

3
- 0003565783
- 2nd edn. Athena Scientific, Belmont
- Bertsekas, D.P.: Dynamic Programming and Optimal Control, 2nd edn. Athena Scientific, Belmont (2000)
- (2000) Dynamic Programming and Optimal Control
- Bertsekas, D.P.¹

4
- 0024038570
- Probabilistic Inference and Influence Diagrams
- Shachter, R.D.: Probabilistic Inference and Influence Diagrams. Operations Research 36, 589-604 (1988)
- (1988) Operations Research , vol.36 , pp. 589-604
- Shachter, R.D.¹

5
- 0000337576
- Simple Statistical Gradient Following Algorithms for Connectionist Reinforcement Learning
- Williams, R.: Simple Statistical Gradient Following Algorithms for Connectionist Reinforcement Learning. Machine Learning 8, 229-256 (1992)
- (1992) Machine Learning , vol.8 , pp. 229-256
- Williams, R.¹

6
- 80052420186
- Bayesian Time Series Models
- Cambridge University, Cambridge in press userpage. fu-berlin.de/~mtoussai
- Toussaint, M., Storkey, A., Harmeling, S.: Bayesian Time Series Models. In: Expectation-Maximization Methods for Solving (PO)MDPs and Optimal Control Problems, Cambridge University, Cambridge (in press 2011), userpage. fu-berlin.de/~mtoussai
- (2011) Expectation-Maximization Methods for Solving (PO)MDPs and Optimal Control Problems
- Toussaint, M.¹ Storkey, A.² Harmeling, S.³

7
- 80053139999
- Efficient Inference in Markov Control Problems
- North-Holland, Amsterdam
- Furmston, T., Barber, D.: Efficient Inference in Markov Control Problems. In: Uncertainty in Artificial Intelligence. North-Holland, Amsterdam (2011)
- (2011) Uncertainty in Artificial Intelligence
- Furmston, T.¹ Barber, D.²

8
- 80052423737
- Research Report RN/11/13, Centre for Computational Statistics and Machine Learning, University College London
- Furmston, T., Barber, D.: An analysis of the Expectation Maximisation algorithm for Markov Decision Processes. Research Report RN/11/13, Centre for Computational Statistics and Machine Learning, University College London (2011)
- (2011) An Analysis of the Expectation Maximisation Algorithm for Markov Decision Processes
- Furmston, T.¹ Barber, D.²

9
- 0003713964
- 2nd edn. Athena Scientific, Belmont
- Bertsekas, D.P.: Nonlinear Programming, 2nd edn. Athena Scientific, Belmont (1999)
- (1999) Nonlinear Programming
- Bertsekas, D.P.¹

10
- 79957829592
- Introduction to Dual Decomposition for Inference
- Sra, S., Nowozin, S., Wright, S. (eds.) MIT Press, Cambridge
- Sontag, D., Globerson, A., Jaakkola, T.: Introduction to Dual Decomposition for Inference. In: Sra, S., Nowozin, S., Wright, S. (eds.) Optimisation for Machine Learning, MIT Press, Cambridge (2011)
- (2011) Optimisation for Machine Learning
- Sontag, D.¹ Globerson, A.² Jaakkola, T.³

11
- 84862273812
- Variational Methods for Reinforcement Learning
- Furmston, T., Barber, D.: Variational Methods for Reinforcement Learning. AISTATS 9(13), 241-248 (2010)
- (2010) AISTATS , vol.9 , Issue.13 , pp. 241-248
- Furmston, T.¹ Barber, D.²

12
- 0004055894
- Cambridge University Press, Cambridge
- Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)
- (2004) Convex Optimization
- Boyd, S.¹ Vandenberghe, L.²

13
- 50649087920
- MRF Optimization via Dual Decomposition: Message-Passing Revisited
- Komodakis, N., Paragios, N., Tziritas, G.: MRF Optimization via Dual Decomposition: Message-Passing Revisited. In: IEEE 11th International Conference on Computer Vision, ICCV, pp. 1-8 (2007)
- (2007) IEEE 11th International Conference on Computer Vision, ICCV , pp. 1-8
- Komodakis, N.¹ Paragios, N.² Tziritas, G.³

14
- 0031619316
- Bayesian Q learning
- Dearden, R., Friedman, N., Russell, S.: Bayesian Q learning. AAAI 15, 761-768 (1998)
- (1998) AAAI , vol.15 , pp. 761-768
- Dearden, R.¹ Friedman, N.² Russell, S.³

15
- 85156221438
- Generalization in Reinforcment Learning: Successful Examples Using Sparse Coarse Coding
- Sutton, R.: Generalization in Reinforcment Learning: Successful Examples Using Sparse Coarse Coding. NIPS (8), 1038-1044 (1996)
- (1996) NIPS , Issue.8 , pp. 1038-1044
- Sutton, R.¹

16
- 70350090880
- Bayesian Policy Learning with Trans-Dimensional MCMC
- Hoffman, M., Doucet, A., De Freitas, N., Jasra, A.: Bayesian Policy Learning with Trans-Dimensional MCMC. NIPS (20), 665-672 (2008)
- (2008) NIPS , Issue.20 , pp. 665-672
- Hoffman, M.¹ Doucet, A.² De Freitas, N.³ Jasra, A.⁴

17
- 84862277035
- An Expectation Maximization Algorithm for Continuous Markov Decision Processes with Arbitrary Rewards
- Hoffman, M., De Freitas, N., Doucet, A., Peters, J.: An Expectation Maximization Algorithm for Continuous Markov Decision Processes with Arbitrary Rewards. AISTATS 5(12), 232-239 (2009)
- (2009) AISTATS , vol.5 , Issue.12 , pp. 232-239
- Hoffman, M.¹ De Freitas, N.² Doucet, A.³ Peters, J.⁴

18
- 1942420675
- Optimization with EM and Expectation-Conjugate-Gradient
- Salakhutdinov, R., Roweis, S., Ghahramani, Z.: Optimization with EM and Expectation-Conjugate-Gradient. ICML (20), 672-679 (2003)
- (2003) ICML , Issue.20 , pp. 672-679
- Salakhutdinov, R.¹ Roweis, S.² Ghahramani, Z.³

19
- 80052422112
- Research Report EDI-INF-RR-0934, University OF Washington
- Fraley, C.: On Computing the Largest Fraction of Missing Information for the EM Algorithm and the Worst Linear Function for Data Augmentation. Research Report EDI-INF-RR-0934, University OF Washington (1999)
- (1999) On Computing the Largest Fraction of Missing Information for the EM Algorithm and the Worst Linear Function for Data Augmentation
- Fraley, C.¹

20
- 77956228214
- Cambridge University Press, Cambridge
- Barber, D.: Bayesian Reasoning and Machine Learning. Cambridge University Press, Cambridge (2011)
- (2011) Bayesian Reasoning and Machine Learning
- Barber, D.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.