메뉴 건너뛰기




Volumn 6911 LNAI, Issue PART 1, 2011, Pages 487-502

Lagrange dual decomposition for finite horizon Markov decision processes

Author keywords

Lagrange Duality; Markov Decision Processes; Planning

Indexed keywords

CONVERGENT ALGORITHMS; DUAL DECOMPOSITION; EMPIRICAL PERFORMANCE; EXPECTATION-MAXIMISATION; FINITE-HORIZON MARKOV DECISION PROCESS; HARD PROBLEMS; LAGRANGE DUAL; LAGRANGE DUALITY; MARKOV DECISION PROCESSES; NONSTATIONARY; PLANNING ALGORITHMS; POLICY GRADIENT; STATIONARY POLICY; SUB-PROBLEMS; LAGRANGE DUAL DECOMPOSITIONS;

EID: 80052418186     PISSN: 03029743     EISSN: 16113349     Source Type: Book Series    
DOI: 10.1007/978-3-642-23780-5_41     Document Type: Conference Paper
Times cited : (5)

References (20)
  • 2
    • 52949118902 scopus 로고    scopus 로고
    • A Concise Introduction to Multiagent Systems and Distributed Artificial Intelligence
    • Vlassis, N.: A Concise Introduction to Multiagent Systems and Distributed Artificial Intelligence. Synthesis Lectures on Artificial Intelligence and Machine Learning 1(1), 1-71 (2007)
    • (2007) Synthesis Lectures on Artificial Intelligence and Machine Learning , vol.1 , Issue.1 , pp. 1-71
    • Vlassis, N.1
  • 4
    • 0024038570 scopus 로고
    • Probabilistic Inference and Influence Diagrams
    • Shachter, R.D.: Probabilistic Inference and Influence Diagrams. Operations Research 36, 589-604 (1988)
    • (1988) Operations Research , vol.36 , pp. 589-604
    • Shachter, R.D.1
  • 5
    • 0000337576 scopus 로고
    • Simple Statistical Gradient Following Algorithms for Connectionist Reinforcement Learning
    • Williams, R.: Simple Statistical Gradient Following Algorithms for Connectionist Reinforcement Learning. Machine Learning 8, 229-256 (1992)
    • (1992) Machine Learning , vol.8 , pp. 229-256
    • Williams, R.1
  • 7
  • 10
    • 79957829592 scopus 로고    scopus 로고
    • Introduction to Dual Decomposition for Inference
    • Sra, S., Nowozin, S., Wright, S. (eds.) MIT Press, Cambridge
    • Sontag, D., Globerson, A., Jaakkola, T.: Introduction to Dual Decomposition for Inference. In: Sra, S., Nowozin, S., Wright, S. (eds.) Optimisation for Machine Learning, MIT Press, Cambridge (2011)
    • (2011) Optimisation for Machine Learning
    • Sontag, D.1    Globerson, A.2    Jaakkola, T.3
  • 11
    • 84862273812 scopus 로고    scopus 로고
    • Variational Methods for Reinforcement Learning
    • Furmston, T., Barber, D.: Variational Methods for Reinforcement Learning. AISTATS 9(13), 241-248 (2010)
    • (2010) AISTATS , vol.9 , Issue.13 , pp. 241-248
    • Furmston, T.1    Barber, D.2
  • 15
    • 85156221438 scopus 로고    scopus 로고
    • Generalization in Reinforcment Learning: Successful Examples Using Sparse Coarse Coding
    • Sutton, R.: Generalization in Reinforcment Learning: Successful Examples Using Sparse Coarse Coding. NIPS (8), 1038-1044 (1996)
    • (1996) NIPS , Issue.8 , pp. 1038-1044
    • Sutton, R.1
  • 16
    • 70350090880 scopus 로고    scopus 로고
    • Bayesian Policy Learning with Trans-Dimensional MCMC
    • Hoffman, M., Doucet, A., De Freitas, N., Jasra, A.: Bayesian Policy Learning with Trans-Dimensional MCMC. NIPS (20), 665-672 (2008)
    • (2008) NIPS , Issue.20 , pp. 665-672
    • Hoffman, M.1    Doucet, A.2    De Freitas, N.3    Jasra, A.4
  • 17
    • 84862277035 scopus 로고    scopus 로고
    • An Expectation Maximization Algorithm for Continuous Markov Decision Processes with Arbitrary Rewards
    • Hoffman, M., De Freitas, N., Doucet, A., Peters, J.: An Expectation Maximization Algorithm for Continuous Markov Decision Processes with Arbitrary Rewards. AISTATS 5(12), 232-239 (2009)
    • (2009) AISTATS , vol.5 , Issue.12 , pp. 232-239
    • Hoffman, M.1    De Freitas, N.2    Doucet, A.3    Peters, J.4
  • 18
    • 1942420675 scopus 로고    scopus 로고
    • Optimization with EM and Expectation-Conjugate-Gradient
    • Salakhutdinov, R., Roweis, S., Ghahramani, Z.: Optimization with EM and Expectation-Conjugate-Gradient. ICML (20), 672-679 (2003)
    • (2003) ICML , Issue.20 , pp. 672-679
    • Salakhutdinov, R.1    Roweis, S.2    Ghahramani, Z.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.