SCOPUS 정보 검색 플랫폼

Proceedings of the 2007 IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning, ADPRL 2007

Volumn , Issue , 2007, Pages 103-110

Model-based reinforcement learning in factored-state MDPs

(1) Strehl, Alexander L a

a RUTGERS UNIVERSITY (United States)

Author keywords

[No Author keywords available]

Indexed keywords

DATA STRUCTURES; LEARNING ALGORITHMS; OPTIMAL CONTROL SYSTEMS; POLYNOMIAL APPROXIMATION; PROBLEM SOLVING;

FACTORED RMAX; INTERVAL ESTIMATION;

REINFORCEMENT LEARNING;

EID: 34548763246 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ADPRL.2007.368176 Document Type: Conference Paper

Times cited : (25)

References (21)

1
- 0036832954
- Near-optimal reinforcement learning in polynomial time
- M. J. Kearns and S. P. Singh, "Near-optimal reinforcement learning in polynomial time," Machine Learning, vol. 49, no. 2-3, pp. 209-232, 2002.
- (2002) Machine Learning , vol.49 , Issue.2-3 , pp. 209-232
- Kearns, M.J.¹ Singh, S.P.²

2
- 0041965975
- R-MAX - a general polynomial time algorithm for near-optimal reinforcement learning
- R. I. Brafman and M. Tennenholtz, "R-MAX - a general polynomial time algorithm for near-optimal reinforcement learning," Journal of Machine Learning Research, vol. 3, pp. 213-231, 2002.
- (2002) Journal of Machine Learning Research , vol.3 , pp. 213-231
- Brafman, R.I.¹ Tennenholtz, M.²

3
- 23244466805
- Ph.D. dissertation, Gatsby Computational Neuroscience Unit, University College London
- S. M. Kakade, "On the sample complexity of reinforcement learning," Ph.D. dissertation, Gatsby Computational Neuroscience Unit, University College London, 2003.
- (2003) On the sample complexity of reinforcement learning
- Kakade, S.M.¹

4
- 0004280606
- Cambridge, MA: The MIT Press
- L. P. Kaelbling, Learning in Embedded Systems. Cambridge, MA: The MIT Press, 1993.
- (1993) Learning in Embedded Systems
- Kaelbling, L.P.¹

5
- 0345161973
- Efficient model-based exploration
- M. Wiering and J. Schmidhuber, "Efficient model-based exploration," in Proceedings of the Fifth International Conference on Simulation of Adaptive Behavior (SAB'98), 1998, pp. 223-228.
- (1998) Proceedings of the Fifth International Conference on Simulation of Adaptive Behavior (SAB'98) , pp. 223-228
- Wiering, M.¹ Schmidhuber, J.²

6
- 31844432138
- A theoretical analysis of model-based interval estimation
- A. L. Strehl and M. L. Littman, "A theoretical analysis of model-based interval estimation," in Proceedings of the Twenty-second International Conference on Machine Learning (ICML-05), 2005, pp. 857-864.
- (2005) Proceedings of the Twenty-second International Conference on Machine Learning (ICML-05) , pp. 857-864
- Strehl, A.L.¹ Littman, M.L.²

7
- 16244391087
- An empirical evaluation of interval estimation for Markov decision processes
- _, "An empirical evaluation of interval estimation for Markov decision processes," in The 16th IEEE International Conference on Tools with Artificial Intelligence (ICTAI-2004), 2004, pp. 128-135.
- (2004) The 16th IEEE International Conference on Tools with Artificial Intelligence (ICTAI-2004) , pp. 128-135
- Strehl, A.L.¹ Littman, M.L.²

8
- 79961182095
- An analysis of model-based interval estimation for Markov decision processes
- in press
- _, "An analysis of model-based interval estimation for Markov decision processes," Journal of Computer and System Sciences, in press.
- Journal of Computer and System Sciences
- Strehl, A.L.¹ Littman, M.L.²

9
- 84880688552
- Computing factored value functions for policies in structured MDPs
- The AAAI Press/The MIT Press
- D. Koller and R. Parr, "Computing factored value functions for policies in structured MDPs," in Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence. The AAAI Press/The MIT Press, 1999, pp. 1332-1339.
- (1999) Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence , pp. 1332-1339
- Koller, D.¹ Parr, R.²

10
- 0346942368
- Decision-theoretic planning: Structural assumptions and computational leverage
- C. Boutilier, T. Dean, and S. Hanks, "Decision-theoretic planning: Structural assumptions and computational leverage," Journal of Artificial Intelligence Research, vol. 11, pp. 1-94, 1999.
- (1999) Journal of Artificial Intelligence Research , vol.11 , pp. 1-94
- Boutilier, C.¹ Dean, T.² Hanks, S.³

11
- 84880677563
- Efficient reinforcement learning in factored MDPs
- M. J. Kearns and D. Koller, "Efficient reinforcement learning in factored MDPs," in Proceedings of the 16th International Joint Conference on Artificial Intelligence (IJCAI), 1999, pp. 740-747.
- (1999) Proceedings of the 16th International Joint Conference on Artificial Intelligence (IJCAI) , pp. 740-747
- Kearns, M.J.¹ Koller, D.²

12
- 33749245414
- Algorithm-directed exploration for model-based reinforcement learning in factored MDPs
- C. Guestrin, R. Patrascu, and D. Schuurmans, "Algorithm-directed exploration for model-based reinforcement learning in factored MDPs," in Proceedings of the International Conference on Machine Learning, 2002, pp. 235-242.
- (2002) Proceedings of the International Conference on Machine Learning , pp. 235-242
- Guestrin, C.¹ Patrascu, R.² Schuurmans, D.³

13
- 0004102479
- The MIT Press
- R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. The MIT Press, 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

14
- 0000675721
- Context-specific independence in Bayesian networks
- Portland, OR
- C. Boutilier, N. Friedman, M. Goldszmidt, and D. Koller, "Context-specific independence in Bayesian networks," in Proceedings of the Twelfth Annual Conference on Uncertainty in Artificial Intelligence (UAI 96), Portland, OR, 1996, pp. 115-123.
- (1996) Proceedings of the Twelfth Annual Conference on Uncertainty in Artificial Intelligence (UAI 96) , pp. 115-123
- Boutilier, C.¹ Friedman, N.² Goldszmidt, M.³ Koller, D.⁴

15
- 0021518106
- A theory of the learnable
- November
- L. G. Valiant, "A theory of the learnable," Communications of the ACM, vol. 27, no. 11, pp. 1134-1142, November 1984.
- (1984) Communications of the ACM , vol.27 , Issue.11 , pp. 1134-1142
- Valiant, L.G.¹

16
- 34548745051
- Incremental model-based learners with formal learning-time guarantees
- A. L. Strehl, L. Li, and M. L. Littman, "Incremental model-based learners with formal learning-time guarantees," in UAI-06: Proceedings of the 22nd conference on Uncertainty in Artificial Intelligence, 2006, pp. 485-493.
- (2006) UAI-06: Proceedings of the 22nd conference on Uncertainty in Artificial Intelligence , pp. 485-493
- Strehl, A.L.¹ Li, L.² Littman, M.L.³

17
- 16244368573
- HewlettPackard Labs, Tech. Rep. HPL-2003-97R1
- T. Weissman, E. Ordentlich, G. Seroussi, S. Verdu, and M. J. Weinberger, "Inequalities for the L1 deviation of the empirical distribution," HewlettPackard Labs, Tech. Rep. HPL-2003-97R1, 2003.
- (2003) Inequalities for the L1 deviation of the empirical distribution
- Weissman, T.¹ Ordentlich, E.² Seroussi, G.³ Verdu, S.⁴ Weinberger, M.J.⁵

18
- 0031369472
- Probabilistic propositional planning: Representations and complexity
- AAAI Press/The MIT Press, Online, Available
- M. L. Littman, "Probabilistic propositional planning: Representations and complexity," in Proceedings of the Fourteenth National Conference on Artificial Intelligence. AAAI Press/The MIT Press, 1997, pp. 748-754. [Online], Available: http://www.cs.rutgers.edu/ mlittman/papers/aaai97-planning.ps
- (1997) Proceedings of the Fourteenth National Conference on Artificial Intelligence , pp. 748-754
- Littman, M.L.¹

19
- 11544375673
- The computational complexity of probabilistic planning
- M. L. Littman, J. Goldsmith, and M. Mundhenk, "The computational complexity of probabilistic planning," Journal of Artificial Intelligence Research, vol. 9, pp. 1-36, 1998.
- (1998) Journal of Artificial Intelligence Research , vol.9 , pp. 1-36
- Littman, M.L.¹ Goldsmith, J.² Mundhenk, M.³

20
- 85081806239
- A note on the representational incompatabilty of function approximation and factored dynamics
- E. Allender, S. Arora, M. Kearns, C. Moore, and A. Russell, "A note on the representational incompatabilty of function approximation and factored dynamics." in Advances in Neural Information Processing Systems (NIPS-03), 2002.
- (2002) Advances in Neural Information Processing Systems (NIPS-03)
- Allender, E.¹ Arora, S.² Kearns, M.³ Moore, C.⁴ Russell, A.⁵

21
- 33749242809
- Learning the structure of factored Markov decision processes in reinforcement learning problems
- T. Degris, O. Sigaud, and P.-H. Wuillemin, "Learning the structure of factored Markov decision processes in reinforcement learning problems," in ICML-06: Proceedings of the 23rd international conference on Machine learning, 2006, pp. 257-264.
- (2006) ICML-06: Proceedings of the 23rd international conference on Machine learning , pp. 257-264
- Degris, T.¹ Sigaud, O.² Wuillemin, P.-H.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.