SCOPUS 정보 검색 플랫폼

Volumn 1, Issue , 2005, Pages 81-88

Efficient exploration with latent structure

(4) Leffler, Bethany R a Littman, Michael L a Strehl, Alexander L a Walsh, Thomas J a

Author keywords

[No Author keywords available]

Indexed keywords

EID: 73549099066 PISSN: 23307668 EISSN: 2330765X Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (4)

References (22)

1
- 0141988716
- Recent advances in hierarchical reinforcement learning
- Special Issue on Reinforcement Learning
- Andrew Barto and Sridhar Mahadevan. Recent advances in hierarchical reinforcement learning. Discrete Event Systems 13, Special Issue on Reinforcement Learning 41-77, 2003.
- (2003) Discrete Event Systems , vol.13 , pp. 41-77
- Barto, A.¹ Mahadevan, S.²

2
- 0004181906
- Chapman and Hall, London, UK
- [Berry and Fristedt, 1985] Donald A. Berry and Bert Fristedt. Bandit Problems: Sequential Allocation of Experiments. Chapman and Hall, London, UK, 1985.
- (1985) Bandit Problems: Sequential Allocation of Experiments
- Berry, D.A.¹ Fristedt, B.²

3
- 0041965975
- R-MAX-A general polynomial time algorithm for near-optimal reinforcement learning
- [Brafman and Tennenholtz, 2002] Ronen I. Brafman and Moshe Tennenholtz. R-MAX-A general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research, 3:213-231, 2002.
- (2002) Journal of Machine Learning Research , vol.3 , pp. 213-231
- Brafman, R.I.¹ Tennenholtz, M.²

4
- 0001909869
- Incremental Pruning: A simple, fast, exact method for partially observable Markov decision processes
- San Francisco, CA Morgan Kaufmann Publishers
- [Cassandra et al., 1997] Anthony Cassandra, Michael L. Littman, and Nevin L. Zhang. Incremental Pruning: A simple, fast, exact method for partially observable Markov decision processes. In Proceedings of the Thirteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-97), pages 54-61, San Francisco, CA, 1997. Morgan Kaufmann Publishers.
- (1997) Proceedings of the Thirteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-97 , pp. 54-61
- Cassandra, A.¹ Littman, M.L.² Zhang, N.L.³

5
- 1142281527
- Model-based Bayesian exploration
- [Dearden et al., 1999] Richard Dearden, Nir Friedman, and David Andre. Model-based bayesian exploration. In Proceedings of the 15th Annual Conference on Uncertainty in Artificial Intelligence (UAI-99), pages 150-159, 1999.
- (1999) Proceedings of the 15th Annual Conference on Uncertainty in Artificial Intelligence (UAI-99 , pp. 150-159
- Dearden, R.¹ Friedman, N.² Andre, D.³

6
- 0002629270
- Maximum likelihood from incomplete data via the em algorithm
- [Dempster et al., 1977] A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, 39(1):1-38, 1977.
- (1977) Journal of the Royal Statistical Society , vol.39 , Issue.1 , pp. 1-38
- Dempster, A.P.¹ Laird, N.M.² Rubin, D.B.³

7
- 84937398609
- Pac bounds for multi-Armed bandit and Markov decision processes
- [Even-Dar et al., 2002] Eyal Even-Dar, Shie Mannor, and Yishay Mansour. Pac bounds for multi-Armed bandit and Markov decision processes. In 15th Annual Conference on Computational Learning Theory (COLT), pages 255-270, 2002.
- (2002) 15th Annual Conference on Computational Learning Theory (COLT , pp. 255-270
- Even-Dar, E.¹ Mannor, S.² Mansour, Y.³

8
- 1942421149
- Action elimination and stopping conditions for reinforcement learning
- [Even-Dar et al., 2003] Eyal Even-Dar, Shie Mannor, and Yishay Mansour. Action elimination and stopping conditions for reinforcement learning. In The Twentieth International Conference on Machine Learning (ICML 2003), pages 162-169, 2003.
- (2003) The Twentieth International Conference on Machine Learning ICML 2003 , pp. 162-169
- Even-Dar, E.¹ Mannor, S.² Mansour, Y.³

9
- 78649499440
- Efficient reinforcement learning
- Association of Computing Machinery
- [Fiechter, 1994] Claude-Nicolas Fiechter. Efficient reinforcement learning. In Proceedings of the Seventh Annual ACM Conference on Computational Learning Theory, pages 88-97. Association of Computing Machinery, 1994.
- (1994) Proceedings of the Seventh Annual ACM Conference on Computational Learning Theory , pp. 88-97
- Fiechter, C.¹

10
- 78650606637
- A quantitative study of hypothesis selection
- [Fong, 1995] Philip W. L. Fong. A quantitative study of hypothesis selection. In Proceedings of the Twelfth International Conference on Machine Learning (ICML-95), pages 226-234, 1995.
- (1995) Proceedings of the Twelfth International Conference on Machine Learning ICML-95 , pp. 226-234
- Philip, W.L.F.¹

11
- 84891584370
- Wiley-Interscience series in systems and optimization. Wiley, Chichester, NY
- [Gittins, 1989] J. C. Gittins. Multi-Armed Bandit Allocation Indices. Wiley-Interscience series in systems and optimization. Wiley, Chichester, NY, 1989.
- (1989) Multi-Armed Bandit Allocation Indices
- Gittins, J.C.¹

12
- 0004280606
- The MIT Press, Cambridge, MA
- [Kaelbling, 1993] Leslie Pack Kaelbling. Learning in Embedded Systems. The MIT Press, Cambridge, MA, 1993.
- (1993) Learning in Embedded Systems
- Pack Kaelbling, L.¹

13
- 84880677563
- Efficient reinforcement learning in factored MDPs
- [Kearns and Koller, 1999] Michael J. Kearns and Daphne Koller. Efficient reinforcement learning in factored MDPs. In Proceedings of the 16th International Joint Conference on Artificial Intelligence (IJCAI), pages 740-747, 1999.
- (1999) Proceedings of the 16th International Joint Conference on Artificial Intelligence (IJCAI , pp. 740-747
- Kearns, M.J.¹ Koller, D.²

14
- 0036832954
- Nearoptimal reinforcement learning in polynomial time
- [Kearns and Singh, 2002] Michael J. Kearns and Satinder P. Singh. Nearoptimal reinforcement learning in polynomial time. Machine Learning, 49(2-3):209-232, 2002.
- (2002) Machine Learning , vol.49 , Issue.2-3 , pp. 209-232
- Kearns, M.J.¹ Singh, S.P.²

15
- 30044441333
- The sample complexity of exploration in the multi-Armed bandit problem
- [Mannor and Tsitsiklis, 2004] Shie Mannor and John N. Tsitsiklis. The sample complexity of exploration in the multi-Armed bandit problem. Journal of Artificial Intelligence Research, 5:623-648, 2004.
- (2004) Journal of Artificial Intelligence Research , vol.5 , pp. 623-648
- Mannor, S.¹ Tsitsiklis, J.N.²

16
- 0003998452
- John Wiley & Sons, Inc., New York, NY
- [Puterman, 1994] Martin L. Puterman. Markov Decision Processes-Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., New York, NY, 1994.
- (1994) Markov Decision Processes-Discrete Stochastic Dynamic Programming
- Puterman, M.L.¹

17
- 29344469131
- Improving action selection in MDP's via knowledge transfer
- July
- [Sherstov and Stone, 2005] Alexander A. Sherstov and Peter Stone. Improving action selection in MDP's via knowledge transfer. In Proceedings of the Twentieth National Conference on Artificial Intelligence, July 2005.
- (2005) Proceedings of the Twentieth National Conference on Artificial Intelligence
- Sherstov, A.A.¹ Stone, P.²

18
- 16244391087
- An empirical evaluation of interval estimation for Markov decision processes
- [Strehl and Littman, 2004] Alexander L. Strehl and Michael L. Littman. An empirical evaluation of interval estimation for Markov decision processes. In The 16th IEEE International Conference on Tools with Artificial Intelligence (ICTAI-2004), pages 128-135, 2004.
- (2004) The 16th IEEE International Conference on Tools with Artificial Intelligence (ICTAI-2004 , pp. 128-135
- Strehl, A.L.¹ Littman, M.L.²

19
- 0004102479
- The MIT Press
- [Sutton and Barto, 1998] Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction. The MIT Press, 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

20
- 0002210775
- The role of exploration in learning control
- In David A. White and Donald A. Sofge, editors Van Nostrand Reinhold, New York, NY
- [Thrun, 1992] Sebastian B. Thrun. The role of exploration in learning control. In David A. White and Donald A. Sofge, editors, Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches, pages 527-559. Van Nostrand Reinhold, New York, NY, 1992.
- (1992) Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches , pp. 527-559
- Sebastian, B.T.¹

21
- 16244368573
- Technical Report HPL-2003-97R1, Hewlett-Packard Labs
- [Weissman et al., 2003] Tsachy Weissman, Erik Ordentlich, Gadiel Seroussi, Sergio Verdu, and Marcelo J. Weinberger. Inequalities for the L1 deviation of the empirical distribution. Technical Report HPL-2003-97R1, Hewlett-Packard Labs, 2003.
- (2003) Inequalities for the L1 Deviation of the Empirical Distribution
- Weissman, T.¹ Ordentlich, E.² Seroussi, G.³ Verdu, S.⁴ Weinberger, M.J.⁵

22
- 0345161973
- Efficient model-based exploration
- [Wiering and Schmidhuber, 1998] Marco Wiering and Jürgen Schmidhuber. Efficient model-based exploration. In Proceedings of the Fifth International Conference on Simulation of Adaptive Behavior (SAB'98), pages 223-228, 1998.
- (1998) Proceedings of the Fifth International Conference on Simulation of Adaptive Behavior (SAB'98 , pp. 223-228
- Wiering, M.¹ Schmidhuber, J.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.