SCOPUS 정보 검색 플랫폼

Proceedings of the 25th International Conference on Machine Learning

Volumn , Issue , 2008, Pages 432-439

Hierarchical model-based reinforcement learning: R-MAX + MAXQ

a UNIVERSITY OF TEXAS AT AUSTIN (United States)

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS; EDUCATION; HIERARCHICAL SYSTEMS; LEARNING SYSTEMS; REINFORCEMENT; REINFORCEMENT LEARNING; ROBOT LEARNING; CONVERGENCE OF NUMERICAL METHODS;

AND MODELS; FINITE-TIME CONVERGENCES; FULLY INTEGRATES; HIERARCHICAL DECOMPOSITIONS; HIERARCHICAL MODELS; LARGE ENVIRONMENTS; LEARNING METHODS; MODEL-BASED; REINFORCEMENT LEARNING ALGORITHMS; SIMULATION ENVIRONMENTS; LEARNING SETTINGS; MODEL-BASED ALGORITHMS; REAL-WORLD PROBLEMS; SAMPLE COMPLEXITY;

LEARNING ALGORITHMS;

EID: 56449090073 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (43)

References (14)

1
- 0141988716
- Recent advances in hierarchical reinforcement learning
- Special Issue on Reinforcement Learning
- Barto, A. G., & Mahadevan, S. (2003). Recent advances in hierarchical reinforcement learning. Discrete-Event Systems, 13, 41-77. Special Issue on Reinforcement Learning.
- (2003) Discrete-Event Systems , vol.13 , pp. 41-77
- Barto, A.G.¹ Mahadevan, S.²

2
- 85166207010
- Exploiting structure in policy construction
- Boutilier, C., Dearden, R., & Goldszmidt, M. (1995). Exploiting structure in policy construction. Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence (pp. 1104-1111).
- (1995) Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence , pp. 1104-1111
- Boutilier, C.¹ Dearden, R.² Goldszmidt, M.³

3
- 0041965975
- R-MAX - a general polynomial time algorithm for near-optimal reinforcement learning
- Brafman, R. I., & Tennenholtz, M. (2002). R-MAX - a general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research, 3, 213-231.
- (2002) Journal of Machine Learning Research , vol.3 , pp. 213-231
- Brafman, R.I.¹ Tennenholtz, M.²

4
- 0002278788
- Hierarchical reinforcement learning with the MAXQ value function decomposition
- Dietterich, T. G. (2000). Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 13, 227-303.
- (2000) Journal of Artificial Intelligence Research , vol.13 , pp. 227-303
- Dietterich, T.G.¹

5
- 34247204877
- A hierarchical approach to efficient reinforcement learning in deterministic domains
- Diuk, C., Strehl, A. L., & Littman, M. L. (2006). A hierarchical approach to efficient reinforcement learning in deterministic domains. Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multiagent Systems.
- (2006) Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multiagent Systems
- Diuk, C.¹ Strehl, A.L.² Littman, M.L.³

6
- 84899911182
- The utility of temporal abstraction in reinforcement learning
- Jong, N. K., Hester, T., & Stone, P. (2008). The utility of temporal abstraction in reinforcement learning. Proceedings of the Seventh International Joint Conference on Autonomous Agents and Multiagent Systems.
- (2008) Proceedings of the Seventh International Joint Conference on Autonomous Agents and Multiagent Systems
- Jong, N.K.¹ Hester, T.² Stone, P.³

7
- 23244466805
- Doctoral dissertation, University College London
- Kakade, S. M. (2003). On the sample complexity of reinforcement learning. Doctoral dissertation, University College London.
- (2003) On the sample complexity of reinforcement learning
- Kakade, S.M.¹

8
- 0012257655
- Near-optimal reinforcement learning in polynomial time
- Kearns, M., & Singh, S. (1998). Near-optimal reinforcement learning in polynomial time. Proceedings of the Fifteenth International Conference on Machine Learning (pp. 260-268).
- (1998) Proceedings of the Fifteenth International Conference on Machine Learning , pp. 260-268
- Kearns, M.¹ Singh, S.²

9
- 0027684215
- Prioritized sweeping: Reinforcement learning with less data and less real time
- Moore, A. W., & Atkeson, C. G. (1993). Prioritized sweeping: Reinforcement learning with less data and less real time. Machine Learning, 13, 103-130.
- (1993) Machine Learning , vol.13 , pp. 103-130
- Moore, A.W.¹ Atkeson, C.G.²

10
- 36949003610
- Model-based hierarchical average-reward reinforcement learning
- Seri, S., & Tadepalli, P. (2002). Model-based hierarchical average-reward reinforcement learning. Proceedings of the Nineteenth International Conference on Machine Learning (pp. 562-569).
- (2002) Proceedings of the Nineteenth International Conference on Machine Learning , pp. 562-569
- Seri, S.¹ Tadepalli, P.²

11
- 14344261491
- Using relative novelty to identify useful temporal abstractions in reinforcement learning
- Şimşek, Ö., & Barto, A. G. (2004). Using relative novelty to identify useful temporal abstractions in reinforcement learning. Proceedings of the Twenty-First International Conference on Machine Learning (pp. 751-758).
- (2004) Proceedings of the Twenty-First International Conference on Machine Learning , pp. 751-758
- Şimşek, O.¹ Barto, A.G.²

12
- 84899031920
- Intrinsically motivated reinforcement learning
- Singh, S., Barto, A. G., & Chentanez, N. (2005). Intrinsically motivated reinforcement learning. Advances in Neural Information Processing Systems 17.
- (2005) Advances in Neural Information Processing Systems , vol.17
- Singh, S.¹ Barto, A.G.² Chentanez, N.³

13
- 0004102479
- Cambridge, MA: MIT Press
- Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT Press.
- (1998) Reinforcement learning: An introduction
- Sutton, R.S.¹ Barto, A.G.²

14
- 0033170372
- Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
- Sutton, R. S., Precup, D., & Singh, S. (1999). Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112, 181-211.
- (1999) Artificial Intelligence , vol.112 , pp. 181-211
- Sutton, R.S.¹ Precup, D.² Singh, S.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.