SCOPUS 정보 검색 플랫폼

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Volumn 5163 LNCS, Issue PART 1, 2008, Pages 357-366

Multigrid reinforcement learning with reward shaping

(2) Grześ, Marek a Kudenko, Daniel a

a UNIVERSITY OF YORK (United Kingdom)

Author keywords

[No Author keywords available]

Indexed keywords

BACKGROUND KNOWLEDGE; CONVERGENCE RATES; DISCRETISATION; LEARNING AGENTS; LOWER RESOLUTION; MULTI-GRID; NOVEL ALGORITHM; POTENTIAL FUNCTION; Q-FUNCTIONS; REINFORCEMENT LEARNING AGENT; STATE SPACE; TEMPORAL DIFFERENCE LEARNING;

ALGORITHMS; BACKPROPAGATION; EDUCATION; INTELLIGENT AGENTS; NEURAL NETWORKS; REINFORCEMENT LEARNING;

REINFORCEMENT;

EID: 58849111871 PISSN: 03029743 EISSN: 16113349 Source Type: Book Series
DOI: 10.1007/978-3-540-87536-9_37 Document Type: Conference Paper

Times cited : (28)

References (18)

1
- 0141596576
- Policy invariance under reward transformations: Theory and application to reward shaping
- Ng, A.Y., Harada, D., Russell, S.J.: Policy invariance under reward transformations: Theory and application to reward shaping. In: Proceedings of the 16th International Conference on Machine Learning, pp. 278-287 (1999)
- (1999) Proceedings of the 16th International Conference on Machine Learning , pp. 278-287
- Ng, A.Y.¹ Harada, D.² Russell, S.J.³

2
- 1642401055
- Learning to drive a bicycle using reinforcement learning and shaping
- Randlov, J., Alstrom, P.: Learning to drive a bicycle using reinforcement learning and shaping. In: Proceedings of the 15th International Conference on Machine Learning, pp. 463-471 (1998)
- (1998) Proceedings of the 15th International Conference on Machine Learning , pp. 463-471
- Randlov, J.¹ Alstrom, P.²

3
- 34547964974
- Automatic shaping and decomposition of reward functions
- Marthi, B.: Automatic shaping and decomposition of reward functions. In: Proceedings of the 24th International Conference on Machine Learning, pp. 601-608 (2007)
- (2007) Proceedings of the 24th International Conference on Machine Learning , pp. 601-608
- Marthi, B.¹

4
- 0004102479
- MIT Press, Cambridge
- Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

5
- 0346942368
- Decision-theoretic planning: Structural assumptions and computational leverage
- Boutilier, C., Dean, T., Hanks, S.: Decision-theoretic planning: Structural assumptions and computational leverage. Journal of Artificial Intelligence Research 11, 1-94 (1999)
- (1999) Journal of Artificial Intelligence Research , vol.11 , pp. 1-94
- Boutilier, C.¹ Dean, T.² Hanks, S.³

6
- 0026206780
- An optimal one-way multigrid algorithm for discretetime stochastic control
- Chow, C.S., Tsitsiklis, J.N.: An optimal one-way multigrid algorithm for discretetime stochastic control. IEEE Transactions on Automatic Control 36(8), 898-914 (1991)
- (1991) IEEE Transactions on Automatic Control , vol.36 , Issue.8 , pp. 898-914
- Chow, C.S.¹ Tsitsiklis, J.N.²

7
- 0042353224
- Multigrid Q-learning
- Technical Report CS-94-121, Colorado State University
- Anderson, C., Crawford-Hines, S.: Multigrid Q-learning. Technical Report CS-94-121, Colorado State University (1994)
- (1994)
- Anderson, C.¹ Crawford-Hines, S.²

8
- 0033170372
- Between MDPs and Semi-MDPs: A framework for temporal abstraction in reinforcement learning
- Sutton, R.S., Precup, D., Singh, S.P.: Between MDPs and Semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112(1-2), 181-211 (1999)
- (1999) Artificial Intelligence , vol.112 , Issue.1-2 , pp. 181-211
- Sutton, R.S.¹ Precup, D.² Singh, S.P.³

9
- 21844451909
- Prioritization methods for accelerating MDP solvers
- Wingate, D., Seppi, K.D.: Prioritization methods for accelerating MDP solvers. Journal of Machine Learning Research 6, 851-881 (2005)
- (2005) Journal of Machine Learning Research , vol.6 , pp. 851-881
- Wingate, D.¹ Seppi, K.D.²

10
- 34250717446
- Epshteyn, A., De.Jong, G.: Qualitative reinforcement learning. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 305-312 (2006)
- Epshteyn, A., De.Jong, G.: Qualitative reinforcement learning. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 305-312 (2006)

11
- 0036832953
- Variable resolution discretization in optimal control
- Munos, R., Moore, A.: Variable resolution discretization in optimal control. Machine Learning 49(2-3), 291-323 (2002)
- (2002) Machine Learning , vol.49 , Issue.2-3 , pp. 291-323
- Munos, R.¹ Moore, A.²

12
- 0141642669
- Layered learning
- Stone, P., Veloso, M.: Layered learning. In: Proceedings of the 11th European Conference on Machine Learning (2000)
- (2000) Proceedings of the 11th European Conference on Machine Learning
- Stone, P.¹ Veloso, M.²

13
- 85143168613
- Hierarchical learning in stochastic domains: Preliminary results
- Kaelbling, L.P.: Hierarchical learning in stochastic domains: Preliminary results. In: Proceedings of International Conference on Machine Learning, pp. 167-173 (1993)
- (1993) Proceedings of International Conference on Machine Learning , pp. 167-173
- Kaelbling, L.P.¹

14
- 84880688141
- Multi-value-functions: Efficient automatic action hierarchies for multiple goal MDPs
- Moore, A., Baird, L., Kaelbling, L.P.: Multi-value-functions: Efficient automatic action hierarchies for multiple goal MDPs. In: Proceedings of the International Joint Conference on Artificial Intelligence, pp. 1316-1323 (1999)
- (1999) Proceedings of the International Joint Conference on Artificial Intelligence , pp. 1316-1323
- Moore, A.¹ Baird, L.² Kaelbling, L.P.³

15
- 0001234682
- Feudal reinforcement learning
- Dayan, P., Hinton, G.E.: Feudal reinforcement learning. In: Proceedings of Advances in Neural Information Processing Systems (1993)
- (1993) Proceedings of Advances in Neural Information Processing Systems
- Dayan, P.¹ Hinton, G.E.²

16
- 0001070375
- Reinforcement learning with hierarchies of machines
- Parr, R., Russell, S.: Reinforcement learning with hierarchies of machines. In: Proceedings of Advances in Neural Information Processing Systems, vol. 10 (1997)
- (1997) Proceedings of Advances in Neural Information Processing Systems , vol.10
- Parr, R.¹ Russell, S.²

17
- 0002278788
- Hierarchical reinforcement learning with the MAXQ value function decomposition
- Dietterich, T.G.: Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research 13, 227-303 (2000)
- (2000) Journal of Artificial Intelligence Research , vol.13 , pp. 227-303
- Dietterich, T.G.¹

18
- 33644807975
- Behavior transfer for value-function-based reinforcement learning
- Taylor, M.E., Stone, P.: Behavior transfer for value-function-based reinforcement learning. In: Proceedings of the 4th International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 53-59 (2005)
- (2005) Proceedings of the 4th International Joint Conference on Autonomous Agents and Multiagent Systems , pp. 53-59
- Taylor, M.E.¹ Stone, P.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.