SCOPUS 정보 검색 플랫폼 - 논문 보기

메뉴 건너뛰기

Proceedings - ISDA 2006: Sixth International Conference on Intelligent Systems Design and Applications

Volumn 3, Issue , 2006, Pages 177-182

Reinforcement learning with hierarchical decision-making

(3) Cohen, Shahar a Maimon, Oded a Khmlenitsky, Evgeni a

a TEL AVIV UNIVERSITY (Israel)

Author keywords

[No Author keywords available]

Indexed keywords

DECISION MAKING; HIERARCHICAL SYSTEMS; LEARNING ALGORITHMS; MARKOV PROCESSES;

HIERARCHICAL REINFORCEMENT LEARNING ALGORITHM; MARKOV DECISION PROCESSES; OPTIMAL POLICY;

LEARNING SYSTEMS;

EID: 34547539260 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ISDA.2006.37 Document Type: Conference Paper

Times cited : (4)

References (17)

1
- 0037288370
- A.G. Barto, and S. Mahadevan, S, Recent Advances in Hierarchical reinforcement Learning. Discrete Event Dynamic Systems: Theory and Applications 13, 2003, pp. 341-379.
- A.G. Barto, and S. Mahadevan, S, "Recent Advances in Hierarchical reinforcement Learning". Discrete Event Dynamic Systems: Theory and Applications 13, 2003, pp. 341-379.

2
- 0029210635
- Learning to Act Using Real-Time Dynamic Programming
- A.G. Barto, S.J. Bradtke, and S.P. Singh, "Learning to Act Using Real-Time Dynamic Programming", Artificial Intelligence 72, 1995, pp. 81-138.
- (1995) Artificial Intelligence , vol.72 , pp. 81-138
- Barto, A.G.¹ Bradtke, S.J.² Singh, S.P.³

3
- 85012688561
- Princeton University Press
- R. Bellman, Dynamic Programming, Princeton University Press, 1957.
- (1957) Dynamic Programming
- Bellman, R.¹

4
- 0003487482
- Athena Scientific, Belmont Massachusetts
- Bertsekas, D.P. and Tsitsiklis, J.N. (1996). Neuro-dynamic programming. Athena Scientific, Belmont Massachusetts.
- (1996) Neuro-dynamic programming
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

5
- 0001234682
- Feudal Reinforcement Learning
- P. Dayan, and G.E. Hinton, "Feudal Reinforcement Learning", Proceedings of Advances in Neural Information Processing Systems 5, 1993, pp. 271-278.
- (1993) Proceedings of Advances in Neural Information Processing Systems , vol.5 , pp. 271-278
- Dayan, P.¹ Hinton, G.E.²

6
- 0002278788
- Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition
- T.D. Dietterich, "Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition", Journal of Artificial Intelligence Research 13, 2000, pp. 227-303.
- (2000) Journal of Artificial Intelligence Research , vol.13 , pp. 227-303
- Dietterich, T.D.¹

7
- 0002357911
- Convergence of Indirect Adaptive Asynchronous Value Iteration Algorithms
- V. Gullapalli, and A.G. Barto, "Convergence of Indirect Adaptive Asynchronous Value Iteration Algorithms" Proceedings of Advances in Neural Information Processing Systems 6, 1994. pp. 695-702.
- (1994) Proceedings of Advances in Neural Information Processing Systems , vol.6 , pp. 695-702
- Gullapalli, V.¹ Barto, A.G.²

8
- 0003644124
- MIT Press
- R.A. Howard, Dynamic Programming and Markov Processes, MIT Press, 1960.
- (1960) Dynamic Programming and Markov Processes
- Howard, R.A.¹

9
- 0029679044
- Reinforcement Learning: A Survey
- L.P. Kaelbling, L.M. Littman, and A.W. Moore, "Reinforcement Learning: a Survey", Journal of Artificial Intelligence Research 4, 1996, pp. 237-285.
- (1996) Journal of Artificial Intelligence Research , vol.4 , pp. 237-285
- Kaelbling, L.P.¹ Littman, L.M.² Moore, A.W.³

10
- 84898956770
- Reinforcement Learning with Hierarchies of Machines
- R. Parr, and S. Russell. "Reinforcement Learning with Hierarchies of Machines", Proceedings of Advances in Neural Information Processing Systems 10, 1997, pp. 1043-1049.
- (1997) Proceedings of Advances in Neural Information Processing Systems , vol.10 , pp. 1043-1049
- Parr, R.¹ Russell, S.²

11
- 85102627959
- John Wiley & Sons
- M.L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming, John Wiley & Sons, 1994.
- (1994) Markov Decision Processes: Discrete Stochastic Dynamic Programming
- Puterman, M.L.¹

12
- 0033901602
- Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms
- S.P. Singh, T. Jaakkola, M.L. Littman, C. Szepesvári, "Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms", Machine Learning 38, 2000, pp. 287-308.
- (2000) Machine Learning , vol.38 , pp. 287-308
- Singh, S.P.¹ Jaakkola, T.² Littman, M.L.³ Szepesvári, C.⁴

13
- 85132026293
- Integrated Architectures for Learning, Planning and Reacting Based on Approximating Dynamic Programming
- R.S. Sutton, "Integrated Architectures for Learning, Planning and Reacting Based on Approximating Dynamic Programming", Proceedings of the 7th International Conference on Machine Learning, 1990, pp. 216-224.
- (1990) Proceedings of the 7th International Conference on Machine Learning , pp. 216-224
- Sutton, R.S.¹

14
- 0004102479
- MIT Press
- R.S. Sutton, and A.G. Barto, Reinforcement learning, an introduction, MIT Press, 1998.
- (1998) Reinforcement learning, an introduction
- Sutton, R.S.¹ Barto, A.G.²

15
- 0033170372
- Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning
- R.S. Sutton, D. Precup, and S. Singh. "Between MDPs and Semi-MDPs: a Framework for Temporal Abstraction in Reinforcement Learning", Artificial Intelligence 112, 1999, pp. 181-211.
- (1999) Artificial Intelligence , vol.112 , pp. 181-211
- Sutton, R.S.¹ Precup, D.² Singh, S.³

16
- 34547533392
- C.J.C.H. Watkins, Learning from Delayed Rewards, Ph.D. thesis, Cambridge University, Cambridge, England
- C.J.C.H. Watkins, Learning from Delayed Rewards, Ph.D. thesis, Cambridge University, Cambridge, England.

17
- 34249833101
- Technical note: Q-Learning
- C.J.C.H. Watkins, and P. Dayan, "Technical note: Q-Learning", Machine Learning 8, 1992, pp. 279-292.
- (1992) Machine Learning , vol.8 , pp. 279-292
- Watkins, C.J.C.H.¹ Dayan, P.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.