SCOPUS 정보 검색 플랫폼

Volumn 2371, Issue , 2002, Pages 196-211

Model minimization in hierarchical reinforcement learning

Author keywords

[No Author keywords available]

Indexed keywords

ABSTRACTING; MACHINE LEARNING; MARKOV PROCESSES; REDUNDANCY;

EQUIVALENT MODEL; FRAME OF REFERENCE; HIERARCHICAL REINFORCEMENT LEARNING; MARKOV DECISION PROCESSES; REAL-WORLD PROBLEM;

REINFORCEMENT LEARNING;

EID: 84956854078 PISSN: 03029743 EISSN: 16113349 Source Type: Book Series
DOI: 10.1007/3-540-45622-8_15 Document Type: Conference Paper

Times cited : (64)

References (19)

2
- 85166207010
- Exploiting structure in policy construction
- C. Boutilier, R. Dearden, and M. Goldszmidt. Exploiting structure in policy construction. In Proceedings of International Joint Conference on Artificial Intelligence 14, pages 1104-1111, 1995.
- (1995) Proceedings of International Joint Conference on Artificial Intelligence 14 , pp. 1104-1111
- Boutilier, C.¹ Dearden, R.² Goldszmidt, M.³

5
- 0030215699
- Symmetry and model checking
- E. A. Emerson and A. P. Sistla. Symmetry and model checking. Formal Methods in System Design, 9(1/2):105-131, 1996.
- (1996) Formal Methods in System Design , vol.9 , Issue.1-2 , pp. 105-131
- Emerson, E.A.¹ Sistla, A.P.²

6
- 84956854241
- Equivalence notions and model minimization in markov decision processes
- Robert Givan, Thomas Dean, and Matthew Greig. Equivalence notions and model minimization in markov decision processes. Submitted to Artificial Intelligence, 2001.
- (2001) Submitted to Artificial Intelligence
- Givan, R.¹ Dean, T.² Greig, M.³

7
- 0034272032
- Bounded-parameter markov decision processes
- Robert Givan, Sonia Leach, and Thomas Dean. Bounded-parameter markov decision processes. Artificial Intelligence, 122:71-109, 2000.
- (2000) Artificial Intelligence , vol.122 , pp. 71-109
- Givan, R.¹ Leach, S.² Dean, T.³

8
- 0141763163
- Symmetry groups and translation invariant representations of markov processes
- J. Glover. Symmetry groups and translation invariant representations of markov processes. The Annals of Probability, 19(2):562-586, 1991.
- (1991) The Annals of Probability , vol.19 , Issue.2 , pp. 562-586
- Glover, J.¹

9
- 0003881270
- Prentice-Hall, Englewood Cliffs, NJ
- J. Hartmanis and R. E. Stearns. Algebraic Structure Theory of Sequential Machines. Prentice-Hall, Englewood Cliffs, NJ, 1966.
- (1966) Algebraic Structure Theory of Sequential Machines
- Hartmanis, J.¹ Stearns, R.E.²

10
- 0000148778
- Iba. A heuristic approach to the discovery of macro-operators
- Glenn A. Iba. A heuristic approach to the discovery of macro-operators. Machine Learning, 3:285-317, 1989.
- (1989) Machine Learning , vol.3 , pp. 285-317
- Glenn, A.¹

11
- 0014604028
- A note on the iterative decomposition of finite automata
- J. R. Jump. A note on the iterative decomposition of finite automata. Information and Control, 15:424-435, 1969.
- (1969) Information and Control , vol.15 , pp. 424-435
- Jump, J.R.¹

12
- 0026222347
- Bisimulation through probabilistic testing
- K. G. Larsen and A. Skou. Bisimulation through probabilistic testing. Information and Computation, 94(1):1-28, 1991.
- (1991) Information and Computation , vol.94 , Issue.1 , pp. 1-28
- Larsen, K.G.¹ Skou, A.²

14
- 0003392384
- PhD thesis, University of Massachusetts, Amherst, May
- Doina Precup. Temporal Abstraction in Reinforcement Learning. PhD thesis, University of Massachusetts, Amherst, May 2000.
- (2000) Temporal Abstraction in Reinforcement Learning
- Precup, D.¹

15
- 57749202646
- Technical Report 01-43, University of Massachusetts, Amherst
- B. Ravindran and A. G. Barto. Symmetries and model minimization of markov decision processes. Technical Report 01-43, University of Massachusetts, Amherst, 2001.
- (2001) Symmetries and Model Minimization of Markov Decision Processes
- Ravindran, B.¹ Barto, A.G.²

17
- 0033170372
- Sutton, Doina Precup, and Satinder Singh. Between MDPs and Semi-MDPs
- Richard S. Sutton, Doina Precup, and Satinder Singh. Between MDPs and Semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112:181-211, 1999.
- (1999) A Framework for Temporal Abstraction in Reinforcement Learning. Artificial Intelligence , vol.112 , pp. 181-211
- Richard, S.¹

18
- 0004049893
- PhD thesis, Cambridge University, Cambridge, England
- C. J. C. H. Watkins. Learning from delayed rewards. PhD thesis, Cambridge University, Cambridge, England, 1989.
- (1989) Learning from Delayed Rewards
- Watkins, C.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.