SCOPUS 정보 검색 플랫폼

Journal of Machine Learning Research

Volumn 7, Issue , 2006, Pages 2259-2301

Causal graph based decomposition of factored MDPs

(2) Jonsson, Anders a Barto, Andrew b

a UNIVERSITAT POMPEU FABRA (Spain)

b The Manning College of Information and Computer Sciences (United States)

Author keywords

Hierarchical decomposition; Markov decision processes; State abstraction

Indexed keywords

ABSTRACTING; ALGORITHMS; DECISION MAKING; GRAPH THEORY; HIERARCHICAL SYSTEMS; MATHEMATICAL MODELS;

BAYESIAN NETWORK MODEL; HIERARCHICAL DECOMPOSITION; MARKOV DECISION PROCESSES; STATE ABSTRACTION;

MARKOV PROCESSES;

EID: 33750705246 PISSN: 15337928 EISSN: 15337928 Source Type: Journal
DOI: None Document Type: Article

Times cited : (75)

References (40)

1
- 0029210635
- Learning to act using real-time dynamic programming
- A. Barto, S. Bradtke, and S. Singh. Learning to act using real-time dynamic programming. Artificial Intelligence, Special Volume on Computational Research on Interaction and Agency, 72(1):81-138, 1995.
- (1995) Artificial Intelligence, Special Volume on Computational Research on Interaction and Agency , vol.72 , Issue.1 , pp. 81-138
- Barto, A.¹ Bradtke, S.² Singh, S.³

2
- 0001700171
- A Markov decision process
- R. Bellman. A Markov decision process. Journal of Mathematical Mechanics, 6:679-684, 1957.
- (1957) Journal of Mathematical Mechanics , vol.6 , pp. 679-684
- Bellman, R.¹

3
- 85166207010
- Exploiting structure in policy construction
- C. Boutilier, R. Dearden, and M. Goldszmidt. Exploiting structure in policy construction. Proceedings of the International Joint Conference on Artificial Intelligence, 14:1104-1113, 1995.
- (1995) Proceedings of the International Joint Conference on Artificial Intelligence , vol.14 , pp. 1104-1113
- Boutilier, C.¹ Dearden, R.² Goldszmidt, M.³

4
- 85150714688
- Reinforcement learning methods for continuous-time Markov decision problems
- S. Bradtke and M. Duff. Reinforcement learning methods for continuous-time Markov decision problems. Advances in Neural Information Processing Systems, 7:393-400, 1995.
- (1995) Advances in Neural Information Processing Systems , vol.7 , pp. 393-400
- Bradtke, S.¹ Duff, M.²

5
- 14344261491
- Using relative novelty to identify useful temporal abstractions in reinforcement learning
- Ö. Şimşek and A. Barto. Using relative novelty to identify useful temporal abstractions in reinforcement learning. Proceedings of the International Conference on Machine Learning, 21:151-758, 2004.
- (2004) Proceedings of the International Conference on Machine Learning , vol.21 , pp. 151-758
- Şimşek, Ö.¹ Barto, A.²

6
- 31844447221
- Identifying useful subgoals in reinforcement learning by local graph partitioning
- Ö. Şimşek, A. Wolfe, and A. Barto. Identifying useful subgoals in reinforcement learning by local graph partitioning. Proceedings of the International Conference on Machine Learning, 22, 2005.
- (2005) Proceedings of the International Conference on Machine Learning , vol.22
- Şimşek, Ö.¹ Wolfe, A.² Barto, A.³

7
- 0031370386
- Model minimization in Markov decision processes
- T. Dean and R. Givan. Model minimization in Markov decision processes. Proceedings of the National Conference on Artificial Intelligence, 14:106-111, 1997.
- (1997) Proceedings of the National Conference on Artificial Intelligence , vol.14 , pp. 106-111
- Dean, T.¹ Givan, R.²

8
- 84990553353
- A model for reasoning about persistence and causation
- T. Dean and K. Kanazawa. A model for reasoning about persistence and causation. Computational Intelligence, 5(3): 142-150, 1989.
- (1989) Computational Intelligence , vol.5 , Issue.3 , pp. 142-150
- Dean, T.¹ Kanazawa, K.²

9
- 85168151397
- Decomposition techniques for planning in stochastic domains
- T. Dean and S. Lin. Decomposition techniques for planning in stochastic domains. Proceedings of the International Joint Conference on Artificial Intelligence, 14:1121-1129, 1995.
- (1995) Proceedings of the International Joint Conference on Artificial Intelligence , vol.14 , pp. 1121-1129
- Dean, T.¹ Lin, S.²

10
- 0002278788
- Hierarchical reinforcement learning with the MAXQ value function decomposition
- T. Dietterich. Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 13:227-303, 2000a.
- (2000) Journal of Artificial Intelligence Research , vol.13 , pp. 227-303
- Dietterich, T.¹

11
- 0003506152
- State abstraction in MAXQ hierarchical reinforcement learning
- T. Dietterich. State Abstraction in MAXQ Hierarchical Reinforcement Learning. Advances in Neural Information Processing Systems, 12:994-1000, 2000b.
- (2000) Advances in Neural Information Processing Systems , vol.12 , pp. 994-1000
- Dietterich, T.¹

12
- 0007907759
- Emergent hierarchical control structures: Learning reactive/hierarchical relationships in reinforcement environments
- B. Digney. Emergent hierarchical control structures: Learning reactive/hierarchical relationships in reinforcement environments. From animals to animals, 4:363-372, 1996.
- (1996) From Animals to Animals , vol.4 , pp. 363-372
- Digney, B.¹

13
- 0036927416
- Symbolic heuristic search for factored Markov decision processes
- Z. Feng and E. Hansen. Symbolic Heuristic Search for Factored Markov Decision Processes. Proceedings of the National Conference on Artificial Intelligence, 18:455-460, 2002.
- (2002) Proceedings of the National Conference on Artificial Intelligence , vol.18 , pp. 455-460
- Feng, Z.¹ Hansen, E.²

14
- 33744500784
- Symbolic generalization for on-line planning
- Z. Feng, E. Hansen, and S. Zilberstein. Symbolic generalization for on-line planning. Proceedings of Uncertainty in Artificial Intelligence, 19:209-216, 2003.
- (2003) Proceedings of Uncertainty in Artificial Intelligence , vol.19 , pp. 209-216
- Feng, Z.¹ Hansen, E.² Zilberstein, S.³

15
- 2842560201
- Strips: A new approach to the application of theorem proving to problem solving
- R. Fikes and N. Nilsson. Strips: A New Approach to the Application of Theorem Proving to Problem Solving. Artificial Intelligence, 2:189-208, 1971.
- (1971) Artificial Intelligence , vol.2 , pp. 189-208
- Fikes, R.¹ Nilsson, N.²

16
- 0013528312
- Continuous-time hierarchical reinforcement learning
- M. Ghavamzadeh and S. Mahadevan. Continuous-Time Hierarchical Reinforcement Learning. Proceedings of the International Conference on Machine Learning, 18:186-193, 2001.
- (2001) Proceedings of the International Conference on Machine Learning , vol.18 , pp. 186-193
- Ghavamzadeh, M.¹ Mahadevan, S.²

17
- 84880898477
- Max-norm projections for factored MDPs
- C. Guestrin, D. Koller, and R. Parr. Max-norm Projections for Factored MDPs. International Joint Conference on Artificial Intelligence, 17:673-680, 2001.
- (2001) International Joint Conference on Artificial Intelligence , vol.17 , pp. 673-680
- Guestrin, C.¹ Koller, D.² Parr, R.³

18
- 0023365727
- Statecharts: A visual formalism for complex systems
- D. Harel. Statecharts: A visual formalism for complex systems. Science of Computer Programming, 8:231-274, 1987.
- (1987) Science of Computer Programming , vol.8 , pp. 231-274
- Harel, D.¹

19
- 0006419533
- Hierarchical solution of Markov decision processes using macro-actions
- M. Hauskrecht, N. Meuleau, L. Kaelbling, T. Dean, and C. Boutilier. Hierarchical Solution of Markov Decision Processes using Macro-actions. Uncertainty in Artificial Intelligence, 14:220-229, 1998.
- (1998) Uncertainty in Artificial Intelligence , vol.14 , pp. 220-229
- Hauskrecht, M.¹ Meuleau, N.² Kaelbling, L.³ Dean, T.⁴ Boutilier, C.⁵

20
- 13444260042
- A planning heuristic based on causal graph analysis
- M. Helmert. A planning heuristic based on causal graph analysis. Proceedings of the International Conference on Automated Planning and Scheduling, 14:161-170, 2004.
- (2004) Proceedings of the International Conference on Automated Planning and Scheduling , vol.14 , pp. 161-170
- Helmert, M.¹

21
- 0013465036
- Discovering hierarchy in reinforcement learning with HEXQ
- B. Hengst. Discovering Hierarchy in Reinforcement Learning with HEXQ. Proceedings of the International Conference on Machine Learning, 19:243-250, 2002.
- (2002) Proceedings of the International Conference on Machine Learning , vol.19 , pp. 243-250
- Hengst, B.¹

22
- 0002956570
- Spudd: Stochastic planning using decision diagrams
- J. Hoey, R. St-Aubin, A. Hu, and C. Boutilier. Spudd: Stochastic Planning using Decision Diagrams. Proceedings of Uncertainty in Artificial Intelligence, 15:279-288, 1999.
- (1999) Proceedings of Uncertainty in Artificial Intelligence , vol.15 , pp. 279-288
- Hoey, J.¹ St-Aubin, R.² Hu, A.³ Boutilier, C.⁴

23
- 84898927961
- Automated state abstractions for options using the U-tree algorithm
- A. Jonsson and A. Barto. Automated State Abstractions for Options Using the U-Tree Algorithm. Advances in Neural Information Processing Systems, 13:1054-1060, 2001.
- (2001) Advances in Neural Information Processing Systems , vol.13 , pp. 1054-1060
- Jonsson, A.¹ Barto, A.²

24
- 31844455449
- A causal approach to hierarchical decomposition of factored MDPs
- A. Jonsson and A. Barto. A Causal Approach to Hierarchical Decomposition of Factored MDPs. Proceedings of the International Conference on Machine Learning, 22:401-408, 2005.
- (2005) Proceedings of the International Conference on Machine Learning , vol.22 , pp. 401-408
- Jonsson, A.¹ Barto, A.²

25
- 84880677563
- Efficient reinforcement learning in factored MDPs
- M. Kearns and D. Koller. Efficient Reinforcement Learning in Factored MDPs. Proceedings of the International Joint Conference on Artificial Intelligence, 16:740-747, 1999.
- (1999) Proceedings of the International Joint Conference on Artificial Intelligence , vol.16 , pp. 740-747
- Kearns, M.¹ Koller, D.²

26
- 14344250635
- Dynamic abstraction in reinforcement learning via clustering
- S. Mannor, I. Menache, A. Hoze, and U. Klein. Dynamic abstraction in reinforcement learning via clustering. Proceedings of the International Conference on Machine Learning, 21:560-567, 2004.
- (2004) Proceedings of the International Conference on Machine Learning , vol.21 , pp. 560-567
- Mannor, S.¹ Menache, I.² Hoze, A.³ Klein, U.⁴

27
- 0013465187
- Automatic discovery of subgoals in reinforcement learning using diverse density
- A. McGovern and A. Barto. Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density. Proceedings of the International Conference on Machine Learning, 18:361-368, 2001.
- (2001) Proceedings of the International Conference on Machine Learning , vol.18 , pp. 361-368
- McGovern, A.¹ Barto, A.²

28
- 84945250000
- Q-cut - Dynamic discovery of sub-goals in reinforcement learning
- I. Menache, S. Mannor, and N. Shimkin. Q-Cut - Dynamic Discovery of Sub-Goals in Reinforcement Learning. Proceedings of the European Conference on Machine Learning, 13:295-306, 2002.
- (2002) Proceedings of the European Conference on Machine Learning , vol.13 , pp. 295-306
- Menache, I.¹ Mannor, S.² Shimkin, N.³

29
- 0037767534
- Technical Report, University of California, Berkeley, USA
- K. Murphy. Active Learning of Causal Bayes Net Structure. Technical Report, University of California, Berkeley, USA, 2001.
- (2001) Active Learning of Causal Bayes Net Structure
- Murphy, K.¹

30
- 0003989214
- Ph.D. Thesis, University of California at Berkeley
- R. Parr. Hierarchical Control and Learning for Markov Decision Processes. Ph.D. Thesis, University of California at Berkeley, 1998.
- (1998) Hierarchical Control and Learning for Markov Decision Processes
- Parr, R.¹

31
- 84898956770
- Reinforcement learning with hierarchies of machines
- R. Parr and S. Russell. Reinforcement Learning with Hierarchies of Machines. Advances in Neural Information Processing Systems, 10:1043-1049, 1998.
- (1998) Advances in Neural Information Processing Systems , vol.10 , pp. 1043-1049
- Parr, R.¹ Russell, S.²

32
- 14344250461
- Policyblocks: An algorithm for creating useful macro-actions in reinforcement learning
- M. Pickett and A. Barto. Policyblocks: An Algorithm for Creating Useful Macro-Actions in Reinforcement Learning. Proceedings of the International Conference on Machine Learning, 19: 506-513, 2002.
- (2002) Proceedings of the International Conference on Machine Learning , vol.19 , pp. 506-513
- Pickett, M.¹ Barto, A.²

33
- 0003998452
- Wiley Interscience, New York, USA
- M. Puterman. Markov Decision Processes. Wiley Interscience, New York, USA, 1994.
- (1994) Markov Decision Processes
- Puterman, M.¹

34
- 32844454706
- Ph.D. Thesis, Department of Computer Science, University of Massachusetts, Amherst, USA
- B. Ravindran. An Algebraic Approach to Abstraction in Reinforcement Learning. Ph.D. Thesis, Department of Computer Science, University of Massachusetts, Amherst, USA, 2004.
- (2004) An Algebraic Approach to Abstraction in Reinforcement Learning
- Ravindran, B.¹

35
- 84899031920
- Intrinsically motivated reinforcement learning
- S. Singh, A. Barto, and N. Chentanez. Intrinsically Motivated Reinforcement Learning. Advances in Neural Information Processing Systems, 18:1281-1288, 2005.
- (2005) Advances in Neural Information Processing Systems , vol.18 , pp. 1281-1288
- Singh, S.¹ Barto, A.² Chentanez, N.³

36
- 31844452167
- Unsupervised active learning in large domains
- H. Steck and T. Jaakkola. Unsupervised Active Learning in Large Domains. Proceedings of Uncertainty in Artificial Intelligence, 18:469-476, 2002.
- (2002) Proceedings of Uncertainty in Artificial Intelligence , vol.18 , pp. 469-476
- Steck, H.¹ Jaakkola, T.²

37
- 0004102479
- MIT Press, Cambridge, USA
- R. Sutton and A. Barto. Reinforcement Learning: An Introduction. MIT Press, Cambridge, USA, 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.¹ Barto, A.²

38
- 0033170372
- Between MDPs and Semi-MDPs: A framework for temporal abstraction in reinforcement learning
- R. Sutton, D. Precup, and S. Singh. Between MDPs and Semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112:181-211, 1999.
- (1999) Artificial Intelligence , vol.112 , pp. 181-211
- Sutton, R.¹ Precup, D.² Singh, S.³

39
- 33749882712
- Finding structure in reinforcement learning
- S. Thrun and A. Schwartz. Finding structure in reinforcement learning. Advances in Neural Information Processing Systems, 8:385-392, 1996.
- (1996) Advances in Neural Information Processing Systems , vol.8 , pp. 385-392
- Thrun, S.¹ Schwartz, A.²

40
- 84898945418
- Active learning for parameter estimation in Bayesian networks
- S. Tong and D. Koller. Active learning for parameter estimation in Bayesian networks. Advances in Neural Information Processing Systems, 13:647-653, 2001.
- (2001) Advances in Neural Information Processing Systems , vol.13 , pp. 647-653
- Tong, S.¹ Koller, D.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.