메뉴 건너뛰기




Volumn 19, Issue , 2003, Pages 399-468

Efficient solution algorithms for factored MDPs

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS; APPROXIMATION THEORY; COMPUTATIONAL COMPLEXITY; ERROR ANALYSIS; FUNCTIONS; LINEAR PROGRAMMING; POLYNOMIALS; PROBLEM SOLVING;

EID: 4544318426     PISSN: 10769757     EISSN: 10769757     Source Type: Journal    
DOI: 10.1613/jair.1000     Document Type: Article
Times cited : (392)

References (50)
  • 2
    • 0035121292 scopus 로고    scopus 로고
    • A sufficiently fast algorithm for finding close to optimal clique trees
    • Becker, A., & Geiger, D. (2001). A sufficiently fast algorithm for finding close to optimal clique trees. Artificial Intelligence, 125(1-2), 3-17.
    • (2001) Artificial Intelligence , vol.125 , Issue.1-2 , pp. 3-17
    • Becker, A.1    Geiger, D.2
  • 3
    • 84968468700 scopus 로고
    • Polynomial approximation - A new computational technique in dynamic programming
    • Bellman, R., Kalaba, R., & Kotkin, B. (1963). Polynomial approximation - a new computational technique in dynamic programming. Math. Comp., 17(8), 155-161.
    • (1963) Math. Comp. , vol.17 , Issue.8 , pp. 155-161
    • Bellman, R.1    Kalaba, R.2    Kotkin, B.3
  • 4
    • 0003787146 scopus 로고
    • Princeton University Press, Princeton, New Jersey
    • Bellman, R. E. (1957). Dynamic Programming. Princeton University Press, Princeton, New Jersey.
    • (1957) Dynamic Programming
    • Bellman, R.E.1
  • 7
    • 0346942368 scopus 로고    scopus 로고
    • Decision theoretic planning: Structural assumptions and computational leverage
    • Boutilier, C., Dean, T., & Hanks, S. (1999). Decision theoretic planning: Structural assumptions and computational leverage. Journal of Artificial Intelligence, Research, 11, 1-94.
    • (1999) Journal of Artificial Intelligence, Research , vol.11 , pp. 1-94
    • Boutilier, C.1    Dean, T.2    Hanks, S.3
  • 8
    • 0012352653 scopus 로고    scopus 로고
    • Approximating value trees in structured dynamic programming
    • Boutilier, C., & Dearden, R. (1996). Approximating value trees in structured dynamic programming. In Proc. ICML, pp. 54-62.
    • (1996) Proc. ICML, . , pp. 54-62
    • Boutilier, C.1    Dearden, R.2
  • 9
    • 85166207010 scopus 로고
    • Exploiting structure in policy construction
    • Boutilier, C., Dearden, R., & Goldszmidt, M. (1995). Exploiting structure in policy construction. In Proc. IJCAI, pp. 1104-1111.
    • (1995) Proc. IJCAI , pp. 1104-1111
    • Boutilier, C.1    Dearden, R.2    Goldszmidt, M.3
  • 10
    • 0034248853 scopus 로고    scopus 로고
    • Stochastic dynamic programming with factored representations
    • Boutilier, C., Dearden, R., & Goldszmidt, M. (2000). Stochastic dynamic programming with factored representations. Artificial Intelligence, 121(1-2), 49-107.
    • (2000) Artificial Intelligence , vol.121 , Issue.1-2 , pp. 49-107
    • Boutilier, C.1    Dearden, R.2    Goldszmidt, M.3
  • 12
    • 1542312633 scopus 로고    scopus 로고
    • The linear programming approach to approximate dynamic programming
    • Submitted to
    • de Farias, D., & Van Roy, B. (2001a). The linear programming approach to approximate dynamic programming. Submitted to Operations Research.
    • (2001) Operations Research
    • De Farias, D.1    Van Roy, B.2
  • 13
    • 13444255536 scopus 로고    scopus 로고
    • On constraint sampling for the linear programming approach to approximate dynamic programming
    • To appear
    • de Farias, D., & Van Roy, B. (2001b). On constraint sampling for the linear programming approach to approximate dynamic programming. To appear in Mathematics of Operations Research.
    • (2001) Mathematics of Operations Research
    • De Farias, D.1    Van Roy, B.2
  • 15
    • 84990553353 scopus 로고
    • A model for reasoning about persistence and causation
    • Dean, T., & Kanazawa, K. (1989). A model for reasoning about persistence and causation. Computational Intelligence, 5(3), 142-150.
    • (1989) Computational Intelligence , vol.5 , Issue.3 , pp. 142-150
    • Dean, T.1    Kanazawa, K.2
  • 17
    • 0030697013 scopus 로고    scopus 로고
    • Abstraction and approximate decision theoretic planning
    • Dearden, R., & Boutilier, C. (1997). Abstraction and approximate decision theoretic planning. Artificial Intelligence, 89(1), 219-283.
    • (1997) Artificial Intelligence , vol.89 , Issue.1 , pp. 219-283
    • Dearden, R.1    Boutilier, C.2
  • 18
    • 0033188982 scopus 로고    scopus 로고
    • Bucket elimination: A unifying framework for reasoning
    • Dechter, R. (1999). Bucket elimination: A unifying framework for reasoning. Artificial Intelligence, 113(1-2), 41-85.
    • (1999) Artificial Intelligence , vol.113 , Issue.1-2 , pp. 41-85
    • Dechter, R.1
  • 26
    • 0000086731 scopus 로고
    • Influence diagrams
    • Howard, R. A., & Matheson, J. E. (Eds.), Strategic Decisions Group, Menlo Park, California
    • Howard, R. A., & Matheson, J. E. (1984). Influence diagrams. In Howard, R. A., & Matheson, J. E. (Eds.), Readings on the Principles and Applications of Decision Analysis, pp. 721-762. Strategic Decisions Group, Menlo Park, California.
    • (1984) Readings on the Principles and Applications of Decision Analysis , pp. 721-762
    • Howard, R.A.1    Matheson, J.E.2
  • 29
    • 0008091392 scopus 로고
    • Triangulation of graphs - Algorithms giving small total state space
    • Department of Mathematics and Computer Science, Strand-vejen, Aalborg, Denmark
    • Kjaerulff, U. (1990). Triangulation of graphs - algorithms giving small total state space. Tech. rep. TR R 90-09, Department of Mathematics and Computer Science, Strand-vejen, Aalborg, Denmark.
    • (1990) Tech. Rep. , vol.TR R 90-09
    • Kjaerulff, U.1
  • 35
    • 0026979104 scopus 로고
    • Finding approximate separators and computing tree-width quickly
    • ACM
    • Reed, B. (1992). Finding approximate separators and computing tree-width quickly. In 24th Annual Symposium on Theory of Computing, pp. 221-228. ACM.
    • (1992) 24th Annual Symposium on Theory of Computing , pp. 221-228
    • Reed, B.1
  • 39
    • 84899022377 scopus 로고    scopus 로고
    • How to dynamically merge Markov decision processes
    • Jordan, M. I., Kearns, M. J., & Solla, S. A. (Eds.), The MIT Press
    • Singh, S., & Cohn, D. (1998). How to dynamically merge Markov decision processes. In Jordan, M. I., Kearns, M. J., & Solla, S. A. (Eds.), Advances in Neural Information Processing Systems, Vol. 10. The MIT Press.
    • (1998) Advances in Neural Information Processing Systems , vol.10
    • Singh, S.1    Cohn, D.2
  • 41
    • 33746816918 scopus 로고
    • Note on Jordan elimination, linear programming and Tchebycheff approximation
    • Stiefel, E. (1960). Note on Jordan elimination, linear programming and Tchebycheff approximation. Numerische Mathematik, 2, 1-17.
    • (1960) Numerische Mathematik , vol.2 , pp. 1-17
    • Stiefel, E.1
  • 42
    • 33847202724 scopus 로고
    • Learning to predict by the methods of temporal differences
    • Sutton, R. S. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3, 9-44.
    • (1988) Machine Learning , vol.3 , pp. 9-44
    • Sutton, R.S.1
  • 43
    • 0002313852 scopus 로고    scopus 로고
    • Scaling up average reward reinforcmeent learning by approximating the domain models and the value function
    • Bari, Italy. Morgan Kaufmann
    • Tadepalli, P., & Ok, D. (1996). Scaling up average reward reinforcmeent learning by approximating the domain models and the value function. In Proceedings of the Thirteenth International Conference on Machine Learning, Bari, Italy. Morgan Kaufmann.
    • (1996) In Proceedings of the Thirteenth International Conference on Machine Learning
    • Tadepalli, P.1    Ok, D.2
  • 45
    • 0029752470 scopus 로고    scopus 로고
    • Feature-based methods for large scale dynamic programming
    • Tsitsiklis, J. N., & Van Roy, B. (1996a). Feature-based methods for large scale dynamic programming. Machine Learning, 22, 59-94.
    • (1996) Machine Learning , vol.22 , pp. 59-94
    • Tsitsiklis, J.N.1    Van Roy, B.2
  • 46
    • 0008813539 scopus 로고    scopus 로고
    • An analysis of temporal-difference learning with function approximation
    • Laboratory for Information and Decision Systems, Massachusetts Institute of Technology
    • Tsitsiklis, J. N., & Van Roy, B. (1996b). An analysis of temporal-difference learning with function approximation. Technical report LIDS-P-2322, Laboratory for Information and Decision Systems, Massachusetts Institute of Technology.
    • (1996) Technical Report , vol.LIDS-P-2322
    • Tsitsiklis, J.N.1    Van Roy, B.2
  • 48
    • 0012252296 scopus 로고
    • Tight performance bounds on greedy policies based on imperfect value functions
    • College of Computer Science, Northeastern University, Boston, Massachusetts
    • Williams, R. J., & Baird, L. C. I. (1993). Tight performance bounds on greedy policies based on imperfect value functions. Tech. rep., College of Computer Science, Northeastern University, Boston, Massachusetts.
    • (1993) Tech. Rep.
    • Williams, R.J.1    Baird, L.C.I.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.