메뉴 건너뛰기




Volumn 25, Issue , 2006, Pages 75-118

Approximate policy iteration with a policy language bias: Solving relational markov decision processes

Author keywords

[No Author keywords available]

Indexed keywords

APPROXIMATION THEORY; FUNCTIONS; LEARNING SYSTEMS; MARKOV PROCESSES; PLANNING; STOCHASTIC CONTROL SYSTEMS;

EID: 33744466799     PISSN: 10769757     EISSN: 10769757     Source Type: Journal    
DOI: 10.1613/jair.1700     Document Type: Article
Times cited : (113)

References (51)
  • 1
    • 0036784224 scopus 로고    scopus 로고
    • Using genetic programming to learn and improve control knowledge
    • Aler, R., Borrajo, D., & Isasi, P. (2002). Using genetic programming to learn and improve control knowledge. Artificial Intelligence, 141(1-2), 29-56.
    • (2002) Artificial Intelligence , vol.141 , Issue.1-2 , pp. 29-56
    • Aler, R.1    Borrajo, D.2    Isasi, P.3
  • 3
    • 0035442648 scopus 로고    scopus 로고
    • The AIPS '00 planning competition
    • 3
    • Bacchus, F. (2001). The AIPS '00 planning competition. AI Magazine, 22(3)(3), 57-62.
    • (2001) AI Magazine , vol.22 , Issue.3 , pp. 57-62
    • Bacchus, F.1
  • 4
    • 0033897011 scopus 로고    scopus 로고
    • Using temporal logics to express search control knowledge for planning
    • Bacchus, F., & Kabanza, F. (2000). Using temporal logics to express search control knowledge for planning. Artificial Intelligence, 16, 123-191.
    • (2000) Artificial Intelligence , vol.16 , pp. 123-191
    • Bacchus, F.1    Kabanza, F.2
  • 9
    • 0034248853 scopus 로고    scopus 로고
    • Stochastic dynamic programming with factored representations
    • Boutilier, C., Dearden, R., & Goldszmidt, M. (2000). Stochastic dynamic programming with factored representations. Artificial Intelligence, 121(1-2), 49-107.
    • (2000) Artificial Intelligence , vol.121 , Issue.1-2 , pp. 49-107
    • Boutilier, C.1    Dearden, R.2    Goldszmidt, M.3
  • 12
    • 0000746330 scopus 로고    scopus 로고
    • Model reduction techniques for computing approximately optimal solutions for Markov decision processes
    • Dean, T., Givan, R., & Leach, S. (1997). Model reduction techniques for computing approximately optimal solutions for Markov decision processes. In Conference on Uncertainty in Artificial Intelligence, pp. 124-131.
    • (1997) Conference on Uncertainty in Artificial Intelligence , pp. 124-131
    • Dean, T.1    Givan, R.2    Leach, S.3
  • 17
    • 2542504100 scopus 로고    scopus 로고
    • A selective macro-learning algorithm and its application to the N×N sliding-tile puzzle
    • Finkelstein, L., & Markovitch, S. (1998). A selective macro-learning algorithm and its application to the N×N sliding-tile puzzle. Journal of Artificial Intelligence Research, 8, 223-263.
    • (1998) Journal of Artificial Intelligence Research , vol.8 , pp. 223-263
    • Finkelstein, L.1    Markovitch, S.2
  • 18
    • 0038517214 scopus 로고    scopus 로고
    • Equivalence notions and model minimization in Markov decision processes
    • Givan, R., Dean, T., & Greig, M. (2003). Equivalence notions and model minimization in Markov decision processes. Artificial Intelligence, 147(1-2), 163-223.
    • (2003) Artificial Intelligence , vol.147 , Issue.1-2 , pp. 163-223
    • Givan, R.1    Dean, T.2    Greig, M.3
  • 23
    • 0036377352 scopus 로고    scopus 로고
    • The FF planning system: Fast plan generation through heuristic search
    • Hoffmann, J., & Nebel, B. (2001). The FF planning system: Fast plan generation through heuristic search. Journal of Artificial Intelligence Research, 14, 263-302.
    • (2001) Journal of Artificial Intelligence Research , vol.14 , pp. 263-302
    • Hoffmann, J.1    Nebel, B.2
  • 26
    • 0036832951 scopus 로고    scopus 로고
    • A sparse sampling algorithm for near-optimal planning in large markov decision processes
    • Kearns, M. J., Mansour, Y., & Ng, A. Y. (2002). A sparse sampling algorithm for near-optimal planning in large markov decision processes. Machine Learning, 45(2-3), 193-208.
    • (2002) Machine Learning , vol.45 , Issue.2-3 , pp. 193-208
    • Kearns, M.J.1    Mansour, Y.2    Ng, A.Y.3
  • 28
    • 0033189384 scopus 로고    scopus 로고
    • Learning action strategies for planning domains
    • Khardon, R. (1999a). Learning action strategies for planning domains. Artificial Intelligence, 113(1-2), 125-148.
    • (1999) Artificial Intelligence , vol.113 , Issue.1-2 , pp. 125-148
    • Khardon, R.1
  • 29
    • 0032649290 scopus 로고    scopus 로고
    • Learning to take actions
    • Khardon, R. (1999b). Learning to take actions. Machine Learning, 35(1), 57-90.
    • (1999) Machine Learning , vol.35 , Issue.1 , pp. 57-90
    • Khardon, R.1
  • 34
    • 0027574520 scopus 로고
    • Taxonomic syntax for first order inference
    • McAllester, D., & Givan, R. (1993). Taxonomic syntax for first order inference. Journal of the ACM, 40(2), 246-283.
    • (1993) Journal of the ACM , vol.40 , Issue.2 , pp. 246-283
    • McAllester, D.1    Givan, R.2
  • 36
    • 0036832961 scopus 로고    scopus 로고
    • Building a basic block instruction scheduler using reinforcement learning and rollouts
    • McGovern, A., Moss, E., & Barto, A. (2002). Building a basic block instruction scheduler using reinforcement learning and rollouts. Machine Learning, 49(2/3), 141-160.
    • (2002) Machine Learning , vol.49 , Issue.2-3 , pp. 141-160
    • McGovern, A.1    Moss, E.2    Barto, A.3
  • 37
    • 84990622495 scopus 로고
    • Quantitative results concerning the utility of explanation-based learning
    • Minton, S. (1988). Quantitative results concerning the utility of explanation-based learning. In National Conference on Artificial Intelligence.
    • (1988) National Conference on Artificial Intelligence
    • Minton, S.1
  • 41
  • 42
    • 1442267080 scopus 로고
    • Learning decision lists
    • Rivest, R. (1987). Learning decision lists. Machine Learning, 2(3), 229-246.
    • (1987) Machine Learning , vol.2 , Issue.3 , pp. 229-246
    • Rivest, R.1
  • 43
    • 0001046225 scopus 로고
    • Practical issues in temporal difference learning
    • Tesauro, G. (1992). Practical issues in temporal difference learning. Machine Learning, 8, 257-277.
    • (1992) Machine Learning , vol.8 , pp. 257-277
    • Tesauro, G.1
  • 45
    • 0029752470 scopus 로고    scopus 로고
    • Feature-based methods for large scale DP
    • Tsitsiklis, J., & Van Roy, B. (1996). Feature-based methods for large scale DP. Machine Learning, 22, 59-94.
    • (1996) Machine Learning , vol.22 , pp. 59-94
    • Tsitsiklis, J.1    Van Roy, B.2
  • 47
    • 0034997316 scopus 로고    scopus 로고
    • Congestion control via online sampling
    • Wu, G., Chong, E., & Givan, R. (2001). Congestion control via online sampling. In Infocom.
    • (2001) Infocom
    • Wu, G.1    Chong, E.2    Givan, R.3
  • 51
    • 0038200710 scopus 로고    scopus 로고
    • Learning-assisted automated planning: Looking back, taking stock, going forward
    • 2
    • Zimmerman, T., & Kambhampati, S. (2003). Learning-assisted automated planning: Looking back, taking stock, going forward. AI Magazine, 24(2)(2), 73-96.
    • (2003) AI Magazine , vol.24 , Issue.2 , pp. 73-96
    • Zimmerman, T.1    Kambhampati, S.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.