메뉴 건너뛰기




Volumn 53, Issue , 2015, Pages 375-438

Approximate value iteration with temporally extended actions

Author keywords

[No Author keywords available]

Indexed keywords

NAVIGATION; REINFORCEMENT LEARNING;

EID: 84938498958     PISSN: 10769757     EISSN: None     Source Type: Journal    
DOI: 10.1613/jair.4676     Document Type: Article
Times cited : (47)

References (51)
  • 7
    • 34147120474 scopus 로고
    • A note on two problems in connexion with graphs
    • Dijkstra, E. (1959). A note on two problems in connexion with graphs. Numerische Mathematik, 1 (1), 269-271.
    • (1959) Numerische Mathematik , vol.1 , Issue.1 , pp. 269-271
    • Dijkstra, E.1
  • 14
    • 0000148778 scopus 로고
    • A heuristic approach to the discovery of macro-operators
    • Iba, G. A. (1989). A heuristic approach to the discovery of macro-operators. Machine Learning, 3, 285-317.
    • (1989) Machine Learning , vol.3 , pp. 285-317
    • Iba, G.A.1
  • 16
    • 0036832951 scopus 로고    scopus 로고
    • A sparse sampling algorithm for near-optimal planning in large Markov decision processes
    • Kearns, M., Mansour, Y., & Ng, A. Y. (2002). A sparse sampling algorithm for near-optimal planning in large Markov decision processes. Machine Learning, 49 (2-3), 193-208.
    • (2002) Machine Learning , vol.49 , Issue.2-3 , pp. 193-208
    • Kearns, M.1    Mansour, Y.2    Ng, A.Y.3
  • 18
    • 80055032021 scopus 로고    scopus 로고
    • Skill discovery in continuous reinforcement learning domains using skill chaining
    • Konidaris, G., & Barto, A. (2009). Skill discovery in continuous reinforcement learning domains using skill chaining. In Advances in Neural Information Processing Systems 22, pp. 1015-1023.
    • (2009) Advances in Neural Information Processing Systems , vol.22 , pp. 1015-1023
    • Konidaris, G.1    Barto, A.2
  • 25
    • 84938531572 scopus 로고    scopus 로고
    • Accessed: 2015-06-29
    • Mann, T. A. (2014). Cyclic Inventory Management (CIM). https://code.google.com/p/rddlsim/source/browse/trunk/files/rddl2/examples/cim.rddl2. Accessed: 2015-06-29.
    • (2014) Cyclic Inventory Management (CIM)
    • Mann, T.A.1
  • 33
    • 44949241322 scopus 로고    scopus 로고
    • Reinforcement learning of motor skills with policy gradients
    • Peters, J., & Schaal, S. (2008). Reinforcement learning of motor skills with policy gradients. Neural Networks, 21, 682-691.
    • (2008) Neural Networks , vol.21 , pp. 682-691
    • Peters, J.1    Schaal, S.2
  • 35
    • 84957069070 scopus 로고    scopus 로고
    • Theoretical results on reinforcement learning with temporally abstract options
    • Springer.
    • Precup, D., Sutton, R. S., & Singh, S. (1998). Theoretical results on reinforcement learning with temporally abstract options. In Machine Learning: ECML-1998, pp. 382-393. Springer.
    • (1998) Machine Learning: ECML-1998 , pp. 382-393
    • Precup, D.1    Sutton, R.S.2    Singh, S.3
  • 37
    • 33646398129 scopus 로고    scopus 로고
    • Neural fitted Q iteration-first experiences with a data efficient neural reinforcement learning method
    • Springer.
    • Riedmiller, M. (2005). Neural fitted Q iteration-first experiences with a data efficient neural reinforcement learning method. In Machine Learning: ECML-2005, pp. 317-328. Springer.
    • (2005) Machine Learning: ECML-2005 , pp. 317-328
    • Riedmiller, M.1
  • 38
    • 27144482716 scopus 로고    scopus 로고
    • Highway hierarchies hasten exact shortest path queries
    • Brodal, G., & Leonardi, S. (Eds.) Algorithms: ESA-2005 Springer Berlin Heidelberg
    • Sanders, P., & Schultes, D. (2005). Highway hierarchies hasten exact shortest path queries. In Brodal, G., & Leonardi, S. (Eds.), Algorithms: ESA-2005, Vol. 3669 of Lecture Notes in Computer Science, pp. 568-579. Springer Berlin Heidelberg.
    • (2005) Lecture Notes in Computer Science , vol.3669 , pp. 568-579
    • Sanders, P.1    Schultes, D.2
  • 41
    • 0031277069 scopus 로고    scopus 로고
    • Optimality of (s,S) policies in inventory models with markovian demand
    • Sethi, S. P., & Cheng, F. (1997). Optimality of (s,S) policies in inventory models with markovian demand. Operations Research, 45 (6), 931-939.
    • (1997) Operations Research , vol.45 , Issue.6 , pp. 931-939
    • Sethi, S.P.1    Cheng, F.2
  • 47
    • 27544506565 scopus 로고    scopus 로고
    • Reinforcement learning for robocup soccer keepaway
    • Stone, P., Sutton, R. S., & Kuhlmann, G. (2005). Reinforcement learning for robocup soccer keepaway. Adaptive Behavior, 13 (3), 165-188.
    • (2005) Adaptive Behavior , vol.13 , Issue.3 , pp. 165-188
    • Stone, P.1    Sutton, R.S.2    Kuhlmann, G.3
  • 48
    • 0033170372 scopus 로고    scopus 로고
    • Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
    • Sutton, R. S., Precup, D., & Singh, S. (1999). Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112 (1), 181-211.
    • (1999) Artificial Intelligence , vol.112 , Issue.1 , pp. 181-211
    • Sutton, R.S.1    Precup, D.2    Singh, S.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.