-
1
-
-
0036568025
-
Finite-time analysis of the multiarmed bandit problem
-
Auer, P.; Cesa-Bianchi, N.; and Fischer, P. 2002. Finite-time analysis of the multiarmed bandit problem. Machine learning 47(2):235-256.
-
(2002)
Machine Learning
, vol.47
, Issue.2
, pp. 235-256
-
-
Auer, P.1
Cesa-Bianchi, N.2
Fischer, P.3
-
2
-
-
71549133876
-
UCT for tactical assault planning in real-time strategy games
-
Balla, R., and Fern, A. 2009. UCT for tactical assault planning in real-time strategy games. In Proc. IJCAI-09, 40-45.
-
(2009)
Proc. IJCAI-09
, pp. 40-45
-
-
Balla, R.1
Fern, A.2
-
3
-
-
0029210635
-
Learning to act using real-time dynamic programming
-
Barto, A.; Bradtke, S.; and Singh, S. 1995. Learning to act using real-time dynamic programming. Artificial Intelligence 72:81-138.
-
(1995)
Artificial Intelligence
, vol.72
, pp. 81-138
-
-
Barto, A.1
Bradtke, S.2
Singh, S.3
-
4
-
-
0031272681
-
Rollout algorithms for combinatorial optimization
-
Bertsekas, D.; Tsitsiklis, J.; and Wu, C. 1997. Rollout algorithms for combinatorial optimization. J. of Heuristics 3(3):245-262.
-
(1997)
J. of Heuristics
, vol.3
, Issue.3
, pp. 245-262
-
-
Bertsekas, D.1
Tsitsiklis, J.2
Wu, C.3
-
5
-
-
9444233135
-
Labeled RTDP: Improving the convergence of real-time dynamic programming
-
Bonet, B., and Geffner, H. 2003. Labeled RTDP: Improving the convergence of real-time dynamic programming. In Proc. ICAPS, 12-31.
-
(2003)
Proc. ICAPS
, pp. 12-31
-
-
Bonet, B.1
Geffner, H.2
-
6
-
-
55249127519
-
Progressive strategies for monte-carlo tree search
-
Chaslot, G.; Winands, M.; Herik, H.; Uiterwijk, J.; and Bouzy, B. 2008. Progressive strategies for monte-carlo tree search. New Math. and Natural Comp. 4(3):343.
-
(2008)
New Math. and Natural Comp.
, vol.4
, Issue.3
, pp. 343
-
-
Chaslot, G.1
Winands, M.2
Herik, H.3
Uiterwijk, J.4
Bouzy, B.5
-
7
-
-
85167430664
-
High-quality policies for the canadian traveler's problem
-
Eyerich, P.; Keller, T.; and Helmert, M. 2010. High-quality policies for the canadian traveler's problem. In Proc. AAAI.
-
(2010)
Proc. AAAI
-
-
Eyerich, P.1
Keller, T.2
Helmert, M.3
-
8
-
-
57749181518
-
Simulation-based approach to general game playing
-
Finnsson, H., and Björnsson, Y. 2008. Simulation-based approach to general game playing. In Proc. AAAI, 259-264.
-
(2008)
Proc. AAAI
, pp. 259-264
-
-
Finnsson, H.1
Björnsson, Y.2
-
9
-
-
34547990649
-
Combining online and offline knowledge in uct
-
Gelly, S., and Silver, D. 2007. Combining online and offline knowledge in uct. In Proc. ICML, 273-280.
-
(2007)
Proc. ICML
, pp. 273-280
-
-
Gelly, S.1
Silver, D.2
-
12
-
-
84880649215
-
A sparse sampling algorithm for near-optimal planning in large MDPs
-
Kearns, M.; Mansour, Y.; and Ng, A. 1999. A sparse sampling algorithm for near-optimal planning in large MDPs. In Proc. IJCAI-99, 1324-1331.
-
(1999)
Proc. IJCAI-99
, pp. 1324-1331
-
-
Kearns, M.1
Mansour, Y.2
Ng, A.3
-
13
-
-
33750293964
-
Bandit based Monte-Carlo planning
-
Kocsis, L., and Szepesvári, C. 2006. Bandit based Monte-Carlo planning. In Proc. ECML-2006, 282-293.
-
(2006)
Proc. ECML-2006
, pp. 282-293
-
-
Kocsis, L.1
Szepesvári, C.2
-
14
-
-
59849106768
-
Comparing real-time and incremental heuristic search for real-time situated agents
-
Koenig, S., and Sun, X. 2009. Comparing real-time and incremental heuristic search for real-time situated agents. Autonomous Agents and Multi-Agent Systems 18(3):313-341.
-
(2009)
Autonomous Agents and Multi-Agent Systems
, vol.18
, Issue.3
, pp. 313-341
-
-
Koenig, S.1
Sun, X.2
-
15
-
-
33745735854
-
ARA*: Anytime A* with provable bounds on sub-optimality
-
Likhachev, M.; Gordon, G.; and Thrun, S. 2003. ARA*: Anytime A* with provable bounds on sub-optimality. In Proc. NIPS.
-
(2003)
Proc. NIPS
-
-
Likhachev, M.1
Gordon, G.2
Thrun, S.3
-
16
-
-
70349275222
-
Bandit algorithms for tree search
-
Munos, R., and Coquelin, P. 2007. Bandit algorithms for tree search. In Proc. UAI, 67-74.
-
(2007)
Proc. UAI
, pp. 67-74
-
-
Munos, R.1
Coquelin, P.2
-
19
-
-
78650622420
-
On adversarial search spaces and sampling-based planning
-
Ramanujan, R.; Sabharwal, A.; and Selman, B. 2010. On adversarial search spaces and sampling-based planning. In Proc. ICAPS, 242-245.
-
(2010)
Proc. ICAPS
, pp. 242-245
-
-
Ramanujan, R.1
Sabharwal, A.2
Selman, B.3
-
20
-
-
85161963598
-
Monte-carlo planning in large POMDPs
-
Silver, D., and Veness, J. 2010. Monte-carlo planning in large POMDPs. In Proc. NIPS, 2164-2172.
-
(2010)
Proc. NIPS
, pp. 2164-2172
-
-
Silver, D.1
Veness, J.2
-
22
-
-
84868275750
-
Anytime heuristic search: Frameworks and algorithms
-
Thayer, J., and Ruml, W. 2010. Anytime heuristic search: Frameworks and algorithms. In Proc. SOCS.
-
(2010)
Proc. SOCS
-
-
Thayer, J.1
Ruml, W.2
-
23
-
-
85167397400
-
Integrating sample-based planning and model-based reinforcement learning
-
Walsh, T.; Goschin, S.; and Littman, M. 2010. Integrating sample-based planning and model-based reinforcement learning. In Proc. AAAI.
-
(2010)
Proc. AAAI
-
-
Walsh, T.1
Goschin, S.2
Littman, M.3
|