SCOPUS 정보 검색 플랫폼

Proceedings of the National Conference on Artificial Intelligence

Volumn 3, Issue , 2012, Pages 1749-1755

Action selection for MDPs: Anytime AO* versus UCT

(2) Bonet, Blai a Geffner, Hector b

a UNIVERSIDAD SIMÓN BOLÍVAR (Venezuela)

b UNIVERSITAT POMPEU FABRA (Spain)

Author keywords

[No Author keywords available]

Indexed keywords

ACTION SELECTION; AND/OR GRAPHS; DYNAMIC PROGRAMMING ALGORITHM; EXPLICIT GRAPHS; HEURISTIC SEARCH; INFINITE HORIZONS; OPTIMAL ALGORITHM; OPTIMAL POLICIES; OPTIMAL VARIANTS;

HEURISTIC ALGORITHMS; OPTIMIZATION;

ARTIFICIAL INTELLIGENCE;

EID: 84868269234 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (28)

References (24)

1
- 0036568025
- Finite-time analysis of the multiarmed bandit problem
- Auer, P.; Cesa-Bianchi, N.; and Fischer, P. 2002. Finite-time analysis of the multiarmed bandit problem. Machine learning 47(2):235-256.
- (2002) Machine Learning , vol.47 , Issue.2 , pp. 235-256
- Auer, P.¹ Cesa-Bianchi, N.² Fischer, P.³

2
- 71549133876
- UCT for tactical assault planning in real-time strategy games
- Balla, R., and Fern, A. 2009. UCT for tactical assault planning in real-time strategy games. In Proc. IJCAI-09, 40-45.
- (2009) Proc. IJCAI-09 , pp. 40-45
- Balla, R.¹ Fern, A.²

3
- 0029210635
- Learning to act using real-time dynamic programming
- Barto, A.; Bradtke, S.; and Singh, S. 1995. Learning to act using real-time dynamic programming. Artificial Intelligence 72:81-138.
- (1995) Artificial Intelligence , vol.72 , pp. 81-138
- Barto, A.¹ Bradtke, S.² Singh, S.³

4
- 0031272681
- Rollout algorithms for combinatorial optimization
- Bertsekas, D.; Tsitsiklis, J.; and Wu, C. 1997. Rollout algorithms for combinatorial optimization. J. of Heuristics 3(3):245-262.
- (1997) J. of Heuristics , vol.3 , Issue.3 , pp. 245-262
- Bertsekas, D.¹ Tsitsiklis, J.² Wu, C.³

5
- 9444233135
- Labeled RTDP: Improving the convergence of real-time dynamic programming
- Bonet, B., and Geffner, H. 2003. Labeled RTDP: Improving the convergence of real-time dynamic programming. In Proc. ICAPS, 12-31.
- (2003) Proc. ICAPS , pp. 12-31
- Bonet, B.¹ Geffner, H.²

6
- 55249127519
- Progressive strategies for monte-carlo tree search
- Chaslot, G.; Winands, M.; Herik, H.; Uiterwijk, J.; and Bouzy, B. 2008. Progressive strategies for monte-carlo tree search. New Math. and Natural Comp. 4(3):343.
- (2008) New Math. and Natural Comp. , vol.4 , Issue.3 , pp. 343
- Chaslot, G.¹ Winands, M.² Herik, H.³ Uiterwijk, J.⁴ Bouzy, B.⁵

7
- 85167430664
- High-quality policies for the canadian traveler's problem
- Eyerich, P.; Keller, T.; and Helmert, M. 2010. High-quality policies for the canadian traveler's problem. In Proc. AAAI.
- (2010) Proc. AAAI
- Eyerich, P.¹ Keller, T.² Helmert, M.³

8
- 57749181518
- Simulation-based approach to general game playing
- Finnsson, H., and Björnsson, Y. 2008. Simulation-based approach to general game playing. In Proc. AAAI, 259-264.
- (2008) Proc. AAAI , pp. 259-264
- Finnsson, H.¹ Björnsson, Y.²

9
- 34547990649
- Combining online and offline knowledge in uct
- Gelly, S., and Silver, D. 2007. Combining online and offline knowledge in uct. In Proc. ICML, 273-280.
- (2007) Proc. ICML , pp. 273-280
- Gelly, S.¹ Silver, D.²

10
- 34249052595
- Anytime heuristic search
- Hansen, E., and Zhou, R. 2007. Anytime heuristic search. J. Artif. Intell. Res. 28:267-297.
- (2007) J. Artif. Intell. Res. , vol.28 , pp. 267-297
- Hansen, E.¹ Zhou, R.²

11
- 0141687970
- Markov Models
- New York: Wiley
- Howard, R. 1971. Dynamic Probabilistic Systems-Volume I: Markov Models. New York: Wiley.
- (1971) Dynamic Probabilistic Systems , vol.1
- Howard, R.¹

12
- 84880649215
- A sparse sampling algorithm for near-optimal planning in large MDPs
- Kearns, M.; Mansour, Y.; and Ng, A. 1999. A sparse sampling algorithm for near-optimal planning in large MDPs. In Proc. IJCAI-99, 1324-1331.
- (1999) Proc. IJCAI-99 , pp. 1324-1331
- Kearns, M.¹ Mansour, Y.² Ng, A.³

13
- 33750293964
- Bandit based Monte-Carlo planning
- Kocsis, L., and Szepesvári, C. 2006. Bandit based Monte-Carlo planning. In Proc. ECML-2006, 282-293.
- (2006) Proc. ECML-2006 , pp. 282-293
- Kocsis, L.¹ Szepesvári, C.²

14
- 59849106768
- Comparing real-time and incremental heuristic search for real-time situated agents
- Koenig, S., and Sun, X. 2009. Comparing real-time and incremental heuristic search for real-time situated agents. Autonomous Agents and Multi-Agent Systems 18(3):313-341.
- (2009) Autonomous Agents and Multi-Agent Systems , vol.18 , Issue.3 , pp. 313-341
- Koenig, S.¹ Sun, X.²

15
- 33745735854
- ARA*: Anytime A* with provable bounds on sub-optimality
- Likhachev, M.; Gordon, G.; and Thrun, S. 2003. ARA*: Anytime A* with provable bounds on sub-optimality. In Proc. NIPS.
- (2003) Proc. NIPS
- Likhachev, M.¹ Gordon, G.² Thrun, S.³

16
- 70349275222
- Bandit algorithms for tree search
- Munos, R., and Coquelin, P. 2007. Bandit algorithms for tree search. In Proc. UAI, 67-74.
- (2007) Proc. UAI , pp. 67-74
- Munos, R.¹ Coquelin, P.²

17
- 0004219017
- Tioga
- Nilsson, N. 1980. Principles of Artificial Intelligence. Tioga.
- (1980) Principles of Artificial Intelligence
- Nilsson, N.¹

18
- 0026190127
- Shortest paths without a map
- Papadimitriou, C., and Yannakakis, M. 1991. Shortest paths without a map. Theoretical Comp. Sci. 84(1):127-150.
- (1991) Theoretical Comp. Sci. , vol.84 , Issue.1 , pp. 127-150
- Papadimitriou, C.¹ Yannakakis, M.²

19
- 78650622420
- On adversarial search spaces and sampling-based planning
- Ramanujan, R.; Sabharwal, A.; and Selman, B. 2010. On adversarial search spaces and sampling-based planning. In Proc. ICAPS, 242-245.
- (2010) Proc. ICAPS , pp. 242-245
- Ramanujan, R.¹ Sabharwal, A.² Selman, B.³

20
- 85161963598
- Monte-carlo planning in large POMDPs
- Silver, D., and Veness, J. 2010. Monte-carlo planning in large POMDPs. In Proc. NIPS, 2164-2172.
- (2010) Proc. NIPS , pp. 2164-2172
- Silver, D.¹ Veness, J.²

21
- 0003420416
- MIT Press
- Sutton, R., and Barto, A. 1998. Introduction to Reinforcement Learning. MIT Press.
- (1998) Introduction to Reinforcement Learning
- Sutton, R.¹ Barto, A.²

22
- 84868275750
- Anytime heuristic search: Frameworks and algorithms
- Thayer, J., and Ruml, W. 2010. Anytime heuristic search: Frameworks and algorithms. In Proc. SOCS.
- (2010) Proc. SOCS
- Thayer, J.¹ Ruml, W.²

23
- 85167397400
- Integrating sample-based planning and model-based reinforcement learning
- Walsh, T.; Goschin, S.; and Littman, M. 2010. Integrating sample-based planning and model-based reinforcement learning. In Proc. AAAI.
- (2010) Proc. AAAI
- Walsh, T.¹ Goschin, S.² Littman, M.³

24
- 33645415150
- Solitaire: Man versus machine
- Yan, X.; Diaconis, P.; Rusmevichientong, P.; and Van Roy, B. 2005. Solitaire: Man versus machine. In Proc. NIPS 17.
- (2005) Proc. NIPS 17
- Yan, X.¹ Diaconis, P.² Rusmevichientong, P.³ Van Roy, B.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.