-
1
-
-
78649507911
-
A Bayesian sampling approach to exploration in reinforcement learning
-
Asmuth, J., Li, L., Littman, M. L., Nouri, A., and Wingate, D. A Bayesian sampling approach to exploration in reinforcement learning. In UAI, 2009.
-
(2009)
UAI
-
-
Asmuth, J.1
Li, L.2
Littman, M.L.3
Nouri, A.4
Wingate, D.5
-
3
-
-
70349431917
-
Using linear programming for Bayesian exploration in Markov Decision Processes
-
Castro, P. S. and Precup, D. Using linear programming for Bayesian exploration in Markov Decision Processes. In IJCAI, 2007.
-
(2007)
IJCAI
-
-
Castro, P.S.1
Precup, D.2
-
4
-
-
0031619316
-
Bayesian Q-learning
-
Dearden, R., Friedman, N., and Russell, S. Bayesian Q-learning. In AAAI, pp. 761-768, 1998.
-
(1998)
AAAI
, pp. 761-768
-
-
Dearden, R.1
Friedman, N.2
Russell, S.3
-
7
-
-
0032073263
-
Planning and acting in partially observable stochastic domains
-
Kaelbling, L. P., Littman, M. L., and Cassandra, A. R. Planning and acting in partially observable stochastic domains. Artificial Intelligence, 101:99-134, 1998.
-
(1998)
Artificial Intelligence
, vol.101
, pp. 99-134
-
-
Kaelbling, L.P.1
Littman, M.L.2
Cassandra, A.R.3
-
8
-
-
84970332367
-
Evolution of learning among pavlov strategies in a competitive environment with noise
-
Kraines, D. and Kraines, V. Evolution of learning among pavlov strategies in a competitive environment with noise. Journal of Conflict Resolution, 39:439-466, 1995.
-
(1995)
Journal of Conflict Resolution
, vol.39
, pp. 439-466
-
-
Kraines, D.1
Kraines, V.2
-
9
-
-
77955779940
-
SARSOP: Efficient point-based POMDP planning by approximating optimally reachable belief spaces
-
Kurniawati, H., Hsu, D., and Lee, W. S. SARSOP: Efficient point-based POMDP planning by approximating optimally reachable belief spaces. In RSS, 2008.
-
(2008)
RSS
-
-
Kurniawati, H.1
Hsu, D.2
Lee, W.S.3
-
10
-
-
57649134413
-
A perception driven autonomous urban vehicle
-
Leonard, J., How, J., and Teller, S. A perception driven autonomous urban vehicle. Journal of Field Robotics, 25(10): 727-774, 2008.
-
(2008)
Journal of Field Robotics
, vol.25
, Issue.10
, pp. 727-774
-
-
Leonard, J.1
How, J.2
Teller, S.3
-
12
-
-
22344437403
-
Leading best-response strategies in repeated games
-
Littman, M. L. and Stone, P. Leading best-response strategies in repeated games. In IJCAI Workshop on Economic Agents, Models, and Mechanisms, 2001.
-
IJCAI Workshop on Economic Agents, Models, and Mechanisms, 2001
-
-
Littman, M.L.1
Stone, P.2
-
13
-
-
47849106249
-
Human driver model and driver decision making for intersection driving
-
Liu, Y. and Ozguner, U. Human driver model and driver decision making for intersection driving. IEEE Intelligent Vehicles Symposium, pp. 642-647, 2007.
-
(2007)
IEEE Intelligent Vehicles Symposium
, pp. 642-647
-
-
Liu, Y.1
Ozguner, U.2
-
14
-
-
0141819580
-
PEGASUS: A policy search method for large MDPs and POMDPs
-
Ng, A. and Jordan, M. PEGASUS: A policy search method for large MDPs and POMDPs. In UAI, pp. 406-415, 2000.
-
(2000)
UAI
, pp. 406-415
-
-
Ng, A.1
Jordan, M.2
-
15
-
-
0027336968
-
A strategy of win-stay, lose-shift that outperforms tit-for-tat in the prisoner's dilemma game
-
Nowak, M. and Sigmund, K. A strategy of win-stay, lose-shift that outperforms tit-for-tat in the prisoner's dilemma game. Nature, 364, 1993.
-
(1993)
Nature
, vol.364
-
-
Nowak, M.1
Sigmund, K.2
-
16
-
-
77954049897
-
Planning under uncertainty for robotic tasks with mixed observability
-
Ong, S. C. W., Png, S. W., Hsu, D., and Lee, W. S. Planning under uncertainty for robotic tasks with mixed observability. IJRR, 29(8):1053-1068, 2010.
-
(2010)
IJRR
, vol.29
, Issue.8
, pp. 1053-1068
-
-
Ong, S.C.W.1
Png, S.W.2
Hsu, D.3
Lee, W.S.4
-
17
-
-
84880772945
-
Point-based value iteration: An anytime algorithm for POMDPs
-
Pineau, J., Gordon, G., and Thrun, S. Point-based value iteration: An anytime algorithm for POMDPs. In IJCAI, pp. 1025-1032, 2003.
-
(2003)
IJCAI
, pp. 1025-1032
-
-
Pineau, J.1
Gordon, G.2
Thrun, S.3
-
19
-
-
77950356463
-
Model-based Bayesian reinforcement learning in partially observable domains
-
Poupart, P. and Vlassis, N. Model-based Bayesian reinforcement learning in partially observable domains. In ISAIM, 2008.
-
(2008)
ISAIM
-
-
Poupart, P.1
Vlassis, N.2
-
20
-
-
34250730267
-
An analytic solution to discrete Bayesian reinforcement learning
-
Poupart, P., Vlassis, N., Hoey, J., and Regan, K. An analytic solution to discrete Bayesian reinforcement learning. In ICML, pp. 697-704, 2006.
-
(2006)
ICML
, pp. 697-704
-
-
Poupart, P.1
Vlassis, N.2
Hoey, J.3
Regan, K.4
-
21
-
-
77955213275
-
Model-based Bayesian reinforcement learning in large structured domains
-
Ross, S. and Pineau, J. Model-based Bayesian reinforcement learning in large structured domains. In UAI, 2008.
-
(2008)
UAI
-
-
Ross, S.1
Pineau, J.2
-
23
-
-
85115971428
-
On some winning strategies for the Iterated Prisoner's Dilemma or Mr. Nice Guy and the Cosa Nostra
-
Slany, W. and Kienreich, W. On some winning strategies for the Iterated Prisoner's Dilemma or Mr. Nice Guy and the Cosa Nostra. In The Iterated Prisoners' Dilemma: 20 Years On, 2007.
-
(2007)
The Iterated Prisoners' Dilemma: 20 Years on
-
-
Slany, W.1
Kienreich, W.2
-
24
-
-
80053262864
-
Point-based POMDP algorithms: Improved analysis and implementation
-
Smith, T. and Simmons, R. G. Point-based POMDP algorithms: Improved analysis and implementation. In UAI, pp. 542-547, 2005.
-
(2005)
UAI
, pp. 542-547
-
-
Smith, T.1
Simmons, R.G.2
-
25
-
-
31844436266
-
Bayesian sparse sampling for on-line reward optimization
-
Wang, T., Lizotte, D., Bowling, M., and Schuurmans, D. Bayesian sparse sampling for on-line reward optimization. In ICML, 2005.
-
(2005)
ICML
-
-
Wang, T.1
Lizotte, D.2
Bowling, M.3
Schuurmans, D.4
|