-
1
-
-
20444403165
-
Research on reinforcement learning technology: A review
-
January
-
Yang G, Chen S F, "Research on Reinforcement Learning Technology: A Review", ACTA AUTOMATICE SINICA, Vol.30, No.1, pp. 89-100, January 2004.
-
(2004)
Acta Automatice Sinica
, vol.30
, Issue.1
, pp. 89-100
-
-
Yang, G.1
Chen, S.F.2
-
2
-
-
0036619222
-
Research on Q-leaming algorithm based on metropolis criterion
-
June
-
Guo Maozu, Wang Yadong, Sun Huamei and Liu Yang. "Research on Q-leaming Algorithm Based on Metropolis Criterion", Journal of Computer Research and Development, Vol.39, No.6, pp. 684-688, June 2002.
-
(2002)
Journal of Computer Research and Development
, vol.39
, Issue.6
, pp. 684-688
-
-
Maozu, G.1
Yadong, W.2
Huamei, S.3
Yang, L.4
-
3
-
-
0033170372
-
Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
-
Sutton R S, Precup D, Singh S. "Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning", Artificial Intelligence, Vol.112, No. 1/2, pp. 181-211, 1999.
-
(1999)
Artificial Intelligence
, vol.112
, Issue.1-2
, pp. 181-211
-
-
Sutton, R.S.1
Precup, D.2
Singh, S.3
-
4
-
-
4844229682
-
A summary on reinforcement learning
-
March
-
Guo Maozu, Chen Bin, Wang Xiaolong et al, "A summary on reinforcement learning", Computer Science, Vol.25, No.3, pp. 13-15, March 1998.
-
(1998)
Computer Science
, vol.25
, Issue.3
, pp. 13-15
-
-
Maozu, G.1
Bin, C.2
Xiaolong, W.3
-
5
-
-
0029679044
-
Reinforcement learning: A sursvey
-
February
-
Kaelbling L P, Littman M L, Moore A W, "Reinforcement learning: A sursvey", Journal of Artificial Intelligence Research, Vol.4, No.2, pp. 237-285, February 1996.
-
(1996)
Journal of Artificial Intelligence Research
, vol.4
, Issue.2
, pp. 237-285
-
-
Kaelbling, L.P.1
Littman, M.L.2
Moore, A.W.3
-
6
-
-
0035456396
-
A learning agent based on reinforcement learning
-
September
-
Li Ning, Gao Yang, Lu Xin, Chen Shi-Fu, "A learning agent based on reinforcement learning", Computer Research and Development, Vol.38, No.9, pp. 1051-1056, September 2001.
-
(2001)
Computer Research and Development
, vol.38
, Issue.9
, pp. 1051-1056
-
-
Ning, L.1
Yang, G.2
Xin, L.3
Shi-Fu, C.4
-
7
-
-
0004049893
-
-
[PhD dissertation]. Psychology Department, Cambridge University, England
-
Watkins, "Learning from delayed rewards" [PhD dissertation]. Psychology Department, Cambridge University, England, 1989.
-
(1989)
Learning from Delayed Rewards
-
-
Watkins1
-
9
-
-
0028497630
-
Asynchronous stochastic approximation and Q-learning
-
March
-
Tsitsiklis, John N, "Asynchronous stochastic approximation and Q-learning", Machine Learning, Vol.16, No.3, pp. 185-202, March 1994.
-
(1994)
Machine Learning
, vol.16
, Issue.3
, pp. 185-202
-
-
Tsitsiklis, J.N.1
-
10
-
-
6344247324
-
An agent team based reinforcement learning model and its application
-
September
-
Cai Qingsheng, Zhang Bo, "An Agent Team Based Reinforcement Learning Model AND Its Application" Journal of Computer Research and Development, Vol. 37, No. 9, pp. 1087-1093, September 2000.
-
(2000)
Journal of Computer Research and Development
, vol.37
, Issue.9
, pp. 1087-1093
-
-
Qingsheng, C.1
Bo, Z.2
-
11
-
-
0035978635
-
Modular Q-learning based multi-agent cooperation for robot soccer
-
February
-
Park Kui-Hong, Kim Yong-Jae, Kim Jong-Hwan, "Modular Q-learning based multi-agent cooperation for robot soccer", Robotics and Autonomous Systems, Vol.35, No. 2, pp. 109-122, February 2001.
-
(2001)
Robotics and Autonomous Systems
, vol.35
, Issue.2
, pp. 109-122
-
-
Kui-Hong, P.1
Yong-Jae, K.2
Jong-Hwan, K.3
-
12
-
-
0003529066
-
On optimal cooperation of knowledge sources
-
Boeing AI Center, Boeing Computer Services, Bellevue, WA, August
-
M. Benda, V. Jagannathan, R. Dodhiawalla. On optimal cooperation of knowledge sources. Technical Report BCS-G2010-28, Boeing AI Center, Boeing Computer Services, Bellevue, WA, August 1985.
-
(1985)
Technical Report
, vol.BCS-G2010-28
-
-
Benda, M.1
Jagannathan, V.2
Dodhiawalla, R.3
-
13
-
-
84949940071
-
Evolving behavioral strategies in predators and prey
-
Gerhard Weiand Sandip Sen, editors. Springer Verlag, Berlin
-
Thomas Haynes, Sandip Sen. Evolving behavioral strategies in predators and prey. In Gerhard Weiand Sandip Sen, editors, Adaptation and Learning in Multiagent Systems, pages 113-126. Springer Verlag, Berlin, 1996.
-
(1996)
Adaptation and Learning in Multiagent Systems
, pp. 113-126
-
-
Haynes, T.1
Sen, S.2
|