-
1
-
-
0003477315
-
Reinforcement learning with high-dimensional continuous actions
-
Wright Laboratory, Wright-Patterson Air Force Base
-
Baird, L., and Klopf, H. 1993. Reinforcement Learning with High-Dimensional Continuous Actions, Technical Report WL-TR-93-1147, Wright Laboratory, Wright-Patterson Air Force Base.
-
(1993)
Technical Report
, vol.WL-TR-93-1147
-
-
Baird, L.1
Klopf, H.2
-
2
-
-
0003587784
-
Residual Q-learning applied to visual attention
-
San Francisco, Calif.: Morgan Kaufmann
-
Bandera, C.; Francisco, V.; Jose, B.; Harmon, M.; and Baird, L. 1996. Residual Q-Learning Applied to Visual Attention. In Proceedings of the Thirteenth International Conference on Machine Learning, 20-27. San Francisco, Calif.: Morgan Kaufmann.
-
(1996)
Proceedings of the Thirteenth International Conference on Machine Learning
, pp. 20-27
-
-
Bandera, C.1
Francisco, V.2
Jose, B.3
Harmon, M.4
Baird, L.5
-
3
-
-
0029210635
-
Learning to act using real-time dynamic programming
-
Barto, A.; Bradkte, S.; and Singh, S. 1995. Learning to Act Using Real-Time Dynamic Programming. Artificial Intelligence 72:81-138.
-
(1995)
Artificial Intelligence
, vol.72
, pp. 81-138
-
-
Barto, A.1
Bradkte, S.2
Singh, S.3
-
4
-
-
0020970738
-
Neuronlike adaptive elements that can solve difficult learning control problems
-
Barto, A.; Sutton, R.; and Anderson, C. 1983. Neuronlike Adaptive Elements That Can Solve Difficult Learning Control Problems. IEEE Transactions on Systems, Man, and Cybernetics 13(5): 834-846.
-
(1983)
IEEE Transactions on Systems, Man, and Cybernetics
, vol.13
, Issue.5
, pp. 834-846
-
-
Barto, A.1
Sutton, R.2
Anderson, C.3
-
7
-
-
0029752592
-
Average reward reinforcement learning: Foundations, algorithms, and empirical results
-
Mahadevan, S. 1996b. Average Reward Reinforcement Learning: Foundations, Algorithms, and Empirical Results. Machine Learning 22:159-196.
-
(1996)
Machine Learning
, vol.22
, pp. 159-196
-
-
Mahadevan, S.1
-
8
-
-
17144430819
-
Sensitive-discount optimality: Unifying average-reward and discounted reinforcement learning
-
San Francisco, Calif.: Morgan Kaufmann
-
Mahadevan, S. 1996c. Sensitive-Discount Optimality: Unifying Average-Reward and Discounted Reinforcement Learning. In Proceedings of the Thirteenth International Conference on Machine Learning, 328-336. San Francisco, Calif.: Morgan Kaufmann.
-
(1996)
Proceedings of the Thirteenth International Conference on Machine Learning
, pp. 328-336
-
-
Mahadevan, S.1
-
9
-
-
0026880130
-
Automatic programming of behavior-based robots using reinforcement learning
-
Mahadevan, S., and Connell, J. 1992. Automatic Programming of Behavior-Based Robots Using Reinforcement Learning. Artificial Intelligence 55:311-365.
-
(1992)
Artificial Intelligence
, vol.55
, pp. 311-365
-
-
Mahadevan, S.1
Connell, J.2
|