-
1
-
-
0029210635
-
Learning to act using real-time dynamic programming
-
A. G. Barto, S. J Bradtke, and S. P. Singh, Learning to act using real-time dynamic programming, Artificial Intelligence, vol. 72, pp. 81-138, 1995.
-
(1995)
Artificial Intelligence
, vol.72
, pp. 81-138
-
-
Barto, A.G.1
Bradtke, S.J.2
Singh, S.P.3
-
3
-
-
0013495368
-
Infinite-horizon gradient-based policy search: II. Gradient ascent algorithms and experiments
-
J. Baxter, P. L. Bartlett, and L. Weaver, Infinite-horizon gradient-based policy search: II. Gradient ascent algorithms and experiments, Journal of Artificial Intelligence Research, vol. 15, pp. 351-381, 2001.
-
(2001)
Journal of Artificial Intelligence Research
, vol.15
, pp. 351-381
-
-
Baxter, J.1
Bartlett, P.L.2
Weaver, L.3
-
8
-
-
0001771345
-
Linear least-squares algorithms for temporal difference learning
-
S. J. Bradtke and A. G. Barto, Linear least-squares algorithms for temporal difference learning, Machine Learning, vol. 22, pp. 33-57,1996.
-
(1996)
Machine Learning
, vol.22
, pp. 33-57
-
-
Bradtke, S.J.1
Barto, A.G.2
-
9
-
-
0003802343
-
-
Wadsworth and Brooks, Monterey, CA
-
L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, Classification and Regression Trees, Wadsworth and Brooks, Monterey, CA, 1984.
-
(1984)
Classification and Regression Trees
-
-
Breiman, L.1
Friedman, J.H.2
Olshen, R.A.3
Stone, C.J.4
-
10
-
-
84899017487
-
Motivated reinforcement learning
-
T. G. Dietterich, S. Becker, and Z. Ghahramani (eds.), MIT Press, Cambridge, MA
-
P. Dayan, Motivated reinforcement learning, in T. G. Dietterich, S. Becker, and Z. Ghahramani (eds.), Advances in Neural Information Processing Systems 14, Proc. Of the 2002 Conference, pp. 11-18, MIT Press, Cambridge, MA, 2003.
-
(2003)
Advances in Neural Information Processing Systems 14, Proc. Of the 2002 Conference
, pp. 11-18
-
-
Dayan, P.1
-
11
-
-
84899029004
-
Batch value function approximation via support vectors
-
T. G. Dietterich, S. Becker, and Z. Ghahramani (eds.), MIT Press, Cambridge, MA
-
T. G. Dietterich and X. Wang, Batch value function approximation via support vectors, in T. G. Dietterich, S. Becker, and Z. Ghahramani (eds.), Advances in Neural Information Processing Systems 14, Proc. Of the 2002 Conference, pp. 1491-1498, MIT Press, Cambridge, MA, 2003.
-
(2003)
Advances in Neural Information Processing Systems 14, Proc. Of the 2002 Conference
, pp. 1491-1498
-
-
Dietterich, T.G.1
Wang, X.2
-
13
-
-
0003922190
-
-
Wiley, New York
-
R. O. Duda, R E. Hart, and D. G. Stork, Pattern Classification, Second Edition, Wiley, New York, 2001.
-
(2001)
Pattern Classification, Second Edition
-
-
Duda, R.O.1
Hart, R.E.2
Stork, D.G.3
-
17
-
-
1942420814
-
Reinforcement learning as classification: Leveraging modem classifiers
-
T. G. Fawcett, N. Mishra (eds.), AAAI Press, Menlo Park, CA
-
M. G. Lagoudakis and R. Parr, Reinforcement learning as classification: leveraging modem classifiers, in T. G. Fawcett, N. Mishra (eds.), Proc. 20th International Conference on Machine Learning, pp. 424-431, AAAI Press, Menlo Park, CA, 2003.
-
(2003)
Proc. 20Th International Conference on Machine Learning
, pp. 424-431
-
-
Lagoudakis, M.G.1
Parr, R.2
-
19
-
-
77956759998
-
Reinforcement learning control and pattem recognition systems
-
J. M. Mendel and K. S. Fu (eds.), Academic Press, New York
-
J. M. Mendel and R. W. McLaren, Reinforcement learning control and pattem recognition systems, in J. M. Mendel and K. S. Fu (eds.), Adaptive Learning and Pattern Recognition Systems: Theory and Applications, pp. 287-318, Academic Press, New York, 1970.
-
(1970)
Adaptive Learning and Pattern Recognition Systems: Theory and Applications
, pp. 287-318
-
-
Mendel, J.M.1
Mc Laren, R.W.2
-
20
-
-
0347592013
-
Behavioural clones and cognitive skill models
-
K. Furukawa, D. Michie, and S. Muggleton (eds.), Oxford University Press, New York
-
D. Michie and C. Sammut, Behavioural clones and cognitive skill models, in K. Furukawa, D. Michie, and S. Muggleton (eds.), Machine Intelligence 14: Applied Machine Intelligence, pp. 387-395, Oxford University Press, New York, 1996.
-
(1996)
Machine Intelligence 14: Applied Machine Intelligence
, pp. 387-395
-
-
Michie, D.1
Sammut, C.2
-
22
-
-
84937350040
-
Steps toward artificial intelligence
-
E. A. Feigenbaum and J. Feldman (eds.), Computers and Thought, pp. 406-450, McGraw-Hill, New York
-
M. L. Minsky, Steps toward artificial intelligence, Proc. Of the Institute of Radio Engineers, vol. 49, pp. 8-30, 1961. E. A. Feigenbaum and J. Feldman (eds.), Computers and Thought, pp. 406-450, McGraw-Hill, New York, 1963.
-
(1961)
Proc. Of the Institute of Radio Engineers
, vol.49
, pp. 8-30
-
-
Minsky, M.L.1
-
24
-
-
0027684215
-
Prioritized sweeping: Reinforcement learning with less data and less real time
-
A. W. Moore and C. G. Atkeson, Prioritized sweeping: reinforcement learning with less data and less real time, Machine Learning, vol. 13, pp. 103-130, 1993.
-
(1993)
Machine Learning
, vol.13
, pp. 103-130
-
-
Moore, A.W.1
Atkeson, C.G.2
-
25
-
-
0003212629
-
Efficient training of artificial neural networks for autonomous navigation
-
D. A. Pomerleau, Efficient training of artificial neural networks for autonomous navigation, Neural Computation, vol. 3, pp. 88-97, 1991.
-
(1991)
Neural Computation
, vol.3
, pp. 88-97
-
-
Pomerleau, D.A.1
-
27
-
-
0001201756
-
Some studies in machine learning using the game of checkers
-
Reprinted in E. A. Feigenbaum and J. Feldman (eds.), Computers and Thought, pp. 71-105, McGraw-Hill, New York
-
A. L. Samuel, Some studies in machine learning using the game of checkers, IBM Journal on Research and Development, vol. 3, pp. 211-229, 1959. Reprinted in E. A. Feigenbaum and J. Feldman (eds.), Computers and Thought, pp. 71-105, McGraw-Hill, New York, 1963.
-
(1959)
IBM Journal on Research and Development
, vol.3
, pp. 211-229
-
-
Samuel, A.L.1
-
30
-
-
33847202724
-
Learning to predict by the method of temporal differences
-
R. S. Sutton, Learning to predict by the method of temporal differences, Machine Learning, vol. 3, pp. 9-44,1988.
-
(1988)
Machine Learning
, vol.3
, pp. 9-44
-
-
Sutton, R.S.1
-
31
-
-
85156221438
-
Generalization in reinforcement learning: Successful examples using coarse coding
-
D. S. Touretzky, M. C. Moser and M. E. Hesselmo (eds.), MIT Press, Cambridge, MA
-
R. S. Sutton, Generalization in reinforcement learning: successful examples using coarse coding, in D. S. Touretzky, M. C. Moser and M. E. Hesselmo (eds.), Advances in Neural Information Processing Systems, Proc. Of the 1995 Conference, pp. 1038-1044, MIT Press, Cambridge, MA, 1996.
-
(1996)
Advances in Neural Information Processing Systems, Proc. Of the 1995 Conference
, pp. 1038-1044
-
-
Sutton, R.S.1
-
32
-
-
0001046225
-
Practical issues in temp oral difference learning
-
G. J. Tesauro, Practical issues in temp oral difference learning, Machi ne Learning, vol. 8, pp. 217-257,1992.
-
(1992)
Machi Ne Learning
, vol.8
, pp. 217-257
-
-
Tesauro, G.J.1
-
33
-
-
0000985504
-
TD-Gammon, A self-teaching backgammon program, achieves master-level play
-
G. J. Tesauro, TD-Gammon, A self-teaching backgammon program, achieves master-level play, Neural Computation, vol. 6, pp. 215-219, 1994.
-
(1994)
Neural Computation
, vol.6
, pp. 215-219
-
-
Tesauro, G.J.1
-
34
-
-
0029276036
-
Temporal Difference Learning and TD-Gammon
-
G. Tesauro, Temporal Difference Learning and TD-Gammon, Communications of the ACM, vol. 28, pp. 58-68,1995.
-
(1995)
Communications of the ACM
, vol.28
, pp. 58-68
-
-
Tesauro, G.1
-
36
-
-
0002988210
-
Computing machinery and intelligence
-
Reprinted in E. A. Feigenbaum and J. Feldman (eds.), Computers and Thought, pp. 11-15, McGraw-Hill, New York, 1963
-
A. M. Turing, Computing machinery and intelligence, Mind, vol. 59, pp. 433-460, 1950. Reprinted in E. A. Feigenbaum and J. Feldman (eds.), Computers and Thought, pp. 11-15, McGraw-Hill, New York, 1963.
-
(1950)
Mind
, vol.59
, pp. 433-460
-
-
Turing, A.M.1
-
37
-
-
1942451973
-
Model-based policy gradient reinforcement learning
-
T. G. Fawcett, N. Mishra (eds.), AAAI Press, Menlo Park, CA
-
X. Wang and T. G. Dietterich, Model-based policy gradient reinforcement learning, in T. G. Fawcett, N. Mishra (eds.), Proc. 20th International Conference on Machine Learning, pp. 776-783, AAAI Press, Menlo Park, CA, 2003.
-
(2003)
Proc. 20Th International Conference on Machine Learning
, pp. 776-783
-
-
Wang, X.1
Dietterich, T.G.2
|