-
1
-
-
0001700171
-
A markovian decision process
-
R. Bellman, A markovian decision process, Journal of Mathematics and Mechanics, vol.6, no.3, pp.679-693, 1957.
-
(1957)
Journal of Mathematics and Mechanics
, vol.6
, Issue.3
, pp. 679-693
-
-
Bellman, R.1
-
2
-
-
84972263711
-
Intelligent agents: Theory and practice
-
M. Wooldridge and N. Jennings, Intelligent agents: Theory and practice, Knowledge Engineering Review, vol.10, no.2, pp.115-152, 1995.
-
(1995)
Knowledge Engineering Review
, vol.10
, Issue.2
, pp. 115-152
-
-
Wooldridge, M.1
Jennings, N.2
-
3
-
-
0003787146
-
-
Princeton University Press, Princeton, NJ
-
R. Bellman, Dynamic Programming, Princeton University Press, Princeton, NJ, 1957.
-
(1957)
Dynamic Programming
-
-
Bellman, R.1
-
4
-
-
77956317471
-
Heuristic
-
Prentice-Hall, Englewood, NJ
-
S. Russell and P. Norvig, Heuristic: Intelligent Search Strategies for Computer Problem Solving, A Modern Approach, Prentice-Hall, Englewood, NJ, 1995.
-
(1995)
Intelligent Search Strategies for Computer Problem Solving, A Modern Approach
-
-
Russell, S.1
Norvig, P.2
-
5
-
-
33847202724
-
Learning to predict by the method of temporal differences
-
R. S. Sutton, Learning to predict by the method of temporal differences, Machine Learning, vol.3, no.9, pp.9-44, 1988.
-
(1988)
Machine Learning
, vol.3
, Issue.9
, pp. 9-44
-
-
Sutton, R.S.1
-
6
-
-
26944466214
-
Function approximation via tile coding: Automating parameter choice
-
Proc. of the SARA 2005, Berlin
-
A. A. Sherstov and P. Stone, Function approximation via tile coding: Automating parameter choice, Proc. of the SARA 2005, Berlin, LNCS, no.3607, pp.194-205, 2005.
-
(2005)
LNCS
, vol.3607
, pp. 194-205
-
-
Sherstov, A.A.1
Stone, P.2
-
7
-
-
84988783053
-
Convergence of reinforcement learning algorithms and acceleration of learning
-
A. Potapov and M. K. Ali, Convergence of reinforcement learning algorithms and acceleration of learning, Physical Review E, vol.67, no.2, 2003.
-
(2003)
Physical Review E
, vol.67
, Issue.2
-
-
Potapov, A.1
Ali, M.K.2
-
8
-
-
64349089159
-
-
Teambots, http://www.es.emu.edu/^trb/Teambots/Domains/SoccerBots, 2000.
-
Teambots, http://www.es.emu.edu/^trb/Teambots/Domains/SoccerBots, 2000.
-
-
-
-
9
-
-
49649148257
-
A theory of cerebellar function
-
J. S. Albus, A theory of cerebellar function, Mathematical Biosciences, vol.10, pp.25-61, 1971.
-
(1971)
Mathematical Biosciences
, vol.10
, pp. 25-61
-
-
Albus, J.S.1
-
10
-
-
64349111008
-
-
Ph.D. Thesis, Cambridge University, Cambridge, England
-
C. J. C. H. Watkins, C J. C H. Watkins, Ph.D. Thesis, Cambridge University, Cambridge, England, 1989.
-
(1989)
C J. C H. Watkins
-
-
Watkins, C.J.C.H.1
-
12
-
-
0028388685
-
TD(λ) converges with probability 1
-
P. Dayan and T. J. Sejnowski, TD(λ) converges with probability 1, Machine Learning, vol.14, no.l, pp.295-301, 1994.
-
(1994)
Machine Learning
, vol.14
, Issue.L
, pp. 295-301
-
-
Dayan, P.1
Sejnowski, T.J.2
-
13
-
-
0028497630
-
Asynchronous stochastic approximation and Q-learning
-
J. N. Tsitsiklis, Asynchronous stochastic approximation and Q-learning, Machine Learning, vol.16, no.l, pp.185-202, 1994.
-
(1994)
Machine Learning
, vol.16
, Issue.L
, pp. 185-202
-
-
Tsitsiklis, J.N.1
-
14
-
-
0003786198
-
Incremental Learning of Evaluation Functions for Absorbing Markov Chains: New Methods and Theorems
-
Preprint
-
L. Gurvits, L. J. Lin and S. J. Hanson, Incremental Learning of Evaluation Functions for Absorbing Markov Chains: New Methods and Theorems, Preprint, 1994.
-
(1994)
-
-
Gurvits, L.1
Lin, L.J.2
Hanson, S.J.3
-
15
-
-
0031143730
-
An analysis of temporal-difference learning with function approximation
-
J. N. Tsitsiklis and B. Van Roy, An analysis of temporal-difference learning with function approximation, IEEE Transactions on Automatic Control, vol.42, no.5, pp.674-690, 1997.
-
(1997)
IEEE Transactions on Automatic Control
, vol.42
, Issue.5
, pp. 674-690
-
-
Tsitsiklis, J.N.1
Van Roy, B.2
-
16
-
-
38049144717
-
Reinforcement learning of competitive skills with soccer agents
-
Proc. of the 11th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems
-
J. Leng, C. Fyfe and L. Jain, Reinforcement learning of competitive skills with soccer agents, Proc. of the 11th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems, LNAI 4692, pp.572-579, 2007.
-
(2007)
LNAI
, vol.4692
, pp. 572-579
-
-
Leng, J.1
Fyfe, C.2
Jain, L.3
-
17
-
-
0010495476
-
On step-size and bias in temporal-difference learning
-
New Haven, CT, pp
-
R. S. Sutton and S. P. Singh, On step-size and bias in temporal-difference learning, Proc. of the Eighth Yale Workshop on adaptive and Learning Systems, New Haven, CT, pp.91-96, 1994.
-
(1994)
Proc. of the Eighth Yale Workshop on adaptive and Learning Systems
, pp. 91-96
-
-
Sutton, R.S.1
Singh, S.P.2
-
19
-
-
63649137867
-
-
X. Cai, Z. Cui, J. Zeng and Y. Tan, Performance-dependent adaptive particle swam optimization, International Journal of Innovative Computing, Information and Control, 3, no.6(B), pp.1697-1706, 2007.
-
X. Cai, Z. Cui, J. Zeng and Y. Tan, Performance-dependent adaptive particle swam optimization, International Journal of Innovative Computing, Information and Control, vol.3, no.6(B), pp.1697-1706, 2007.
-
-
-
-
20
-
-
48249095357
-
-
S.-C. Chu and P.-W. Tsai, Computational intelligence based on the behavior of cats, International Journal of Innovative Computing, Information and Control, 3, no.l, pp.163-173, 2007.
-
S.-C. Chu and P.-W. Tsai, Computational intelligence based on the behavior of cats, International Journal of Innovative Computing, Information and Control, vol.3, no.l, pp.163-173, 2007.
-
-
-
-
21
-
-
5744249209
-
Equations of state calculations by fast computing machines
-
N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, M. N. Teller and E. Teller, Equations of state calculations by fast computing machines, Journal of Chemical Physics, vol.21, pp.1087-1091, 1953.
-
(1953)
Journal of Chemical Physics
, vol.21
, pp. 1087-1091
-
-
Metropolis, N.1
Rosenbluth, A.W.2
Rosenbluth, M.N.3
Teller, M.N.4
Teller, E.5
-
22
-
-
0024900644
-
Very fast simulated re-annealing
-
L. Ingber, Very fast simulated re-annealing, Mathematical Computer Modelling, vol.12, no.8, pp.967-973, 1989.
-
(1989)
Mathematical Computer Modelling
, vol.12
, Issue.8
, pp. 967-973
-
-
Ingber, L.1
-
23
-
-
38049168425
-
A reinforcement learning method based on adaptive simulated annealing
-
A. F. Atiya, A. G. Parlos and L. Ingber, A reinforcement learning method based on adaptive simulated annealing, Proc. of the 46th IEEE International Midwest Symposium on, vol.1, pp.121-124, 2003.
-
(2003)
Proc. of the 46th IEEE International Midwest Symposium on
, vol.1
, pp. 121-124
-
-
Atiya, A.F.1
Parlos, A.G.2
Ingber, L.3
-
25
-
-
0002363078
-
On the experimental attainment of optimum conditions (with discussion)
-
G. E. P. Box and K. B. Wilson, On the experimental attainment of optimum conditions (with discussion), Journal of the Royal Statistical Society Series B, vol.13, no.l, pp.1-45, 1951.
-
(1951)
Journal of the Royal Statistical Society Series B
, vol.13
, Issue.L
, pp. 1-45
-
-
Box, G.E.P.1
Wilson, K.B.2
-
27
-
-
84869265512
-
-
The MathWorks
-
The MathWorks. http://www.mathworks.com.
-
-
-
|