-
1
-
-
0016555419
-
Data storage in the cerebellar model articulation controller (CMAC)
-
Albus, J.S.: Data storage in the cerebellar model articulation controller (CMAC). J. Dyn. Syst. Meas. Control 97, 228-233 (1975)
-
(1975)
J. Dyn. Syst. Meas. Control
, vol.97
, pp. 228-233
-
-
Albus, J.S.1
-
5
-
-
0034612523
-
Inspiration for optimization from social insect behaviour
-
Bonabeau, E., Dorigo, M., Theraulaz, G.: Inspiration for optimization from social insect behaviour. Nature 406 (2000)
-
(2000)
Nature
, vol.406
-
-
Bonabeau, E.1
Dorigo, M.2
Theraulaz, G.3
-
7
-
-
0043247546
-
Accelerating reinforcement learning by composing solutions of automatically identified subtasks
-
Drummond, C.: Accelerating reinforcement learning by composing solutions of automatically identified subtasks. J. Artif. Intell. Res. 16, 59-104 (2002)
-
(2002)
J. Artif. Intell. Res.
, vol.16
, pp. 59-104
-
-
Drummond, C.1
-
8
-
-
0024684020
-
Using occupancy grids for mobile robot perception and navigation
-
Elfes, A.: Using occupancy grids for mobile robot perception and navigation. Computer 22, 46-57 (1989)
-
(1989)
Computer
, vol.22
, pp. 46-57
-
-
Elfes, A.1
-
9
-
-
0036832959
-
Structure in the space of value functions
-
Foster, D., Dayan, P.: Structure in the space of value functions. Mach. Learn. 49, 325-346 (2002)
-
(2002)
Mach. Learn.
, vol.49
, pp. 325-346
-
-
Foster, D.1
Dayan, P.2
-
10
-
-
0000016031
-
Markov localization for mobile robots in dynamic environments
-
Fox, D., Burgard, W., Thrun, S.: Markov localization for mobile robots in dynamic environments. J. Artif. Intell. Res. 11, 391-427 (1999)
-
(1999)
J. Artif. Intell. Res.
, vol.11
, pp. 391-427
-
-
Fox, D.1
Burgard, W.2
Thrun, S.3
-
11
-
-
84899829959
-
A formal basis for the heuristic determination of minimum cost paths
-
Hart, P.E., Nilsson, N.J., Raphael, B.: A formal basis for the heuristic determination of minimum cost paths. IEEE Trans. Syst. Sci. Cybern. 4, 100-107 (1968)
-
(1968)
IEEE Trans. Syst. Sci. Cybern.
, vol.4
, pp. 100-107
-
-
Hart, P.E.1
Nilsson, N.J.2
Raphael, B.3
-
12
-
-
0029679044
-
Reinforcement learning: A survey
-
Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4, 237-285 (1996)
-
(1996)
J. Artif. Intell. Res.
, vol.4
, pp. 237-285
-
-
Kaelbling, L.P.1
Littman, M.L.2
Moore, A.W.3
-
14
-
-
0036832960
-
Continuous-action Q-learning
-
Millan, J.R., Posenato, D., Dedieu, E.: Continuous-action Q-learning. Mach. Learn. 49, 247-266 (2002)
-
(2002)
Mach. Learn.
, vol.49
, pp. 247-266
-
-
Millan, J.R.1
Posenato, D.2
Dedieu, E.3
-
16
-
-
0027684215
-
Prioritized sweeping: Reinforcement learning with less data and less time
-
Moore, A.W., Atkeson, C.G.: Prioritized sweeping: reinforcement learning with less data and less time. Mach. Learn. 13, 103-130 (1993)
-
(1993)
Mach. Learn.
, vol.13
, pp. 103-130
-
-
Moore, A.W.1
Atkeson, C.G.2
-
17
-
-
0036832953
-
Variable resolution discretization in optimal control
-
Munos, R., Moore, A.W.: Variable resolution discretization in optimal control. Mach. Learn. 49, 291-323 (2002)
-
(2002)
Mach. Learn.
, vol.49
, pp. 291-323
-
-
Munos, R.1
Moore, A.W.2
-
18
-
-
84977063352
-
Efficient learning and planning within the dyna framework
-
Peng, J., Williams, R.J.: Efficient learning and planning within the dyna framework. Adapt. Behav. 1, 437-454 (1993)
-
(1993)
Adapt. Behav.
, vol.1
, pp. 437-454
-
-
Peng, J.1
Williams, R.J.2
-
20
-
-
0003636089
-
On-line Q-learning using connectionist systems
-
Cambridge University Engineering Department
-
Rummery, G., Niranjan, M.: On-line Q-learning using connectionist systems. Technical report CUED/F-INFENG/TR 166, Cambridge University Engineering Department (1994)
-
(1994)
Technical Report CUED/F-INFENG/TR 166
-
-
Rummery, G.1
Niranjan, M.2
-
23
-
-
33847202724
-
Learning to predict by the methods of temporal differences
-
Sutton, R.S.: Learning to predict by the methods of temporal differences. Mach. Learn. 3, 9-44 (1988)
-
(1988)
Mach. Learn.
, vol.3
, pp. 9-44
-
-
Sutton, R.S.1
-
24
-
-
85132026293
-
Integrated architectures for learning, planning and reacting based on approximating dynamic programming
-
Morgan Kaufmann Austin
-
Sutton, R.S.: Integrated architectures for learning, planning and reacting based on approximating dynamic programming. In: Proceedings of the 7th International Conference on Machine Learning. Morgan Kaufmann, Austin (1990)
-
(1990)
Proceedings of the 7th International Conference on Machine Learning
-
-
Sutton, R.S.1
-
25
-
-
85156221438
-
Generalization in reinforcement learning: Successful examples using sparse coarse coding
-
Sutton, R.S.: Generalization in reinforcement learning: successful examples using sparse coarse coding. Adv. Neural. Inf. Process. Syst. 8, 1038-1044 (1996)
-
(1996)
Adv. Neural. Inf. Process. Syst.
, vol.8
, pp. 1038-1044
-
-
Sutton, R.S.1
-
27
-
-
0003629453
-
-
CS-96-11, Brown University, Department of Computer Science, Providence, RI
-
Szepesvári, C., Littman, M.: Generalized Markov decision processes: dynamic-programming and reinforcement-learning algorithms. CS-96-11, Brown University, Department of Computer Science, Providence, RI (1996)
-
(1996)
Generalized Markov Decision Processes: Dynamic-programming and Reinforcement-learning Algorithms
-
-
Szepesvári, C.1
Littman, M.2
-
28
-
-
0035336711
-
Robust Monte Carlo localization for mobile robots
-
Thrun, S., Fox, W., Burgard, D., Dellaert, F.: Robust Monte Carlo localization for mobile robots. Artif. Intell. 128, 99-141 (2001)
-
(2001)
Artif. Intell.
, vol.128
, pp. 99-141
-
-
Thrun, S.1
Fox, W.2
Burgard, D.3
Dellaert, F.4
|