-
1
-
-
0011812771
-
Kernel independent component analysis
-
Bach FR, Jordan MI (2002) Kernel independent component analysis. J Mach Learn Res 3: 1-48.
-
(2002)
J Mach Learn Res
, vol.3
, pp. 1-48
-
-
Bach, F.R.1
Jordan, M.I.2
-
2
-
-
0020970738
-
Neuronlike adaptive elements that can solve difficult learning control problems
-
Barto AG, Sutton RS, Anderson CW (1983) Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans Syst Man Cybern 13(5): 835-846.
-
(1983)
IEEE Trans Syst Man Cybern
, vol.13
, Issue.5
, pp. 835-846
-
-
Barto, A.G.1
Sutton, R.S.2
Anderson, C.W.3
-
3
-
-
0013535965
-
Infinite-horizon policy-gradient estimation
-
Baxter J, Bartlett PL (2001) Infinite-horizon policy-gradient estimation. J Artif Intell Res 15: 319-350.
-
(2001)
J Artif Intell Res
, vol.15
, pp. 319-350
-
-
Baxter, J.1
Bartlett, P.L.2
-
5
-
-
0036832950
-
Technical update: least-squares temporal difference learning
-
Boyan J (2002) Technical update: least-squares temporal difference learning. Mach Learn 49(2-3): 233-246.
-
(2002)
Mach Learn
, vol.49
, Issue.2-3
, pp. 233-246
-
-
Boyan, J.1
-
6
-
-
0032208335
-
Elevator group control using multiple reinforcement learning agents
-
Crites RH, Barto AG (1998) Elevator group control using multiple reinforcement learning agents. Mach Learn 33(2-3): 235-262.
-
(1998)
Mach Learn
, vol.33
, Issue.2-3
, pp. 235-262
-
-
Crites, R.H.1
Barto, A.G.2
-
7
-
-
0000430514
-
The convergence of TD(λ) for general λ
-
Dayan P (1992) The convergence of TD(λ) for general λ. Mach Learn 8: 341-362.
-
(1992)
Mach Learn
, vol.8
, pp. 341-362
-
-
Dayan, P.1
-
8
-
-
0028388685
-
TD(λ) converges with probability 1
-
Dayan P, Sejnowski TJ (1994) TD(λ) converges with probability 1. Mach Learn 14: 295-301.
-
(1994)
Mach Learn
, vol.14
, pp. 295-301
-
-
Dayan, P.1
Sejnowski, T.J.2
-
9
-
-
3543096272
-
The kernel recursive least-squares algorithm
-
Engel Y, Mannor S, Meir R (2004) The kernel recursive least-squares algorithm. IEEE Trans Signal Process 52(8): 2275-2285.
-
(2004)
IEEE Trans Signal Process
, vol.52
, Issue.8
, pp. 2275-2285
-
-
Engel, Y.1
Mannor, S.2
Meir, R.3
-
14
-
-
35748957806
-
Proto-value functions: a Laplacian framework for learning representation and control in Markov decision processes
-
Mahadevan S, Maggioni M (2007) Proto-value functions: a Laplacian framework for learning representation and control in Markov decision processes. J Mach Learn Res 8: 2169-2231.
-
(2007)
J Mach Learn Res
, vol.8
, pp. 2169-2231
-
-
Mahadevan, S.1
Maggioni, M.2
-
17
-
-
84899026055
-
Gaussian processes in reinforcement learning
-
Thrun S, Saul LK, Schölkopf B, MIT Press, Cambridge
-
Rasmussen CE, Kuss M (2004) Gaussian processes in reinforcement learning. In: Thrun S, Saul LK, Schölkopf B (eds) Advances in neural information processing systems, vol 16. MIT Press, Cambridge, pp 751-759.
-
(2004)
Advances In Neural Information Processing Systems
, vol.16
, pp. 751-759
-
-
Rasmussen, C.E.1
Kuss, M.2
-
19
-
-
0033901602
-
Convergence results for single-step on-policy reinforcement-learning algorithms
-
Singh SP, Jaakkola T, Littman ML, Szepesvari C (2000) Convergence results for single-step on-policy reinforcement-learning algorithms. Mach Learn 38: 287-308.
-
(2000)
Mach Learn
, vol.38
, pp. 287-308
-
-
Singh, S.P.1
Jaakkola, T.2
Littman, M.L.3
Szepesvari, C.4
-
20
-
-
33847202724
-
Learning to predict by the method of temporal differences
-
Sutton R (1988) Learning to predict by the method of temporal differences. Mach Learn 3(1): 9-44.
-
(1988)
Mach Learn
, vol.3
, Issue.1
, pp. 9-44
-
-
Sutton, R.1
-
21
-
-
85156221438
-
Generalization in reinforcement learning: Successful examples using sparse coarse coding
-
MIT Press, Cambridge
-
Sutton R (1996) Generalization in reinforcement learning: successful examples using sparse coarse coding. In: Advances in neural information processing systems, vol 8. MIT Press, Cambridge, pp 1038-1044.
-
(1996)
Advances In Neural Information Processing Systems
, vol.8
, pp. 1038-1044
-
-
Sutton, R.1
-
23
-
-
0000985504
-
TD-Gammon, a self-teaching backgammon program, achieves master-level play
-
Tesauro G (1994) TD-Gammon, a self-teaching backgammon program, achieves master-level play. Neural Comput 6: 215-219.
-
(1994)
Neural Comput
, vol.6
, pp. 215-219
-
-
Tesauro, G.1
-
24
-
-
0028497630
-
Asynchronous stochastic approximation and Q-learning
-
Tsitsiklis JN (1994) Asynchronous stochastic approximation and Q-learning. Mach Learn 16: 185-202.
-
(1994)
Mach Learn
, vol.16
, pp. 185-202
-
-
Tsitsiklis, J.N.1
-
25
-
-
0031143730
-
An analysis of temporal difference learning with function approximation
-
Tsitsiklis JN, Roy BV (1997) An analysis of temporal difference learning with function approximation. IEEE Trans Autom Control 42(5): 674-690.
-
(1997)
IEEE Trans Autom Control
, vol.42
, Issue.5
, pp. 674-690
-
-
Tsitsiklis, J.N.1
Roy, B.V.2
-
27
-
-
33646714634
-
Evolutionary function approximation for reinforcement learning
-
Whiteson S, Stone P (2006) Evolutionary function approximation for reinforcement learning. J Mach Learn Res 7: 877-917.
-
(2006)
J Mach Learn Res
, vol.7
, pp. 877-917
-
-
Whiteson, S.1
Stone, P.2
-
28
-
-
0000337576
-
Simple statistical gradient-following algorithms for connectionist reinforcement learning
-
Williams RJ (1992) Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach Learn 8: 229-256.
-
(1992)
Mach Learn
, vol.8
, pp. 229-256
-
-
Williams, R.J.1
-
29
-
-
34547098844
-
Kernel-based least-squares policy iteration for reinforcement learning
-
Xu X, Hu DW, Lu XC (2007) Kernel-based least-squares policy iteration for reinforcement learning. IEEE Trans Neural Netw 18(4): 973-997.
-
(2007)
IEEE Trans Neural Netw
, vol.18
, Issue.4
, pp. 973-997
-
-
Xu, X.1
Hu, D.W.2
Lu, X.C.3
|