-
1
-
-
0037288370
-
"Recent advances in hierarchical reinforcement learning"
-
A. G. Barto and S. Mahadevan, "Recent advances in hierarchical reinforcement learning," Special Issue Reinforcement Learn., Discrete Event Syst. J., vol. 13, pp. 41-77, 2003.
-
(2003)
Special Issue Reinforcement Learn., Discrete Event Syst. J.
, vol.13
, pp. 41-77
-
-
Barto, A.G.1
Mahadevan, S.2
-
2
-
-
0012257655
-
"Near-optimal reinforcement learning in polynomial time"
-
M. Kearns and S. Singh, "Near-optimal reinforcement learning in polynomial time," in Proc. Int. Conf. Mach. Learn., 1999, pp. 260-268.
-
(1999)
Proc. Int. Conf. Mach. Learn.
, pp. 260-268
-
-
Kearns, M.1
Singh, S.2
-
3
-
-
0001961616
-
"A generalized reinforcement learning model: Convergence and applications"
-
M. Littman and C. Szepesvari, "A generalized reinforcement learning model: Convergence and applications," in Proc. 13th Int. Conf. Mach. Learn., 1996, pp. 310-318.
-
(1996)
Proc. 13th Int. Conf. Mach. Learn.
, pp. 310-318
-
-
Littman, M.1
Szepesvari, C.2
-
4
-
-
0003989214
-
"Hierarchical control and learning for Markov decision processes"
-
Ph.D. dissertation, Univ. California Berkeley, Berkeley, CA
-
R. E. Parr, "Hierarchical control and learning for Markov decision processes," Ph.D. dissertation, Univ. California Berkeley, Berkeley, CA, 1998.
-
(1998)
-
-
Parr, R.E.1
-
5
-
-
0003636089
-
Online Q-learning using connectionist system
-
Cambridge Univ., Eng. Dept., Tech. Rep. CUED/F-INFENG/TR 166
-
G. Rummery and M. Niranjan, Online Q-learning using connectionist system Cambridge Univ., Eng. Dept., Tech. Rep. CUED/F-INFENG/TR 166, 1994.
-
(1994)
-
-
Rummery, G.1
Niranjan, M.2
-
6
-
-
0033686132
-
"On-line connectionist Q-learning produces unreliable performance with a synonym finding task"
-
in Jul
-
I. Johnson and M. D. Plumbley, "On-line connectionist Q-learning produces unreliable performance with a synonym finding task," in Proc. Int. Joint Conf. Neural Netw., Jul. 2000, pp. 24-27.
-
(2000)
Proc. Int. Joint Conf. Neural Netw.
, pp. 24-27
-
-
Johnson, I.1
Plumbley, M.D.2
-
7
-
-
0003806984
-
Reinforcement learning based on back propagation for mobile robot navigation
-
Computational Intelligence Group, Dept. Cybern. Artif. Intell., Technical Univ., Kosice, Slovakia
-
R. Jaksa, P. Majernik, and P. Sincak, Reinforcement learning based on back propagation for mobile robot navigation Computational Intelligence Group, Dept. Cybern. Artif. Intell., Technical Univ., Kosice, Slovakia, 2000.
-
(2000)
-
-
Jaksa, R.1
Majernik, P.2
Sincak, P.3
-
10
-
-
0028574934
-
"Genetic reinforcement learning for cooperative traffic signal control"
-
S. Mikami and Y. Kakazu, "Genetic reinforcement learning for cooperative traffic signal control," in Proc. 1st IEEE Conf. Evol. Comput., 1994, vol. 1, pp. 223-228.
-
(1994)
Proc. 1st IEEE Conf. Evol. Comput.
, vol.1
, pp. 223-228
-
-
Mikami, S.1
Kakazu, Y.2
-
11
-
-
0036456219
-
"FL-FN based traffic signal control"
-
in May
-
W. Wei and Y. Zhang, "FL-FN based traffic signal control," in Proc. 2002 IEEE Int. Conf. Fuzzy Syst., May 2002, vol. 1, no. 12-17, pp. 296-300.
-
(2002)
Proc. 2002 IEEE Int. Conf. Fuzzy Syst.
, vol.1
, Issue.12-17
, pp. 296-300
-
-
Wei, W.1
Zhang, Y.2
-
12
-
-
0345880144
-
"Traffic-responsive signal timing for system-wide traffic control"
-
J. C. Spall and D. C. Chin, "Traffic-responsive signal timing for system-wide traffic control," Transpn. Res.- C, vol. 5, no. 3/4, pp. 153-163, 1997.
-
(1997)
Transpn. Res.- C
, vol.5
, Issue.3-4
, pp. 153-163
-
-
Spall, J.C.1
Chin, D.C.2
-
13
-
-
0035372090
-
"Reinforcement learning in neural fuzzy traffic signal control"
-
E. Bingham, "Reinforcement learning in neural fuzzy traffic signal control," Euro. J. Operation Res., vol. 131, no. 2, pp. 232-241, 2001.
-
(2001)
Euro. J. Operation Res.
, vol.131
, Issue.2
, pp. 232-241
-
-
Bingham, E.1
-
15
-
-
0035330108
-
"Distributed-information neural control: The case of dynamic routing in traffic networks"
-
May
-
M. Baglietto, T. Parisini, and R. Zoppoli, "Distributed-information neural control: The case of dynamic routing in traffic networks," IEEE Trans. Neural Netw., vol. 12, no. 3, pp. 485-502, May 2001.
-
(2001)
IEEE Trans. Neural Netw.
, vol.12
, Issue.3
, pp. 485-502
-
-
Baglietto, M.1
Parisini, T.2
Zoppoli, R.3
-
16
-
-
0032096305
-
"Nonlinear stabilization by receding-horizon neural regulators"
-
T. Parisini, M. Sanguineti, and R. Zoppoli, "Nonlinear stabilization by receding-horizon neural regulators," Int. J. Control, vol. 70, no. 3, pp. 341-362, 1998.
-
(1998)
Int. J. Control
, vol.70
, Issue.3
, pp. 341-362
-
-
Parisini, T.1
Sanguineti, M.2
Zoppoli, R.3
-
18
-
-
0033285473
-
"On the use of simultaneous perturbation stochastic approximation for neural network training"
-
A. V. Wouwer, C. Renotte, and M. Remy, "On the use of simultaneous perturbation stochastic approximation for neural network training," in Proc. Amer. Control Conf., 1999, pp. 388-392.
-
(1999)
Proc. Amer. Control Conf.
, pp. 388-392
-
-
Wouwer, A.V.1
Renotte, C.2
Remy, M.3
-
19
-
-
0036075615
-
"Stochastic learning control for nonlinear systems"
-
in May
-
E. Gomez-Ramirez, P. L. Najim, and E. Ikonen, "Stochastic learning control for nonlinear systems," in Proc. Int. Joint Conf. Neural Netw. (IJCNN'02), May 2002, vol. 1, pp. 171-176.
-
(2002)
Proc. Int. Joint Conf. Neural Netw. (IJCNN'02)
, vol.1
, pp. 171-176
-
-
Gomez-Ramirez, E.1
Najim, P.L.2
Ikonen, E.3
-
20
-
-
0000439891
-
"On the convergence of stochastic iterative dynamic programming algorithms"
-
T. Jaakkola, M. I. Jordan, and S. P. Singh, "On the convergence of stochastic iterative dynamic programming algorithms," Neural Comput., vol. 6, no. 6, pp. 1185-1201, 1994.
-
(1994)
Neural Comput.
, vol.6
, Issue.6
, pp. 1185-1201
-
-
Jaakkola, T.1
Jordan, M.I.2
Singh, S.P.3
-
21
-
-
0001961616
-
"A generalized reinforcement learning model: Convergence and applications"
-
M. Littman and C. Szepesvari, "A generalized reinforcement learning model: Convergence and applications," in Proc. 13th Int. Conf. Mach. Learn., 1996, pp. 310-318.
-
(1996)
Proc. 13th Int. Conf. Mach. Learn.
, pp. 310-318
-
-
Littman, M.1
Szepesvari, C.2
-
22
-
-
0026839090
-
"Multivariate stochastic approximation using a simultaneous perturbation gradient approximation"
-
Mar
-
J. C. Spall, "Multivariate stochastic approximation using a simultaneous perturbation gradient approximation," IEEE Trans. Autom. Control, vol. 37, no. 3, pp. 332-341, Mar. 1992.
-
(1992)
IEEE Trans. Autom. Control
, vol.37
, Issue.3
, pp. 332-341
-
-
Spall, J.C.1
-
23
-
-
0000016172
-
"A stochastic approximation method"
-
H. Robbins and S. Monro, "A stochastic approximation method," Ann. Math. Statist., vol. 25, pp. 382-386, 1951.
-
(1951)
Ann. Math. Statist.
, vol.25
, pp. 382-386
-
-
Robbins, H.1
Monro, S.2
-
24
-
-
0001079593
-
"Stochastic estimation of a regression function"
-
J. Kiefer and J. Wolfowitz, "Stochastic estimation of a regression function," Ann. Math. Stat., vol. 23, pp. 462-466, 1952.
-
(1952)
Ann. Math. Stat.
, vol.23
, pp. 462-466
-
-
Kiefer, J.1
Wolfowitz, J.2
-
25
-
-
37949008637
-
"Reinforcement learning based on back propagation for mobile robot navigation"
-
in Vienna, Austria
-
R. Jaksa, P. Majernik, and P. Sincak, "Reinforcement learning based on back propagation for mobile robot navigation," in Proc. Comput. Intell. Modeling, Control, Autom. (CIMCA), Vienna, Austria, 1999.
-
(1999)
Proc. Comput. Intell. Modeling, Control, Autom. (CIMCA)
-
-
Jaksa, R.1
Majernik, P.2
Sincak, P.3
-
27
-
-
0024137490
-
"Increased rates of convergence through learning rate adaptation"
-
R. A. Jacobs, "Increased rates of convergence through learning rate adaptation," Neural Netw., vol. 1, pp. 295-307, 1988.
-
(1988)
Neural Netw.
, vol.1
, pp. 295-307
-
-
Jacobs, R.A.1
-
28
-
-
0001518167
-
"On the convergence of the LMS algorithm with adaptive learning rate for linear feedforward networks"
-
Z. Luo, "On the convergence of the LMS algorithm with adaptive learning rate for linear feedforward networks," Neural Comput., vol. 3, pp. 226-245, 1991.
-
(1991)
Neural Comput.
, vol.3
, pp. 226-245
-
-
Luo, Z.1
-
29
-
-
0003410791
-
-
2nd ed. Berlin, Germany: Springer-Verlag
-
T. Kohonen, Self-Organizing Maps, 2nd ed. Berlin, Germany: Springer-Verlag, 1997.
-
(1997)
Self-Organizing Maps
-
-
Kohonen, T.1
-
30
-
-
0034862807
-
"Coordination of exploration and exploitation in a dynamic environment"
-
G. Yan, F. Yang, T. Hickey, and M. Goldstein, "Coordination of exploration and exploitation in a dynamic environment," in Proc. Int. Joint Conf. Neural Netw. (IJCNN), 2001, pp. 1014-1018.
-
(2001)
Proc. Int. Joint Conf. Neural Netw. (IJCNN)
, pp. 1014-1018
-
-
Yan, G.1
Yang, F.2
Hickey, T.3
Goldstein, M.4
-
32
-
-
0032628023
-
"Analysis of intersection delay under real-time adaptive signal control"
-
Feb
-
P. B. Wolshon and W. C. Taylor, "Analysis of intersection delay under real-time adaptive signal control," Transportation Res. Part C, Emerging Technol., vol. 7C, no. 1, pp. 53-72, Feb. 1999.
-
(1999)
Transportation Res. Part C, Emerging Technol.
, vol.7 C
, Issue.1
, pp. 53-72
-
-
Wolshon, P.B.1
Taylor, W.C.2
|