-
3
-
-
79952438883
-
A hybrid agent architecture integrating desire, intention and reinforcement learning
-
Tan AH, Ong YS, Tapanuj A (2011) A hybrid agent architecture integrating desire, intention and reinforcement learning. Expert Syst Appl 38(7):8477–8487
-
(2011)
Expert Syst Appl
, vol.38
, Issue.7
, pp. 8477-8487
-
-
Tan, A.H.1
Ong, Y.S.2
Tapanuj, A.3
-
4
-
-
84902475773
-
-
Tang L, Liu Y-J, Tong S (2014) Adaptive neural control using reinforcement learning for a class of robot manipulator. Neural Comput Appl 25(1):135–141
-
Tang L, Liu Y-J, Tong S (2014) Adaptive neural control using reinforcement learning for a class of robot manipulator. Neural Comput Appl 25(1):135–141
-
-
-
-
5
-
-
84872617336
-
A neural-network-based iterative GDHP approach for solving a class of nonlinear optimal control problems with control constraints
-
Wang D, Liu D, Zhao D, Huang Y, Zhang D (2013) A neural-network-based iterative GDHP approach for solving a class of nonlinear optimal control problems with control constraints. Neural Comput Appl 22(2):219–227
-
(2013)
Neural Comput Appl
, vol.22
, Issue.2
, pp. 219-227
-
-
Wang, D.1
Liu, D.2
Zhao, D.3
Huang, Y.4
Zhang, D.5
-
6
-
-
84898013913
-
Stable iterative adaptive dynamic programming algorithm with approximation errors for discrete-time nonlinear systems
-
Wei Q, Liu D (2014) Stable iterative adaptive dynamic programming algorithm with approximation errors for discrete-time nonlinear systems. Neural Comput Appl 24(6):1355–1367
-
(2014)
Neural Comput Appl
, vol.24
, Issue.6
, pp. 1355-1367
-
-
Wei, Q.1
Liu, D.2
-
7
-
-
84896543600
-
Dual heuristic dynamic programming for nonlinear discrete-time uncertain systems with state delay
-
Wang B, Zhao D, Alippi C, Liu D (2014) Dual heuristic dynamic programming for nonlinear discrete-time uncertain systems with state delay. Neurocomputing 134:222–229
-
(2014)
Neurocomputing
, vol.134
, pp. 222-229
-
-
Wang, B.1
Zhao, D.2
Alippi, C.3
Liu, D.4
-
8
-
-
0004049893
-
Learning from delayed rewards
-
Cambridge University, Cambridge:
-
Watkins C (1989) Learning from delayed rewards. PhD thesis, Cambridge University, Cambridge
-
(1989)
PhD thesis
-
-
Watkins, C.1
-
11
-
-
84863467146
-
Neural-network-based optimal control for a class of unknown discrete-time nonlinear systems using globalized dual heuristic programming
-
Liu D, Wang D, Zhao D, Wei Q, Jin N (2012) Neural-network-based optimal control for a class of unknown discrete-time nonlinear systems using globalized dual heuristic programming. IEEE Trans Autom Sci Eng 9(3):628–634
-
(2012)
IEEE Trans Autom Sci Eng
, vol.9
, Issue.3
, pp. 628-634
-
-
Liu, D.1
Wang, D.2
Zhao, D.3
Wei, Q.4
Jin, N.5
-
12
-
-
0002210775
-
The role of exploration in learning control
-
Florence, Kentucky:
-
Thrun SB (1992) The role of exploration in learning control. In: White D, Sofge D (eds) Handbook for intelligent control: neural, fuzzy and adaptive approaches. Van Nostrand Reinhold, Florence, Kentucky 41022
-
(1992)
Handbook for intelligent control: neural, fuzzy and adaptive approaches. Van Nostrand Reinhold
, pp. 41022
-
-
Thrun, S.B.1
White, D.2
Sofge, D.3
-
13
-
-
84888019460
-
Full range adaptive cruise control based on supervised adaptive dynamic programming
-
Zhao D, Hu Z, Xia Z, Alippi C, Wang D (2014) Full range adaptive cruise control based on supervised adaptive dynamic programming. Neurocomputing 125:57–67
-
(2014)
Neurocomputing
, vol.125
, pp. 57-67
-
-
Zhao, D.1
Hu, Z.2
Xia, Z.3
Alippi, C.4
Wang, D.5
-
14
-
-
84885903360
-
A supervised actor-critic approach for adaptive cruise control
-
Zhao D, Wang B, Liu D (2013) A supervised actor-critic approach for adaptive cruise control. Soft Comput 17(11):2089–2099
-
(2013)
Soft Comput
, vol.17
, Issue.11
, pp. 2089-2099
-
-
Zhao, D.1
Wang, B.2
Liu, D.3
-
15
-
-
82455175244
-
DHP for coordinated freeway ramp metering
-
Zhao D, Bai X, Wang F, Xu J, Yu W (2011) DHP for coordinated freeway ramp metering. IEEE Trans Intell Transp Syst 12(4):990–999
-
(2011)
IEEE Trans Intell Transp Syst
, vol.12
, Issue.4
, pp. 990-999
-
-
Zhao, D.1
Bai, X.2
Wang, F.3
Xu, J.4
Yu, W.5
-
16
-
-
70350492296
-
The application of ADHDP λ method to coordinated multiple ramps metering
-
Bai X, Zhao D, Yi J (2009) The application of ADHDP$$(\lambda)$$(λ) method to coordinated multiple ramps metering. Int J Innov Comput 5(10(B)):3471–3481
-
(2009)
Int J Innov Comput
, vol.5
, Issue.10B
, pp. 3471-3481
-
-
Bai, X.1
Zhao, D.2
Yi, J.3
-
17
-
-
0036832954
-
Near-optimal reinforcement learning in polynomial time
-
Kearns M, Singh S (2002) Near-optimal reinforcement learning in polynomial time. Mach Learn 49(2–3):209–232
-
(2002)
Mach Learn
, vol.49
, Issue.2-3
, pp. 209-232
-
-
Kearns, M.1
Singh, S.2
-
18
-
-
0041965975
-
R-max—a general polynomial time algorithm for near-optimal reinforcement learning
-
Brafman RI, Tennenholtz M (2003) R-max—a general polynomial time algorithm for near-optimal reinforcement learning. J Mach Learn Res 3:213–231
-
(2003)
J Mach Learn Res
, vol.3
, pp. 213-231
-
-
Brafman, R.I.1
Tennenholtz, M.2
-
19
-
-
31844432138
-
A theoretical analysis of model-based interval estimation. In: Proceedings of 22nd international conference on machine learning (ICML’05)
-
Strehl AL, Littman ML (2005) A theoretical analysis of model-based interval estimation. In: Proceedings of 22nd international conference on machine learning (ICML’05), pp 856–863
-
(2005)
pp 856–863
-
-
Strehl, A.L.1
Littman, M.L.2
-
20
-
-
33749255382
-
PAC model-free reinforcement learning. In: Proceedings of 23rd international conference on machine learning (ICML’06)
-
Strehl AL, Li L, Wiewiora E, Langford J, Littman ML (2006) PAC model-free reinforcement learning. In: Proceedings of 23rd international conference on machine learning (ICML’06), pp 881–888
-
(2006)
pp 881–888
-
-
Strehl, A.L.1
Li, L.2
Wiewiora, E.3
Langford, J.4
Littman, M.L.5
-
21
-
-
1942452450
-
Exploration in metric state spaces. In: Proceedings of 20th international conference on machine learning (ICML’03)
-
Kakade S, Kearns MJ, Langford J (2003) Exploration in metric state spaces. In: Proceedings of 20th international conference on machine learning (ICML’03), pp 306–312
-
(2003)
pp 306–312
-
-
Kakade, S.1
Kearns, M.J.2
Langford, J.3
-
23
-
-
78649716899
-
Adaptive-resolution reinforcement learning with polynomial exploration in deterministic domains
-
Bernstein A, Shimkin N (2010) Adaptive-resolution reinforcement learning with polynomial exploration in deterministic domains. Mach Learn 81(3):359–397
-
(2010)
Mach Learn
, vol.81
, Issue.3
, pp. 359-397
-
-
Bernstein, A.1
Shimkin, N.2
-
24
-
-
0036832953
-
Variable resolution discretization in optimal control
-
Munos R, Moore A (2002) Variable resolution discretization in optimal control. Mach Learn 49(2–3):291–323
-
(2002)
Mach Learn
, vol.49
, Issue.2-3
, pp. 291-323
-
-
Munos, R.1
Moore, A.2
-
27
-
-
84878421441
-
Optimal control for discrete-time affine nonlinear systems using general value iteration
-
Li H, Liu D (2012) Optimal control for discrete-time affine nonlinear systems using general value iteration. IET Control Theory Appl 6(18):2725–2736
-
(2012)
IET Control Theory Appl
, vol.6
, Issue.18
, pp. 2725-2736
-
-
Li, H.1
Liu, D.2
-
28
-
-
49049089962
-
Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof
-
Al-Tamimi A, Lewis FL, Abu-Khalaf M (2008) Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. Trans Syst Man Cyber Part B 38(4):943–949
-
(2008)
Trans Syst Man Cyber Part B
, vol.38
, Issue.4
, pp. 943-949
-
-
Al-Tamimi, A.1
Lewis, F.L.2
Abu-Khalaf, M.3
-
29
-
-
84887472008
-
Adaptive optimal control for a class of continuous-time affine nonlinear systems with unknown internal dynamics
-
Liu D, Yang X, Li H (2013) Adaptive optimal control for a class of continuous-time affine nonlinear systems with unknown internal dynamics. Neural Comput Appl 23(7–8):1843–1850
-
(2013)
Neural Comput Appl
, vol.23
, Issue.7-8
, pp. 1843-1850
-
-
Liu, D.1
Yang, X.2
Li, H.3
-
30
-
-
84887486066
-
A hierarchical reinforcement learning approach for optimal path tracking of wheeled mobile robots
-
Zuo L, Xu X, Liu C, Huang Z (2013) A hierarchical reinforcement learning approach for optimal path tracking of wheeled mobile robots. Neural Comput Appl 23(7–8):1873–1883
-
(2013)
Neural Comput Appl
, vol.23
, Issue.7-8
, pp. 1873-1883
-
-
Zuo, L.1
Xu, X.2
Liu, C.3
Huang, Z.4
-
31
-
-
0344961876
-
Reinforcement learning on explicitly specified time scales
-
Schoknecht R, Riedmiller M (2003) Reinforcement learning on explicitly specified time scales. Neural Comput Appl 12(2):61–80
-
(2003)
Neural Comput Appl
, vol.12
, Issue.2
, pp. 61-80
-
-
Schoknecht, R.1
Riedmiller, M.2
|