SCOPUS 정보 검색 플랫폼

IEEE Transactions on Neural Networks and Learning Systems

Volumn 25, Issue 10, 2014, Pages 1909-1920

Clipping in neurocontrol by adaptive dynamic programming

(3) Fairbank, Michael a Prokhorov, Danil b Alonso, Eduardo a

a CITY UNIVERSITY (United Kingdom)

b Toyota Motor North America Inc (United States)

Author keywords

Backpropagation through time (BPTT); clipping; dual heuristic programming (DHP); neurocontrol; value gradient learning

Indexed keywords

BACKPROPAGATION THROUGH TIME; CLIPPING; DUAL HEURISTIC PROGRAMMING; NEURO CONTROL; VALUE-GRADIENT LEARNING;

ALGORITHM; ARTIFICIAL INTELLIGENCE; ARTIFICIAL NEURAL NETWORK; COMPUTER SIMULATION; HUMAN; NONLINEAR SYSTEM; REINFORCEMENT;

ALGORITHMS; ARTIFICIAL INTELLIGENCE; COMPUTER SIMULATION; HUMANS; NEURAL NETWORKS (COMPUTER); NONLINEAR DYNAMICS; REINFORCEMENT (PSYCHOLOGY);

EID: 84907821545 PISSN: 2162237X EISSN: 21622388 Source Type: Journal
DOI: 10.1109/TNNLS.2014.2297991 Document Type: Article

Times cited : (12)

References (35)

1
- 66449130966
- Adaptive dynamic programming: An introduction
- May
- F.-Y. Wang, H. Zhang, and D. Liu, "Adaptive dynamic programming: An introduction," IEEE Comput. Intell. Mag., vol. 4, no. 2, pp. 39-47, May 2009.
- (2009) IEEE Comput. Intell. Mag , vol.4 , Issue.2 , pp. 39-47
- Wang, F.-Y.¹ Zhang, H.² Liu, D.³

2
- 84907812505
- Neurocontrol: Where it is going and why it is crucial
- North Holland, Amsterdam, I. Aleksander, Ed. New York, NY, USA: Taylor
- P. J. Werbos, "Neurocontrol: Where it is going and why it is crucial," in Artifical Neural Networks II (ICANN-2), North Holland, Amsterdam, I. Aleksander, Ed. New York, NY, USA: Taylor, 1992, pp. 61-68.
- (1992) Artifical Neural Networks II (ICANN-2) , pp. 61-68
- Werbos, P.J.¹

3
- 0033284518
- Neural networks for control
- M. Hagan and H. Demuth, "Neural networks for control," in Proc. Amer. Control Conf., vol. 3. 1999, pp. 1642-1656.
- (1999) Proc. Amer. Control Conf , vol.3 , pp. 1642-1656
- Hagan, M.¹ Demuth, H.²

4
- 0004102479
- Cambridge MA USA: MIT Press
- R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. Cambridge, MA, USA: MIT Press, 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

5
- 84883537695
- Reinforcement learning and feedback control: Using natural decision methods to design optimal adaptive controllers
- Dec
- F. L. Lewis, D. Vrabie, and K. G. Vamvoudakis, "Reinforcement learning and feedback control: Using natural decision methods to design optimal adaptive controllers," IEEE Control Syst., vol. 32, no. 6, pp. 76-105, Dec. 2012.
- (2012) IEEE Control Syst , vol.32 , Issue.6 , pp. 76-105
- Lewis, F.L.¹ Vrabie, D.² Vamvoudakis, K.G.³

6
- 0002031779
- Approximating dynamic programming for real-time control and neural modeling
- D. A. White and D. A. Sofge, Eds. New York, NY, USA: Van Nostrand, ch. 13
- P. J. Werbos, "Approximating dynamic programming for real-time control and neural modeling." in Handbook of Intelligent Control, D. A. White and D. A. Sofge, Eds. New York, NY, USA: Van Nostrand, 1992, pp. 493-525, ch. 13.
- (1992) Handbook of Intelligent Control , pp. 493-525
- Werbos, P.J.¹

7
- 0031236002
- Adaptive critic designs
- Sep
- D. Prokhorov and D. Wunsch, "Adaptive critic designs," IEEE Trans. Neural Netw., vol. 8, no. 5, pp. 997-1007, Sep. 1997.
- (1997) IEEE Trans. Neural Netw , vol.8 , Issue.5 , pp. 997-1007
- Prokhorov, D.¹ Wunsch, D.²

8
- 0025503558
- Backpropagation through time: What it does and how to do it
- Oct
- P. J. Werbos, "Backpropagation through time: What it does and how to do it," Proc. IEEE, vol. 78, no. 10, pp. 1550-1560, Oct. 1990.
- (1990) Proc. IEEE , vol.78 , Issue.10 , pp. 1550-1560
- Werbos, P.J.¹

9
- 84865069763
- Value-gradient learning
- Jun
- M. Fairbank and E. Alonso, "Value-gradient learning," in Proc. IEEE IJCNN, Jun. 2012, pp. 3062-3069.
- (2012) Proc. IEEE IJCNN , pp. 3062-3069
- Fairbank, M.¹ Alonso, E.²

10
- 84886350301
- Approximating optimal control with value gradient learning
- F. Lewis and D. Liu, Eds. New York, NY, USA: Wiley
- M. Fairbank, D. Prokhorov, and E. Alonso, "Approximating optimal control with value gradient learning," in Reinforcement Learning and Approximate Dynamic Programming for Feedback Control, F. Lewis and D. Liu, Eds. New York, NY, USA: Wiley, 2012, pp. 142-161.
- (2012) Reinforcement Learning and Approximate Dynamic Programming for Feedback Control , pp. 142-161
- Fairbank, M.¹ Prokhorov, D.² Alonso, E.³

11
- 84865080650
- M. Fairbank. (2008). Reinforcement Learning by Value Gradients [Online]. Available: http://arxiv.org/abs/0803.3539
- (2008) Reinforcement Learning by Value Gradients [Online]
- Fairbank, M.¹

12
- 84887996993
- An equivalence between adaptive dynamic programming with a critic and backpropagation through time
- Dec
- M. Fairbank, E. Alonso, and D. Prokhorov, "An equivalence between adaptive dynamic programming with a critic and backpropagation through time," IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 12, pp. 2088-2100, Dec. 2013.
- (2013) IEEE Trans. Neural Netw. Learn. Syst , vol.24 , Issue.12 , pp. 2088-2100
- Fairbank, M.¹ Alonso, E.² Prokhorov, D.³

13
- 34548772284
- Backpropagation through time and derivative adaptive critics: A common framework for comparison
- J. Si, A. Barto, W. Powell, and D. Wunsch, Eds. New York, NY, USA: Wiley
- D. Prokhorov, "Backpropagation through time and derivative adaptive critics: A common framework for comparison," in Handbook of Learning and Approximate Dynamic Programming, J. Si, A. Barto, W. Powell, and D. Wunsch, Eds. New York, NY, USA: Wiley, 2004.
- (2004) Handbook of Learning and Approximate Dynamic Programming
- Prokhorov, D.¹

14
- 84865070696
- M. Fairbank and E. Alonso. (2011). The Local Optimality of Reinforcement Learning by Value Gradients, and its Relationship to Policy Gradient Learning [Online]. Available: http://arxiv.org/abs/1101.0428
- (2011) The Local Optimality of Reinforcement Learning by Value Gradients, and Its Relationship to Policy Gradient Learning [Online]
- Fairbank, M.¹ Alonso, E.²

15
- 33646384929
- Policy gradient in continuous time
- Jan
- R. Munos, "Policy gradient in continuous time," J. Mach. Learn. Res., vol. 7, pp. 413-427, Jan. 2006.
- (2006) J. Mach. Learn. Res , vol.7 , pp. 413-427
- Munos, R.¹

16
- 84876158475
- Simple and fast calculation of the second-order gradients for globalized dual heuristic dynamic programming in neural networks
- Oct
- M. Fairbank, E. Alonso, and D. Prokhorov, "Simple and fast calculation of the second-order gradients for globalized dual heuristic dynamic programming in neural networks," IEEE Trans. Neural Netw. Learn. Syst., vol. 23, no. 10, pp. 1671-1678, Oct. 2012.
- (2012) IEEE Trans. Neural Netw. Learn. Syst , vol.23 , Issue.10 , pp. 1671-1678
- Fairbank, M.¹ Alonso, E.² Prokhorov, D.³

17
- 14844340822
- Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network hjb approach
- M. Abu-Khalaf and F. L. Lewis, "Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach," Automatica, vol. 41, no. 5, pp. 779-791, 2005.
- (2005) Automatica , vol.41 , Issue.5 , pp. 779-791
- Abu-Khalaf, M.¹ Lewis, F.L.²

18
- 33847648898
- Adaptive critic designs for discrete-time zero-sum games with application to h∞ control
- Feb
- A. Al-Tamimi, M. Abu-Khalaf, and F. L. Lewis, "Adaptive critic designs for discrete-time zero-sum games with application to h∞ control," IEEE Trans. Syst., Man, Cybern., B, Cybern., vol. 37, no. 1, pp. 240-247, Feb. 2007.
- (2007) IEEE Trans. Syst., Man, Cybern., B, Cybern , vol.37 , Issue.1 , pp. 240-247
- Al-Tamimi, A.¹ Abu-Khalaf, M.² Lewis, F.L.³

19
- 49049089962
- Discrete-time nonlinear hjb solution using approximate dynamic programming: Convergence proof
- Aug
- A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, "Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof," IEEE Trans. Syst., Man, Cybern., B, Cybern., vol. 38, no. 4, pp. 943-949, Aug. 2008.
- (2008) IEEE Trans. Syst., Man, Cybern., B, Cybern , vol.38 , Issue.4 , pp. 943-949
- Al-Tamimi, A.¹ Lewis, F.L.² Abu-Khalaf, M.³

20
- 70349253929
- Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints
- Sep
- H. Zhang, Y. Luo, and D. Liu, "Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints," IEEE Trans. Neural Netw., vol. 20, no. 9, pp. 1490-1503, Sep. 2009.
- (2009) IEEE Trans. Neural Netw , vol.20 , Issue.9 , pp. 1490-1503
- Zhang, H.¹ Luo, Y.² Liu, D.³

21
- 49049119493
- A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy hdp iteration algorithm
- Aug
- H. Zhang, Q. Wei, and Y. Luo, "A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm," IEEE Trans. Syst., Man, Cybern., B, Cybern., vol. 38, no. 4, pp. 937-942, Aug. 2008.
- (2008) IEEE Trans. Syst., Man, Cybern., B, Cybern , vol.38 , Issue.4 , pp. 937-942
- Zhang, H.¹ Wei, Q.² Luo, Y.³

22
- 0000337576
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- R. J. Williams, "Simple statistical gradient-following algorithms for connectionist reinforcement learning," Mach. Learn., vol. 8, nos. 3-4, pp. 229-356, 1992.
- (1992) Mach. Learn , vol.8 , Issue.3-4 , pp. 229-356
- Williams, R.J.¹

23
- 34250635407
- Policy gradient methods for robotics
- Oct
- J. Peters and S. Schaal, "Policy gradient methods for robotics," in Proc. IEEE/RSJ Int. Conf. IROS, Oct. 2006, pp. 2219-2225.
- (2006) Proc. IEEE/RSJ Int. Conf. IROS , pp. 2219-2225
- Peters, J.¹ Schaal, S.²

24
- 0004049893
- Ph.D. dissertation, Dept. Comput. Sci., Cambridge Univ., Cambridge, U.K.
- C. J. C. H. Watkins, "Learning from delayed rewards," Ph.D. dissertation, Dept. Comput. Sci., Cambridge Univ., Cambridge, U.K., 1989.
- (1989) Learning from Delayed Rewards
- Watkins, C.J.C.H.¹

25
- 85032189594
- Model-based adaptive critic designs
- J. Si, A. Barto, W. Powell, and D. Wunsch, Eds. New York, NY, USA: Wiley
- S. Ferrari and R. F. Stengel, "Model-based adaptive critic designs," in Handbook of Learning and Approximate Dynamic Programming, J. Si, A. Barto, W. Powell, and D. Wunsch, Eds. New York, NY, USA: Wiley, 2004, pp. 65-96.
- (2004) Handbook of Learning and Approximate Dynamic Programming , pp. 65-96
- Ferrari, S.¹ Stengel, R.F.²

26
- 33847202724
- Learning to predict by the methods of temporal differences
- R. S. Sutton, "Learning to predict by the methods of temporal differences," Mach. Learn., vol. 3, no. 1, pp. 9-44, 1988.
- (1988) Mach. Learn , vol.3 , Issue.1 , pp. 9-44
- Sutton, R.S.¹

27
- 0033629916
- Reinforcement learning in continuous time and space
- K. Doya, "Reinforcement learning in continuous time and space," Neural Comput., vol. 12, no. 1, pp. 219-245, 2000.
- (2000) Neural Comput , vol.12 , Issue.1 , pp. 219-245
- Doya, K.¹

28
- 0003529238
- Ph.D. dissertation, Dept. Comput. Sci., Harvard Univ., Cambridge, MA, USA
- P. J. Werbos, "Beyond regression: New tools for prediction and analysis in the behavioral sciences," Ph.D. dissertation, Dept. Comput. Sci., Harvard Univ., Cambridge, MA, USA, 1974.
- (1974) Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences
- Werbos, P.J.¹

29
- 0022471098
- Learning representations by back-propagating errors
- Oct
- D. Rumelhart, G. Hinton, and R. Williams, "Learning representations by back-propagating errors," Nature, vol. 323, no. 6088, pp. 533-536, Oct. 1986.
- (1986) Nature , vol.323 , Issue.6088 , pp. 533-536
- Rumelhart, D.¹ Hinton, G.² Williams, R.³

30
- 0027554566
- Temporal-difference methods and markov models
- Mar./Apr
- E. Barnard, "Temporal-difference methods and Markov models," IEEE Trans. Syst., Man, Cybern., vol. 23, no. 2, pp. 357-365, Mar./Apr. 1993.
- (1993) IEEE Trans. Syst., Man, Cybern , vol.23 , Issue.2 , pp. 357-365
- Barnard, E.¹

31
- 0003487601
- Oxford, U.K.: Oxford Univ. Press
- C. M. Bishop, Neural Networks for Pattern Recognition. Oxford, U.K.: Oxford Univ. Press, 1995.
- (1995) Neural Networks for Pattern Recognition
- Bishop, C.M.¹

32
- 84865077338
- A comparison of learning speed and ability to cope without exploration between dhp and td(0)
- Jun
- M. Fairbank and E. Alonso, "A comparison of learning speed and ability to cope without exploration between DHP and TD(0)," in Proc. IEEE Int. Joint Conf. Neural Netw., Jun. 2012, pp. 1478-1485.
- (2012) Proc. IEEE Int. Joint Conf. Neural Netw , pp. 1478-1485
- Fairbank, M.¹ Alonso, E.²

33
- 0020970738
- Neuronlike adaptive elements that can solve difficult learning control problems
- Sep./Oct
- A. G. Barto, R. S. Sutton, and C. W. Anderson, "Neuronlike adaptive elements that can solve difficult learning control problems," IEEE Trans. Syst., Man, Cybern., vol. 13, no. 5, pp. 834-846, Sep./Oct. 1983.
- (1983) IEEE Trans. Syst., Man, Cybern , vol.13 , Issue.5 , pp. 834-846
- Barto, A.G.¹ Sutton, R.S.² Anderson, C.W.³

34
- 0030702730
- Training strategies for critic and action neural networks in dual heuristic programming method
- Jun
- G. G. Lendaris and C. Paintz, "Training strategies for critic and action neural networks in dual heuristic programming method," in Proc. Int. Conf. Neural Netw., Jun. 1997, pp. 712-717.
- (1997) Proc. Int. Conf. Neural Netw , pp. 712-717
- Lendaris, G.G.¹ Paintz, C.²

35
- 76649091717
- Correct equations for the dynamics of the cart-pole system
- Univ. Bucharest, Cluj-Napoca, Romania, Tech. Rep
- R. V. Florian, "Correct equations for the dynamics of the cart-pole system," Center for Cognitive and Neural Studies (Coneural), Univ. Bucharest, Cluj-Napoca, Romania, Tech. Rep., 2007.
- (2007) Center for Cognitive and Neural Studies (Coneural)
- Florian, R.V.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.