SCOPUS 정보 검색 플랫폼

Proceedings of the International Joint Conference on Neural Networks

Volumn , Issue , 2012, Pages

A comparison of learning speed and ability to cope without exploration between DHP and TD(0)

(2) Fairbank, Michael a Alonso, Eduardo a

a CITY UNIVERSITY (United Kingdom)

Author keywords

Adaptive Dynamic Programming; DHP; Dual Heuristic Dynamic Programming; Reinforcement Learning

Indexed keywords

ADAPTIVE DYNAMIC PROGRAMMING; CONTINUOUS STATE SPACE; DHP; DIFFERENTIABILITY; DUAL HEURISTIC DYNAMIC PROGRAMMING; LEARNING METHODS; LEARNING SPEED; MODEL FUNCTIONS;

NEURAL NETWORKS; REINFORCEMENT LEARNING;

DYNAMIC PROGRAMMING;

EID: 84865077338 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/IJCNN.2012.6252569 Document Type: Conference Paper

Times cited : (7)

References (21)

1
- 66449130966
- Adaptive dynamic programming: An introduction
- F.-Y. Wang, H. Zhang, and D. Liu, "Adaptive dynamic programming: An introduction," IEEE Computational Intelligence Magazine, pp. 39- 47, 2009.
- (2009) IEEE Computational Intelligence Magazine , pp. 39-47
- Wang, F.-Y.¹ Zhang, H.² Liu, D.³

2
- 0004102479
- Cambridge Massachussetts USA: The MIT Press
- R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. Cambridge, Massachussetts, USA: The MIT Press, 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

3
- 85012688561
- Princeton NJ, USA: Princeton University Press
- R. E. Bellman, Dynamic Programming. Princeton, NJ, USA: Princeton University Press, 1957.
- (1957) Dynamic Programming
- Bellman, R.E.¹

4
- 33847202724
- Learning to predict by the methods of temporal differences
- R. S. Sutton, "Learning to predict by the methods of temporal differences," Machine Learning, vol. 3, pp. 9-44, 1988.
- (1988) Machine Learning , vol.3 , pp. 9-44
- Sutton, R.S.¹

5
- 0003636089
- On-line q-learning using connectionist systems
- Cambridge University Engineering Department
- G. Rummery and M. Niranjan, "On-line q-learning using connectionist systems," Tech. Rep. Technical Report CUED/F-INFENG/TR 166, Cambridge University Engineering Department, 1994.
- (1994) Tech. Rep. Technical Report CUED/F-INFENG/TR 166
- Rummery, G.¹ Niranjan, M.²

6
- 0004049893
- Ph.D. dissertation, Cambridge University
- C. J. C. H. Watkins, "Learning from delayed rewards," Ph.D. dissertation, Cambridge University, 1989.
- (1989) Learning from Delayed Rewards
- Watkins, C.J.C.H.¹

7
- 0002031779
- Approximating dynamic programming for real-time control and neural modeling
- editors White and Sofge, Chapter 13
- P. J. Werbos, "Approximating dynamic programming for real-time control and neural modeling." Handbook of Intelligent Control, editors White and Sofge, Chapter 13, pp. 493-525, 1992.
- (1992) Handbook of Intelligent Control , pp. 493-525
- Werbos, P.J.¹

8
- 0031236002
- Adaptive critic designs
- September
- D. Prokhorov and D. Wunsch, "Adaptive critic designs," IEEE Transactions on Neural Networks, vol. September, pp. 997-1007, 1997.
- (1997) IEEE Transactions on Neural Networks , pp. 997-1007
- Prokhorov, D.¹ Wunsch, D.²

9
- 85032189594
- Model-based adaptive critic designs
- editors Jennie Si et al.
- S. Ferrari and R. F. Stengel, "Model-based adaptive critic designs," Handbook of learning and approximate dynamic programming, editors Jennie Si et al., pp. 65-96, 2004.
- (2004) Handbook of Learning and Approximate Dynamic Programming , pp. 65-96
- Ferrari, S.¹ Stengel, R.F.²

10
- 84865070696
- eprint arXiv:1101.0428
- M. Fairbank and E. Alonso, "The local optimality of reinforcement learning by value gradients and its relationship to policy gradient learning," eprint arXiv:1101.0428, 2011.
- (2011) The Local Optimality of Reinforcement Learning by Value Gradients and Its Relationship to Policy Gradient Learning
- Fairbank, M.¹ Alonso, E.²

11
- 84865069763
- Value-gradient learning
- IEEE Press
- - "Value-gradient learning," in Proceedings of the IEEE International Joint Conference on Neural Networks 2012 (IJCNN'12). IEEE Press, 2012.
- (2012) Proceedings of the IEEE International Joint Conference on Neural Networks 2012 (IJCNN'12)
- Fairbank, M.¹ Alonso, E.²

12
- 0000255539
- Fast exact multiplication by the Hessian
- B. A. Pearlmutter, "Fast exact multiplication by the Hessian," Neural Computation, vol. 6, no. 1, pp. 147-160, 1994.
- (1994) Neural Computation , vol.6 , Issue.1 , pp. 147-160
- Pearlmutter, B.A.¹

13
- 0008011457
- Neural networks, system identification, and control in the chemical process industries
- Chapter 10
- P. J. Werbos, "Neural networks, system identification, and control in the chemical process industries." Handbook of Intelligent Control, editors White and Sofge, Chapter 10, pp. 283-356, 1992.
- (1992) Handbook of Intelligent Control, Editors White and Sofge , pp. 283-356
- Werbos, P.J.¹

14
- 31844443291
- Inverted autonomous helicopter flight via reinforcement learning
- MIT Press
- A. Y. Ng, H. J. Kim, M. I. Jordan, and S. Sastry, "Inverted autonomous helicopter flight via reinforcement learning," in International Symposium on Experimental Robotics. MIT Press, 2004.
- (2004) International Symposium on Experimental Robotics
- Ng, A.Y.¹ Kim, H.J.² Jordan, M.I.³ Sastry, S.⁴

15
- 33646384929
- Policy gradient in continuous time
- R. Munos, "Policy gradient in continuous time," Journal of Machine Learning Research, vol. 7, pp. 413-427, 2006.
- (2006) Journal of Machine Learning Research , vol.7 , pp. 413-427
- Munos, R.¹

16
- 0033629916
- Reinforcement learning in continuous time and space
- K. Doya, "Reinforcement learning in continuous time and space," Neural Computation, vol. 12, no. 1, pp. 219-245, 2000.
- (2000) Neural Computation , vol.12 , Issue.1 , pp. 219-245
- Doya, K.¹

17
- 84865064406
- 3rd ed. Van Nostrand Reinhold Company, ch. 3.2.2
- I. N. Bronshtein and K. A. Semendyayev, Handbook of Mathematics, 3rd ed. Van Nostrand Reinhold Company, 1985, ch. 3.2.2, pp. 372-382.
- (1985) Handbook of Mathematics , pp. 372-382
- Bronshtein, I.N.¹ Semendyayev, K.A.²

18
- 0003487601
- Oxford University Press
- C. M. Bishop, Neural Networks for Pattern Recognition. Oxford University Press, 1995.
- (1995) Neural Networks for Pattern Recognition
- Bishop, C.M.¹

19
- 0037561866
- Dual heuristic programming excitation neurocontrol for generators in a multimachine power system
- G. K. Venayagamoorthy and D. C. Wunsch, "Dual heuristic programming excitation neurocontrol for generators in a multimachine power system," IEEE Transactions on Industry Applications, vol. 39, pp. 382- 394, 2003.
- (2003) IEEE Transactions on Industry Applications , vol.39 , pp. 382-394
- Venayagamoorthy, G.K.¹ Wunsch, D.C.²

20
- 0030702730
- Training strategies for critic and action neural networks in dual heuristic programming method
- Houston
- G. G. Lendaris and C. Paintz, "Training strategies for critic and action neural networks in dual heuristic programming method," in Proceedings of International Conference on Neural Networks, Houston, 1997.
- (1997) Proceedings of International Conference on Neural Networks
- Lendaris, G.G.¹ Paintz, C.²

21
- 80053055883
- Guidance in the use of adaptive critics for control
- editors Jennie Si et al.
- G. G. Lendaris and J. C. Neidhoefer, "Guidance in the use of adaptive critics for control," Handbook of learning and approximate dynamic programming, editors Jennie Si et al., pp. 97-124, 2004.
- (2004) Handbook of Learning and Approximate Dynamic Programming , pp. 97-124
- Lendaris, G.G.¹ Neidhoefer, J.C.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.