메뉴 건너뛰기




Volumn 12, Issue 2, 2003, Pages 81-88

Neural Q-learning

Author keywords

Feed forward network; Learning from real systems; Nonlinear systems; Optimal control Reinforcement learning

Indexed keywords

GENERAL FUNCTION APPROXIMATORS; REINFORCEMENT LEARNING (RL);

EID: 0345393286     PISSN: 09410643     EISSN: None     Source Type: Journal    
DOI: 10.1007/s00521-003-0369-9     Document Type: Review
Times cited : (28)

References (21)
  • 1
    • 0025399567 scopus 로고
    • Identification and control for dynamic systems using neural networks
    • Narendra K, Parthasarathy K (1990) Identification and control for dynamic systems using neural networks. IEEE Transaction on Neural networks 1(1): 447-457
    • (1990) IEEE Transaction on Neural Networks , vol.1 , Issue.1 , pp. 447-457
    • Narendra, K.1    Parthasarathy, K.2
  • 2
    • 0033731028 scopus 로고    scopus 로고
    • Nonlinear adaptive control using networks of piecewise linear approximators
    • Choi J, Farrell J (2000) Nonlinear adaptive control using networks of piecewise linear approximators. IEEE Transactions on Neural Networks 11: 390-401
    • (2000) IEEE Transactions on Neural Networks , vol.11 , pp. 390-401
    • Choi, J.1    Farrell, J.2
  • 3
    • 0000922214 scopus 로고    scopus 로고
    • Stable neural controller design for unknown nonlinear systems using backstepping
    • Zhang Y, Peng P, Jiang Z (2000) Stable neural controller design for unknown nonlinear systems using backstepping. IEEE Transactions on Neural Networks 11: 1347-60
    • (2000) IEEE Transactions on Neural Networks , vol.11 , pp. 1347-1360
    • Zhang, Y.1    Peng, P.2    Jiang, Z.3
  • 4
    • 0002031779 scopus 로고
    • Approximate dynamic programming for real-time control and neural modeling
    • White DA, Sofge DAW (ed). Van Nostrand Reinhold
    • Werbos P (1992) Approximate dynamic programming for real-time control and neural modeling. In: White DA, Sofge DAW (ed) Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches. Van Nostrand Reinhold
    • (1992) Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches
    • Werbos, P.1
  • 8
    • 0031259122 scopus 로고    scopus 로고
    • Synthesis of reinforcement learning, neural networks, and pi control applied to a simulated heating coil
    • Anderson C, Hittle D, Katz A, Kretchmar R (1997) Synthesis of reinforcement learning, neural networks, and pi control applied to a simulated heating coil. J Artific Intell Eng 11(4): 421-429
    • (1997) J Artific Intell Eng , vol.11 , Issue.4 , pp. 421-429
    • Anderson, C.1    Hittle, D.2    Katz, A.3    Kretchmar, R.4
  • 11
    • 0345494056 scopus 로고
    • Temporal Difference Learning: A Chemical Process Control Application
    • Miller S, Williams R (eds). Kluwer
    • Miller S, Williams R (1995) Temporal Difference Learning: A Chemical Process Control Application In: Miller S, Williams R (eds) Applications of Artificial Neural Networks. Kluwer
    • (1995) Applications of Artificial Neural Networks
    • Miller, S.1    Williams, R.2
  • 12
    • 0033233953 scopus 로고    scopus 로고
    • Concepts and facilities of a neural reinforcement learning control architecture for technical process control
    • Springer Verlag London [14]
    • Riedmiller M (1999) Concepts and facilities of a neural reinforcement learning control architecture for technical process control. Neural Computation and Application Journal 8: 323-338, Springer Verlag London [14] Macline Learing, Kluwer, 3(1): 9-44
    • (1999) Neural Computation and Application Journal , vol.8 , pp. 323-338
    • Riedmiller, M.1
  • 13
    • 33847202724 scopus 로고    scopus 로고
    • Kluwer
    • Riedmiller M (1999) Concepts and facilities of a neural reinforcement learning control architecture for technical process control. Neural Computation and Application Journal 8: 323-338, Springer Verlag London [14] Macline Learing, Kluwer, 3(1): 9-44
    • Macline Learing , vol.3 , Issue.1 , pp. 9-44
  • 14
    • 0033750123 scopus 로고    scopus 로고
    • Neurocontroller alternatives for 'fuzzy' ball-and-beam systems with nonuniform nonlinear friction
    • Eaton P, Prokhorov D, Wunch II D (2000) Neurocontroller alternatives for 'fuzzy' ball-and-beam systems with nonuniform nonlinear friction. IEEE transactions on Neural Networks
    • (2000) IEEE Transactions on Neural Networks
    • Eaton, P.1    Prokhorov, D.2    Wunch II, D.3
  • 15
    • 33847202724 scopus 로고    scopus 로고
    • Learning to predict by the methods of temporal differences
    • Kluwer
    • Sutton R (1988) Learning to predict by the methods of temporal differences. Machine Learning, Kluwer, 3(1): 9-44
    • (1988) Machine Learning , vol.3 , Issue.1 , pp. 9-44
    • Sutton, R.1
  • 17
    • 34249833101 scopus 로고
    • Technical note: Q learning
    • Kluwer
    • Watkins C, Dayan P (1992) Technical note: Q learning. Machine Learning, Kluwer 18(3-4); 217-292
    • (1992) Machine Learning , vol.18 , Issue.3-4 , pp. 217-292
    • Watkins, C.1    Dayan, P.2
  • 19
    • 0345062525 scopus 로고
    • Adaptive linear quadratic control using policy iteration
    • University of Massachusetts
    • Bradtke S, Ydstie B, Barto A (1994) Adaptive linear quadratic control using policy iteration. Technical report, University of Massachusetts
    • (1994) Technical Report
    • Bradtke, S.1    Ydstie, B.2    Barto, A.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.