메뉴 건너뛰기




Volumn 84, Issue 1-2, 2011, Pages 137-169

Reinforcement learning in feedback control : Challenges and benchmarks from technical process control

Author keywords

Benchmarks; Feedback control; Nonlinear control; Reinforcement learning

Indexed keywords

BENCH-MARK PROBLEMS; BENCHMARKS; CLASSICAL CONTROL; CLASSICAL CONTROLLERS; CONTROL QUALITY; HIGH QUALITY; LEARNING APPROACH; LEARNING BEHAVIOR; LEARNING CONTROLLERS; LEARNING EFFORTS; LEARNING SCHEMES; LONG TERM DYNAMICS; NON LINEAR CONTROL; PERFORMANCE MEASURE; REINFORCEMENT LEARNING METHOD; SETPOINTS; TECHNICAL PROCESS;

EID: 79958779459     PISSN: 08856125     EISSN: 15730565     Source Type: Journal    
DOI: 10.1007/s10994-011-5235-x     Document Type: Article
Times cited : (200)

References (49)
  • 3
    • 0003787146 scopus 로고
    • Princeton Univ Press Princeton 0077.13605
    • Bellman, R. (1957). Dynamic programming. Princeton: Princeton Univ Press.
    • (1957) Dynamic Programming
    • Bellman, R.1
  • 4
    • 0000719863 scopus 로고
    • Packet routing in dynamically changing networks-a reinforcement learning approach
    • J. Cowan G. Tesauro J. Alspector (eds)
    • Boyan, J., & Littman, M. (1994). Packet routing in dynamically changing networks-a reinforcement learning approach. In J. Cowan, G. Tesauro, & J. Alspector (Eds.), Advances in neural information processing systems 6.
    • (1994) Advances in Neural Information Processing Systems 6
    • Boyan, J.1    Littman, M.2
  • 6
    • 84876929592 scopus 로고    scopus 로고
    • CTM University of Michigan
    • CTM (1996). Digital Control Tutorial. University of Michigan, www.engin.umich.edu/group/ctm (online).
    • (1996) Digital Control Tutorial
  • 7
    • 61849173491 scopus 로고    scopus 로고
    • Gaussian process dynamic programming
    • 10.1016/j.neucom.2008.12.019
    • M. Deisenroth C. Rasmussen J. Peters 2009 Gaussian process dynamic programming Neurocomputing 72 7-9 1508 1524 10.1016/j.neucom.2008.12.019
    • (2009) Neurocomputing , vol.72 , Issue.79 , pp. 1508-1524
    • Deisenroth, M.1    Rasmussen, C.2    Peters, J.3
  • 15
    • 0000676676 scopus 로고
    • Learning to control an unstable system with forward modeling
    • D. Touretzky (eds). Morgan Kaufmann San Mateo
    • Jordan, M. I., & Jacobs, R. A. (1990). Learning to control an unstable system with forward modeling. In D. Touretzky (Ed.), Advances in neural information processing systems (NIPS) 2 (pp. 324-331). San Mateo: Morgan Kaufmann.
    • (1990) Advances in Neural Information Processing Systems (NIPS) 2 , pp. 324-331
    • Jordan, M.I.1    Jacobs, R.A.2
  • 16
    • 0031271863 scopus 로고    scopus 로고
    • Nonlinear autopilot control design for a 2-DOF helicopter model
    • J. Kaloust C. Ham Z. Qu 1997 Nonlinear autopilot control design for a 2-dof helicopter model IEE Proceedings. Control Theory and Applications 144 6 612 616 0900.93226 10.1049/ip-cta:19971638 (Pubitemid 127754346)
    • (1997) IEE Proceedings: Control Theory and Applications , vol.144 , Issue.6 , pp. 612-616
    • Kaloust, J.1    Ham, C.2    Qu, Z.3
  • 19
    • 33749960180 scopus 로고    scopus 로고
    • Robust neural network control of rigid link flexible-joint robots
    • 10.1111/j.1934-6093.1999.tb00019.x
    • C. Kwan F. Lewis Y. Kim 1999 Robust neural network control of rigid link flexible-joint robots Asian Journal of Control 1 3 188 197 10.1111/j.1934-6093. 1999.tb00019.x
    • (1999) Asian Journal of Control , vol.1 , Issue.3 , pp. 188-197
    • Kwan, C.1    Lewis, F.2    Kim, Y.3
  • 22
    • 67349250130 scopus 로고    scopus 로고
    • Modeling and robust control of blu-ray disc servo-mechanisms
    • 10.1016/j.mechatronics.2009.02.006
    • J. J. Martinez O. Sename A. Voda 2009 Modeling and robust control of blu-ray disc servo-mechanisms Mechatronics 19 5 715 725 10.1016/j.mechatronics. 2009.02.006
    • (2009) Mechatronics , vol.19 , Issue.5 , pp. 715-725
    • Martinez, J.J.1    Sename, O.2    Voda, A.3
  • 27
    • 33646687423 scopus 로고    scopus 로고
    • Neural fitted q iteration-first experiences with a data efficient neural reinforcement learning method
    • Porto, Portugal
    • Riedmiller, M. (2005). Neural fitted q iteration-first experiences with a data efficient neural reinforcement learning method. In Proc. of the European conference on machine learning, ECML 2005, Porto, Portugal.
    • (2005) Proc. of the European Conference on Machine Learning, ECML 2005
    • Riedmiller, M.1
  • 28
    • 84943274699 scopus 로고
    • A direct adaptive method for faster backpropagation learning: The RPROP algorithm
    • San Francisco H. Ruspini (eds). 10.1109/ICNN.1993.298623
    • Riedmiller, M., & Braun, H. (1993). A direct adaptive method for faster backpropagation learning: The RPROP algorithm. In H. Ruspini (Ed.), Proceedings of the IEEE international conference on neural networks (ICNN), San Francisco (pp. 586-591).
    • (1993) Proceedings of the IEEE International Conference on Neural Networks (ICNN) , pp. 586-591
    • Riedmiller, M.1    Braun, H.2
  • 32
    • 67650996818 scopus 로고    scopus 로고
    • Reinforcement learning for robot soccer
    • 10.1007/s10514-009-9120-4
    • M. Riedmiller T. Gabel R. Hafner S. Lange 2009 Reinforcement learning for robot soccer Autonomous Robots 27 1 55 74 10.1007/s10514-009-9120-4
    • (2009) Autonomous Robots , vol.27 , Issue.1 , pp. 55-74
    • Riedmiller, M.1    Gabel, T.2    Hafner, R.3    Lange, S.4
  • 33
    • 0002395261 scopus 로고
    • Comparison of optimized backpropagation algorithms
    • Brussels
    • Schiffmann, W., Joost, M., & Werner, R. (1993). Comparison of optimized backpropagation algorithms. In Proc. of ESANN'93, Brussels (pp. 97-104).
    • (1993) Proc. of ESANN'93 , pp. 97-104
    • Schiffmann, W.1    Joost, M.2    Werner, R.3
  • 38
    • 70449370276 scopus 로고    scopus 로고
    • RL-Glue: Language-independent software for reinforcement-learning experiments
    • B. Tanner A. White 2009 RL-Glue: language-independent software for reinforcement-learning experiments Journal of Machine Learning Research 10 2133 2136
    • (2009) Journal of Machine Learning Research , vol.10 , pp. 2133-2136
    • Tanner, B.1    White, A.2
  • 39
    • 0001046225 scopus 로고
    • Practical issues in temporal difference learning
    • 0772.68075
    • G. Tesauro 1992 Practical issues in temporal difference learning Machine Learning 8 257 277 0772.68075
    • (1992) Machine Learning , vol.8 , pp. 257-277
    • Tesauro, G.1
  • 41
    • 0025789531 scopus 로고
    • Dynamic nonlinear modeling of a hot-water-to-air heat exchanger for control applications
    • D. M. Underwood R. R. Crawford 1991 Dynamic nonlinear modeling of a hot-water-to-air heat exchanger for control applications ASHRAE Transactions 97 1 149 155 (Pubitemid 21725993)
    • (1991) ASHRAE Transactions , Issue.PART 1 , pp. 149-155
    • Underwood, D.M.1    Crawford, R.R.2
  • 42
    • 0035273403 scopus 로고    scopus 로고
    • On-line learning control by association and reinforcement
    • DOI 10.1109/72.914523, PII S1045922701014047
    • Y. Wang J. Si 2001 On-line learning control by association and reinforcement IEEE Transactions on Neural Networks 12 2 264 276 1859321 10.1109/72.914523 (Pubitemid 32371483)
    • (2001) IEEE Transactions on Neural Networks , vol.12 , Issue.2 , pp. 264-276
    • Si, J.1    Wang, Y.-T.2
  • 45
    • 79951878534 scopus 로고    scopus 로고
    • The reinforcement learning competitions
    • S. Whiteson B. Tanner A. White 2010 The reinforcement learning competitions The AI Magazine 31 2 81 94
    • (2010) The AI Magazine , vol.31 , Issue.2 , pp. 81-94
    • Whiteson, S.1    Tanner, B.2    White, A.3
  • 47
    • 0035400418 scopus 로고    scopus 로고
    • Adaptive robust nonlinear control of a magnetic levitation system
    • DOI 10.1016/S0005-1098(01)00063-2, PII S0005109801000632
    • Z.-J. Yang M. Tateishi 2001 Adaptive robust nonlinear control of a magnetic levitation system Automatica 37 7 1125 1131 0979.93076 10.1016/S0005-1098(01)00063-2 (Pubitemid 32498998)
    • (2001) Automatica , vol.37 , Issue.7 , pp. 1125-1131
    • Yang, Z.-J.1    Tateishi, M.2
  • 48
    • 72349085802 scopus 로고    scopus 로고
    • Robust nonlinear control of a voltage-controlled magnetic levitation system using disturbance observer
    • Z.-J. Yang H. Tsubakihara S. Kanae K. Wada 2007 Robust nonlinear control of a voltage-controlled magnetic levitation system using disturbance observer Transactions of IEE of Japan 127-C 12 2118 2125
    • (2007) Transactions of IEE of Japan , vol.100-127 , Issue.12 , pp. 2118-2125
    • Yang, Z.-J.1    Tsubakihara, H.2    Kanae, S.3    Wada, K.4
  • 49
    • 38349191029 scopus 로고    scopus 로고
    • Adaptive robust output feedback control of a magnetic levitation system by k-filter approach
    • 10.1109/TIE.2007.896488
    • Z.-J. Yang K. Kunitoshi S. Kanae K. Wada 2008 Adaptive robust output feedback control of a magnetic levitation system by k-filter approach IEEE Transactions on Industrial Electronics 55 1 390 399 10.1109/TIE.2007.896488
    • (2008) IEEE Transactions on Industrial Electronics , vol.55 , Issue.1 , pp. 390-399
    • Yang, Z.-J.1    Kunitoshi, K.2    Kanae, S.3    Wada, K.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.