메뉴 건너뛰기




Volumn 19, Issue , 2011, Pages 1-26

Regret bounds for the adaptive control of Linear Quadratic systems

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS;

EID: 84867882483     PISSN: 15324435     EISSN: 15337928     Source Type: Journal    
DOI: None     Document Type: Conference Paper
Times cited : (422)

References (33)
  • 3
    • 0041966002 scopus 로고    scopus 로고
    • Using confidence bounds for exploitation-exploration trade-offs
    • ISSN 1533-7928
    • P. Auer. Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research, 3:397-422, 2003. ISSN 1533-7928.
    • (2003) Journal of Machine Learning Research , vol.3 , pp. 397-422
    • Auer, P.1
  • 4
    • 0036568025 scopus 로고    scopus 로고
    • Finite-time analysis of the multiarmed bandit problem
    • DOI 10.1023/A:1013689704352, Computational Learning Theory
    • P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite time analysis of the multiarmed bandit problem. Machine Learning, 47(2-3):235-256, 2002. (Pubitemid 34126111)
    • (2002) Machine Learning , vol.47 , Issue.2-3 , pp. 235-256
    • Auer, P.1    Cesa-Bianchi, N.2    Fischer, P.3
  • 8
    • 0022044787 scopus 로고
    • Adaptive control with the stochastic approximation algorithm: Geometry and convergence
    • A. Becker, P. R. Kumar, and C. Z. Wei. Adaptive control with the stochastic approximation algorithm: Geometry and convergence. IEEE Trans, on Automatic Control, 30(4) :330-338, 1985.
    • (1985) IEEE Trans, on Automatic Control , vol.30 , Issue.4 , pp. 330-338
    • Becker, A.1    Kumar, P.R.2    Wei, C.Z.3
  • 11
    • 84877734570 scopus 로고    scopus 로고
    • Adaptive control of linear time invariant systems: The "bet on the best" principle
    • S. Bittanti and M. C. Campi. Adaptive control of linear time invariant systems: the "bet on the best" principle. Communications in Information and Systems, 6(4):299-320, 2006.
    • (2006) Communications in Information and Systems , vol.6 , Issue.4 , pp. 299-320
    • Bittanti, S.1    Campi, M.C.2
  • 12
    • 0041965975 scopus 로고    scopus 로고
    • R-MAX - A general polynomial time algorithm for near-optimal reinforcement learning
    • R. I. Brafman and M. Tennenholtz. R-MAX - a general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research, 3:213-231, 2002.
    • (2002) Journal of Machine Learning Research , vol.3 , pp. 213-231
    • Brafman, R.I.1    Tennenholtz, M.2
  • 13
    • 0032203343 scopus 로고    scopus 로고
    • Adaptive linear quadratic Gaussian control: The cost-biased approach revisited
    • M. C. Campi and P. R. Kumar. Adaptive linear quadratic Gaussian control: the cost-biased approach revisited. SIAM Journal on Control and Optimization, 36(6):1890-1907, 1998.
    • (1998) SIAM Journal on Control and Optimization , vol.36 , Issue.6 , pp. 1890-1907
    • Campi, M.C.1    Kumar, P.R.2
  • 14
    • 0023383665 scopus 로고
    • Optimal adaptive control and consistent parameter estimates for armax model with quadratic cost
    • H. Chen and L. Guo. Optimal adaptive control and consistent parameter estimates for armax model with quadratic cost. SIAM Journal on Control and Optimization, 25(4): 845-867, 1987. (Pubitemid 17599082)
    • (1987) SIAM Journal on Control and Optimization , vol.25 , Issue.4 , pp. 845-867
    • Chen, H.-F.1    Guo, L.2
  • 15
    • 0025470399 scopus 로고
    • Identification and adaptive control for systems with unknown orders, delay, and coefficients
    • DOI 10.1109/9.58496
    • H. Chen and J. Zhang. Identification and adaptive control for systems with unknown orders, delay, and coefficients. Automatic Control, IEEE Transactions on, 35(8):866-877, August 1990. (Pubitemid 20738736)
    • (1990) IEEE Transactions on Automatic Control , vol.35 , Issue.8 , pp. 866-877
    • Chen, H.-F.1    Zhang, J.-F.2
  • 16
    • 33244456637 scopus 로고    scopus 로고
    • Robbing the bandit: Less regret in online geometric optimization against an adaptive adversary
    • V. Dani and T. P. Hayes. Robbing the bandit: Less regret in online geometric optimization against an adaptive adversary. In 16th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 937-943, 2006.
    • (2006) 16th Annual ACM-SIAM Symposium on Discrete Algorithms , pp. 937-943
    • Dani, V.1    Hayes, T.P.2
  • 17
    • 84898072179 scopus 로고    scopus 로고
    • Stochastic linear optimization under bandit feedback
    • V. Dani, T. P. Hayes, and S. M. Kakade. Stochastic linear optimization under bandit feedback. COLT-2008, pages 355-366, 2008.
    • (2008) COLT-2008 , pp. 355-366
    • Dani, V.1    Hayes, T.P.2    Kakade, S.M.3
  • 20
    • 1942452450 scopus 로고    scopus 로고
    • Exploration in metric state spaces
    • T. Fawcett and N. Mishra, editors, AAAI Press
    • S. M. Kakade, M. J. Kearns, and J. Langford. Exploration in metric state spaces. In T. Fawcett and N. Mishra, editors, ICML 2003, pages 306-312. AAAI Press, 2003.
    • (2003) ICML 2003 , pp. 306-312
    • Kakade, S.M.1    Kearns, M.J.2    Langford, J.3
  • 21
    • 0012257655 scopus 로고    scopus 로고
    • Near-optimal performance for reinforcement learning in polynomial time
    • J. W. Shavlik, editor, Morgan Kauffmann
    • M. Kearns and S. P. Singh. Near-optimal performance for reinforcement learning in polynomial time. In J. W. Shavlik, editor, ICML 1998, pages 260-268. Morgan Kauffmann, 1998.
    • (1998) ICML 1998 , pp. 260-268
    • Kearns, M.1    Singh, S.P.2
  • 24
    • 0002899547 scopus 로고
    • Asymptotically efficient adaptive allocation rules
    • T. L. Lai and H. Robbins. Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6:4-22, 1985.
    • (1985) Advances in Applied Mathematics , vol.6 , pp. 4-22
    • Lai, T.L.1    Robbins, H.2
  • 25
    • 0000258837 scopus 로고
    • Least squares estimates in stochastic regression models with applications to identification and control of dynamic systems
    • T. L. Lai and C. Z. Wei. Least squares estimates in stochastic regression models with applications to identification and control of dynamic systems. The Annals of Statistics, 10(1):pp. 154-166, 1982a.
    • (1982) The Annals of Statistics , vol.10 , Issue.1 , pp. 154-166
    • Lai, T.L.1    Wei, C.Z.2
  • 26
    • 0000258837 scopus 로고
    • Least squares estimates in stochastic regression models with applications to identification and control of dynamic systems
    • T. L. Lai and C. Z. Wei. Least squares estimates in stochastic regression models with applications to identification and control of dynamic systems. The Annals of Statistics, 10(1) : 154-166, 1982b.
    • (1982) The Annals of Statistics , vol.10 , Issue.1 , pp. 154-166
    • Lai, T.L.1    Wei, C.Z.2
  • 27
  • 28
    • 33750972856 scopus 로고    scopus 로고
    • Efficient recursive estimation and adaptive control in stochastic regression and armax models
    • T. L. Lai and Z. Ying. Efficient recursive estimation and adaptive control in stochastic regression and armax models. Statistica Sinica, 16:741-772, 2006. (Pubitemid 44744348)
    • (2006) Statistica Sinica , vol.16 , Issue.3 , pp. 741-772
    • Lai, T.L.1    Ying, Z.2
  • 30
    • 0001787217 scopus 로고
    • Dynamic programming under uncertainty with a quadratic criterion function
    • H. A. Simon, dynamic programming under uncertainty with a quadratic criterion function. Econometrica, 24(1):741, 1956.
    • (1956) Econometrica , vol.24 , Issue.1 , pp. 741
    • Simon, H.A.1
  • 31
    • 85162058047 scopus 로고    scopus 로고
    • Online linear regression and its application to model-based reinforcement learning
    • A. L. Strehl and M. L. Littman. Online linear regression and its application to model-based reinforcement learning. In NIPS, pages 1417-1424, 2008.
    • (2008) NIPS , pp. 1417-1424
    • Strehl, A.L.1    Littman, M.L.2
  • 33
    • 77956520676 scopus 로고    scopus 로고
    • Model-based reinforcement learning with nearly tight exploration complexity bounds
    • I. Szita and Cs. Szepesvári. Model-based reinforcement learning with nearly tight exploration complexity bounds. In ICML 2010, pages 1031-1038, 2010.
    • (2010) ICML 2010 , pp. 1031-1038
    • Szita, I.1    Szepesvári, Cs.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.