SCOPUS 정보 검색 플랫폼

Journal of Machine Learning Research

Volumn 19, Issue , 2011, Pages 1-26

Regret bounds for the adaptive control of Linear Quadratic systems

(2) Abbasi Yadkori, Yasin a Szepesvaŕi, Csaba a

a UNIVERSITY OF ALBERTA (Canada)

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS;

ADAPTIVE CONTROL; CONTROL COMMUNITY; LEAST-SQUARES ESTIMATION; LINEAR QUADRATIC; LINEAR QUADRATIC CONTROL PROBLEM; LQ CONTROL PROBLEM; MODEL PARAMETERS; REGRET BOUNDS;

ADAPTIVE CONTROL SYSTEMS;

EID: 84867882483 PISSN: 15324435 EISSN: 15337928 Source Type: Journal
DOI: None Document Type: Conference Paper

Times cited : (422)

References (33)

1
- 84860633337
- Y. Abbasi-Yadkori, D. Pal, and Cs. Szepesvári. Online least squares estimation with self-normalized processes: An application to bandit problems. Arxiv preprint http://arxiv.org/abs/1102.2670, 2011.
- (2011) Online Least Squares Estimation with Self-normalized Processes: An Application to Bandit Problems
- Abbasi-Yadkori, Y.¹ Pal, D.² Szepesvári, Cs.³

2
- 0003956358
- Prentice-Hall
- B. D. O. Anderson and J. B. Moore. Linear Optimal Control. Prentice-Hall, 1971.
- (1971) Linear Optimal Control
- Anderson, B.D.O.¹ Moore, J.B.²

3
- 0041966002
- Using confidence bounds for exploitation-exploration trade-offs
- ISSN 1533-7928
- P. Auer. Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research, 3:397-422, 2003. ISSN 1533-7928.
- (2003) Journal of Machine Learning Research , vol.3 , pp. 397-422
- Auer, P.¹

4
- 0036568025
- Finite-time analysis of the multiarmed bandit problem
- DOI 10.1023/A:1013689704352, Computational Learning Theory
- P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite time analysis of the multiarmed bandit problem. Machine Learning, 47(2-3):235-256, 2002. (Pubitemid 34126111)
- (2002) Machine Learning , vol.47 , Issue.2-3 , pp. 235-256
- Auer, P.¹ Cesa-Bianchi, N.² Fischer, P.³

5
- 38049040954
- Improved rates for the stochastic continuumarmed bandit problem
- P. Auer, R. Ortner, and Cs. Szepesvári. Improved rates for the stochastic continuumarmed bandit problem. In Proceedings of the 20th Annual Conference on Learning Theory (COLT-07), pages 454-468, 2007.
- (2007) Proceedings of the 20th Annual Conference on Learning Theory (COLT-07) , pp. 454-468
- Auer, P.¹ Ortner, R.² Szepesvári, Cs.³

6
- 77951952841
- Near-optimal regret bounds for reinforcement learning
- P. Auer, T. Jaksch, and R. Ortner. Near-optimal regret bounds for reinforcement learning. Journal of Machine Learning Research, 11:1563-1600, 2010.
- (2010) Journal of Machine Learning Research , vol.11 , pp. 1563-1600
- Auer, P.¹ Jaksch, T.² Ortner, R.³

7
- 80053161827
- REGAL: A regularization based algorithm for reinforcement learning in weakly communicating MDPs
- P. L. Bartlett and A. Tewari. REGAL: A regularization based algorithm for reinforcement learning in weakly communicating MDPs. In Proceedings of the 25th Annual Conference on Uncertainty in Artificial Intelligence, 2009.
- (2009) Proceedings of the 25th Annual Conference on Uncertainty in Artificial Intelligence
- Bartlett, P.L.¹ Tewari, A.²

8
- 0022044787
- Adaptive control with the stochastic approximation algorithm: Geometry and convergence
- A. Becker, P. R. Kumar, and C. Z. Wei. Adaptive control with the stochastic approximation algorithm: Geometry and convergence. IEEE Trans, on Automatic Control, 30(4) :330-338, 1985.
- (1985) IEEE Trans, on Automatic Control , vol.30 , Issue.4 , pp. 330-338
- Becker, A.¹ Kumar, P.R.² Wei, C.Z.³

9
- 0004295484
- Prentice-Hall
- D. Bertsekas. Dynamic Programming. Prentice-Hall, 1987.
- (1987) Dynamic Programming
- Bertsekas, D.¹

10
- 0003565783
- Athena Scientific, 2nd edition
- D. P. Bertsekas. Dynamic Programming and Optimal Control. Athena Scientific, 2nd edition, 2001.
- (2001) Dynamic Programming and Optimal Control
- Bertsekas, D.P.¹

11
- 84877734570
- Adaptive control of linear time invariant systems: The "bet on the best" principle
- S. Bittanti and M. C. Campi. Adaptive control of linear time invariant systems: the "bet on the best" principle. Communications in Information and Systems, 6(4):299-320, 2006.
- (2006) Communications in Information and Systems , vol.6 , Issue.4 , pp. 299-320
- Bittanti, S.¹ Campi, M.C.²

12
- 0041965975
- R-MAX - A general polynomial time algorithm for near-optimal reinforcement learning
- R. I. Brafman and M. Tennenholtz. R-MAX - a general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research, 3:213-231, 2002.
- (2002) Journal of Machine Learning Research , vol.3 , pp. 213-231
- Brafman, R.I.¹ Tennenholtz, M.²

13
- 0032203343
- Adaptive linear quadratic Gaussian control: The cost-biased approach revisited
- M. C. Campi and P. R. Kumar. Adaptive linear quadratic Gaussian control: the cost-biased approach revisited. SIAM Journal on Control and Optimization, 36(6):1890-1907, 1998.
- (1998) SIAM Journal on Control and Optimization , vol.36 , Issue.6 , pp. 1890-1907
- Campi, M.C.¹ Kumar, P.R.²

14
- 0023383665
- Optimal adaptive control and consistent parameter estimates for armax model with quadratic cost
- H. Chen and L. Guo. Optimal adaptive control and consistent parameter estimates for armax model with quadratic cost. SIAM Journal on Control and Optimization, 25(4): 845-867, 1987. (Pubitemid 17599082)
- (1987) SIAM Journal on Control and Optimization , vol.25 , Issue.4 , pp. 845-867
- Chen, H.-F.¹ Guo, L.²

15
- 0025470399
- Identification and adaptive control for systems with unknown orders, delay, and coefficients
- DOI 10.1109/9.58496
- H. Chen and J. Zhang. Identification and adaptive control for systems with unknown orders, delay, and coefficients. Automatic Control, IEEE Transactions on, 35(8):866-877, August 1990. (Pubitemid 20738736)
- (1990) IEEE Transactions on Automatic Control , vol.35 , Issue.8 , pp. 866-877
- Chen, H.-F.¹ Zhang, J.-F.²

16
- 33244456637
- Robbing the bandit: Less regret in online geometric optimization against an adaptive adversary
- V. Dani and T. P. Hayes. Robbing the bandit: Less regret in online geometric optimization against an adaptive adversary. In 16th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 937-943, 2006.
- (2006) 16th Annual ACM-SIAM Symposium on Discrete Algorithms , pp. 937-943
- Dani, V.¹ Hayes, T.P.²

17
- 84898072179
- Stochastic linear optimization under bandit feedback
- V. Dani, T. P. Hayes, and S. M. Kakade. Stochastic linear optimization under bandit feedback. COLT-2008, pages 355-366, 2008.
- (2008) COLT-2008 , pp. 355-366
- Dani, V.¹ Hayes, T.P.² Kakade, S.M.³

18
- 0030714490
- Pac adaptive control of linear systems
- Press
- C. Fiechter. Pac adaptive control of linear systems. In in Proceedings of the 10th Annual Conference on Computational Learning Theory, ACM, pages 72-80. Press, 1997.
- (1997) Proceedings of the 10th Annual Conference on Computational Learning Theory, ACM , pp. 72-80
- Fiechter, C.¹

19
- 23244466805
- PhD thesis, Gatsby Computational Neuroscience Unit, University College London
- S. M. Kakade. On the sample complexity of reinforcement learning. PhD thesis, Gatsby Computational Neuroscience Unit, University College London, 2003.
- (2003) On the Sample Complexity of Reinforcement Learning
- Kakade, S.M.¹

20
- 1942452450
- Exploration in metric state spaces
- T. Fawcett and N. Mishra, editors, AAAI Press
- S. M. Kakade, M. J. Kearns, and J. Langford. Exploration in metric state spaces. In T. Fawcett and N. Mishra, editors, ICML 2003, pages 306-312. AAAI Press, 2003.
- (2003) ICML 2003 , pp. 306-312
- Kakade, S.M.¹ Kearns, M.J.² Langford, J.³

21
- 0012257655
- Near-optimal performance for reinforcement learning in polynomial time
- J. W. Shavlik, editor, Morgan Kauffmann
- M. Kearns and S. P. Singh. Near-optimal performance for reinforcement learning in polynomial time. In J. W. Shavlik, editor, ICML 1998, pages 260-268. Morgan Kauffmann, 1998.
- (1998) ICML 1998 , pp. 260-268
- Kearns, M.¹ Singh, S.P.²

22
- 84898981061
- Nearly tight bounds for the continuum-armed bandit problem
- R. Kleinberg. Nearly tight bounds for the continuum-armed bandit problem. In Advances in Neural Information Processing Systems, pages 697-704, 2005.
- (2005) Advances in Neural Information Processing Systems , pp. 697-704
- Kleinberg, R.¹

23
- 57049185311
- Multi-armed bandits in metric spaces
- R. Kleinberg, A. Slivkins, and E. Upfal. Multi-armed bandits in metric spaces. In Proceedings of the 40th annual ACM symposium on Theory of computing, pages 681-690, 2008.
- (2008) Proceedings of the 40th Annual ACM Symposium on Theory of Computing , pp. 681-690
- Kleinberg, R.¹ Slivkins, A.² Upfal, E.³

24
- 0002899547
- Asymptotically efficient adaptive allocation rules
- T. L. Lai and H. Robbins. Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6:4-22, 1985.
- (1985) Advances in Applied Mathematics , vol.6 , pp. 4-22
- Lai, T.L.¹ Robbins, H.²

25
- 0000258837
- Least squares estimates in stochastic regression models with applications to identification and control of dynamic systems
- T. L. Lai and C. Z. Wei. Least squares estimates in stochastic regression models with applications to identification and control of dynamic systems. The Annals of Statistics, 10(1):pp. 154-166, 1982a.
- (1982) The Annals of Statistics , vol.10 , Issue.1 , pp. 154-166
- Lai, T.L.¹ Wei, C.Z.²

26
- 0000258837
- Least squares estimates in stochastic regression models with applications to identification and control of dynamic systems
- T. L. Lai and C. Z. Wei. Least squares estimates in stochastic regression models with applications to identification and control of dynamic systems. The Annals of Statistics, 10(1) : 154-166, 1982b.
- (1982) The Annals of Statistics , vol.10 , Issue.1 , pp. 154-166
- Lai, T.L.¹ Wei, C.Z.²

27
- 0023306354
- Asymptotically efficient self-tuning regulators
- March
- T. L. Lai and C. Z. Wei. Asymptotically efficient self-tuning regulators. SIAM Journal on Control and Optimization, 25:466-481, March 1987.
- (1987) SIAM Journal on Control and Optimization , vol.25 , pp. 466-481
- Lai, T.L.¹ Wei, C.Z.²

28
- 33750972856
- Efficient recursive estimation and adaptive control in stochastic regression and armax models
- T. L. Lai and Z. Ying. Efficient recursive estimation and adaptive control in stochastic regression and armax models. Statistica Sinica, 16:741-772, 2006. (Pubitemid 44744348)
- (2006) Statistica Sinica , vol.16 , Issue.3 , pp. 741-772
- Lai, T.L.¹ Ying, Z.²

29
- 77953111834
- Linearly parameterized bandits
- P. Rusmevichientong and J. N. Tsitsiklis. Linearly parameterized bandits. Mathematics of Operations Research, 35(2):395-411, 2010.
- (2010) Mathematics of Operations Research , vol.35 , Issue.2 , pp. 395-411
- Rusmevichientong, P.¹ Tsitsiklis, J.N.²

30
- 0001787217
- Dynamic programming under uncertainty with a quadratic criterion function
- H. A. Simon, dynamic programming under uncertainty with a quadratic criterion function. Econometrica, 24(1):741, 1956.
- (1956) Econometrica , vol.24 , Issue.1 , pp. 741
- Simon, H.A.¹

31
- 85162058047
- Online linear regression and its application to model-based reinforcement learning
- A. L. Strehl and M. L. Littman. Online linear regression and its application to model-based reinforcement learning. In NIPS, pages 1417-1424, 2008.
- (2008) NIPS , pp. 1417-1424
- Strehl, A.L.¹ Littman, M.L.²

32
- 34250700033
- PAC model-free reinforcement learning
- A. L. Strehl, L. Li, E. Wiewiora, J. Langford, and M. L. Littman. PAC model-free reinforcement learning. In ICML, pages 881-888, 2006.
- (2006) ICML , pp. 881-888
- Strehl, A.L.¹ Li, L.² Wiewiora, E.³ Langford, J.⁴ Littman, M.L.⁵

33
- 77956520676
- Model-based reinforcement learning with nearly tight exploration complexity bounds
- I. Szita and Cs. Szepesvári. Model-based reinforcement learning with nearly tight exploration complexity bounds. In ICML 2010, pages 1031-1038, 2010.
- (2010) ICML 2010 , pp. 1031-1038
- Szita, I.¹ Szepesvári, Cs.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.