SCOPUS 정보 검색 플랫폼

Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010, NIPS 2010

Volumn , Issue , 2010, Pages

Predictive State Temporal Difference learning

(2) Boots, Byron a Gordon, Geoffrey J a

a Carnegie Mellon University ^* (United States)

Author keywords

[No Author keywords available]

Indexed keywords

HIGH-DIMENSIONAL; HIGHER-DIMENSIONAL; NEW APPROACHES; REINFORCEMENT LEARNINGS; SETS OF FEATURES; SUBSPACE IDENTIFICATION; TEMPORAL DIFFERENCE LEARNING; TEMPORAL DIFFERENCE REINFORCEMENT LEARNING; TEMPORAL DIFFERENCES; VALUE FUNCTION APPROXIMATION;

ARTIFICIAL INTELLIGENCE;

EID: 85162041278 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (26)

References (25)

1
- 33847202724
- Learning to predict by the methods of temporal differences
- R. S. Sutton. Learning to predict by the methods of temporal differences. Machine Learning, 3(1):9-44, 1988.
- (1988) Machine Learning , vol.3 , Issue.1 , pp. 9-44
- Sutton, R.S.¹

2
- 0038595396
- Least-squares temporal difference learning
- Morgan Kaufmann, San Francisco, CA
- Justin A. Boyan. Least-squares temporal difference learning. In Proc. Intl. Conf. Machine Learning, pages 49-56. Morgan Kaufmann, San Francisco, CA, 1999.
- (1999) Proc. Intl. Conf. Machine Learning , pp. 49-56
- Boyan, J.A.¹

3
- 0001771345
- Linear least-squares algorithms for temporal difference learning
- Steven J. Bradtke and Andrew G. Barto. Linear least-squares algorithms for temporal difference learning. In Machine Learning, pages 22-33, 1996.
- (1996) Machine Learning , pp. 22-33
- Bradtke, S.J.¹ Barto, A.G.²

4
- 4644323293
- Least-squares policy iteration
- Michail G. Lagoudakis and Ronald Parr. Least-squares policy iteration. J. Mach. Learn. Res., 4:1107-1149, 2003.
- (2003) J. Mach. Learn. Res. , vol.4 , pp. 1107-1149
- Lagoudakis, M.G.¹ Parr, R.²

5
- 56449092660
- An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning
- New York, NY, USA. ACM
- Ronald Parr, Lihong Li, Gavin Taylor, Christopher Painter-Wakefield, and Michael L. Littman. An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning. In ICML '08: Proceedings of the 25th international conference on Machine learning, pages 752-759, New York, NY, USA, 2008. ACM.
- (2008) ICML '08: Proceedings of the 25th International Conference on Machine Learning , pp. 752-759
- Parr, R.¹ Li, L.² Taylor, G.³ Painter-Wakefield, C.⁴ Littman, M.L.⁵

6
- 14344256568
- Learning low dimensional predictive representations
- Matthew Rosencrantz, Geoffrey J. Gordon, and Sebastian Thrun. Learning low dimensional predictive representations. In Proc. ICML, 2004.
- (2004) Proc. ICML
- Rosencrantz, M.¹ Gordon, G.J.² Thrun, S.³

7
- 79956360448
- Closing the learning-planning loop with predictive state representations
- Byron Boots, Sajid M. Siddiqi, and Geoffrey J. Gordon. Closing the learning-planning loop with predictive state representations. In Proceedings of Robotics: Science and Systems VI, 2010.
- (2010) Proceedings of Robotics: Science and Systems , vol.6
- Boots, B.¹ Siddiqi, S.M.² Gordon, G.J.³

8
- 85156266716
- Value-directed compression of pomdps
- Pascal Poupart and Craig Boutilier. Value-directed compression of pomdps. In NIPS, pages 1547-1554, 2002.
- (2002) NIPS , pp. 1547-1554
- Poupart, P.¹ Boutilier, C.²

9
- 71149121683
- Regularization and feature selection in least-squares temporal difference learning
- New York, NY, USA. ACM
- J. Zico Kolter and Andrew Y. Ng. Regularization and feature selection in least-squares temporal difference learning. In ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning, pages 521-528, New York, NY, USA, 2009. ACM.
- (2009) ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning , pp. 521-528
- Kolter, J.Z.¹ Ng, A.Y.²

10
- 0003479606
- Springer
- Gregory C. Reinsel and Rajabather Palani Velu. Multivariate Reduced-rank Regression: Theory and Applications. Springer, 1998.
- (1998) Multivariate Reduced-rank Regression: Theory and Applications
- Reinsel, G.C.¹ Velu, R.P.²

11
- 0004236492
- The Johns Hopkins University Press
- Gene H. Golub and Charles F. Van Loan. Matrix Computations. The Johns Hopkins University Press, 1996.
- (1996) Matrix Computations
- Golub, G.H.¹ Van Loan, C.F.²

12
- 85162032428
- Predictive state temporal difference learning
- Byron Boots and Geoffrey J. Gordon. Predictive state temporal difference learning. Technical report, arXiv.org.
- Technical Report, ArXiv.org
- Boots, B.¹ Gordon, G.J.²

13
- 0001523230
- The most predictable criterion
- Harold Hotelling. The most predictable criterion. Journal of Educational Psychology, 26:139-142, 1935.
- (1935) Journal of Educational Psychology , vol.26 , pp. 139-142
- Hotelling, H.¹

14
- 4243922811
- Dynamic data factorization
- S. Soatto and A. Chiuso. Dynamic data factorization. Technical report, UCLA, 2001.
- (2001) Technical Report UCLA
- Soatto, S.¹ Chiuso, A.²

15
- 84898982129
- Predictive representations of state
- Michael Littman, Richard Sutton, and Satinder Singh. Predictive representations of state. In Advances in Neural Information Processing Systems (NIPS), 2002.
- (2002) Advances in Neural Information Processing Systems (NIPS)
- Littman, M.¹ Sutton, R.² Singh, S.³

16
- 0003398906
- Cambridge University Press
- Judea Pearl. Causality: models, reasoning, and inference. Cambridge University Press, 2000.
- (2000) Causality: Models, Reasoning, and Inference
- Pearl, J.¹

17
- 0034198996
- Observable operator models for discrete stochastic time series
- Herbert Jaeger. Observable operator models for discrete stochastic time series. Neural Computation, 12:1371-1398, 2000.
- (2000) Neural Computation , vol.12 , pp. 1371-1398
- Jaeger, H.¹

18
- 31844457132
- Predictive state representations: A new theory for modeling dynamical systems
- Satinder Singh, Michael James, and Matthew Rudary. Predictive state representations: A new theory for modeling dynamical systems. In Proc. UAI, 2004.
- (2004) Proc. UAI
- Singh, S.¹ James, M.² Rudary, M.³

19
- 0003426684
- Kluwer
- P. Van Overschee and B. De Moor. Subspace Identification for Linear Systems: Theory, Implementation, Applications. Kluwer, 1996.
- (1996) Subspace Identification for Linear Systems: Theory, Implementation, Applications
- Van Overschee, P.¹ De Moor, B.²

20
- 33645159758
- Springer-Verlag
- Tohru Katayama. Subspace Methods for System Identification. Springer-Verlag, 2005.
- (2005) Subspace Methods for System Identification
- Katayama, T.¹

21
- 84898066687
- A spectral algorithm for learning hidden Markov models
- Daniel Hsu, Sham Kakade, and Tong Zhang. A spectral algorithm for learning hidden Markov models. In COLT, 2009.
- (2009) COLT
- Hsu, D.¹ Kakade, S.² Zhang, T.³

22
- 84860648072
- Improving approximate value iteration using memories and predictive state representations
- Michael R. James, Ton Wessling, and Nikos A. Vlassis. Improving approximate value iteration using memories and predictive state representations. In AAAI, 2006.
- (2006) AAAI
- James, M.R.¹ Wessling, T.² Vlassis, N.A.³

23
- 84860608661
- Reduced-rank hidden Markov models
- Sajid Siddiqi, Byron Boots, and Geoffrey J. Gordon. Reduced-rank hidden Markov models. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS-2010), 2010.
- (2010) Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS-2010)
- Siddiqi, S.¹ Boots, B.² Gordon, G.J.³

24
- 0033351917
- Optimal stopping of Markov processes: Hilbert space theory, approximation algorithms, and an application to pricing high-dimensional financial derivatives
- John N. Tsitsiklis and Benjamin Van Roy. Optimal stopping of Markov processes: Hilbert space theory, approximation algorithms, and an application to pricing high-dimensional financial derivatives. IEEE Transactions on Automatic Control, 44:1840-1851, 1997.
- (1997) IEEE Transactions on Automatic Control , vol.44 , pp. 1840-1851
- Tsitsiklis, J.N.¹ Van Roy, B.²

25
- 33646435300
- A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning
- David Choi and Benjamin Roy. A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning. Discrete Event Dynamic Systems, 16(2):207-239, 2006.
- (2006) Discrete Event Dynamic Systems , vol.16 , Issue.2 , pp. 207-239
- Choi, D.¹ Roy, B.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.