SCOPUS 정보 검색 플랫폼

Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010, NIPS 2010

Volumn , Issue , 2010, Pages

Basis construction from power series expansions of value functions

(2) Mahadevan, Sridhar a Liu, Bo a

a University of Massachusetts Amherst (United States)

Author keywords

[No Author keywords available]

Indexed keywords

INVERSE PROBLEMS; MATRIX ALGEBRA;

AVERAGE REWARD; BELLMAN ERROR; CONSTRUCTION METHOD; DECISION POWER; DISCOUNT FACTORS; MARKOV DECISION PROCESSES; NEUMANN SERIES EXPANSION; POWER SERIES EXPANSIONS; PROPERTY; VALUE FUNCTIONS;

MARKOV PROCESSES;

EID: 85161990353 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (20)

References (15)

1
- 0024680419
- Adaptive aggregation methods for infinite horizon dynamic programming
- D. Bertsekas and D. Castañon. Adaptive aggregation methods for infinite horizon dynamic programming. IEEE Transactions on Automatic Control, 34:589-598, 1989.
- (1989) IEEE Transactions on Automatic Control , vol.34 , pp. 589-598
- Bertsekas, D.¹ Castañon, D.²

2
- 0003598718
- Pitman
- S. Campbell and C. Meyer. Generalized Inverses of Linear Transformations. Pitman, 1979.
- (1979) Generalized Inverses of Linear Transformations
- Campbell, S.¹ Meyer, C.²

3
- 4644323293
- Least-squares policy iteration
- M. Lagoudakis and R. Parr. Least-squares policy iteration. Journal of Machine Learning Research, 4:1107-1149, 2003.
- (2003) Journal of Machine Learning Research , vol.4 , pp. 1107-1149
- Lagoudakis, M.¹ Parr, R.²

4
- 85162034137
- An investigation of basis construction from power series expansions of value functions
- Amherst
- B. Liu and S. Mahadevan. An investigation of basis construction from power series expansions of value functions. Technical report, University Massachusetts, Amherst, 2010.
- (2010) Technical Report, University Massachusetts
- Liu, B.¹ Mahadevan, S.²

5
- 17144430819
- Sensitive-discount optimality: Unifying discounted and average reward reinforcement learning
- S Mahadevan. Sensitive-discount optimality: Unifying discounted and average reward reinforcement learning. In Proceedings of the International Conference on Machine Learning, 1996.
- (1996) Proceedings of the International Conference on Machine Learning
- Mahadevan, S.¹

6
- 70349322784
- Learning representation and control in Markov Decision Processes: New frontiers
- S. Mahadevan. Learning representation and control in Markov Decision Processes: New frontiers. Foundations and Trends in Machine Learning, 1(4):403-565, 2009.
- (2009) Foundations and Trends in Machine Learning , vol.1 , Issue.4 , pp. 403-565
- Mahadevan, S.¹

7
- 35748957806
- Proto-value functions: A Laplacian framework for learning representation and control in Markov Decision Processes
- S. Mahadevan and M. Maggioni. Proto-value functions: A Laplacian framework for learning representation and control in Markov Decision Processes. Journal of Machine Learning Research, 8:2169-2231, 2007.
- (2007) Journal of Machine Learning Research , vol.8 , pp. 2169-2231
- Mahadevan, S.¹ Maggioni, M.²

8
- 56449092660
- An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning
- R. Parr, Li. L., G. Taylor, C. Painter-Wakefield, and M. Littman. An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning. In Proceedings of the International Conference on Machine Learning (ICML), 2008.
- (2008) Proceedings of the International Conference on Machine Learning (ICML)
- Parr Li. L, R.¹ Taylor, G.² Painter-Wakefield, C.³ Littman, M.⁴

9
- 34547982545
- Analyzing feature generation for value function approximation
- R. Parr, C. Painter-Wakefield, L. Li, and M. Littman. Analyzing feature generation for value function approximation. In Proceedings of the International Conference on Machine Learning (ICML), pages 737-744, 2007.
- (2007) Proceedings of the International Conference on Machine Learning (ICML) , pp. 737-744
- Parr, R.¹ Painter-Wakefield, C.² Li, L.³ Littman, M.⁴

10
- 84880899807
- An analysis of Laplacian methods for value function approximation in MDPs
- M. Petrik. An analysis of Laplacian methods for value function approximation in MDPs. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), pages 2574-2579, 2007.
- (2007) Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI) , pp. 2574-2579
- Petrik, M.¹

11
- 0003998452
- Wiley Interscience, New York, USA
- M. L. Puterman. Markov Decision Processes. Wiley Interscience, New York, USA, 1994.
- (1994) Markov Decision Processes
- Puterman, M.L.¹

12
- 1842829625
- SIAM Press
- Y. Saad. Iterative Methods for Sparse Linear Systems. SIAM Press, 2003.
- (2003) Iterative Methods for Sparse Linear Systems
- Saad, Y.¹

13
- 85152626183
- A reinforcement learning method for maximizing undiscounted rewards
- Morgan Kaufmann, San Francisco, CA
- A. Schwartz. A reinforcement learning method for maximizing undiscounted rewards. In Proc. 10th International Conf. on Machine Learning. Morgan Kaufmann, San Francisco, CA, 1993.
- (1993) Proc. 10th International Conf. on Machine Learning
- Schwartz, A.¹

14
- 0346922977
- Numerical methods for computing stationary distributions of finite irreducible markov chains
- Kluwer Academic Publishers
- William J. Stewart. Numerical methods for computing stationary distributions of finite irreducible markov chains. In Advances in Computational Probability. Kluwer Academic Publishers, 1997.
- (1997) Advances in Computational Probability
- Stewart, W.J.¹

15
- 0012841228
- Successive matrix squaring algorithm for computing the Drazin inverse
- Y. Wei. Successive matrix squaring algorithm for computing the Drazin inverse. Applied Mathematics and Computation, 108:67-75, 2000.
- (2000) Applied Mathematics and Computation , vol.108 , pp. 67-75
- Wei, Y.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.