SCOPUS 정보 검색 플랫폼

Volumn 405, Issue 3, 2008, Pages 274-284

On the possibility of learning in reactive environments with arbitrary dependence

b INRIA (France)

Author keywords

(non) Markov decision processes; Asymptotic average value; Reinforcement learning; Self optimizing policies

Indexed keywords

MARKOV PROCESSES; STOCHASTIC SYSTEMS;

AVERAGE VALUES; MARKOV DECISION PROCESSES; MIXING CONDITIONS; PROBABILISTIC ASSUMPTIONS; SELF-OPTIMIZING; STOCHASTIC DEPENDENCE;

REINFORCEMENT LEARNING;

EID: 77949509398 PISSN: 03043975 EISSN: None Source Type: Journal
DOI: 10.1016/j.tcs.2008.06.039 Document Type: Article

Times cited : (19)

References (18)

1
- 0003471079
- Springer
- D. Bosq, Nonparametric Statistics for Stochastic Processes, Springer, 1996.
- (1996) Nonparametric Statistics for Stochastic Processes
- Bosq, D.¹

2
- 84880854156
- A general polynomial time algorithm for near-optimal reinforcement learning
- R.I. Brafman, M. Tennenholtz, A general polynomial time algorithm for near-optimal reinforcement learning, in: Proc. 17th International Joint Conference on Artificial Intelligence, IJCAI-01, 1999, pp. 734-739.
- (1999) Proc. 17th International Joint Conference on Artificial Intelligence, IJCAI-01 , pp. 734-739
- Brafman, R.I.¹ Tennenholtz, M.²

3
- 84926078662
- Cambridge University Press, New York
- N. Cesa-Bianchi, G. Lugosi, Prediction, Learning, and Games, Cambridge University Press, New York, 2006.
- (2006) Prediction, Learning, and Games
- Cesa-Bianchi, N.¹ Lugosi, G.²

4
- 33748099812
- Notes on information theory and statistics
- I. Csiszar, P.C. Shields, Notes on information theory and statistics, in: Foundations and Trends in Communications and Information Theory, 2004.
- (2004) Foundations and Trends in Communications and Information Theory
- Csiszar, I.¹ Shields, P.C.²

6
- 0003954462
- John Wiley & Sons, New York
- J.L. Doob, Stochastic Processes, John Wiley & Sons, New York, 1953.
- (1953) Stochastic Processes
- Doob, J.L.¹

7
- 84880715629
- Reinforcement learning in POMDPs without resets
- E. Even-Dar, S.M. Kakade, Y. Mansour, Reinforcement learning in POMDPs without resets, in: IJCAI, 2005, pp. 690-695.
- (2005) IJCAI , pp. 690-695
- Even-Dar, E.¹ Kakade, S.M.² Mansour, Y.³

10
- 4644374039
- Optimality of universal Bayesian prediction for general loss and alphabet
- M. Hutter, Optimality of universal Bayesian prediction for general loss and alphabet, Journal of Machine Learning Research 4 (2003) 971-1000.
- (2003) Journal of Machine Learning Research , vol.4 , pp. 971-1000
- Hutter, M.¹

11
- 21844479189
- Springer, Berlin
- M. Hutter, Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability, Springer, Berlin, 2005, 300 pages http://www.hutter1.net/ai/uaibook.htm.
- (2005) Universal Artificial Intelligence: Sequential Decisions Based on Algorithmic Probability , pp. 300
- Hutter, M.¹

13
- 0003691637
- Prentice Hall, Englewood Cliffs, NJ
- P.R. Kumar, P.P. Varaiya, Stochastic Systems: Estimation, Identification, and Adaptive Control, Prentice Hall, Englewood Cliffs, NJ, 1986.
- (1986) Stochastic Systems: Estimation, Identification, and Adaptive Control
- Kumar, P.R.¹ Varaiya, P.P.²

14
- 33646515747
- LNAI, vol. 3734, Springer, Singapore, Berlin
- J. Poland, M. Hutter, Defensive universal learning with experts, in: Proc. 16th International Conf. on Algorithmic Learning Theory, ALT'05, in: LNAI, vol. 3734, Springer, Singapore, Berlin, 2005, pp. 356-370.
- (2005) Defensive Universal Learning with Experts, In: Proc. 16th International Conf. on Algorithmic Learning Theory, ALT'05 , pp. 356-370
- Poland, J.¹ Hutter, M.²

17
- 0003584577
- Prentice-Hall, Englewood Cliffs
- S.J. Russell, P. Norvig, Artificial Intelligence. A Modern Approach, Prentice-Hall, Englewood Cliffs, 1995.
- (1995) Artificial Intelligence. A Modern Approach
- Russell, S.J.¹ Norvig, P.²

18
- 0004102479
- MIT Press, Cambridge, MA
- R. Sutton, A. Barto, Reinforcement Learning: An Introduction, MIT Press, Cambridge, MA, 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.¹ Barto, A.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.