메뉴 건너뛰기




Volumn 405, Issue 3, 2008, Pages 274-284

On the possibility of learning in reactive environments with arbitrary dependence

Author keywords

(non) Markov decision processes; Asymptotic average value; Reinforcement learning; Self optimizing policies

Indexed keywords

MARKOV PROCESSES; STOCHASTIC SYSTEMS;

EID: 77949509398     PISSN: 03043975     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.tcs.2008.06.039     Document Type: Article
Times cited : (19)

References (18)
  • 5
    • 33845304828 scopus 로고    scopus 로고
    • How to combine expert (and novice) advice when actions impact the environment?
    • Sebastian Thrun, Lawrence Saul, Bernhard Schölkopf (Eds, MIT Press, Cambridge, MA
    • D. Pucci de Farias, N. Megiddo, How to combine expert (and novice) advice when actions impact the environment? in: Sebastian Thrun, Lawrence Saul, Bernhard Schölkopf (Eds.), Advances in Neural Information Processing Systems, vol. 16, MIT Press, Cambridge, MA, 2004.
    • (2004) Advances in Neural Information Processing Systems , vol.16
    • De Farias, D.P.1    Megiddo, N.2
  • 7
    • 84880715629 scopus 로고    scopus 로고
    • Reinforcement learning in POMDPs without resets
    • E. Even-Dar, S.M. Kakade, Y. Mansour, Reinforcement learning in POMDPs without resets, in: IJCAI, 2005, pp. 690-695.
    • (2005) IJCAI , pp. 690-695
    • Even-Dar, E.1    Kakade, S.M.2    Mansour, Y.3
  • 8
    • 21844436185 scopus 로고    scopus 로고
    • Prediction with expert advice by following the perturbed leader for general weights
    • Algorithmic Learning Theory - 15th International Conference, ALT 2004
    • M. Hutter, J. Poland, Prediction with expert advice by following the perturbed leader for general weights, in: Proc. 15th International Conf. on Algorithmic Learning Theory, ALT'04, in: LNAI, vol. 3244, Springer, Padova, Berlin, 2004, pp. 279-293. (Pubitemid 41050298)
    • (2004) Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) , vol.3244 , pp. 279-293
    • Hutter, M.1    Poland, J.2
  • 9
    • 84937417436 scopus 로고    scopus 로고
    • Self-optimizing and Pareto-optimal policies in general environments based on Bayes-mixtures
    • Lecture Notes in Artificial Intelligence, Springer, Sydney, Australia, July
    • M. Hutter, Self-optimizing and Pareto-optimal policies in general environments based on Bayes-mixtures, in: Proc. 15th Annual Conference on Computational Learning Theory, COLT 2002, in: Lecture Notes in Artificial Intelligence, Springer, Sydney, Australia, July 2002, pp. 364-379.
    • (2002) Proc. 15th Annual Conference on Computational Learning Theory, COLT 2002 , pp. 364-379
    • Hutter, M.1
  • 10
    • 4644374039 scopus 로고    scopus 로고
    • Optimality of universal Bayesian prediction for general loss and alphabet
    • M. Hutter, Optimality of universal Bayesian prediction for general loss and alphabet, Journal of Machine Learning Research 4 (2003) 971-1000.
    • (2003) Journal of Machine Learning Research , vol.4 , pp. 971-1000
    • Hutter, M.1
  • 16
    • 41149139797 scopus 로고    scopus 로고
    • Predicting non-stationary processes
    • DOI 10.1016/j.aml.2007.04.004, PII S0893965907001899
    • D. Ryabko, M. Hutter, Predicting Non-Stationary Processes, Applied Mathematics Letters 21 (5) (2008) 477-482. (Pubitemid 351424908)
    • (2008) Applied Mathematics Letters , vol.21 , Issue.5 , pp. 477-482
    • Ryabko, D.1    Hutter, M.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.