|
Volumn , Issue , 2010, Pages
|
Online Markov decision processes under bandit feedback
|
Author keywords
[No Author keywords available]
|
Indexed keywords
E-LEARNING;
MARKOV PROCESSES;
STOCHASTIC SYSTEMS;
BANDIT FEEDBACKS;
LEARNING AGENTS;
MARKOV DECISION PROCESSES;
MARKOVIAN ENVIRONMENT;
OBLIVIOUS ADVERSARIES;
ONLINE LEARNING;
REWARD FUNCTION;
STATIONARY POLICY;
STOCHASTICS;
TIME STEP;
LEARNING ALGORITHMS;
|
EID: 85162052729
PISSN: None
EISSN: None
Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper |
Times cited : (154)
|
References (8)
|