|
Volumn 2111, Issue , 2001, Pages 605-615
|
Optimizing average reward using discounted rewards
|
Author keywords
[No Author keywords available]
|
Indexed keywords
DYNAMIC PROGRAMMING;
MARKOV PROCESSES;
REINFORCEMENT LEARNING;
APPROXIMATE HESSIANS;
AVERAGE REWARD;
BELLMAN EQUATIONS;
BIASED ESTIMATES;
DISCOUNT FACTORS;
DISCOUNTED REWARD;
MIXING TIME;
POLICY GRADIENT;
COMPUTATION THEORY;
|
EID: 84943252297
PISSN: 03029743
EISSN: 16113349
Source Type: Book Series
DOI: 10.1007/3-540-44581-1_40 Document Type: Conference Paper |
Times cited : (34)
|
References (10)
|