SCOPUS 정보 검색 플랫폼

IJCAI International Joint Conference on Artificial Intelligence

Volumn , Issue , 2011, Pages 2165-2171

Robust online optimization of reward-uncertain MDPs

(2) Regan, Kevin a Boutilier, Craig a

a UNIVERSITY OF TORONTO (Canada)

Author keywords

[No Author keywords available]

Indexed keywords

ANY-TIME ALGORITHMS; APPROXIMATION SCHEME; COMPUTATIONAL TRACTABILITY; ERROR BOUND; MARKOV DECISION PROCESSES; MINIMAX REGRET; ONLINE OPTIMIZATION; REWARD FUNCTION;

ERROR ANALYSIS; MARKOV PROCESSES;

ARTIFICIAL INTELLIGENCE;

EID: 84881084517 PISSN: 10450823 EISSN: None Source Type: Conference Proceeding
DOI: 10.5591/978-1-57735-516-8/IJCAI11-361 Document Type: Conference Paper

Times cited : (31)

References (19)

1
- 0010606787
- lrs: A revised implementation of the reverse search vertex enumeration algorithm
- Birkhauser-Verlag
- David Avis. lrs: A revised implementation of the reverse search vertex enumeration algorithm. In Polytopes-Combinatorics and Computation, pages 177-198. Birkhauser-Verlag, 2000.
- (2000) Polytopes-Combinatorics and Computation , pp. 177-198
- Avis, D.¹

2
- 1942450194
- Technical Report CMU-RI-TR-01-25, Carnegie Mellon University, Pittsburgh
- Andrew Bagnell, Andrew Ng, and Jeff Schneider. Solving uncertain Markov decision problems. Technical Report CMU-RI-TR-01-25, Carnegie Mellon University, Pittsburgh, 2003.
- (2003) Solving Uncertain Markov Decision Problems
- Bagnell, A.¹ Ng, A.² Schneider, J.³

3
- 27344432831
- Solving transition independent decentralized Markov decision processes
- Ralphen Becker, Shlomo Zilberstein, Victor R. Lesser, and Claudia V. Goldman. Solving transition independent decentralized Markov decision processes. Journal of Artificial Intelligence Research, 22:423-455, 2004.
- (2004) Journal of Artificial Intelligence Research , vol.22 , pp. 423-455
- Becker, R.¹ Zilberstein, S.² Lesser, V.R.³ Goldman, C.V.⁴

4
- 33645712239
- A planning system based on Markov decision processes to guide people with dementia through activities of daily living
- Jennifer Boger, Pascal Poupart, Jesse Hoey, Craig Boutilier, Geoff Fernie, and Alex Mihailidis. A planning system based on Markov decision processes to guide people with dementia through activities of daily living. IEEE Transactions on Information Technology in Biomedicine, 10(2):323-333, 2006.
- (2006) IEEE Transactions on Information Technology in Biomedicine , vol.10 , Issue.2 , pp. 323-333
- Boger, J.¹ Poupart, P.² Hoey, J.³ Boutilier, C.⁴ Fernie, G.⁵ Mihailidis, A.⁶

5
- 33646096015
- Constraint-based optimization and utility elicitation using the minimax decision criterion
- Craig Boutilier, Relu Patrascu, Pascal Poupart, and Dale Schuurmans. Constraint-based optimization and utility elicitation using the minimax decision criterion. Artifical Intelligence, 170(8-9):686-713, 2006.
- (2006) Artifical Intelligence , vol.170 , Issue.8-9 , pp. 686-713
- Boutilier, C.¹ Patrascu, R.² Poupart, P.³ Schuurmans, D.⁴

6
- 0003818801
- PhD thesis, University of British Columbia, Vancouver
- Hsien-Te Cheng. Algorithms for Partially Observable Markov Decision Processes. PhD thesis, University of British Columbia, Vancouver, 1988.
- (1988) Algorithms for Partially Observable Markov Decision Processes
- Cheng, H.-T.¹

7
- 34547985785
- Percentile optimization in uncertain Markov decision processes with application to efficient exploration
- Corvallis, OR
- Erick Delage and Shie Mannor. Percentile optimization in uncertain Markov decision processes with application to efficient exploration. In Proceedings of the Twenty-fourth International Conference on Machine Learning (ICML-07), pages 225-232, Corvallis, OR, 2007.
- (2007) Proceedings of the Twenty-fourth International Conference on Machine Learning (ICML-07) , pp. 225-232
- Delage, E.¹ Mannor, S.²

8
- 0004232519
- Halsted, New York
- Simon French. Decision Theory. Halsted, New York, 1986.
- (1986) Decision Theory
- French, S.¹

9
- 25444493818
- Robust dynamic programming
- G. Iyengar. Robust dynamic programming. Mathematics of Operations Research, 30(2):1-21, 2005.
- (2005) Mathematics of Operations Research , vol.30 , Issue.2 , pp. 1-21
- Iyengar, G.¹

10
- 1942452324
- Planning in the presence of cost functions controlled by an adversary
- Washington, DC
- Brendan McMahan, Geoffrey Gordon, and Avrim Blum. Planning in the presence of cost functions controlled by an adversary. In Proceedings of the Twentieth International Conference on Machine Learning (ICML-03), pages 536-543, Washington, DC, 2003.
- (2003) Proceedings of the Twentieth International Conference on Machine Learning (ICML-03) , pp. 536-543
- McMahan, B.¹ Gordon, G.² Blum, A.³

11
- 0042547347
- Algorithms for inverse reinforcement learning
- Stanford, CA
- Andrew Ng and Stuart Russell. Algorithms for inverse reinforcement learning. In Proceedings of the Seventeenth International Conference on Machine Learning (ICML-00), pages 663-670, Stanford, CA, 2000.
- (2000) Proceedings of the Seventeenth International Conference on Machine Learning (ICML-00) , pp. 663-670
- Ng, A.¹ Russell, S.²

12
- 14344250395
- Robust control of Markov decision processes with uncertain transition matrices
- Arnab Nilim and Laurent El Ghaoui. Robust control of Markov decision processes with uncertain transition matrices. Operations Research, 53(1):780-798, 2005.
- (2005) Operations Research , vol.53 , Issue.1 , pp. 780-798
- Nilim, A.¹ El Ghaoui, L.²

13
- 68349086890
- A bilinear programming approach for multiagent planning
- Marek Petrik and Shlomo Zilberstein. A bilinear programming approach for multiagent planning. Journal of Artificial Intelligence Research, 35(1):235-274, 2009.
- (2009) Journal of Artificial Intelligence Research , vol.35 , Issue.1 , pp. 235-274
- Petrik, M.¹ Zilberstein, S.²

14
- 85102627959
- Wiley, New York
- Martin L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York, 1994.
- (1994) Markov Decision Processes: Discrete Stochastic Dynamic Programming
- Puterman, M.L.¹

15
- 80052425037
- Regret-based reward elicitation for Markov decision processes
- Montreal
- Kevin Regan and Craig Boutilier. Regret-based reward elicitation for Markov decision processes. In Proceedings of the Twenty-fifth Conference on Uncertainty in Artificial Intelligence (UAI-09), pages 454-451, Montreal, 2009.
- (2009) Proceedings of the Twenty-fifth Conference on Uncertainty in Artificial Intelligence (UAI-09) , pp. 454-1451
- Regan, K.¹ Boutilier, C.²

16
- 77958520196
- Robust policy computation in reward-uncertain MDPs using nondominated policies
- Atlanta
- Kevin Regan and Craig Boutilier. Robust policy computation in reward-uncertain MDPs using nondominated policies. In Proceedings of the Twenty-fourth AAAI Conference on Artificial Intelligence (AAAI-10), pages 1127-1133, Atlanta, 2010.
- (2010) Proceedings of the Twenty-fourth AAAI Conference on Artificial Intelligence (AAAI-10) , pp. 1127-1133
- Regan, K.¹ Boutilier, C.²

17
- 84881054930
- Eliciting additive reward functions for Markov decision processes
- To appear
- Kevin Regan and Craig Boutilier. Eliciting additive reward functions for Markov decision processes. In Proceedings of the Twenty-second International Joint Conference on Artificial Intelligence (IJCAI-11), Barcelona, 2009. To appear.
- Proceedings of the Twenty-second International Joint Conference on Artificial Intelligence (IJCAI-11), Barcelona, 2009
- Regan, K.¹ Boutilier, C.²

18
- 0003984043
- Wiley, New York
- Leonard J. Savage. The Foundations of Statistics. Wiley, New York, 1954.
- (1954) The Foundations of Statistics
- Savage, L.J.¹

19
- 77950823530
- Parametric regret in uncertain Markov decision processes
- Shanghai
- Huan Xu and Shie Mannor. Parametric regret in uncertain Markov decision processes. In 48th IEEE Conference on Decision and Control, pages 3606-3613, Shanghai, 2009.
- (2009) 48th IEEE Conference on Decision and Control , pp. 3606-3613
- Xu, H.¹ Mannor, S.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.