SCOPUS 정보 검색 플랫폼

10th International Symposium on Artificial Intelligence and Mathematics, ISAIM 2008

Volumn , Issue , 2008, Pages

Reinforcement learning with limited reinforcement: Using Bayes risk for active learning in POMDPs

(3) Doshi, Finale a Roy, Nicholas a Pineau, Joelle a

a NONE

Author keywords

[No Author keywords available]

Indexed keywords

ACTIVE LEARNING; APPROXIMATE ALGORITHMS; BAYES RISK; DOMAIN KNOWLEDGE; HIDDEN STATE; MODEL PARAMETERS; PARTIALLY OBSERVABLE MARKOV DECISION PROCESS; PLANNING DOMAINS;

ARTIFICIAL INTELLIGENCE;

REINFORCEMENT LEARNING;

EID: 84864606634 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (2)

References (15)

1
- 0029679131
- Active learningwith statistical models
- Cohn, D. A.; Ghahramani, Z.; and Jordan, M. I. 1996. Active learningwith statistical models. Journal of Artificial Intelligence Research 4:129-145.
- (1996) Journal of Artificial Intelligence Research , vol.4 , pp. 129-145
- Cohn, D.A.¹ Ghahramani, Z.² Jordan, M.I.³

2
- 1142281527
- Dearden, R.; Friedman, N.; and Andre, D. 1999. Model based bayesian exploration. 150-159.
- (1999) Model Based Bayesian Exploration , pp. 150-159
- Dearden, R.¹ Friedman, N.² Andre, D.³

3
- 84880715629
- Reinforcement learning in pomdps without resets
- Even-Dar, E.; Kakade, S. M.; and Mansour, Y. 2005. Reinforcement learning in pomdps without resets. In IJCAI, 690-695.
- (2005) IJCAI , pp. 690-695
- Even-Dar, E.¹ Kakade, S.M.² Mansour, Y.³

4
- 39649090194
- Learning in non-stationary partially observable markov decision processes
- Jaulmes, R.; Pineau, J.; and Precup, D. 2005. Learning in non-stationary partially observable markov decision processes. In ECML Workshop.
- (2005) ECML Workshop
- Jaulmes, R.¹ Pineau, J.² Precup, D.³

5
- 85138579181
- Learning policies for partially observable environments: Scaling up
- Littman, M. L.; Cassandra, A. R.; and Kaelbling, L. P. 1995. Learning policies for partially observable environments: scaling up. ICML.
- (1995) ICML
- Littman, M.L.¹ Cassandra, A.R.² Kaelbling, L.P.³

6
- 0042547347
- Algorithms for inverse reinforcement learning
- Ng, A., and Russell, S. 2000. Algorithms for inverse reinforcement learning. In Proceedings of ICML.
- (2000) Proceedings of ICML
- Ng, A.¹ Russell, S.²

7
- 84880772945
- Point-based value iteration: An anytime algorithm for pomdps
- Pineau, J.; Gordon, G.; and Thrun, S. 2003. Point-based value iteration: An anytime algorithm for pomdps. IJCAI.
- (2003) IJCAI
- Pineau, J.¹ Gordon, G.² Thrun, S.³

8
- 34250730267
- An analytic solution to discrete bayesian reinforcement learning
- New York, NY, USA: ACMPress
- Poupart, P.; Vlassis, N.; Hoey, J.; and Regan, K. 2006. An analytic solution to discrete bayesian reinforcement learning. In ICML, 697-704. New York, NY, USA: ACMPress.
- (2006) ICML , pp. 697-704
- Poupart, P.¹ Vlassis, N.² Hoey, J.³ Regan, K.⁴

9
- 0024610919
- A tutorial on hidden markov models and selected applications in speech recognition
- Rabiner, L. R. 1989. A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE 77(2):257-286.
- (1989) Proceedings of the IEEE , vol.77 , Issue.2 , pp. 257-286
- Rabiner, L.R.¹

10
- 4243328638
- Fast learning of on-line em algorithm
- Sato,M. 1999. Fast learning of on-line em algorithm. Technical Report, TR-H-281, ATR Human Information Processing Research Laboratorie.
- (1999) Technical Report, TR-H-281, ATR Human Information Processing Research Laboratorie
- Sato, M.¹

11
- 34548745051
- Incremental model-based learners with formal learning-time guarantees
- Strehl, A. L.; Li, L.; and Littman, M. L. 2006. Incremental model-based learners with formal learning-time guarantees. In Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence.
- (2006) Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence
- Strehl, A.L.¹ Li, L.² Littman, M.L.³

12
- 14344258433
- A bayesian framework for reinforcement learning
- Strens, M. 2000. A bayesian framework for reinforcement learning. In Proc. of the 17th International Conf. on Machine Learning.
- (2000) Proc. of the 17th International Conf. on Machine Learning
- Strens, M.¹

13
- 33847202724
- Learning to predict by the methods of temporal differences
- Sutton, R. 1988. Learning to predict by the methods of temporal differences. Machine Learning 3.
- (1988) Machine Learning , vol.3
- Sutton, R.¹

14
- 0004049893
- Ph.D. Dissertation, Cambridge University
- Watkins, C. 1989. Learning from Delayed Rewards. Ph.D. Dissertation, Cambridge University.
- (1989) Learning from Delayed Rewards
- Watkins, C.¹

15
- 33846220727
- Scaling up pomdps for dialogue management: The "summary pomdp" method
- Williams, J., and Young, S. 2005. Scaling up pomdps for dialogue management: The "summary pomdp" method. In Proceedings of the IEEE ASRU Workshop.
- (2005) Proceedings of the IEEE ASRU Workshop
- Williams, J.¹ Young, S.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.