SCOPUS 정보 검색 플랫폼

Proceedings of the 25th International Conference on Machine Learning

Volumn , Issue , 2008, Pages 256-263

Reinforcement learning with limited reinforcement: Using bayes risk for active learning in POMDPs

(3) Doshi, Finale a Pineau, Joelle b Roy, Nicholas a

a MASSACHUSETTS INSTITUTE OF TECHNOLOGY (United States)

b MCGILL UNIVERSITY (Canada)

Author keywords

[No Author keywords available]

Indexed keywords

EDUCATION; LEARNING SYSTEMS; PLANNING; REINFORCEMENT LEARNING; ROBOT LEARNING;

ACTIVE LEARNINGS; APPROXIMATION APPROACHES; BAYES RISKS; DOMAIN KNOWLEDGES; HIDDEN STATES; MODEL PARAMETERS; PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES; PLANNING DOMAINS; ACTIVE LEARNING; DOMAIN KNOWLEDGE; HIDDEN STATE; PARTIALLY OBSERVABLE MARKOV DECISION PROCESS;

REINFORCEMENT;

EID: 56449086386 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (49)

References (18)

1
- 1142281527
- Dearden, R., Friedman, N., & Andre, D. (1999). Model based Bayesian exploration. .
- (1999) Model based Bayesian exploration
- Dearden, R.¹ Friedman, N.² Andre, D.³

2
- 56449088986
- Efficient model learning for dialog management
- Technical Report SS-07-07. AAA1 Press
- Doshi, F., & Roy, N. (2007). Efficient model learning for dialog management. Technical Report SS-07-07. AAA1 Press.
- (2007)
- Doshi, F.¹ Roy, N.²

3
- 84880715629
- Reinforcement learning in POMDPs without resets
- Even-Dar, E., Kakade, S. M., & Mansour, Y. (2005). Reinforcement learning in POMDPs without resets. IJCAI.
- (2005) IJCAI
- Even-Dar, E.¹ Kakade, S.M.² Mansour, Y.³

4
- 39649090194
- Learning in non-stationary partially observable Markov decision processes
- Iaulmes, R., Pineau, J., & Precup, D. (2005). Learning in non-stationary partially observable Markov decision processes. ECML Workshop on Reinforcement Learning in Non-Stationary Environments.
- (2005) ECML Workshop on Reinforcement Learning in Non-Stationary Environments
- Iaulmes, R.¹ Pineau, J.² Precup, D.³

5
- 85138579181
- Learning policies for partially observable environments: Scaling up
- Littman, M. L., Cassandra, A. R., & Kaelbling, L. P. (1995). Learning policies for partially observable environments: scaling up. ICML.
- (1995) ICML
- Littman, M.L.¹ Cassandra, A.R.² Kaelbling, L.P.³

6
- 56449107989
- The variational Bayesian EM algorithm for incomplete data: With application to scoring graphical model structures
- Millet, I. (1998). The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures. Journal of Multi-Criteria Decision Analysis, 6.
- (1998) Journal of Multi-Criteria Decision Analysis , vol.6
- Millet, I.¹

7
- 56449111524
- Monte Carlo samplers
- Moral, P., Doucet, A., & Peters, G. (2002). Sequential Monte Carlo samplers.
- (2002) Sequential
- Moral, P.¹ Doucet, A.² Peters, G.³

8
- 0042547347
- Algorithms for inverse reinforcement learning
- Ng, A., & Russell, S. (2000). Algorithms for inverse reinforcement learning. ICML.
- (2000) ICML
- Ng, A.¹ Russell, S.²

9
- 84880772945
- Point-based value iteration: An anytime algorithm for POMDPs
- Pineau, J., Gordon, G., & Thrun, S. (2003). Point-based value iteration: an anytime algorithm for POMDPs. IJCAI.
- (2003) IJCAI
- Pineau, J.¹ Gordon, G.² Thrun, S.³

10
- 77950356463
- Model-based Bayesian reinforcement learning in partially observable domains
- Poupart, P., & Vlassis, N. (2008). Model-based Bayesian reinforcement learning in partially observable domains. ISAIM.
- (2008) ISAIM
- Poupart, P.¹ Vlassis, N.²

11
- 33749251297
- An analytic solution to discrete Bayesian reinforcement learning
- Poupart, P., Vlassis, N., Hoey, J., & Regan, K. (2006). An analytic solution to discrete Bayesian reinforcement learning. ICML.
- (2006) ICML
- Poupart, P.¹ Vlassis, N.² Hoey, J.³ Regan, K.⁴

12
- 84880707672
- Spoken dialogue management using probabilistic reasoning
- Hong Kong
- Roy, N., Pineau, J., & Thrun, S. (2000). Spoken dialogue management using probabilistic reasoning. ACL. Hong Kong.
- (2000) ACL
- Roy, N.¹ Pineau, J.² Thrun, S.³

13
- 4243328638
- Fast learning of on-line EM algorithm
- Sato, M. (1999). Fast learning of on-line EM algorithm. TR-H-281, ATR Human Information Processing Lab.
- (1999) TR-H-281, ATR Human Information Processing Lab
- Sato, M.¹

14
- 84880906197
- Forward search value iteration for POMDPs
- Shani, G., Brafman, R., & Shimony, S. (2007). Forward search value iteration for POMDPs. IJCAI.
- (2007) IJCAI
- Shani, G.¹ Brafman, R.² Shimony, S.³

15
- 34548745051
- Incremental model-based learners with formal learning-time guarantees
- Srrehl, A. L., Li, L., & Littman, M. L. (2006). Incremental model-based learners with formal learning-time guarantees. UAI.
- (2006) UAI
- Srrehl, A.L.¹ Li, L.² Littman, M.L.³

16
- 14344258433
- A Bayesian framework for reinforcement learning
- Strens, M. (2000). A Bayesian framework for reinforcement learning. ICML.
- (2000) ICML
- Strens, M.¹

17
- 0004049893
- Doctoral dissertation, Cambridge University
- Watkins, C. (1989). Learning from delayed rewards. Doctoral dissertation, Cambridge University.
- (1989) Learning from delayed rewards
- Watkins, C.¹

18
- 33846220727
- Scaling up POMDPs for dialogue management: The "summary POMDP" method
- Williams, J., & Young, S. (2005). Scaling up POMDPs for dialogue management: The "summary POMDP" method. Proceedings of the IEEE ASRU Workshop.
- (2005) Proceedings of the IEEE ASRU Workshop
- Williams, J.¹ Young, S.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.