SCOPUS 정보 검색 플랫폼

IJCAI International Joint Conference on Artificial Intelligence

Volumn , Issue , 2005, Pages 690-695

Reinforcement learning in POMDPs without resets

(3) Even Dar, Eyal a Kakade, Sham M b Mansour, Yishay a

a TEL AVIV UNIVERSITY (Israel)

b UNIVERSITY OF PENNSYLVANIA (United States)

Author keywords

[No Author keywords available]

Indexed keywords

AVERAGE REWARD; BALANCING EXPLORATION AND EXPLOITATIONS; BUILDING BLOCKES; CONVERGENCE RATES; NEAR-OPTIMAL; OPTIMAL POLICIES; UNKNOWN ENVIRONMENTS; UNOBSERVABLE;

ARTIFICIAL INTELLIGENCE; OPTIMIZATION; REINFORCEMENT LEARNING;

ALGORITHMS;

EID: 84880715629 PISSN: 10450823 EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (33)

References (13)

1
- 0002436850
- Tractable inference for complex stochastic processes
- X. Boyen and D. Koller. Tractable inference for complex stochastic processes. In UAI, 1998.
- (1998) UAI
- Boyen, X.¹ Koller, D.²

2
- 0003989210
- PhD thesis, Brown University
- A. Cassandra. Exact and approximate algorithms for partially observable markov decision processes. PhD thesis, Brown University, 1998.
- (1998) Exact and Approximate Algorithms for Partially Observable Markov Decision Processes
- Cassandra, A.¹

3
- 57049103166
- A heuristic variable-grid solution method for pomdps
- M. Hauskrecht. A heuristic variable-grid solution method for pomdps. In AAAI-97, pages 734-739, 1997.
- (1997) AAAI-97 , pp. 734-739
- Hauskrecht, M.¹

4
- 84898967749
- Approximate planning in large pomdps via reusable trajectories
- M. Kearns, Y. Mansour, and A. Ng. Approximate planning in large pomdps via reusable trajectories. In NIPS 12, 1999.
- (1999) NIPS 12
- Kearns, M.¹ Mansour, Y.² Ng, A.³

5
- 0012257655
- Near-optimal reinforcement learning in polynomial time
- M. Kearns and S. Singh. Near-optimal reinforcement learning in polynomial time. Proceedings of ICML, 1998.
- Proceedings of ICML, 1998
- Kearns, M.¹ Singh, S.²

6
- 0000494894
- Computationally feasible bounds for partially observed markov decision processes
- W. S. Lovejoy. Computationally feasible bounds for partially observed markov decision processes. Operations Research, 39(1):162-175, 1991a.
- (1991) Operations Research , vol.39 , Issue.1 , pp. 162-175
- Lovejoy, W.S.¹

7
- 0002679852
- A survey of algorithmic methods for partially observed markov decision processes
- W. S. Lovejoy. A survey of algorithmic methods for partially observed markov decision processes. Annals of Operations Research, 28:47-66, 1991b.
- (1991) Annals of Operations Research , vol.28 , pp. 47-66
- Lovejoy, W.S.¹

8
- 0036374190
- Nonapproximability results for partially observable Markov decision processes
- C. Lusena, J. Goldsmith, and M. Mundhenk. Nonapproximability results for partially observable markov decision processes. Journal of Artificial Intelligence Research, 14: 83-103, 2001. (Pubitemid 33738060)
- (2001) Journal of Artificial Intelligence Research , vol.14 , pp. 83-103
- Lusena, C.¹ Goldsmith, J.² Mundhenk, M.³

9
- 0004835198
- Approximate planning for factored pomdps using belief state simplification
- D. McAllester and S. Singh. Approximate planning for factored pomdps using belief state simplification. In In Proceedings of the Fifteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI), pages 409-416, 1999.
- (1999) In Proceedings of the Fifteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI) , pp. 409-416
- McAllester, D.¹ Singh, S.²

10
- 85168129602
- Approximating optimal policies for partially observable stochastic domains
- R. Parr and S. Russell. Approximating optimal policies for partially observable stochastic domains. In Proceedings of the International Joint Conference on Artificial Intelligence, 1995.
- Proceedings of the International Joint Conference on Artificial Intelligence, 1995
- Parr, R.¹ Russell, S.²

11
- 0012646255
- Learning to cooperate via policy search
- L. Peshkin, K. Kim, N. Meuleau, and L.P. Kaelbling. Learning to cooperate via policy search. In 16th Proceedings of UAI, pages 307-314, 2000.
- (2000) 16th Proceedings of UAI , pp. 307-314
- Peshkin, L.¹ Kim, K.² Meuleau, N.³ Kaelbling, L.P.⁴

12
- 0001349185
- Inference of finite automata using homing sequences
- R. Rivest and R. Schapire. Inference of finite automata using homing sequences. Information and Computation, 103(2): 299-347, 1993.
- (1993) Information and Computation , vol.103 , Issue.2 , pp. 299-347
- Rivest, R.¹ Schapire, R.²

13
- 0003871607
- PhD thesis, Stanford University, Stanford, California
- E. Sondik. The optimal control of partially observable processes over a finite horizon. PhD thesis, Stanford University, Stanford, California, 1971.
- (1971) The Optimal Control of Partially Observable Processes over a Finite Horizon
- Sondik, E.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.