메뉴 건너뛰기




Volumn , Issue , 2005, Pages 690-695

Reinforcement learning in POMDPs without resets

Author keywords

[No Author keywords available]

Indexed keywords

AVERAGE REWARD; BALANCING EXPLORATION AND EXPLOITATIONS; BUILDING BLOCKES; CONVERGENCE RATES; NEAR-OPTIMAL; OPTIMAL POLICIES; UNKNOWN ENVIRONMENTS; UNOBSERVABLE;

EID: 84880715629     PISSN: 10450823     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (33)

References (13)
  • 1
    • 0002436850 scopus 로고    scopus 로고
    • Tractable inference for complex stochastic processes
    • X. Boyen and D. Koller. Tractable inference for complex stochastic processes. In UAI, 1998.
    • (1998) UAI
    • Boyen, X.1    Koller, D.2
  • 3
    • 57049103166 scopus 로고    scopus 로고
    • A heuristic variable-grid solution method for pomdps
    • M. Hauskrecht. A heuristic variable-grid solution method for pomdps. In AAAI-97, pages 734-739, 1997.
    • (1997) AAAI-97 , pp. 734-739
    • Hauskrecht, M.1
  • 4
    • 84898967749 scopus 로고    scopus 로고
    • Approximate planning in large pomdps via reusable trajectories
    • M. Kearns, Y. Mansour, and A. Ng. Approximate planning in large pomdps via reusable trajectories. In NIPS 12, 1999.
    • (1999) NIPS 12
    • Kearns, M.1    Mansour, Y.2    Ng, A.3
  • 6
    • 0000494894 scopus 로고
    • Computationally feasible bounds for partially observed markov decision processes
    • W. S. Lovejoy. Computationally feasible bounds for partially observed markov decision processes. Operations Research, 39(1):162-175, 1991a.
    • (1991) Operations Research , vol.39 , Issue.1 , pp. 162-175
    • Lovejoy, W.S.1
  • 7
    • 0002679852 scopus 로고
    • A survey of algorithmic methods for partially observed markov decision processes
    • W. S. Lovejoy. A survey of algorithmic methods for partially observed markov decision processes. Annals of Operations Research, 28:47-66, 1991b.
    • (1991) Annals of Operations Research , vol.28 , pp. 47-66
    • Lovejoy, W.S.1
  • 8
    • 0036374190 scopus 로고    scopus 로고
    • Nonapproximability results for partially observable Markov decision processes
    • C. Lusena, J. Goldsmith, and M. Mundhenk. Nonapproximability results for partially observable markov decision processes. Journal of Artificial Intelligence Research, 14: 83-103, 2001. (Pubitemid 33738060)
    • (2001) Journal of Artificial Intelligence Research , vol.14 , pp. 83-103
    • Lusena, C.1    Goldsmith, J.2    Mundhenk, M.3
  • 12
    • 0001349185 scopus 로고
    • Inference of finite automata using homing sequences
    • R. Rivest and R. Schapire. Inference of finite automata using homing sequences. Information and Computation, 103(2): 299-347, 1993.
    • (1993) Information and Computation , vol.103 , Issue.2 , pp. 299-347
    • Rivest, R.1    Schapire, R.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.