메뉴 건너뛰기




Volumn 52, Issue 4, 2007, Pages 677-681

Partially observable markov decision processes with reward information: Basic ideas and models

Author keywords

Partially observable Markov decision process (POMDP); Reward information policy

Indexed keywords

DECISION THEORY; MARKOV PROCESSES; MATHEMATICAL MODELS; PROBABILITY DISTRIBUTIONS; STATE ESTIMATION;

EID: 34247229563     PISSN: 00189286     EISSN: None     Source Type: Journal    
DOI: 10.1109/TAC.2007.894520     Document Type: Article
Times cited : (14)

References (15)
  • 1
    • 0023450663 scopus 로고
    • Asymtotically efficient allocation rules for the mutiarmed bandit problem with multiple plays - Part II: Markovian reward
    • Nov
    • V. Anantharam, P. Varaiya, and J. Walrand, "Asymtotically efficient allocation rules for the mutiarmed bandit problem with multiple plays - Part II: Markovian reward," IEEE Trans. Autom. Control, vol. AC-32, no. 11, pp. 977-982, Nov. 1987.
    • (1987) IEEE Trans. Autom. Control , vol.AC-32 , Issue.11 , pp. 977-982
    • Anantharam, V.1    Varaiya, P.2    Walrand, J.3
  • 3
    • 0034437507 scopus 로고    scopus 로고
    • Average cost dynamic programming equations for controlled Markov chains with partial observations
    • V. S. Borkar, "Average cost dynamic programming equations for controlled Markov chains with partial observations," SIAM J. Control Optim., vol. 39, pp. 673-681, 2001.
    • (2001) SIAM J. Control Optim , vol.39 , pp. 673-681
    • Borkar, V.S.1
  • 4
    • 3843150404 scopus 로고    scopus 로고
    • A unified approach to Markov decision problems and performance sensitivity analysis with discounted and average criteria: Multichain cases
    • X.-R. Cao and X. P. Guo, "A unified approach to Markov decision problems and performance sensitivity analysis with discounted and average criteria: Multichain cases," Automatica, vol. 40, pp. 1749-1759, 2004.
    • (2004) Automatica , vol.40 , pp. 1749-1759
    • Cao, X.-R.1    Guo, X.P.2
  • 5
    • 33244489385 scopus 로고    scopus 로고
    • Optimal control of ergodic continuous-time Markov chains with average sample-path rewards
    • X. P. Guo and X.-R Cao, "Optimal control of ergodic continuous-time Markov chains with average sample-path rewards," SIAM J. Control Optim., vol. 44, pp. 29-48, 2005.
    • (2005) SIAM J. Control Optim , vol.44 , pp. 29-48
    • Guo, X.P.1    Cao, X.-R.2
  • 7
    • 0036112835 scopus 로고    scopus 로고
    • Limiting discounted-cost control of partially observable stochastic systems
    • O. Hernández-Lerma and R. Romera, "Limiting discounted-cost control of partially observable stochastic systems," SIAM J. Control Optim., vol. 40, pp. 348-369, 2001.
    • (2001) SIAM J. Control Optim , vol.40 , pp. 348-369
    • Hernández-Lerma, O.1    Romera, R.2
  • 8
    • 0034171759 scopus 로고    scopus 로고
    • Finite-time lower bounds for the two-armed bandit problem
    • Apr
    • S. R. Kulkarni and G. Lugosi, "Finite-time lower bounds for the two-armed bandit problem," IEEE Trans. Autom. Control, vol. 45, no. 4, pp. 711-714, Apr. 2000.
    • (2000) IEEE Trans. Autom. Control , vol.45 , Issue.4 , pp. 711-714
    • Kulkarni, S.R.1    Lugosi, G.2
  • 9
    • 0002679852 scopus 로고
    • A survey of algorithmic results for partially observable Markov decision processes
    • W. S. Lovejoy, "A survey of algorithmic results for partially observable Markov decision processes," Ann. Oper. Res., vol. 35, pp. 47-66, 1991.
    • (1991) Ann. Oper. Res , vol.35 , pp. 47-66
    • Lovejoy, W.S.1
  • 10
    • 0006034218 scopus 로고
    • An optimal inspection and replacement policy under incomplete state information
    • M. Ohnish, H. Kawai, and H. Mine, "An optimal inspection and replacement policy under incomplete state information," Eur. J. Oper. Res., vol. 27, pp. 117-128, 1986.
    • (1986) Eur. J. Oper. Res , vol.27 , pp. 117-128
    • Ohnish, M.1    Kawai, H.2    Mine, H.3
  • 12
    • 33646420905 scopus 로고    scopus 로고
    • Constrained ordinal optimization - A feasibility based approach
    • C. Song, X. H. Guan, and Y. C. Ho, "Constrained ordinal optimization - A feasibility based approach," Discrete Event Dyna. Syst.: Theory Appl., vol. 16, pp. 279-299, 2006.
    • (2006) Discrete Event Dyna. Syst.: Theory Appl , vol.16 , pp. 279-299
    • Song, C.1    Guan, X.H.2    Ho, Y.C.3
  • 13
    • 14244259416 scopus 로고
    • Bonds on optimal cost for a replacement problem with partial observation
    • C. C. White, "Bonds on optimal cost for a replacement problem with partial observation," Naval Res. Logist. Quart., vol. 26, pp. 415-422, 1979.
    • (1979) Naval Res. Logist. Quart , vol.26 , pp. 415-422
    • White, C.C.1
  • 14
    • 0008632494 scopus 로고
    • Discrete-time Markovian decision processes with incomplete state observation
    • S. Yoshikazu and Y. Tsuneo, "Discrete-time Markovian decision processes with incomplete state observation," Ann. Math. Statist., vol. 41, pp.78-86, 1970.
    • (1970) Ann. Math. Statist , vol.41 , pp. 78-86
    • Yoshikazu, S.1    Tsuneo, Y.2
  • 15


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.