메뉴 건너뛰기




Volumn 52, Issue 7, 2007, Pages 1349-1355

Recursive learning automata approach to Markov decision processes

Author keywords

Learning automata; Markov decision process (MDP); Sampling

Indexed keywords

FINITE ELEMENT METHOD; MARKOV PROCESSES; OPTIMIZATION; RANDOM PROCESSES; RECURSIVE FUNCTIONS;

EID: 34547108579     PISSN: 00189286     EISSN: None     Source Type: Journal    
DOI: 10.1109/TAC.2007.900859     Document Type: Article
Times cited : (11)

References (18)
  • 1
    • 0036682894 scopus 로고    scopus 로고
    • A reinforcement learning approach to automatic generation control
    • T. P. I. Ahamed, P. S. N. Rao, and P. S. Sastry, "A reinforcement learning approach to automatic generation control," Electric Power Syst. Res., vol. 63, pp. 9-26, 2002.
    • (2002) Electric Power Syst. Res , vol.63 , pp. 9-26
    • Ahamed, T.P.I.1    Rao, P.S.N.2    Sastry, P.S.3
  • 2
    • 0013535965 scopus 로고    scopus 로고
    • Infinite-horizon policy-gradient estimation
    • J. Baxter and P. L. Bartlett, "Infinite-horizon policy-gradient estimation," J. Artif. Intell. Res., vol. 15, pp. 319-350, 2001.
    • (2001) J. Artif. Intell. Res , vol.15 , pp. 319-350
    • Baxter, J.1    Bartlett, P.L.2
  • 4
    • 14644444172 scopus 로고    scopus 로고
    • An adaptive sampling algorithm for solving Markov decision processes
    • H. S. Chang, M. C. Fu, J. Hu, and S. I. Marcus, "An adaptive sampling algorithm for solving Markov decision processes," Operat. Res., vol. 53, no. 1, pp. 126-139, 2005.
    • (2005) Operat. Res , vol.53 , Issue.1 , pp. 126-139
    • Chang, H.S.1    Fu, M.C.2    Hu, J.3    Marcus, S.I.4
  • 6
    • 33847338786 scopus 로고    scopus 로고
    • An asymptotically efficient simulation-based algorithm for finite horizon stochastic dynamic programming
    • Jan
    • H. S. Chang, M. C. Fu, J. Hu, and S. I. Marcus, "An asymptotically efficient simulation-based algorithm for finite horizon stochastic dynamic programming," IEEE Trans. Autom. Control, vol. 52, no. 1, pp. 89-94, Jan. 2007.
    • (2007) IEEE Trans. Autom. Control , vol.52 , Issue.1 , pp. 89-94
    • Chang, H.S.1    Fu, M.C.2    Hu, J.3    Marcus, S.I.4
  • 8
    • 34547118450 scopus 로고    scopus 로고
    • Simulation-based uniform value function estimates of Markov decision processes
    • R. Jain and P. Varaiya, "Simulation-based uniform value function estimates of Markov decision processes," SIAM J. Control Optim., vol. 45, no. 5, pp. 1633-1656, 2006.
    • (2006) SIAM J. Control Optim , vol.45 , Issue.5 , pp. 1633-1656
    • Jain, R.1    Varaiya, P.2
  • 11
    • 0030212543 scopus 로고    scopus 로고
    • Finite time analysis of the pursuit algorithm for learning automata
    • Aug
    • K. Rajaraman and P. S. Sastry, "Finite time analysis of the pursuit algorithm for learning automata," IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 26, no. 4, pp. 590-598, Aug. 1996.
    • (1996) IEEE Trans. Syst., Man, Cybern. B, Cybern , vol.26 , Issue.4 , pp. 590-598
    • Rajaraman, K.1    Sastry, P.S.2
  • 12
    • 0031235784 scopus 로고    scopus 로고
    • A reinforcement learning neural network for adaptive control of Markov chains
    • Sep
    • G. Santharam and P. S. Sastry, "A reinforcement learning neural network for adaptive control of Markov chains," IEEE Trans. Syst, Man, Cybern. A, Syst. Humans, vol. 27, no. 5, pp. 588-600, Sep. 1997.
    • (1997) IEEE Trans. Syst, Man, Cybern. A, Syst. Humans , vol.27 , Issue.5 , pp. 588-600
    • Santharam, G.1    Sastry, P.S.2
  • 13
    • 33748017961 scopus 로고
    • Systems of learning automata-estimator algorithms and applications,
    • Ph.D. dissertation, Dept. Electr. Eng, Indian Inst. Sci, Bangalore, India, Jun
    • P. S. Sastry, "Systems of learning automata-estimator algorithms and applications," Ph.D. dissertation, Dept. Electr. Eng., Indian Inst. Sci., Bangalore, India, Jun. 1985.
    • (1985)
    • Sastry, P.S.1
  • 15
    • 0004225404 scopus 로고
    • 2nd ed. New York: Springer-Verlag
    • A. N. Shiryaev, Probability, 2nd ed. New York: Springer-Verlag, 1995.
    • (1995) Probability
    • Shiryaev, A.N.1
  • 18
    • 0022738693 scopus 로고
    • Decentralized learning in finite Markov chains
    • Jun
    • R. M. Wheeler Jr. and K. S. Narendra, "Decentralized learning in finite Markov chains," IEEE Trans. Autom. Control, vol. AC-31, no. 6, pp. 519-526, Jun. 1986.
    • (1986) IEEE Trans. Autom. Control , vol.AC-31 , Issue.6 , pp. 519-526
    • Wheeler Jr., R.M.1    Narendra, K.S.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.