메뉴 건너뛰기




Volumn 50, Issue 11, 2005, Pages 1804-1808

Evolutionary policy iteration for solving Markov decision processes

Author keywords

(Distributed) policy iteration; Evolutionary algorithm; Genetic algorithm; Markov decision process; Parallelization

Indexed keywords

CONVERGENCE OF NUMERICAL METHODS; EVOLUTIONARY ALGORITHMS; GENETIC ALGORITHMS; ITERATIVE METHODS; MARKOV PROCESSES; OPTIMIZATION; PROBABILITY;

EID: 28644446278     PISSN: 00189286     EISSN: None     Source Type: Journal    
DOI: 10.1109/TAC.2005.858644     Document Type: Article
Times cited : (36)

References (13)
  • 2
    • 3543128853 scopus 로고    scopus 로고
    • "Parallel rollout for on-line solution of partially observable Markov decision processes"
    • H. S. Chang, R. Givan, and E. K. P. Chong, "Parallel rollout for on-line solution of partially observable Markov decision processes,": Discrete Event Dyna. Syst.: Theory Appl., vol. 15, no. 3, pp. 309-341, 2004.
    • (2004) Discrete Event Dyna. Syst.: Theory Appl. , vol.15 , Issue.3 , pp. 309-341
    • Chang, H.S.1    Givan, R.2    Chong, E.K.P.3
  • 3
    • 28644432777 scopus 로고    scopus 로고
    • "Genetic algorithm methods for solving the best stationary policy of finite Markov decision processes"
    • H. Chin and A. Jafari, "Genetic algorithm methods for solving the best stationary policy of finite Markov decision processes," in Proc. 30th Southeastern Symp. System Theory, 1998, pp. 538-543.
    • (1998) Proc. 30th Southeastern Symp. System Theory , pp. 538-543
    • Chin, H.1    Jafari, A.2
  • 4
    • 0003871635 scopus 로고
    • "An analysis of the behavior of a class of genetic adaptive systems"
    • Ph.D. dissertation, Univ. Michigan, Ann Arbor, MI
    • K. A. De Jong, "An analysis of the behavior of a class of genetic adaptive systems," Ph.D. dissertation, Univ. Michigan, Ann Arbor, MI, 1975.
    • (1975)
    • De Jong, K.A.1
  • 5
    • 0034272032 scopus 로고    scopus 로고
    • "Bounded Markov decision processes"
    • R. Givan, S. Leach. and T. Dean, "Bounded Markov decision processes," Artif. Intell., vol. 122, pp. 71-109, 2000.
    • (2000) Artif. Intell. , vol.122 , pp. 71-109
    • Givan, R.1    Leach, S.2    Dean, T.3
  • 7
    • 0026862612 scopus 로고
    • "On distributed dynamic programming"
    • May
    • A. Jalali and M. J. Ferguson, "On distributed dynamic programming," IEEE Trans. Autom. Control, vol. 37, no. 5, pp. 685-689, May 1992.
    • (1992) IEEE Trans. Autom. Control , vol.37 , Issue.5 , pp. 685-689
    • Jalali, A.1    Ferguson, M.J.2
  • 8
    • 28644451559 scopus 로고    scopus 로고
    • "A hybrid genetic/optimization algorithm for finite horizon partially observed Markov decision processes"
    • Univ. Michigan, Ann Arbor, MI, Tech. Rep. 98-25
    • A. Z.-Z. Lin, J. Bean, and C. White, III, "A hybrid genetic/ optimization algorithm for finite horizon partially observed Markov decision processes" Dept. Ind. Oper. Eng., Univ. Michigan, Ann Arbor, MI, Tech. Rep. 98-25, 1998.
    • (1998) Dept. Ind. Oper. Eng.
    • Lin, A.Z.-Z.1    Bean, J.2    White III, C.3
  • 10
    • 0001017536 scopus 로고    scopus 로고
    • "Genetic algorithms for the operations researcher"
    • C. R. Reeves, "Genetic algorithms for the operations researcher," INFORMS J. Comput., vol. 9, no. 3, pp. 231-250, 1997.
    • (1997) INFORMS J. Comput. , vol.9 , Issue.3 , pp. 231-250
    • Reeves, C.R.1
  • 11
    • 0028446271 scopus 로고
    • "Genetic algorithms: A survey"
    • M. Srinivas and L. M. Patnaik, "Genetic algorithms: A survey," IEEE Comput., vol. 27, no. 6, pp. 17-26, 1994.
    • (1994) IEEE Comput. , vol.27 , Issue.6 , pp. 17-26
    • Srinivas, M.1    Patnaik, L.M.2
  • 12
    • 0039967456 scopus 로고
    • "Analysis of some incremental variants of policy iteration: First steps toward understanding actor-critic learning systems"
    • Tech. Rep. NU-CCS-93-11
    • R. J. Williams and L. C. Baird, "Analysis of some incremental variants of policy iteration: First steps toward understanding actor-critic learning systems,", Tech. Rep. NU-CCS-93-11, 1993.
    • (1993)
    • Williams, R.J.1    Baird, L.C.2
  • 13
    • 0342455390 scopus 로고
    • "A mathematical analysis of actor-critic architectures for learning optimal controls through incremental dynamic programming"
    • New Haven, CT, Aug. 15-17
    • R. J. Williams and L. C. Baird, "A mathematical analysis of actor-critic architectures for learning optimal controls through incremental dynamic programming," in Proc. 6th Yale Workshop on Adaptive and Learning Systems. New Haven, CT, Aug. 15-17, 1990.
    • (1990) Proc. 6th Yale Workshop on Adaptive and Learning Systems
    • Williams, R.J.1    Baird, L.C.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.