메뉴 건너뛰기




Volumn 15, Issue 2, 2005, Pages 169-197

Basic ideas for event-based optimization of markov systems

Author keywords

Aggregation; Markov decision processes (MDPs); Performance potentials; Perturbation analysis; Policy gradients; Policy iteration; POMDPs

Indexed keywords

AGGLOMERATION; MARKOV PROCESSES; MATRIX ALGEBRA; OPTIMIZATION; PERTURBATION TECHNIQUES; PROBABILITY; PROBLEM SOLVING;

EID: 14644388113     PISSN: 09246703     EISSN: None     Source Type: Journal    
DOI: 10.1007/s10626-004-6211-4     Document Type: Article
Times cited : (91)

References (29)
  • 1
    • 0037288370 scopus 로고    scopus 로고
    • Recent advances in hierarchical reinforcement learning, special issue on reinforcement learning
    • Barto, A., and Mahadevan, S. 2003. Recent advances in hierarchical reinforcement learning, special issue on reinforcement learning. Discret. Event Dyn. Syst. Theory Appl. 13: 41-77.
    • (2003) Discret. Event Dyn. Syst. Theory Appl. , vol.13 , pp. 41-77
    • Barto, A.1    Mahadevan, S.2
  • 2
    • 0013535965 scopus 로고    scopus 로고
    • Infinite-horizon policy-gradient estimation
    • Baxter, J., and Bartlett, P. L. 2001. Infinite-horizon policy-gradient estimation. J. Artif. Intell. Res. 15: 319-350.
    • (2001) J. Artif. Intell. Res. , vol.15 , pp. 319-350
    • Baxter, J.1    Bartlett, P.L.2
  • 3
    • 0013495368 scopus 로고    scopus 로고
    • Experiments with infinite-horizon policy-gradient estimation
    • Baxter, J., Bartlett, P. L., and Weaver, L. 2001. Experiments with infinite-horizon policy-gradient estimation. J. Artif. Intell. Res. 15: 351-381.
    • (2001) J. Artif. Intell. Res. , vol.15 , pp. 351-381
    • Baxter, J.1    Bartlett, P.L.2    Weaver, L.3
  • 6
    • 0032027940 scopus 로고    scopus 로고
    • The relation among potentials, perturbation analysis, Markov decision processes, and other topics
    • Cao, X. R. 1998. The relation among potentials, perturbation analysis, Markov decision processes, and other topics. J. Discret. Event Dyn. Syst. 8: 71-87.
    • (1998) J. Discret. Event Dyn. Syst. , vol.8 , pp. 71-87
    • Cao, X.R.1
  • 7
    • 0033247533 scopus 로고    scopus 로고
    • Single sample path based optimization of Markov chains
    • Cao, X. R. 1999. Single sample path based optimization of Markov chains. J. Optim. Theory Appl. 100(3): 527-548.
    • (1999) J. Optim. Theory Appl. , vol.100 , Issue.3 , pp. 527-548
    • Cao, X.R.1
  • 8
    • 0033884215 scopus 로고    scopus 로고
    • A unified approach to Markov decision problems and performance sensitivity analysis
    • Cao, X. R. 2000. A unified approach to Markov decision problems and performance sensitivity analysis. Automatica 36: 771-774.
    • (2000) Automatica , vol.36 , pp. 771-774
    • Cao, X.R.1
  • 9
    • 11044222936 scopus 로고    scopus 로고
    • The potential structure of sample paths and performance sensitivities of Markov systems
    • Cao, X. R. 2004a. The potential structure of sample paths and performance sensitivities of Markov systems. IEEE Trans. Automat. Contr. 49: 2129-2142.
    • (2004) IEEE Trans. Automat. Contr. , vol.49 , pp. 2129-2142
    • Cao, X.R.1
  • 10
    • 14644391675 scopus 로고    scopus 로고
    • A basic formula for on-line policy gradient algorithms
    • to appear
    • Cao, X. R. 2004b. A basic formula for on-line policy gradient algorithms. IEEE Trans. Automat. Contr. to appear.
    • (2004) IEEE Trans. Automat. Contr
    • Cao, X.R.1
  • 12
    • 0031258478 scopus 로고    scopus 로고
    • Perturbation realization, potentials and sensitivity analysis of Markov processes
    • Cao, X. R., and Chen, H. F. 1997. Perturbation realization, potentials and sensitivity analysis of Markov processes. IEEE Trans. Automat. Contr. 42: 1382-1393.
    • (1997) IEEE Trans. Automat. Contr. , vol.42 , pp. 1382-1393
    • Cao, X.R.1    Chen, H.F.2
  • 13
    • 3843150404 scopus 로고    scopus 로고
    • A unified approach to Markov decision problems and performance sensitivity analysis with discounted and average criteria: Multichain cases
    • Cao, X. R., and Guo, X. 2004. A unified approach to Markov decision problems and performance sensitivity analysis with discounted and average criteria: Multichain cases. Automatica 40: 1749-1759.
    • (2004) Automatica , vol.40 , pp. 1749-1759
    • Cao, X.R.1    Guo, X.2
  • 14
    • 0032122986 scopus 로고    scopus 로고
    • Algorithms for sensitivity analysis of Markov systems through potentials and perturbation realization
    • Cao, X. R., and Wan, Y. W. 1998. Algorithms for sensitivity analysis of Markov systems through potentials and perturbation realization. IEEE Trans. Control Syst. Technol. 6: 482-494.
    • (1998) IEEE Trans. Control Syst. Technol. , vol.6 , pp. 482-494
    • Cao, X.R.1    Wan, Y.W.2
  • 15
    • 0030409198 scopus 로고    scopus 로고
    • A single sample path-based performance sensitivity formula for Markov chains
    • Cao, X. R., Yuan, X. M., and Qiu, L. 1996. A single sample path-based performance sensitivity formula for Markov chains. IEEE Trans. Automat. Contr. 41: 1814-1817.
    • (1996) IEEE Trans. Automat. Contr. , vol.41 , pp. 1814-1817
    • Cao, X.R.1    Yuan, X.M.2    Qiu, L.3
  • 16
    • 0036604532 scopus 로고    scopus 로고
    • A time aggregation approach to Markov decision processes
    • Cao, X. R., Ren, Z. Y., Bhatnagar, S., Fu, M., and Marcus, S. 2002. A time aggregation approach to Markov decision processes. Automatica 38: 929-943.
    • (2002) Automatica , vol.38 , pp. 929-943
    • Cao, X.R.1    Ren, Z.Y.2    Bhatnagar, S.3    Fu, M.4    Marcus, S.5
  • 17
    • 0028466316 scopus 로고
    • Stochastic optimization of regenerative systems using infinitesimal perturbation analysis
    • Chong, E. K. P., and Ramadge, P. J. 1994. Stochastic optimization of regenerative systems using infinitesimal perturbation analysis. IEEE Trans. Automat. Contr. 39: 1400-1410.
    • (1994) IEEE Trans. Automat. Contr. , vol.39 , pp. 1400-1410
    • Chong, E.K.P.1    Ramadge, P.J.2
  • 20
    • 2442614974 scopus 로고    scopus 로고
    • Potential-based on-line policy iteration algorithms for Markov decision processes
    • Fang, H. T., and Cao, X. R. 2004. Potential-based on-line policy iteration algorithms for Markov decision processes. IEEE Trans. Automat. Contr. 49: 493-505.
    • (2004) IEEE Trans. Automat. Contr. , vol.49 , pp. 493-505
    • Fang, H.T.1    Cao, X.R.2
  • 21
    • 0020802518 scopus 로고
    • Perturbation analysis and optimization of queueing networks
    • Ho, Y. C., and Cao, X. R. 1983. Perturbation analysis and optimization of queueing networks. J. Optim. Theory Appl. 40(4): 559-582.
    • (1983) J. Optim. Theory Appl. , vol.40 , Issue.4 , pp. 559-582
    • Ho, Y.C.1    Cao, X.R.2
  • 23
    • 0037955677 scopus 로고    scopus 로고
    • The no free lunch theorem, complexity and computer security
    • Ho, Y. C., Zhao, Q. C., and Pepyne, D. L. 2003. The no free lunch theorem, complexity and computer security. IEEE Trans. Automat. Contr. 48: 783-793.
    • (2003) IEEE Trans. Automat. Contr. , vol.48 , pp. 783-793
    • Ho, Y.C.1    Zhao, Q.C.2    Pepyne, D.L.3
  • 24
    • 0035249254 scopus 로고    scopus 로고
    • Simulation-based optimization of Markov reward processes
    • Marbach, P., and Tsitsiklis, T. N. 2001. Simulation-based optimization of Markov reward processes. IEEE Trans. Automat. Contr. 46: 191-209.
    • (2001) IEEE Trans. Automat. Contr. , vol.46 , pp. 191-209
    • Marbach, P.1    Tsitsiklis, T.N.2
  • 27
    • 0024621270 scopus 로고
    • Single run optimization of discrete event simulations - An empirical study using the M/M/1 queue
    • Suri, R., and Leung, Y. T. 1989. Single run optimization of discrete event simulations - An empirical study using the M/M/1 queue. IIE Trans. 21: 35-49.
    • (1989) IIE Trans. , vol.21 , pp. 35-49
    • Suri, R.1    Leung, Y.T.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.