메뉴 건너뛰기




Volumn 9, Issue 3, 2011, Pages 421-430

Semi-Markov adaptive critic heuristics with application to airline revenue management

Author keywords

Actor critics; Adaptive critics; Approximate dynamic programming; Reinforcement learning; Semi Markov

Indexed keywords

ACTOR CRITIC; ADAPTIVE CRITIC; AIRLINE INDUSTRY; AIRLINE TICKETS; APPROXIMATE DYNAMIC PROGRAMMING; AVERAGE REWARD; AVERAGE REWARD CRITERIA; CLASSICAL METHODS; CONVERGENCE ANALYSIS; CURSE OF DIMENSIONALITY; DISCOUNTED REWARD; MAINTENANCE MANAGEMENT; MARKOV DECISION PROCESSES; NUMERICAL RESULTS; OPTIMAL SOLUTIONS; REAL-WORLD PROBLEM; REVENUE MANAGEMENT; SEMI-MARKOV; SEMI-MARKOV DECISION PROCESS; TIME SPENT;

EID: 79960466561     PISSN: 16726340     EISSN: 10008152     Source Type: Journal    
DOI: 10.1007/s11768-011-0161-9     Document Type: Article
Times cited : (10)

References (37)
  • 3
    • 0000985504 scopus 로고
    • A self-teaching backgammon program, achieves master-level play
    • G. Tesaru. T. gammon. A self-teaching backgammon program, achieves master-level play. Neural Computation, 1994, 6(2): 215-219.
    • (1994) Neural Computation , vol.6 , Issue.2 , pp. 215-219
    • Tesaru, G.1    Gammon, T.2
  • 4
    • 35349027192 scopus 로고    scopus 로고
    • Application of reinforcement learning to the game of othello
    • N. J. V. Eck, M. W. Wezel. Application of reinforcement learning to the game of othello. Computers and Operations Research, 2008, 35(6): 1999-2017.
    • (2008) Computers and Operations Research , vol.35 , Issue.6 , pp. 1999-2017
    • Eck, N.J.V.1    Wezel, M.W.2
  • 5
    • 0003787146 scopus 로고
    • Princeton: Princeton University Press
    • R. E. Bellman. Dynamic Programming. Princeton: Princeton University Press, 1957.
    • (1957) Dynamic Programming
    • Bellman, R.E.1
  • 13
    • 0023169119 scopus 로고
    • Building and understanding adaptive systems: a statistical/numerical approach to factory automation and brain research
    • P. J. Werbos. Building and understanding adaptive systems: a statistical/numerical approach to factory automation and brain research. IEEE Transactions on Systems, Man, and Cybernetics, 1987, 17(1): 7-20.
    • (1987) IEEE Transactions on Systems, Man, and Cybernetics , vol.17 , Issue.1 , pp. 7-20
    • Werbos, P.J.1
  • 14
    • 0036565019 scopus 로고    scopus 로고
    • Comparison of heuristic dynamic programming and dual heuristic programming adaptive critics for neuro-control of a turbogenerator
    • G. Venayagamoorthy, R. Harley, D. Wunsch. Comparison of heuristic dynamic programming and dual heuristic programming adaptive critics for neuro-control of a turbogenerator. IEEE Transactions on Neural Networks, 2002, 13(3): 764-773.
    • (2002) IEEE Transactions on Neural Networks , vol.13 , Issue.3 , pp. 764-773
    • Venayagamoorthy, G.1    Harley, R.2    Wunsch, D.3
  • 15
    • 67650170605 scopus 로고
    • Neuronlike adaptive elements that can solve difficult learning control problems
    • Piscataway: New York
    • A. G. Barto, R. S. Sutton, C. W. Anderson. Neuronlike adaptive elements that can solve difficult learning control problems. Artificial Neural Networks, Piscataway: New York, 1990: 81-93.
    • (1990) Artificial Neural Networks , pp. 81-93
    • Barto, A.G.1    Sutton, R.S.2    Anderson, C.W.3
  • 16
    • 0343893613 scopus 로고    scopus 로고
    • Actor-critic type learning algorithms for Markov decision processes
    • V. R. Konda, V. S. Borkar. Actor-critic type learning algorithms for Markov decision processes. SIAM Journal on Control and Optimization, 1999, 38(1): 94-123.
    • (1999) SIAM Journal on Control and Optimization , vol.38 , Issue.1 , pp. 94-123
    • Konda, V.R.1    Borkar, V.S.2
  • 19
    • 70349116541 scopus 로고    scopus 로고
    • Reinforcement learning and adaptive dynamic programming for feedback control
    • F. L. Lewis, D. Vrabie. Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits and Systems Magazine, 2009, 9(3): 32-50.
    • (2009) IEEE Circuits and Systems Magazine , vol.9 , Issue.3 , pp. 32-50
    • Lewis, F.L.1    Vrabie, D.2
  • 22
    • 67649964731 scopus 로고    scopus 로고
    • Reinforcement learning: a tutorial survey and recent advances
    • A. Gosavi. Reinforcement learning: a tutorial survey and recent advances. INFORMS Journal on Computing, 2009, 21(2): 178-192.
    • (2009) INFORMS Journal on Computing , vol.21 , Issue.2 , pp. 178-192
    • Gosavi, A.1
  • 26
    • 33751077547 scopus 로고    scopus 로고
    • A policy-gradient method for semi-Markov decision processes with application to call admission control
    • S. Singh, V. Tadic, A. Doucet. A policy-gradient method for semi-Markov decision processes with application to call admission control. European Journal of Operational Research, 2007, 178(3): 808-818.
    • (2007) European Journal of Operational Research , vol.178 , Issue.3 , pp. 808-818
    • Singh, S.1    Tadic, V.2    Doucet, A.3
  • 28
    • 0031076413 scopus 로고    scopus 로고
    • Stochastic approximation with two-time scales
    • V. S. Borkar. Stochastic approximation with two-time scales. Systems & Control Letters, 1997, 29(5): 291-294.
    • (1997) Systems & Control Letters , vol.29 , Issue.5 , pp. 291-294
    • Borkar, V.S.1
  • 30
  • 31
    • 0024629453 scopus 로고
    • Application of a probabilistic decision model to airline seat inventory control
    • P. P. Belobaba. Application of a probabilistic decision model to airline seat inventory control. Operations Research, 1989, 37(2): 183-197.
    • (1989) Operations Research , vol.37 , Issue.2 , pp. 183-197
    • Belobaba, P.P.1
  • 32
    • 0032642848 scopus 로고    scopus 로고
    • Revenue management: Research overview and prospects
    • J. I. McGill, G. J. van Ryzin. Revenue management: Research overview and prospects. Transportation Science, 1999, 33(2): 233-256.
    • (1999) Transportation Science , vol.33 , Issue.2 , pp. 233-256
    • McGill, J.I.1    van Ryzin, G.J.2
  • 33
    • 41549145624 scopus 로고    scopus 로고
    • An overview of research on revenue management: current issues and future research
    • W. C. Chiang, J. C. H. Chen, X. Xu. An overview of research on revenue management: current issues and future research. International Journal of Revenue Management, 2007, 1(1): 97-128.
    • (2007) International Journal of Revenue Management , vol.1 , Issue.1 , pp. 97-128
    • Chiang, W.C.1    Chen, J.C.H.2    Xu, X.3
  • 36
    • 0036722536 scopus 로고    scopus 로고
    • A reinforcement learning approach to a single leg airline revenue management problem with multiple fare classes and overbooking
    • A. Gosavi, N. Bandla, T. K. Das. A reinforcement learning approach to a single leg airline revenue management problem with multiple fare classes and overbooking. IIE Transactions, 2002, 34(9): 729-742.
    • (2002) IIE Transactions , vol.34 , Issue.9 , pp. 729-742
    • Gosavi, A.1    Bandla, N.2    Das, T.K.3
  • 37
    • 2342446663 scopus 로고    scopus 로고
    • A reinforcement learning algorithm based on policy iteration for average reward: empirical results with yield management and convergence analysis
    • A. Gosavi. A reinforcement learning algorithm based on policy iteration for average reward: empirical results with yield management and convergence analysis. Machine Learning, 2004, 55(1): 5-29.
    • (2004) Machine Learning , vol.55 , Issue.1 , pp. 5-29
    • Gosavi, A.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.