SCOPUS 정보 검색 플랫폼

Journal of Control Theory and Applications

Volumn 9, Issue 3, 2011, Pages 421-430

Semi-Markov adaptive critic heuristics with application to airline revenue management

(4) Kulkarni, Ketaki a Gosavi, Abhijit a Murray, Susan a Grantham, Katie a

a Architectural and Environmental Engineering (United States)

Author keywords

Actor critics; Adaptive critics; Approximate dynamic programming; Reinforcement learning; Semi Markov

Indexed keywords

ACTOR CRITIC; ADAPTIVE CRITIC; AIRLINE INDUSTRY; AIRLINE TICKETS; APPROXIMATE DYNAMIC PROGRAMMING; AVERAGE REWARD; AVERAGE REWARD CRITERIA; CLASSICAL METHODS; CONVERGENCE ANALYSIS; CURSE OF DIMENSIONALITY; DISCOUNTED REWARD; MAINTENANCE MANAGEMENT; MARKOV DECISION PROCESSES; NUMERICAL RESULTS; OPTIMAL SOLUTIONS; REAL-WORLD PROBLEM; REVENUE MANAGEMENT; SEMI-MARKOV; SEMI-MARKOV DECISION PROCESS; TIME SPENT;

AIR TRANSPORTATION; ALGORITHMS; COMMERCE; CONVERGENCE OF NUMERICAL METHODS; HEURISTIC PROGRAMMING; MAINTENANCE; MANAGEMENT; MARKOV PROCESSES; RANDOM VARIABLES; REAL VARIABLES; REINFORCEMENT LEARNING; SUPPLY CHAIN MANAGEMENT;

DYNAMIC PROGRAMMING;

EID: 79960466561 PISSN: 16726340 EISSN: 10008152 Source Type: Journal
DOI: 10.1007/s11768-011-0161-9 Document Type: Article

Times cited : (10)

References (37)

1
- 0008556523
- The theory of dynamic programming
- R. Bellman. The theory of dynamic programming. Proceedings of the National Academy of Sciences, 1952, 38(8): 716-719.
- (1952) Proceedings of the National Academy of Sciences , vol.38 , Issue.8 , pp. 716-719
- Bellman, R.¹

2
- 84864030941
- An application of reinforcement learning to aerobatic helicopter fight
- Cambridge: MIT Press
- P. Abbeel, A. Coates, M. Quigley, et al. An application of reinforcement learning to aerobatic helicopter fight. Advances in Neural Information Processing Systems 19, Cambridge: MIT Press, 2006: 1-8.
- (2006) Advances in Neural Information Processing Systems 19 , pp. 1-8
- Abbeel, P.¹ Coates, A.² Quigley, M.³

3
- 0000985504
- A self-teaching backgammon program, achieves master-level play
- G. Tesaru. T. gammon. A self-teaching backgammon program, achieves master-level play. Neural Computation, 1994, 6(2): 215-219.
- (1994) Neural Computation , vol.6 , Issue.2 , pp. 215-219
- Tesaru, G.¹ Gammon, T.²

4
- 35349027192
- Application of reinforcement learning to the game of othello
- N. J. V. Eck, M. W. Wezel. Application of reinforcement learning to the game of othello. Computers and Operations Research, 2008, 35(6): 1999-2017.
- (2008) Computers and Operations Research , vol.35 , Issue.6 , pp. 1999-2017
- Eck, N.J.V.¹ Wezel, M.W.²

5
- 0003787146
- Princeton: Princeton University Press
- R. E. Bellman. Dynamic Programming. Princeton: Princeton University Press, 1957.
- (1957) Dynamic Programming
- Bellman, R.E.¹

6
- 0003644124
- Cambridge: MIT Press
- R. Howard. Dynamic Programming and Markov Processes. Cambridge: MIT Press, 1960.
- (1960) Dynamic Programming and Markov Processes
- Howard, R.¹

7
- 0003487482
- Belmont: Athena Scientific
- D. P. Bertsekas, J. N. Tsitsiklis. Neuro-dynamic Programming. Belmont: Athena Scientific, 1996.
- (1996) Neuro-Dynamic Programming
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

8
- 0004102479
- Cambridge: MIT Press
- R. Sutton, A. G. Barto. Reinforcement Learning: An Introduction. Cambridge: MIT Press, 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.¹ Barto, A.G.²

9
- 84888630832
- Boston: Kluwer Academic
- A. Gosavi. Simulation-based Optimization: Parametric Optimization Techniques and Reinforcement Learning. Boston: Kluwer Academic, 2003.
- (2003) Simulation-Based Optimization: Parametric Optimization Techniques and Reinforcement Learning
- Gosavi, A.¹

10
- 79951542055
- Adaptive critics for airline revenue management
- Dallas, TX
- A. Gosavi. Adaptive critics for airline revenue management. Proceedings of the Production and Operations Management Society, Dallas, TX, 2007: 7-58.
- (2007) Proceedings of the Production and Operations Management Society , pp. 7-58
- Gosavi, A.¹

11
- 0003644137
- New York: Dover Publications
- S. Ross. Applied Probability Models with Optimization Applications. New York: Dover Publications, 1992.
- (1992) Applied Probability Models with Optimization Applications
- Ross, S.¹

12
- 0003565783
- 2nd edth edn., Belmont: Athena Scientific
- D. P. Bertsekas. Dynamic Programming and Optimal Control. 2nd ed. Belmont: Athena Scientific, 2000.
- (2000) Dynamic Programming and Optimal Control
- Bertsekas, D.P.¹

13
- 0023169119
- Building and understanding adaptive systems: a statistical/numerical approach to factory automation and brain research
- P. J. Werbos. Building and understanding adaptive systems: a statistical/numerical approach to factory automation and brain research. IEEE Transactions on Systems, Man, and Cybernetics, 1987, 17(1): 7-20.
- (1987) IEEE Transactions on Systems, Man, and Cybernetics , vol.17 , Issue.1 , pp. 7-20
- Werbos, P.J.¹

14
- 0036565019
- Comparison of heuristic dynamic programming and dual heuristic programming adaptive critics for neuro-control of a turbogenerator
- G. Venayagamoorthy, R. Harley, D. Wunsch. Comparison of heuristic dynamic programming and dual heuristic programming adaptive critics for neuro-control of a turbogenerator. IEEE Transactions on Neural Networks, 2002, 13(3): 764-773.
- (2002) IEEE Transactions on Neural Networks , vol.13 , Issue.3 , pp. 764-773
- Venayagamoorthy, G.¹ Harley, R.² Wunsch, D.³

15
- 67650170605
- Neuronlike adaptive elements that can solve difficult learning control problems
- Piscataway: New York
- A. G. Barto, R. S. Sutton, C. W. Anderson. Neuronlike adaptive elements that can solve difficult learning control problems. Artificial Neural Networks, Piscataway: New York, 1990: 81-93.
- (1990) Artificial Neural Networks , pp. 81-93
- Barto, A.G.¹ Sutton, R.S.² Anderson, C.W.³

16
- 0343893613
- Actor-critic type learning algorithms for Markov decision processes
- V. R. Konda, V. S. Borkar. Actor-critic type learning algorithms for Markov decision processes. SIAM Journal on Control and Optimization, 1999, 38(1): 94-123.
- (1999) SIAM Journal on Control and Optimization , vol.38 , Issue.1 , pp. 94-123
- Konda, V.R.¹ Borkar, V.S.²

17
- 0004049893
- Cambridge, U.K.: Kings College
- C. J. Watkins. Learning from Delayed Rewards. Ph. D. thesis. Cambridge, U. K.: Kings College, 1989.
- (1989) Learning from Delayed Rewards
- Watkins, C.J.¹

18
- 0003636089
- Cambridge: Engineering Department, Cambridge University
- G. A. Rummery, M. Niranjan. On-line Q-learning Using Connectionist Systems. Report CUED/F-INFENG/TR 166. Cambridge: Engineering Department, Cambridge University, 1994.
- (1994) On-Line Q-Learning Using Connectionist Systems
- Rummery, G.A.¹ Niranjan, M.²

19
- 70349116541
- Reinforcement learning and adaptive dynamic programming for feedback control
- F. L. Lewis, D. Vrabie. Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits and Systems Magazine, 2009, 9(3): 32-50.
- (2009) IEEE Circuits and Systems Magazine , vol.9 , Issue.3 , pp. 32-50
- Lewis, F.L.¹ Vrabie, D.²

20
- 49049111594
- Issues on stability of adp feedback controllers for dynamical systems
- S. N. Balakrishnan, J. Ding, F. L. Lewis. Issues on stability of adp feedback controllers for dynamical systems. IEEE Transactions on Systems, Man and Cybernetics (Special Issue on ADP/RL), 2008, 38(4): 913-917.
- (2008) IEEE Transactions on Systems, Man and Cybernetics (Special Issue on ADP/RL) , vol.38 , Issue.4 , pp. 913-917
- Balakrishnan, S.N.¹ Ding, J.² Lewis, F.L.³

21
- 77955790905
- San Rafael, CA: Morgan & Claypool Publishers
- C. Szepesvari. Algorithms for Reinforcement Learning (Synthesis Lectures on Artificial Intelligence and Machine Learning). San Rafael, CA: Morgan & Claypool Publishers, 2010.
- (2010) Algorithms for Reinforcement Learning (Synthesis Lectures on Artificial Intelligence and Machine Learning)
- Szepesvari, C.¹

22
- 67649964731
- Reinforcement learning: a tutorial survey and recent advances
- A. Gosavi. Reinforcement learning: a tutorial survey and recent advances. INFORMS Journal on Computing, 2009, 21(2): 178-192.
- (2009) INFORMS Journal on Computing , vol.21 , Issue.2 , pp. 178-192
- Gosavi, A.¹

23
- 77957772128
- Optimal control of ane nonlinear discretetime systems
- New York: IEEE
- T. Dierks, S. Jagannathan. Optimal control of ane nonlinear discretetime systems. Proceedings of the IEEE Mediterranean Conference on Control and Automation, New York: IEEE, 2009: 1390-1395.
- (2009) Proceedings of the IEEE Mediterranean Conference on Control and Automation , pp. 1390-1395
- Dierks, T.¹ Jagannathan, S.²

24
- 0034863083
- Action-dependent adaptive critic designs
- New York: IEEE
- D. Liu, X. Xiong, Y. Zhang. Action-dependent adaptive critic designs. Proceedings of INNS-IEEE International Joint Conference Neural Networks, New York: IEEE, 2001: 990-995.
- (2001) Proceedings of INNS-IEEE International Joint Conference Neural Networks , pp. 990-995
- Liu, D.¹ Xiong, X.² Zhang, Y.³

25
- 0013535965
- Infinite-horizon policy-gradient estimation
- J. Baxter, P. Bartlett. Infinite-horizon policy-gradient estimation. Journal of Artificial Intelligence, 2001, 15(1): 319-350.
- (2001) Journal of Artificial Intelligence , vol.15 , Issue.1 , pp. 319-350
- Baxter, J.¹ Bartlett, P.²

26
- 33751077547
- A policy-gradient method for semi-Markov decision processes with application to call admission control
- S. Singh, V. Tadic, A. Doucet. A policy-gradient method for semi-Markov decision processes with application to call admission control. European Journal of Operational Research, 2007, 178(3): 808-818.
- (2007) European Journal of Operational Research , vol.178 , Issue.3 , pp. 808-818
- Singh, S.¹ Tadic, V.² Doucet, A.³

27
- 0003452601
- New York: Springer-Verlag
- H. J. Kushner, D. S. Clark. Stochastic Approximation Methods for Constrained and Unconstrained Systems. New York: Springer-Verlag, 1978.
- (1978) Stochastic Approximation Methods for Constrained and Unconstrained Systems
- Kushner, H.J.¹ Clark, D.S.²

28
- 0031076413
- Stochastic approximation with two-time scales
- V. S. Borkar. Stochastic approximation with two-time scales. Systems & Control Letters, 1997, 29(5): 291-294.
- (1997) Systems & Control Letters , vol.29 , Issue.5 , pp. 291-294
- Borkar, V.S.¹

29
- 0036287773
- Learning algorithms for Markov decision processes with average cost
- J. Abounadi, D. P. Bertsekas, V. Borkar. Learning algorithms for Markov decision processes with average cost. SIAM Journal of Control and Optimization, 2001, 40(3): 681-698.
- (2001) SIAM Journal of Control and Optimization , vol.40 , Issue.3 , pp. 681-698
- Abounadi, J.¹ Bertsekas, D.P.² Borkar, V.³

30
- 64049092199
- Forecasting and control of passenger bookings
- K. Littlewood. Forecasting and control of passenger bookings. Journal of Revenue and Pricing Management, 2005, 4(2): 111-123.
- (2005) Journal of Revenue and Pricing Management , vol.4 , Issue.2 , pp. 111-123
- Littlewood, K.¹

31
- 0024629453
- Application of a probabilistic decision model to airline seat inventory control
- P. P. Belobaba. Application of a probabilistic decision model to airline seat inventory control. Operations Research, 1989, 37(2): 183-197.
- (1989) Operations Research , vol.37 , Issue.2 , pp. 183-197
- Belobaba, P.P.¹

32
- 0032642848
- Revenue management: Research overview and prospects
- J. I. McGill, G. J. van Ryzin. Revenue management: Research overview and prospects. Transportation Science, 1999, 33(2): 233-256.
- (1999) Transportation Science , vol.33 , Issue.2 , pp. 233-256
- McGill, J.I.¹ van Ryzin, G.J.²

33
- 41549145624
- An overview of research on revenue management: current issues and future research
- W. C. Chiang, J. C. H. Chen, X. Xu. An overview of research on revenue management: current issues and future research. International Journal of Revenue Management, 2007, 1(1): 97-128.
- (2007) International Journal of Revenue Management , vol.1 , Issue.1 , pp. 97-128
- Chiang, W.C.¹ Chen, J.C.H.² Xu, X.³

34
- 0242622203
- Boston: Kluwer Academic
- K. Talluri, G. van Ryzin. The Theory and Practice of Revenue Management. Boston: Kluwer Academic, 2004.
- (2004) The Theory and Practice of Revenue Management
- Talluri, K.¹ van Ryzin, G.²

35
- 34247532774
- Stanford: Stanford Business Books
- R. Phillips. Pricing and Revenue Optimization. Stanford: Stanford Business Books, 2005.
- (2005) Pricing and Revenue Optimization
- Phillips, R.¹

36
- 0036722536
- A reinforcement learning approach to a single leg airline revenue management problem with multiple fare classes and overbooking
- A. Gosavi, N. Bandla, T. K. Das. A reinforcement learning approach to a single leg airline revenue management problem with multiple fare classes and overbooking. IIE Transactions, 2002, 34(9): 729-742.
- (2002) IIE Transactions , vol.34 , Issue.9 , pp. 729-742
- Gosavi, A.¹ Bandla, N.² Das, T.K.³

37
- 2342446663
- A reinforcement learning algorithm based on policy iteration for average reward: empirical results with yield management and convergence analysis
- A. Gosavi. A reinforcement learning algorithm based on policy iteration for average reward: empirical results with yield management and convergence analysis. Machine Learning, 2004, 55(1): 5-29.
- (2004) Machine Learning , vol.55 , Issue.1 , pp. 5-29
- Gosavi, A.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.