SCOPUS 정보 검색 플랫폼

Journal of the Operational Research Society

Volumn 63, Issue 8, 2012, Pages 1165-1173

Comparing reinforcement learning approaches for solving game theoretic models: A dynamic airline pricing game example

(2) Collins, A a Thomas, L b

a OLD DOMINION UNIVERSITY (United States)

b UNIVERSITY OF SOUTHAMPTON (United Kingdom)

Author keywords

air transport; artificial intelligence; game theory; reinforcement learning

Indexed keywords

ARTIFICIAL INTELLIGENCE; COSTS; DECISION MAKING; GAME THEORY; MACHINE LEARNING;

AIR TRANSPORT; AIRLINE PRICING; ARTIFICIAL INTELLIGENCE TECHNIQUES; DECISION MAKERS; GAME-THEORETIC MODEL; NASH EQUILIBRIA; PRICING GAMES; REINFORCEMENT LEARNING APPROACH;

REINFORCEMENT LEARNING;

EID: 84863425031 PISSN: 01605682 EISSN: 14769360 Source Type: Journal
DOI: 10.1057/jors.2011.94 Document Type: Article

Times cited : (16)

References (41)

1
- 40249113497
- An overview of the issues in the airline industry and the role of optimization models and algorithms
- DOI 10.1057/palgrave.jors.2602350, PII 2602350
- Ahmed AH and Poojari CA (2008). An overview of the issues in the airline industry and the role of optimization models and algorithms. J Opl Res Soc 59: 267-277. (Pubitemid 351334690)
- (2008) Journal of the Operational Research Society , vol.59 , Issue.3 , pp. 267-277
- Ahmed, A.H.¹ Poojari, C.A.²

2
- 0003586302
- Princeton University Press: Princeton
- Axelrod R (1997). The Complexity of Cooperation: Agent-based Models of Competition and Collaboration. Princeton University Press: Princeton.
- (1997) The Complexity of Cooperation: Agent-based Models of Competition and Collaboration
- Axelrod, R.¹

3
- 0008556523
- On the theory of dynamic programming
- Bellman R (1952). On the theory of dynamic programming. Proc Natl Acad Sci USA 38: 716-719.
- (1952) Proc Natl Acad Sci USA , vol.38 , pp. 716-719
- Bellman, R.¹

4
- 0003487482
- Athena Scientific: Belmont
- Bertsekas DP and Tsitsiklis JN (1996). Neuro-dynamic Programming. Athena Scientific: Belmont.
- (1996) Neuro-dynamic Programming
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

5
- 0003476737
- Basil Blackwell: Oxford
- Binmore K (1990). Essays on the Foundations of Game Theory. Basil Blackwell: Oxford.
- (1990) Essays on the Foundations of Game Theory
- Binmore, K.¹

6
- 0004106775
- Houghton Mifflin: Lexington
- Binmore K (1992). Fun and Games: A Text on Game Theory. Houghton Mifflin: Lexington.
- (1992) Fun and Games: A Text on Game Theory
- Binmore, K.¹

7
- 84863443679
- Perspectives on the future of pricing
- Presented at Barcelona, Spain
- Boyd EA (2007). Perspectives on the future of pricing. Presented at 7th Annual INFORMS Pricing and Revenue Management Conference, Barcelona, Spain.
- (2007) 7th Annual INFORMS Pricing and Revenue Management Conference
- Boyd, E.A.¹

8
- 0001699291
- Training stochastic model recognition algorithms as networks can lead to maximum mutual information estimation of parameters
- In: Touretzky DS (ed) Morgan Kaufmann: San Mateo, USA
- Bridle JS (1990). Training stochastic model recognition algorithms as networks can lead to maximum mutual information estimation of parameters. In: Touretzky DS (ed). Advances in Neural Information Processing Systems: Proceedings of the 1989 Conference. Morgan Kaufmann: San Mateo, USA, pp 211-217.
- (1990) Advances in Neural Information Processing Systems: Proceedings of the 1989 Conference , pp. 211-217
- Bridle, J.S.¹

9
- 34247158133
- Drama theory: Dispelling the myths
- Bryant JW (2007). Drama theory: Dispelling the myths. J Opl Res Soc 58: 602-613.
- (2007) J Opl Res Soc , vol.58 , pp. 602-613
- Bryant, J.W.¹

10
- 33748701543
- Settling the complexity of 2-player Nash-Equilibrium
- (TR05), accessed 1 June 2011
- Chen X and Deng X (2005). Settling the complexity of 2-player Nash-Equilibrium. In Electronic Colloquium on Computational Complexity, 140(TR05), http://eccc.hpi-web.de/report/2005/140/accessed 1 June 2011.
- (2005) Electronic Colloquium on Computational Complexity , vol.140
- Chen, X.¹ Deng, X.²

11
- 37149044142
- Agent-based modelling and simulation of urban evacuation: Relative effectiveness of simultaneous and staged evacuation strategies
- DOI 10.1057/palgrave.jors.2602321, PII 2602321
- Chen X and Zhan FB (2008). Agent-based modelling and simulation of urban evacuation: Relative effectiveness of simultaneous and staged evacuation strategies. J Opl Res Soc 59: 25-33. (Pubitemid 350261770)
- (2008) Journal of the Operational Research Society , vol.59 , Issue.1 , pp. 25-33
- Chen, X.¹ Zhan, F.B.²

12
- 84863492191
- Ph.D. thesis, University of Southampton
- Collins AJ (2009). Evaluating reinforcement learning for game theory application: Learning to price airline seats under competition. Ph.D. thesis, University of Southampton.
- (2009) Evaluating Reinforcement Learning for Game Theory Application: Learning to Price Airline Seats Under Competition
- Collins, A.J.¹

13
- 84863474694
- Dstl/CR07880, Defence Science and Technology Laboratories, Ministry of Defence, United Kingdom (UNCLASSIFIED
- Collins AJ, Pullum F and Kenyon L (2003). Applications of game theory in defence project: Year one report. Dstl/CR07880, Defence Science and Technology Laboratories, Ministry of Defence, United Kingdom (UNCLASSIFIED).
- (2003) Applications of Game Theory in Defence Project: Year One Report
- Collins, A.J.¹ Pullum, F.² Kenyon, L.³

14
- 47049125492
- Dynamic pricing of airline tickets with competition
- Currie C, Cheng RCH and Smith HK (2008). Dynamic pricing of airline tickets with competition. J Opl Res Soc 59: 1026-1037.
- (2008) J Opl Res Soc , vol.59 , pp. 1026-1037
- Currie, C.¹ Cheng, R.C.H.² Smith, H.K.³

15
- 0342751709
- Macmillan Press: London
- Eatwell J, Milgate M and Newman P (eds) (1987). The New Palgrave: Game Theory. Macmillan Press: London.
- (1987) The New Palgrave: Game Theory
- Eatwell, J.¹ Milgate, M.² Newman, P.³

16
- 0004247096
- MIT Press: London
- Fudenberg D and Levine DK (1998). The Theory of Learning in Games. MIT Press: London.
- (1998) The Theory of Learning in Games
- Fudenberg, D.¹ Levine, D.K.²

17
- 84888630832
- Operations Research/Computer Science Interfaces Series.Kluwer Academic Publishers: London
- Gosavi A (2003). Simulation-based Optimization: Parametric Optimization Techniques and Reinforcement Learning. Operations Research/Computer Science Interfaces Series. Kluwer Academic Publishers: London.
- (2003) Simulation-based Optimization: Parametric Optimization Techniques and Reinforcement Learning
- Gosavi, A.¹

18
- 0003709994
- MIT Press: London
- Harsanyi JC and Selten R (1988). A General Theory of Equilibrium Selection in Games. MIT Press: London.
- (1988) A General Theory of Equilibrium Selection in Games
- Harsanyi, J.C.¹ Selten, R.²

19
- 34249076324
- Game theoretic analysis of the bargaining process over a long-term replenishment contract
- DOI 10.1057/palgrave.jors.2602183, PII 2602183
- Kim JS and Kwak TC (2007). Game theoretic analysis of the bargaining process over a long-term replenishment contract. J Opl Res Soc 58: 769-778. (Pubitemid 46782021)
- (2007) Journal of the Operational Research Society , vol.58 , Issue.6 , pp. 769-778
- Kim, J.S.¹ Kwak, T.C.²

20
- 0001869771
- Sulla determinazione empirica di una legge di distribuzione
- Kolmogorov AN (1933). Sulla determinazione empirica di una legge di distribuzione. Giornale dell'Istituto Italiano degli Attuari 4: 83-91.
- (1933) Giornale dell'Istituto Italiano degli Attuari , vol.4 , pp. 83-91
- Kolmogorov, A.N.¹

21
- 0000176346
- Equilibrium points of bimatrix games
- Lemke CE and Howson JJT (1964). Equilibrium points of bimatrix games. SIAM J Appl Math 12: 413-423.
- (1964) SIAM J Appl Math , vol.12 , pp. 413-423
- Lemke, C.E.¹ Howson, J.J.T.²

22
- 0035536032
- Learning: Association or computation? Introduction to a special section
- Leslie AM (2001). Learning: Association or computation? Introduction to a special section. Curr Dir Psychol Sci 10(4): 124-127. (Pubitemid 33388536)
- (2001) Current Directions in Psychological Science , vol.10 , Issue.4 , pp. 124-127
- Leslie, A.M.¹

23
- 33645029191
- Individual q-learning in normal form games
- Leslie DS and Collins EJ (2005). Individual q-learning in normal form games. SIAM J Control Optim 44: 495-414.
- (2005) SIAM J Control Optim , vol.44 , pp. 495-414
- Leslie, D.S.¹ Collins, E.J.²

24
- 0004005973
- Wiley: Oxford
- Luce RD (1959). Individual Choice Behaviour. Wiley: Oxford.
- (1959) Individual Choice Behaviour
- Luce, R.D.¹

25
- 34249048200
- Multi-agent learning for engineers
- DOI 10.1016/j.artint.2007.01.003, PII S0004370207000070, Foundations of Multi-Agent Learning
- Mannor S and Shamma J (2007). Multi-agent learning for engineers. Artificial Intelligence 171: 417-422. (Pubitemid 46802418)
- (2007) Artificial Intelligence , vol.171 , Issue.7 , pp. 417-422
- Mannor, S.¹ Shamma, J.S.²

26
- 0002974509
- The structure of random utility models
- Manski CF (1977). The structure of random utility models. Theory and Decision 8: 229-254.
- (1977) Theory and Decision , vol.8 , pp. 229-254
- Manski, C.F.¹

27
- 0742310651
- Version 0. 2007. 01.30, accessed 10 September 2008
- McKelvey RD, McLennan AM and Turocy TL (2007). Gambit: Software tools for Game Theory - Version 0.2007.01.30, http://econweb.tamu.edu/gambit, accessed 10 September 2008.
- (2007) Gambit: Software tools for Game Theory
- McKelvey, R.D.¹ McLennan, A.M.² Turocy, T.L.³

28
- 0000827179
- BOXES: An experiment in adaptive control
- In: Dale E and Michie D (eds) Oliver and Boyd: Edinburgh
- Michie D and Chambers RA (1968). BOXES: An experiment in adaptive control. In: Dale E and Michie D (eds). Machine Intelligence. Vol. 2, Oliver and Boyd: Edinburgh, pp 137-152.
- (1968) Machine Intelligence. , vol.2 , pp. 137-152
- Michie, D.¹ Chambers, R.A.²

29
- 0013500961
- Ph.D. dissertation, Princeton University
- Minsky ML (1954). Theory of neural-analog reinforcement systems and its application to the brain-model problem. Ph.D. dissertation, Princeton University.
- (1954) Theory of Neural-Analog Reinforcement Systems and its Application to the Brain-Model Problem
- Minsky, M.L.¹

30
- 0001730497
- Non-cooperative games
- Nash J (1951). Non-cooperative games. Ann Math 54: 286-295.
- (1951) Ann Math , vol.54 , pp. 286-295
- Nash, J.¹

31
- 84921065499
- Oxford University Press: Oxford
- North MJ and Macal CM (2007). Managing Business Complexity: Discovering Strategic Solutions with Agent-based Modeling and Simulation. Oxford University Press: Oxford.
- (2007) Managing Business Complexity: Discovering Strategic Solutions with Agent-based Modeling and Simulation
- North, M.J.¹ Macal, C.M.²

32
- 0004058626
- Wiley: Chichester
- Pidd M (1996). Tools for Thinking: Modelling in Management Science. Wiley: Chichester.
- (1996) Tools for Thinking: Modelling in Management Science
- Pidd, M.¹

33
- 0003677359
- Ph.D. thesis, Cambridge University
- Rummery GA (1995). Problem solving with reinforcement learning. Ph.D. thesis, Cambridge University.
- (1995) Problem Solving with Reinforcement Learning
- Rummery, G.A.¹

34
- 34247244139
- Multi-agent reinforcement learning: A critical survey
- Presented at the Washington DC, USA
- Shoham Y, Powers R and Grenager T (2004). Multi-agent reinforcement learning: A critical survey. Presented at the AAAI Fall Symposium on Artificial Multi-agent Learning. Washington DC, USA.
- AAAI Fall Symposium on Artificial Multi-agent Learning , vol.2004
- Shoham, Y.¹ Powers, R.² Grenager, T.³

35
- 0004102479
- MIT Press: London
- Sutton RS and Barto AG (1998). Reinforcement Learning: An Introduction. MIT Press: London.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

36
- 0242622203
- Springer-Verlag: New York
- Talluri KT and van Ryzin GJ (2004). Theory and Practice of Revenue Management. Springer-Verlag: New York.
- (2004) Theory and Practice of Revenue Management
- Talluri, K.T.¹ Van Ryzin, G.J.²

37
- 0003998491
- Hafner: Darien
- Thorndike EL (1911). Animal Intelligence. Hafner: Darien.
- (1911) Animal Intelligence
- Thorndike, E.L.¹

38
- 0004248022
- Julius Springer: Vienna
- von Stackelberg HF (1934). Marktform und Gleichgewicht. Julius Springer: Vienna.
- (1934) Marktform und Gleichgewicht
- Von Stackelberg, H.F.¹

39
- 67649370955
- Computing equilibria for two-person games
- In: Aumann RJ and Hart S (eds) Elsevier: Amsterdam
- von Stengel B (2002). Computing equilibria for two-person games. In: Aumann RJ and Hart S (eds). Handbook of Game Theory with Economic Applications. Vol. 3, Elsevier: Amsterdam, pp 1723-1759.
- (2002) Handbook of Game Theory with Economic Applications. , vol.3 , pp. 1723-1759
- Von Stengel, B.¹

40
- 0004049893
- Ph.D. thesis, Cambridge University
- Watkins CJCH (1989). Learning from delayed rewards. Ph.D. thesis, Cambridge University.
- (1989) Learning from Delayed Rewards
- Cjch, W.¹

41
- 78650408718
- Learning agents for the multi-mode project scheduling problem
- Wauters T, Verbeeck K, Vanden Berghe G and De Causmaecker P (2011). Learning agents for the multi-mode project scheduling problem. J Opl Res Soc 62: 281-290.
- (2011) J Opl Res Soc , vol.62 , pp. 281-290
- Wauters, T.¹ Verbeeck, K.² Vanden Berghe, G.³ De Causmaecker, P.⁴

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.