메뉴 건너뛰기




Volumn 41, Issue 2, 2009, Pages 158-167

A reinforcement learning algorithm for obtaining the Nash equilibrium of multi-player matrix games

Author keywords

Matrix games; Reinforcement Learning; Restructured electricity markets; Vash equilibrium

Indexed keywords

ALGORITHMS; APPROXIMATION ALGORITHMS; COMMERCE; COMPETITION; ELECTRICITY; ELECTRONIC COMMERCE; GAME THEORY; LEARNING SYSTEMS; REINFORCEMENT; REINFORCEMENT LEARNING; SOLUTIONS; STOCHASTIC MODELS; TELECOMMUNICATION NETWORKS;

EID: 56749179073     PISSN: 0740817X     EISSN: 15458830     Source Type: Journal    
DOI: 10.1080/07408170802369417     Document Type: Article
Times cited : (18)

References (37)
  • 1
    • 0033501158 scopus 로고    scopus 로고
    • Understanding how market power can arise in network competition: A game theoretic approach
    • Berry, C. A., Hobbs, B. F., Meroney, W. A., O'Neill, R. P. and Stewart Jr., W. R. (1999) Understanding how market power can arise in network competition: A game theoretic approach. Utilities Policy, 8, pp. 139-158.
    • (1999) Utilities Policy , vol.8 , pp. 139-158
    • Berry, C.A.1    Hobbs, B.F.2    Meroney, W.A.3    O'Neill, R.P.4    Stewart Jr., W.R.5
  • 3
    • 18744371204 scopus 로고    scopus 로고
    • Reinforcement learning in Markovian evolutionary games
    • Borkar, V. S. (2002) Reinforcement learning in Markovian evolutionary games. Advances in Complex Systems, 5:1, pp. 55-72.
    • (2002) Advances in Complex Systems , vol.5 , Issue.1 , pp. 55-72
    • Borkar, V.S.1
  • 4
    • 0031091534 scopus 로고    scopus 로고
    • Market power and strategic interaction in electricity networks
    • Cardell, J. B., Hitt, C. C. and Hogan, W. W. (1997) Market power and strategic interaction in electricity networks. Resource and Energy Economics, 19, pp. 109-137.
    • (1997) Resource and Energy Economics , vol.19 , pp. 109-137
    • Cardell, J.B.1    Hitt, C.C.2    Hogan, W.W.3
  • 5
    • 0036474461 scopus 로고    scopus 로고
    • An empirical study of applied game theory: Transmission constrained Cournot behavior
    • Cunningham, L. B., Baldick, R. and Baughman, M. L. (2002) An empirical study of applied game theory: Transmission constrained Cournot behavior. IEEE Transactions on Power Systems, 17:1, pp. 166-172.
    • (2002) IEEE Transactions on Power Systems , vol.17 , Issue.1 , pp. 166-172
    • Cunningham, L.B.1    Baldick, R.2    Baughman, M.L.3
  • 7
    • 0032643313 scopus 로고    scopus 로고
    • Solving semi-Markov decision problems using average reward reinforcement learning
    • Das, T. K., Gosavi, A., Mahadevan, S. and Marchalleck, N. (1999) Solving semi-Markov decision problems using average reward reinforcement learning. Management Science, 45:4, pp. 560-574.
    • (1999) Management Science , vol.45 , Issue.4 , pp. 560-574
    • Das, T.K.1    Gosavi, A.2    Mahadevan, S.3    Marchalleck, N.4
  • 8
    • 0031200040 scopus 로고    scopus 로고
    • Transaction analysis in deregulated power systems using game theory
    • Ferrero, R. W., Shahidehpour, M. and Ramesh, V. C. (1999) Transaction analysis in deregulated power systems using game theory. IEEE Transactions on Power Systems, 12:3, pp. 1340-1347.
    • (1999) IEEE Transactions on Power Systems , vol.12 , Issue.3 , pp. 1340-1347
    • Ferrero, R.W.1    Shahidehpour, M.2    Ramesh, V.C.3
  • 10
    • 0742319170 scopus 로고    scopus 로고
    • Reinforcement learning for long-run average cost
    • Gosavi, A. (2004) Reinforcement learning for long-run average cost. European Journal of Operational Research, 155, pp. 654-674.
    • (2004) European Journal of Operational Research , vol.155 , pp. 654-674
    • Gosavi, A.1
  • 11
    • 0036722536 scopus 로고    scopus 로고
    • A reinforcement learning approach to airline seat allocation for multiple fare classes with overbooking
    • Gosavi, A., Bandla, N. and Das, T. K. (2002) A reinforcement learning approach to airline seat allocation for multiple fare classes with overbooking. IIE Transactions, 34:9, pp. 729-742.
    • (2002) IIE Transactions , vol.34 , Issue.9 , pp. 729-742
    • Gosavi, A.1    Bandla, N.2    Das, T.K.3
  • 12
    • 0344704362 scopus 로고
    • Finite dimensional variational inequality and nonlinear complementarity problems: A survey of theory, algorithms and applications
    • Harker, P. T. and Pang, J. S. (1990) Finite dimensional variational inequality and nonlinear complementarity problems: A survey of theory, algorithms and applications. Mathematical Programming, 48, pp. 161-220.
    • (1990) Mathematical Programming , vol.48 , pp. 161-220
    • Harker, P.T.1    Pang, J.S.2
  • 13
    • 0035340070 scopus 로고    scopus 로고
    • Linear complementarity models of Nash-Cournot competition in bilateral and POOLCO power markets
    • Hobbs, B. F. (2001) Linear complementarity models of Nash-Cournot competition in bilateral and POOLCO power markets. IEEE Transactions on Power Systems, 16:2, pp. 194-202.
    • (2001) IEEE Transactions on Power Systems , vol.16 , Issue.2 , pp. 194-202
    • Hobbs, B.F.1
  • 14
    • 0034186850 scopus 로고    scopus 로고
    • Strategic gaming analysis for electric power systems: An MPEC approach
    • Hobbs, B. F., Carolyn, B. M. and Pang, J. S. (2006) Strategic gaming analysis for electric power systems: An MPEC approach. IEEE Transactions on Power Systems, 15:2, pp. 638-645.
    • (2006) IEEE Transactions on Power Systems , vol.15 , Issue.2 , pp. 638-645
    • Hobbs, B.F.1    Carolyn, B.M.2    Pang, J.S.3
  • 15
    • 0000929496 scopus 로고    scopus 로고
    • Multiagent reinforcement learning: Theoretical framework and an algorithm
    • Presented at the Madison, WI
    • Hu, J. and Wellman, M. P. (1998) Multiagent reinforcement learning: theoretical framework and an algorithm. Presented at the 15th International Conference on Machine Learning Madison, WI
    • (1998) 15th International Conference on Machine Learning
    • Hu, J.1    Wellman, M.P.2
  • 16
    • 4644369748 scopus 로고    scopus 로고
    • Nash Q-learning for general-sum stochastic games
    • Hu, J. and Wellman, M. P. (2003) Nash Q-learning for general-sum stochastic games. Journal of Machine Learning Research, 4, pp. 1039-1069.
    • (2003) Journal of Machine Learning Research , vol.4 , pp. 1039-1069
    • Hu, J.1    Wellman, M.P.2
  • 18
    • 0344827154 scopus 로고    scopus 로고
    • Solving three-player games by the matrix approach with an application to an electric power market
    • Lee, K. H. and Baldick, R. (2003) Solving three-player games by the matrix approach with an application to an electric power market. IEEE Transactions on Power Systems, 18:4, pp. 166-172.
    • (2003) IEEE Transactions on Power Systems , vol.18 , Issue.4 , pp. 166-172
    • Lee, K.H.1    Baldick, R.2
  • 20
    • 56749086441 scopus 로고    scopus 로고
    • A reinforcement learning (Nash-r) algorithm for average reward irreducible stochastic games
    • Li, J., Ramachandran, K. and Das, T. K. (2007) A reinforcement learning (Nash-r) algorithm for average reward irreducible stochastic games. Journal of Machine Learning Research
    • (2007) Journal of Machine Learning Research
    • Li, J.1    Ramachandran, K.2    Das, T.K.3
  • 23
    • 0022149015 scopus 로고
    • Computational experience in solving equilibrium models by a sequence of linear complementarity problems
    • Mathiesen, L. (1985) Computational experience in solving equilibrium models by a sequence of linear complementarity problems. Operations Research, 33:6, pp. 1225-1250.
    • (1985) Operations Research , vol.33 , Issue.6 , pp. 1225-1250
    • Mathiesen, L.1
  • 24
    • 70350102387 scopus 로고    scopus 로고
    • Computation of equilibria in finite games
    • Elsevier, The Netherlands
    • McKelvey, R. and McLennan, A. (1996) Computation of equilibria in finite games. Handbook of Computational Economics, pp. 87-142. Elsevier, The Netherlands
    • (1996) Handbook of Computational Economics , pp. 87-142
    • McKelvey, R.1    McLennan, A.2
  • 26
    • 0001730497 scopus 로고
    • Non-cooperative games
    • Nash, J. (1951) Non-cooperative games. The Annals of Mathematics, 54:2, pp. 286-295.
    • (1951) The Annals of Mathematics , vol.54 , Issue.2 , pp. 286-295
    • Nash, J.1
  • 27
    • 0031231882 scopus 로고    scopus 로고
    • Using co-evolutionary programming to simulate strategic behavior in markets
    • Price, T. C. (1997) Using co-evolutionary programming to simulate strategic behavior in markets. Journal of Evolutionary Economics, 7, pp. 219-254.
    • (1997) Journal of Evolutionary Economics , vol.7 , pp. 219-254
    • Price, T.C.1
  • 30
    • 0002958728 scopus 로고
    • On the generalization of the Lemke-Howson algorithm to noncooperative n-person games
    • Rosenmuller, J. (1971) On the generalization of the Lemke-Howson algorithm to noncooperative n-person games. SIAM Journal on Applied Mathematics, 21:1, pp. 73-79.
    • (1971) SIAM Journal on Applied Mathematics , vol.21 , Issue.1 , pp. 73-79
    • Rosenmuller, J.1
  • 31
    • 56749166432 scopus 로고
    • Stochastic games
    • Princeton University Press, Princeton, NJ
    • Shapley, L. S. (1953) Stochastic games. Classics in Game Theory, Princeton University Press, Princeton, NJ
    • (1953) Classics in Game Theory
    • Shapley, L.S.1
  • 32
    • 0008056524 scopus 로고    scopus 로고
    • Using game theory to study market power in simple networks
    • IEEE, Piscataway, NJ
    • Stoft, S. (1999) Using game theory to study market power in simple networks. IEEE Tutorial on Game Theory in Electric Power Markets, pp. 33-40. IEEE, Piscataway, NJ
    • (1999) IEEE Tutorial on Game Theory in Electric Power Markets , pp. 33-40
    • Stoft, S.1
  • 34
    • 38549135253 scopus 로고    scopus 로고
    • Equilibrium problems with equilibrium constraints: Stationarities, algorithms, and applications
    • Ph.D. Dissertation. Department of Management Sciences, Stanford University, Stanford, CA
    • Su, C. L. (2005) Equilibrium problems with equilibrium constraints: stationarities, algorithms, and applications, Ph.D. Dissertation. Department of Management Sciences, Stanford University, Stanford, CA
    • (2005)
    • Su, C.L.1
  • 36
    • 0002958732 scopus 로고
    • Computing equilibria of n-person games
    • Wilson, R. (1971) Computing equilibria of n-person games. SIAM Journal of Applied Mathematics, 21, pp. 80-87.
    • (1971) SIAM Journal of Applied Mathematics , vol.21 , pp. 80-87
    • Wilson, R.1
  • 37
    • 12344318209 scopus 로고    scopus 로고
    • Computing Cournot equilibria in two settlement electricity markets with transmission constraints
    • IEEE Computer Society, Los Alamitos, CA
    • Yao, J., Oren, S. and Adler, I. (2004) Computing Cournot equilibria in two settlement electricity markets with transmission constraints. Proceedings of the 37th Hawaii International Conference on System Sciences, vol. 2 IEEE Computer Society, p. 20051b. Los Alamitos, CA
    • (2004) Proceedings of the 37th Hawaii International Conference on System Sciences , vol.2
    • Yao, J.1    Oren, S.2    Adler, I.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.