메뉴 건너뛰기




Volumn 196, Issue 2, 2008, Pages 913-922

Reinforcement learning and evolutionary algorithms for non-stationary multi-armed bandit problems

Author keywords

Action selection; Decision making agents; Exploration exploitation; Genetic algorithms; Multi armed bandit; Reinforcement learning

Indexed keywords

AD HOC NETWORKS; COMPUTATIONAL COMPLEXITY; COMPUTER SIMULATION; EVOLUTIONARY ALGORITHMS; GAUSSIAN DISTRIBUTION; PROBLEM SOLVING;

EID: 38649118249     PISSN: 00963003     EISSN: None     Source Type: Journal    
DOI: 10.1016/j.amc.2007.07.043     Document Type: Article
Times cited : (78)

References (17)
  • 4
    • 0032271173 scopus 로고    scopus 로고
    • Multi-armed bandits in discrete and continuous time
    • Kaspi H., and Mandelbaum A. Multi-armed bandits in discrete and continuous time. Annals of Applied Probability 8 4 (1998) 1270-1290
    • (1998) Annals of Applied Probability , vol.8 , Issue.4 , pp. 1270-1290
    • Kaspi, H.1    Mandelbaum, A.2
  • 5
    • 0000456128 scopus 로고
    • Switching costs and the Gittins index
    • Banks J.S., and Sundaram R.K. Switching costs and the Gittins index. Econometrica 62 (1994) 687-694
    • (1994) Econometrica , vol.62 , pp. 687-694
    • Banks, J.S.1    Sundaram, R.K.2
  • 6
    • 10944236938 scopus 로고    scopus 로고
    • A survey on the bandit problem with switching costs
    • Jun T. A survey on the bandit problem with switching costs. De Economist 152 (2004) 513-541
    • (2004) De Economist , vol.152 , pp. 513-541
    • Jun, T.1
  • 7
    • 0029513526 scopus 로고    scopus 로고
    • P. Auer, N. Cesa-Bianchi, Y. Freund, R.E. Schapire, Gambling in a rigged Casino: the adversarial multi-armed bandit problem, in: Proceedings of the 36th Annual Symposium on Foundations of Computer Science, 1995, pp. 322-331.
    • P. Auer, N. Cesa-Bianchi, Y. Freund, R.E. Schapire, Gambling in a rigged Casino: the adversarial multi-armed bandit problem, in: Proceedings of the 36th Annual Symposium on Foundations of Computer Science, 1995, pp. 322-331.
  • 8
    • 85131751581 scopus 로고    scopus 로고
    • Dynamic pricing on the Internet: theory and simulations
    • Leloup B., and Deveaux L. Dynamic pricing on the Internet: theory and simulations. Electronic Commerce Research 1 (2001) 265-276
    • (2001) Electronic Commerce Research , vol.1 , pp. 265-276
    • Leloup, B.1    Deveaux, L.2
  • 9
    • 0347020920 scopus 로고    scopus 로고
    • Job assignment and bandit problems
    • Valsecchi I. Job assignment and bandit problems. International Journal of Manpower 24 7 (2003) 844-866
    • (2003) International Journal of Manpower , vol.24 , Issue.7 , pp. 844-866
    • Valsecchi, I.1
  • 10
    • 4243096065 scopus 로고    scopus 로고
    • Exploitation vs. exploration: choosing a supplier in an environment of incomplete information
    • Azoulay-Schwartz R., Kraus S., and Wilkenfeld J. Exploitation vs. exploration: choosing a supplier in an environment of incomplete information. Decision Support Systems 38 (2004) 1-18
    • (2004) Decision Support Systems , vol.38 , pp. 1-18
    • Azoulay-Schwartz, R.1    Kraus, S.2    Wilkenfeld, J.3
  • 11
    • 38649109439 scopus 로고
    • Systematic search, belated information and the Gittins' index
    • McCall B.P., and McCall J.J. Systematic search, belated information and the Gittins' index. Economics Letters 8 (1981) 327-333
    • (1981) Economics Letters , vol.8 , pp. 327-333
    • McCall, B.P.1    McCall, J.J.2
  • 12
    • 18144424368 scopus 로고    scopus 로고
    • Information technology project failures. Applying the bandit problem to evaluate managerial decision making
    • Chulkov D.V., and Desai M.S. Information technology project failures. Applying the bandit problem to evaluate managerial decision making. Information Management and Computer Security 13 2 (2005) 135-143
    • (2005) Information Management and Computer Security , vol.13 , Issue.2 , pp. 135-143
    • Chulkov, D.V.1    Desai, M.S.2
  • 15
    • 32444431946 scopus 로고    scopus 로고
    • D. Thierens, An adaptive pursuit strategy for allocating operator probabilities, in: Proceedings of the Genetic and Evolutionary Computing Conference (GECCO 2005), 2005, pp. 1539-1546.
    • D. Thierens, An adaptive pursuit strategy for allocating operator probabilities, in: Proceedings of the Genetic and Evolutionary Computing Conference (GECCO 2005), 2005, pp. 1539-1546.
  • 16
    • 0034619428 scopus 로고    scopus 로고
    • Do evolutionary processes minimize expected losses?
    • Fogel D.B., and Beyer H.G. Do evolutionary processes minimize expected losses?. Journal of Theoretic Biology 207 (2000) 117-123
    • (2000) Journal of Theoretic Biology , vol.207 , pp. 117-123
    • Fogel, D.B.1    Beyer, H.G.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.