메뉴 건너뛰기




Volumn 1585, Issue , 1999, Pages 195-197

Reinforcement learning: Past, present and future?

Author keywords

[No Author keywords available]

Indexed keywords


EID: 72949089205     PISSN: 03029743     EISSN: 16113349     Source Type: Book Series    
DOI: 10.1007/3-540-48873-1_26     Document Type: Conference Paper
Times cited : (26)

References (13)
  • 3
    • 85156187730 scopus 로고    scopus 로고
    • Improving elevator performance using re-inforcement learning
    • MIT Press, Cambridge, MA
    • Crites, R. H., and Barto, A. G. (1996). Improving elevator performance using re-inforcement learning. In Advances in Neural Information Processing Systems 9, pp. 1017-1023. MIT Press, Cambridge, MA.
    • (1996) Advances in Neural Information Processing Systems , vol.9 , pp. 1017-1023
    • Crites, R.H.1    Barto, A.G.2
  • 5
    • 84956885505 scopus 로고    scopus 로고
    • CRL Report 334. Communications Research Laboratory, Mc-Master University, Hamilton, Ontario
    • Nie, J., and Haykin, S. (1996). A dynamic channel assignment policy through Q-learning. CRL Report 334. Communications Research Laboratory, Mc-Master University, Hamilton, Ontario.
    • (1996) A Dynamic Channel Assignment Policy through Q-Learning
    • Nie, J.1    Haykin, S.2
  • 7
    • 84898972974 scopus 로고    scopus 로고
    • Reinforcement learning for dynamic channel allocation in cellular telephone systems
    • MIT Press, Cambridge, MA
    • Singh, S. P., and Bertsekas, D. (1997). Reinforcement learning for dynamic channel allocation in cellular telephone systems. In Advances in Neural Information Processing Systems 10, pp. 974-980. MIT Press, Cambridge, MA.
    • (1997) Advances in Neural Information Processing Systems , vol.10 , pp. 974-980
    • Singh, S.P.1    Bertsekas, D.2
  • 8
    • 33847202724 scopus 로고
    • Learning to predict by the methods of temporal differences
    • Sutton, R. S. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3:9-44.
    • (1988) Machine Learning , vol.3 , pp. 9-44
    • Sutton, R.S.1
  • 11
    • 0029276036 scopus 로고
    • Temporal difference learning and TD-Gammon
    • Tesauro, G. J. (1995). Temporal difference learning and TD-Gammon. Communications of the ACM, 38:58-68.
    • (1995) Communications of the ACM , vol.38 , pp. 58-68
    • Tesauro, G.J.1
  • 13
    • 85156225449 scopus 로고    scopus 로고
    • High-performance job-shop scheduling with a time-delay TD(λ) network
    • MIT Press, Cambridge, MA
    • Zhang, W., and Dietterich, T. G. (1996). High-performance job-shop scheduling with a time-delay TD(λ) network. In Advances in Neural Information Processing Systems 9, pp. 1024-1030. MIT Press, Cambridge, MA.
    • (1996) Advances in Neural Information Processing Systems , vol.9 , pp. 1024-1030
    • Zhang, W.1    Dietterich, T.G.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.