메뉴 건너뛰기




Volumn , Issue , 1997, Pages 1068-1074

On-line policy improvement using Monte-Carlo search

Author keywords

[No Author keywords available]

Indexed keywords

ADAPTIVE CONTROL SYSTEMS; INTELLIGENT SYSTEMS; SUPERCOMPUTERS;

EID: 84898992015     PISSN: 10495258     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (170)

References (8)
  • 2
    • 85156187730 scopus 로고    scopus 로고
    • Improving elevator performance using reinforcement learning
    • D. Touretzky et al., eds., MIT Press
    • R. H. Crites and A. G. Barto, "Improving elevator performance using reinforcement learning." In: D. Touretzky et al., eds., Advances in Neural Information Processing Systems 8, 1017-1023, MIT Press (1996).
    • (1996) Advances in Neural Information Processing Systems , vol.8 , pp. 1017-1023
    • Crites, R.H.1    Barto, A.G.2
  • 3
    • 0000218399 scopus 로고
    • Programming a computer for playing chess
    • C. E. Shannon, "Programming a computer for playing chess." Philosophical Magazine 41, 265-275 (1950).
    • (1950) Philosophical Magazine , vol.41 , pp. 265-275
    • Shannon, C.E.1
  • 4
    • 33847202724 scopus 로고
    • Learning to predict by the methods of temporal differences
    • R. S. Sutton, "Learning to predict by the methods of temporal differences." Machine Learning 3, 9-44 (1988).
    • (1988) Machine Learning , vol.3 , pp. 9-44
    • Sutton, R.S.1
  • 5
    • 0007993990 scopus 로고
    • Connectionist learning of expert preferences by comparison training
    • D. Touretzky, ed., Morgan Kaufmann
    • G. Tesauro, "Connectionist learning of expert preferences by comparison training." In: D. Touretzky, ed., Advances in Neural Information Processing Systems 1, 99-106, Morgan Kaufmann (1989).
    • (1989) Advances in Neural Information Processing Systems , vol.1 , pp. 99-106
    • Tesauro, G.1
  • 6
    • 0001046225 scopus 로고
    • Practical issues in temporal difference learning
    • G. Tesauro, "Practical issues in temporal difference learning." Machine Learning 8, 257-277 (1992).
    • (1992) Machine Learning , vol.8 , pp. 257-277
    • Tesauro, G.1
  • 7
    • 0029276036 scopus 로고
    • Temporal difference learning and TD-Gammon
    • G. Tesauro, "Temporal difference learning and TD-Gammon." Comm. of the ACM, 38:3, 58-67 (1995).
    • (1995) Comm. of the ACM , vol.38 , Issue.3 , pp. 58-67
    • Tesauro, G.1
  • 8
    • 85156225449 scopus 로고    scopus 로고
    • High-performance job-shop scheduling with a time-delay TD(λ) network
    • D. Touretzky et al., eds., MIT Press
    • W. Zhang and T. G. Dietterich, "High-performance job-shop scheduling with a time-delay TD(λ) network." In: D. Touretzky et al., eds., Advances in Neural Information Processing Systems 8, 1024-1030, MIT Press (1996).
    • (1996) Advances in Neural Information Processing Systems , vol.8 , pp. 1024-1030
    • Zhang, W.1    Dietterich, T.G.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.