메뉴 건너뛰기




Volumn , Issue , 2007, Pages 1169-1176

Natural actor-critic for road traffic optimisation

Author keywords

[No Author keywords available]

Indexed keywords

ACTOR CRITIC; ACTOR-CRITIC ALGORITHM; CONTROL SIGNAL; INFINITE HORIZONS; OPTIMISATIONS; REINFORCEMENT LEARNING APPROACH; ROAD TRAFFIC; TRAFFIC SYSTEMS;

EID: 84864064043     PISSN: 10495258     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (85)

References (12)
  • 2
    • 84899022736 scopus 로고    scopus 로고
    • Large scale online learning
    • L. Bottou and Y. Le Cun. Large scale online learning. In Proc. NIPS'2003, volume 16, 2004.
    • (2004) Proc. NIPS'2003 , vol.16
    • Bottou, L.1    Le Cun, Y.2
  • 5
    • 0019012912 scopus 로고
    • The Sydney coordinated adaptive traffic (SCAT) system philosophy and benefits
    • A. G. Sims and K.W. Dobinson. The Sydney coordinated adaptive traffic (SCAT) system philosophy and benefits. IEEE Transactions on Vehicular Technology, VT-29(2):130-137, 1980.
    • (1980) IEEE Transactions on Vehicular Technology , vol.VT-29 , Issue.2 , pp. 130-137
    • Sims, A.G.1    Dobinson, K.W.2
  • 6
    • 36249019659 scopus 로고    scopus 로고
    • Multi-agent reinforcement learning for traffic light control
    • M. Wiering. Multi-agent reinforcement learning for traffic light control. In Proc. ICML 2000, 2000.
    • (2000) Proc. ICML 2000
    • Wiering, M.1
  • 8
    • 84864058248 scopus 로고    scopus 로고
    • On local rewards and scaling distributed reinforcement learning
    • J. A. Bagnell and A. Y. Ng. On local rewards and scaling distributed reinforcement learning. In Proc. NIPS'2005, volume 18, 2006.
    • (2006) Proc. NIPS'2005 , vol.18
    • Bagnell, J.A.1    Ng, A.Y.2
  • 9
    • 84898939480 scopus 로고    scopus 로고
    • Policy gradient methods for reinforcement learning with function approximation
    • MIT Press
    • R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour. Policy gradient methods for reinforcement learning with function approximation. In Proc. NIPS, volume 12. MIT Press, 2000.
    • (2000) Proc. NIPS , vol.12
    • Sutton, R.S.1    McAllester, D.2    Singh, S.3    Mansour, Y.4
  • 10
    • 84898930479 scopus 로고    scopus 로고
    • A natural policy gradient
    • S. Kakade. A natural policy gradient. In Proc. NIPS'2001, volume 14, 2002.
    • (2002) Proc. NIPS'2001 , vol.14
    • Kakade, S.1
  • 11
    • 0038595396 scopus 로고    scopus 로고
    • Least-squares temporal difference learning
    • J. A. Boyan. Least-squares temporal difference learning. In Proc. ICML 16, pages 49-56, 1999.
    • (1999) Proc. ICML , vol.16 , pp. 49-56
    • Boyan, J.A.1
  • 12
    • 0013495368 scopus 로고    scopus 로고
    • Experiments with infinite-horizon, policy-gradient estimation
    • J. Baxter, P. Bartlett, and L.Weaver. Experiments with infinite-horizon, policy-gradient estimation. JAIR, 15:351-381, 2001.
    • (2001) JAIR , vol.15 , pp. 351-381
    • Baxter, J.1    Bartlett, P.2    Weaver, L.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.