메뉴 건너뛰기




Volumn E88-D, Issue 5, 2005, Pages 1004-1011

CHQ: A multi-agent reinforcement learning scheme for partially observable Markov decision processes

Author keywords

Multi agent system; Partially observable MDP; Q learning; Reinforcement learning

Indexed keywords

DECISION SUPPORT SYSTEMS; LEARNING SYSTEMS; MARKOV PROCESSES; PROBABILITY;

EID: 24144454723     PISSN: 09168532     EISSN: None     Source Type: Journal    
DOI: 10.1093/ietisy/e88-d.5.1004     Document Type: Conference Paper
Times cited : (5)

References (12)
  • 2
    • 0003529066 scopus 로고
    • On optimal cooperation of knowledge sources
    • BCS-G2010-28, Boeing AI Center
    • M. Benda, V. Jagannathan, and R. Dodhiawalla, "On optimal cooperation of knowledge sources,"Technical Report, BCS-G2010-28, Boeing AI Center, 1985.
    • (1985) Technical Report
    • Benda, M.1    Jagannathan, V.2    Dodhiawalla, R.3
  • 3
    • 0032093057 scopus 로고    scopus 로고
    • Agent-mediated electronic commerce: A survey
    • R.H. Guttman, A.C. Moukas, and P. Maes, "Agent-mediated electronic commerce: A survey,"Knowledge Engineering Review, vol.13, no.2, pp.147-159, 1998.
    • (1998) Knowledge Engineering Review , vol.13 , Issue.2 , pp. 147-159
    • Guttman, R.H.1    Moukas, A.C.2    Maes, P.3
  • 5
    • 0027684215 scopus 로고
    • Prioritized sweeping: Reinforcement learning with less data and less time
    • A. Moore and C.G. Atkeson, "Prioritized sweeping: Reinforcement learning with less data and less time,"Mach. Learn., vol. 13, pp.103-130, 1993.
    • (1993) Mach. Learn. , vol.13 , pp. 103-130
    • Moore, A.1    Atkeson, C.G.2
  • 6
    • 33847202724 scopus 로고    scopus 로고
    • Learning to predict by the method of temporal differences
    • R.S. Sutton, "Learning to predict by the method of temporal differences,"Mach. Learn., vol.3, pp.9-44, 1998.
    • (1998) Mach. Learn. , vol.3 , pp. 9-44
    • Sutton, R.S.1
  • 7
    • 0008321896 scopus 로고    scopus 로고
    • Reinforcement learning: An introduction
    • MIT Press
    • R.S. Sutton and A.G. Barto, "Reinforcement learning: An introduction,"in A Bradford Book, MIT Press, 1998.
    • (1998) A Bradford Book
    • Sutton, R.S.1    Barto, A.G.2
  • 9
    • 34249833101 scopus 로고
    • Technical note: Q-learning
    • C.J.C.H. Watkins and P. Dayan, "Technical note: Q-Learning, "Mach. Learn., vol.8, pp.279-292, 1992.
    • (1992) Mach. Learn. , vol.8 , pp. 279-292
    • Watkins, C.J.C.H.1    Dayan, P.2
  • 11
    • 16244414493 scopus 로고    scopus 로고
    • Delayed reward-based genetic algorithms for partially observable Markov decision problems
    • Dec.
    • Y. Yamashiro, A. Ueno, and H. Takeda, "Delayed reward-based genetic algorithms for partially observable Markov decision problems,"IEICE Trans. Inf. & Syst. (Japanese Edition), vol.J84-D-l, no.12, pp.1635-1647, Dec. 2001.
    • (2001) IEICE Trans. Inf. & Syst. (Japanese Edition) , vol.J84-D-L , Issue.12 , pp. 1635-1647
    • Yamashiro, Y.1    Ueno, A.2    Takeda, H.3
  • 12
    • 85027163402 scopus 로고    scopus 로고
    • URL: http://nssdc.gsfc.nasa.gov/planetary/mesur.html


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.