메뉴 건너뛰기




Volumn 190, Issue , 2016, Pages 82-94

Multi-agent reinforcement learning as a rehearsal for decentralized planning

Author keywords

Decentralized planning; Multi agent reinforcement learning

Indexed keywords

BEHAVIORAL RESEARCH; BENCHMARKING; DECISION MAKING; FERTILIZERS; MARKOV PROCESSES; MULTI AGENT SYSTEMS; UNCERTAINTY ANALYSIS;

EID: 84962082047     PISSN: 09252312     EISSN: 18728286     Source Type: Journal    
DOI: 10.1016/j.neucom.2016.01.031     Document Type: Article
Times cited : (404)

References (30)
  • 2
    • 77952736651 scopus 로고    scopus 로고
    • An investigation into mathematical programming for finite horizon decentralized POMDPs
    • Aras Raghav, Dutech Alain An investigation into mathematical programming for finite horizon decentralized POMDPs. J. Artif. Intell. Res. 2010, 37:329-396.
    • (2010) J. Artif. Intell. Res. , vol.37 , pp. 329-396
    • Aras, R.1    Dutech, A.2
  • 3
    • 0041966002 scopus 로고    scopus 로고
    • Using confidence bounds for exploitation-exploration trade-offs
    • Auer Peter Using confidence bounds for exploitation-exploration trade-offs. J. Mach. Learn. Res. 2002, 3:397-422.
    • (2002) J. Mach. Learn. Res. , vol.3 , pp. 397-422
    • Auer, P.1
  • 6
    • 0036874366 scopus 로고    scopus 로고
    • The complexity of decentralized control of Markov decision processes
    • Bernstein Daniel S., Givan Robert, Immerman Neil, Zilberstein Shlomo The complexity of decentralized control of Markov decision processes. Math. Oper. Res. 2002, 27:819-840.
    • (2002) Math. Oper. Res. , vol.27 , pp. 819-840
    • Bernstein, D.S.1    Givan, R.2    Immerman, N.3    Zilberstein, S.4
  • 8
    • 84962119252 scopus 로고    scopus 로고
    • The MARL Toolbox version 1.3
    • Lucian Busoniu, The MARL Toolbox version 1.3, 2010. http://busoniu.net/repository.php.
    • (2010)
    • Busoniu, L.1
  • 12
    • 84962119248 scopus 로고    scopus 로고
    • Mobile robotics: Kilobots
    • K-Team, Mobile robotics: Kilobots, 〈〉. http://www.k-team.com/mobile-robotics-products/kilobot.
  • 17
    • 0030647149 scopus 로고    scopus 로고
    • Reinforcement learning in the multi-robot domain
    • Mataric Maja J. Reinforcement learning in the multi-robot domain. Auton. Robots 1997, 4:73-83.
    • (1997) Auton. Robots , vol.4 , pp. 73-83
    • Mataric, M.J.1
  • 18
    • 0141596576 scopus 로고    scopus 로고
    • Policy invariance under reward transformations: theory and application to reward shaping
    • Morgan Kaufmann, Bled, Slovenia
    • Andrew Y. Ng, Daishi Harada, Stuart Russell, Policy invariance under reward transformations: theory and application to reward shaping, in: Proceedings of 16th International Conference on Machine Learning, Morgan Kaufmann, Bled, Slovenia, 1999, pp. 278-287.
    • (1999) Proceedings of 16th International Conference on Machine Learning , pp. 278-287
    • Andrew, Y.N.1    Daishi, H.2    Stuart, R.3
  • 20
    • 52249098423 scopus 로고    scopus 로고
    • Optimal and approximate Q-value functions for decentralized POMDPs
    • Oliehoek Frans A., Spaan Matthijs T.J., Vlassis Nikos Optimal and approximate Q-value functions for decentralized POMDPs. JAIR 2008, 32:289-353.
    • (2008) JAIR , vol.32 , pp. 289-353
    • Oliehoek, F.A.1    Spaan, M.T.J.2    Vlassis, N.3
  • 23
    • 27344432348 scopus 로고    scopus 로고
    • Accelerating reinforcement learning through implicit imitation
    • Price Bob, Boutilier Craig Accelerating reinforcement learning through implicit imitation. J. Artif. Intell. Res. 2003, 19:569-629.
    • (2003) J. Artif. Intell. Res. , vol.19 , pp. 569-629
    • Price, B.1    Boutilier, C.2
  • 25
    • 84962095723 scopus 로고    scopus 로고
    • Dec-POMDP problem domains and format
    • Matthijs Spaan, Dec-POMDP problem domains and format. 〈〉. http://masplan.org/.
    • Matthijs, S.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.