SCOPUS 정보 검색 플랫폼

Autonomous Agents and Multi-Agent Systems

Volumn 13, Issue 2, 2006, Pages 197-229

Hierarchical multi-agent reinforcement learning

(3) Ghavamzadeh, Mohammad a Mahadevan, Sridhar b Makar, Rajbala c

a UNIVERSITY OF ALBERTA (Canada)

b Biologically Inspired Neural and Dynamical Systems Laboratory (United States)

c AGILENT TECHNOLOGIES (United States)

Author keywords

Communication; Cooperative multi agent systems; Coordination; Hierarchical reinforcement learning

Indexed keywords

EID: 33846942607 PISSN: 13872532 EISSN: 15737454 Source Type: Journal
DOI: 10.1007/s10458-006-7035-4 Document Type: Article

Times cited : (118)

References (45)

1
- 0004203547
- John Wiley and Sons
- Askin, R., & Standridge, C. (1993). Modeling and analysis of manufacturing systems. John Wiley and Sons.
- (1993) Modeling and analysis of manufacturing systems
- Askin, R.¹ Standridge, C.²

2
- 0032308533
- Behavior-based formation control for multi-robot Teams
- Balch, T., & Arkin, R. (1998). Behavior-based formation control for multi-robot Teams. IEEE Transactions on Robotics and Automation, 14, 1-15.
- (1998) IEEE Transactions on Robotics and Automation , vol.14 , pp. 1-15
- Balch, T.¹ Arkin, R.²

3
- 0037288370
- Recent advances in hierarchical reinforcement learning
- Barto, A., & Mahadevan, S. (2003) Recent advances in hierarchical reinforcement learning, Discrete Event Systems Special Issue on Reinforcement Learning, 13, 41-77.
- (2003) Discrete Event Systems Special Issue on Reinforcement Learning , vol.13 , pp. 41-77
- Barto, A.¹ Mahadevan, S.²

4
- 0141965747
- The complexity of decentralized control of markov decision processes
- Bernstein, D., Zilberstein, S., & Immerman, N. (2000). The complexity of decentralized control of markov decision processes. In Proceedings of the sixteenth international conference on uncertainty in artificial intelligence (pp. 32-37).
- (2000) Proceedings of the sixteenth international conference on uncertainty in artificial intelligence , pp. 32-37
- Bernstein, D.¹ Zilberstein, S.² Immerman, N.³

5
- 84880690163
- Sequential optimality & coordination in multi-agent systems
- Boutilier, C. (1999). Sequential optimality & coordination in multi-agent systems. In Proceedings of the sixteenth international joint conference on artificial intelligence (pp. 478-485).
- (1999) Proceedings of the sixteenth international joint conference on artificial intelligence , pp. 478-485
- Boutilier, C.¹

6
- 0036531878
- Multiagent learning using a variable learning rate
- Bowling, M., & Veloso, M. (2002). Multiagent learning using a variable learning rate. Artificial Intelligence, 136, 215-250.
- (2002) Artificial Intelligence , vol.136 , pp. 215-250
- Bowling, M.¹ Veloso, M.²

7
- 0042254114
- Policy recognition in the abstract hidden markov model
- Bui, H., Venkatesh, S., & West, G. (2002). Policy recognition in the abstract hidden markov model. Journal of Artificial Intelligence Research, 17, 451-499.
- (2002) Journal of Artificial Intelligence Research , vol.17 , pp. 451-499
- Bui, H.¹ Venkatesh, S.² West, G.³

8
- 0032208335
- Elevator group control using multiple reinforcement learning agents
- Crites, R., & Barto, A. (1998). Elevator group control using multiple reinforcement learning agents. Machine Learning, 33, 235-262.
- (1998) Machine Learning , vol.33 , pp. 235-262
- Crites, R.¹ Barto, A.²

9
- 0002278788
- Hierarchical reinforcement learning with the MAXQ value function decomposition
- Dietterich, T. (2000). Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 13, 227-303.
- (2000) Journal of Artificial Intelligence Research , vol.13 , pp. 227-303
- Dietterich, T.¹

10
- 0003989209
- Springer Verlag
- Filar, J., & Vrieze, K. (1997). Competitive Markov decision processes. Springer Verlag.
- (1997) Competitive Markov decision processes
- Filar, J.¹ Vrieze, K.²

11
- 1942452721
- Hierarchical policy gradient algorithms
- Ghavamzadeh, M., & Mahadevan, S. (2003). Hierarchical policy gradient algorithms. In Proceedings of the twentieth international conference on machine learning (pp. 226-233).
- (2003) Proceedings of the twentieth international conference on machine learning , pp. 226-233
- Ghavamzadeh, M.¹ Mahadevan, S.²

12
- 4544317342
- Learning to communicate and act using hierarchical reinforcement learning
- Ghavamzadeh, M., & Mahadevan, S. (2004). Learning to communicate and act using hierarchical reinforcement learning. In Proceedings of the third international joint conference on autonomous agents and multiagent systems (pp. 1114-1121).
- (2004) Proceedings of the third international joint conference on autonomous agents and multiagent systems , pp. 1114-1121
- Ghavamzadeh, M.¹ Mahadevan, S.²

13
- 4544236179
- Coordinated reinforcement learning
- Guestrin, C., Lagoudakis, M., & Parr, R. (2002). Coordinated reinforcement learning. In Proceedings of the nineteenth international conference on machine learning (pp. 227-234).
- (2002) Proceedings of the nineteenth international conference on machine learning , pp. 227-234
- Guestrin, C.¹ Lagoudakis, M.² Parr, R.³

14
- 0003871605
- John Wiley and Sons
- Howard, R. (1971). Dynamic probabilistic systems: Semi-Markov and decision processes. John Wiley and Sons.
- (1971) Dynamic probabilistic systems: Semi-Markov and decision processes
- Howard, R.¹

15
- 0000929496
- Multiagent reinforcement learning: Theoretical framework and an algorithm
- Hu, J., & Wellman, M. (1998). Multiagent reinforcement learning: Theoretical framework and an algorithm. In Proceedings of the fifteenth international conference on machine learning (pp. 242-250).
- (1998) Proceedings of the fifteenth international conference on machine learning , pp. 242-250
- Hu, J.¹ Wellman, M.²

16
- 0141591857
- Graphical models for game theory
- Kearns, M., Littman, M., & Singh, S. (2001). Graphical models for game theory. In Proceedings of the seventeenth international conference on uncertainty in artificial intelligence (pp. 253-260).
- (2001) Proceedings of the seventeenth international conference on uncertainty in artificial intelligence , pp. 253-260
- Kearns, M.¹ Littman, M.² Singh, S.³

17
- 0029762504
- AGV dispatching
- Klein, C., & Kim, J. (1996). AGV dispatching. International Journal of Production Research, 34, 95-110.
- (1996) International Journal of Production Research , vol.34 , pp. 95-110
- Klein, C.¹ Kim, J.²

18
- 84880906306
- Multiagent influence diagrams for representing and solving games
- Koller, D., & Milch, B. Multiagent influence diagrams for representing and solving games. In Proceedings of the seventeenth international joint conference on artificial intelligence (pp. 1027-1034).
- Proceedings of the seventeenth international joint conference on artificial intelligence , pp. 1027-1034
- Koller, D.¹ Milch, B.²

19
- 0242719540
- Game Networks
- La Mura, P. (2000). Game Networks. In Proceedings of the sixteenth international conference on uncertainty in artificial intelligence.
- (2000) Proceedings of the sixteenth international conference on uncertainty in artificial intelligence
- La Mura, P.¹

20
- 0030082467
- Composite dispatching rules for multiple-vehicle agv systems
- Lee, J. (1996). Composite dispatching rules for multiple-vehicle agv systems. Simulation, 66, 121-130.
- (1996) Simulation , vol.66 , pp. 121-130
- Lee, J.¹

21
- 85149834820
- Markov games as a framework for multi-agent reinforcement learning
- Littman, M. (1994). Markov games as a framework for multi-agent reinforcement learning. In Proceedings of the eleventh international conference on machine learning (pp. 157-163).
- (1994) Proceedings of the eleventh international conference on machine learning , pp. 157-163
- Littman, M.¹

22
- 0242466944
- Friend-or-Foe Q-learning in general-sum games
- Littman, M. (2001). Friend-or-Foe Q-learning in general-sum games. In Proceedings of the eighteenth international conference on machine learning (pp. 322-328).
- (2001) Proceedings of the eighteenth international conference on machine learning , pp. 322-328
- Littman, M.¹

23
- 0242635258
- An efficient exact algorithm for singly connected graphical games
- Littman, M., Kearns, M., & Singh, S. (2001). An efficient exact algorithm for singly connected graphical games. In Proceedings of neural information processing systems (pp. 817-824).
- (2001) Proceedings of neural information processing systems , pp. 817-824
- Littman, M.¹ Kearns, M.² Singh, S.³

24
- 0034819292
- Hierarchical multi-agent reinforcement learning
- Makar, R., Mahadevan, S., & Ghavamzadeh, M. (2001). Hierarchical multi-agent reinforcement learning. In Proceedings of the fifth international conference on autonomous agents (pp. 246-253).
- (2001) Proceedings of the fifth international conference on autonomous agents , pp. 246-253
- Makar, R.¹ Mahadevan, S.² Ghavamzadeh, M.³

25
- 0030647149
- Reinforcement learning in the multi-robot domain (1997)
- Mataric, M. (1997). Reinforcement learning in the multi-robot domain (1997). Autonomous Robots, 4, 73-83.
- (1997) Autonomous Robots , vol.4 , pp. 73-83
- Mataric, M.¹

26
- 80055033101
- Nash propagation for loopy graphical games
- Ortiz, L., & Kearns, M. (2002). Nash propagation for loopy graphical games. In Proceedings of neural information processing systems.
- (2002) Proceedings of neural information processing systems
- Ortiz, L.¹ Kearns, M.²

27
- 0004260006
- Academic Press
- Owen, G. (1995). Game theory. Academic Press.
- (1995) Game theory
- Owen, G.¹

28
- 0003989214
- PhD thesis, University of California, Berkeley
- Parr, R. (1998). Hierarchical control and learning for Markov decision processes, PhD thesis, University of California, Berkeley.
- (1998) Hierarchical control and learning for Markov decision processes
- Parr, R.¹

29
- 0012646255
- Learning to cooperate via policy search
- Peshkin, L., Kim, K., Meuleau, N., & Kaelbling, L. (2000). Learning to cooperate via policy search. In Proceedings of the sixteenth international conference on uncertainty in artificial intelligence (pp. 489-496).
- (2000) Proceedings of the sixteenth international conference on uncertainty in artificial intelligence , pp. 489-496
- Peshkin, L.¹ Kim, K.² Meuleau, N.³ Kaelbling, L.⁴

30
- 0003998452
- Wiley Interscience
- Puterman, M. (1994). Markov decision processes. Wiley Interscience.
- (1994) Markov decision processes
- Puterman, M.¹

31
- 1142292938
- The communicative multiagent team decision problem: Analyzing teamwork theories and models
- Pynadath, D., & Tambe, M. (2002). The communicative multiagent team decision problem: Analyzing teamwork theories and models. Journal of Artificial Intelligence Research, 16, 389-426.
- (2002) Journal of Artificial Intelligence Research , vol.16 , pp. 389-426
- Pynadath, D.¹ Tambe, M.²

32
- 33745588421
- Learning to take concurrent actions
- Rohanimanesh, K., & Mahadevan, S. Learning to take concurrent actions. In Proceedings of the sixteenth annual conference on neural information processing systems.
- Proceedings of the sixteenth annual conference on neural information processing systems
- Rohanimanesh, K.¹ Mahadevan, S.²

33
- 13444261465
- Probabilistic plan recognition in multiagent systems
- Saria, S., & Mahadevan, M. (2004). Probabilistic plan recognition in multiagent systems. In Proceedings of the fourteenth international conference on automated planning and scheduling (pp. 12-22).
- (2004) Proceedings of the fourteenth international conference on automated planning and scheduling , pp. 12-22
- Saria, S.¹ Mahadevan, M.²

34
- 0001395498
- Distributed value functions
- Schneider, J., Wong, W., Moore, A., & Riedmiller, M. Distributed value functions. In Proceedings of the sixteenth international conference on machine learning (pp. 371-378).
- Proceedings of the sixteenth international conference on machine learning , pp. 371-378
- Schneider, J.¹ Wong, W.² Moore, A.³ Riedmiller, M.⁴

35
- 0001644761
- Nash convergence of gradient dynamics in general-sum games
- Singh, S., Kearns, M., & Mansour, Y. (2000). Nash convergence of gradient dynamics in general-sum games. In Proceedings of the sixteenth international conference on uncertainty in artificial intelligence (pp. 541-548).
- (2000) Proceedings of the sixteenth international conference on uncertainty in artificial intelligence , pp. 541-548
- Singh, S.¹ Kearns, M.² Mansour, Y.³

36
- 0032645144
- Team-partitioned, opaque-transition reinforcement learning
- Stone, P., & Veloso, M. (1999). Team-partitioned, opaque-transition reinforcement learning. In Proceedings of the third international conference on autonomous agents (pp. 206-212).
- (1999) Proceedings of the third international conference on autonomous agents , pp. 206-212
- Stone, P.¹ Veloso, M.²

37
- 0032208403
- Learning to improve coordinated actions in cooperative distributed problem-solving environments
- Sugawara, T., & Lesser, V. Learning to improve coordinated actions in cooperative distributed problem-solving environments. Machine Learning, 33, 129-154.
- Machine Learning , vol.33 , pp. 129-154
- Sugawara, T.¹ Lesser, V.²

38
- 0033170372
- Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
- Sutton, R., Precup, D., & Singh, S. (1999). Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112, 181-211.
- (1999) Artificial Intelligence , vol.112 , pp. 181-211
- Sutton, R.¹ Precup, D.² Singh, S.³

39
- 0002313852
- Scaling up average reward reinforcement learning by approximating the domain models and the value function
- Tadepalli, P., & Ok, D. Scaling up average reward reinforcement learning by approximating the domain models and the value function. In Proceedings of the thirteenth international conference on machine, learning (pp. 471-479).
- Proceedings of the thirteenth international conference on machine, learning , pp. 471-479
- Tadepalli, P.¹ Ok, D.²

40
- 85152198941
- Multi-agent reinforcement learning: Independent vs. cooperative agents
- Tan, M. (1993). Multi-agent reinforcement learning: Independent vs. cooperative agents. In Proceedings of the tenth international conference on machine learning (pp. 330-337).
- (1993) Proceedings of the tenth international conference on machine learning , pp. 330-337
- Tan, M.¹

41
- 0036930301
- Multiagent algorithms for solving graphical games
- Vickrey, D., & Koller, D. (2002). Multiagent algorithms for solving graphical games. In Proceedings of the national conference, on artificial intelligence, (pp. 345-351).
- (2002) Proceedings of the national conference, on artificial intelligence , pp. 345-351
- Vickrey, D.¹ Koller, D.²

42
- 0004049893
- PhD thesis, Kings College, Cambridge, England
- Watkins, C. (1989). Learning from delayed rewards, PhD thesis, Kings College, Cambridge, England.
- (1989) Learning from delayed rewards
- Watkins, C.¹

43
- 0003744207
- MIT Press
- Weiss, G. (1999). Multi-agent systems: A modern approach to distributed artificial intelligence. MIT Press.
- (1999) Multi-agent systems: A modern approach to distributed artificial intelligence
- Weiss, G.¹

44
- 0034827257
- Communication decisions in multi-agent cooperation: Model and experiments
- Xuan, P., Lesser, V., & Zilberstein, S. (2001). Communication decisions in multi-agent cooperation: Model and experiments. In Proceedings of the fifth international conference on autonomous agents (pp. 616-623).
- (2001) Proceedings of the fifth international conference on autonomous agents , pp. 616-623
- Xuan, P.¹ Lesser, V.² Zilberstein, S.³

45
- 0036367583
- Multiagent policies: From centralized ones to decentralized ones
- Xuan, P., & Lesser, V. (2002). Multiagent policies: From centralized ones to decentralized ones. In Proceedings of the first international joint conference on autonomous agents and multiagent systems (pp. 1098-1105).
- (2002) Proceedings of the first international joint conference on autonomous agents and multiagent systems , pp. 1098-1105
- Xuan, P.¹ Lesser, V.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.