SCOPUS 정보 검색 플랫폼

Autonomous Agents and Multi-Agent Systems

Volumn 2, Issue 2, 1999, Pages 141-172

Exploration Strategies for Model-based Learning in Multi-agent Systems

(2) Carmel, David a,b Markovitch, Shaul a

a TECHNION ISRAEL INSTITUTE OF TECHNOLOGY (Israel)

b IBM HAIFA RESEARCH LAB (Israel)

Author keywords

Exploration; Model based learning; Multi agent systems

Indexed keywords

EID: 0033423368 PISSN: 13872532 EISSN: None Source Type: Journal
DOI: 10.1023/A:1010007108196 Document Type: Article

Times cited : (67)

References (42)

1
- 0023453626
- Learning regular sets from queries and counterexamples
- D. Angluin. "Learning regular sets from queries and counterexamples." Information and Computation, vol. 75 pp. 87-106, 1987.
- (1987) Information and Computation , vol.75 , pp. 87-106
- Angluin, D.¹

2
- 84936824515
- Basic Books: New York, NY
- R. Axelrod. The Evolution of Cooperation, Basic Books: New York, NY, 1984.
- (1984) The Evolution of Cooperation
- Axelrod, R.¹

3
- 0011471586
- The complexity of computing a best response automaton in repeated games with mixed strategies
- E. Ben-Porath. "The complexity of computing a best response automaton in repeated games with mixed strategies." Games and Economic Behavior, vol. 2 pp. 1-12, 1990.
- (1990) Games and Economic Behavior , vol.2 , pp. 1-12
- Ben-Porath, E.¹

4
- 0004181906
- Chapman and Hall
- D. A. Berry and B. Fristedt. Bandit Problems, Sequential Allocation and Experiments, Chapman and Hall: 1985.
- (1985) Bandit Problems, Sequential Allocation and Experiments
- Berry, D.A.¹ Fristedt, B.²

5
- 0030365402
- Learning models of intelligent agents
- Portland, Oregon, August
- D. Carmel and S. Markovitch. "Learning models of intelligent agents," in Proceedings of Thirteenth National Conference on Artificial Intelligence (AAAI 96), Portland, Oregon, pp. 62-67, August 1996.
- (1996) Proceedings of Thirteenth National Conference on Artificial Intelligence (AAAI 96) , pp. 62-67
- Carmel, D.¹ Markovitch, S.²

6
- 0042413243
- Exploration and adaptation in multi-agent systems: A model-based approach
- Nagoya, Japan, August
- D. Carmel and S. Markovitch. "Exploration and adaptation in multi-agent systems: A model-based approach," in Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence (IJCAI-97), Nagoya, Japan, pp. 606-611, August 1997.
- (1997) Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence (IJCAI-97) , pp. 606-611
- Carmel, D.¹ Markovitch, S.²

7
- 0032344579
- Model-based learning of interaction strategies in multi-agent systems
- June
- D. Carmel and S. Markovitch. "Model-based learning of interaction strategies in multi-agent systems." Journal of Experimental and Theoretical Artificial Intelligence (JETAI), vol. 10 pp. 309-332, June 1998.
- (1998) Journal of Experimental and Theoretical Artificial Intelligence (JETAI) , vol.10 , pp. 309-332
- Carmel, D.¹ Markovitch, S.²

8
- 0003328374
- Neural network exploration using optimal experimental design
- J. D. Cowan, G. Tesauro, and J. Alspector, (Eds.), Morgan Kaufmann
- D. A. Cohn. "Neural network exploration using optimal experimental design," in J. D. Cowan, G. Tesauro, and J. Alspector, (Eds.), Advances in Neural Information Processing Systems 6, Morgan Kaufmann: pp. 679-686, 1994.
- (1994) Advances in Neural Information Processing Systems 6 , pp. 679-686
- Cohn, D.A.¹

9
- 0030260201
- Exploration bonuses and dual control
- P. Dayan and T. J. Sejnowski. "Exploration bonuses and dual control." Machine Learning, vol. 25(1) pp. 5-22, 1996.
- (1996) Machine Learning , vol.25 , Issue.1 , pp. 5-22
- Dayan, P.¹ Sejnowski, T.J.²

10
- 0004264698
- Academic Press: New York
- V. Fedorov. Theory of Optimal Experiments. Academic Press: New York, 1972.
- (1972) Theory of Optimal Experiments
- Fedorov, V.¹

11
- 0028062304
- Optimality and domination in repeated games with bounded players
- L. Fortnow and D. Whang. "Optimality and domination in repeated games with bounded players," in Proceedings of the 25th Annual ACM Symposium on Theory and Computing, pp. 741-749, 1994.
- (1994) Proceedings of the 25th Annual ACM Symposium on Theory and Computing , pp. 741-749
- Fortnow, L.¹ Whang, D.²

12
- 0027307379
- Efficient learning of typical finite automata from random walks
- Y. Freund, M. Kearns, D. Ron, R. Rubinfeld, R. E. Schapire, and Linda Sellie. "Efficient learning of typical finite automata from random walks," in Proceedings of the 25th Annual ACM Symposium on Theory and Computing, pp. 315-324, 1993.
- (1993) Proceedings of the 25th Annual ACM Symposium on Theory and Computing , pp. 315-324
- Freund, Y.¹ Kearns, M.² Ron, D.³ Rubinfeld, R.⁴ Schapire, R.E.⁵ Sellie, L.⁶

13
- 0029547692
- Efficient algorithms for learning to play repeated games against computationally bounded adversaries
- Y. Freund, M. Kearns, Y. Mansour, D. Ron, R. Rubinfeled, and R. E. Schapire. "Efficient algorithms for learning to play repeated games against computationally bounded adversaries," in Proceedings of the Annual Symposium on the Foundations of Computer Science, pp. 332-341, 1995.
- (1995) Proceedings of the Annual Symposium on the Foundations of Computer Science , pp. 332-341
- Freund, Y.¹ Kearns, M.² Mansour, Y.³ Ron, D.⁴ Rubinfeled, R.⁵ Schapire, R.E.⁶

14
- 0001536620
- Steady state learning and nash equilibrium
- D. Fudenberg and D. Levine. "Steady state learning and nash equilibrium." Econometrica, vol. 61 pp. 547-574, 1993.
- (1993) Econometrica , vol.61 , pp. 547-574
- Fudenberg, D.¹ Levine, D.²

15
- 38249006045
- Bounded versus unbounded rationality: The tyranny of the weak
- I. Gilboa and D. Samet. "Bounded versus unbounded rationality: The tyranny of the weak." Games and Economic Behavior, vol. 1 pp. 213-221, 1989.
- (1989) Games and Economic Behavior , vol.1 , pp. 213-221
- Gilboa, I.¹ Samet, D.²

16
- 38249029225
- The complexity of computing best response automata in repeated games
- I. Gilboa. "The complexity of computing best response automata in repeated games." Journal of Economic Theory, vol. 45 pp. 342-352, 1988.
- (1988) Journal of Economic Theory , vol.45 , pp. 342-352
- Gilboa, I.¹

17
- 84891584370
- Wiley and Sons: New York
- J. C. Gittins. Multi-armed Bandit Allocation Indices. Wiley and Sons: New York, 1989.
- (1989) Multi-armed Bandit Allocation Indices
- Gittins, J.C.¹

18
- 0003620778
- Addison-Wesley: MA
- J. E. Hopcroft and J. D. Ullman. Introduction to Automata Theory, Languages and Computation. Addison-Wesley: MA, 1979.
- (1979) Introduction to Automata Theory, Languages and Computation
- Hopcroft, J.E.¹ Ullman, J.D.²

19
- 0002298153
- Bayesian learning in normal form games
- J. S. Jordan. "Bayesian learning in normal form games." Games and Economic Behavior, vol. 3 pp. 60-81, 1991.
- (1991) Games and Economic Behavior , vol.3 , pp. 60-81
- Jordan, J.S.¹

20
- 38249015887
- The exponential convergence of bayesian learning in normal form games
- J. S. Jordan. "The exponential convergence of bayesian learning in normal form games." Games and Economic Behavior, vol. 4 pp. 202-217, 1991.
- (1991) Games and Economic Behavior , vol.4 , pp. 202-217
- Jordan, J.S.¹

21
- 0029679044
- Reinforement learning: A survey
- L. P. Kaelbling, M. L. Littman, and A. W. Moore. "Reinforement learning: A survey." Journal of Artificial Intelligence Research, vol. 4 pp. 237-285, 1996.
- (1996) Journal of Artificial Intelligence Research , vol.4 , pp. 237-285
- Kaelbling, L.P.¹ Littman, M.L.² Moore, A.W.³

22
- 0004280606
- MIT Press: Cambridge, MA
- L. P. Kaelbling. Learning in Embedded Systems. MIT Press: Cambridge, MA, 1993.
- (1993) Learning in Embedded Systems
- Kaelbling, L.P.¹

23
- 0000221289
- Rational learning leads to Nash equilibrium
- September
- E. Kalai and E. Lehrer. "Rational learning leads to Nash equilibrium." Econometrica, vol. 61(5) pp. 1019-1045, September 1993.
- (1993) Econometrica , vol.61 , Issue.5 , pp. 1019-1045
- Kalai, E.¹ Lehrer, E.²

24
- 0011473030
- Bounded rationality and strategic complexity in repeated games
- T Ichiishi, A. Neyman, and Y. Tauman, (Eds.), Academic Press: San Diego
- E. Kalai. "Bounded rationality and strategic complexity in repeated games," in T Ichiishi, A. Neyman, and Y. Tauman, (Eds.), Game Theory and Applications, Academic Press: San Diego, pp. 131-157, 1990.
- (1990) Game Theory and Applications , pp. 131-157
- Kalai, E.¹

25
- 0041912178
- Probabilistic exploration in planning while learning
- July
- G. I. Karakoulas. "Probabilistic exploration in planning while learning," in Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (UAI 95), pp. 352-361, July 1995.
- (1995) Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (UAI 95) , pp. 352-361
- Karakoulas, G.I.¹

26
- 85149834820
- Markov games as a framework for multi-agent reinforcement learning
- July
- M. L. Littman. "Markov games as a framework for multi-agent reinforcement learning," in Proceedings of the Eleventh International Conference on Machine Learning, pp. 157-163, July 1994.
- (1994) Proceedings of the Eleventh International Conference on Machine Learning , pp. 157-163
- Littman, M.L.¹

27
- 0003616750
- John Wiley & Sons
- R. D. Luce and H. Raiffa. Games and Decisions, Introduction and Critical Survey. John Wiley & Sons: 1957.
- (1957) Games and Decisions, Introduction and Critical Survey
- Luce, R.D.¹ Raiffa, H.²

28
- 0027684215
- Prioritized sweeping: Reinforcement learning with less data and less time
- A. W. Moore and C. G. Atkeson. "Prioritized sweeping: Reinforcement learning with less data and less time." Machine Learning, vol. 13(1), 1993.
- (1993) Machine Learning , vol.13 , Issue.1
- Moore, A.W.¹ Atkeson, C.G.²

29
- 84949966497
- Learn your opponent's strategy (in polynomial time)
- G. Weiß and S. Sen, (Eds.), Springer-Verlag
- Y. Mor, C. V. Goldman, and J. S. Rosenschein. "Learn your opponent's strategy (in polynomial time)," in G. Weiß and S. Sen, (Eds.), Adaptation and Learning in Multi-agent Systems, Lecture Notes in AI. Springer-Verlag: 1996.
- (1996) Adaptation and Learning in Multi-agent Systems, Lecture Notes in AI
- Mor, Y.¹ Goldman, C.V.² Rosenschein, J.S.³

30
- 0042914184
- Optimization and rational learning in games
- J. H. Nachbar. "Optimization and rational learning in games." Econometrica vol. 65(2), 1997.
- (1997) Econometrica , vol.65 , Issue.2
- Nachbar, J.H.¹

31
- 0000977910
- The complexity of Markov decision processes
- C. H. Papadimitriou and J. N. Tsitsiklis. "The complexity of Markov decision processes." Mathematics of Operations Research, vol. 12(3) pp. 441-450, 1987.
- (1987) Mathematics of Operations Research , vol.12 , Issue.3 , pp. 441-450
- Papadimitriou, C.H.¹ Tsitsiklis, J.N.²

32
- 0000948830
- On players with a bounded number of states
- C. H. Papadimitriou. "On players with a bounded number of states." Games and Economic Behavior, vol. 4 pp. 122-131, 1992.
- (1992) Games and Economic Behavior , vol.4 , pp. 122-131
- Papadimitriou, C.H.¹

33
- 0041410936
- Exactly learning automata with small cover time
- D. Ron and R. Rubinfeled. "Exactly learning automata with small cover time," in Proceedings of the Seventh Annual ACM Conference on Computational Learning Theory, pp. 427-436, 1995.
- (1995) Proceedings of the Seventh Annual ACM Conference on Computational Learning Theory , pp. 427-436
- Ron, D.¹ Rubinfeled, R.²

34
- 46149134052
- Finite automata play the repeated Prisoner's Dilemma
- A. Rubinstein. "Finite automata play the repeated Prisoner's Dilemma." Journal of Economic Theory, vol. 39 pp. 83-96, 1986.
- (1986) Journal of Economic Theory , vol.39 , pp. 83-96
- Rubinstein, A.¹

35
- 0030050933
- Multiagent reinforcement learning and the iterated Prisoner's Dilemma
- T. W. Sandholm and R. H. Crites. "Multiagent reinforcement learning and the iterated Prisoner's Dilemma." Biosystems Journal, vol. 37 pp. 147-166, 1995.
- (1995) Biosystems Journal , vol.37 , pp. 147-166
- Sandholm, T.W.¹ Crites, R.H.²

36
- 0024079557
- Learning control of finite Markov chains with an explicit trade-off between estimation and control
- September
- M. Sato, K. Abe, and H. Takeda. "Learning control of finite Markov chains with an explicit trade-off between estimation and control," in IEEE Transactions on Systems, Man and Cybernetics, vol. 18(5), September 1991.
- (1991) IEEE Transactions on Systems, Man and Cybernetics , vol.18 , Issue.5
- Sato, M.¹ Abe, K.² Takeda, H.³

37
- 0028555752
- Learning to coordinate without sharing information
- Seattle, Washington
- S. Sen, M. Sekaran, and J. Hale. "Learning to coordinate without sharing information," in Proceeding of the Twelfth National Conference on Artificial Intelligence (AAAI-94), Seattle, Washington, pp. 426-431, 1994.
- (1994) Proceeding of the Twelfth National Conference on Artificial Intelligence (AAAI-94) , pp. 426-431
- Sen, S.¹ Sekaran, M.² Hale, J.³

38
- 0041410934
- Convergence results for single-step on-policy reinforcement-learning algorithms
- to appear
- S. Singh, T. Jaakkola, M. L. Littman, and C. Szpezvari. "Convergence results for single-step on-policy reinforcement-learning algorithms." Machine Learning Journal (to appear), 1998.
- (1998) Machine Learning Journal
- Singh, S.¹ Jaakkola, T.² Littman, M.L.³ Szpezvari, C.⁴

39
- 85132026293
- Integrated architectures for learning, planning, and reacting based on approximating dynamic programming
- Morgan Kaufman: San Mateo, CA
- R. S. Sutton. "Integrated architectures for learning, planning, and reacting based on approximating dynamic programming," in Proceedings of the Seventh International Conference on Machine Learning, Morgan Kaufman: San Mateo, CA, pp. 216-224, 1990.
- (1990) Proceedings of the Seventh International Conference on Machine Learning , pp. 216-224
- Sutton, R.S.¹

40
- 0002210775
- The role of exploration in learning control
- David A. White and Donald Sopfge, (Eds.), Multiscience Press Inc.
- S. B. Thrun. "The role of exploration in learning control," in David A. White and Donald Sopfge, (Eds.), Handbook for Intelligent Control. Multiscience Press Inc.: 1992.
- (1992) Handbook for Intelligent Control
- Thrun, S.B.¹

41
- 34249833101
- Technical notes: Q-learning
- C. J. C. H. Watkins and P. Dayan. "Technical notes: Q-learning." Machine Learning, vol. 8 pp. 279-292, 1992.
- (1992) Machine Learning , vol.8 , pp. 279-292
- Watkins, C.J.C.H.¹ Dayan, P.²

42
- 0003782395
- Springer-Verlag
- G. Weiß and S. Sen. Adaptation and Learning in Multi-agent Systems, Lecture Notes in AI 1042. Springer-Verlag: 1996.
- (1996) Adaptation and Learning in Multi-agent Systems, Lecture Notes in AI 1042
- Weiß, G.¹ Sen, S.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.