-
1
-
-
0020970738
-
IEEE Transactions on Systems, Man, and Cybernetics, SMC-13
-
Barto, A., Sutton, R., & Anderson, C. (1983). Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man, and Cybernetics, SMC-13, 834-846.
-
(1983)
, pp. 834-846
-
-
Barto, A.1
Sutton, R.2
Anderson, C.3
-
3
-
-
0028447220
-
Decision-theoretic deliberation scheduling for problem solving in time-constrained environments
-
Boddy, M., & Dean, T. (1994). Decision-theoretic deliberation scheduling for problem solving in time-constrained environments, Artificial Intelligence, 67(2), 245-286.
-
(1994)
Artificial Intelligence
, vol.67
, Issue.2
, pp. 245-286
-
-
Boddy, M.1
Dean, T.2
-
5
-
-
0008458860
-
-
Improving elevator performance using reinforcement learning, Multi-ag In :1017-1023
-
Crites, R., & Barto, A. (1996). Improving elevator performance using reinforcement learning, Multi-ag In Advances in Neural Information Processing Systems, pages 8:1017-1023.
-
(1996)
Advances in Neural Information Processing Systems
, pp. 8
-
-
Crites, R.1
Barto, A.2
-
6
-
-
84880655104
-
An analysis of time-dependent planning
-
Saint Paul, Minnesota, USA: AAAI Press/MIT Press
-
Dean, T., & Boddy, M. (1988). An analysis of time-dependent planning. In Proceedings of the seventh national conference on artificial intelligence (AAAI-88) (pp. 49-54). Saint Paul, Minnesota, USA: AAAI Press/MIT Press.
-
(1988)
Proceedings of the seventh national conference on artificial intelligence (AAAI-88)
, pp. 49-54
-
-
Dean, T.1
Boddy, M.2
-
7
-
-
0001700825
-
Taems: A framework for environment centered analysis and design of coordination mechanisms
-
G. O'Hare & N. Jennings, Eds, Wiley Inter-Science
-
Decker, K. (1996). Taems: a framework for environment centered analysis and design of coordination mechanisms. In G. O'Hare & N. Jennings, (Eds.), Foundations of Distributed Artificial Intelligence, Chapter 16 (pp. 429-448). Wiley Inter-Science.
-
(1996)
Foundations of Distributed Artificial Intelligence
, pp. 429-448
-
-
Decker, K.1
-
8
-
-
0002278788
-
Hierarchical reinforcement learning with the MAXQ value function decomposition
-
Dietterich, T. (2000). Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 13, 227-303.
-
(2000)
Journal of Artificial Intelligence Research
, vol.13
, pp. 227-303
-
-
Dietterich, T.1
-
9
-
-
9144222344
-
What is rational psychology? toward a modern mental philosophy
-
Doyle, J. (1983). What is rational psychology? toward a modern mental philosophy. AI Magazine, 4(3), 50-53.
-
(1983)
AI Magazine
, vol.4
, Issue.3
, pp. 50-53
-
-
Doyle, J.1
-
10
-
-
0027701502
-
Design-to-time real-time scheduling
-
Garvey, A. & Lesser, V. (1993). Design-to-time real-time scheduling. IEEE Transactions on Systems, Man, and Cybernetics, 23 (6):1491-1502.
-
(1993)
IEEE Transactions on Systems, Man, and Cybernetics
, vol.23
, Issue.6
, pp. 1491-1502
-
-
Garvey, A.1
Lesser, V.2
-
12
-
-
33747182896
-
Managing online self-adaptation in real-time environments
-
Goldman, R., Musliner, D. & Krebsbach, K. (2003). Managing online self-adaptation in real-time environments. In LNCS, vol. 2614, SV, pp. 6-23.
-
(2003)
LNCS
, vol.2614
, Issue.SV
, pp. 6-23
-
-
Goldman, R.1
Musliner, D.2
Krebsbach, K.3
-
13
-
-
0003106875
-
Twenty-seven principles of rationality
-
V. P. Godambe & D. A. Sprott, Eds, Toronto: Holt Rinehart Wilson
-
Good, I. J. (1971). Twenty-seven principles of rationality. In V. P. Godambe & D. A. Sprott, (Eds.), Foundations of statistical inference (pp. 108-141). Toronto: Holt Rinehart Wilson.
-
(1971)
Foundations of statistical inference
, pp. 108-141
-
-
Good, I.J.1
-
14
-
-
0004867057
-
Monitoring anytime algorithms
-
Hansen, E. & Zilberstein, S. (1996). Monitoring anytime algorithms. SIGART Bulletin, 7(2), 28-33.
-
(1996)
SIGART Bulletin
, vol.7
, Issue.2
, pp. 28-33
-
-
Hansen, E.1
Zilberstein, S.2
-
16
-
-
0027694654
-
-
Hayes-Roth, B. (1993). Opportunistic control of action in intelligent agents. In Proceedings of IEEE transactions on systems, man and cybernetics, pp. SMC-23(6), 1575-1587.
-
Hayes-Roth, B. (1993). Opportunistic control of action in intelligent agents. In Proceedings of IEEE transactions on systems, man and cybernetics, pp. SMC-23(6), 1575-1587.
-
-
-
-
17
-
-
0028579426
-
Guardian: A prototype intelligent agent for intensive-care monitoring
-
Hayes-Roth, B., Uckun, S., Larsson, X E., Gaba, D., Barr, J. & Chien, J. (1994). Guardian: A prototype intelligent agent for intensive-care monitoring. In Proceedings of the national conference on artificial intelligence, pp. 1503-1511.
-
(1994)
Proceedings of the national conference on artificial intelligence
, pp. 1503-1511
-
-
Hayes-Roth, B.1
Uckun, S.2
Larsson, X.E.3
Gaba, D.4
Barr, J.5
Chien, J.6
-
18
-
-
14744296052
-
Multi-agent system simulation framework
-
Switzerland: EPFL, Lausanne
-
Horling, B., Lesser, V. & Vincent, R. (2000). Multi-agent system simulation framework. In sixteenth IMACS World Congress 2000 on scientific computation, applied mathematics and simulation. Switzerland: EPFL, Lausanne.
-
(2000)
sixteenth IMACS World Congress 2000 on scientific computation, applied mathematics and simulation
-
-
Horling, B.1
Lesser, V.2
Vincent, R.3
-
19
-
-
31344442254
-
The soft real-time agent control architecture
-
Horling, B., Lesser, V., Vincent, R. & Wagner, T. (2006). The soft real-time agent control architecture. Autonomous Agents and Multi-Agent Systems, 12(1), 35-92.
-
(2006)
Autonomous Agents and Multi-Agent Systems
, vol.12
, Issue.1
, pp. 35-92
-
-
Horling, B.1
Lesser, V.2
Vincent, R.3
Wagner, T.4
-
22
-
-
0032114497
-
Learning communication strategies in multiagent systems
-
Kinney, M. & Tsatsoulis, C. (1998). Learning communication strategies in multiagent systems. Applied intelligence, 9(1), 71-91.
-
(1998)
Applied intelligence
, vol.9
, Issue.1
, pp. 71-91
-
-
Kinney, M.1
Tsatsoulis, C.2
-
25
-
-
0343048727
-
A distributed reinforcement learning scheme for network routing
-
Technical Report CS-93-165
-
Littman, M. & Boyan, J. (1993). A distributed reinforcement learning scheme for network routing. Technical Report CS-93-165.
-
(1993)
-
-
Littman, M.1
Boyan, J.2
-
27
-
-
0030647149
-
Reinforcement learning in the multi-robot domain
-
Mataric, M. (1997). Reinforcement learning in the multi-robot domain. Autonomous Robots, 4(1), 73-83.
-
(1997)
Autonomous Robots
, vol.4
, Issue.1
, pp. 73-83
-
-
Mataric, M.1
-
28
-
-
0029220270
-
The Challenges of real-time AI
-
Musliner, D. J., Hendler, J. A., Agrawala, A. K., Durfee, E. H., Strosnider, J. K. & Paul, C. J. (1995). The Challenges of real-time AI. IEEE Computer, 28(1), 58-66.
-
(1995)
IEEE Computer
, vol.28
, Issue.1
, pp. 58-66
-
-
Musliner, D.J.1
Hendler, J.A.2
Agrawala, A.K.3
Durfee, E.H.4
Strosnider, J.K.5
Paul, C.J.6
-
31
-
-
0001070375
-
Reinforcement learning with hierarchies of machines
-
M. I. Jordan, M. J. Kearns, & S. A. Solla Eds, The MIT Press
-
Parr, R. & Russell, S. (1997). Reinforcement learning with hierarchies of machines. In M. I. Jordan, M. J. Kearns, & S. A. Solla (Eds.), Advances in neural information processing systems, vol. 10, The MIT Press.
-
(1997)
Advances in neural information processing systems
, vol.10
-
-
Parr, R.1
Russell, S.2
-
33
-
-
3543071954
-
-
PhD thesis, University of Massachusetts at Amherst, Amherst, Massachusetts
-
Raja, A. (2003). Meta-level control in multi-agent systems. PhD thesis, University of Massachusetts at Amherst, Amherst, Massachusetts.
-
(2003)
Meta-level control in multi-agent systems
-
-
Raja, A.1
-
35
-
-
0033700750
-
Toward Robust Agent Control in Open Environments
-
Barcelona, Catalonia, Spain: ACM Press
-
Raja, A., Lesser, V., & Wagner, T. (2000). Toward Robust Agent Control in Open Environments. In Proceedings of the fourth international conference on autonomous agents (pp. 84-91). Barcelona, Catalonia, Spain: ACM Press.
-
(2000)
Proceedings of the fourth international conference on autonomous agents
, pp. 84-91
-
-
Raja, A.1
Lesser, V.2
Wagner, T.3
-
40
-
-
0030050933
-
Multiagent reinforcement learning in the iterated prisoner's dilemma
-
Sandholm, T. & Crites, R. (1995). Multiagent reinforcement learning in the iterated prisoner's dilemma. Biosystems Journal, 37, 147-166.
-
(1995)
Biosystems Journal
, vol.37
, pp. 147-166
-
-
Sandholm, T.1
Crites, R.2
-
41
-
-
0010271689
-
The control of reasoning in resource-bounded agents
-
Schut, M. & Wooldridge, M. (2001). The control of reasoning in resource-bounded agents. Knowledge Engineering Review, 16(3), 215-240.
-
(2001)
Knowledge Engineering Review
, vol.16
, Issue.3
, pp. 215-240
-
-
Schut, M.1
Wooldridge, M.2
-
42
-
-
0028555752
-
Learning to coordinate without sharing information
-
Seattle, WA
-
Sen, S., Sekaran, M. & Hale, J. (1994). Learning to coordinate without sharing information. In Proceedings of the twelfth national conference on artificial intelligence, (pp. 426-431), Seattle, WA.
-
(1994)
Proceedings of the twelfth national conference on artificial intelligence
, pp. 426-431
-
-
Sen, S.1
Sekaran, M.2
Hale, J.3
-
43
-
-
0002298346
-
From substantive to procedural rationality
-
Simon, H, Latsis, S. J, Ed, Cambridge University Press, pp
-
Simon, H., Latsis, S. J. (Ed.) (1976). From substantive to procedural rationality. In Method and Appraisal in Economic. Cambridge University Press, pp. 129-148.
-
(1976)
Method and Appraisal in Economic
, pp. 129-148
-
-
-
44
-
-
0016556911
-
Optimal problem solving search: All-or-none solutions
-
Simon, H. & Kadane, J. (1974). Optimal problem solving search: All-or-none solutions. Artificial Intelligence, 6, 235-247.
-
(1974)
Artificial Intelligence
, vol.6
, pp. 235-247
-
-
Simon, H.1
Kadane, J.2
-
46
-
-
85158142417
-
Empirical evaluation of a reinforcement learning spoken dialogue system
-
Singh, S., Kearns, M., Litman, D. & Walker, M. (2000). Empirical evaluation of a reinforcement learning spoken dialogue system. In Proceedings of the seventeenth national conference on artificial intelligence, pp. 645-651.
-
(2000)
Proceedings of the seventeenth national conference on artificial intelligence
, pp. 645-651
-
-
Singh, S.1
Kearns, M.2
Litman, D.3
Walker, M.4
-
50
-
-
33847202724
-
Learning to predict by the method of temporal differences
-
Sutton, R. (1988). Learning to predict by the method of temporal differences. Machine Learning, 3(1), 9-44.
-
(1988)
Machine Learning
, vol.3
, Issue.1
, pp. 9-44
-
-
Sutton, R.1
-
51
-
-
0033170372
-
Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
-
Sutton, R., Precup, D. & Singh, S. (1999). Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112(1-2), 181-211.
-
(1999)
Artificial Intelligence
, vol.112
, Issue.1-2
, pp. 181-211
-
-
Sutton, R.1
Precup, D.2
Singh, S.3
-
53
-
-
84937406535
-
An agent infrastructure to build and evaluate multi-agent systems: The java agent framework and multi-agent system simulator
-
Wagner and Rana Eds, Springer
-
Vincent, R., Horling, B. & Lesser, V. (2001). An agent infrastructure to build and evaluate multi-agent systems: The java agent framework and multi-agent system simulator. In Wagner and Rana (Eds.), Lecture notes in artificial intelligence: infrastructure for agents, multi-agent systems, and scalable multi-agent systems, vol. 1887. Springer.
-
(2001)
Lecture notes in artificial intelligence: Infrastructure for agents, multi-agent systems, and scalable multi-agent systems
, vol.1887
-
-
Vincent, R.1
Horling, B.2
Lesser, V.3
-
54
-
-
0032116206
-
International Journal of Approximate Reasoning, Special Issue on Scheduling, 19
-
1-2, A version also available as UMASS CS TR-97-59
-
Wagner, T., Garvey, A. & Lesser, V. (1998). Criteria-directed heuristic task scheduling. International Journal of Approximate Reasoning, Special Issue on Scheduling, 19(1-2), 91-118. A version also available as UMASS CS TR-97-59.
-
(1998)
, pp. 91-118
-
-
Wagner, T.1
Garvey, A.2
Lesser, V.3
-
56
-
-
0002557085
-
Learning to perceive and act by trial and error
-
Whitehead, S. D. & Ballard, D. H. (1991). Learning to perceive and act by trial and error. Machine Learning, 7(1), 45-83.
-
(1991)
Machine Learning
, vol.7
, Issue.1
, pp. 45-83
-
-
Whitehead, S.D.1
Ballard, D.H.2
-
59
-
-
0026986776
-
-
Zilberstein, S. & Russell, S. J. (.1.992). Efficient resource-bounded reasoning in AT-RALPH. In James Hendler, (Edn.), Proceedings of the first international conference of artificial intelligence planning systems (AIPS 92) (pp. 260-268) Morgan Kaufmann: College Park, Maryland, USA.
-
Zilberstein, S. & Russell, S. J. (.1.992). Efficient resource-bounded reasoning in AT-RALPH. In James Hendler, (Edn.), Proceedings of the first international conference of artificial intelligence planning systems (AIPS 92) (pp. 260-268) Morgan Kaufmann: College Park, Maryland, USA.
-
-
-
-
60
-
-
0030122886
-
Optimal composition of real-time systems
-
Zilberstein, S. & Russell, S. J. (1996). Optimal composition of real-time systems. Artificial Intelligence, 82(1-2), 181-213.
-
(1996)
Artificial Intelligence
, vol.82
, Issue.1-2
, pp. 181-213
-
-
Zilberstein, S.1
Russell, S.J.2
-
61
-
-
1942484421
-
Online convex programming and generalized infinitesimal gradient ascent
-
Zinkevich, M. (2003). Online convex programming and generalized infinitesimal gradient ascent. International Conference in Machine Learning, pp. 929-936.
-
(2003)
International Conference in Machine Learning
, pp. 929-936
-
-
Zinkevich, M.1
|