-
2
-
-
0030149709
-
Purposive behavior acquisition for a real robot by vision-based reinforcement learning
-
M. Asada, S. Noda, S. Tawaratsumida, K. Hosada, Purposive behavior acquisition for a real robot by vision-based reinforcement learning, Machine Learning 23 (1996) 279-303.
-
(1996)
Machine Learning
, vol.23
, pp. 279-303
-
-
Asada, M.1
Noda, S.2
Tawaratsumida, S.3
Hosada, K.4
-
4
-
-
84880685295
-
Prioritized goal decomposition of markov decision processes: Toward a synthesis of classical and decision theoretic planning
-
Nagoya, Japan
-
C. Boutilier, R.I. Brafman, C. Geib, Prioritized goal decomposition of markov decision processes: Toward a synthesis of classical and decision theoretic planning, in: Proc. IJCAI-97, Nagoya, Japan, 1997, pp. 1162-1165.
-
(1997)
Proc. IJCAI-97
, pp. 1162-1165
-
-
Boutilier, C.1
Brafman, R.I.2
Geib, C.3
-
5
-
-
85150714688
-
Reinforcement learning methods for continuous-time markov decision problems
-
MIT Press, Cambridge, MA
-
S.J. Bradtke, M.O. Duff, Reinforcement learning methods for continuous-time markov decision problems, in: Advances in Neural Information Processing Systems 7, MIT Press, Cambridge, MA, 1995, pp. 393-400.
-
(1995)
Advances in Neural Information Processing Systems
, vol.7
, pp. 393-400
-
-
Bradtke, S.J.1
Duff, M.O.2
-
6
-
-
0031185898
-
Modeling agents as qualitative decision makers
-
R.I. Brafman, M. Tennenholtz, Modeling agents as qualitative decision makers, Artificial Intelligence 94 (1) (1997) 217-268.
-
(1997)
Artificial Intelligence
, vol.94
, Issue.1
, pp. 217-268
-
-
Brafman, R.I.1
Tennenholtz, M.2
-
7
-
-
0003106852
-
Hybrid models for motion control systems
-
Birkhäuser, Boston, MA
-
R.W. Brockett, Hybrid models for motion control systems, in: Essays in Control: Perspectives in the Theory and its Applications, Birkhäuser, Boston, MA, 1993, pp. 29-53.
-
(1993)
Essays in Control: Perspectives in the Theory and its Applications,
, pp. 29-53
-
-
Brockett, R.W.1
-
8
-
-
0006493602
-
Reasoning about probabilistic actions at multiple levels of granularity
-
Stanford University
-
L. Chrisman, Reasoning about probabilistic actions at multiple levels of granularity, in: Proc. AAAI Spring Symposium: Decision-Theoretic Planning, Stanford University, 1994.
-
(1994)
Proc. AAAI Spring Symposium: Decision-Theoretic Planning,
-
-
Chrisman, L.1
-
9
-
-
0030167564
-
Behavior analysis and training: A methodology for behavior engineering
-
M. Colombetti, M. Dorigo, G. Borghi, Behavior analysis and training: A methodology for behavior engineering, IEEE Trans. Systems Man Cybernet. Part B 26 (3) (1996) 365-380.
-
(1996)
IEEE Trans. Systems Man Cybernet. Part B
, vol.26
, Issue.3
, pp. 365-380
-
-
Colombetti, M.1
Dorigo, M.2
Borghi, G.3
-
10
-
-
85156187730
-
Improving elevator performance using reinforcement learning
-
MIT Press, Cambridge, MA
-
R.H. Crites, A.G. Barto, Improving elevator performance using reinforcement learning, in: Advances in Neural Information Processing Systems 8, MIT Press, Cambridge, MA, 1996, pp. 1017-1023.
-
(1996)
Advances in Neural Information Processing Systems
, vol.8
, pp. 1017-1023
-
-
Crites, R.H.1
Barto, A.G.2
-
11
-
-
0001158047
-
Improving generalization for temporal difference learning: The successor representation
-
P. Dayan, Improving generalization for temporal difference learning: The successor representation, Neural Computation 5 (1993) 613-624.
-
(1993)
Neural Computation
, vol.5
, pp. 613-624
-
-
Dayan, P.1
-
12
-
-
0001234682
-
Feudal reinforcement learning
-
Morgan Kaufmann, San Mateo, CA
-
P. Dayan, G.E. Hinton, Feudal reinforcement learning, in: Advances in Neural Information Processing Systems 5, Morgan Kaufmann, San Mateo, CA, 1993, pp. 271-278.
-
(1993)
Advances in Neural Information Processing Systems
, vol.5
, pp. 271-278
-
-
Dayan, P.1
Hinton, G.E.2
-
13
-
-
0029332887
-
Planning under time constraints in stochastic domains
-
T. Dean, L.P. Kaelbling, J. Kirman, A. Nicholson, Planning under time constraints in stochastic domains, Artificial Intelligence 76 (1-2) (1995) 35-74.
-
(1995)
Artificial Intelligence
, vol.76
, Issue.1-2
, pp. 35-74
-
-
Dean, T.1
Kaelbling, L.P.2
Kirman, J.3
Nicholson, A.4
-
14
-
-
85168151397
-
Decomposition techniques for planning in stochastic domains
-
Montreal, Quebec, Morgan Kaufmann, San Mateo, CA, See also Technical Report CS-95-10, Brown University, Department of Computer Science, 1995
-
T. Dean, S.-H. Lin, Decomposition techniques for planning in stochastic domains, in: Proc. IJCAI-95, Montreal, Quebec, Morgan Kaufmann, San Mateo, CA, 1995, pp. 1121-1127. See also Technical Report CS-95-10, Brown University, Department of Computer Science, 1995.
-
(1995)
Proc. IJCAI-95
, pp. 1121-1127
-
-
Dean, T.1
Lin, S.-H.2
-
15
-
-
0028317777
-
Learning to plan in continuous domains
-
G.F. DeJong, Learning to plan in continuous domains, Artificial Intelligence 65 (1994) 71-141.
-
(1994)
Artificial Intelligence
, vol.65
, pp. 71-141
-
-
DeJong, G.F.1
-
16
-
-
0001806701
-
The MAXQ method for hierarchical reinforcement learning
-
Morgan Kaufmann, San Mateo, CA
-
T.G. Dietterich, The MAXQ method for hierarchical reinforcement learning, in: Machine Learning: Proc. 15th International Conference, Morgan Kaufmann, San Mateo, CA, 1998, pp. 118-126.
-
(1998)
Machine Learning: Proc. 15th International Conference
, pp. 118-126
-
-
Dietterich, T.G.1
-
17
-
-
0028739953
-
Robot shaping: Developing autonomous agents through learning
-
M. Dorigo, M. Colombetti, Robot shaping: Developing autonomous agents through learning, Artificial Intelligence 71 (1994) 321-370.
-
(1994)
Artificial Intelligence
, vol.71
, pp. 321-370
-
-
Dorigo, M.1
Colombetti, M.2
-
19
-
-
26844577989
-
Composing functions to speed up reinforcement learning in a changing world
-
Springer, Berlin
-
C. Drummond, Composing functions to speed up reinforcement learning in a changing world, in: Proc. 10th European Conference on Machine Learning, Springer, Berlin, 1998.
-
(1998)
Proc. 10th European Conference on Machine Learning
-
-
Drummond, C.1
-
20
-
-
85158051593
-
Why PRODIGY/EBL works
-
Boston, MA, MIT Press, Cambridge, MA
-
O. Etzioni, Why PRODIGY/EBL works, in: Proc. AAAI-90, Boston, MA, MIT Press, Cambridge, MA, 1990, pp. 916-922.
-
(1990)
Proc. AAAI-90
, pp. 916-922
-
-
Etzioni, O.1
-
23
-
-
0030389008
-
A statistical approach to adaptive problem solving
-
J. Gratch, G. DeJong, A statistical approach to adaptive problem solving, Artificial Intelligence 88 (1-2) (1996) 101-161.
-
(1996)
Artificial Intelligence
, vol.88
, Issue.1-2
, pp. 101-161
-
-
Gratch, J.1
DeJong, G.2
-
24
-
-
0026961480
-
A statistical approach to solving the EBL utility problem
-
San Jose, CA
-
R. Greiner, I. Jurisica, A statistical approach to solving the EBL utility problem, in: Proc. AAAI-92, San Jose, CA, 1992, pp. 241-248.
-
(1992)
Proc. AAAI-92
, pp. 241-248
-
-
Greiner, R.1
Jurisica, I.2
-
25
-
-
0004242478
-
-
Springer, New York
-
R.L. Grossman, A. Nerode, A.P. Ravn, H. Rischel, Hybrid Systems, Springer, New York, 1993.
-
(1993)
Hybrid Systems
-
-
Grossman, R.L.1
Nerode, A.2
Ravn, A.P.3
Rischel, H.4
-
26
-
-
0006419533
-
Hierarchical solution of Markov decision processes using macro-actions
-
M. Hauskrecht, N. Meuleau, C. Boutilier, L.P. Kaelbling, T. Dean, Hierarchical solution of Markov decision processes using macro-actions, in: Uncertainty in Artificial Intelligence: Proc. 14th Conference, 1998, pp. 220-229.
-
(1998)
Uncertainty in Artificial Intelligence: Proc. 14th Conference
, pp. 220-229
-
-
Hauskrecht, M.1
Meuleau, N.2
Boutilier, C.3
Kaelbling, L.P.4
Dean, T.5
-
28
-
-
0031343489
-
A feedback control structure for on-line learning tasks
-
M. Huber, R.A. Grupen, A feedback control structure for on-line learning tasks, Robotics and Autonomous Systems 22 (3-4) (1997) 303-315.
-
(1997)
Robotics and Autonomous Systems
, vol.22
, Issue.3-4
, pp. 303-315
-
-
Huber, M.1
Grupen, R.A.2
-
29
-
-
0000148778
-
A heuristic approach to the discovery of macro-operators
-
G.A. Iba, A heuristic approach to the discovery of macro-operators, Machine Learning 3 (1989) 285-317.
-
(1989)
Machine Learning
, vol.3
, pp. 285-317
-
-
Iba, G.A.1
-
30
-
-
0000439891
-
On the convergence of stochastic iterative dynamic programming algorithms
-
T. Jaakkola, M.I. Jordan, S. Singh, On the convergence of stochastic iterative dynamic programming algorithms, Neural Computation 6 (6) (1994) 1185-1201.
-
(1994)
Neural Computation
, vol.6
, Issue.6
, pp. 1185-1201
-
-
Jaakkola, T.1
Jordan, M.I.2
Singh, S.3
-
31
-
-
85143168613
-
Hierarchical learning in stochastic domains: Preliminary results
-
Morgan Kaufmann, San Mateo, CA
-
L.P. Kaelbling, Hierarchical learning in stochastic domains: Preliminary results, in: Proc. 10th International Conference on Machine Learning, Morgan Kaufmann, San Mateo, CA, 1993, pp. 167-173.
-
(1993)
Proc. 10th International Conference on Machine Learning
, pp. 167-173
-
-
Kaelbling, L.P.1
-
32
-
-
0032045145
-
Module based reinforcement learning: Experiments with a real robot
-
and Autonomous Robots 5 (1998) 273-295 (special joint issue)
-
Zs. Kalmár, Cs. Szepesvári, A. Lörincz, Module based reinforcement learning: Experiments with a real robot, Machine Learning 31 (1998) 55-85 and Autonomous Robots 5 (1998) 273-295 (special joint issue).
-
(1998)
Machine Learning
, vol.31
, pp. 55-85
-
-
Kalmár, Zs.1
Szepesvári, Cs.2
Lörincz, A.3
-
33
-
-
0021577685
-
A qualitative physics based on confluences
-
J. de Kleer, J.S. Brown, A qualitative physics based on confluences, Artificial Intelligence 24 (1-3) (1984) 7-83.
-
(1984)
Artificial Intelligence
, vol.24
, Issue.1-3
, pp. 7-83
-
-
De Kleer, J.1
Brown, J.S.2
-
35
-
-
0026961481
-
Automatic programming of robots using genetic programming
-
San Jose, CA
-
J.R. Koza, J.P. Rice, Automatic programming of robots using genetic programming, in: Proc. AAAI-92, San Jose, CA, 1992, pp. 194-201.
-
(1992)
Proc. AAAI-92
, pp. 194-201
-
-
Koza, J.R.1
Rice, J.P.2
-
36
-
-
0006502449
-
Commonsense knowledge of space: Learning from experience
-
Tokyo, Japan
-
B.J. Kuipers, Commonsense knowledge of space: Learning from experience, in: Proc. IJCAI-79, Tokyo, Japan, 1979, pp. 499-501.
-
(1979)
Proc. IJCAI-79
, pp. 499-501
-
-
Kuipers, B.J.1
-
37
-
-
0002982589
-
Chunking in SOAR: The anatomy of a general learning mechanism
-
J.E. Laird, P.S. Rosenbloom, A. Newell, Chunking in SOAR: The anatomy of a general learning mechanism, Machine Learning 1 (1986) 11-46.
-
(1986)
Machine Learning
, vol.1
, pp. 11-46
-
-
Laird, J.E.1
Rosenbloom, P.S.2
Newell, A.3
-
38
-
-
0003673017
-
Reinforcement learning for robots using neural networks
-
Ph.D. Thesis, Carnegie Mellon University
-
L.-J. Lin, Reinforcement learning for robots using neural networks, Ph.D. Thesis, Carnegie Mellon University, Technical Report CMU-CS-93-103, 1993.
-
(1993)
Technical Report CMU-CS-93-103
-
-
Lin, L.-J.1
-
39
-
-
84976813028
-
Learning to coordinate behaviors
-
Boston, MA
-
P. Maes, R. Brooks, Learning to coordinate behaviors, in: Proc. AAAI-90, Boston, MA, 1990, pp. 796-802.
-
(1990)
Proc. AAAI-90
, pp. 796-802
-
-
Maes, P.1
Brooks, R.2
-
40
-
-
0026880130
-
Automatic programming of behavior-based robots using reinforcement learning
-
S. Mahadevan, J. Connell, Automatic programming of behavior-based robots using reinforcement learning, Artificial Intelligence 55 (2-3) (1992) 311-365.
-
(1992)
Artificial Intelligence
, vol.55
, Issue.2-3
, pp. 311-365
-
-
Mahadevan, S.1
Connell, J.2
-
41
-
-
0001963197
-
Self-improving factory simulation using continuous-time average-reward reinforcement learning
-
S. Mahadevan, N. Marchalleck, T. Das, A. Gosavi, Self-improving factory simulation using continuous-time average-reward reinforcement learning, in: Proc. 14th International Conference on Machine Learning, 1997, pp. 202-210.
-
(1997)
Proc. 14th International Conference on Machine Learning
, pp. 202-210
-
-
Mahadevan, S.1
Marchalleck, N.2
Das, T.3
Gosavi, A.4
-
42
-
-
84898959706
-
Reinforcement learning for call admission control in routing in integrated service networks
-
Morgan Kaufmann, San Mateo, CA
-
P. Marbach, O. Mihatsch, M. Schulte, J.N. Tsitsiklis, Reinforcement learning for call admission control in routing in integrated service networks, in: Advances in Neural Information Processing Systems 10, Morgan Kaufmann, San Mateo, CA, 1998, pp. 922-928.
-
(1998)
Advances in Neural Information Processing Systems
, vol.10
, pp. 922-928
-
-
Marbach, P.1
Mihatsch, O.2
Schulte, M.3
Tsitsiklis, J.N.4
-
43
-
-
0031504223
-
Behavior-based control: Examples from navigation, learning, and group behavior
-
M.J. Mataric, Behavior-based control: Examples from navigation, learning, and group behavior, J. Experiment. Theoret. Artificial Intelligence 9 (2-3) (1997) 323-336.
-
(1997)
J. Experiment. Theoret. Artificial Intelligence
, vol.9
, Issue.2-3
, pp. 323-336
-
-
Mataric, M.J.1
-
44
-
-
0003543129
-
Macro-actions in reinforcement learning: An empirical analysis
-
University of Massachusetts, Department of Computer Science
-
A. McGovern, R.S. Sutton, Macro-actions in reinforcement learning: An empirical analysis, Technical Report 98-70, University of Massachusetts, Department of Computer Science, 1998.
-
(1998)
Technical Report 98-70
-
-
McGovern, A.1
Sutton, R.S.2
-
45
-
-
0031632806
-
Solving very large weakly coupled Markov decision processes
-
Madison, WI
-
N. Meuleau, M. Hauskrecht, K.-E. Kim, L. Peshkin, L.P. Kaelbling, T. Dean, C. Boutilier, Solving very large weakly coupled Markov decision processes, in: Proc. AAAI-98, Madison, WI, 1998, pp. 165-172.
-
(1998)
Proc. AAAI-98
, pp. 165-172
-
-
Meuleau, N.1
Hauskrecht, M.2
Kim, K.-E.3
Peshkin, L.4
Kaelbling, L.P.5
Dean, T.6
Boutilier, C.7
-
47
-
-
0025398889
-
Quantitative results concerning the utilty of explanation-based learning
-
S. Minton, Quantitative results concerning the utilty of explanation-based learning, Artificial Intelligence 42 (2-3) (1990) 363-391.
-
(1990)
Artificial Intelligence
, vol.42
, Issue.2-3
, pp. 363-391
-
-
Minton, S.1
-
48
-
-
0006488247
-
The parti-game algorithm for variable resolution reinforcement learning in multidimensional spaces
-
MIT Press, Cambridge, MA
-
A.W. Moore, The parti-game algorithm for variable resolution reinforcement learning in multidimensional spaces, in: Advances in Neural Information Processing Systems 6, MIT Press, Cambridge, MA, 1994, pp. 711-718.
-
(1994)
Advances in Neural Information Processing Systems
, vol.6
, pp. 711-718
-
-
Moore, A.W.1
-
49
-
-
0003430412
-
-
Prentice-Hall, Englewood Cliffs, NJ
-
A. Newell, H.A. Simon, Human Problem Solving, Prentice-Hall, Englewood Cliffs, NJ, 1972.
-
(1972)
Human Problem Solving
-
-
Newell, A.1
Simon, H.A.2
-
50
-
-
84899001559
-
A Q-learning based dynamic channel assignment technique for mobile communication systems
-
to appear
-
J. Nie, S. Haykin, A Q-learning based dynamic channel assignment technique for mobile communication systems, IEEE Transactions on Vehicular Technology, to appear.
-
IEEE Transactions on Vehicular Technology
-
-
Nie, J.1
Haykin, S.2
-
51
-
-
0027652475
-
Teleo-reactive programs for agent control
-
N. Nilsson, Teleo-reactive programs for agent control, J. Artificial Intelligence Res. 1 (1994) 139-158.
-
(1994)
J. Artificial Intelligence Res.
, vol.1
, pp. 139-158
-
-
Nilsson, N.1
-
53
-
-
84898956770
-
Reinforcement learning with hierarchies of machines
-
MIT Press, Cambridge, MA
-
R. Parr, S. Russell, Reinforcement learning with hierarchies of machines, in: Advances in Neural Information Procesing Systems 10, MIT Press, Cambridge, MA, 1998, pp. 1043-1049.
-
(1998)
Advances in Neural Information Procesing Systems
, vol.10
, pp. 1043-1049
-
-
Parr, R.1
Russell, S.2
-
55
-
-
84899003140
-
Multi-time models for temporally abstract planning
-
MIT Press, Cambridge, MA
-
D. Precup, R.S. Sutton, Multi-time models for temporally abstract planning, in: Advances in Neural Information Processing Systems 10, MIT Press, Cambridge, MA, 1998, pp. 1050-1056.
-
(1998)
Advances in Neural Information Processing Systems
, vol.10
, pp. 1050-1056
-
-
Precup, D.1
Sutton, R.S.2
-
56
-
-
0006419257
-
Planning with closed-loop macro actions
-
D. Precup, R.S. Sutton, S.P. Singh, Planning with closed-loop macro actions, in: Working Notes 1997 AAAI Fall Symposium on Model-directed Autonomous Systems, 1997, pp. 70-76.
-
(1997)
Working Notes 1997 AAAI Fall Symposium on Model-directed Autonomous Systems
, pp. 70-76
-
-
Precup, D.1
Sutton, R.S.2
Singh, S.P.3
-
57
-
-
0002955348
-
Theoretical results on reinforcement learning with temporally abstract options
-
Springer, Berlin
-
D. Precup, R.S. Sutton, S.P. Singh, Theoretical results on reinforcement learning with temporally abstract options, in: Proc. 10th European Conference on Machine Learning, Springer, Berlin, 1998.
-
(1998)
Proc. 10th European Conference on Machine Learning
-
-
Precup, D.1
Sutton, R.S.2
Singh, S.P.3
-
59
-
-
10844252596
-
Incremental development of complex behaviors through automatic construction of sensory-motor hierarchies
-
Morgan Kaufmann, San Mateo, CA
-
M. Ring, Incremental development of complex behaviors through automatic construction of sensory-motor hierarchies, in: Proc. 8th International Conference on Machine Learning, Morgan Kaufmann, San Mateo, CA, 1991, pp. 343-347.
-
(1991)
Proc. 8th International Conference on Machine Learning
, pp. 343-347
-
-
Ring, M.1
-
60
-
-
0016069798
-
Planning in ahierarchy of abstraction spaces
-
E.D. Sacerdoti, Planning in ahierarchy of abstraction spaces, Artificial Intelligence 5 (1974) 115-135.
-
(1974)
Artificial Intelligence
, vol.5
, pp. 115-135
-
-
Sacerdoti, E.D.1
-
62
-
-
0030145238
-
Qualitative system identification: Deriving structure from behavior
-
A.C.C. Say, S. Kuru, Qualitative system identification: Deriving structure from behavior, Artificial Intelligence 83 (1) (1996) 75-141.
-
(1996)
Artificial Intelligence
, vol.83
, Issue.1
, pp. 75-141
-
-
Say, A.C.C.1
Kuru, S.2
-
63
-
-
0006459160
-
-
Technische Universität München, TR FKI-148-91
-
J. Schmidhuber, Neural Sequence Chunkers, Technische Universität München, TR FKI-148-91, 1991.
-
(1991)
Neural Sequence Chunkers
-
-
Schmidhuber, J.1
-
64
-
-
0005610003
-
Probabilistic robot navigation in partially observable environments
-
Montreal, Quebec, Morgan Kaufmann, San Mateo, CA
-
R. Simmons, S. Koenig, Probabilistic robot navigation in partially observable environments, in: Proc. IJCAI-95, Montreal, Quebec, Morgan Kaufmann, San Mateo, CA, 1995, pp. 1080-1087.
-
(1995)
Proc. IJCAI-95
, pp. 1080-1087
-
-
Simmons, R.1
Koenig, S.2
-
65
-
-
0026962175
-
Reinforcement learning with ahierarchy of abstract models
-
San Jose, CA, MIT/AAAI Press, Cambridge, MA
-
S.P. Singh, Reinforcement learning with ahierarchy of abstract models, in: Proc. AAAI-92, San Jose, CA, MIT/AAAI Press, Cambridge, MA, 1992, pp. 202-207.
-
(1992)
Proc. AAAI-92
, pp. 202-207
-
-
Singh, S.P.1
-
66
-
-
0002876837
-
Scaling reinforcement learning by learning variable temporal resolution models
-
Morgan Kaufmann, San Mateo, CA
-
S.P. Singh, Scaling reinforcement learning by learning variable temporal resolution models, in: Proc. 9th International Conference on Machine Learning, Morgan Kaufmann, San Mateo, CA, 1992, pp. 406-415.
-
(1992)
Proc. 9th International Conference on Machine Learning
, pp. 406-415
-
-
Singh, S.P.1
-
67
-
-
0001652790
-
The efficient learning of multiple task sequences
-
Morgan Kaufmann, San Mateo, CA
-
S.P. Singh, The efficient learning of multiple task sequences, in: Advances in Neural Information Processing Systems 4, Morgan Kaufmann, San Mateo, CA, 1992, pp. 251-258.
-
(1992)
Advances in Neural Information Processing Systems
, vol.4
, pp. 251-258
-
-
Singh, S.P.1
-
68
-
-
0001027894
-
Transfer of learning by composing solutions of elemental sequential tasks
-
S.P. Singh, Transfer of learning by composing solutions of elemental sequential tasks, Machine Learning 8 (3/4) (1992) 323-340.
-
(1992)
Machine Learning
, vol.8
, Issue.3-4
, pp. 323-340
-
-
Singh, S.P.1
-
69
-
-
0006488248
-
Robust reinforcement learning in motion planning
-
Morgan Kaufmann, San Mateo, CA
-
S.P. Singh, A.G. Barto, R.A. Grupen, C.I. Connolly, Robust reinforcement learning in motion planning, in: Advances in Neural Information Processing Systems 6, Morgan Kaufmann, San Mateo, CA, 1994, pp. 655-662.
-
(1994)
Advances in Neural Information Processing Systems
, vol.6
, pp. 655-662
-
-
Singh, S.P.1
Barto, A.G.2
Grupen, R.A.3
Connolly, C.I.4
-
70
-
-
84898972974
-
Reinforcement learning for dynamic channel allocation in cellular telephone systems
-
MIT Press, Cambridge, MA
-
S.P. Singh, D. Bertsekas, Reinforcement learning for dynamic channel allocation in cellular telephone systems, in: Advances in Neural Information Processing Systems 9, MIT Press, Cambridge, MA, 1997, pp. 974-980.
-
(1997)
Advances in Neural Information Processing Systems
, vol.9
, pp. 974-980
-
-
Singh, S.P.1
Bertsekas, D.2
-
71
-
-
84922015064
-
TD models: Modeling the world at a mixture of time scales
-
Morgan Kaufmann, San Mateo, CA
-
R.S. Sutton, TD models: Modeling the world at a mixture of time scales, in: Proc. 12th International Conference on Machine Learning, Morgan Kaufmann, San Mateo, CA, 1995, pp. 531-539.
-
(1995)
Proc. 12th International Conference on Machine Learning
, pp. 531-539
-
-
Sutton, R.S.1
-
72
-
-
0004102479
-
-
MIT Press, Cambridge, MA
-
R.S. Sutton, A.G. Barto, Reinforcement Learning: An Introduction, MIT Press, Cambridge, MA, 1998.
-
(1998)
Reinforcement Learning: An Introduction
-
-
Sutton, R.S.1
Barto, A.G.2
-
74
-
-
0002260073
-
Intra-option learning about temporally abstract actions
-
Morgan Kaufmann, San Mateo, CA
-
R.S. Sutton, D. Precup, S. Singh, Intra-option learning about temporally abstract actions, in: Proc. 15th International Conference on Machine Learning, Morgan Kaufmann, San Mateo, CA, 1998, pp. 556-564.
-
(1998)
Proc. 15th International Conference on Machine Learning
, pp. 556-564
-
-
Sutton, R.S.1
Precup, D.2
Singh, S.3
-
75
-
-
0000672258
-
Improved switching among temporally abstract actions
-
MIT Press, Cambridge, MA
-
R.S. Sutton, S. Singh, D. Precup, B. Ravindran, Improved switching among temporally abstract actions, in: Advances in Neural Information Processing Systems 11, MIT Press, Cambridge, MA, 1999, pp. 1066-1072.
-
(1999)
Advances in Neural Information Processing Systems
, vol.11
, pp. 1066-1072
-
-
Sutton, R.S.1
Singh, S.2
Precup, D.3
Ravindran, B.4
-
76
-
-
0000797959
-
The problem of expensive chunks and its solution by restricting expressiveness
-
M. Tambe, A. Newell, P. Rosenbloom, The problem of expensive chunks and its solution by restricting expressiveness, Machine Learning 5 (3) (1990) 299-348.
-
(1990)
Machine Learning
, vol.5
, Issue.3
, pp. 299-348
-
-
Tambe, M.1
Newell, A.2
Rosenbloom, P.3
-
77
-
-
0029276036
-
Temporal difference learning and TD-Gammon
-
G.J. Tesauro, Temporal difference learning and TD-Gammon, Comm. ACM 38 (1995) 58-68.
-
(1995)
Comm. ACM
, vol.38
, pp. 58-68
-
-
Tesauro, G.J.1
-
78
-
-
33749882712
-
Finding structure in reinforcement learning
-
Morgan Kaufmann, San Mateo, CA
-
T. Thrun, A. Schwartz, Finding structure in reinforcement learning, in: Advances in Neural Information Processing Systems 7, Morgan Kaufmann, San Mateo, CA, 1995, pp. 385-392.
-
(1995)
Advances in Neural Information Processing Systems
, vol.7
, pp. 385-392
-
-
Thrun, T.1
Schwartz, A.2
-
79
-
-
0030418601
-
Behavior coordination for a mobile robot using modular reinforcement learning
-
M. Uchibe, M. Asada, K. Hosada, Behavior coordination for a mobile robot using modular reinforcement learning, in: Proc. IEEE/RSJ International Conference on Intelligent Robots and Systems, 1996, pp. 1329-1336.
-
(1996)
Proc. IEEE/RSJ International Conference on Intelligent Robots and Systems
, pp. 1329-1336
-
-
Uchibe, M.1
Asada, M.2
Hosada, K.3
-
82
-
-
0006496594
-
Scaling reinforcement learning techniques via modularity
-
Morgan Kaufmann, San Mateo, CA
-
L.E. Wixson, Scaling reinforcement learning techniques via modularity, in: Proc. 8th International Conference on Machine Learning, Morgan Kaufmann, San Mateo, CA, 1991, pp. 368-372.
-
(1991)
Proc. 8th International Conference on Machine Learning
, pp. 368-372
-
-
Wixson, L.E.1
|