-
2
-
-
0028731609
-
Fuzzy Q-learning: A new approach for fuzzy dynamic programming problems
-
Orlando, FL
-
Berenji, H.R. (1994) Fuzzy Q-learning: a new approach for fuzzy dynamic programming problems. Third IEEE International Conference on Fuzzy Systems, Orlando, FL.
-
(1994)
Third IEEE International Conference on Fuzzy Systems
-
-
Berenji, H.R.1
-
3
-
-
84899032145
-
All learning is local: Multi-agent learning in global reward games, Advances in Neural Information Processing Systems 16
-
Chang, Y.H., Ho, T., Kaelbling, L.P. (2004) All learning is local: Multi-agent learning in global reward games, Advances in Neural Information Processing Systems 16, Vancouver, (NIPS-03).
-
(2004)
Vancouver
-
-
Chang, Y.H.1
Ho, T.2
Kaelbling, L.P.3
-
4
-
-
85078188985
-
-
AAMAS03, Melbourne, Australia
-
Chalkiadakis, G., Boutilier, C. (2003) Coordination in Multiagent Reinforcement Learning: A Bayesian Approach, AAMAS03, Melbourne, Australia, 1418.
-
(2003)
Coordination in Multiagent Reinforcement Learning: A Bayesian Approach
-
-
Chalkiadakis, G.1
Boutilier, C.2
-
5
-
-
85078179656
-
-
Claus, C., Boutilier, C. (1998) The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems, Department of Computer Science, University of British Columbia, Canada (American Association for Artificial Intelligence).
-
(1998)
The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems, Department of Computer Science, University of British Columbia, Canada (American Association for Artificial Intelligence)
-
-
Claus, C.1
Boutilier, C.2
-
6
-
-
85040789943
-
-
Department of Computer Science, University of British Columbia, Vancouver, Canada Computer Science Division, University of California Berkeley
-
Dearden, R., Friedman, N., Russell, S. (1998) Bayesian Q-learning, Department of Computer Science, University of British Columbia, Vancouver, Canada Computer Science Division, University of California Berkeley.
-
(1998)
Bayesian Q-Learning
-
-
Dearden, R.1
Friedman, N.2
Russell, S.3
-
9
-
-
0028730301
-
Fuzzy Q-learning and dynamical fuzzy Q-learning
-
pp
-
Glorennec, P.Y. (1994) Fuzzy Q-learning and dynamical fuzzy Q-learning. Proceedings of the Third IEEE International Conference on Fuzzy Systems, IEEE Press, Piscataway, NJ, pp. 474–479.
-
(1994)
Proceedings of the Third IEEE International Conference on Fuzzy Systems, IEEE Press, Piscataway, NJ
, pp. 474-479
-
-
Glorennec, P.Y.1
-
10
-
-
0030711314
-
Fuzzy Q-Learning
-
pp
-
Glorennec, P.Y., Jouffe, L. (1997) Fuzzy Q-Learning. Proceedings of Sixth International Conference on Fuzzy Systems, Barcelona, Spain, pp. 659–662.
-
(1997)
Proceedings of Sixth International Conference on Fuzzy Systems, Barcelona, Spain
, pp. 659-662
-
-
Glorennec, P.Y.1
Jouffe, L.2
-
11
-
-
85078215532
-
-
Mixed-Initiative Interaction, IEEE Intelligence Systems, September/October
-
Hearst, M.A. (1999) Trends & Controversies, Mixed-Initiative Interaction, IEEE Intelligence Systems, September/October.
-
(1999)
Trends & Controversies
-
-
Hearst, M.A.1
-
13
-
-
85153938292
-
Reinforcement learning algorithm for partially observable markov decision problems
-
Jaakkola, T., Singh, S.P., Jordan, M.I. (1994) Reinforcement learning algorithm for partially observable markov decision problems, In Advances in Neural Information Processing Systems (NIPS), 7.
-
(1994)
Advances in Neural Information Processing Systems (NIPS)
, pp. 7
-
-
Jaakkola, T.1
Singh, S.P.2
Jordan, M.I.3
-
15
-
-
0032073263
-
Planning and acting in partially observable stochastic domains
-
Kaelbling, L.P., Littman, M.L., Cassandra, A.R. (1998) Planning and acting in partially observable stochastic domains. Artificial Intelligence, 101:99–134.
-
(1998)
Artificial Intelligence
, vol.101
, pp. 99-134
-
-
Kaelbling, L.P.1
Littman, M.L.2
Cassandra, A.R.3
-
16
-
-
0029679044
-
Reinforcement learning: A survey
-
Kaelbling, L.P., Littman, M.L., Moore, A.W. (1996) Reinforcement learning: a survey. Journal of Artificial Intelligence Research, 4:237–285.
-
(1996)
Journal of Artificial Intelligence Research
, vol.4
, pp. 237-285
-
-
Kaelbling, L.P.1
Littman, M.L.2
Moore, A.W.3
-
17
-
-
17444405333
-
Hidden Markov models with states depending on observations source, Pattern Recognition Letters Archive, New York
-
Li, Y. (2005) Hidden Markov models with states depending on observations source, Pattern Recognition Letters Archive, New York, NY: Elsevier Science Inc. 26(7): 977– 984.
-
(2005)
NY: Elsevier Science Inc.
, vol.26
, Issue.7
, pp. 977-984
-
-
Li, Y.1
-
19
-
-
0141819580
-
PEGASUS: A policy search method for large MDPs and POMDPs
-
UAI), Proceedinjgs of the Sixteenth Conference
-
Ng, A.Y., Jordan, M.I. (2000) PEGASUS: A policy search method for large MDPs and POMDPs, Uncertainty in artificial intelligence (UAI), Proceedinjgs of the Sixteenth Conference.
-
(2000)
Uncertainty in Artificial Intelligence
-
-
Ng, A.Y.1
Jordan, M.I.2
-
20
-
-
85078084757
-
-
POMDPs for Dummies, Subtitled: POMDPs and Their Algorithms, Sans Formula!
-
Online Tutorial, Brown University, Department of Computer Science, POMDPs for Dummies, Subtitled: POMDPs and Their Algorithms, Sans Formula!, http://www.cs.brown.edu/research/ai/pomdp/tutorial/index.html.
-
Online Tutorial, Brown University, Department of Computer Science
-
-
-
24
-
-
84880707672
-
Spoken dialogue management using probabilistic reasoning
-
Hong Kong
-
Roy, N., Pineau, J., Thrun, S. (2000) Spoken dialogue management using probabilistic reasoning, In Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics (ACL-2000), Hong Kong.
-
(2000)
Proceedings of the 38Th Annual Meeting of the Association for Computational Linguistics (ACL-2000)
-
-
Roy, N.1
Pineau, J.2
Thrun, S.3
-
26
-
-
85078090479
-
-
Sarawagi, S., Cohen, W.W. (2004) Semi-Markov Conditional Random Fields for Information Extraction, NIPS 2004 (Advances in Neural Information Processing Systems 17 [Neural Information Processing Systems, NIPS 2004, December 13–18, 2004, Vancouver, British Columbia, Canada]).
-
(2004)
Semi-Markov Conditional Random Fields for Information Extraction, NIPS 2004 (Advances in Neural Information Processing Systems 17 [Neural Information Processing Systems, NIPS 2004, December 13–18, 2004, Vancouver, British Columbia, Canada])
-
-
Sarawagi, S.1
Cohen, W.W.2
-
30
-
-
0003631802
-
-
Massachusetts Institute of Technology, Artificial Intelligence Laboratory and Center for Biological and Computational Learning, Department of Brain and Cognitive Science
-
Smyth, P., Heckerman, D., Jordan, M. (1996) Probabilistic Independence Networks for Hidden Markov Models, Massachusetts Institute of Technology, Artificial Intelligence Laboratory and Center for Biological and Computational Learning, Department of Brain and Cognitive Science.
-
(1996)
Probabilistic Independence Networks for Hidden Markov Models
-
-
Smyth, P.1
Heckerman, D.2
Jordan, M.3
-
32
-
-
85078141188
-
-
Thacker, N.A., Lacey, A.J. (1998) Tutorial: The Kalman Filter, Imaging Science and Biomedical Engineering Division, Medical School, University of Manchester, Stopford Building, Oxford Road, Manchester, M13 9PT.
-
(1998)
Tutorial: The Kalman Filter, Imaging Science and Biomedical Engineering Division, Medical School, University of Manchester, Stopford Building, Oxford Road, Manchester, M13 9PT
-
-
Thacker, N.A.1
Lacey, A.J.2
-
33
-
-
5444243723
-
A Framework for the initialization of student models in Web-based intelligent tutoring systems
-
Tsiriga, V., Virvou, M. (2004) A Framework for the initialization of student models in Web-based intelligent tutoring systems. User Modeling and User-Adapted Interaction, 14:289–316.
-
(2004)
User Modeling and User-Adapted Interaction
, vol.14
, pp. 289-316
-
-
Tsiriga, V.1
Virvou, M.2
-
34
-
-
14344279109
-
An Application of Reinforcement Learning to Dialogue Strategy Selection in a Spoken Dialogue System for Email
-
Walker, M.A. (2000) An Application of Reinforcement Learning to Dialogue Strategy Selection in a Spoken Dialogue System for Email, Journal of Artificial Intelligence Research (JAIR), 12:387–416.
-
(2000)
Journal of Artificial Intelligence Research (JAIR)
, vol.12
, pp. 387-416
-
-
Walker, M.A.1
-
36
-
-
34249833101
-
Technical note, Q-learning
-
Watkins, C.J.H., Dayan, P. (1992) Technical note, Q-learning. Machine Learning, 8:279–292.
-
(1992)
Machine Learning
, vol.8
, pp. 279-292
-
-
Watkins, C.J.H.1
Dayan, P.2
-
37
-
-
0003504917
-
Hierarchical Optimization of Policy-Coupled Semi-Markov Decision Processes
-
Bled, Slovenia, June 27–30. (nominated for best paper award at ICML-99)
-
Wang, G., Mahadevan, S. (1999) Hierarchical Optimization of Policy-Coupled Semi-Markov Decision Processes, Proceeding of the 16th International Conference on Machine Learning (ICML ’99), Bled, Slovenia, June 27–30. (nominated for best paper award at ICML-99).
-
(1999)
Proceeding of the 16Th International Conference on Machine Learning (ICML ’99)
-
-
Wang, G.1
Mahadevan, S.2
-
38
-
-
0036644611
-
Maximum entropy-based optimal threshold selection using deterministic reinforcement learning with controlled randomization
-
Yin, P.Y. (2002) Maximum entropy-based optimal threshold selection using deterministic reinforcement learning with controlled randomization. Signal Processing 82:993– 1006.
-
(2002)
Signal Processing
, vol.82
, pp. 993-1006
-
-
Yin, P.Y.1
|