-
1
-
-
76849111294
-
-
[Bellman 1957] Princeton University Press Princeton, New Jersey
-
[Bellman 1957] Richard Ernest Bellman Dynamic Pro gramming Princeton University Press Princeton, New Jersey, 1957
-
(1957)
Richard Ernest Bellman Dynamic Pro gramming
-
-
-
2
-
-
0028564629
-
Acting optimally in partially observable stochastic domains
-
andM, [Cassandra et al 1994] pages Seattle, Washington, August AAAI Press
-
[Cassandra et al 1994] A R Cassandra, L P Kaelbling andM L Littman Acting optimally in partially observable stochastic domains In Proceedings of the Twelfth National Conference on Artificial Intelligence (AAAI 94), pages 1023-1028 Seattle, Washington, August 1994 AAAI Press
-
(1994)
Proceedings of the Twelfth National Conference on Artificial Intelligence (AAAI 94)
, pp. 1023-1028
-
-
Cassandra, A R1
Kaelbling, L P2
Littman, L3
-
3
-
-
0026998041
-
LonnieChnsman Reinforcement learning with perceptual aliasing The perceptual distinctions approach
-
[Chnsman 1992] pages San Jose California, July AAAI Press
-
[Chnsman 1992] LonnieChnsman Reinforcement learning with perceptual aliasing The perceptual distinctions approach In Proceedings of the Tenth Notional Conference on Artificial Intelligence (AAAI 92) pages 183-188 San Jose California, July 1992 AAAI Press
-
(1992)
Proceedings of the Tenth Notional Conference on Artificial Intelligence (AAAI 92)
, pp. 183-188
-
-
-
4
-
-
85168159258
-
Reinforcement learning algorithm for partially observable Markov decision problems
-
[Jaakolaefa in press] 7 to appear in press
-
[Jaakolaefa/ in press] Tommi Jaakola, Satinder P Singh, and Michael I Jordan Reinforcement learning algorithm for partially observable Markov decision problems In Neural Information Processing Systems 7 to appear in press
-
Neural Information Processing Systems
-
-
Jaakola, Tommi1
Singh, Satinder P2
Jordan, Michael I3
-
8
-
-
0002679852
-
A survey of algonthmic methods for partially observed Markov decision processes
-
[Lovejoy, 1991] 66 Apnl
-
[Lovejoy, 1991] W S Lovejoy A survey of algonthmic methods for partially observed Markov decision processes Annals of Operations Research 28(1-4)47-.66 Apnl 1991
-
(1991)
Annals of Operations Research
, vol.28
, Issue.1-4
, pp. 47
-
-
Lovejoy, W S1
-
9
-
-
0027632248
-
Schullen neural gas' network for vector quantization and its application Lo lime series prediction
-
Sta.nislav and, [Martinetz et al 1993] Thomas
-
[Martinetz et al 1993] Thomas. M Martinetz, Sta.nislav G Berkovich and Klaus J Schullen neural gas' network for vector quantization and its application Lo lime series prediction IEEE Transactions on Neural Networks SSC 4 558-569 1993
-
(1993)
IEEE Transactions on Neural Networks SSC
, vol.4
, pp. 558-569
-
-
Martinetz, M1
Berkovich, G2
Klaus, J3
-
10
-
-
85151432208
-
McCallum Overcoming incomplete perception with utile distinction memory
-
[McCallum 1993] pages Amherst Massachusetts, July 1993 Morgan Kaufmann
-
[McCallum 1993] Andrew R McCallum Overcoming incomplete perception with utile distinction memory In Proceedings of the Tenth International Conference on Ma chine Learning pages 190-196 Amherst Massachusetts, July 1993 Morgan Kaufmann
-
Proceedings of the Tenth International Conference on Ma chine Learning
, pp. 190-196
-
-
Andrew, R1
-
14
-
-
33847202724
-
R S Sutton Learning to predict by the methods of temporal differences
-
[Sutton, 1988] August
-
[Sutton, 1988] R S Sutton Learning to predict by the methods of temporal differences Machine Learning, 3 9-44, August 1988
-
(1988)
Machine Learning
, vol.3
, pp. 9-44
-
-
-
15
-
-
85168123636
-
-
[Watkins 1989] Psychology Department, Cambndge University, Cambridge, United Kingdom
-
[Watkins 1989] C J Watkins Models of Delayed Reinforcement Learning PhD thesis, Psychology Department, Cambndge University, Cambridge, United Kingdom 1989
-
(1989)
C J Watkins Models of Delayed Reinforcement Learning PhD thesis
-
-
|