-
1
-
-
0005069684
-
Adaptively pointing spacebome radar for precipitation measurements
-
Atlas, D. (1982). Adaptively pointing spacebome radar for precipitation measurements. Journal of Applied Meteorology, 21, 429-443.
-
(1982)
Journal of Applied Meteorology
, vol.21
, pp. 429-443
-
-
Atlas, D.1
-
2
-
-
85151728371
-
Residual algorithms: Reinforcement learning with function approximation
-
A. Prieditis & S. J. Russell (Eds.), 9-12 July 1995. Tahoe City, CA/San Francisco: Morgan Kaufmann
-
Baird, L. C. (1995). Residual algorithms: Reinforcement learning with function approximation. In A. Prieditis & S. J. Russell (Eds.), Proceedings of the Twelfth International Conference on Machine Learning (pp. 30-37). 9-12 July 1995. Tahoe City, CA/San Francisco: Morgan Kaufmann.
-
(1995)
Proceedings of the Twelfth International Conference on Machine Learning
, pp. 30-37
-
-
Baird, L.C.1
-
3
-
-
0003272616
-
Reinforcement learning in POMDP via direct gradient ascent
-
29 June-2 July 2000. Stanford, CA/San Francisco: Morgan Kaufmann
-
Baxter, J., & Bartlett, P. L. (2000). Reinforcement learning in POMDP via direct gradient ascent. Proceedings of the 17th International Conference on Machine Learning (pp. 41-48). 29 June-2 July 2000. Stanford, CA/San Francisco: Morgan Kaufmann.
-
(2000)
Proceedings of the 17th International Conference on Machine Learning
, pp. 41-48
-
-
Baxter, J.1
Bartlett, P.L.2
-
4
-
-
85012688561
-
-
Princeton, NJ: Princeton University Press
-
Bellman, R. E. (1957). Dynamic programming (342 pp.). Princeton, NJ: Princeton University Press.
-
(1957)
Dynamic Programming
-
-
Bellman, R.E.1
-
7
-
-
0001188860
-
The air traffic flow management problem with enroute capacities
-
Bertsimas, D., & Patterson, S. S. (1998). The air traffic flow management problem with enroute capacities. Operations Research, 46, 406-422. (Pubitemid 128655441)
-
(1998)
Operations Research
, vol.46
, Issue.3
, pp. 406-422
-
-
Bertsimas, D.1
Patterson, S.S.2
-
8
-
-
0026998041
-
Reinforcement learning with perceptual aliasing: The perceptual distinctions approach
-
Chrisman, L. (1992). Reinforcement learning with perceptual aliasing: The perceptual distinctions approach. Proceedings of the Tenth National Conference on Artificial Intelligence (pp. 183-188). 12-16 July 1992. San Jose/Menlo Park, CA: AAAI Press. (Pubitemid 23633590)
-
(1992)
Proceedings Tenth National Conference on Artificial Intelligence
, pp. 183-188
-
-
Chrisman Lonnie1
-
9
-
-
0028388685
-
TD(0) converges with probability 1
-
Dayan, P., & Sejnowski, T. (1994). TD(0) converges with probability 1. Machine Learning, 14, 295-301.
-
(1994)
Machine Learning
, vol.14
, pp. 295-301
-
-
Dayan, P.1
Sejnowski, T.2
-
10
-
-
77958166664
-
Integrating advanced weather forecast technologies into air traffic management decision support
-
Evans, J. E., Weber, M. E., & Moser, W. R. (2006). Integrating advanced weather forecast technologies into air traffic management decision support. Lincoln Laboratory Journal, 16, 81-96.
-
(2006)
Lincoln Laboratory Journal
, vol.16
, pp. 81-96
-
-
Evans, J.E.1
Weber, M.E.2
Moser, W.R.3
-
12
-
-
0000439891
-
On the convergence of stochastic iterative dynamic programming algorithms
-
Jaakkola, T., Jordan, M., & Singh, S. (1994). On the convergence of stochastic iterative dynamic programming algorithms. Neural Computation, 6, 1185-1201.
-
(1994)
Neural Computation
, vol.6
, pp. 1185-1201
-
-
Jaakkola, T.1
Jordan, M.2
Singh, S.3
-
13
-
-
85153938292
-
Reinforcement learning algorithm for partially observable Markov decision problems
-
G. Tesauro, D. S. Touretzky, & T. Leen (Eds.), Cambridge, MA: MIT Press
-
Jaakkola, T., Singh, S., & Jordan, M. (1995). Reinforcement learning algorithm for partially observable Markov decision problems. In G. Tesauro, D. S. Touretzky, & T. Leen (Eds.), Advances in neural information processing systems: Proceedings of the 1994 Conference (pp. 345-352). Cambridge, MA: MIT Press.
-
(1995)
Advances in Neural Information Processing Systems: Proceedings of the 1994 Conference
, pp. 345-352
-
-
Jaakkola, T.1
Singh, S.2
Jordan, M.3
-
15
-
-
77957786462
-
Future air traffic management requirements for dynamic weather avoidance routing
-
October 2006. Portland, OR: IEEE/AIAA
-
Krozel, J., Andre, A. D. & Smith, P. (2006). Future air traffic management requirements for dynamic weather avoidance routing. Preprints, 25th Digital Avionics Systems Conference (pp. 1-9). October 2006. Portland, OR: IEEE/AIAA.
-
(2006)
Preprints, 25th Digital Avionics Systems Conference
, pp. 1-9
-
-
Krozel, J.1
Andre, A.D.2
Smith, P.3
-
17
-
-
0002679852
-
A survey of algorithmic methods for partially observable Markov decision processes
-
Lovejoy, W. S. (1991). A survey of algorithmic methods for partially observable Markov decision processes. Annals of Operations Research, 28, 47-66.
-
(1991)
Annals of Operations Research
, vol.28
, pp. 47-66
-
-
Lovejoy, W.S.1
-
18
-
-
30044439279
-
Distributed Collaborative Adaptive Sensing (DCAS) for improved detection, understanding, and prediction of atmospheric hazards
-
10-13 January 2005. Paper 11.3. San Diego, CA
-
McLaughlin, D. J., Chandrasekar, V., Droegemeier, K., Frasier, S., Kurose, J., Junyent, F., et al. (2005). Distributed Collaborative Adaptive Sensing (DCAS) for improved detection, understanding, and prediction of atmospheric hazards. Preprints-CD, AMS Ninth Symposium on Integrated Observing and Assimilation Systems for the Atmosphere, Oceans, and Land Surface. 10-13 January 2005. Paper 11.3. San Diego, CA.
-
(2005)
Preprints-CD, AMS Ninth Symposium on Integrated Observing and Assimilation Systems for the Atmosphere, Oceans, and Land Surface
-
-
McLaughlin, D.J.1
Chandrasekar, V.2
Droegemeier, K.3
Frasier, S.4
Kurose, J.5
Junyent, F.6
-
20
-
-
0000955979
-
Incremental multi-step Qlearning
-
Peng, J., & Williams, R. J. (1996). Incremental multi-step Qlearning. Machine Learning, 22, 283-290.
-
(1996)
Machine Learning
, vol.22
, pp. 283-290
-
-
Peng, J.1
Williams, R.J.2
-
21
-
-
4644328593
-
Off-policy temporal-difference learning with function approximation
-
C. E. Brodley and A. P. Danylok (Eds.), 28 June-1 July 2001.Williamstown, MA/San Francisco, CA: Morgan Kaufmann
-
Precup, D., Sutton, R. S., & Dasgupta, S. (2001). Off-policy temporal-difference learning with function approximation. In C. E. Brodley and A. P. Danylok (Eds.), Proceedings of the 18th International Conference on Machine Learning (pp. 417-424). 28 June-1 July 2001.Williamstown, MA/San Francisco, CA: Morgan Kaufmann.
-
(2001)
Proceedings of the 18th International Conference on Machine Learning
, pp. 417-424
-
-
Precup, D.1
Sutton, R.S.2
Dasgupta, S.3
-
24
-
-
0001201756
-
Some studies in machine learning using the game of checkers
-
Samuel, A. L. (1959). Some studies in machine learning using the game of checkers. IBM Journal on Research and Development, 3, 211-229.
-
(1959)
IBM Journal on Research and Development
, vol.3
, pp. 211-229
-
-
Samuel, A.L.1
-
25
-
-
84921399937
-
-
Piscataway, NJ: Wiley-Interscience
-
Si, J., Barto, A. G., Powell,W. B., &Wunsch, D. (Eds.). (2004). Handbook of learning and approximate dynamic programming (644 pp.). Piscataway, NJ: Wiley-Interscience.
-
(2004)
Handbook of Learning and Approximate Dynamic Programming
-
-
Si, J.1
Barto, A.G.2
Powell, W.B.3
Wunsch, D.4
-
26
-
-
0029753630
-
Reinforcement learning with replacing eligibility traces
-
Singh, S. P., & Sutton, R. S. (1996). Reinforcement learning with replacing eligibility traces. Machine Learning, 22, 123-158. (Pubitemid 126724365)
-
(1996)
Machine Learning
, vol.22
, Issue.1-3
, pp. 123-158
-
-
Singh, S.P.1
Sutton, R.S.2
-
27
-
-
0033901602
-
Convergence results for single-step on-policy reinforcement-learning algorithms
-
DOI 10.1023/A:1007678930559
-
Singh, S. P., Jaakkola, T., Littman, M. L., & Szepasvari, C. (2000). Convergence results for single-step on-policy reinforcement-learning algorithms. Machine Learning, 38, 287-308. (Pubitemid 30572449)
-
(2000)
Machine Learning
, vol.38
, Issue.3
, pp. 287-308
-
-
Singh, S.1
Jaakkola, T.2
Littman, M.L.3
Szepesvari, C.4
-
29
-
-
0035283402
-
On the convergence of temporal-difference learning with linear function approximation
-
DOI 10.1023/A:1007609817671
-
Tadic, V. (2001). On the convergence of temporal-difference learning with linear function approximation. Machine Learning, 42, 241-267. (Pubitemid 32188797)
-
(2001)
Machine Learning
, vol.42
, Issue.3
, pp. 241-267
-
-
Tadic, V.1
-
30
-
-
0042466434
-
On the convergence of optimistic policy iteration
-
Tsitsiklis, J. N. (2002). On the convergence of optimistic policy iteration. Journal of Machine Learning Research, 3, 59-72.
-
(2002)
Journal of Machine Learning Research
, vol.3
, pp. 59-72
-
-
Tsitsiklis, J.N.1
-
31
-
-
0031143730
-
An analysis of temporal-difference learning with function approximation
-
PII S0018928697034375
-
Tsitsiklis, J. N., & Van Roy, B. (1997). An analysis of temporal-difference learning with function approximation. IEEE Transactions on Automatic Control, 42, 674-690. (Pubitemid 127760263)
-
(1997)
IEEE Transactions on Automatic Control
, vol.42
, Issue.5
, pp. 674-690
-
-
Tsitsiklis, J.N.1
Van Roy, B.2
-
32
-
-
0003190274
-
Intelligent machinery, National Physical Laboratory report
-
D. C. Ince (Ed.). 1992, New York: Elsevier Science
-
Turing, A. M. (1948). Intelligent machinery, National Physical Laboratory report. In D. C. Ince (Ed.). 1992, Collected works of A. M. Turing: Mechanical intelligence (227 pp.). New York: Elsevier Science.
-
(1948)
Collected Works of A. M. Turing: Mechanical Intelligence
-
-
Turing, A.M.1
-
33
-
-
0002988210
-
Computing machinery and intelligence
-
Turing, A. M. (1950). Computing machinery and intelligence. Mind, 59, 433-460.
-
(1950)
Mind
, vol.59
, pp. 433-460
-
-
Turing, A.M.1
-
34
-
-
0004049893
-
-
Ph.D. thesis, King's College, Cambridge University, Cambridge
-
Watkins, C. J. C. H. (1989). Learning from delayed rewards. Ph.D. thesis, King's College, Cambridge University, Cambridge, 234 pp.
-
(1989)
Learning from Delayed Rewards
-
-
Watkins, C.J.C.H.1
-
37
-
-
66149110157
-
Experimental results on learning stochastic memoryless policies for partially observable Markov decision processes
-
M. S. Kearns, S. A. Solla, and D. A. Cohn (Eds.), Cambridge, MA: MIT Press
-
Williams, J. K., & Singh, S. (1999). Experimental results on learning stochastic memoryless policies for partially observable Markov decision processes. In M. S. Kearns, S. A. Solla, and D. A. Cohn (Eds.), Advances in neural information processing systems 11. Proceedings of the 1998 Conference (pp. 1073-1079). Cambridge, MA: MIT Press.
-
(1999)
Advances in Neural Information Processing Systems 11. Proceedings of the 1998 Conference
, pp. 1073-1079
-
-
Williams, J.K.1
Singh, S.2
|