-
1
-
-
0041966002
-
Using confidence bounds for exploitation-exploration trade-offs
-
Auer, P. (2002). Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research, 3, 397-422.
-
(2002)
Journal of Machine Learning Research
, vol.3
, pp. 397-422
-
-
Auer, P.1
-
2
-
-
27344458404
-
On prediction using variable order Markov models
-
Begleiter, R., El-Yaniv, R., & Yona, G. (2004). On prediction using variable order Markov models. Journal of Artificial Intelligence Research, 22, 385-421. (Pubitemid 41525891)
-
(2004)
Journal of Artificial Intelligence Research
, vol.22
, pp. 385-421
-
-
Begleiter, R.1
El-Yaniv, R.2
Yona, G.3
-
3
-
-
0344672463
-
Rollout algorithms for stochastic scheduling problems
-
Bertsekas, D. P., & Castanon, D. A. (1999). Rollout algorithms for stochastic scheduling problems. Journal of Heuristics, 5(1), 89-108.
-
(1999)
Journal of Heuristics
, vol.5
, Issue.1
, pp. 89-108
-
-
Bertsekas, D.P.1
Castanon, D.A.2
-
4
-
-
0032069371
-
Top-down induction of first-order logical decision trees
-
PII S0004370298000344
-
Blockeel, H., & De Raedt, L. (1998). Top-down induction of first-order logical decision trees. Artificial Intelligence, 101(1-2), 285-297. (Pubitemid 128387397)
-
(1998)
Artificial Intelligence
, vol.101
, Issue.1-2
, pp. 285-297
-
-
Blockeel, H.1
De Raedt, L.2
-
5
-
-
79956360448
-
Closing the learning-planning loop with predictive state representations
-
Richland, SC. International Foundation for Autonomous Agents and Multiagent Systems
-
Boots, B., Siddiqi, S. M., & Gordon, G. J. (2010). Closing the learning-planning loop with predictive state representations. In Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: Volume 1 - Volume 1, AAMAS '10, pp. 1369-1370 Richland, SC. International Foundation for Autonomous Agents and Multiagent Systems.
-
(2010)
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: Volume 1 - Volume 1, AAMAS '10
, pp. 1369-1370
-
-
Boots, B.1
Siddiqi, S.M.2
Gordon, G.J.3
-
6
-
-
0041965975
-
R-max-a general polynomial time algorithm for nearoptimal reinforcement learning
-
Brafman, R. I., & Tennenholtz, M. (2003). R-max - a general polynomial time algorithm for nearoptimal reinforcement learning. Journal of Machine Learning Research, 3, 213-231.
-
(2003)
Journal of Machine Learning Research
, vol.3
, pp. 213-231
-
-
Brafman, R.I.1
Tennenholtz, M.2
-
7
-
-
0028564629
-
Acting optimally in partially observable stochastic domains
-
Cassandra, A. R., Kaelbling, L. P., & Littman, M. L. (1994). Acting optimally in partially observable stochastic domains. In AAAI, pp. 1023-1028.
-
(1994)
AAAI
, pp. 1023-1028
-
-
Cassandra, A.R.1
Kaelbling, L.P.2
Littman, M.L.3
-
8
-
-
55249127519
-
Progressive strategies for Monte-Carlo Tree Search
-
Chaslot, G.-B., Winands, M., Uiterwijk, J., van den Herik, H., & Bouzy, B. (2008a). Progressive strategies for Monte-Carlo Tree Search. New Mathematics and Natural Computation, 4(3), 343-357.
-
(2008)
New Mathematics and Natural Computation
, vol.4
, Issue.3
, pp. 343-357
-
-
Chaslot, G.-B.1
Winands, M.2
Uiterwijk, J.3
Van Den Herik, H.4
Bouzy, B.5
-
9
-
-
55249093890
-
Parallel monte-carlo tree search
-
Berlin, Heidelberg. Springer-Verlag
-
Chaslot, G. M., Winands, M. H., & Herik, H. J. (2008b). Parallel monte-carlo tree search. In Proceedings of the 6th International Conference on Computers and Games, pp. 60-71 Berlin, Heidelberg. Springer-Verlag.
-
(2008)
Proceedings of the 6th International Conference on Computers and Games
, pp. 60-71
-
-
Chaslot, G.M.1
Winands, M.H.2
Herik, H.J.3
-
10
-
-
84889281816
-
-
Wiley-Interscience, New York, NY, USA
-
Cover, T. M., & Thomas, J. A. (1991). Elements of information theory. Wiley-Interscience, New York, NY, USA.
-
(1991)
Elements of information theory
-
-
Cover, T.M.1
Thomas, J.A.2
-
11
-
-
77951573287
-
Universal reinforcement learning
-
Farias, V., Moallemi, C., Van Roy, B., & Weissman, T. (2010). Universal reinforcement learning. Information Theory, IEEE Transactions on, 56(5), 2441-2454.
-
(2010)
Information Theory, IEEE Transactions on
, vol.56
, Issue.5
, pp. 2441-2454
-
-
Farias, V.1
Moallemi, C.2
Van Roy, B.3
Weissman, T.4
-
12
-
-
57749181518
-
Simulation-based approach to general game playing
-
Finnsson, H., & Bj̈ornsson, Y. (2008). Simulation-based approach to general game playing. In AAAI, pp. 259-264.
-
(2008)
AAAI
, pp. 259-264
-
-
Finnsson, H.1
Bj̈ornsson, Y.2
-
15
-
-
79956339609
-
-
Modification of UCT with patterns in Monte-Carlo Go. Tech. rep. 6062, INRIA, France
-
Gelly, S., Wang, Y., Munos, R., & Teytaud, O. (2006). Modification of UCT with patterns in Monte-Carlo Go. Tech. rep. 6062, INRIA, France.
-
(2006)
-
-
Gelly, S.1
Wang, Y.2
Munos, R.3
Teytaud, O.4
-
16
-
-
29344449759
-
Effective short-term opponent exploitation in simplified poker
-
Proceedings of the 20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference, AAAI-05/IAAI-05
-
Hoehn, B., Southey, F., Holte, R. C., & Bulitko, V. (2005). Effective short-term opponent exploitation in simplified poker. In AAAI, pp. 783-788. (Pubitemid 43006704)
-
(2005)
Proceedings of the National Conference on Artificial Intelligence
, vol.2
, pp. 783-788
-
-
Hoehn, B.1
Southey, F.2
Holte, R.C.3
Bulitko, V.4
-
17
-
-
34250765690
-
Looping suffix tree-based inference of partially observable hidden state
-
Holmes, M. P., & Jr, C. L. I. (2006). Looping suffix tree-based inference of partially observable hidden state. In ICML, pp. 409-416.
-
(2006)
ICML
, pp. 409-416
-
-
Holmes Jr., M.P.1
-
18
-
-
1642393842
-
The fastest and shortest algorithm for all well-defined problems
-
Hutter, M. (2002a). The fastest and shortest algorithm for all well-defined problems. International Journal of Foundations of Computer Science., 13(3), 431-443.
-
(2002)
International Journal of Foundations of Computer Science.
, vol.13
, Issue.3
, pp. 431-443
-
-
Hutter, M.1
-
19
-
-
84937417436
-
Self-optimizing and Pareto-optimal policies in general environments based on Bayes-mixtures
-
Lecture Notes in Artificial Intelligence. Springer
-
Hutter, M. (2002b). Self-optimizing and Pareto-optimal policies in general environments based on Bayes-mixtures. In Proceedings of the 15th Annual Conference on Computational Learning Theory (COLT 2002), Lecture Notes in Artificial Intelligence. Springer.
-
(2002)
Proceedings of the 15th Annual Conference on Computational Learning Theory (COLT 2002
-
-
Hutter, M.1
-
21
-
-
78049319488
-
Universal algorithmic intelligence: A mathematical top?down approach
-
Springer, Berlin
-
Hutter, M. (2007). Universal algorithmic intelligence: A mathematical top?down approach. In Artificial General Intelligence, pp. 227-290. Springer, Berlin.
-
(2007)
Artificial General Intelligence
, pp. 227-290
-
-
Hutter, M.1
-
22
-
-
0032073263
-
Planning and acting in partially observable stochastic domains
-
PII S000437029800023X
-
Kaelbling, L. P., Littman, M. L., & Cassandra, A. R. (1995). Planning and acting in partially observable stochastic domains. Artificial Intelligence, 101, 99-134. (Pubitemid 128387390)
-
(1998)
Artificial Intelligence
, vol.101
, Issue.1-2
, pp. 99-134
-
-
Kaelbling, L.P.1
Littman, M.L.2
Cassandra, A.R.3
-
24
-
-
0006713882
-
Inducing classification and regression trees in first order logic
-
In Džeroski, S., & Lavrač, N. (Eds.), chap. 6. Springer
-
Kramer, S., & Widmer, G. (2001). Inducing classification and regression trees in first order logic. In Džeroski, S., & Lavrač, N. (Eds.), Relational Data Mining, chap. 6. Springer.
-
(2001)
Relational Data Mining
-
-
Kramer, S.1
Widmer, G.2
-
27
-
-
79956345551
-
-
Ergodic MDPs admit self-optimising policies. Tech. rep. IDSIA- s21-04, Dalle Molle Institute for Artificial Intelligence (IDSIA
-
Legg, S., & Hutter, M. (2004). Ergodic MDPs admit self-optimising policies. Tech. rep. IDSIA-21-04, Dalle Molle Institute for Artificial Intelligence (IDSIA).
-
(2004)
-
-
Legg, S.1
Hutter, M.2
-
28
-
-
77956163718
-
-
Ph.D. thesis, Department of Informatics, University of Lugano
-
Legg, S. (2008). Machine Super Intelligence. Ph.D. thesis, Department of Informatics, University of Lugano.
-
(2008)
Machine Super Intelligence
-
-
Legg, S.1
-
31
-
-
84898982129
-
Predictive representations of state
-
Littman, M., Sutton, R., & Singh, S. (2002). Predictive representations of state. In NIPS, pp. 1555-1561.
-
(2002)
NIPS
, pp. 1555-1561
-
-
Littman, M.1
Sutton, R.2
Singh, S.3
-
34
-
-
71149083875
-
Proto-predictive representation of states with simple recurrent temporaldifference networks
-
Makino, T. (2009). Proto-predictive representation of states with simple recurrent temporaldifference networks. In ICML, pp. 697-704.
-
(2009)
ICML
, pp. 697-704
-
-
Makino, T.1
-
37
-
-
79956365352
-
A computational approximation to the AIXI model
-
Pankov, S. (2008). A computational approximation to the AIXI model. In AGI, pp. 256-267.
-
(2008)
AGI
, pp. 256-267
-
-
Pankov, S.1
-
39
-
-
79956346776
-
-
Universal learning of repeated matrix games. Tech. rep. 18-05, IDSIA
-
Poland, J., & Hutter, M. (2006). Universal learning of repeated matrix games. Tech. rep. 18-05, IDSIA.
-
(2006)
-
-
Poland, J.1
Hutter, M.2
-
40
-
-
77950356463
-
Model-based bayesian reinforcement learning in partially observable domains
-
Poupart, P., & Vlassis, N. (2008). Model-based bayesian reinforcement learning in partially observable domains. In ISAIM.
-
(2008)
ISAIM
-
-
Poupart, P.1
Vlassis, N.2
-
41
-
-
33749251297
-
An analytic solution to discrete bayesian reinforcement learning
-
New York, NY, USA. ACM
-
Poupart, P., Vlassis, N., Hoey, J., & Regan, K. (2006). An analytic solution to discrete bayesian reinforcement learning. In ICML '06: Proceedings of the 23rd international conference on Machine learning, pp. 697-704 New York, NY, USA. ACM.
-
(2006)
ICML '06: Proceedings of the 23rd international conference on Machine learning
, pp. 697-704
-
-
Poupart, P.1
Vlassis, N.2
Hoey, J.3
Regan, K.4
-
43
-
-
0030282113
-
The power of amnesia: Learning probabilistic automata with variable memory length
-
Ron, D., Singer, Y., & Tishby, N. (1996). The power of amnesia: Learning probabilistic automata with variable memory length. Machine Learning, 25(2), 117-150.
-
(1996)
Machine Learning
, vol.25
, Issue.2
, pp. 117-150
-
-
Ron, D.1
Singer, Y.2
Tishby, N.3
-
44
-
-
14344256568
-
Learning low dimensional predictive representations
-
New York, NY, USA. ACM
-
Rosencrantz, M., Gordon, G., & Thrun, S. (2004). Learning low dimensional predictive representations. In Proceedings of the twenty-first International Conference on Machine Learning, p. 88 New York, NY, USA. ACM.
-
(2004)
Proceedings of the twenty-first International Conference on Machine Learning
, pp. 88
-
-
Rosencrantz, M.1
Gordon, G.2
Thrun, S.3
-
45
-
-
85162018872
-
Bayes-adaptive POMDPs
-
In Platt, J., Koller, D., Singer, Y., & Roweis, S. (Eds.), MIT Press, Cambridge, MA
-
Ross, S., Chaib-draa, B., & Pineau, J. (2008). Bayes-adaptive POMDPs. In Platt, J., Koller, D., Singer, Y., & Roweis, S. (Eds.), Advances in Neural Information Processing Systems 20, pp. 1225-1232. MIT Press, Cambridge, MA.
-
(2008)
Advances in Neural Information Processing Systems
, vol.20
, pp. 1225-1232
-
-
Ross, S.1
Chaib-draa, B.2
Pineau, J.3
-
46
-
-
0031186687
-
Shifting Inductive Bias with Success-Story Algorithm, Adaptive Levin Search, and Incremental Self-Improvement
-
Schmidhuber, J., Zhao, J., & Wiering, M. A. (1997). Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self-improvement. Machine Learning, 28, 105-130. (Pubitemid 127507171)
-
(1997)
Machine Learning
, vol.28
, Issue.1
, pp. 105-130
-
-
Schmidhuber, J.1
Zhao, J.2
Wiering, M.3
-
47
-
-
0031194381
-
Discovering neural nets with low Kolmogorov complexity and high generalization capability
-
DOI 10.1016/S0893-6080(96)00127-X, PII S089360809600127X
-
Schmidhuber, J. (1997). Discovering neural nets with low Kolmogorov complexity and high generalization capability. Neural Networks, 10(5), 857-873. (Pubitemid 27315721)
-
(1997)
Neural Networks
, vol.10
, Issue.5
, pp. 857-873
-
-
Schmidhuber, J.1
-
48
-
-
84937439050
-
The speed prior: A new simplicity measure yielding near-optimal computable predictions
-
Schmidhuber, J. (2002). The speed prior: A new simplicity measure yielding near-optimal computable predictions. In Proc. 15th Annual Conf. on Computational Learning Theory, pp. 216-228.
-
(2002)
Proc. 15th Annual Conf. on Computational Learning Theory
, pp. 216-228
-
-
Schmidhuber, J.1
-
50
-
-
1642328943
-
Optimal ordered problem solver
-
Schmidhuber, J. (2004). Optimal ordered problem solver. Machine Learning, 54, 211-254.
-
(2004)
Machine Learning
, vol.54
, pp. 211-254
-
-
Schmidhuber, J.1
-
52
-
-
33646434962
-
Resolving perceptual aliasing in the presence of noisy sensors
-
Shani, G., & Brafman, R. (2004). Resolving perceptual aliasing in the presence of noisy sensors. In NIPS.
-
(2004)
NIPS
-
-
Shani, G.1
Brafman, R.2
-
53
-
-
71149102015
-
Monte-carlo simulation balancing
-
New York, NY, USA. ACM
-
Silver, D., & Tesauro, G. (2009). Monte-carlo simulation balancing. In ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 945-952 New York, NY, USA. ACM.
-
(2009)
ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning
, pp. 945-952
-
-
Silver, D.1
Tesauro, G.2
-
55
-
-
33749263456
-
Predictive state representations: A new theory for modeling dynamical systems
-
Singh, S., James, M., & Rudary, M. (2004). Predictive state representations: A new theory for modeling dynamical systems. In UAI, pp. 512-519.
-
(2004)
UAI
, pp. 512-519
-
-
Singh, S.1
James, M.2
Rudary, M.3
-
56
-
-
4544279425
-
A formal theory of inductive inference: Parts 1 and 2
-
224-254
-
Solomonoff, R. J. (1964). A formal theory of inductive inference: Parts 1 and 2. Information and Control, 7, 1-22 and 224-254.
-
(1964)
Information and Control
, vol.7
, pp. 1-22
-
-
Solomonoff, R.J.1
-
57
-
-
73549084301
-
Reinforcement learning in finite MDPs: PAC analysis
-
Strehl, A. L., Li, L., & Littman, M. L. (2009). Reinforcement learning in finite MDPs: PAC analysis. Journal of Machine Learning Research, 10, 2413-2444.
-
(2009)
Journal of Machine Learning Research
, vol.10
, pp. 2413-2444
-
-
Strehl, A.L.1
Li, L.2
Littman, M.L.3
-
58
-
-
33749255382
-
PAC model-free reinforcement learning
-
New York, NY, USA. ACM
-
Strehl, A. L., Li, L., Wiewiora, E., Langford, J., & Littman, M. L. (2006). PAC model-free reinforcement learning. In ICML '06: Proceedings of the 23rd international conference on Machine learning, pp. 881-888 New York, NY, USA. ACM.
-
(2006)
ICML '06: Proceedings of the 23rd international conference on Machine learning
, pp. 881-888
-
-
Strehl, A.L.1
Li, L.2
Wiewiora, E.3
Langford, J.4
Littman, M.L.5
-
59
-
-
14344258433
-
A Bayesian framework for reinforcement learning
-
Strens, M. (2000). A Bayesian framework for reinforcement learning. In ICML, pp. 943-950.
-
(2000)
ICML
, pp. 943-950
-
-
Strens, M.1
-
60
-
-
77958523564
-
A reinforcement learning algorithm in partially observable environments using short-term memory
-
Suematsu, N., & Hayashi, A. (1999). A reinforcement learning algorithm in partially observable environments using short-term memory. In NIPS, pp. 1059-1065.
-
(1999)
NIPS
, pp. 1059-1065
-
-
Suematsu, N.1
Hayashi, A.2
-
61
-
-
79956373203
-
A Bayesian approach to model learning in non- Markovian environment
-
Suematsu, N., Hayashi, A., & Li, S. (1997). A Bayesian approach to model learning in non- Markovian environment. In ICML, pp. 349-357.
-
(1997)
ICML
, pp. 349-357
-
-
Suematsu, N.1
Hayashi, A.2
Li, S.3
-
63
-
-
31844431936
-
Temporal-difference networks
-
Sutton, R. S., & Tanner, B. (2004). Temporal-difference networks. In NIPS.
-
(2004)
NIPS
-
-
Sutton, R.S.1
Tanner, B.2
-
65
-
-
77958583913
-
Reinforcement Learning via AIXI Approximation
-
Veness, J., Ng, K. S., Hutter, M., & Silver, D. (2010). Reinforcement Learning via AIXI Approximation. In Proceedings of the Conference for the Association for the Advancement of Artificial Intelligence (AAAI).
-
(2010)
Proceedings of the Conference for the Association for the Advancement of Artificial Intelligence (AAAI)
-
-
Veness, J.1
Ng, K.S.2
Hutter, M.3
Silver, D.4
-
66
-
-
84858720579
-
Bootstrapping from game tree search
-
Veness, J., Silver, D., Uther, W., & Blair, A. (2009). Bootstrapping from Game Tree Search. In Neural Information Processing Systems (NIPS).
-
(2009)
Neural Information Processing Systems (NIPS)
-
-
Veness, J.1
Silver, D.2
Uther, W.3
Blair, A.4
-
67
-
-
31844436266
-
Bayesian sparse sampling for on-line reward optimization
-
Wang, T., Lizotte, D. J., Bowling, M. H., & Schuurmans, D. (2005). Bayesian sparse sampling for on-line reward optimization. In ICML, pp. 956-963.
-
(2005)
ICML
, pp. 956-963
-
-
Wang, T.1
Lizotte, D.J.2
Bowling, M.H.3
Schuurmans, D.4
-
69
-
-
48249155421
-
Reflections on "The Context Tree Weighting Method: Basic properties
-
Willems, F., Shtarkov, Y., & Tjalkens, T. (1997). Reflections on "The Context Tree Weighting Method: Basic properties". Newsletter of the IEEE Information Theory Society, 47(1).
-
(1997)
Newsletter of the IEEE Information Theory Society
, vol.47
, pp. 1
-
-
Willems, F.1
Shtarkov, Y.2
Tjalkens, T.3
-
70
-
-
0032022518
-
The context-tree weighting method: Extensions
-
PII S0018944898006543
-
Willems, F. M. J. (1998). The context-tree weighting method: Extensions. IEEE Transactions on Information Theory, 44, 792-798. (Pubitemid 128737641)
-
(1998)
IEEE Transactions on Information Theory
, vol.44
, Issue.2
, pp. 792-798
-
-
Willems, F.M.J.1
-
71
-
-
0030242432
-
Context weighting for general finitecontext sources
-
Willems, F. M. J., Shtarkov, Y. M., & Tjalkens, T. J. (1996). Context weighting for general finitecontext sources. IEEE Trans. Inform. Theory, 42, 42-1514.
-
(1996)
IEEE Trans. Inform. Theory
, vol.42
, pp. 42-1514
-
-
Willems, F.M.J.1
Shtarkov, Y.M.2
Tjalkens, T.J.3
-
72
-
-
0029307102
-
The context tree weighting method: Basic properties
-
Willems, F. M., Shtarkov, Y. M., & Tjalkens, T. J. (1995). The context tree weighting method: Basic properties. IEEE Transactions on Information Theory, 41, 653-664.
-
(1995)
IEEE Transactions on Information Theory
, vol.41
, pp. 653-664
-
-
Willems, F.M.1
Shtarkov, Y.M.2
Tjalkens, T.J.3
|