SCOPUS 정보 검색 플랫폼

Journal of Artificial Intelligence Research

Volumn 40, Issue , 2011, Pages 95-142

A Monte-Carlo AIXI approximation

(5) Veness, Joel a Ng, Kee Siong b Hutter, Marcus b Uther, William b Silver, David c

a UNIVERSITY OF NEW SOUTH WALES (Australia)

b AUSTRALIAN NATIONAL UNIVERSITY (Australia)

c MASSACHUSETTS INSTITUTE OF TECHNOLOGY (United States)

Author keywords

[No Author keywords available]

Indexed keywords

CONTEXT TREE WEIGHTING; MONTE CARLO; OPTIMALITY; PRACTICAL ALGORITHMS; REINFORCEMENT LEARNING AGENT; TREE SEARCH ALGORITHM;

APPROXIMATION ALGORITHMS; INTELLIGENT AGENTS; PLANT EXTRACTS; REINFORCEMENT LEARNING;

TREES (MATHEMATICS);

EID: 79956344726 PISSN: None EISSN: 10769757 Source Type: Journal
DOI: 10.1613/jair.3125 Document Type: Article

Times cited : (138)

References (72)

1
- 0041966002
- Using confidence bounds for exploitation-exploration trade-offs
- Auer, P. (2002). Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research, 3, 397-422.
- (2002) Journal of Machine Learning Research , vol.3 , pp. 397-422
- Auer, P.¹

2
- 27344458404
- On prediction using variable order Markov models
- Begleiter, R., El-Yaniv, R., & Yona, G. (2004). On prediction using variable order Markov models. Journal of Artificial Intelligence Research, 22, 385-421. (Pubitemid 41525891)
- (2004) Journal of Artificial Intelligence Research , vol.22 , pp. 385-421
- Begleiter, R.¹ El-Yaniv, R.² Yona, G.³

3
- 0344672463
- Rollout algorithms for stochastic scheduling problems
- Bertsekas, D. P., & Castanon, D. A. (1999). Rollout algorithms for stochastic scheduling problems. Journal of Heuristics, 5(1), 89-108.
- (1999) Journal of Heuristics , vol.5 , Issue.1 , pp. 89-108
- Bertsekas, D.P.¹ Castanon, D.A.²

4
- 0032069371
- Top-down induction of first-order logical decision trees
- PII S0004370298000344
- Blockeel, H., & De Raedt, L. (1998). Top-down induction of first-order logical decision trees. Artificial Intelligence, 101(1-2), 285-297. (Pubitemid 128387397)
- (1998) Artificial Intelligence , vol.101 , Issue.1-2 , pp. 285-297
- Blockeel, H.¹ De Raedt, L.²

5
- 79956360448
- Closing the learning-planning loop with predictive state representations
- Richland, SC. International Foundation for Autonomous Agents and Multiagent Systems
- Boots, B., Siddiqi, S. M., & Gordon, G. J. (2010). Closing the learning-planning loop with predictive state representations. In Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: Volume 1 - Volume 1, AAMAS '10, pp. 1369-1370 Richland, SC. International Foundation for Autonomous Agents and Multiagent Systems.
- (2010) Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: Volume 1 - Volume 1, AAMAS '10 , pp. 1369-1370
- Boots, B.¹ Siddiqi, S.M.² Gordon, G.J.³

6
- 0041965975
- R-max-a general polynomial time algorithm for nearoptimal reinforcement learning
- Brafman, R. I., & Tennenholtz, M. (2003). R-max - a general polynomial time algorithm for nearoptimal reinforcement learning. Journal of Machine Learning Research, 3, 213-231.
- (2003) Journal of Machine Learning Research , vol.3 , pp. 213-231
- Brafman, R.I.¹ Tennenholtz, M.²

7
- 0028564629
- Acting optimally in partially observable stochastic domains
- Cassandra, A. R., Kaelbling, L. P., & Littman, M. L. (1994). Acting optimally in partially observable stochastic domains. In AAAI, pp. 1023-1028.
- (1994) AAAI , pp. 1023-1028
- Cassandra, A.R.¹ Kaelbling, L.P.² Littman, M.L.³

8
- 55249127519
- Progressive strategies for Monte-Carlo Tree Search
- Chaslot, G.-B., Winands, M., Uiterwijk, J., van den Herik, H., & Bouzy, B. (2008a). Progressive strategies for Monte-Carlo Tree Search. New Mathematics and Natural Computation, 4(3), 343-357.
- (2008) New Mathematics and Natural Computation , vol.4 , Issue.3 , pp. 343-357
- Chaslot, G.-B.¹ Winands, M.² Uiterwijk, J.³ Van Den Herik, H.⁴ Bouzy, B.⁵

9
- 55249093890
- Parallel monte-carlo tree search
- Berlin, Heidelberg. Springer-Verlag
- Chaslot, G. M., Winands, M. H., & Herik, H. J. (2008b). Parallel monte-carlo tree search. In Proceedings of the 6th International Conference on Computers and Games, pp. 60-71 Berlin, Heidelberg. Springer-Verlag.
- (2008) Proceedings of the 6th International Conference on Computers and Games , pp. 60-71
- Chaslot, G.M.¹ Winands, M.H.² Herik, H.J.³

10
- 84889281816
- Wiley-Interscience, New York, NY, USA
- Cover, T. M., & Thomas, J. A. (1991). Elements of information theory. Wiley-Interscience, New York, NY, USA.
- (1991) Elements of information theory
- Cover, T.M.¹ Thomas, J.A.²

11
- 77951573287
- Universal reinforcement learning
- Farias, V., Moallemi, C., Van Roy, B., & Weissman, T. (2010). Universal reinforcement learning. Information Theory, IEEE Transactions on, 56(5), 2441-2454.
- (2010) Information Theory, IEEE Transactions on , vol.56 , Issue.5 , pp. 2441-2454
- Farias, V.¹ Moallemi, C.² Van Roy, B.³ Weissman, T.⁴

12
- 57749181518
- Simulation-based approach to general game playing
- Finnsson, H., & Bj̈ornsson, Y. (2008). Simulation-based approach to general game playing. In AAAI, pp. 259-264.
- (2008) AAAI , pp. 259-264
- Finnsson, H.¹ Bj̈ornsson, Y.²

13
- 34547990649
- Combining online and offline learning in UCT
- Gelly, S., & Silver, D. (2007). Combining online and offline learning in UCT. In Proceedings of the 17th International Conference on Machine Learning, pp. 273-280.
- (2007) Proceedings of the 17th International Conference on Machine Learning , pp. 273-280
- Gelly, S.¹ Silver, D.²

14
- 77949664565
- Exploration exploitation in Go: UCT for Monte-Carlo Go
- Gelly, S., & Wang, Y. (2006). Exploration exploitation in Go: UCT for Monte-Carlo Go. In NIPS Workshop on On-line trading of Exploration and Exploitation.
- (2006) NIPS Workshop on On-line trading of Exploration and Exploitation
- Gelly, S.¹ Wang, Y.²

15
- 79956339609
- Modification of UCT with patterns in Monte-Carlo Go. Tech. rep. 6062, INRIA, France
- Gelly, S., Wang, Y., Munos, R., & Teytaud, O. (2006). Modification of UCT with patterns in Monte-Carlo Go. Tech. rep. 6062, INRIA, France.
- (2006)
- Gelly, S.¹ Wang, Y.² Munos, R.³ Teytaud, O.⁴

16
- 29344449759
- Effective short-term opponent exploitation in simplified poker
- Proceedings of the 20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference, AAAI-05/IAAI-05
- Hoehn, B., Southey, F., Holte, R. C., & Bulitko, V. (2005). Effective short-term opponent exploitation in simplified poker. In AAAI, pp. 783-788. (Pubitemid 43006704)
- (2005) Proceedings of the National Conference on Artificial Intelligence , vol.2 , pp. 783-788
- Hoehn, B.¹ Southey, F.² Holte, R.C.³ Bulitko, V.⁴

17
- 34250765690
- Looping suffix tree-based inference of partially observable hidden state
- Holmes, M. P., & Jr, C. L. I. (2006). Looping suffix tree-based inference of partially observable hidden state. In ICML, pp. 409-416.
- (2006) ICML , pp. 409-416
- Holmes Jr., M.P.¹

18
- 1642393842
- The fastest and shortest algorithm for all well-defined problems
- Hutter, M. (2002a). The fastest and shortest algorithm for all well-defined problems. International Journal of Foundations of Computer Science., 13(3), 431-443.
- (2002) International Journal of Foundations of Computer Science. , vol.13 , Issue.3 , pp. 431-443
- Hutter, M.¹

19
- 84937417436
- Self-optimizing and Pareto-optimal policies in general environments based on Bayes-mixtures
- Lecture Notes in Artificial Intelligence. Springer
- Hutter, M. (2002b). Self-optimizing and Pareto-optimal policies in general environments based on Bayes-mixtures. In Proceedings of the 15th Annual Conference on Computational Learning Theory (COLT 2002), Lecture Notes in Artificial Intelligence. Springer.
- (2002) Proceedings of the 15th Annual Conference on Computational Learning Theory (COLT 2002
- Hutter, M.¹

20
- 21844479189
- Springer
- Hutter, M. (2005). Universal Artificial Intelligence: Sequential Decisions Based on Algorithmic Probability. Springer.
- (2005) Universal Artificial Intelligence: Sequential Decisions Based on Algorithmic Probability
- Hutter, M.¹

21
- 78049319488
- Universal algorithmic intelligence: A mathematical top?down approach
- Springer, Berlin
- Hutter, M. (2007). Universal algorithmic intelligence: A mathematical top?down approach. In Artificial General Intelligence, pp. 227-290. Springer, Berlin.
- (2007) Artificial General Intelligence , pp. 227-290
- Hutter, M.¹

22
- 0032073263
- Planning and acting in partially observable stochastic domains
- PII S000437029800023X
- Kaelbling, L. P., Littman, M. L., & Cassandra, A. R. (1995). Planning and acting in partially observable stochastic domains. Artificial Intelligence, 101, 99-134. (Pubitemid 128387390)
- (1998) Artificial Intelligence , vol.101 , Issue.1-2 , pp. 99-134
- Kaelbling, L.P.¹ Littman, M.L.² Cassandra, A.R.³

23
- 33750293964
- Bandit based Monte-Carlo planning
- Machine Learning: ECML 2006 - 17th European Conference on Machine Learning, Proceedings
- Kocsis, L., & Szepesvári, C. (2006). Bandit based Monte-Carlo planning. In ECML, pp. 282-293. (Pubitemid 44618839)
- (2006) Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) , vol.4212 , pp. 282-293
- Kocsis, L.¹ Szepesvari, C.²

24
- 0006713882
- Inducing classification and regression trees in first order logic
- In Džeroski, S., & Lavrač, N. (Eds.), chap. 6. Springer
- Kramer, S., & Widmer, G. (2001). Inducing classification and regression trees in first order logic. In Džeroski, S., & Lavrač, N. (Eds.), Relational Data Mining, chap. 6. Springer.
- (2001) Relational Data Mining
- Kramer, S.¹ Widmer, G.²

25
- 0019539634
- The performance of universal coding
- Krichevsky, R., & Trofimov, V. (1981). The performance of universal coding. IEEE Transactions on Information Theory, IT-27, 199-207.
- (1981) IEEE Transactions on Information Theory, IT-27 , pp. 199-207
- Krichevsky, R.¹ Trofimov, V.²

26
- 0007901027
- A simplified two-person poker
- Kuhn, H. W. (1950). A simplified two-person poker. In Contributions to the Theory of Games, pp. 97-103.
- (1950) Contributions to the Theory of Games , pp. 97-103
- Kuhn, H.W.¹

27
- 79956345551
- Ergodic MDPs admit self-optimising policies. Tech. rep. IDSIA- s21-04, Dalle Molle Institute for Artificial Intelligence (IDSIA
- Legg, S., & Hutter, M. (2004). Ergodic MDPs admit self-optimising policies. Tech. rep. IDSIA-21-04, Dalle Molle Institute for Artificial Intelligence (IDSIA).
- (2004)
- Legg, S.¹ Hutter, M.²

28
- 77956163718
- Ph.D. thesis, Department of Informatics, University of Lugano
- Legg, S. (2008). Machine Super Intelligence. Ph.D. thesis, Department of Informatics, University of Lugano.
- (2008) Machine Super Intelligence
- Legg, S.¹

29
- 0000202647
- Universal sequential search problems
- Levin, L. A. (1973). Universal sequential search problems. Problems of Information Transmission, 9, 265-266.
- (1973) Problems of Information Transmission , vol.9 , pp. 265-266
- Levin, L.A.¹

30
- 0003680739
- (Third edition). Springer
- Li, M., & Vitányi, P. (2008). An Introduction to Kolmogorov Complexity and Its Applications (Third edition). Springer.
- (2008) An Introduction to Kolmogorov Complexity and Its Applications
- Li, M.¹ Vitányi, P.²

31
- 84898982129
- Predictive representations of state
- Littman, M., Sutton, R., & Singh, S. (2002). Predictive representations of state. In NIPS, pp. 1555-1561.
- (2002) NIPS , pp. 1555-1561
- Littman, M.¹ Sutton, R.² Singh, S.³

32
- 4143127458
- Springer
- Lloyd, J.W. (2003). Logic for Learning: Learning Comprehensible Theories from Structured Data. Springer.
- (2003) Logic for Learning: Learning Comprehensible Theories from Structured Data
- Lloyd, J.W.¹

33
- 38049144143
- Learning modal theories
- Lloyd, J.W., & Ng, K. S. (2007). Learning modal theories. In Proceedings of the 16th International Conference on Inductive Logic Programming, LNAI 4455, pp. 320-334.
- (2007) Proceedings of the 16th International Conference on Inductive Logic Programming, LNAI 4455 , pp. 320-334
- Lloyd, J.W.¹ Ng, K.S.²

34
- 71149083875
- Proto-predictive representation of states with simple recurrent temporaldifference networks
- Makino, T. (2009). Proto-predictive representation of states with simple recurrent temporaldifference networks. In ICML, pp. 697-704.
- (2009) ICML , pp. 697-704
- Makino, T.¹

35
- 0003932121
- Ph.D. thesis, University of Rochester
- McCallum, A. K. (1996). Reinforcement Learning with Selective Perception and Hidden State. Ph.D. thesis, University of Rochester.
- (1996) Reinforcement Learning with Selective Perception and Hidden State
- McCallum, A.K.¹

36
- 33750283133
- Ph.D. thesis, The Australian National University
- Ng, K. S. (2005). Learning Comprehensible Theories from Structured Data. Ph.D. thesis, The Australian National University.
- (2005) Learning Comprehensible Theories from Structured Data
- Ng, K.S.¹

37
- 79956365352
- A computational approximation to the AIXI model
- Pankov, S. (2008). A computational approximation to the AIXI model. In AGI, pp. 256-267.
- (2008) AGI , pp. 256-267
- Pankov, S.¹

38
- 33646515747
- Defensive universal learning with experts
- Springer
- Poland, J., & Hutter, M. (2005). Defensive universal learning with experts. In Proc. 16th International Conf. on Algorithmic Learning Theory, Vol. LNAI 3734, pp. 356-370. Springer.
- (2005) Proc. 16th International Conf. on Algorithmic Learning Theory, Vol. LNAI 3734 , pp. 356-370
- Poland, J.¹ Hutter, M.²

39
- 79956346776
- Universal learning of repeated matrix games. Tech. rep. 18-05, IDSIA
- Poland, J., & Hutter, M. (2006). Universal learning of repeated matrix games. Tech. rep. 18-05, IDSIA.
- (2006)
- Poland, J.¹ Hutter, M.²

40
- 77950356463
- Model-based bayesian reinforcement learning in partially observable domains
- Poupart, P., & Vlassis, N. (2008). Model-based bayesian reinforcement learning in partially observable domains. In ISAIM.
- (2008) ISAIM
- Poupart, P.¹ Vlassis, N.²

41
- 33749251297
- An analytic solution to discrete bayesian reinforcement learning
- New York, NY, USA. ACM
- Poupart, P., Vlassis, N., Hoey, J., & Regan, K. (2006). An analytic solution to discrete bayesian reinforcement learning. In ICML '06: Proceedings of the 23rd international conference on Machine learning, pp. 697-704 New York, NY, USA. ACM.
- (2006) ICML '06: Proceedings of the 23rd international conference on Machine learning , pp. 697-704
- Poupart, P.¹ Vlassis, N.² Hoey, J.³ Regan, K.⁴

42
- 0020824731
- A universal data compression system
- Rissanen, J. (1983). A universal data compression system. IEEE Transactions on Information Theory, 29(5), 656-663.
- (1983) IEEE Transactions on Information Theory , vol.29 , Issue.5 , pp. 656-663
- Rissanen, J.¹

43
- 0030282113
- The power of amnesia: Learning probabilistic automata with variable memory length
- Ron, D., Singer, Y., & Tishby, N. (1996). The power of amnesia: Learning probabilistic automata with variable memory length. Machine Learning, 25(2), 117-150.
- (1996) Machine Learning , vol.25 , Issue.2 , pp. 117-150
- Ron, D.¹ Singer, Y.² Tishby, N.³

44
- 14344256568
- Learning low dimensional predictive representations
- New York, NY, USA. ACM
- Rosencrantz, M., Gordon, G., & Thrun, S. (2004). Learning low dimensional predictive representations. In Proceedings of the twenty-first International Conference on Machine Learning, p. 88 New York, NY, USA. ACM.
- (2004) Proceedings of the twenty-first International Conference on Machine Learning , pp. 88
- Rosencrantz, M.¹ Gordon, G.² Thrun, S.³

45
- 85162018872
- Bayes-adaptive POMDPs
- In Platt, J., Koller, D., Singer, Y., & Roweis, S. (Eds.), MIT Press, Cambridge, MA
- Ross, S., Chaib-draa, B., & Pineau, J. (2008). Bayes-adaptive POMDPs. In Platt, J., Koller, D., Singer, Y., & Roweis, S. (Eds.), Advances in Neural Information Processing Systems 20, pp. 1225-1232. MIT Press, Cambridge, MA.
- (2008) Advances in Neural Information Processing Systems , vol.20 , pp. 1225-1232
- Ross, S.¹ Chaib-draa, B.² Pineau, J.³

46
- 0031186687
- Shifting Inductive Bias with Success-Story Algorithm, Adaptive Levin Search, and Incremental Self-Improvement
- Schmidhuber, J., Zhao, J., & Wiering, M. A. (1997). Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self-improvement. Machine Learning, 28, 105-130. (Pubitemid 127507171)
- (1997) Machine Learning , vol.28 , Issue.1 , pp. 105-130
- Schmidhuber, J.¹ Zhao, J.² Wiering, M.³

47
- 0031194381
- Discovering neural nets with low Kolmogorov complexity and high generalization capability
- DOI 10.1016/S0893-6080(96)00127-X, PII S089360809600127X
- Schmidhuber, J. (1997). Discovering neural nets with low Kolmogorov complexity and high generalization capability. Neural Networks, 10(5), 857-873. (Pubitemid 27315721)
- (1997) Neural Networks , vol.10 , Issue.5 , pp. 857-873
- Schmidhuber, J.¹

48
- 84937439050
- The speed prior: A new simplicity measure yielding near-optimal computable predictions
- Schmidhuber, J. (2002). The speed prior: A new simplicity measure yielding near-optimal computable predictions. In Proc. 15th Annual Conf. on Computational Learning Theory, pp. 216-228.
- (2002) Proc. 15th Annual Conf. on Computational Learning Theory , pp. 216-228
- Schmidhuber, J.¹

49
- 84898985714
- Bias-optimal incremental problem solving
- MIT Press
- Schmidhuber, J. (2003). Bias-optimal incremental problem solving. In Advances in Neural Information Processing Systems 15, pp. 1571-1578. MIT Press.
- (2003) Advances in Neural Information Processing Systems , vol.15 , pp. 1571-1578
- Schmidhuber, J.¹

50
- 1642328943
- Optimal ordered problem solver
- Schmidhuber, J. (2004). Optimal ordered problem solver. Machine Learning, 54, 211-254.
- (2004) Machine Learning , vol.54 , pp. 211-254
- Schmidhuber, J.¹

51
- 79956368281
- Ph.D. thesis, Ben-Gurion University of the Negev
- Shani, G. (2007). Learning and Solving Partially Observable Markov Decision Processes. Ph.D. thesis, Ben-Gurion University of the Negev.
- (2007) Learning and Solving Partially Observable Markov Decision Processes
- Shani, G.¹

52
- 33646434962
- Resolving perceptual aliasing in the presence of noisy sensors
- Shani, G., & Brafman, R. (2004). Resolving perceptual aliasing in the presence of noisy sensors. In NIPS.
- (2004) NIPS
- Shani, G.¹ Brafman, R.²

53
- 71149102015
- Monte-carlo simulation balancing
- New York, NY, USA. ACM
- Silver, D., & Tesauro, G. (2009). Monte-carlo simulation balancing. In ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 945-952 New York, NY, USA. ACM.
- (2009) ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning , pp. 945-952
- Silver, D.¹ Tesauro, G.²

54
- 85161963598
- Monte-Carlo planning in large POMDPs
- To appear
- Silver, D., & Veness, J. (2010). Monte-Carlo Planning in Large POMDPs. In Advances in Neural Information Processing Systems (NIPS). To appear.
- (2010) Advances in Neural Information Processing Systems (NIPS)
- Silver, D.¹ Veness, J.²

55
- 33749263456
- Predictive state representations: A new theory for modeling dynamical systems
- Singh, S., James, M., & Rudary, M. (2004). Predictive state representations: A new theory for modeling dynamical systems. In UAI, pp. 512-519.
- (2004) UAI , pp. 512-519
- Singh, S.¹ James, M.² Rudary, M.³

56
- 4544279425
- A formal theory of inductive inference: Parts 1 and 2
- 224-254
- Solomonoff, R. J. (1964). A formal theory of inductive inference: Parts 1 and 2. Information and Control, 7, 1-22 and 224-254.
- (1964) Information and Control , vol.7 , pp. 1-22
- Solomonoff, R.J.¹

57
- 73549084301
- Reinforcement learning in finite MDPs: PAC analysis
- Strehl, A. L., Li, L., & Littman, M. L. (2009). Reinforcement learning in finite MDPs: PAC analysis. Journal of Machine Learning Research, 10, 2413-2444.
- (2009) Journal of Machine Learning Research , vol.10 , pp. 2413-2444
- Strehl, A.L.¹ Li, L.² Littman, M.L.³

58
- 33749255382
- PAC model-free reinforcement learning
- New York, NY, USA. ACM
- Strehl, A. L., Li, L., Wiewiora, E., Langford, J., & Littman, M. L. (2006). PAC model-free reinforcement learning. In ICML '06: Proceedings of the 23rd international conference on Machine learning, pp. 881-888 New York, NY, USA. ACM.
- (2006) ICML '06: Proceedings of the 23rd international conference on Machine learning , pp. 881-888
- Strehl, A.L.¹ Li, L.² Wiewiora, E.³ Langford, J.⁴ Littman, M.L.⁵

59
- 14344258433
- A Bayesian framework for reinforcement learning
- Strens, M. (2000). A Bayesian framework for reinforcement learning. In ICML, pp. 943-950.
- (2000) ICML , pp. 943-950
- Strens, M.¹

60
- 77958523564
- A reinforcement learning algorithm in partially observable environments using short-term memory
- Suematsu, N., & Hayashi, A. (1999). A reinforcement learning algorithm in partially observable environments using short-term memory. In NIPS, pp. 1059-1065.
- (1999) NIPS , pp. 1059-1065
- Suematsu, N.¹ Hayashi, A.²

61
- 79956373203
- A Bayesian approach to model learning in non- Markovian environment
- Suematsu, N., Hayashi, A., & Li, S. (1997). A Bayesian approach to model learning in non- Markovian environment. In ICML, pp. 349-357.
- (1997) ICML , pp. 349-357
- Suematsu, N.¹ Hayashi, A.² Li, S.³

62
- 0004102479
- MIT Press
- Sutton, R. S., & Barto, A. G. (1998). Reinforcement Learning: An Introduction. MIT Press.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

63
- 31844431936
- Temporal-difference networks
- Sutton, R. S., & Tanner, B. (2004). Temporal-difference networks. In NIPS.
- (2004) NIPS
- Sutton, R.S.¹ Tanner, B.²

64
- 0141444595
- Context tree weighting: Multi-alphabet sources
- Tjalkens, T. J., Shtarkov, Y. M., &Willems, F. M. J. (1993). Context tree weighting: Multi-alphabet sources. In Proceedings of the 14th Symposium on Information Theory Benelux.
- (1993) Proceedings of the 14th Symposium on Information Theory Benelux
- Tjalkens, T.J.¹ Shtarkov, Y.M.² Willems, F.M.J.³

65
- 77958583913
- Reinforcement Learning via AIXI Approximation
- Veness, J., Ng, K. S., Hutter, M., & Silver, D. (2010). Reinforcement Learning via AIXI Approximation. In Proceedings of the Conference for the Association for the Advancement of Artificial Intelligence (AAAI).
- (2010) Proceedings of the Conference for the Association for the Advancement of Artificial Intelligence (AAAI)
- Veness, J.¹ Ng, K.S.² Hutter, M.³ Silver, D.⁴

66
- 84858720579
- Bootstrapping from game tree search
- Veness, J., Silver, D., Uther, W., & Blair, A. (2009). Bootstrapping from Game Tree Search. In Neural Information Processing Systems (NIPS).
- (2009) Neural Information Processing Systems (NIPS)
- Veness, J.¹ Silver, D.² Uther, W.³ Blair, A.⁴

67
- 31844436266
- Bayesian sparse sampling for on-line reward optimization
- Wang, T., Lizotte, D. J., Bowling, M. H., & Schuurmans, D. (2005). Bayesian sparse sampling for on-line reward optimization. In ICML, pp. 956-963.
- (2005) ICML , pp. 956-963
- Wang, T.¹ Lizotte, D.J.² Bowling, M.H.³ Schuurmans, D.⁴

68
- 34249833101
- Q-learning
- Watkins, C., & Dayan, P. (1992). Q-learning. Machine Learning, 8, 279-292.
- (1992) Machine Learning , vol.8 , pp. 279-292
- Watkins, C.¹ Dayan, P.²

69
- 48249155421
- Reflections on "The Context Tree Weighting Method: Basic properties
- Willems, F., Shtarkov, Y., & Tjalkens, T. (1997). Reflections on "The Context Tree Weighting Method: Basic properties". Newsletter of the IEEE Information Theory Society, 47(1).
- (1997) Newsletter of the IEEE Information Theory Society , vol.47 , pp. 1
- Willems, F.¹ Shtarkov, Y.² Tjalkens, T.³

70
- 0032022518
- The context-tree weighting method: Extensions
- PII S0018944898006543
- Willems, F. M. J. (1998). The context-tree weighting method: Extensions. IEEE Transactions on Information Theory, 44, 792-798. (Pubitemid 128737641)
- (1998) IEEE Transactions on Information Theory , vol.44 , Issue.2 , pp. 792-798
- Willems, F.M.J.¹

71
- 0030242432
- Context weighting for general finitecontext sources
- Willems, F. M. J., Shtarkov, Y. M., & Tjalkens, T. J. (1996). Context weighting for general finitecontext sources. IEEE Trans. Inform. Theory, 42, 42-1514.
- (1996) IEEE Trans. Inform. Theory , vol.42 , pp. 42-1514
- Willems, F.M.J.¹ Shtarkov, Y.M.² Tjalkens, T.J.³

72
- 0029307102
- The context tree weighting method: Basic properties
- Willems, F. M., Shtarkov, Y. M., & Tjalkens, T. J. (1995). The context tree weighting method: Basic properties. IEEE Transactions on Information Theory, 41, 653-664.
- (1995) IEEE Transactions on Information Theory , vol.41 , pp. 653-664
- Willems, F.M.¹ Shtarkov, Y.M.² Tjalkens, T.J.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.