SCOPUS 정보 검색 플랫폼

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Volumn 5924 LNAI, Issue , 2010, Pages 1-32

Abstraction and generalization in reinforcement learning: A summary and framework

(3) Ponsen, Marc a Taylor, Matthew E b Tuyls, Karl a

a MAASTRICHT UNIVERSITY (Netherlands)

b UNIVERSITY OF SOUTHERN CALIFORNIA (United States)

Author keywords

[No Author keywords available]

Indexed keywords

BASIC FUNCTIONS; TRANSFER LEARNING;

ABSTRACTING; INTELLIGENT AGENTS; REINFORCEMENT LEARNING; TECHNICAL PRESENTATIONS;

REINFORCEMENT;

EID: 77950871800 PISSN: 03029743 EISSN: 16113349 Source Type: Book Series
DOI: 10.1007/978-3-642-11814-2_1 Document Type: Conference Paper

Times cited : (40)

References (53)

1
- 0003942195
- Byte Books, Peterborough
- Albus, J.S.: Brains, Behavior, and Robotics. Byte Books, Peterborough (1981)
- (1981) Brains, Behavior, and Robotics
- Albus, J.S.¹

2
- 0141988716
- Recent advances in hierarchical reinforcement learning
- Barto, A., Mahadevan, S.: Recent advances in hierarchical reinforcement learning. Discrete Event Dynamic Systems: Theory and Application 13(4), 341-379 (2003)
- (2003) Discrete Event Dynamic Systems: Theory and Application , vol.13 , Issue.4 , pp. 341-379
- Barto, A.¹ Mahadevan, S.²

3
- 85012688561
- Princeton University Press, Princeton
- Bellman, R.: Dynamic Programming. Princeton University Press, Princeton (1957)
- (1957) Dynamic Programming
- Bellman, R.¹

4
- 71149116544
- Curriculum learning
- Bengio, Y., Collobert, J.L.R., Weston, J.: Curriculum learning. In: Proceedings of the Twenty-Sixth International Conference on Machine Learning (June 2009)
- Proceedings of the Twenty-Sixth International Conference on Machine Learning (June 2009)
- Bengio, Y.¹ Collobert, J.L.R.² Weston, J.³

5
- 85153940465
- Generalization in reinforcement learning: Safely approximating the value function
- Tesauro, G., Touretzky, D.S., Leen, T.K. (eds.) MIT Press, Cambridge
- Boyan, J.A., Moore, A.W.: Generalization in reinforcement learning: Safely approximating the value function. In: Tesauro, G., Touretzky, D.S., Leen, T.K. (eds.) Advances in Neural Information Processing Systems, vol. 7, pp. 369-376. MIT Press, Cambridge (1995)
- (1995) Advances in Neural Information Processing Systems , vol.7 , pp. 369-376
- Boyan, J.A.¹ Moore, A.W.²

6
- 0041965975
- R-max - A general polynomial time algorithm for near-optimal reinforcement learning
- Brafman, R.I., Tennenholtz, M.: R-max - a general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research 3, 213-231 (2003)
- (2003) Journal of Machine Learning Research , vol.3 , pp. 213-231
- Brafman, R.I.¹ Tennenholtz, M.²

7
- 0025957717
- Intelligence without representation
- Brooks, R.A.: Intelligence without representation. Artificial Intelligence (47), 139-159 (1991)
- (1991) Artificial Intelligence , Issue.47 , pp. 139-159
- Brooks, R.A.¹

8
- 0031189914
- Multitask learning
- Caruana, R.: Multitask learning. Machine Learning 28, 41-75 (1997)
- (1997) Machine Learning , vol.28 , pp. 41-75
- Caruana, R.¹

9
- 85156187730
- Improving elevator performance using reinforcement learning
- Touretzky, D.S., Mozer, M.C., Hasselmo, M.E. (eds.) MIT Press, Cambridge
- Crites, R.H., Barto, A.G.: Improving elevator performance using reinforcement learning. In: Touretzky, D.S., Mozer, M.C., Hasselmo, M.E. (eds.) Advances in Neural Information Processing Systems, vol. 8, pp. 1017-1023. MIT Press, Cambridge (1996)
- (1996) Advances in Neural Information Processing Systems , vol.8 , pp. 1017-1023
- Crites, R.H.¹ Barto, A.G.²

10
- 40249088278
- Learning relational options for inductive transfer in relational reinforcement learning
- Blockeel, H., Ramon, J., Shavlik, J., Tadepalli, P. (eds.) Springer, Heidelberg
- Croonenborghs, T., Driessens, K., Bruynooghe, M.: Learning relational options for inductive transfer in relational reinforcement learning. In: Blockeel, H., Ramon, J., Shavlik, J., Tadepalli, P. (eds.) ILP 2007. LNCS (LNAI), vol. 4894, pp. 88-97. Springer, Heidelberg (2008)
- (2008) ILP 2007. LNCS (LNAI) , vol.4894 , pp. 88-97
- Croonenborghs, T.¹ Driessens, K.² Bruynooghe, M.³

11
- 0031370386
- Model minimization in Markov decision processes
- Dean, T., Givan, R.: Model minimization in Markov decision processes. In: Proceedings of the Thirteenth National Conference on Artificial Intelligence, pp. 106-111 (1997)
- (1997) Proceedings of the Thirteenth National Conference on Artificial Intelligence , pp. 106-111
- Dean, T.¹ Givan, R.²

12
- 84942867726
- An overview of MAXQ hierarchical reinforcement learning
- Choueiry, B.Y., Walsh, T. (eds.) Springer, Heidelberg
- Dietterich, T.: An overview of MAXQ hierarchical reinforcement learning. In: Choueiry, B.Y., Walsh, T. (eds.) SARA 2000. LNCS (LNAI), vol. 1864, pp. 26-44. Springer, Heidelberg (2000)
- (2000) SARA 2000. LNCS (LNAI) , vol.1864 , pp. 26-44
- Dietterich, T.¹

13
- 33745615420
- PhD thesis, DEPTCW
- Driessens, K.: Relational Reinforcement Learning. PhD thesis, DEPTCW(2004), http://www.cs.kuleuven.be/publicaties/doctoraten/cw/CW2004 05.abs.html
- (2004) Relational Reinforcement Learning
- Driessens, K.¹

14
- 0035312760
- Relational reinforcement learning
- Džeroski, S., De Raedt, L., Driessens, K.: Relational reinforcement learning. Machine Learning 43, 7-52 (2001)
- (2001) Machine Learning , vol.43 , pp. 7-52
- Džeroski, S.¹ De Raedt, L.² Driessens, K.³

15
- 21844465127
- Tree-based batch mode reinforcement learning
- Ernst, D., Geurts, P., Wehenkel, L.: Tree-based batch mode reinforcement learning. Journal of Machine Learning Research 6, 503-556 (2005)
- (2005) Journal of Machine Learning Research , vol.6 , pp. 503-556
- Ernst, D.¹ Geurts, P.² Wehenkel, L.³

16
- 0026940563
- A theory of abstraction
- Giunchiglia, F., Walsh, T.: A theory of abstraction. Artificial Intelligence 57(2-3), 323-389 (1992)
- (1992) Artificial Intelligence , vol.57 , Issue.2-3 , pp. 323-389
- Giunchiglia, F.¹ Walsh, T.²

17
- 77950872081
- Abstraction and reformulation in ai
- Holte, R.C., Chouiery, B.Y.: Abstraction and reformulation in ai. Philosophical transactions of the Royal Society of London 358(1435:1), 197-204 (2003)
- (2003) Philosophical Transactions of the Royal Society of London , vol.358 , Issue.1-1435 , pp. 197-204
- Holte, R.C.¹ Chouiery, B.Y.²

18
- 0003644124
- MIT Press, Cambridge
- Howard, R.A.: Dynamic Programming and Markov Processes. MIT Press, Cambridge (1960)
- (1960) Dynamic Programming and Markov Processes
- Howard, R.A.¹

19
- 77952323439
- Model-based exploration in continuous state spaces
- Jong, N.K., Stone, P.: Model-based exploration in continuous state spaces. In: The Seventh Symposium on Abstraction, Reformulation, and Approximation (July 2007)
- The Seventh Symposium on Abstraction, Reformulation, and Approximation (July 2007)
- Jong, N.K.¹ Stone, P.²

20
- 0012257655
- Near-optimal reinforcement learning in polynomial time
- Morgan Kaufmann, San Francisco
- Kearns, M., Singh, S.: Near-optimal reinforcement learning in polynomial time. In: Proc. 15th International Conf. on Machine Learning, pp. 260-268. Morgan Kaufmann, San Francisco (1998)
- (1998) Proc. 15th International Conf. on Machine Learning , pp. 260-268
- Kearns, M.¹ Singh, S.²

21
- 33749243349
- Autonomous shaping: Knowledge transfer in reinforcement learning
- Konidaris, G., Barto, A.: Autonomous shaping: Knowledge transfer in reinforcement learning. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 489-496 (2006)
- (2006) Proceedings of the 23rd International Conference on Machine Learning , pp. 489-496
- Konidaris, G.¹ Barto, A.²

22
- 84880873347
- Building portable options: Skill transfer in reinforcement learning
- Konidaris, G., Barto, A.G.: Building portable options: Skill transfer in reinforcement learning. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence, pp. 895-900 (2007)
- (2007) Proceedings of the 20th International Joint Conference on Artificial Intelligence , pp. 895-900
- Konidaris, G.¹ Barto, A.G.²

23
- 56049125072
- Transfer of samples in batch reinforcement learning
- Lazaric, A., Restelli, M., Bonarini, A.: Transfer of samples in batch reinforcement learning. In: Proceedings of the 25th Annual ICML, pp. 544-551 (2008)
- (2008) Proceedings of the 25th Annual ICML , pp. 544-551
- Lazaric, A.¹ Restelli, M.² Bonarini, A.³

24
- 84864535343
- Towards a unified theory of state abstraction for MDPs
- Li, L., Walsh, T.J., Littman, M.L.: Towards a unified theory of state abstraction for MDPs. In: Proceedings of the Ninth International Symposium on Artificial Intelligence and Mathematics, pp. 531-539 (2006)
- (2006) Proceedings of the Ninth International Symposium on Artificial Intelligence and Mathematics , pp. 531-539
- Li, L.¹ Walsh, T.J.² Littman, M.L.³

25
- 35748957806
- Proto-value functions: A Laplacian framework for learning representation and control in Markov decision processes
- Mahadevan, S., Maggioni, M.: Proto-value functions: A Laplacian framework for learning representation and control in Markov decision processes. Journal of Machine Learning Research 8, 2169-2231 (2007)
- (2007) Journal of Machine Learning Research , vol.8 , pp. 2169-2231
- Mahadevan, S.¹ Maggioni, M.²

26
- 0028429573
- Inductive logic programming: Theory and methods
- Muggleton, S., De Raedt, L.: Inductive logic programming: Theory and methods. Journal of Logic Programming 19(20), 629-679 (1994)
- (1994) Journal of Logic Programming , vol.19 , Issue.20 , pp. 629-679
- Muggleton, S.¹ De Raedt, L.²

27
- 0032021222
- Soccer server: A tool for research on multiagent systems
- Noda, I.,Matsubara, H., Hiraki, K., Frank, I.: Soccer server: A tool for research on multiagent systems. Applied Artificial Intelligence 12, 233-250 (1998)
- (1998) Applied Artificial Intelligence , vol.12 , pp. 233-250
- Noda, I.¹ Matsubara, H.² Hiraki, K.³ Frank, I.⁴

28
- 84899823587
- Learning with whom to communicate using relational reinforcement learning
- Ponsen, M., Croonenborghs, T., Ramon, J., Tuyls, K., Driessens, K., van den Herik, J., Postma, E.: Learning with whom to communicate using relational reinforcement learning. In: International Conference on Autonomous Agents and Multi Agent Systems, AAMAS (2009)
- International Conference on Autonomous Agents and Multi Agent Systems, AAMAS (2009)
- Ponsen, M.¹ Croonenborghs, T.² Ramon, J.³ Tuyls, K.⁴ Driessens, K.⁵ Van Den Herik, J.⁶ Postma, E.⁷

29
- 85102627959
- John Wiley and Sons, New York
- Puterman, M.: Markov decision processes: Discrete stochastic dynamic programming. John Wiley and Sons, New York (1994)
- (1994) Markov Decision Processes: Discrete Stochastic Dynamic Programming
- Puterman, M.¹

30
- 33646712258
- Decision tree function approximation in reinforcement learning
- Pyeatt, L.D., Howe, A.E.: Decision tree function approximation in reinforcement learning. In: Proceedings of the Third International Symposium on Adaptive Systems: Evolutionary Computation & Probabilistic Graphical Models, pp. 70-77 (2001)
- (2001) Proceedings of the Third International Symposium on Adaptive Systems: Evolutionary Computation & Probabilistic Graphical Models , pp. 70-77
- Pyeatt, L.D.¹ Howe, A.E.²

31
- 1942514642
- Model minimization in hierarchical reinforcement learning
- Ravindran, B., Barto, A.:Model minimization in hierarchical reinforcement learning. In: Proceedings of the Fifth Symposium on Abstraction, Reformulation and Approximation (2002)
- Proceedings of the Fifth Symposium on Abstraction, Reformulation and Approximation (2002)
- Ravindran, B.¹ Barto, A.²

32
- 33750697135
- An algebraic approach to abstraction in reinforcement learning
- Ravindran, B., Barto, A.: An algebraic approach to abstraction in reinforcement learning. In: Twelfth Yale Workshop on Adaptive and Learning Systems, pp. 109-114 (2003)
- (2003) Twelfth Yale Workshop on Adaptive and Learning Systems , pp. 109-114
- Ravindran, B.¹ Barto, A.²

33
- 0003636089
- Technical report, Cambridge University Engineering Department
- Rummery, G.A., Niranjan, M.: On-line Q-learning using connectionist systems. Technical report, Cambridge University Engineering Department (1994)
- (1994) On-line Q-learning Using Connectionist Systems
- Rummery, G.A.¹ Niranjan, M.²

34
- 0004144751
- Colliler-Macmillian
- Skinner, B.F.: Science and Human Behavior. Colliler-Macmillian (1953)
- (1953) Science and Human Behavior
- Skinner, B.F.¹

35
- 33750690679
- Using homomorphisms to transfer options across continuous reinforcement learning domains
- Soni, V., Singh, S.: Using homomorphisms to transfer options across continuous reinforcement learning domains. In: Proceedings of the Twenty First National Conference on Artificial Intelligence (July 2006)
- Proceedings of the Twenty First National Conference on Artificial Intelligence (July 2006)
- Soni, V.¹ Singh, S.²

36
- 84899792946
- Transfer via soft homomorphisms
- Sorg, J., Singh, S.: Transfer via soft homomorphisms. In: Proceedings of the Eighth International Conference on Autonomous Agents and Multiagent Systems, May 2009, pp. 741-748 (2009)
- (2009) Proceedings of the Eighth International Conference on Autonomous Agents and Multiagent Systems, May 2009 , pp. 741-748
- Sorg, J.¹ Singh, S.²

37
- 37249034293
- Keepaway soccer: From machine learning testbed to benchmark
- Bredenfeld, A., Jacoff, A., Noda, I., Takahashi, Y. (eds.) Springer, Heidelberg
- Stone, P., Kuhlmann, G., Taylor, M.E., Liu, Y.: Keepaway soccer: From machine learning testbed to benchmark. In: Bredenfeld, A., Jacoff, A., Noda, I., Takahashi, Y. (eds.) RoboCup 2005. LNCS (LNAI), vol. 4020, pp. 93-105. Springer, Heidelberg (2006)
- (2006) RoboCup 2005. LNCS (LNAI) , vol.4020 , pp. 93-105
- Stone, P.¹ Kuhlmann, G.² Taylor, M.E.³ Liu, Y.⁴

38
- 0012929784
- Dyna, an integrated architecture for learning, planning, and reacting
- Sutton, R.: Dyna, an integrated architecture for learning, planning, and reacting. SIGART Bulletin 2, 160-163 (1991)
- (1991) SIGART Bulletin , vol.2 , pp. 160-163
- Sutton, R.¹

39
- 0004102479
- MIT Press, Cambridge
- Sutton, R., Barto, A.: Reinforcement Learning: an introduction. MIT Press, Cambridge (1998)
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.¹ Barto, A.²

40
- 0033170372
- Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning
- Sutton, R., Precup, D., Singh, S.: Between mdps and semi-mdps: a framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112, 181-211 (1999)
- (1999) Artificial Intelligence , vol.112 , pp. 181-211
- Sutton, R.¹ Precup, D.² Singh, S.³

41
- 70350499781
- Assisting transfer-enabled machine learning algorithms: Leveraging human knowledge for curriculum design
- Taylor, M.E.: Assisting transfer-enabled machine learning algorithms: Leveraging human knowledge for curriculum design. In: The AAAI 2009 Spring Symposium on Agents that Learn from Human Teachers (March 2009)
- The AAAI 2009 Spring Symposium on Agents That Learn from Human Teachers (March 2009)
- Taylor, M.E.¹

42
- 56049086452
- Transferring instances for model-based reinforcement learning
- Daelemans, W., Goethals, B., Morik, K. (eds.) Springer, Heidelberg
- Taylor, M.E., Jong, N.K., Stone, P.: Transferring instances for model-based reinforcement learning. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part II. LNCS (LNAI), vol. 5212, pp. 488-505. Springer, Heidelberg (2008)
- (2008) ECML PKDD 2008, Part II. LNCS (LNAI) , vol.5212 , pp. 488-505
- Taylor, M.E.¹ Jong, N.K.² Stone, P.³

43
- 38349005230
- Cross-domain transfer for reinforcement learning
- Taylor,M.E., Stone, P.: Cross-domain transfer for reinforcement learning. In: Proceedings of the Twenty-Fourth International Conference on Machine Learning (June 2007)
- Proceedings of the Twenty-Fourth International Conference on Machine Learning (June 2007)
- Taylor, M.E.¹ Stone, P.²

44
- 68949157375
- Transfer learning for reinforcement learning domains: A survey
- Taylor, M.E., Stone, P.: Transfer learning for reinforcement learning domains: A survey. Journal of Machine Learning Research 10(1), 1633-1685 (2009)
- (2009) Journal of Machine Learning Research , vol.10 , Issue.1 , pp. 1633-1685
- Taylor, M.E.¹ Stone, P.²

45
- 34848816477
- Transfer learning via inter-task mappings for temporal difference learning
- Taylor, M.E., Stone, P., Liu, Y.: Transfer learning via inter-task mappings for temporal difference learning. Journal of Machine Learning Research 8(1), 2125-2167 (2007)
- (2007) Journal of Machine Learning Research , vol.8 , Issue.1 , pp. 2125-2167
- Taylor, M.E.¹ Stone, P.² Liu, Y.³

46
- 0000985504
- TD-Gammon, a self-teaching backgammon program, achieves master-level play
- Tesauro, G.: TD-Gammon, a self-teaching backgammon program, achieves master-level play. Neural Computation 6(2), 215-219 (1994)
- (1994) Neural Computation , vol.6 , Issue.2 , pp. 215-219
- Tesauro, G.¹

47
- 33751551663
- The influence of improvement in one mental function upon the efficiency of other functions
- Thorndike, E., Woodworth, R.: The influence of improvement in one mental function upon the efficiency of other functions. Psychological Review 8, 247-261 (1901)
- (1901) Psychological Review , vol.8 , pp. 247-261
- Thorndike, E.¹ Woodworth, R.²

48
- 85031124575
- Is learning the n-th thing any easier than learning the first?
- Thrun, S.: Is learning the n-th thing any easier than learning the first? In: Advances in Neural Information Processing Systems, vol. 8, pp. 640-646 (1996)
- (1996) Advances in Neural Information Processing Systems , vol.8 , pp. 640-646
- Thrun, S.¹

49
- 40249114836
- Relational macros for transfer in reinforcement learning
- Blockeel, H., Ramon, J., Shavlik, J., Tadepalli, P. (eds.) Springer, Heidelberg
- Torrey, L., Shavlik, J.W., Walker, T., Maclin, R.: Relational macros for transfer in reinforcement learning. In: Blockeel, H., Ramon, J., Shavlik, J., Tadepalli, P. (eds.) ILP 2007. LNCS (LNAI), vol. 4894, pp. 254-268. Springer, Heidelberg (2008)
- (2008) ILP 2007. LNCS (LNAI) , vol.4894 , pp. 254-268
- Torrey, L.¹ Shavlik, J.W.² Walker, T.³ Maclin, R.⁴

50
- 0004049895
- PhD thesis, Cambridge University
- Watkins, C.: Learning with Delayed Rewards. PhD thesis, Cambridge University (1989)
- (1989) Learning with Delayed Rewards
- Watkins, C.¹

51
- 84962092260
- A multiagent variant of dyna-q
- Weiss, G.: A multiagent variant of dyna-q. In: Proceedings of the 4th International Conference on Multi-Agent Systems (ICMAS 2000), pp. 461-462 (2000)
- (2000) Proceedings of the 4th International Conference on Multi-Agent Systems (ICMAS 2000) , pp. 461-462
- Weiss, G.¹

52
- 0345161977
- PhD thesis, Universiteit van Amsterdam
- Wiering, M.: Explorations in Efficient Reinforcement Learning. PhD thesis, Universiteit van Amsterdam (1999)
- (1999) Explorations in Efficient Reinforcement Learning
- Wiering, M.¹

53
- 0042760373
- A grounded theory of abstraction in artificial intelligence
- Zucker, J.D.: A grounded theory of abstraction in artificial intelligence. Philosophical transactions of the Royal Society of London 358(1435:1), 293-309 (2003)
- (2003) Philosophical Transactions of the Royal Society of London , vol.358 , Issue.1-1435 , pp. 293-309
- Zucker, J.D.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.