메뉴 건너뛰기




Volumn 5924 LNAI, Issue , 2010, Pages 1-32

Abstraction and generalization in reinforcement learning: A summary and framework

Author keywords

[No Author keywords available]

Indexed keywords

BASIC FUNCTIONS; TRANSFER LEARNING;

EID: 77950871800     PISSN: 03029743     EISSN: 16113349     Source Type: Book Series    
DOI: 10.1007/978-3-642-11814-2_1     Document Type: Conference Paper
Times cited : (40)

References (53)
  • 3
    • 85012688561 scopus 로고
    • Princeton University Press, Princeton
    • Bellman, R.: Dynamic Programming. Princeton University Press, Princeton (1957)
    • (1957) Dynamic Programming
    • Bellman, R.1
  • 5
    • 85153940465 scopus 로고
    • Generalization in reinforcement learning: Safely approximating the value function
    • Tesauro, G., Touretzky, D.S., Leen, T.K. (eds.) MIT Press, Cambridge
    • Boyan, J.A., Moore, A.W.: Generalization in reinforcement learning: Safely approximating the value function. In: Tesauro, G., Touretzky, D.S., Leen, T.K. (eds.) Advances in Neural Information Processing Systems, vol. 7, pp. 369-376. MIT Press, Cambridge (1995)
    • (1995) Advances in Neural Information Processing Systems , vol.7 , pp. 369-376
    • Boyan, J.A.1    Moore, A.W.2
  • 6
    • 0041965975 scopus 로고    scopus 로고
    • R-max - A general polynomial time algorithm for near-optimal reinforcement learning
    • Brafman, R.I., Tennenholtz, M.: R-max - a general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research 3, 213-231 (2003)
    • (2003) Journal of Machine Learning Research , vol.3 , pp. 213-231
    • Brafman, R.I.1    Tennenholtz, M.2
  • 7
    • 0025957717 scopus 로고
    • Intelligence without representation
    • Brooks, R.A.: Intelligence without representation. Artificial Intelligence (47), 139-159 (1991)
    • (1991) Artificial Intelligence , Issue.47 , pp. 139-159
    • Brooks, R.A.1
  • 8
    • 0031189914 scopus 로고    scopus 로고
    • Multitask learning
    • Caruana, R.: Multitask learning. Machine Learning 28, 41-75 (1997)
    • (1997) Machine Learning , vol.28 , pp. 41-75
    • Caruana, R.1
  • 9
    • 85156187730 scopus 로고    scopus 로고
    • Improving elevator performance using reinforcement learning
    • Touretzky, D.S., Mozer, M.C., Hasselmo, M.E. (eds.) MIT Press, Cambridge
    • Crites, R.H., Barto, A.G.: Improving elevator performance using reinforcement learning. In: Touretzky, D.S., Mozer, M.C., Hasselmo, M.E. (eds.) Advances in Neural Information Processing Systems, vol. 8, pp. 1017-1023. MIT Press, Cambridge (1996)
    • (1996) Advances in Neural Information Processing Systems , vol.8 , pp. 1017-1023
    • Crites, R.H.1    Barto, A.G.2
  • 10
    • 40249088278 scopus 로고    scopus 로고
    • Learning relational options for inductive transfer in relational reinforcement learning
    • Blockeel, H., Ramon, J., Shavlik, J., Tadepalli, P. (eds.) Springer, Heidelberg
    • Croonenborghs, T., Driessens, K., Bruynooghe, M.: Learning relational options for inductive transfer in relational reinforcement learning. In: Blockeel, H., Ramon, J., Shavlik, J., Tadepalli, P. (eds.) ILP 2007. LNCS (LNAI), vol. 4894, pp. 88-97. Springer, Heidelberg (2008)
    • (2008) ILP 2007. LNCS (LNAI) , vol.4894 , pp. 88-97
    • Croonenborghs, T.1    Driessens, K.2    Bruynooghe, M.3
  • 12
    • 84942867726 scopus 로고    scopus 로고
    • An overview of MAXQ hierarchical reinforcement learning
    • Choueiry, B.Y., Walsh, T. (eds.) Springer, Heidelberg
    • Dietterich, T.: An overview of MAXQ hierarchical reinforcement learning. In: Choueiry, B.Y., Walsh, T. (eds.) SARA 2000. LNCS (LNAI), vol. 1864, pp. 26-44. Springer, Heidelberg (2000)
    • (2000) SARA 2000. LNCS (LNAI) , vol.1864 , pp. 26-44
    • Dietterich, T.1
  • 20
    • 0012257655 scopus 로고    scopus 로고
    • Near-optimal reinforcement learning in polynomial time
    • Morgan Kaufmann, San Francisco
    • Kearns, M., Singh, S.: Near-optimal reinforcement learning in polynomial time. In: Proc. 15th International Conf. on Machine Learning, pp. 260-268. Morgan Kaufmann, San Francisco (1998)
    • (1998) Proc. 15th International Conf. on Machine Learning , pp. 260-268
    • Kearns, M.1    Singh, S.2
  • 25
    • 35748957806 scopus 로고    scopus 로고
    • Proto-value functions: A Laplacian framework for learning representation and control in Markov decision processes
    • Mahadevan, S., Maggioni, M.: Proto-value functions: A Laplacian framework for learning representation and control in Markov decision processes. Journal of Machine Learning Research 8, 2169-2231 (2007)
    • (2007) Journal of Machine Learning Research , vol.8 , pp. 2169-2231
    • Mahadevan, S.1    Maggioni, M.2
  • 26
    • 0028429573 scopus 로고
    • Inductive logic programming: Theory and methods
    • Muggleton, S., De Raedt, L.: Inductive logic programming: Theory and methods. Journal of Logic Programming 19(20), 629-679 (1994)
    • (1994) Journal of Logic Programming , vol.19 , Issue.20 , pp. 629-679
    • Muggleton, S.1    De Raedt, L.2
  • 37
    • 37249034293 scopus 로고    scopus 로고
    • Keepaway soccer: From machine learning testbed to benchmark
    • Bredenfeld, A., Jacoff, A., Noda, I., Takahashi, Y. (eds.) Springer, Heidelberg
    • Stone, P., Kuhlmann, G., Taylor, M.E., Liu, Y.: Keepaway soccer: From machine learning testbed to benchmark. In: Bredenfeld, A., Jacoff, A., Noda, I., Takahashi, Y. (eds.) RoboCup 2005. LNCS (LNAI), vol. 4020, pp. 93-105. Springer, Heidelberg (2006)
    • (2006) RoboCup 2005. LNCS (LNAI) , vol.4020 , pp. 93-105
    • Stone, P.1    Kuhlmann, G.2    Taylor, M.E.3    Liu, Y.4
  • 38
    • 0012929784 scopus 로고
    • Dyna, an integrated architecture for learning, planning, and reacting
    • Sutton, R.: Dyna, an integrated architecture for learning, planning, and reacting. SIGART Bulletin 2, 160-163 (1991)
    • (1991) SIGART Bulletin , vol.2 , pp. 160-163
    • Sutton, R.1
  • 40
    • 0033170372 scopus 로고    scopus 로고
    • Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning
    • Sutton, R., Precup, D., Singh, S.: Between mdps and semi-mdps: a framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112, 181-211 (1999)
    • (1999) Artificial Intelligence , vol.112 , pp. 181-211
    • Sutton, R.1    Precup, D.2    Singh, S.3
  • 42
    • 56049086452 scopus 로고    scopus 로고
    • Transferring instances for model-based reinforcement learning
    • Daelemans, W., Goethals, B., Morik, K. (eds.) Springer, Heidelberg
    • Taylor, M.E., Jong, N.K., Stone, P.: Transferring instances for model-based reinforcement learning. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part II. LNCS (LNAI), vol. 5212, pp. 488-505. Springer, Heidelberg (2008)
    • (2008) ECML PKDD 2008, Part II. LNCS (LNAI) , vol.5212 , pp. 488-505
    • Taylor, M.E.1    Jong, N.K.2    Stone, P.3
  • 44
    • 68949157375 scopus 로고    scopus 로고
    • Transfer learning for reinforcement learning domains: A survey
    • Taylor, M.E., Stone, P.: Transfer learning for reinforcement learning domains: A survey. Journal of Machine Learning Research 10(1), 1633-1685 (2009)
    • (2009) Journal of Machine Learning Research , vol.10 , Issue.1 , pp. 1633-1685
    • Taylor, M.E.1    Stone, P.2
  • 45
    • 34848816477 scopus 로고    scopus 로고
    • Transfer learning via inter-task mappings for temporal difference learning
    • Taylor, M.E., Stone, P., Liu, Y.: Transfer learning via inter-task mappings for temporal difference learning. Journal of Machine Learning Research 8(1), 2125-2167 (2007)
    • (2007) Journal of Machine Learning Research , vol.8 , Issue.1 , pp. 2125-2167
    • Taylor, M.E.1    Stone, P.2    Liu, Y.3
  • 46
    • 0000985504 scopus 로고
    • TD-Gammon, a self-teaching backgammon program, achieves master-level play
    • Tesauro, G.: TD-Gammon, a self-teaching backgammon program, achieves master-level play. Neural Computation 6(2), 215-219 (1994)
    • (1994) Neural Computation , vol.6 , Issue.2 , pp. 215-219
    • Tesauro, G.1
  • 47
    • 33751551663 scopus 로고
    • The influence of improvement in one mental function upon the efficiency of other functions
    • Thorndike, E., Woodworth, R.: The influence of improvement in one mental function upon the efficiency of other functions. Psychological Review 8, 247-261 (1901)
    • (1901) Psychological Review , vol.8 , pp. 247-261
    • Thorndike, E.1    Woodworth, R.2
  • 48
    • 85031124575 scopus 로고    scopus 로고
    • Is learning the n-th thing any easier than learning the first?
    • Thrun, S.: Is learning the n-th thing any easier than learning the first? In: Advances in Neural Information Processing Systems, vol. 8, pp. 640-646 (1996)
    • (1996) Advances in Neural Information Processing Systems , vol.8 , pp. 640-646
    • Thrun, S.1
  • 49
    • 40249114836 scopus 로고    scopus 로고
    • Relational macros for transfer in reinforcement learning
    • Blockeel, H., Ramon, J., Shavlik, J., Tadepalli, P. (eds.) Springer, Heidelberg
    • Torrey, L., Shavlik, J.W., Walker, T., Maclin, R.: Relational macros for transfer in reinforcement learning. In: Blockeel, H., Ramon, J., Shavlik, J., Tadepalli, P. (eds.) ILP 2007. LNCS (LNAI), vol. 4894, pp. 254-268. Springer, Heidelberg (2008)
    • (2008) ILP 2007. LNCS (LNAI) , vol.4894 , pp. 254-268
    • Torrey, L.1    Shavlik, J.W.2    Walker, T.3    Maclin, R.4
  • 53


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.