메뉴 건너뛰기




Volumn 131, Issue 3, 2012, Pages 139-148

An information-theoretic approach to curiosity-driven reinforcement learning

Author keywords

Adaptive behavior; Curiosity; Exploration exploitation trade off; Information theory; Rate distortion theory; Reinforcement learning

Indexed keywords

ALGORITHM; ANIMAL; ARTICLE; EXPLORATORY BEHAVIOR; HUMAN; INFORMATION SCIENCE; LEARNING;

EID: 84865114997     PISSN: 14317613     EISSN: 16117530     Source Type: Journal    
DOI: 10.1007/s12064-011-0142-z     Document Type: Article
Times cited : (202)

References (36)
  • 1
    • 49249134328 scopus 로고    scopus 로고
    • Predictive information and explorative behavior of autonomous robots
    • 10.1140/epjb/e2008-00175-0 1:CAS:528:DC%2BD1cXpt1Whsbk%3D
    • N Ay N Bertschinger R Der F Guttler E Olbrich 2008 Predictive information and explorative behavior of autonomous robots European Physical Journal B 63 329 339 10.1140/epjb/e2008-00175-0 1:CAS:528:DC%2BD1cXpt1Whsbk%3D
    • (2008) European Physical Journal B , vol.63 , pp. 329-339
    • Ay, N.1    Bertschinger, N.2    Der, R.3    Guttler, F.4    Olbrich, E.5
  • 4
    • 0035514587 scopus 로고    scopus 로고
    • Predictability, complexity, and learning
    • DOI 10.1162/089976601753195969
    • W Bialek I Nemenman N Tishby 2001 Predictability, complexity and learning Neural Comput 13 2409 2463 11674845 10.1162/089976601753195969 1:STN:280:DC%2BD3Mrmslyiuw%3D%3D (Pubitemid 33594578)
    • (2001) Neural Computation , vol.13 , Issue.11 , pp. 2409-2463
    • Bialek, W.1    Nemenman, I.2    Tishby, N.3
  • 5
    • 0041965975 scopus 로고    scopus 로고
    • R-max-a general polynomial time algorithm for near-optimal reinforcement learning
    • Brafman RI, Tennenholtz M (2002) R-max-a general polynomial time algorithm for near-optimal reinforcement learning. J Mach Learn Res 3:213-231
    • (2002) J Mach Learn Res , vol.3 , pp. 213-231
    • Brafman, R.I.1    Tennenholtz, M.2
  • 7
    • 84872673342 scopus 로고    scopus 로고
    • Optimal manifold representation of data: An information theoretic perspective
    • Thrun S, Saul L, Schölkopf B (eds) MIT Press, Cambridge, MA
    • Chigirev DV, Bialek W (2004) Optimal manifold representation of data: an information theoretic perspective. In: Thrun S, Saul L, Schölkopf B (eds) Advances in neural information processing systems 16. MIT Press, Cambridge, MA
    • (2004) Advances in Neural Information Processing Systems 16
    • Chigirev, D.V.1    Bialek, W.2
  • 8
    • 0042264936 scopus 로고    scopus 로고
    • Synchronizing to the environment: Information theoretic limits on agent learning
    • JP Crutchfield DP Feldman 2001 Synchronizing to the environment: Information theoretic limits on agent learning Adv in Complex Systems 4 2 251 264
    • (2001) Adv in Complex Systems , vol.4 , Issue.2 , pp. 251-264
    • Crutchfield, J.P.1    Feldman, D.P.2
  • 9
    • 0037357392 scopus 로고    scopus 로고
    • Regularities unseen, randomness observed: Levels of entropy convergence
    • DOI 10.1063/1.1530990
    • JP Crutchfield DP Feldman 2003 Regularities unseen, randomness observed: Levels of entropy convergence Chaos 13 1 25 54 12675408 10.1063/1.1530990 (Pubitemid 36419900)
    • (2003) Chaos , vol.13 , Issue.1 , pp. 25-54
    • Crutchfield, J.P.1    Feldman, D.P.2
  • 10
    • 11944266539 scopus 로고
    • Information Theory and Statistical Mechanics
    • 10.1103/PhysRev.106.620
    • ET Jaynes 1957 Information Theory and Statistical Mechanics Phys Rev 106 4 620 630 10.1103/PhysRev.106.620
    • (1957) Phys Rev , vol.106 , Issue.4 , pp. 620-630
    • Jaynes, E.T.1
  • 13
    • 34047267520 scopus 로고    scopus 로고
    • Intrinsic motivation systems for autonomous mental development
    • DOI 10.1109/TEVC.2006.890271, Convergent Approached to the Understanding of Autonomous Metal Development
    • P-Y Oudeyer F Kaplan V Hafner 2007 Intrinsic motivation systems for autonomous mental development IEEE Transactions on Evolutionary Computation 11 2 265 286 10.1109/TEVC.2006.890271 (Pubitemid 46547111)
    • (2007) IEEE Transactions on Evolutionary Computation , vol.11 , Issue.2 , pp. 265-286
    • Oudeyer, P.-Y.1    Kaplan, F.2    Hafner, V.V.3
  • 16
    • 44949241322 scopus 로고    scopus 로고
    • Reinforcement learning of motor skills with policy gradients
    • 18482830 10.1016/j.neunet.2008.02.003
    • J Peters S Schaal 2008 Reinforcement learning of motor skills with policy gradients Neural Networks 21 4 682 697 18482830 10.1016/j.neunet.2008.02.003
    • (2008) Neural Networks , vol.21 , Issue.4 , pp. 682-697
    • Peters, J.1    Schaal, S.2
  • 17
    • 9444263770 scopus 로고    scopus 로고
    • Using MDP Characteristics to Guide Exploration in Reinforcement Learning
    • Machine Learning: ECML 2003
    • Ratitch B, Precup D (2003) Using MDP characteristics to guide exploration in reinforcement learning. In: Proceedings of ECML, pp 313-324 (Pubitemid 37230987)
    • (2003) Lecture Notes in Computer Science , Issue.2837 , pp. 313-324
    • Ratitch, B.1    Precup, D.2
  • 18
    • 0032202775 scopus 로고    scopus 로고
    • Deterministic annealing for clustering, compression, classification, regression, and related optimization problems
    • PII S0018921998078608
    • K Rose 1998 Deterministic annealing for clustering, compression, classification, regression, and related optimization problems Proc. IEEE 86 11 2210 2239 10.1109/5.726788 (Pubitemid 128720301)
    • (1998) Proceedings of the IEEE , vol.86 , Issue.11 , pp. 2210-2239
    • Rose, K.1
  • 19
    • 0000389568 scopus 로고
    • Statistical Mechanics and Phase Transitions in CLustering
    • 10043066 10.1103/PhysRevLett.65.945
    • K Rose E Gurewitz GC Fox 1990 Statistical Mechanics and Phase Transitions in CLustering Phys. Rev. Lett 65 8 945 948 10043066 10.1103/PhysRevLett.65.945
    • (1990) Phys. Rev. Lett , vol.65 , Issue.8 , pp. 945-948
    • Rose, K.1    Gurewitz, E.2    Fox, G.C.3
  • 20
    • 0026306990 scopus 로고
    • Curious model-building control systems
    • Schmidhuber J (1991) Curious model-building control systems. In Proceedings of IJCNN, pp 1458-1463
    • (1991) Proceedings of IJCNN , pp. 1458-1463
    • Schmidhuber, J.1
  • 21
    • 77954092659 scopus 로고    scopus 로고
    • Art and science as by-products of the search for novel patterns, or data compressible in unknown yet learnable ways
    • Swiss Design Network-et al. Edizioni 2009
    • Schmidhuber J (2009) Art and science as by-products of the search for novel patterns, or data compressible in unknown yet learnable ways. In: Multiple ways to design research. Research cases that reshape the design discipline. Swiss Design Network-et al. Edizioni, 2009, pp 98-112
    • (2009) Multiple Ways to Design Research. Research Cases That Reshape the Design Discipline , pp. 98-112
    • Schmidhuber, J.1
  • 22
    • 84856043672 scopus 로고
    • A mathematical theory of communication
    • 623-656
    • Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:379-423, 623-656
    • (1948) Bell Syst Tech J , vol.27 , pp. 379-423
    • Shannon, C.E.1
  • 25
    • 79051470133 scopus 로고    scopus 로고
    • Information-theoretic approach to interactive learning
    • doi: 10.1209/0295-5075/85/28005
    • Still S (2009) Information-theoretic approach to interactive learning. EPL 85 28005. doi: 10.1209/0295-5075/85/28005
    • (2009) EPL , vol.85 , pp. 28005
    • Still, S.1
  • 26
    • 10044254422 scopus 로고    scopus 로고
    • How many clusters? An information-theoretic perspective
    • DOI 10.1162/0899766042321751
    • S Still W Bialek 2004 How many clusters? An information theoretic perspective Neural Computation 16 12 2483 2506 15516271 10.1162/0899766042321751 (Pubitemid 39604007)
    • (2004) Neural Computation , vol.16 , Issue.12 , pp. 2483-2506
    • Still, S.1    Bialek, W.2
  • 27
    • 84898998530 scopus 로고    scopus 로고
    • Geometric clustering using the information bottleneck method
    • Thrun S, Saul LK, Schölkopf B (eds) MIT Press, Cambridge, MA
    • Still S, Bialek W, Bottou L (2004) Geometric clustering using the information bottleneck method. In: Thrun S, Saul LK, Schölkopf B (eds) Advances in neural information processing systems 16. MIT Press, Cambridge, MA
    • (2004) Advances in Neural Information Processing Systems 16
    • Still, S.1    Bialek, W.2    Bottou, L.3
  • 30
    • 68949157375 scopus 로고    scopus 로고
    • Transfer learning for reinforcement learning domains: A survey
    • ME Taylor P Stone 2009 Transfer learning for reinforcement learning domains: A survey Journal of Machine Learning Research 10 1 1633 1685
    • (2009) Journal of Machine Learning Research , vol.10 , Issue.1 , pp. 1633-1685
    • Taylor, M.E.1    Stone, P.2
  • 34
    • 67650915125 scopus 로고    scopus 로고
    • Efficient computation of optimal actions
    • 19574462 10.1073/pnas.0710743106 1:CAS:528:DC%2BD1MXptVKjsL8%3D
    • E Todorov 2009 Efficient computation of optimal actions PNAS 106 28 11478 11483 19574462 10.1073/pnas.0710743106 1:CAS:528:DC%2BD1MXptVKjsL8%3D
    • (2009) PNAS , vol.106 , Issue.28 , pp. 11478-11483
    • Todorov, E.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.