메뉴 건너뛰기




Volumn 9783642398759, Issue , 2014, Pages 13-46

Behavioral hierarchy: Exploration and representation

Author keywords

[No Author keywords available]

Indexed keywords

ARTIFICIAL AGENTS; BUILDING BLOCKES; COMPLEX DOMAINS; EXPLORATORY BEHAVIOURS; HIERARCHICAL REINFORCEMENT LEARNING; MULTIPLE LEVELS; NATURAL AGENTS;

EID: 84907552069     PISSN: None     EISSN: None     Source Type: Book    
DOI: 10.1007/978-3-642-39875-9_2     Document Type: Chapter
Times cited : (19)

References (100)
  • 1
    • 84937546745 scopus 로고    scopus 로고
    • Exploiting behavioral hierarchy for efficient model checking
    • E. Brinksma & K. G. Larsen (Eds.), (Lecture notes in computer science) . Berlin: Springer
    • Alur, R., McDougall, M., Yang, Z. (2002). Exploiting behavioral hierarchy for efficient model checking. In E. Brinksma & K. G. Larsen (Eds.), Computer aided verification: 14th international conference, proceedings (Lecture notes in computer science) (pp. 338-342). Berlin: Springer.
    • (2002) th International Conference, Proceedings , pp. 338-342
    • Alur, R.1    McDougall, M.2    Yang, Z.3
  • 2
    • 84930524287 scopus 로고
    • Problems of representation in heuristic problemsolving: Related issues in the development ofexpert systems
    • New Brunswick NJ
    • Amarel, S. (1981). Problems of representation in heuristic problemsolving: related issues in the development ofexpert systems. Technical Report CBM-TR-118, Laboratory for Computer Science, Rutgers University, New Brunswick NJ.
    • (1981) Technical Report CBM-TR-118, Laboratory for Computer Science, Rutgers University
    • Amarel, S.1
  • 3
    • 5644290841 scopus 로고    scopus 로고
    • An integrated theory of mind
    • Anderson, J. R. (2004). An integrated theory of mind. Psychological Review, 111, 1036-1060.
    • (2004) Psychological Review , vol.111 , pp. 1036-1060
    • Anderson, J.R.1
  • 5
    • 33845876447 scopus 로고    scopus 로고
    • Hierarchical reinforcement learning based on subgoal discovery and subpolicy specialization
    • F. Groen, N. Amato, A. Bonarini, E. Yoshida, B. Kröse (Eds.), . Amsterdam, The Netherlands: IOS
    • Bakker, B., & Schmidhuber, J. (2004). Hierarchical reinforcement learning based on subgoal discovery and subpolicy specialization. In F. Groen, N. Amato, A. Bonarini, E. Yoshida, B. Kröse (Eds.), Proceedings of the 8th conference on intelligent autonomous systems, IAS- 8 (pp. 438-445). Amsterdam, The Netherlands: IOS.
    • (2004) th Conference on Intelligent Autonomous Systems, IAS- 8 , pp. 438-445
    • Bakker, B.1    Schmidhuber, J.2
  • 7
    • 33749651693 scopus 로고    scopus 로고
    • Intrinsically motivated learning of hierarchical collections of skills
    • J. Triesch & T. Jebara (Eds.), . UCSD Institute for Neural Computation
    • Barto, A., Singh, S., Chentanez, N. (2004). Intrinsically motivated learning of hierarchical collections of skills. In J. Triesch & T. Jebara (Eds.), Proceedings of the 2004 international conference on development and learning (pp. 112-119). UCSD Institute for Neural Computation.
    • (2004) Proceedings of the 2004 International Conference on Development and Learning , pp. 112-119
    • Barto, A.1    Singh, S.2    Chentanez, N.3
  • 8
    • 84872770177 scopus 로고    scopus 로고
    • Intrinsic motivation and reinforcement learning
    • G. Baldassarre & M. Miroll (Eds.), Berlin: Springer
    • Barto, A. G. (2012). Intrinsic motivation and reinforcement learning. In G. Baldassarre & M. Miroll (Eds.), Intrinsically motivated learning in natural and artificial system. Berlin: Springer.
    • (2012) Intrinsically Motivated Learning in Natural and Artificial System
    • Barto, A.G.1
  • 10
    • 85012688561 scopus 로고
    • Princeton: Princeton University Press
    • Bellman, R. E. (1957). Dynamic programming. Princeton: Princeton University Press.
    • (1957) Dynamic Programming
    • Bellman, R.E.1
  • 12
    • 1942443210 scopus 로고    scopus 로고
    • Doing without schema hierarchies: A recurrent connectionist approach to normal and impaired routine sequential action
    • Botvinick, M. M., & Plaut, D. C. (2004). Doing without schema hierarchies: A recurrent connectionist approach to normal and impaired routine sequential action. Psychological Review, 111, 395-429.
    • (2004) Psychological Review , vol.111 , pp. 395-429
    • Botvinick, M.M.1    Plaut, D.C.2
  • 13
    • 70350566799 scopus 로고    scopus 로고
    • Hierarchically organized behavior and its neural foundations: A reinforcement-learning perspective
    • Botvinick, M. M., Niv, Y., Barto, A. G. (2009). Hierarchically organized behavior and its neural foundations: A reinforcement-learning perspective. Cognition, 113, 262-280.
    • (2009) Cognition , vol.113 , pp. 262-280
    • Botvinick, M.M.1    Niv, Y.2    Barto, A.G.3
  • 14
    • 0034248853 scopus 로고    scopus 로고
    • Stochastic dynamic programming with factored representations
    • Boutilier, C., Dearden, R., Goldszmdt, M. (2000). Stochastic dynamic programming with factored representations. Artificial Intelligence, 121, 49-107.
    • (2000) Artificial Intelligence , vol.121 , pp. 49-107
    • Boutilier, C.1    Dearden, R.2    Goldszmdt, M.3
  • 20
    • 84990553353 scopus 로고
    • A model for reasoning about persistence and causation
    • Dean, T. L., & Kanazawa, K. (1989). A model for reasoning about persistence and causation. Computational Intelligence, 5, 142-150.
    • (1989) Computational Intelligence , vol.5 , pp. 142-150
    • Dean, T.L.1    Kanazawa, K.2
  • 21
    • 34250766214 scopus 로고    scopus 로고
    • Learning the structure of factored Markov decision processes in reinforcement learning problems
    • W. W. Cohen & A. Moore (Eds.), ACM international conference proceeding series . New York: ACM
    • Degris, T., Sigaud, O., Wuillemin, P. H. (2006). Learning the structure of factored Markov decision processes in reinforcement learning problems. In W. W. Cohen & A. Moore (Eds.), Machine learning, proceedings of the twenty-third international conference (ICML 2006). ACM international conference proceeding series (vol. 148, pp. 257-264). New York: ACM.
    • (2006) Machine Learning, Proceedings of the Twenty-third International Conference (ICML 2006) , vol.148 , pp. 257-264
    • Degris, T.1    Sigaud, O.2    Wuillemin, P.H.3
  • 22
    • 0002278788 scopus 로고    scopus 로고
    • Hierarchical reinforcement learning with the MAXQ value function decomposition
    • Dietterich, T. G. (2000a). Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 13, 227-303.
    • (2000) Journal of Artificial Intelligence Research , vol.13 , pp. 227-303
    • Dietterich, T.G.1
  • 23
    • 0003506152 scopus 로고    scopus 로고
    • State abstraction in MAXQ hierarchical reinforcement learning
    • S. A. Solla, T. K. Leen, K.-R.Müller (Eds.), . Cambridge: MIT
    • Dietterich, T. G. (2000b). State abstraction in MAXQ hierarchical reinforcement learning. In S. A. Solla, T. K. Leen, K.-R.Müller (Eds.), Advances in neural information processing systems 12 (pp. 994-1000). Cambridge: MIT.
    • (2000) Advances in Neural Information Processing Systems , vol.12 , pp. 994-1000
    • Dietterich, T.G.1
  • 24
    • 0007907759 scopus 로고    scopus 로고
    • Emergent hierarchical control structures: Learning reactive/hierarchical relationships inreinforcement environments
    • P. Meas, M. Mataric, J.-A. Meyer, J. Pollack, S. W. Wilson (Eds.), . Cambridge: MIT
    • Digney, B. (1996). Emergent hierarchical control structures: learning reactive/hierarchical relationships inreinforcement environments. In P. Meas, M. Mataric, J.-A. Meyer, J. Pollack, S. W. Wilson (Eds.), From animals to animats 4: proceedings of the fourth international conference on simulation of adaptive behavior (pp. 363-372). Cambridge: MIT.
    • (1996) From Animals to Animats 4: Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior , pp. 363-372
    • Digney, B.1
  • 25
    • 70049104382 scopus 로고    scopus 로고
    • The adaptive k-meteorologists problems and its application to structure learning and feature selection in reinforcement learning
    • A. P. Danyluk, L. Bottou, M. L. Littman (Eds.), ACM international conference proceeding series . New York: ACM
    • Diuk, C., Li, L., Leffler, B. (2009). The adaptive k-meteorologists problems and its application to structure learning and feature selection in reinforcement learning. In A. P. Danyluk, L. Bottou, M. L. Littman (Eds.), Proceedings of the 26th annual international conference on machine learning, ICML 2009. ACM international conference proceeding series (vol. 382, pp. 249-256). New York: ACM.
    • (2009) Proceedings of the 26th Annual International Conference on Machine Learning, ICML 2009 , vol.382 , pp. 249-256
    • Diuk, C.1    Li, L.2    Leffler, B.3
  • 29
    • 80052879447 scopus 로고    scopus 로고
    • Learning generalizable control programs
    • Special Issue on Representations and Architectures for Cognitive Systems
    • Hart, S., & Grupen, R. (2011). Learning generalizable control programs. IEEE Transactions on Autonomous Mental Development, 3, 216-231. Special Issue on Representations and Architectures for Cognitive Systems.
    • (2011) IEEE Transactions on Autonomous Mental Development , vol.3 , pp. 216-231
    • Hart, S.1    Grupen, R.2
  • 30
    • 84929056927 scopus 로고    scopus 로고
    • Intrinsically motivated affordance discovery and modeling
    • G. Baldassarre & M. Mirolli (Eds.), Berlin: Springer
    • Hart, S., & Grupen, R. (2012). Intrinsically motivated affordance discovery and modeling. In G. Baldassarre & M. Mirolli (Eds.), Intrinsically motivated learning in natural and artificial systems. Berlin: Springer.
    • (2012) Intrinsically Motivated Learning in Natural and Artificial Systems
    • Hart, S.1    Grupen, R.2
  • 31
    • 34249761849 scopus 로고
    • Learning Bayesian networks: The combination of knowledge and statistical data
    • Heckerman, D., Geiger, D., Chickering, D. (1995). Learning Bayesian networks: the combination of knowledge and statistical data. Machine Learning, 20, 197-243.
    • (1995) Machine Learning , vol.20 , pp. 197-243
    • Heckerman, D.1    Geiger, D.2    Chickering, D.3
  • 32
    • 0013465036 scopus 로고    scopus 로고
    • Discovering hierarchy in reinforcement learning with HEXQ
    • C. Sammut & A. G. Hoffmann (Eds.), . San Francisco: Morgan Kaufmann
    • Hengst, B. (2002). Discovering hierarchy in reinforcement learning with HEXQ. In C. Sammut & A. G. Hoffmann (Eds.), Machine learning, proceedings of the nineteenth international conference (ICML 2002) (pp. 243-250). San Francisco: Morgan Kaufmann.
    • (2002) Machine Learning, Proceedings of the Nineteenth International Conference (ICML 2002) , pp. 243-250
    • Hengst, B.1
  • 33
    • 0031343489 scopus 로고    scopus 로고
    • A feedback control structure for on-line learning tasks
    • Huber, M., & Grupen, R. A. (1997). A feedback control structure for on-line learning tasks. Robotics and Autonomous Systems, 22, 303-315.
    • (1997) Robotics and Autonomous Systems , vol.22 , pp. 303-315
    • Huber, M.1    Grupen, R.A.2
  • 34
    • 85131157011 scopus 로고
    • The theory of affordances
    • R. Shaw & J. Bransford (Eds.), . Hillsdale: Lawrence Erlbaum
    • Gibson, J. (1977). The theory of affordances. In R. Shaw & J. Bransford (Eds.), Perceiving, acting, and knowing: toward an ecological psychology (pp. 67-82). Hillsdale: Lawrence Erlbaum.
    • (1977) Perceiving, Acting, and Knowing: Toward An Ecological Psychology , pp. 67-82
    • Gibson, J.1
  • 36
  • 37
    • 34548070563 scopus 로고    scopus 로고
    • Active learning of dynamic Bayesian networks in Markov decision processes
    • I. Miguel & W. Rumi (Eds.), Whistler, Canada, July 18-21, 2007. Lecture notes in computer science: Abstraction, reformulation, and approximation . Berlin: Springer
    • Jonsson, A., & Barto, A. G. (2007). Active learning of dynamic Bayesian networks in Markov decision processes. In I. Miguel & W. Rumi (Eds.), Proceedings of Abstraction, reformulation, and approximation, 7th international symposium, SARA 2007, Whistler, Canada, July 18-21, 2007. Lecture notes in computer science: Abstraction, reformulation, and approximation (vol. 4612, pp. 273-284). Berlin: Springer.
    • (2007) Proceedings of Abstraction, Reformulation, and Approximation, 7th International Symposium, SARA 2007 , vol.4612 , pp. 273-284
    • Jonsson, A.1    Barto, A.G.2
  • 38
    • 84880873347 scopus 로고    scopus 로고
    • Building portable options: Skill transfer in reinforcement learning
    • M. Veloso (Ed.), Hyderabad, India, 6-12 January 2007 . Menlo Park: AAAI Press
    • Konidaris, G., & Barto, A. (2007). Building portable options: Skill transfer in reinforcement learning. In M. Veloso (Ed.), IJCAI 2007, proceedings of the 20th international joint conference on artificial intelligence, Hyderabad, India, 6-12 January 2007 (pp. 895-900). Menlo Park: AAAI Press.
    • (2007) th International Joint Conference on Artificial Intelligence , pp. 895-900
    • Konidaris, G.1    Barto, A.2
  • 39
    • 78751681641 scopus 로고    scopus 로고
    • Efficient skill learning using abstraction selection
    • C. Boutilier (Ed.), Pasadena, California, USA, 11-17 July 2009 . Menlo Park: AAAI Press
    • Konidaris, G., & Barto, A. (2009a). Efficient skill learning using abstraction selection. In C. Boutilier (Ed.), IJCAI 2009, Proceedings of the 21st international joint conference on artificial intelligence, Pasadena, California, USA, 11-17 July 2009 (pp. 1107-1112). Menlo Park: AAAI Press.
    • (2009) st International Joint Conference on Artificial Intelligence , pp. 1107-1112
    • Konidaris, G.1    Barto, A.2
  • 40
    • 80055032021 scopus 로고    scopus 로고
    • Skill discovery in continuous reinforcement learning domains using skill chaining
    • Y. Bengio, D. Schuurmans, J. Lafferty, C. Williams, A. Culotta (Eds.), . NIPS Foundation
    • Konidaris, G., & Barto, A. (2009b). Skill discovery in continuous reinforcement learning domains using skill chaining. In Y. Bengio, D. Schuurmans, J. Lafferty, C. Williams, A. Culotta (Eds.), Proceedings of the 2009 conference of Advances in neural information processing systems 22 (pp. 1015-1023). NIPS Foundation.
    • (2009) Proceedings of the 2009 Conference of Advances in Neural Information Processing Systems , vol.22 , pp. 1015-1023
    • Konidaris, G.1    Barto, A.2
  • 45
    • 84877767633 scopus 로고    scopus 로고
    • PhD thesis, Computer Science, University of Massachusetts Amherst
    • Konidaris, G. D. (2011). Autonomous robot skill acquisition. PhD thesis, Computer Science, University of Massachusetts Amherst.
    • (2011) Autonomous Robot Skill Acquisition
    • Konidaris, G.D.1
  • 47
    • 71749084105 scopus 로고    scopus 로고
    • Acquisition of hierarchical reactive skills in a unified cognitive architecture
    • Langley, P., Choi, D., Rogers, S. (2009). Acquisition of hierarchical reactive skills in a unified cognitive architecture. Cognitive Systems Research, 10, 316-332.
    • (2009) Cognitive Systems Research , vol.10 , pp. 316-332
    • Langley, P.1    Choi, D.2    Rogers, S.3
  • 49
    • 0001990073 scopus 로고
    • The problem of serial order in behavior
    • L. A. Jeffress (Ed.), . New York: Wiley
    • Lashley, K. S. (1951). The problem of serial order in behavior. In L. A. Jeffress (Ed.), Cerebral mechanisms in behavior: the Hixon symposium (pp. 112-136). New York: Wiley.
    • (1951) Cerebral Mechanisms in Behavior: The Hixon Symposium , pp. 112-136
    • Lashley, K.S.1
  • 50
    • 70349116541 scopus 로고    scopus 로고
    • Reinforcement learning and adaptive dynamic programming for feedback control
    • IEEE Circuits and Systems Society
    • Lewis, F. L., & Vrabie, D. (2009). Reinforcement learning and adaptive dynamic programming for feedback control. In IEEE circuits and systems magazine (vol. 9, pp. 32-50). IEEE Circuits and Systems Society.
    • (2009) IEEE Circuits and Systems Magazine , vol.9 , pp. 32-50
    • Lewis, F.L.1    Vrabie, D.2
  • 53
    • 70349322784 scopus 로고    scopus 로고
    • Learning representation and control in Markov decision processes: New frontiers
    • Hanover: Now Publishers Inc
    • Mahadevan, S. (2009). Learning representation and control in Markov decision processes: new frontiers. Foundations and trends in machine learning (vol. 1). Hanover: Now Publishers Inc.
    • (2009) Foundations and Trends in Machine Learning , vol.1
    • Mahadevan, S.1
  • 54
    • 14344250635 scopus 로고    scopus 로고
    • Dynamic abstraction in reinforcement learning via clustering
    • C. E. Brodley (Ed.), ACM international conference proceeding series . New York: ACM
    • Mannor, S., Menache, I., Hoze, A., Klein, U. (2004). Dynamic abstraction in reinforcement learning via clustering. In C. E. Brodley (Ed.), Machine learning, proceedings of the twenty-first international conference (ICML 2004). ACM international conference proceeding series (vol. 69, pp. 560-567). New York: ACM.
    • (2004) Machine Learning, Proceedings of the Twenty-first International Conference (ICML 2004) , vol.69 , pp. 560-567
    • Mannor, S.1    Menache, I.2    Hoze, A.3    Klein, U.4
  • 56
    • 0013465187 scopus 로고    scopus 로고
    • Automatic discovery of subgoals in reinforcement learning using diverse density
    • C. E. Brodley & A. P. Danyluk (Eds.), . San Francisco: Morgan Kaufmann
    • McGovern, A., & Barto, A. (2001). Automatic discovery of subgoals in reinforcement learning using diverse density. In C. E. Brodley & A. P. Danyluk (Eds.), Proceedings of the eighteenth international conference on machine learning (ICML 2001) (pp. 361-368). San Francisco: Morgan Kaufmann.
    • (2001) Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001) , pp. 361-368
    • McGovern, A.1    Barto, A.2
  • 57
    • 55149090494 scopus 로고    scopus 로고
    • Transfer in variable-reward hierarchical reinforcement learning
    • Mehta, N., Natarajan, S., Tadepalli, P. (2008). Transfer in variable-reward hierarchical reinforcement learning. Machine Learning, 73, 289-312.
    • (2008) Machine Learning , vol.73 , pp. 289-312
    • Mehta, N.1    Natarajan, S.2    Tadepalli, P.3
  • 58
    • 84945250000 scopus 로고    scopus 로고
    • Q-Cut - Dynamic discovery of sub-goals in reinforcement learning
    • Lecture notes in computer science . Berlin: Springer
    • Menache, I., Mannor, S., Shimkin, N. (2002). Q-Cut - Dynamic discovery of sub-goals in reinforcement learning. In Machine learning: ECML 2002, 13th European conference on machine learning. Lecture notes in computer science (vol. 2430, pp. 295-306). Berlin: Springer.
    • (2002) th European Conference on Machine Learning , vol.2430 , pp. 295-306
    • Menache, I.1    Mannor, S.2    Shimkin, N.3
  • 60
    • 78751697580 scopus 로고    scopus 로고
    • Autonomously learning an action hierarchy using a learned qualitative state representation
    • C. Boutilier (Ed.), e, Pasadena, California, USA, 11-17 July 2009 . Menlo Park: AAAI Press
    • Mugan, J., & Kuipers, B. (2009). Autonomously learning an action hierarchy using a learned qualitative state representation. In C. Boutilier (Ed.), IJCAI 2009, Proceedings of the 21st international joint conference on artificial intelligence, Pasadena, California, USA, 11-17 July 2009 (pp. 1175-1180). Menlo Park: AAAI Press.
    • (2009) st International Joint Conference on Artificial Intelligenc , pp. 1175-1180
    • Mugan, J.1    Kuipers, B.2
  • 62
    • 70049112930 scopus 로고    scopus 로고
    • Learning complex motions by sequencing simpler motion templates
    • A. P. Danyluk, L. Bottou, M. L. Littman (Eds.), ACM international conference proceeding series . New York: ACM
    • Neumann, G., Maass, W., Peters, J. (2009). Learning complex motions by sequencing simpler motion templates. In A. P. Danyluk, L. Bottou, M. L. Littman (Eds.), Proceedings of the 26th annual international conference on machine learning, ICML 2009. ACM international conference proceeding series (vol. 382, pp. 753-760). New York: ACM.
    • (2009) Proceedings of the 26th Annual International Conference on Machine Learning, ICML 2009 , vol.382 , pp. 753-760
    • Neumann, G.1    Maass, W.2    Peters, J.3
  • 63
    • 0002921687 scopus 로고
    • GPS, a program that simulates human thought
    • J. Feldman (Ed.), . New York: McGraw-Hill
    • Newell, A., Shaw, J. C., Simon, H. A. (1963). GPS, a program that simulates human thought. In J. Feldman (Ed.), Computers and thought (pp. 279-293). New York: McGraw-Hill.
    • (1963) Computers and Thought , pp. 279-293
    • Newell, A.1    Shaw, J.C.2    Simon, H.A.3
  • 64
    • 85162360219 scopus 로고    scopus 로고
    • Clustering via Dirichlet process mixture models for portable skill discovery
    • J. Shawe-Taylor, R. Zemel, P. Bartlett, F. Pereira, K. Weinberger (Eds.), (NIPS) . Curran Associates
    • Niekum, S., & Barto, A. G. (2011). Clustering via Dirichlet process mixture models for portable skill discovery. In J. Shawe-Taylor, R. Zemel, P. Bartlett, F. Pereira, K. Weinberger (Eds.), Advances in neural information processing systems 24 (NIPS) (pp. 1818-1826). Curran Associates.
    • (2011) Advances in Neural Information Processing Systems , vol.24 , pp. 1818-1826
    • Niekum, S.1    Barto, A.G.2
  • 65
    • 84899441331 scopus 로고    scopus 로고
    • Basis function construction for hierarchical reinforcement learning
    • W. van der Hoek, G. A. Kaminka, Y. Lespérance, M. Luck, S. Sen (Eds.), . International Foundation for Autonomous Agents and MultiAgent Systems (IFAAMAS
    • Osentoski, S., & Mahadevan, S. (2010). Basis function construction for hierarchical reinforcement learning. In W. van der Hoek, G. A. Kaminka, Y. Lespérance, M. Luck, S. Sen (Eds.), 9th international conference on autonomous agents and multiagent systems (AAMAS 2010) (pp. 747-754). International Foundation for Autonomous Agents and MultiAgent Systems (IFAAMAS).
    • (2010) th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2010) , pp. 747-754
    • Osentoski, S.1    Mahadevan, S.2
  • 70
    • 14344250461 scopus 로고    scopus 로고
    • PolicyBlocks: An algorithm for creating useful macro-actions in reinforcement learning
    • C. Sammut & A. Hoffmann (Eds.), . San Francisco: Morgan Kaufmann
    • Pickett, M., & Barto, A. G. (2002). PolicyBlocks: An algorithm for creating useful macro-actions in reinforcement learning. In C. Sammut & A. Hoffmann (Eds.), Machine learning, proceedings of the nineteenth international conference (ICML 2002) (pp. 506-513). San Francisco: Morgan Kaufmann.
    • (2002) Machine Learning, Proceedings of the Nineteenth International Conference (ICML 2002) , pp. 506-513
    • Pickett, M.1    Barto, A.G.2
  • 71
    • 84956854078 scopus 로고    scopus 로고
    • Model minimization in hierarchical reinforcement learning
    • S. Koenig & R. C. Holte (Eds.), Kananaskis, Alberta, Canada, 2-4 August 2002, proceedings. Lecture notes in computer science . Berlin: Springer
    • Ravindran, B., & Barto, A. G. (2002). Model minimization in hierarchical reinforcement learning. In S. Koenig & R. C. Holte (Eds.), Abstraction, reformulation and approximation, 5th international symposium, SARA 2002, Kananaskis, Alberta, Canada, 2-4 August 2002, proceedings. Lecture notes in computer science (vol. 2371, pp. 196-211). Berlin: Springer.
    • (2002) thInternational Symposium, SARA 2002 , vol.2371 , pp. 196-211
    • Ravindran, B.1    Barto, A.G.2
  • 72
    • 0002209063 scopus 로고    scopus 로고
    • Intrinsic and extrinsic motivations: Classic definitions and new directions
    • Ryan, R. M., & Deci, E. L. (2000). Intrinsic and extrinsic motivations: classic definitions and new directions. Contemporary Educational Psychology, 25, 54-67.
    • (2000) Contemporary Educational Psychology , vol.25 , pp. 54-67
    • Ryan, R.M.1    Deci, E.L.2
  • 73
    • 0016069798 scopus 로고
    • Planning in a hierarchy of abstraction spaces
    • Sacerdoti, E. D. (1974). Planning in a hierarchy of abstraction spaces. Artificial Intelligence, 5, 115-135.
    • (1974) Artificial Intelligence , vol.5 , pp. 115-135
    • Sacerdoti, E.D.1
  • 76
    • 33750711410 scopus 로고    scopus 로고
    • Hierarchical control of cognitive processes: Switching tasks in sequences
    • Schneider, D. W., & Logan, G. D. (2006). Hierarchical control of cognitive processes: switching tasks in sequences. Journal of Experimental Psychology: General, 135, 623-640.
    • (2006) Journal of Experimental Psychology: General , vol.135 , pp. 623-640
    • Schneider, D.W.1    Logan, G.D.2
  • 78
    • 77953026498 scopus 로고    scopus 로고
    • The structure of complexity in an evolving world: The role of near decomposability
    • W. Callebaut & D. Rasskin-Gutman (Eds.), . Cambridge: MIT
    • Simon, H. A. (2005). The structure of complexity in an evolving world: the role of near decomposability. In W. Callebaut & D. Rasskin-Gutman (Eds.), Modularity: understanding the development and evolution of natural complex systems (pp. ix-xiii). Cambridge: MIT.
    • (2005) Modularity: Understanding the Development and Evolution of Natural Complex Systems , pp. ix-xiii
    • Simon, H.A.1
  • 79
    • 14344261491 scopus 로고    scopus 로고
    • Using relative novelty to identify useful temporal abstractions in reinforcement learning
    • In C. E. Brodley (Ed.), ACM international conference proceeding series . New York: ACM
    • Ş imşek, Ö ., & Barto, A. (2004). Using relative novelty to identify useful temporal abstractions in reinforcement learning. In C. E. Brodley (Ed.), Machine learning, proceedings of the twentyfirst international conference (ICML 2004) ACM international conference proceeding series (vol. 69, pp. 751-758). New York: ACM.
    • (2004) Machine Learning, Proceedings of the Twentyfirst International Conference (ICML 2004) , vol.69 , pp. 751-758
    • Ş Imşek, O.1    Barto, A.2
  • 80
    • 78651097494 scopus 로고    scopus 로고
    • Skill characterization based on betweenness
    • D. Koller, D. Schuurmans, Y. Bengio, L. Bottou (Eds.), Proceedings of the twenty-second annual conference on neural information processing systems . Red Hook: Curran Associates, Inc
    • Ş imşek, Ö ., & Barto, A. (2009). Skill characterization based on betweenness. In D. Koller, D. Schuurmans, Y. Bengio, L. Bottou (Eds.), Advances in neural information processing systems 21, Proceedings of the twenty-second annual conference on neural information processing systems (pp. 1497-1504). Red Hook: Curran Associates, Inc.
    • (2009) Advances in Neural Information Processing Systems , vol.21 , pp. 1497-1504
    • Ş Imşek, O.1    Barto, A.2
  • 81
    • 31844447221 scopus 로고    scopus 로고
    • Identifying useful subgoals in reinforcement learning by local graph partitioning
    • L. D. Raedt & S. Wrobel (Eds.), ACM international conference proceeding series . New York: ACM
    • Ş imşek, Ö ., Wolfe, A. P., Barto, A. (2005). Identifying useful subgoals in reinforcement learning by local graph partitioning. In L. D. Raedt & S. Wrobel (Eds.), Machine learning, proceedings of the twenty-second international conference (ICML 2005) ACM international conference proceeding series (vol. 119, pp. 816-823). New York: ACM.
    • (2005) Machine Learning, Proceedings of the Twenty-second International Conference (ICML 2005) , vol.119 , pp. 816-823
    • Ş Imşek, O.1    Wolfe, A.P.2    Barto, A.3
  • 83
    • 79953822184 scopus 로고    scopus 로고
    • Intrinsically motivated reinforcement learning: An evolutionary perspective
    • Special issue on Active Learning and Intrinsically Motivated Exploration in Robots: Advances and Challenges
    • Singh, S., Lewis, R. L., Barto, A. G., Sorg, J. (2010). Intrinsically motivated reinforcement learning: An evolutionary perspective. IEEE Transactions on AutonomousMental Development, 2, 70-82. Special issue on Active Learning and Intrinsically Motivated Exploration in Robots: Advances and Challenges.
    • (2010) IEEE Transactions on AutonomousMental Development , vol.2 , pp. 70-82
    • Singh, S.1    Lewis, R.L.2    Barto, A.G.3    Sorg, J.4
  • 84
    • 80055036368 scopus 로고    scopus 로고
    • Reinforcement learning of hierarchical skills on the Sony Aibo robot
    • L. Smith, O. Sporns, C. Yu, M. Gasser, C. Breazeal, G. Deak, J. Weng (Eds.), Bloomington IN
    • Soni, V., & Singh, S. (2006). Reinforcement learning of hierarchical skills on the Sony Aibo robot. In L. Smith, O. Sporns, C. Yu, M. Gasser, C. Breazeal, G. Deak, J. Weng (Eds.), Fifth international conference on development and learning (ICDL). Bloomington IN.
    • (2006) Fifth International Conference on Development and Learning (ICDL)
    • Soni, V.1    Singh, S.2
  • 88
    • 0033170372 scopus 로고    scopus 로고
    • Between MDPs and semi-MDPs: A framework for temporal abstraction inreinforcement learning
    • Sutton, R. S., Precup, D., Singh, S. (1999). Between MDPs and semi-MDPs: A framework for temporal abstraction inreinforcement learning. Artificial Intelligence, 112, 181-211.
    • (1999) Artificial Intelligence , vol.112 , pp. 181-211
    • Sutton, R.S.1    Precup, D.2    Singh, S.3
  • 89
    • 68949157375 scopus 로고    scopus 로고
    • Transfer learning for reinforcement learning domains: A survey
    • Taylor, M. E., & Stone, P. (2009). Transfer learning for reinforcement learning domains: A survey. Journal of Machine Learning Research, 10, 1633-1685.
    • (2009) Journal of Machine Learning Research , vol.10 , pp. 1633-1685
    • Taylor, M.E.1    Stone, P.2
  • 90
    • 34848816477 scopus 로고    scopus 로고
    • Transfer learning via inter-task mappings for temporal difference learning
    • Taylor, M. E., Stone, P., Liu, Y. (2007). Transfer learning via inter-task mappings for temporal difference learning. Journal of Machine Learning Research, 8, 2125-2167.
    • (2007) Journal of Machine Learning Research , vol.8 , pp. 2125-2167
    • Taylor, M.E.1    Stone, P.2    Liu, Y.3
  • 93
    • 0000985504 scopus 로고
    • TD-gammon, a self-teaching backgammon program, achieves master-level play
    • Tesauro, G. J. (1994). TD-gammon, a self-teaching backgammon program, achieves master-level play. Neural Computation, 6, 215-219.
    • (1994) Neural Computation , vol.6 , pp. 215-219
    • Tesauro, G.J.1
  • 96
    • 40249114836 scopus 로고    scopus 로고
    • Relational macros for transfer in reinforcement learning
    • H. Blockeel, J. Ramon, J. Shavlik, P. Tadepalli (Eds.), Lecture notes in computer science . Berlin: Springer
    • Torrey, L., Shavlik, J., Walker, J., Maclin, R. (2008). Relational macros for transfer in reinforcement learning. In H. Blockeel, J. Ramon, J. Shavlik, P. Tadepalli (Eds.), Inductive logic programming 17th international conference, ILP 2007. Lecture notes in computer science (vol. 4894, pp. 254-268). Berlin: Springer.
    • (2008) th International Conference, ILP 2007 , vol.4894 , pp. 254-268
    • Torrey, L.1    Shavlik, J.2    Walker, J.3    Maclin, R.4
  • 97
    • 77950346977 scopus 로고    scopus 로고
    • Switching between representations in reinforcement learning
    • Studies in Computational Intelligence R. Babuska & F. C. A. Groen (Eds.), . Berlin: Springer
    • van Seijen, H., Whiteson, S., Kester, L. (2007). Switching between representations in reinforcement learning. In R. Babuska & F. C. A. Groen (Eds.), Interactive collaborative information systems. Studies in computational intelligence (vol. 281, pp. 65-84). Berlin: Springer.
    • (2007) Interactive Collaborative Information Systems , vol.281 , pp. 65-84
    • Van Seijen, H.1    Whiteson, S.2    Kester, L.3
  • 98
    • 80054969173 scopus 로고    scopus 로고
    • Intrinsically motivated hierarchical skill learning in structured environments
    • Special issue on Active Learning and Intrinsically Motivated Exploration in Robots: Advances and Challenges
    • Vigorito, C., & Barto, A. G. (2010). Intrinsically motivated hierarchical skill learning in structured environments. IEEE Transactions on Autonomous Mental Development, 2, 83-90. Special issue on Active Learning and Intrinsically Motivated Exploration in Robots: Advances and Challenges.
    • (2010) IEEE Transactions on Autonomous Mental Development , vol.2 , pp. 83-90
    • Vigorito, C.1    Barto, A.G.2
  • 100
    • 33749411161 scopus 로고
    • Motivation reconsidered: The concept of competence
    • White, R. W. (1959). Motivation reconsidered: the concept of competence. Psychological Review, 66, 297-333.
    • (1959) Psychological Review , vol.66 , pp. 297-333
    • White, R.W.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.