SCOPUS 정보 검색 플랫폼

Philosophical Transactions of the Royal Society B: Biological Sciences

Volumn 369, Issue 1655, 2014, Pages

Model-based hierarchical reinforcement learning and human action control

(2) Botvinick, Matthew a Weinstein, Ari a

a PRINCETON UNIVERSITY (United States)

Author keywords

Goal directed behaviour; Hierarchy; Reinforcement learning

Indexed keywords

DECISION MAKING; HIERARCHICAL SYSTEM; HUMAN BEHAVIOR; LEARNING; NUMERICAL MODEL;

BIOLOGICAL MODEL; DECISION MAKING; HUMAN; LEARNING; MOTIVATION; PHYSIOLOGY; REINFORCEMENT;

DECISION MAKING; GOALS; HUMANS; LEARNING; MODELS, NEUROLOGICAL; REINFORCEMENT (PSYCHOLOGY);

EID: 84907487070 PISSN: 09628436 EISSN: 14712970 Source Type: Journal
DOI: 10.1098/rstb.2013.0480 Document Type: Article

Times cited : (134)

References (69)

1
- 78649966665
- Dopamine in motivational control: Rewarding, aversive and alerting
- Bromberg-Martin ES, Matsumoto M, Hikosaka O. 2010 Dopamine in motivational control: rewarding, aversive and alerting. Neuron 68, 815–834. (doi:10.1016/j.neuron.2010.11.022)
- (2010) Neuron , vol.68 , pp. 815-834
- Bromberg-Martin, E.S.¹ Matsumoto, M.² Hikosaka, O.³

2
- 0037057808
- Reward, motivation, and reinforcement learning
- Dayan P, Balleine BW. 2002 Reward, motivation, and reinforcement learning. Neuron 36, 285–298. (doi:10.1016/S0896-6273(02)00963-7)
- (2002) Neuron , vol.36 , pp. 285-298
- Dayan, P.¹ Balleine, B.W.²

3
- 28044450875
- Uncertainty-based competition between prefrontal and striatal systems for behavioral control
- Daw ND, Niv Y, Dayan P. 2005 Uncertainty-based competition between prefrontal and striatal systems for behavioral control. Nat. Neurosci. 8, 1704–1711. (doi:10.1038/nn1560)
- (2005) Nat. Neurosci , vol.8 , pp. 1704-1711
- Daw, N.D.¹ Niv, Y.² Dayan, P.³

4
- 84885802926
- Goals and habits in the brain
- Dolan RJ, Dayan P. 2013 Goals and habits in the brain. Neuron 80, 312–325. (doi:10.1016/j.neuron.2013.09.007)
- (2013) Neuron , vol.80 , pp. 312-325
- Dolan, R.J.¹ Dayan, P.²

5
- 84872761547
- The ubiquity of model-based reinforcement learning
- Doll BB, Simon DA, Daw ND. 2012 The ubiquity of model-based reinforcement learning. Curr. Opin. Neurobiol. 22, 1075–1081. (doi:10.1016/j.conb.2012.08.003)
- (2012) Curr. Opin. Neurobiol , vol.22 , pp. 1075-1081
- Doll, B.B.¹ Simon, D.A.² Daw, N.D.³

6
- 33747187640
- Hove, UK: Psychology Press
- Morris R, Ward G. 2004 The cognitive psychology of planning. Hove, UK: Psychology Press.
- (2004) The cognitive psychology of planning
- Morris, R.¹ Ward, G.²

7
- 84859737036
- Goal directed decision making as probabilistic inference: A computational framework and potential neural correlates
- Solway A, Botvinick MM. 2012 Goal directed decision making as probabilistic inference: a computational framework and potential neural correlates. Psychol. Rev. 119, 120–154. (doi:10.1037/a0026435)
- (2012) Psychol. Rev , vol.119 , pp. 120-154
- Solway, A.¹ Botvinick, M.M.²

8
- 84907545889
- The algorithmic anatomy of model-based evaluation
- Daw ND, Dayan P. 2014 The algorithmic anatomy of model-based evaluation. Phil. Trans. R. Soc. B 369, 20130478. (doi:10.1098/rstb.2013.0478)
- (2014) Phil. Trans. R. Soc. B , vol.369 , pp. 20130478
- Daw, N.D.¹ Dayan, P.²

9
- 0141988716
- Recent advances in hierarchical reinforcement learning
- Barto A, Mahadevan S. 2003 Recent advances in hierarchical reinforcement learning. Discrete Event Dyn. Syst. 13, 341–379. (doi:10.1023/A:1025696116075)
- (2003) Discrete Event Dyn. Syst , vol.13 , pp. 341-379
- Barto, A.¹ Mahadevan, S.²

10
- 70350566799
- Hierarchically organized behavior and its neural foundations: A reinforcement-learning perspective
- Botvinick M., Niv Y, Barto AC. 2009 Hierarchically organized behavior and its neural foundations: a reinforcement-learning perspective. Cognition 113, 262–280. (doi:10.1016/j.cognition.2008.08.011)
- (2009) Cognition , vol.113 , pp. 262-280
- Botvinick, M.¹ Niv, Y.² Barto, A.C.³

11
- 84907552069
- Behavioral hierarchy: Exploration and representation
- (eds G Baldassarre, M Mirolli, Heidelberg, Germany: Springer
- Barto AG, Konidaris GD, Vigorito CM. 2013 Behavioral hierarchy: exploration and representation. In Computational and robotic models of hierarchical organization of behavior (eds G Baldassarre, M Mirolli), pp. 13–46. Heidelberg, Germany: Springer.
- (2013) Computational and robotic models of hierarchical organization of behavior , pp. 13-46
- Barto, A.G.¹ Konidaris, G.D.² Vigorito, C.M.³

12
- 84859341150
- Habits, action sequences and reinforcement learning
- Dezfouli A, Balleine BW. 2012 Habits, action sequences and reinforcement learning. Eur. J. Neurosci. 35, 1036–1051. (doi:10.1111/j.1460-9568.2012.08050.x)
- (2012) Eur. J. Neurosci , vol.35 , pp. 1036-1051
- Dezfouli, A.¹ Balleine, B.W.²

13
- 84880660982
- The expected value of control: An integrative theory of anterior cingulate cortex function
- Shenhav A, Botvinick M., Cohen JD. 2013 The expected value of control: an integrative theory of anterior cingulate cortex function. Neuron 79, 217–240. (doi:10.1016/j.neuron.2013.07.007)
- (2013) Neuron , vol.79 , pp. 217-240
- Shenhav, A.¹ Botvinick, M.² Cohen, J.D.³

14
- 84856318423
- Motivation of extended behaviors by anterior cingulate cortex
- Holroyd CB, Yeung N. 2012 Motivation of extended behaviors by anterior cingulate cortex. Trends Cogn. Sci. 16, 122–128. (doi:10.1016/j.tics.2011.12.008)
- (2012) Trends Cogn. Sci , vol.16 , pp. 122-128
- Holroyd, C.B.¹ Yeung, N.²

15
- 84857211334
- Mechanisms of hierarchical reinforcement learning in corticostraital circuits 1: Computational analysis
- Frank MJ, Badre D. 2012 Mechanisms of hierarchical reinforcement learning in corticostraital circuits 1: computational analysis. Cereb. Cortex 22, 509–526. (doi:10.1093/cercor/bhr114)
- (2012) Cereb. Cortex , vol.22 , pp. 509-526
- Frank, M.J.¹ Badre, D.²

16
- 0033170372
- Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
- Sutton RS, Precup D, Singh S. 1999 Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning. Artif. Intell. 112, 181–211. (doi:10.1016/S0004-3702(99) 00052-1)
- (1999) Artif. Intell , vol.112 , pp. 181-211
- Sutton, R.S.¹ Precup, D.² Singh, S.³

17
- 84907555023
- Silver D, Ciosek K. 2012 Compositional planning using optimal option models. (http://arxiv.org/abs/1206.6473).
- (2012) Compositional planning using optimal option models
- Silver, D.¹ Ciosek, K.²

18
- 84880688141
- Multi-value-functions: Efficient automatic action hierarchies for multiple goal MDPs
- Stockholm, Sweden, 31 July 1999, San Francisco, CA: Morgan Kaufmann
- Moore AW, Baird L, Kaelbling L. 1999 Multi-value-functions: efficient automatic action hierarchies for multiple goal MDPs. In Proc. Int. Joint Conf. on Artificial Intelligence, Stockholm, Sweden, 31 July 1999, pp. 1316–1323. San Francisco, CA: Morgan Kaufmann.
- (1999) Proc. Int. Joint Conf. on Artificial Intelligence , pp. 1316-1323
- Moore, A.W.¹ Baird, L.² Kaelbling, L.³

19
- 84907555022
- 2013The advantage of planning with options
- Princeton, NJ, 25–27 October 2013
- Mann TA, Mannor S. 2013 The advantage of planning with options. In First Multidisciplinary Conf. on Reinforcement Learning and Decision Making, Princeton, NJ, 25–27 October 2013.
- First Multidisciplinary Conf. on Reinforcement Learning and Decision Making
- Mann, T.A.¹ Mannor, S.²

20
- 84878190351
- Hierarchical reinforcement learning and decision making
- Botvinick MM. 2012 Hierarchical reinforcement learning and decision making. Curr. Opin. Neurobiol. 22, 956–962. (doi:10.1016/j.conb.2012.05.008)
- (2012) Curr. Opin. Neurobiol , vol.22 , pp. 956-962
- Botvinick, M.M.¹

21
- 84877341847
- The curse of planning: Dissecting multiple reinforcement-learning systems by taxing the central executive
- Otto AR, Gershman SJ, Markman AB, Daw ND. 2013 The curse of planning: dissecting multiple reinforcement-learning systems by taxing the central executive. Psychol. Sci. 24, 751–761. (doi:10.1177/0956797612463080)
- (2013) Psychol. Sci , vol.24 , pp. 751-761
- Otto, A.R.¹ Gershman, S.J.² Markman, A.B.³ Daw, N.D.⁴

22
- 0003506152
- State abstraction in MAXQ hierarchical reinforcement learning
- Colorado, 28 November 2000, Cambridge, MA: MIT Press
- Dietterich TG. 2000 State abstraction in MAXQ hierarchical reinforcement learning. In Advances in Neural Information Processing, Denver, Colorado, 28 November 2000, pp. 994–1000. Cambridge, MA: MIT Press.
- (2000) Advances in Neural Information Processing, Denver , pp. 994-1000
- Dietterich, T.G.¹

23
- 84898878959
- Symbol acquisition for task-level planning
- Konidaris G, Kaelbling L, Lozano-Perez T. 2013 Symbol acquisition for task-level planning. In The AAAI 2013 Workshop on Learning Rich Representations from Low-Level Sensors.
- (2013) The AAAI 2013 Workshop on Learning Rich Representations from Low-Level Sensors
- Konidaris, G.¹ Kaelbling, L.² Lozano-Perez, T.³

24
- 84871698013
- Hierarchical task and motion planning in the now
- Shanghai, China, 9 May 2011, Piscataway, NJ: IEEE Press
- Kaelbling L., Lozano-Pérez T. 2011 Hierarchical task and motion planning in the now. In IEEE Int. Conf. on Robotics and Automation, Shanghai, China, 9 May 2011, pp. 1470–1477. Piscataway, NJ: IEEE Press.
- (2011) IEEE Int. Conf. on Robotics and Automation , pp. 1470-1477
- Kaelbling, L.¹ Lozano-Pérez, T.²

25
- 84923228672
- Optimal behavioral hierarchy
- Solway A, Diuk C, Cordova N, Yee D, Barto AG, Niv Y, Botvinick M. 2014 Optimal behavioral hierarchy. PLoS Comput. Biol. 10, 1–10.
- (2014) PLoS Comput. Biol , vol.10 , pp. 1-10
- Solway, A.¹ Diuk, C.² Cordova, N.³ Yee, D.⁴ Barto, A.G.⁵ Niv, Y.⁶ Botvinick, M.⁷

26
- 79960637995
- A neural signature of hierarchical reinforcement learning
- Ribas-Fernandes JJF, Solway A, Diuk C, Barto AG, NIv Y, Botvinick M. 2011 A neural signature of hierarchical reinforcement learning. Neuron 71, 370–379. (doi:10.1016/j.neuron.2011.05.042)
- (2011) Neuron , vol.71 , pp. 370-379
- Ribas-Fernandes, J.J.F.¹ Solway, A.² Diuk, C.³ Barto, A.G.⁴ Niv, Y.⁵ Botvinick, M.⁶

27
- 84875468581
- Two simultaneous, but separable, prediction errors in human ventral striatum
- Diuk C, Tsai K, Wallis J, Botvinick M, Niv Y. 2013 Two simultaneous, but separable, prediction errors in human ventral striatum. J. Neurosci. 33, 5797–5805. (doi:10.1523/JNEUROSCI.5445-12.2013)
- (2013) J. Neurosci , vol.33 , pp. 5797-5805
- Diuk, C.¹ Tsai, K.² Wallis, J.³ Botvinick, M.⁴ Niv, Y.⁵

28
- 5644290841
- An integrated theory of mind
- Anderson JR, Bothell D, Byrne MD, Scott D, Lebiere C, Qin Y. 2004 An integrated theory of mind. Psychol. Rev. 111, 1036–1060. (doi:10.1037/0033-295X.111.4.1036)
- (2004) Psychol. Rev , vol.111 , pp. 1036-1060
- Anderson, J.R.¹ Bothell, D.² Byrne, M.D.³ Scott, D.⁴ Lebiere, C.⁵ Qin, Y.⁶

29
- 0023422739
- SOAR: An architecture for general intelligence
- Laird JE, Newell A, Rosenbloom PS. 1987 SOAR: an architecture for general intelligence. Artif. Intell. 33, 1–64. (doi:10.1016/0004-3702(87)90050-6)
- (1987) Artif. Intell , vol.33 , pp. 1-64
- Laird, J.E.¹ Newell, A.² Rosenbloom, P.S.³

30
- 79951476983
- Hierarchical control of cognitive processes: The case for skilled typewriting
- (ed. BH Ross, New York, NY: Academic Press
- Logan GD, Crump MJC. 2011 Hierarchical control of cognitive processes: the case for skilled typewriting. In The psychology of learning and motivation: advances in research and theory (ed. BH Ross), pp. 2–19. New York, NY: Academic Press.
- (2011) The psychology of learning and motivation: advances in research and theory , pp. 2-19
- Logan, G.D.¹ Crump, J.C.²

31
- 0004248275
- New York, NY: Holt, Rinehart and Winston
- Miller GA, Galanter E, Pribram KH. 1960 Plans and the structure of behavior. New York, NY: Holt, Rinehart and Winston.
- (1960) Plans and the structure of behavior
- Miller, G.A.¹ Galanter, E.² Pribram, K.H.³

32
- 1942443210
- Doing without schema hierarchies: A recurrent connectionist approach to normal and impaired routine sequential action
- Botvinick M, Plaut DC. 2004 Doing without schema hierarchies: a recurrent connectionist approach to normal and impaired routine sequential action. Psychol. Rev. 111, 395–429. (doi:10.1037/0033-295X.111.2.395)
- (2004) Psychol. Rev , vol.111 , pp. 395-429
- Botvinick, M.¹ Plaut, D.C.²

33
- 0034075310
- Contention scheduling and the control of routine activities. Cogn
- Cooper R, Shallice T. 2000 Contention scheduling and the control of routine activities. Cogn. Neuropsychol. 17, 297–338. (doi:10.1080/026432900380427)
- (2000) Neuropsychol , vol.17 , pp. 297-338
- Cooper, R.¹ Shallice, T.²

34
- 33750296630
- Such stuff as habits are made on: A reply to Cooper and Shallice (2006
- Botvinick M, Plaut DC. 2006 Such stuff as habits are made on: a reply to Cooper and Shallice (2006). Psychol. Rev. 113, 917–928. (doi:10.1037/0033-295X.113.4.917)
- (2006) Psychol. Rev , vol.113 , pp. 917-928
- Botvinick, M.¹ Plaut, D.C.²

35
- 0003891643
- New York, NY: Holt
- James W. 1890 The principles of psychology. New York, NY: Holt.
- (1890) The principles of psychology
- James, W.¹

36
- 0004223940
- Cambridge, UK: Cambridge University Press
- Reason JT. 1992 Human error. Cambridge, UK: Cambridge University Press.
- (1992) Human error
- Reason, J.T.¹

37
- 0003649763
- New York, NY: Century
- Tolman EC. 1932 Purposive behavior in animals and men. New York, NY: Century.
- (1932) Purposive behavior in animals and men
- Tolman, E.C.¹

38
- 0016069798
- Planning in a hierarchy of abstraction spaces
- Sacerdoti ED. 1974 Planning in a hierarchy of abstraction spaces. Artif. Intell. 5, 115–135. (doi:10.1016/0004-3702(74)90026-5)
- (1974) Artif. Intell , vol.5 , pp. 115-135
- Sacerdoti, E.D.¹

39
- 0018594651
- A cognitive model of planning
- Hayes-Roth B, Hayes-Roth F. 1979 A cognitive model of planning. Cogn. Sci. 3, 275–310. (doi:10.1207/s15516709cog0304_1)
- (1979) Cogn. Sci , vol.3 , pp. 275-310
- Hayes-Roth, B.¹ Hayes-Roth, F.²

40
- 27444431878
- SHOP2: An HTN Planning System
- Nau D, Au T-C, Ilghami O, Kuter U, Murdock JW, Wu D, Yaman F. 2003 SHOP2: an HTN Planning System. J. Artif. Intell. Res. 20, 379–404.
- (2003) J. Artif. Intell. Res , vol.20 , pp. 379-404
- Nau, D.¹ Au, T.-C.² Ilghami, O.³ Kuter, U.⁴ Murdock, J.W.⁵ Wu, D.⁶ Yaman, F.⁷

41
- 84907555021
- Hierarchical deconstruction and memoization of goal-directed plans
- Princeton, NJ, 25–27 October 2013
- Huys Q, Lally N, Falkner P, Gershman S, Dayan P, Roiser J. 2013 Hierarchical deconstruction and memoization of goal-directed plans. Poster presented at First Multidisciplinary Conf. on Reinforcement Learning and Decision Making, Princeton, NJ, 25–27 October 2013.
- (2013) Poster presented at First Multidisciplinary Conf. on Reinforcement Learning and Decision Making
- Huys, Q.¹ Lally, N.² Falkner, P.³ Gershman, S.⁴ Dayan, P.⁵ Roiser, J.⁶

42
- 84892682926
- Actions, action sequences and decision-making: Evidence that goal-directed and habitual action control are hierarchically organized
- Dezfouli A, Balleine BW. 2013 Actions, action sequences and decision-making: evidence that goal-directed and habitual action control are hierarchically organized. PLoS Comput. Biol. 9, e1003364. (doi:10.1371/journal.pcbi.1003364)
- (2013) PLoS Comput. Biol , vol.9
- Dezfouli, A.¹ Balleine, B.W.²

43
- 84907480610
- Habits as action sequences: Hierarchical action control and changes in outcome value
- Dezfouli A, Lingawi NW, Balleine BW. 2014 Habits as action sequences: hierarchical action control and changes in outcome value. Phil. Trans. R. Soc. B 369, 20130482. (doi:10.1098/rstb.2013.0482)
- (2014) Phil. Trans. R. Soc. B , vol.369 , pp. 20130482
- Dezfouli, A.¹ Lingawi, N.W.² Balleine, B.W.³

44
- 67649342617
- Evidence of action sequence chunking in goal-directed instrumental conditioning and its dependence on the dorsomedial prefrontal cortex
- Ostlund SB, Winterbauer NE, Balleine BW. 2009 Evidence of action sequence chunking in goal-directed instrumental conditioning and its dependence on the dorsomedial prefrontal cortex. J. Neurosci. 29, 8280–8287. (doi:10.1523/JNEUROSCI.1176-09.2009)
- (2009) J. Neurosci , vol.29 , pp. 8280-8287
- Ostlund, S.B.¹ Winterbauer, N.E.² Balleine, B.W.³

45
- 33746272406
- Duration neglect in retrospective evaluations of affective episodes
- Fredrickson BL, Kahneman D. 1993 Duration neglect in retrospective evaluations of affective episodes. J. Pers. Soc. Psychol. 65, 45–55. (doi:10.1037/0022-3514.65.1.45)
- (1993) J. Pers. Soc. Psychol , vol.65 , pp. 45-55
- Fredrickson, B.L.¹ Kahneman, D.²

46
- 0039786967
- Gestalt characteristics of experiences: The defining features of summarized events
- Ariely D, Carmon Z. 2000 Gestalt characteristics of experiences: the defining features of summarized events. J. Behav. Decis. Making 13, 191–201. (doi:10.1002/(SICI)1099-0771(200004/06)13: 2,191::AID-BDM330.3.0.CO;2-A)
- (2000) J. Behav. Decis. Making , vol.13 , pp. 191-201
- Ariely, D.¹ Carmon, Z.²

47
- 84879603032
- Hedonic evaluation over short and long retention intervals: The mechanism of the peak–end rule
- Geng X, Chen Z, Lam W, Zheng Q. 2013 Hedonic evaluation over short and long retention intervals: the mechanism of the peak–end rule. J. Behav. Decis. Making 26, 225–236. (doi:10.1002/bdm.1755)
- (2013) J. Behav. Decis. Making , vol.26 , pp. 225-236
- Geng, X.¹ Chen, Z.² Lam, W.³ Zheng, Q.⁴

48
- 33747688922
- Optimal predictions in everyday cognition
- Griffiths TL, Tenenbaum JB. 2006 Optimal predictions in everyday cognition. Psychol. Sci. 17, 767–773. (doi:10.1111/j.1467-9280.2006.01780.x)
- (2006) Psychol. Sci , vol.17 , pp. 767-773
- Griffiths, T.L.¹ Tenenbaum, J.B.²

49
- 82855178982
- Predicting the future as Bayesian Inference: People combine prior knowledge with observations when estimating duration and extent
- Griffiths TL, Tenenbaum JB. 2011 Predicting the future as Bayesian Inference: people combine prior knowledge with observations when estimating duration and extent. J. Exp. Psychol. Gen. 140, 725–743. (doi:10.1037/a0024899)
- (2011) J. Exp. Psychol. Gen , vol.140 , pp. 725-743
- Griffiths, T.L.¹ Tenenbaum, J.B.²

50
- 26844492851
- Underestimating the duration of future events: Memory incorrectly used or memory bias?
- Roy MM, Christenfeld NJ, McKenzie CRM. 2005 Underestimating the duration of future events: memory incorrectly used or memory bias? Psychol. Bull. 131, 738–756. (doi:10.1037/0033-2909.131.5.738)
- (2005) Psychol. Bull , vol.131 , pp. 738-756
- Roy, M.M.¹ Christenfeld, N.J.² McKenzie, R.M.³

51
- 84859772658
- The energetics of motivated cognition: A force-field analysis
- Kruglanski AW, Bélanger JJ, Chen X, Köpetz C, Pierro A, Mannetti L. 2012 The energetics of motivated cognition: a force-field analysis. Psychol. Rev. 119, 1–20. (doi:10.1037/a0025488)
- (2012) Psychol. Rev , vol.119 , pp. 1-20
- Kruglanski, A.W.¹ Bélanger, J.J.² Chen, X.³ Köpetz, C.⁴ Pierro, A.⁵ Mannetti, L.⁶

52
- 14644414684
- ‘Fine-to-coarse’ route planning and navigation in regionalized environments
- Wiener JM, Mallot HA. 2003 ‘Fine-to-coarse’ route planning and navigation in regionalized environments. Spat. Cogn. Comput. 3, 331–358. (doi:10.1207/s15427633scc0304_5)
- (2003) Spat. Cogn. Comput , vol.3 , pp. 331-358
- Wiener, J.M.¹ Mallot, H.A.²

53
- 33750705246
- Casual graph based decomposition of factored MDPs
- Jonsson A, Barto AG. 2006 Casual graph based decomposition of factored MDPs. J. Mach. Learn. Res. 7, 2259–2301.
- (2006) J. Mach. Learn. Res , vol.7 , pp. 2259-2301
- Jonsson, A.¹ Barto, A.G.²

54
- 80054969173
- Intrinsically motivated hieararchical skill learning in structured environments
- Vigorito CM, Barto AG. 2010 Intrinsically motivated hieararchical skill learning in structured environments. IEEE Trans. Auton. Ment. Dev. (T-AMD) 2, 83–90. (doi:10.1109/TAMD.2010.2051436)
- (2010) IEEE Trans. Auton. Ment. Dev. (T-AMD) , vol.2 , pp. 83-90
- Vigorito, C.M.¹ Barto, A.G.²

55
- 0013465036
- Discovering hierarchy in reinforcement learning with HEXQ
- Hengst B. 2002 Discovering hierarchy in reinforcement learning with HEXQ. Proc. Int. Conf. Mach. Learn. 19, 243–250.
- (2002) Proc. Int. Conf. Mach. Learn , vol.19 , pp. 243-250
- Hengst, B.¹

56
- 84858634841
- Autonomous learning of high-level states and actions in continuous environments
- Mugan J, Kuipers B. 2012 Autonomous learning of high-level states and actions in continuous environments. IEEE Trans. Auton. Ment. Dev. 4, 70–86. (doi:10.1109/TAMD.2011.2160943)
- (2012) IEEE Trans. Auton. Ment. Dev , vol.4 , pp. 70-86
- Mugan, J.¹ Kuipers, B.²

57
- 84883172973
- Computational models of executive control: Charted territory and new frontiers
- In press
- Botvinick M, Cohen JD. In press. Computational models of executive control: charted territory and new frontiers. Cogn. Sci.
- Cogn. Sci
- Botvinick, M.¹ Cohen, J.D.²

58
- 84875674596
- Neural representations of events arise from temporal community structure
- Schapiro A, Cordova N, Turk-Browne N, Rogers TT, Botvinick MM. 2013 Neural representations of events arise from temporal community structure. Nat. Neurosci. 16, 486–492. (doi:10.1038/nn.3331)
- (2013) Nat. Neurosci , vol.16 , pp. 486-492
- Schapiro, A.¹ Cordova, N.² Turk-Browne, N.³ Rogers, T.T.⁴ Botvinick, M.M.⁵

59
- 0001158047
- Improving generalization for temporal difference learning: The successor representation
- Dayan P. 1993 Improving generalization for temporal difference learning: the successor representation. Neural Comput. 5, 613–624. (doi:10.1162/neco.1993.5.4.613)
- (1993) Neural Comput , vol.5 , pp. 613-624
- Dayan, P.¹

60
- 77953260848
- States versus rewards: Dissociable neural prediction error signals underlying model-based and model-free reinforcement learning
- Glascher J, Daw N, Dayan P, O’Doherty JP. 2010 States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. Neuron 66, 585–595. (doi:10.1016/j.neuron.2010.04.016)
- (2010) Neuron , vol.66 , pp. 585-595
- Glascher, J.¹ Daw, N.² Dayan, P.³ O’doherty, J.P.⁴

61
- 42749096312
- Cognitive control, hierarchy, and the rostro–caudal organization of the frontal lobes
- Badre D. 2008 Cognitive control, hierarchy, and the rostro–caudal organization of the frontal lobes. Trends Cogn. Sci. 12, 193–200. (doi:10.1016/j.tics.2008.02.004)
- (2008) Trends Cogn. Sci , vol.12 , pp. 193-200
- Badre, D.¹

62
- 0242497620
- The architecture of cognitive control in the human prefrontal cortex
- Koechlin E, Ody C, Kouneiher F. 2003 The architecture of cognitive control in the human prefrontal cortex. Science 302, 1181–1185. (doi:10.1126/science.1088545)
- (2003) Science , vol.302 , pp. 1181-1185
- Koechlin, E.¹ Ody, C.² Kouneiher, F.³

63
- 84896322734
- Task difficulty manipulation reveals multiple demand activity but no frontal lobe hierarchy
- Crittenden BM, Duncan J. 2012 Task difficulty manipulation reveals multiple demand activity but no frontal lobe hierarchy. Cereb. Cortex 24, 532–540. (doi:10.1093/cercor/bhs333)
- (2012) Cereb. Cortex , vol.24 , pp. 532-540
- Crittenden, B.M.¹ Duncan, J.²

64
- 84906983504
- Prefrontal cortex organization: Dissociating effects of temporal abstraction, relational abstraction, and integration with fMRI
- Nee DE, Jahn A, Brown JW. 2013 Prefrontal cortex organization: dissociating effects of temporal abstraction, relational abstraction, and integration with fMRI. Cereb. Cortex 24, 2377–2387. (doi:10.1093/cercor/bht091)
- (2013) Cereb. Cortex , vol.24 , pp. 2377-2387
- Nee, D.E.¹ Jahn, A.² Brown, J.W.³

65
- 84857065417
- The function and organization of lateral prefrontal cortex: A test of competing hypotheses
- Reynolds JR, O’Reilly RC, Cohen JD, Braver TS. 2012 The function and organization of lateral prefrontal cortex: a test of competing hypotheses. PLoS ONE 7, e30284. (doi:10.1371/journal.pone.0030284)
- (2012) PLoS ONE , vol.7
- Reynolds, J.R.¹ O’reilly, R.C.² Cohen, J.D.³ Braver, T.S.⁴

66
- 33846607753
- Self-projection and the brain
- Buckner RL, Carroll DC. 2006 Self-projection and the brain. Trends Cogn. Sci. 11, 49–57. (doi:10.1016/j.tics.2006.11.004)
- (2006) Trends Cogn. Sci , vol.11 , pp. 49-57
- Buckner, R.L.¹ Carroll, D.C.²

67
- 0004098484
- Oxford, UK: Oxford University Press
- O’Keefe J, Nadel L. 1978 The hippocampus as a cognitive map. Oxford, UK: Oxford University Press.
- (1978) The hippocampus as a cognitive map
- O’keefe, J.¹ Nadel, L.²

68
- 85132026293
- Integrated architectures for learning, planning, and reacting based on approximating dynamic programming
- Austin, Texas, 21 June 1990, San Francisco, CA: Morgan Kaufmann
- Sutton RS. 1990 Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In Proc. Seventh International Conference on Machine Learning, Austin, Texas, 21 June 1990, pp. 216–224. San Francisco, CA: Morgan Kaufmann.
- (1990) Proc. Seventh International Conference on Machine Learning , pp. 216-224
- Sutton, R.S.¹

69
- 38149106939
- A biologically plausible model of human planning based on neural networks and Dyna-PI models
- Baldassarre G. 2002 A biologically plausible model of human planning based on neural networks and Dyna-PI models. In Workshop on Adaptive Behaviour in Anticipatory Learning Systems, pp. 40–60.
- (2002) Workshop on Adaptive Behaviour in Anticipatory Learning Systems , pp. 40-60
- Baldassarre, G.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.