메뉴 건너뛰기




Volumn 1, Issue NOV, 2010, Pages

Credit assignment in multiple goal embodied visuomotor behavior

Author keywords

Credit assignment; Learning; Modules; Reinforcement; Reward

Indexed keywords


EID: 80054091681     PISSN: None     EISSN: 16641078     Source Type: Journal    
DOI: 10.3389/fpsyg.2010.00173     Document Type: Article
Times cited : (31)

References (60)
  • 3
    • 0033170833 scopus 로고    scopus 로고
    • Animation control for real-time virtual humans
    • Badler, N., Palmer, M., and Bindiganavale, R. (1999). Animation control for real-time virtual humans. Commun. ACM 42, 64-73.
    • (1999) Commun. ACM , vol.42 , pp. 64-73
    • Badler, N.1    Palmer, M.2    Bindiganavale, R.3
  • 6
    • 66149104409 scopus 로고    scopus 로고
    • Simulation, situated conceptualization, and prediction
    • Barsalou, L. (2009). Simulation, situated conceptualization, and prediction. Phil. Trans. R. Soc. B 364, 1281-1289.
    • (2009) Phil. Trans. R. Soc. B , vol.364 , pp. 1281-1289
    • Barsalou, L.1
  • 7
    • 0141988716 scopus 로고    scopus 로고
    • Recent advances in hierarchical reinforcement learning
    • Barto, A. G., and Mahadevan, S. (2003). Recent advances in hierarchical reinforcement learning. Discrete Event Dyn. Syst. 13, 41-77.
    • (2003) Discrete Event Dyn. Syst. , vol.13 , pp. 41-77
    • Barto, A.G.1    Mahadevan, S.2
  • 9
    • 0022688781 scopus 로고
    • A robust layered control system for a mobile robot
    • Brooks, R. (1986). A robust layered control system for a mobile robot. IEEE J. Robot. Autom. 2, 14-23.
    • (1986) IEEE J. Robot. Autom. , vol.2 , pp. 14-23
    • Brooks, R.1
  • 11
    • 28044450875 scopus 로고    scopus 로고
    • Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control
    • Daw, N. D., Niv, Y., and Dayan, P. (2005). Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8, 1704-1711.
    • (2005) Nat. Neurosci. , vol.8 , pp. 1704-1711
    • Daw, N.D.1    Niv, Y.2    Dayan, P.3
  • 12
    • 33745223257 scopus 로고    scopus 로고
    • Cortical substrates for exploratory decisions in humans
    • Daw, N. D., O'Doherty, J. P., Dayan, P., Seymour, B., and Dolan, R. J. (2006). Cortical substrates for exploratory decisions in humans. Nature 441, 876-879.
    • (2006) Nature , vol.441 , pp. 876-879
    • Daw, N.D.1    O'Doherty, J.P.2    Dayan, P.3    Seymour, B.4    Dolan, R.J.5
  • 13
    • 0001234682 scopus 로고
    • Feudal reinforcement learning
    • (San Francisco, CA: Morgan Kaufmann Publishers Inc.)
    • Dayan, P., and Hinton, G. E. (1992). "Feudal reinforcement learning," in Advances in Neural Information Processing Systems, (San Francisco, CA: Morgan Kaufmann Publishers Inc.) 5, 271-278.
    • (1992) Advances in Neural Information Processing Systems , vol.5 , pp. 271-278
    • Dayan, P.1    Hinton, G.E.2
  • 14
    • 0034248853 scopus 로고    scopus 로고
    • Stochastic dynamic programming with factored representations
    • Dearden, R. Boutilier, C., and Goldszmidt, M. (2000). Stochastic dynamic programming with factored representations. Artif. Intell. 121, 49-107.
    • (2000) Artif. Intell. , vol.121 , pp. 49-107
    • Dearden, R.1    Boutilier, C.2    Goldszmidt, M.3
  • 15
    • 0036618011 scopus 로고    scopus 로고
    • Multiple model-based reinforcement learning
    • Doya, K., Samejima, K., Katagiri K., and Kawato, M. (2002). Multiple model-based reinforcement learning. Neural Comput. 14, 1347-1369.
    • (2002) Neural Comput , vol.14 , pp. 1347-1369
    • Doya, K.1    Samejima, K.2    Katagiri, K.3    Kawato, M.4
  • 17
    • 77951632406 scopus 로고    scopus 로고
    • Taxing executive processes does not necessarily increase impulsive decision making
    • Franco-Watkins, A. M., Rickard, T. C., and Pashler, H. (2010). Taxing executive processes does not necessarily increase impulsive decision making. Exp. Psychol. 57, 193-201.
    • (2010) Exp. Psychol. , vol.57 , pp. 193-201
    • Franco-Watkins, A.M.1    Rickard, T.C.2    Pashler, H.3
  • 18
    • 77956582789 scopus 로고    scopus 로고
    • Embodiment as a unifying perspective for psychology
    • Glenberg, A. M. (2010). Embodiment as a unifying perspective for psychology. Wiley Interdiscip. Rev. Cogn. Sci. 1, 586-596.
    • (2010) Wiley Interdiscip. Rev. Cogn. Sci. , vol.1 , pp. 586-596
    • Glenberg, A.M.1
  • 19
    • 0042932360 scopus 로고    scopus 로고
    • Encoding predictive reward value in human amygdala and orbitofrontal cortex
    • Gottfried, J. A., O'Doherty, J., and Dolan, R. J. (2003). Encoding predictive reward value in human amygdala and orbitofrontal cortex. Science 301, 1104-1107.
    • (2003) Science , vol.301 , pp. 1104-1107
    • Gottfried, J.A.1    O'Doherty, J.2    Dolan, R.J.3
  • 21
    • 33644858743 scopus 로고    scopus 로고
    • Different neural correlates of reward expectation and reward expectation error in the putamen and caudate nucleus during stimulus-action-reward association learning
    • Haruno, M., and Kawato, M. (2006). Different neural correlates of reward expectation and reward expectation error in the putamen and caudate nucleus during stimulus-action-reward association learning. J. Neurophysiol. 95, 948-959.
    • (2006) J. Neurophysiol. , vol.95 , pp. 948-959
    • Haruno, M.1    Kawato, M.2
  • 23
    • 0007914441 scopus 로고    scopus 로고
    • Action selection methods using reinforcement learning
    • eds P. Maes, M. Mataric, J.-A. Meyer, J. Pollack, and S. W. Wilson (Cambridge, MA: MIT Press, Bradford Books)
    • Humphrys, M. (1996). "Action selection methods using reinforcement learning," in From Animals to Animats 4: Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior, eds P. Maes, M. Mataric, J.-A. Meyer, J. Pollack, and S. W. Wilson (Cambridge, MA: MIT Press, Bradford Books), 135-144.
    • (1996) From Animals to Animats 4 Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior , pp. 135-144
    • Humphrys, M.1
  • 25
    • 0023422739 scopus 로고
    • Soar: an architecture for general intelligence
    • Laird, J. E., Newell, A., and Rosenblum, P. S. (1987). Soar: an architecture for general intelligence. Artif. Intell. 33, 1-64.
    • (1987) Artif. Intell. , vol.33 , pp. 1-64
    • Laird, J.E.1    Newell, A.2    Rosenblum, P.S.3
  • 26
    • 33646365064 scopus 로고    scopus 로고
    • Learning recursive control programs from problem solving
    • Langley, P., and Choi, D. (2006). Learning recursive control programs from problem solving. J. Mach. Learn. Res. 7, 493-518.
    • (2006) J. Mach. Learn. Res. , vol.7 , pp. 493-518
    • Langley, P.1    Choi, D.2
  • 27
    • 0030778788 scopus 로고    scopus 로고
    • The capacity of visual working memory for features and conjunctions
    • Luck, S. J., and Vogel, E. K. (1997). The capacity of visual working memory for features and conjunctions. Nature 390, 279-281.
    • (1997) Nature , vol.390 , pp. 279-281
    • Luck, S.J.1    Vogel, E.K.2
  • 29
    • 33747585633 scopus 로고    scopus 로고
    • Midbrain dopamine neurons encode decisions for future action
    • Morris, G., Nevet, A., Arkadir, D., Vaadia, E., and Bergman, H. (2006). Midbrain dopamine neurons encode decisions for future action. Nat. Neurosci. 9, 1057-1063.
    • (2006) Nat. Neurosci. , vol.9 , pp. 1057-1063
    • Morris, G.1    Nevet, A.2    Arkadir, D.3    Vaadia, E.4    Bergman, H.5
  • 30
  • 33
    • 84898956770 scopus 로고    scopus 로고
    • Reinforcement learning with hierarchies of machines
    • M. I. Jordan, M. J. Kearns, and S. A. Solla. (Cambridge, MA: MIT Press)
    • Parr, R., and Russell, S. (1997). "Reinforcement learning with hierarchies of machines," in Advances in Neural Information Processing Systems, M. I. Jordan, M. J. Kearns, and S. A. Solla. (Cambridge, MA: MIT Press), 1043-1049.
    • (1997) Advances in Neural Information Processing Systems , pp. 1043-1049
    • Parr, R.1    Russell, S.2
  • 34
    • 33748302924 scopus 로고    scopus 로고
    • Dopamine-dependent prediction errors underpin reward-seeking behaviour in humans
    • Pessiglione, M., Seymour, B., Flandin, G., Dolan, R. J., and Frith, C. D. (2006). Dopamine-dependent prediction errors underpin reward-seeking behaviour in humans. Nature 442, 1042-1045.
    • (2006) Nature , vol.442 , pp. 1042-1045
    • Pessiglione, M.1    Seymour, B.2    Flandin, G.3    Dolan, R.J.4    Frith, C.D.5
  • 35
    • 77952544599 scopus 로고    scopus 로고
    • Neural computations associated with goal-directed choice
    • Rangel, A., and Hare, T. (2010). Neural computations associated with goal-directed choice. Curr. Opin. Neurobiol. 20, 262-270.
    • (2010) Curr. Opin. Neurobiol. , vol.20 , pp. 262-270
    • Rangel, A.1    Hare, T.2
  • 37
    • 65349126422 scopus 로고    scopus 로고
    • Image statistics at the point of gaze during human navigation
    • Rothkopf, C. A., and Ballard, D. H. (2009). Image statistics at the point of gaze during human navigation. Vis. Neurosci. 26, 81-92.
    • (2009) Vis. Neurosci. , vol.26 , pp. 81-92
    • Rothkopf, C.A.1    Ballard, D.H.2
  • 38
    • 84867041360 scopus 로고    scopus 로고
    • Learning and coordinating reper-toires of behaviors: credit assignment and module activation
    • Eds G. Baldassarre and M. Mirolli (in press)
    • Rothkopf, C. A., and Ballard, D. H. (2010). "Learning and coordinating reper-toires of behaviors: credit assignment and module activation," in Intrinsically Motivated Cumulative Learning in Natural and Artificial Systems, Eds G. Baldassarre and M. Mirolli (in press).
    • (2010) Intrinsically Motivated Cumulative Learning in Natural and Artificial Systems
    • Rothkopf, C.A.1    Ballard, D.H.2
  • 39
    • 0036152936 scopus 로고    scopus 로고
    • Learning words from sights and sounds: a computational model
    • Roy, D. K., and Pentland, A. P. (2002). Learning words from sights and sounds: a computational model. Cogn. Sci. 26, 113-146.
    • (2002) Cogn. Sci. , vol.26 , pp. 113-146
    • Roy, D.K.1    Pentland, A.P.2
  • 41
    • 32844474095 scopus 로고    scopus 로고
    • Reinforcement learning with factored states and actions
    • Sallans, B., and Hinton, G. E. (2004). Reinforcement learning with factored states and actions. J. Mach. Learn. Res. 5, 1063-1088.
    • (2004) J. Mach. Learn. Res. , vol.5 , pp. 1063-1088
    • Sallans, B.1    Hinton, G.E.2
  • 42
    • 0742324926 scopus 로고    scopus 로고
    • Inter-module credit assignment in modular reinforcement learning
    • Samejima, K., Doya, K., and Kawato, M. (2003). Inter-module credit assignment in modular reinforcement learning. Neural Netw. 16, 985-994.
    • (2003) Neural Netw , vol.16 , pp. 985-994
    • Samejima, K.1    Doya, K.2    Kawato, M.3
  • 43
    • 0034576323 scopus 로고    scopus 로고
    • Multiple reward signals in the brain
    • Schultz, W. (2000). Multiple reward signals in the brain. Nat. Rev. Neurosci. 1, 199-207.
    • (2000) Nat. Rev. Neurosci. , vol.1 , pp. 199-207
    • Schultz, W.1
  • 44
    • 0030896968 scopus 로고    scopus 로고
    • A neural substrate of prediction and reward
    • Schultz, W., Dayan, P., and Montague, P. R. (1997). A neural substrate of prediction and reward. Science 275, 1593-1599.
    • (1997) Science , vol.275 , pp. 1593-1599
    • Schultz, W.1    Dayan, P.2    Montague, P.R.3
  • 45
    • 0035189004 scopus 로고    scopus 로고
    • What controls attention in natural environments?
    • Shinoda, H., Hayhoe, M. M., and Shrivastava, A. (2001). What controls attention in natural environments? Vis. Res. 41, 3535-3546.
    • (2001) Vis. Res , vol.41 , pp. 3535-3546
    • Shinoda, H.1    Hayhoe, M.M.2    Shrivastava, A.3
  • 46
    • 84899022377 scopus 로고    scopus 로고
    • How to dynamically merge Markov decision processes
    • Singh, S., and Cohn, D. (1998). How to dynamically merge Markov decision processes. Neural Inf. Process. Syst. 10, 1057-1063.
    • (1998) Neural Inf. Process. Syst. , vol.10 , pp. 1057-1063
    • Singh, S.1    Cohn, D.2
  • 49
    • 33846607265 scopus 로고    scopus 로고
    • chapter 4. Cambridge: Cambridge University Press
    • Sun, R. (2006). Cognition and Multi-Agent Interaction, chapter 4. Cambridge: Cambridge University Press, 79-99.
    • (2006) Cognition and Multi-Agent Interaction , pp. 79-99
    • Sun, R.1
  • 51
    • 0033170372 scopus 로고    scopus 로고
    • Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning
    • Sutton, R. S., Precup, D., and Singh, S. P. (1999). Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning. Artif. Intell. 112, 181-211.
    • (1999) Artif. Intell , vol.112 , pp. 181-211
    • Sutton, R.S.1    Precup, D.2    Singh, S.P.3
  • 52
    • 0002648372 scopus 로고    scopus 로고
    • Artificial life for computer graphics
    • Terzopoulos, D. (1999). Artificial life for computer graphics. Commun. ACM 42, 32-42.
    • (1999) Commun. ACM , vol.42 , pp. 32-42
    • Terzopoulos, D.1
  • 53
    • 0001049378 scopus 로고
    • Artificial fishes: autonomous locomotion, perception, behavior, and learning in a simulated physical world
    • Terzopoulos, D., Tu, X., and Grzeszczuk, R. (1994). Artificial fishes: autonomous locomotion, perception, behavior, and learning in a simulated physical world. Artif. Life, 1, 327-351.
    • (1994) Artif. Life , vol.1 , pp. 327-351
    • Terzopoulos, D.1    Tu, X.2    Grzeszczuk, R.3
  • 54
    • 58149442669 scopus 로고
    • Cognitive maps in rats and men
    • Tolman, E. C. (1948). Cognitive maps in rats and men. Psychol. Rev. 55, 189-208.
    • (1948) Psychol. Rev. , vol.55 , pp. 189-208
    • Tolman, E.C.1
  • 55
    • 0018878142 scopus 로고
    • A feature-integration theory of attention
    • Treisman, A. M. (1980). A feature-integration theory of attention. Cogn. Psychol. 12, 97-136.
    • (1980) Cogn. Psychol. , vol.12 , pp. 97-136
    • Treisman, A.M.1
  • 56
    • 0027961585 scopus 로고
    • Why are small and large numbers enumerated differently? A limited-capacity preattentive stage in vision
    • Trick, L. M., and Pylyshyn, Z. W. (1994). Why are small and large numbers enumerated differently? A limited-capacity preattentive stage in vision. Psychol. Rev. 101, 80-102.
    • (1994) Psychol. Rev. , vol.101 , pp. 80-102
    • Trick, L.M.1    Pylyshyn, Z.W.2
  • 57
    • 0021700041 scopus 로고
    • Visual routines
    • Ullman, S. (1984). Visual routines. Cognition 18, 97-157.
    • (1984) Cognition , vol.18 , pp. 97-157
    • Ullman, S.1
  • 58
    • 80054969173 scopus 로고    scopus 로고
    • Intrinsically motivated hierarchical skill learning in structured environments
    • Vigorito, C. M., and Barto, A. G. (2010). Intrinsically motivated hierarchical skill learning in structured environments. IEEE Trans. Auton. Ment. Dev. 2, 132-143.
    • (2010) IEEE Trans. Auton. Ment. Dev. , vol.2 , pp. 132-143
    • Vigorito, C.M.1    Barto, A.G.2
  • 59
    • 4344577338 scopus 로고    scopus 로고
    • Behavioral dynamics of human locomotion
    • Warren, W. H., and Fajen, B. R. (2004). Behavioral dynamics of human locomotion. Ecol. Psychol. 16, 61-66.
    • (2004) Ecol. Psychol. , vol.16 , pp. 61-66
    • Warren, W.H.1    Fajen, B.R.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.