메뉴 건너뛰기




Volumn 21, Issue 10, 2007, Pages 1215-1229

Reinforcement learning of a continuous motor sequence with hidden states

Author keywords

Actor critic method; Pendulum swing up; Perceptual aliasing problem; Recurrent neural network; Reinforcement learning

Indexed keywords

ALGORITHMS; BEHAVIORAL RESEARCH; NEUROLOGY; PARAMETER ESTIMATION;

EID: 34547543601     PISSN: 01691864     EISSN: 15685535     Source Type: Journal    
DOI: 10.1163/156855307781389365     Document Type: Article
Times cited : (13)

References (18)
  • 1
    • 23144448134 scopus 로고    scopus 로고
    • Novelty and reinforcement learning in the value system of developmental robots
    • X. Huang and J. Weng, Novelty and reinforcement learning in the value system of developmental robots, in: Proc. 2nd Int. Workshop on Epigenetic Robotics(2002).
    • (2002) Proc. 2nd Int. Workshop on Epigenetic Robotics
    • Huang, X.1    Weng, J.2
  • 2
    • 0033148990 scopus 로고    scopus 로고
    • Cooperative behavior acquisition for mobile robots in dynamically changing real worlds via vision-based reinforcement learning and development
    • M. Asada, E. Uchibe and K. Hosoda, Cooperative behavior acquisition for mobile robots in dynamically changing real worlds via vision-based reinforcement learning and development, Artif. Intell.110, 275-292 (1999).
    • (1999) Artif. Intell , vol.110 , pp. 275-292
    • Asada, M.1    Uchibe, E.2    Hosoda, K.3
  • 3
    • 0000123778 scopus 로고
    • Self-improving reactive agents based on reinforcement learning
    • L.-J. Lin, Self-improving reactive agents based on reinforcement learning, Machine Learn.8, 293-321 (1992).
    • (1992) Machine Learn , vol.8 , pp. 293-321
    • Lin, L.-J.1
  • 4
    • 0020970738 scopus 로고
    • Neuronlike adaptive elements that can solve difficult learning control problems
    • A. G. Barto, R. S. Sutton and C. W. Anderson, Neuronlike adaptive elements that can solve difficult learning control problems, IEEE Trans. Syst. Man Cybernet.13, 834-846 (1983).
    • (1983) IEEE Trans. Syst. Man Cybernet , vol.13 , pp. 834-846
    • Barto, A.G.1    Sutton, R.S.2    Anderson, C.W.3
  • 5
    • 3242752134 scopus 로고    scopus 로고
    • Evolving the neural controller for a robotic arm able to grasp objects on the basis of tactile sensors
    • R. Bianco and S. Nolfi, Evolving the neural controller for a robotic arm able to grasp objects on the basis of tactile sensors, Adapt. Behav.12, 37-45 (2004).
    • (2004) Adapt. Behav , vol.12 , pp. 37-45
    • Bianco, R.1    Nolfi, S.2
  • 6
    • 0025600638 scopus 로고
    • A stochastic reinforcement learning algorithm for learning real-valued functions
    • V. Gullapalli, A stochastic reinforcement learning algorithm for learning real-valued functions, Neural Networks3, 671-692 (1990).
    • (1990) Neural Networks , vol.3 , pp. 671-692
    • Gullapalli, V.1
  • 7
    • 0033629916 scopus 로고    scopus 로고
    • Reinforcement learning in continuous time and space
    • K. Doya, Reinforcement learning in continuous time and space. Neural Comput.12, 219-245 (2000).
    • (2000) Neural Comput , vol.12 , pp. 219-245
    • Doya, K.1
  • 9
    • 0030164858 scopus 로고    scopus 로고
    • Model-based learning for mobile robot navigation from the dynamical system perspective
    • J. Tani, Model-based learning for mobile robot navigation from the dynamical system perspective, IEEE Trans. Syst. Man Cybernet. B26, 421-436 (1996).
    • (1996) IEEE Trans. Syst. Man Cybernet. B , vol.26 , pp. 421-436
    • Tani, J.1
  • 11
    • 33847202724 scopus 로고
    • Learning to predict by the methods of temporal difference
    • R. S. Sutton, Learning to predict by the methods of temporal difference, Machine Learn.3, 9-44 (1988).
    • (1988) Machine Learn , vol.3 , pp. 9-44
    • Sutton, R.S.1
  • 12
    • 0004671869 scopus 로고    scopus 로고
    • Temporal Difference Learning in Continuous Time and Space
    • of, MIT Press, Cambridge, MA
    • K. Doya, Temporal Difference Learning in Continuous Time and Space, volume 8 of Advances in Neural Information Processing Systems. MIT Press, Cambridge, MA (1996).
    • (1996) Advances in Neural Information Processing Systems , vol.8
    • Doya, K.1
  • 13
    • 0001202594 scopus 로고
    • A learning algorithm, for continually running fully recurrent neural networks
    • R. J. Williams and D. Zipser, A learning algorithm, for continually running fully recurrent neural networks, Neural Comput.1, 270-280 (1989).
    • (1989) Neural Comput , vol.1 , pp. 270-280
    • Williams, R.J.1    Zipser, D.2
  • 14
    • 44049116478 scopus 로고
    • Forward models: Supervised learning with a distal teacher
    • M. I. Jordan and D. E. Rumelhart, Forward models: supervised learning with a distal teacher, Cognitive Sci.16, 307-354 (1992).
    • (1992) Cognitive Sci , vol.16 , pp. 307-354
    • Jordan, M.I.1    Rumelhart, D.E.2
  • 15
    • 0000646059 scopus 로고
    • Learning Internal Representations by Error Propagation
    • of, MIT Press, Cambridge, MA
    • D. Rumelhart, G. Hinton and R. Williams, Learning Internal Representations by Error Propagation, volume 1 of Parallel Distributed Processing. MIT Press, Cambridge, MA (1986).
    • (1986) Parallel Distributed Processing , vol.1
    • Rumelhart, D.1    Hinton, G.2    Williams, R.3
  • 16
    • 0032220772 scopus 로고    scopus 로고
    • An interpretation of the "self" from the dynamical system perspective: A constructivist approach
    • J. Tani, An interpretation of the "self" from the dynamical system perspective: a constructivist approach, Consciousness Studies5 (1998).
    • (1998) Consciousness Studies , vol.5
    • Tani, J.1
  • 17
    • 0344154963 scopus 로고
    • Strategy learning with multilayer connectionist representations
    • C. W. Anderson, Strategy learning with multilayer connectionist representations, in: Proc. 4th Int. Workshop on Machine Learning, pp. 103-114 (1987).
    • (1987) Proc. 4th Int. Workshop on Machine Learning , pp. 103-114
    • Anderson, C.W.1
  • 18
    • 0028392483 scopus 로고
    • Learning long-term dependencies with gradient descent is difficult
    • Y Bengio, P. Simard and P. Frasconi, Learning long-term dependencies with gradient descent is difficult, IEEE Trans. Neural Networks5, 157-166 (1994).
    • (1994) IEEE Trans. Neural Networks , vol.5 , pp. 157-166
    • Bengio, Y.1    Simard, P.2    Frasconi, P.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.