메뉴 건너뛰기




Volumn 40, Issue 3, 2000, Pages 265-299

Study of reinforcement learning in the continuous case by the means of viscosity solutions

Author keywords

[No Author keywords available]

Indexed keywords

APPROXIMATION THEORY; COMPUTER SIMULATION; CONVERGENCE OF NUMERICAL METHODS; DECISION THEORY; DIFFERENTIAL EQUATIONS; DYNAMIC PROGRAMMING; FINITE DIFFERENCE METHOD; FINITE ELEMENT METHOD; FUNCTION EVALUATION; MARKOV PROCESSES; PROBLEM SOLVING; THEOREM PROVING;

EID: 0034274415     PISSN: 08856125     EISSN: None     Source Type: Journal    
DOI: 10.1023/A:1007686309208     Document Type: Article
Times cited : (57)

References (50)
  • 3
    • 0003267441 scopus 로고
    • Solutions de viscosité des équations de Hamilton-Jacobi
    • Springer-Verlag.
    • Barles, G. (1994). Solutions de viscosité des équations de Hamilton-Jacobi, Springer-Verlag. Mathématiques et Applications, Vol. 17.
    • (1994) Mathématiques et Applications , vol.17
    • Barles, G.1
  • 4
    • 0024070155 scopus 로고
    • Exit time problems in optimal control and vanishing viscosity solutions of hamilton-jacobi equations
    • Barles, G. & Perthame, B. (1988). Exit time problems in optimal control and vanishing viscosity solutions of hamilton-jacobi equations. SIAM Control Optimization, 26, 1133-1148.
    • (1988) SIAM Control Optimization , vol.26 , pp. 1133-1148
    • Barles, G.1    Perthame, B.2
  • 5
    • 0003184220 scopus 로고
    • Comparison principle for dirichlet-type hamilton-jacobi equations and singular perturbations of degenerated elliptic equations
    • Barles, G. & Perthame, B. (1990). Comparison principle for dirichlet-type hamilton-jacobi equations and singular perturbations of degenerated elliptic equations. Applied Mathematics and Optimization, 21, 21-44.
    • (1990) Applied Mathematics and Optimization , vol.21 , pp. 21-44
    • Barles, G.1    Perthame, B.2
  • 6
    • 84974753170 scopus 로고
    • Convergence of approximation schemes for fully nonlinear second order equations
    • Barles, G. & Souganidis, P. (1991). Convergence of approximation schemes for fully nonlinear second order equations. Asymptotic Analysis, 4, 271-283.
    • (1991) Asymptotic Analysis , vol.4 , pp. 271-283
    • Barles, G.1    Souganidis, P.2
  • 7
    • 0002355083 scopus 로고
    • Connectionist learning for control: An overview
    • W. T. Miller, R. S. Sutton, & P. J. Werbos (Eds.), Cambridge, Massachussetts: MIT Press
    • Barto, A. G. (1990). Connectionist learning for control: An overview. In W. T. Miller, R. S. Sutton, & P. J. Werbos (Eds.), Neural Networks for Control (pp. 5-58). Cambridge, Massachussetts: MIT Press.
    • (1990) Neural Networks for Control , pp. 5-58
    • Barto, A.G.1
  • 11
    • 0000193385 scopus 로고    scopus 로고
    • A simplification of the back-propagation-through-time algorithm for optimal neurocontrol
    • Bersini, H. & Gorrini, V. (1997). A simplification of the back-propagation-through-time algorithm for optimal neurocontrol. IEEE Transaction on Neural Networks, 8, 437-441.
    • (1997) IEEE Transaction on Neural Networks , vol.8 , pp. 437-441
    • Bersini, H.1    Gorrini, V.2
  • 14
    • 85153940465 scopus 로고
    • Generalization in reinforcement learning: Safely approximating the value function
    • Boyan, J. & Moore, A. (1995). Generalization in reinforcement learning: Safely approximating the value function. Advances in Neural Information Processing Systems, 7, 369-376.
    • (1995) Advances in Neural Information Processing Systems , vol.7 , pp. 369-376
    • Boyan, J.1    Moore, A.2
  • 15
    • 84967708673 scopus 로고
    • User's guide to viscosity solutions of second order partial differential equations
    • Crandall, M., Ishii, H., & Lions, P. (1992). User's guide to viscosity solutions of second order partial differential equations. Bulletin of the American Mathematical Society, 27(1), 1-67.
    • (1992) Bulletin of the American Mathematical Society , vol.27 , Issue.1 , pp. 1-67
    • Crandall, M.1    Ishii, H.2    Lions, P.3
  • 17
    • 85156231814 scopus 로고    scopus 로고
    • Temporal difference learning in continuous time and space
    • Doya, K. (1996). Temporal difference learning in continuous time and space. Advances in Neural Information Processing Systems, 8, 1073-1079.
    • (1996) Advances in Neural Information Processing Systems , vol.8 , pp. 1073-1079
    • Doya, K.1
  • 18
    • 0032024210 scopus 로고    scopus 로고
    • Rates of convergence for approximation schemes in optimal control
    • Dupuis, P. & James, M. R. (1998). Rates of convergence for approximation schemes in optimal control. SIAM Journal Control and Optimization, 360(2).
    • (1998) SIAM Journal Control and Optimization , vol.360 , Issue.2
    • Dupuis, P.1    James, M.R.2
  • 19
    • 0003294328 scopus 로고
    • Controlled Markov Processes and Viscosity Solutions
    • Springer-Verlag.
    • Fleming, W. H. & Soner, H. M. (1993). Controlled Markov Processes and Viscosity Solutions. Springer-Verlag. Applications of Mathematics.
    • (1993) Applications of Mathematics
    • Fleming, W.H.1    Soner, H.M.2
  • 22
    • 0343942238 scopus 로고    scopus 로고
    • Adaptive sparse grid multilevel methods for elliptic pdes based on finite differences
    • Notes on Numerical Fluid Mechanics: Computing, submitted
    • Griebel, M. (1998). Adaptive sparse grid multilevel methods for elliptic pdes based on finite differences. In Proceedings Large Scale Scientific Computations. Notes on Numerical Fluid Mechanics: Computing, submitted.
    • (1998) Proceedings Large Scale Scientific Computations
    • Griebel, M.1
  • 24
    • 79957749002 scopus 로고    scopus 로고
    • Reinforcement learning applied to a differential game
    • Harmon, M. E., Baird, L. C., & Klopf, A. H. (1996). Reinforcement learning applied to a differential game. Adaptive Behavior, 4, 3-28.
    • (1996) Adaptive Behavior , vol.4 , pp. 3-28
    • Harmon, M.E.1    Baird, L.C.2    Klopf, A.H.3
  • 26
    • 0025484857 scopus 로고
    • Numerical methods for stochastic control problems in continuous time
    • Kushner, H. J. (1990). Numerical methods for stochastic control problems in continuous time. SIAM J. Control and Optimization, 28, 999-1048.
    • (1990) SIAM J. Control and Optimization , vol.28 , pp. 999-1048
    • Kushner, H.J.1
  • 27
    • 0003270114 scopus 로고
    • Numerical Methods for Stochastic Control Problems in Continuous Time
    • Springer-Verlag.
    • Kushner, H. J. & Dupuis, P. (1992). Numerical Methods for Stochastic Control Problems in Continuous Time. Springer-Verlag. Applications of Mathematics.
    • (1992) Applications of Mathematics
    • Kushner, H.J.1    Dupuis, P.2
  • 29
    • 0026880130 scopus 로고
    • Automatic programming of behavior-based robots using reinforcement learning
    • Mahadevan, S. & Connell, J. (1992). Automatic programming of behavior-based robots using reinforcement learning. Artificial Intelligence, 5, 311-365.
    • (1992) Artificial Intelligence , vol.5 , pp. 311-365
    • Mahadevan, S.1    Connell, J.2
  • 31
    • 33747997674 scopus 로고
    • Variable resolution dynamic programming: Efficiently learning action maps in multivariate real-valued state-spaces
    • Moore, A. W. (1991). Variable resolution dynamic programming: Efficiently learning action maps in multivariate real-valued state-spaces. In Machine Learning: Poceedings of the Eight International Workshop (pp. 333-337).
    • (1991) Machine Learning: Poceedings of the Eight International Workshop , pp. 333-337
    • Moore, A.W.1
  • 32
    • 0029514510 scopus 로고
    • The parti-game algorithm for variable resolution reinforcement learning in multidimensional state space
    • Moore, A. W. & Atkeson, C. (1995). The parti-game algorithm for variable resolution reinforcement learning in multidimensional state space. Machine Learning Journal, 21.
    • (1995) Machine Learning Journal , vol.21
    • Moore, A.W.1    Atkeson, C.2
  • 33
    • 0343506477 scopus 로고    scopus 로고
    • A convergent reinforcement learning algorithm in the continuous case: The finite-element reinforcement learning
    • Munos, R. (1996). A convergent reinforcement learning algorithm in the continuous case: The finite-element reinforcement learning. In International Conference on Machine Learning.
    • (1996) International Conference on Machine Learning
    • Munos, R.1
  • 35
    • 0039225090 scopus 로고    scopus 로고
    • A convergent reinforcement learning algorithm in the continuous case based on a finite difference method
    • Munos, R. (1997b). A convergent reinforcement learning algorithm in the continuous case based on a finite difference method. In International Joint Conference on Artificial Intelligence.
    • (1997) International Joint Conference on Artificial Intelligence
    • Munos, R.1
  • 36
    • 0343506476 scopus 로고    scopus 로고
    • Finite-element methods with local triangulation refinement for continuous reinforcement learning problems
    • Munos, R. (1997c). Finite-element methods with local triangulation refinement for continuous reinforcement learning problems. In European Conference on Machine Learning.
    • (1997) European Conference on Machine Learning
    • Munos, R.1
  • 37
    • 0343070473 scopus 로고    scopus 로고
    • A general convergence theorem for reinforcement learning in the continuous case
    • Munos, R. (1998). A general convergence theorem for reinforcement learning in the continuous case. In European Conference on Machine Learning.
    • (1998) European Conference on Machine Learning
    • Munos, R.1
  • 38
    • 0033308517 scopus 로고    scopus 로고
    • Gradient descent approaches to neural-net-based solutions of the hamilton-jacobi-bellman equation
    • Munos, R., Baird, L., & Moore, A. (1999). Gradient descent approaches to neural-net-based solutions of the hamilton-jacobi-bellman equation. In International Joint Conference on Neural Networks.
    • (1999) International Joint Conference on Neural Networks
    • Munos, R.1    Baird, L.2    Moore, A.3
  • 40
    • 0012003008 scopus 로고    scopus 로고
    • Barycentric interpolators for continuous space and time reinforcement learning
    • Munos, R. & Moore, A. (1998). Barycentric interpolators for continuous space and time reinforcement learning. Advances in Neural Information Processing Systems, 11, 1024-1030.
    • (1998) Advances in Neural Information Processing Systems , vol.11 , pp. 1024-1030
    • Munos, R.1    Moore, A.2
  • 41
    • 84880680664 scopus 로고    scopus 로고
    • Variable resolution discretization for high-accuracy solutions of optimal control problems
    • Munos, R. & Moore, A. (1999). Variable resolution discretization for high-accuracy solutions of optimal control problems. In International Joint Conference on Artificial Intelligence, 1348-1355.
    • (1999) International Joint Conference on Artificial Intelligence , pp. 1348-1355
    • Munos, R.1    Moore, A.2
  • 43
    • 0343070470 scopus 로고    scopus 로고
    • Multi-grid methods for reinforcement learning in controlled diffusion processes
    • Pareigis, S. (1996). Multi-grid methods for reinforcement learning in controlled diffusion processes. Advances in Neural Information Processing Systems, 9.
    • (1996) Advances in Neural Information Processing Systems , vol.9
    • Pareigis, S.1
  • 48
    • 85156221438 scopus 로고    scopus 로고
    • Generalization in reinforcement learning: Successful examples using sparse coarse coding
    • Sutton, R. S. (1996). Generalization in reinforcement learning: Successful examples using sparse coarse coding. Advances in Neural Information Processing Systems, 8, 1038-1044.
    • (1996) Advances in Neural Information Processing Systems , vol.8 , pp. 1038-1044
    • Sutton, R.S.1
  • 50
    • 0000337576 scopus 로고
    • Simple statistical gradient-following algorithms for connectionist reinforcement learning
    • Williams, R. J. (1992). Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8, 229-256.
    • (1992) Machine Learning , vol.8 , pp. 229-256
    • Williams, R.J.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.