SCOPUS 정보 검색 플랫폼

Machine Learning

Volumn 40, Issue 3, 2000, Pages 265-299

Study of reinforcement learning in the continuous case by the means of viscosity solutions

(1) Munos, Rémi a

a CARNEGIE MELLON UNIVERSITY (United States)

Author keywords

[No Author keywords available]

Indexed keywords

APPROXIMATION THEORY; COMPUTER SIMULATION; CONVERGENCE OF NUMERICAL METHODS; DECISION THEORY; DIFFERENTIAL EQUATIONS; DYNAMIC PROGRAMMING; FINITE DIFFERENCE METHOD; FINITE ELEMENT METHOD; FUNCTION EVALUATION; MARKOV PROCESSES; PROBLEM SOLVING; THEOREM PROVING;

HAMILTON-JACOBI-BELLMAN EQUATION; MARKOV DECISION PROCESS; REINFORCEMENT LEARNING; STOCHASTIC STATE DYNAMICS; VISCOSITY SOLUTIONS;

LEARNING ALGORITHMS;

EID: 0034274415 PISSN: 08856125 EISSN: None Source Type: Journal
DOI: 10.1023/A:1007686309208 Document Type: Article

Times cited : (59)

References (50)

1
- 6044232041
- Ph.D. Thesis, University Paris IX Dauphine
- Akian, M. (1990). Méthodes multigrilles en contrôle stochastique. Ph.D. Thesis, University Paris IX Dauphine.
- (1990) Méthodes Multigrilles en Contrôle Stochastique
- Akian, M.¹

2
- 85151728371
- Residual algorithms: Reinforcement learning with function approximation
- Baird, L. C. (1995). Residual algorithms: Reinforcement learning with function approximation. In Machine Learning: Proceedings of the Twelfth International Conference.
- (1995) Machine Learning: Proceedings of the Twelfth International Conference
- Baird, L.C.¹

3
- 0003267441
- Solutions de viscosité des équations de Hamilton-Jacobi
- Springer-Verlag.
- Barles, G. (1994). Solutions de viscosité des équations de Hamilton-Jacobi, Springer-Verlag. Mathématiques et Applications, Vol. 17.
- (1994) Mathématiques et Applications , vol.17
- Barles, G.¹

4
- 0024070155
- Exit time problems in optimal control and vanishing viscosity solutions of hamilton-jacobi equations
- Barles, G. & Perthame, B. (1988). Exit time problems in optimal control and vanishing viscosity solutions of hamilton-jacobi equations. SIAM Control Optimization, 26, 1133-1148.
- (1988) SIAM Control Optimization , vol.26 , pp. 1133-1148
- Barles, G.¹ Perthame, B.²

5
- 0003184220
- Comparison principle for dirichlet-type hamilton-jacobi equations and singular perturbations of degenerated elliptic equations
- Barles, G. & Perthame, B. (1990). Comparison principle for dirichlet-type hamilton-jacobi equations and singular perturbations of degenerated elliptic equations. Applied Mathematics and Optimization, 21, 21-44.
- (1990) Applied Mathematics and Optimization , vol.21 , pp. 21-44
- Barles, G.¹ Perthame, B.²

6
- 84974753170
- Convergence of approximation schemes for fully nonlinear second order equations
- Barles, G. & Souganidis, P. (1991). Convergence of approximation schemes for fully nonlinear second order equations. Asymptotic Analysis, 4, 271-283.
- (1991) Asymptotic Analysis , vol.4 , pp. 271-283
- Barles, G.¹ Souganidis, P.²

7
- 0002355083
- Connectionist learning for control: An overview
- W. T. Miller, R. S. Sutton, & P. J. Werbos (Eds.), Cambridge, Massachussetts: MIT Press
- Barto, A. G. (1990). Connectionist learning for control: An overview. In W. T. Miller, R. S. Sutton, & P. J. Werbos (Eds.), Neural Networks for Control (pp. 5-58). Cambridge, Massachussetts: MIT Press.
- (1990) Neural Networks for Control , pp. 5-58
- Barto, A.G.¹

8
- 0003915098
- Tech. Rep. 91-57, Computer Science Department, University of Massachusetts
- Barto, A. G., Bradtke, S. J., & Singh, S. P. (1991). Real-time learning and control using asynchronous dynamic programming. Tech. Rep. 91-57, Computer Science Department, University of Massachusetts.
- (1991) Real-time Learning and Control Using Asynchronous Dynamic Programming
- Barto, A.G.¹ Bradtke, S.J.² Singh, S.P.³

9
- 0020970738
- Neuronlike adaptive elements that can solve difficult learning control problems
- Barto, A. G., Sutton, R. S., & Anderson, C. W. (1983). Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man and Sybernetics, 13, 835-846.
- (1983) IEEE Transactions on Systems, Man and Sybernetics , vol.13 , pp. 835-846
- Barto, A.G.¹ Sutton, R.S.² Anderson, C.W.³

10
- 85012688561
- Princeton Univ. Press
- Bellman, R. (1957). Dynamic Programming. Princeton Univ. Press.
- (1957) Dynamic Programming
- Bellman, R.¹

11
- 0000193385
- A simplification of the back-propagation-through-time algorithm for optimal neurocontrol
- Bersini, H. & Gorrini, V. (1997). A simplification of the back-propagation-through-time algorithm for optimal neurocontrol. IEEE Transaction on Neural Networks, 8, 437-441.
- (1997) IEEE Transaction on Neural Networks , vol.8 , pp. 437-441
- Bersini, H.¹ Gorrini, V.²

12
- 0003565779
- Prentice Hall
- Bertsekas, D. P. (1987). Dynamic Programming: Deterministic and Stochastic Models. Prentice Hall.
- (1987) Dynamic Programming: Deterministic and Stochastic Models
- Bertsekas, D.P.¹

13
- 0003487482
- Athena Scientific
- Bertsekas, D. P. & Tsitsiklis, J. (1996). Neuro-Dynamic Programming. Athena Scientific.
- (1996) Neuro-dynamic Programming
- Bertsekas, D.P.¹ Tsitsiklis, J.²

14
- 85153940465
- Generalization in reinforcement learning: Safely approximating the value function
- Boyan, J. & Moore, A. (1995). Generalization in reinforcement learning: Safely approximating the value function. Advances in Neural Information Processing Systems, 7, 369-376.
- (1995) Advances in Neural Information Processing Systems , vol.7 , pp. 369-376
- Boyan, J.¹ Moore, A.²

15
- 84967708673
- User's guide to viscosity solutions of second order partial differential equations
- Crandall, M., Ishii, H., & Lions, P. (1992). User's guide to viscosity solutions of second order partial differential equations. Bulletin of the American Mathematical Society, 27(1), 1-67.
- (1992) Bulletin of the American Mathematical Society , vol.27 , Issue.1 , pp. 1-67
- Crandall, M.¹ Ishii, H.² Lions, P.³

16
- 84967758647
- Viscosity solutions of hamilton-jacobi equations
- Crandall, M. & Lions, P. (1983). Viscosity solutions of hamilton-jacobi equations. Trans. of the American Mathematical Society, 277, 1-42.
- (1983) Trans. of the American Mathematical Society , vol.277 , pp. 1-42
- Crandall, M.¹ Lions, P.²

17
- 85156231814
- Temporal difference learning in continuous time and space
- Doya, K. (1996). Temporal difference learning in continuous time and space. Advances in Neural Information Processing Systems, 8, 1073-1079.
- (1996) Advances in Neural Information Processing Systems , vol.8 , pp. 1073-1079
- Doya, K.¹

18
- 0032024210
- Rates of convergence for approximation schemes in optimal control
- Dupuis, P. & James, M. R. (1998). Rates of convergence for approximation schemes in optimal control. SIAM Journal Control and Optimization, 360(2).
- (1998) SIAM Journal Control and Optimization , vol.360 , Issue.2
- Dupuis, P.¹ James, M.R.²

19
- 0003294328
- Controlled Markov Processes and Viscosity Solutions
- Springer-Verlag.
- Fleming, W. H. & Soner, H. M. (1993). Controlled Markov Processes and Viscosity Solutions. Springer-Verlag. Applications of Mathematics.
- (1993) Applications of Mathematics
- Fleming, W.H.¹ Soner, H.M.²

20
- 0030711314
- Fuzzy q-learning
- Glorennec, P. & Jouffe, L. (1997). Fuzzy q-learning. In Sixth International Conference on Fuzzy Systems.
- (1997) Sixth International Conference on Fuzzy Systems
- Glorennec, P.¹ Jouffe, L.²

21
- 84880694195
- Stable function approximation in dynamic programming
- Gordon, G. (1995). Stable function approximation in dynamic programming. In International Conference on Machine Learning.
- (1995) International Conference on Machine Learning
- Gordon, G.¹

22
- 0343942238
- Adaptive sparse grid multilevel methods for elliptic pdes based on finite differences
- Notes on Numerical Fluid Mechanics: Computing, submitted
- Griebel, M. (1998). Adaptive sparse grid multilevel methods for elliptic pdes based on finite differences. In Proceedings Large Scale Scientific Computations. Notes on Numerical Fluid Mechanics: Computing, submitted.
- (1998) Proceedings Large Scale Scientific Computations
- Griebel, M.¹

23
- 0003572965
- Ph.D. Thesis, University of Massachusetts, Amherst
- Gullapalli, V. (1992). Reinforcement Learning and its application to control. Ph.D. Thesis, University of Massachusetts, Amherst.
- (1992) Reinforcement Learning and Its Application to Control
- Gullapalli, V.¹

24
- 79957749002
- Reinforcement learning applied to a differential game
- Harmon, M. E., Baird, L. C., & Klopf, A. H. (1996). Reinforcement learning applied to a differential game. Adaptive Behavior, 4, 3-28.
- (1996) Adaptive Behavior , vol.4 , pp. 3-28
- Harmon, M.E.¹ Baird, L.C.² Klopf, A.H.³

25
- 0029679044
- Reinforcement learning: A survey
- Kaelbling, L. P., Littman, M., & Moore, A. W. (1996). Reinforcement learning: A survey. Journal of AI Research, 4, 237-285.
- (1996) Journal of AI Research , vol.4 , pp. 237-285
- Kaelbling, L.P.¹ Littman, M.² Moore, A.W.³

26
- 0025484857
- Numerical methods for stochastic control problems in continuous time
- Kushner, H. J. (1990). Numerical methods for stochastic control problems in continuous time. SIAM J. Control and Optimization, 28, 999-1048.
- (1990) SIAM J. Control and Optimization , vol.28 , pp. 999-1048
- Kushner, H.J.¹

27
- 0003270114
- Numerical Methods for Stochastic Control Problems in Continuous Time
- Springer-Verlag.
- Kushner, H. J. & Dupuis, P. (1992). Numerical Methods for Stochastic Control Problems in Continuous Time. Springer-Verlag. Applications of Mathematics.
- (1992) Applications of Mathematics
- Kushner, H.J.¹ Dupuis, P.²

28
- 0003673017
- Ph.D. Thesis, Carnegie Mellon University, Pittsburg, Pennsylvania
- Lin, L.-J. (1993). Reinforcement learning for robots using neural networks. Ph.D. Thesis, Carnegie Mellon University, Pittsburg, Pennsylvania.
- (1993) Reinforcement Learning for Robots Using Neural Networks
- Lin, L.-J.¹

29
- 0026880130
- Automatic programming of behavior-based robots using reinforcement learning
- Mahadevan, S. & Connell, J. (1992). Automatic programming of behavior-based robots using reinforcement learning. Artificial Intelligence, 5, 311-365.
- (1992) Artificial Intelligence , vol.5 , pp. 311-365
- Mahadevan, S.¹ Connell, J.²

30
- 0344683716
- Ph.D. Thesis, Université de Caen
- Meuleau, N. (1996). Le dilemme Exploration/Exploitation dans les systèmes d'apprentissage par renforcement. Ph.D. Thesis, Université de Caen.
- (1996) Le Dilemme Exploration/Exploitation Dans les Systèmes d'apprentissage par Renforcement
- Meuleau, N.¹

31
- 33747997674
- Variable resolution dynamic programming: Efficiently learning action maps in multivariate real-valued state-spaces
- Moore, A. W. (1991). Variable resolution dynamic programming: Efficiently learning action maps in multivariate real-valued state-spaces. In Machine Learning: Poceedings of the Eight International Workshop (pp. 333-337).
- (1991) Machine Learning: Poceedings of the Eight International Workshop , pp. 333-337
- Moore, A.W.¹

32
- 0029514510
- The parti-game algorithm for variable resolution reinforcement learning in multidimensional state space
- Moore, A. W. & Atkeson, C. (1995). The parti-game algorithm for variable resolution reinforcement learning in multidimensional state space. Machine Learning Journal, 21.
- (1995) Machine Learning Journal , vol.21
- Moore, A.W.¹ Atkeson, C.²

33
- 0343506477
- A convergent reinforcement learning algorithm in the continuous case: The finite-element reinforcement learning
- Munos, R. (1996). A convergent reinforcement learning algorithm in the continuous case: The finite-element reinforcement learning. In International Conference on Machine Learning.
- (1996) International Conference on Machine Learning
- Munos, R.¹

34
- 0342636184
- Ph.D. Thesis, Ecole des Hautes Etudes en Sciences Sociales
- Munos, R. (1997a). Apprentissage par Renforcement, étude du cas continu. Ph.D. Thesis, Ecole des Hautes Etudes en Sciences Sociales.
- (1997) Apprentissage par Renforcement, Étude du Cas Continu
- Munos, R.¹

35
- 0039225090
- A convergent reinforcement learning algorithm in the continuous case based on a finite difference method
- Munos, R. (1997b). A convergent reinforcement learning algorithm in the continuous case based on a finite difference method. In International Joint Conference on Artificial Intelligence.
- (1997) International Joint Conference on Artificial Intelligence
- Munos, R.¹

36
- 0343506476
- Finite-element methods with local triangulation refinement for continuous reinforcement learning problems
- Munos, R. (1997c). Finite-element methods with local triangulation refinement for continuous reinforcement learning problems. In European Conference on Machine Learning.
- (1997) European Conference on Machine Learning
- Munos, R.¹

37
- 0343070473
- A general convergence theorem for reinforcement learning in the continuous case
- Munos, R. (1998). A general convergence theorem for reinforcement learning in the continuous case. In European Conference on Machine Learning.
- (1998) European Conference on Machine Learning
- Munos, R.¹

38
- 0033308517
- Gradient descent approaches to neural-net-based solutions of the hamilton-jacobi-bellman equation
- Munos, R., Baird, L., & Moore, A. (1999). Gradient descent approaches to neural-net-based solutions of the hamilton-jacobi-bellman equation. In International Joint Conference on Neural Networks.
- (1999) International Joint Conference on Neural Networks
- Munos, R.¹ Baird, L.² Moore, A.³

39
- 36348936314
- Reinforcement learning for continuous stochastic control problems
- Munos, R. & Bourgine, P. (1997). Reinforcement learning for continuous stochastic control problems. Advances in Neural Information Processing Systems, 10.
- (1997) Advances in Neural Information Processing Systems , vol.10
- Munos, R.¹ Bourgine, P.²

40
- 0012003008
- Barycentric interpolators for continuous space and time reinforcement learning
- Munos, R. & Moore, A. (1998). Barycentric interpolators for continuous space and time reinforcement learning. Advances in Neural Information Processing Systems, 11, 1024-1030.
- (1998) Advances in Neural Information Processing Systems , vol.11 , pp. 1024-1030
- Munos, R.¹ Moore, A.²

41
- 84880680664
- Variable resolution discretization for high-accuracy solutions of optimal control problems
- Munos, R. & Moore, A. (1999). Variable resolution discretization for high-accuracy solutions of optimal control problems. In International Joint Conference on Artificial Intelligence, 1348-1355.
- (1999) International Joint Conference on Artificial Intelligence , pp. 1348-1355
- Munos, R.¹ Moore, A.²

42
- 0343070471
- Fuzzy reinforcement learning an overview
- Nowé, A. (1995). Fuzzy reinforcement learning an overview. Advances in Fuzzy Theory and Technology.
- (1995) Advances in Fuzzy Theory and Technology
- Nowé, A.¹

43
- 0343070470
- Multi-grid methods for reinforcement learning in controlled diffusion processes
- Pareigis, S. (1996). Multi-grid methods for reinforcement learning in controlled diffusion processes. Advances in Neural Information Processing Systems, 9.
- (1996) Advances in Neural Information Processing Systems , vol.9
- Pareigis, S.¹

44
- 0343942237
- Adaptive choice of grid and time in reinforcement learning
- Pareigis, S. (1997). Adaptive choice of grid and time in reinforcement learning. Advances in Neural Information Processing Systems, 10.
- (1997) Advances in Neural Information Processing Systems , vol.10
- Pareigis, S.¹

45
- 0003716450
- New York: Interscience
- Pontryagin, L., Boltyanskii, V., Gamkriledze, R., & Mischenko, E. (1962). The Mathematical Theory of Optimal Processes. New York: Interscience.
- (1962) The Mathematical Theory of Optimal Processes
- Pontryagin, L.¹ Boltyanskii, V.² Gamkriledze, R.³ Mischenko, E.⁴

46
- 85102627959
- A Wiley-Interscience Publication
- Puterman, M. L. (1994). Markov Decision Processes, Discrete Stochastic Dynamic Programming. A Wiley-Interscience Publication.
- (1994) Markov Decision Processes, Discrete Stochastic Dynamic Programming
- Puterman, M.L.¹

47
- 85153965130
- Reinforcement learning with soft state aggregation
- Singh, S. P., Jaakkola, T., & Jordan, M. I. (1994). Reinforcement learning with soft state aggregation. Advances in Neural Information Processing Systems, 6, 359-368.
- (1994) Advances in Neural Information Processing Systems , vol.6 , pp. 359-368
- Singh, S.P.¹ Jaakkola, T.² Jordan, M.I.³

48
- 85156221438
- Generalization in reinforcement learning: Successful examples using sparse coarse coding
- Sutton, R. S. (1996). Generalization in reinforcement learning: Successful examples using sparse coarse coding. Advances in Neural Information Processing Systems, 8, 1038-1044.
- (1996) Advances in Neural Information Processing Systems , vol.8 , pp. 1038-1044
- Sutton, R.S.¹

49
- 84901616862
- Online learning with random representations
- Sutton, R. & Whitehead, S. (1993). Online learning with random representations. In International Conference on Machine Learning.
- (1993) International Conference on Machine Learning
- Sutton, R.¹ Whitehead, S.²

50
- 0000337576
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- Williams, R. J. (1992). Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8, 229-256.
- (1992) Machine Learning , vol.8 , pp. 229-256
- Williams, R.J.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.