SCOPUS 정보 검색 플랫폼

Volumn 6, Issue 2, 1997, Pages 163-217

Experiments with reinforcement learning in problems with continuous state and action spaces

(3) Santamaría, Juan Carlos a Ram, Ashwin a Sutton, Richard S b

a Georgia Institute of Technology (United States)

b UNIVERSITY OF MASSACHUSETTS (United States)

Author keywords

Continuous domains; Function approximation; Memory based methods; Optimal control; Reinforcement learning; Resource preallocation

Indexed keywords

EID: 0031231885 PISSN: 10597123 EISSN: None Source Type: Journal
DOI: 10.1177/105971239700600201 Document Type: Article

Times cited : (168)

References (26)

1
- 0016556021
- A new approach to manipulator control: The cerebellar model articulation controller (cmac)
- Albus, J. S. (1975). A new approach to manipulator control: The cerebellar model articulation controller (cmac). Journal of Dynamic Systems, Measurement, and Control, 97(3), 220-227.
- (1975) Journal of Dynamic Systems, Measurement, and Control , vol.97 , Issue.3 , pp. 220-227
- Albus, J.S.¹

2
- 0026367243
- Memory-based learning control
- New York: American Automatic Control Council
- Atkeson, C. G. (1991). Memory-based learning control. In Proceedings of the 1991 American Control Conference (Vol. 3). New York: American Automatic Control Council.
- (1991) Proceedings of the 1991 American Control Conference , vol.3
- Atkeson, C.G.¹

3
- 0020970738
- Neuronlike elements that can solve difficult learning control problems
- Barto, A. G., Sutton, R. S., & Anderson, C. W. (1983). Neuronlike elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man, and Cybernetics, 13, 835-846.
- (1983) IEEE Transactions on Systems, Man, and Cybernetics , vol.13 , pp. 835-846
- Barto, A.G.¹ Sutton, R.S.² Anderson, C.W.³

4
- 85012688561
- Princeton, NJ: Princeton University Press
- Bellman, R. (1957). Dynamic programming. Princeton, NJ: Princeton University Press.
- (1957) Dynamic Programming
- Bellman, R.¹

5
- 0003565783
- Belmont, MA: Athena Scientific
- Bertsekas, D. P. (1995). Dynamic programming and optimal control (Vol. 1). Belmont, MA: Athena Scientific.
- (1995) Dynamic Programming and Optimal Control , vol.1
- Bertsekas, D.P.¹

6
- 0007512578
- Truncating temporal differences: On the efficient implementation of TD(λ) for reinforcement learning
- Cishosz, P. (1996). Truncating temporal differences: on the efficient implementation of TD(λ) for reinforcement learning. Journal of Artificial Intelligence Research, Athena Scientific, 287-318.
- (1996) Journal of Artificial Intelligence Research, Athena Scientific , pp. 287-318
- Cishosz, P.¹

7
- 0010814177
- Sparse distributed memory and related models
- M. H. Hassoun (ed.). New York: Oxford University Press
- Kanerva, P. (1993). Sparse distributed memory and related models. In M. H. Hassoun (ed.), Associative neural memories: Theory and implementation. New York: Oxford University Press.
- (1993) Associative Neural Memories: Theory and Implementation
- Kanerva, P.¹

8
- 0001892558
- Instance-based prediction of real-valued attributes
- Kibler, D., & Aha, D. W. (1989). Instance-based prediction of real-valued attributes. Computational Intelligence, 5(2), 51-57.
- (1989) Computational Intelligence , vol.5 , Issue.2 , pp. 51-57
- Kibler, D.¹ Aha, D.W.²

9
- 0000123778
- Self-improving reactive agents based on reinforcement learning
- Lin, L. J. (1992). Self-improving reactive agents based on reinforcement learning. Machine Learning, 8(3-4), 293-321.
- (1992) Machine Learning , vol.8 , Issue.3-4 , pp. 293-321
- Lin, L.J.¹

10
- 79955966783
- Scaling reinforcement learning to robotics by exploiting the subsumption architecture
- San Mateo: Morgan Kaufmann
- Mahadevan, S., & Connell, J. (1991). Scaling reinforcement learning to robotics by exploiting the subsumption architecture. In Proceedings of the Eighth International Workshop on Machine Learning (Vol. 1). San Mateo: Morgan Kaufmann.
- (1991) In Proceedings of the Eighth International Workshop on Machine Learning , vol.1
- Mahadevan, S.¹ Connell, J.²

11
- 85153937136
- Instance-based state identification for reinforcement learning
- McCallum, R. A., Tesauro, G., Touretzky, D., & Leen, T. (1995). Instance-based state identification for reinforcement learning. Advances in Neural Information Processing Systems, 7, 377-384.
- (1995) Advances in Neural Information Processing Systems , vol.7 , pp. 377-384
- McCallum, R.A.¹ Tesauro, G.² Touretzky, D.³ Leen, T.⁴

12
- 0029514510
- The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces
- Moore, A. W., & Atkeson, C. G. (1995). The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces. Machine Learning, 21(3), 199-233.
- (1995) Machine Learning , vol.21 , Issue.3 , pp. 199-233
- Moore, A.W.¹ Atkeson, C.G.²

13
- 0003583154
- Englewood Cliffs, NJ: Prentice-Hall
- Narendra, K. S., & Annaswamy, A. M. (1989). Stable adaptive systems. Englewood Cliffs, NJ: Prentice-Hall.
- (1989) Stable Adaptive Systems
- Narendra, K.S.¹ Annaswamy, A.M.²

14
- 0010932382
- Unpublished doctoral thesis, Department of Computer Science, Northeastern University
- Peng, J. (1993). Efficient dynamic prograinming-based learning for control. Unpublished doctoral thesis, Department of Computer Science, Northeastern University.
- (1993) Efficient Dynamic Prograinming-based Learning for Control
- Peng, J.¹

15
- 85152551400
- Incremental multi-step Q-learning
- W. W. Cohen & H. Hirsh (Eds.). San Mateo: Morgan Kaufmann
- Peng, J., & Williams, R. J. (1994). Incremental multi-step Q-learning. In W. W. Cohen & H. Hirsh (Eds.), Machine Learning: Proceedings of the Eleventh International Conference. San Mateo: Morgan Kaufmann.
- (1994) Machine Learning: Proceedings of the Eleventh International Conference
- Peng, J.¹ Williams, R.J.²

16
- 0031072835
- Continuous case-based reasoning
- Ram, A., & Santamaría, J. C. (1997). Continuous case-based reasoning. Artificial Intelligence, 90(1-2), 25-77.
- (1997) Artificial Intelligence , vol.90 , Issue.1-2 , pp. 25-77
- Ram, A.¹ Santamaría, J.C.²

17
- 0004138329
- New York: Longman
- Richards, R. J. (1979). An introduction to dynamics and control. New York: Longman.
- (1979) An Introduction to Dynamics and Control
- Richards, R.J.¹

18
- 0003636089
- (Tech. Rep. No. CUED/F-INFEG/TR66). Cambridge, Engl.: Cambridge University Department
- Rummery, G. A., & Niranjan, M. (1994). On-line q-learning using connectionist systems (Tech. Rep. No. CUED/F-INFEG/TR66). Cambridge, Engl.: Cambridge University Department.
- (1994) On-line Q-learning Using Connectionist Systems
- Rummery, G.A.¹ Niranjan, M.²

19
- 0003418838
- New York: Longman
- Shanmugam, K. S. (1979). Digital and analog communication systems. New York: Longman.
- (1979) Digital and Analog Communication Systems
- Shanmugam, K.S.¹

20
- 0029753630
- Reinforcement learning with replacing eligibility traces
- Singh, S. P., & Sutton, R. S. (1996). Reinforcement learning with replacing eligibility traces. Machine Learning, 22, 123-158.
- (1996) Machine Learning , vol.22 , pp. 123-158
- Singh, S.P.¹ Sutton, R.S.²

21
- 0004294973
- Mineola, NY: Dover Publications
- Stengel, R. F. (1994). Optimal control and estimation. Mineola, NY: Dover Publications.
- (1994) Optimal Control and Estimation
- Stengel, R.F.¹

22
- 33847202724
- Learning to predict by the methods of temporal differences
- Sutton, R. S. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3, 9-44.
- (1988) Machine Learning , vol.3 , pp. 9-44
- Sutton, R.S.¹

23
- 85156221438
- Generalization in reinforcement learning: Successful examples using sparse coarse coding
- Sutton, R. S. (1996). Generalization in reinforcement learning: Successful examples using sparse coarse coding. Advances in Neural Information Processing Systems, 8, 1038-1044.
- (1996) Advances in Neural Information Processing Systems , vol.8 , pp. 1038-1044
- Sutton, R.S.¹

24
- 0029390263
- Reinforcement learning of multiple tasks using a hierarchical cmac architecture
- Tham, C. L. (1995). Reinforcement learning of multiple tasks using a hierarchical cmac architecture. Robotics and Autonomous Systems, 15(4), 247-274.
- (1995) Robotics and Autonomous Systems , vol.15 , Issue.4 , pp. 247-274
- Tham, C.L.¹

25
- 0031143730
- An analysis of temporal-difference learning with function approximation
- Tsitsiklis, J. N., & Van Roy, B. (1997). An analysis of temporal-difference learning with function approximation. IEEE Transactions on Automatic Control, H2(5), 674-690.
- (1997) IEEE Transactions on Automatic Control , vol.H2 , Issue.5 , pp. 674-690
- Tsitsiklis, J.N.¹ Van Roy, B.²

26
- 0004049893
- Unpublished doctoral thesis, University of Cambridge, Cambridge, England
- Watkins, C. J. C. H. (1989). Learning from delayed rewards. Unpublished doctoral thesis, University of Cambridge, Cambridge, England.
- (1989) Learning from Delayed Rewards
- Watkins, C.J.C.H.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.