SCOPUS 정보 검색 플랫폼

Machine Learning

Volumn 33, Issue 1, 1998, Pages 105-115

Fast Online Q(λ)

(2) Wiering, Marco a Schmidhuber, Jürgen a

a DALLE MOLLE INSTITUTE FOR ARTIFICIAL INTELLIGENCE IDSIA (Switzerland)

Author keywords

Lazy learning; Online Q( ); Q learning; Reinforcement learning; TD( )

Indexed keywords

ALGORITHMS; APPROXIMATION THEORY; ERROR ANALYSIS; MARKOV PROCESSES; MATHEMATICAL OPERATORS; ONLINE SYSTEMS; PROBABILITY DISTRIBUTIONS; COMPUTATIONAL COMPLEXITY; LEARNING ALGORITHMS; TABLE LOOKUP;

LAZY LEARNING; Q-LEARNING; REINFORCEMENT LEARNING; STATE/ACTION SPACE;

LEARNING SYSTEMS;

EID: 0032182997 PISSN: 08856125 EISSN: None Source Type: Journal
DOI: 10.1023/A:1007562800292 Document Type: Article

Times cited : (64)

References (21)

1
- 0016556021
- A new approach to manipulator control: The cerebellar model articulation controller (CMAC)
- Albus, J.S. (1975). A new approach to manipulator control: The cerebellar model articulation controller (CMAC). Dynamic Systems, Measurement and Control, 97, 220-227.
- (1975) Dynamic Systems, Measurement and Control , vol.97 , pp. 220-227
- Albus, J.S.¹

2
- 0031074521
- Locally weighted learning
- Atkeson, C.G., Schaal, S., & Moore, A.W. (1997). Locally weighted learning. Artificial Intelligence Review, 11, 11-73.
- (1997) Artificial Intelligence Review , vol.11 , pp. 11-73
- Atkeson, C.G.¹ Schaal, S.² Moore, A.W.³

3
- 0020970738
- Neuronlike adaptive elements that can solve difficult learning control problems
- Barto, A.G., Sutton, R.S., & AndersOn, C.W. (1983). Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man, and Cybernetics, SMC-13, 834-846.
- (1983) IEEE Transactions on Systems, Man, and Cybernetics , vol.SMC-13 , pp. 834-846
- Barto, A.G.¹ Sutton, R.S.² Anderson, C.W.³

4
- 0003487482
- Belmont, MA: Athena Scientific
- Bertsekas, O.P., & Tsitsiklis, J.N. (1996). Neuro-dynamic programming. Belmont, MA: Athena Scientific.
- (1996) Neuro-dynamic Programming
- Bertsekas, O.P.¹ Tsitsiklis, J.N.²

5
- 0010878888
- (Technical Report IRIDIA-94-14). Université Libre de Bruxelles
- Caironi, P.V.C., & Dorigo, M. (1994). Training Q-agents (Technical Report IRIDIA-94-14). Université Libre de Bruxelles.
- (1994) Training Q-agents
- Caironi, P.V.C.¹ Dorigo, M.²

6
- 0007512578
- Truncating temporal differences: On the efficient implementation of TD(λ) for reinforcement learning
- Cichosz, P. (1995). Truncating temporal differences: On the efficient implementation of TD(λ) for reinforcement learning. Journal of Artificial Intelligence Research, 2, 287-318.
- (1995) Journal of Artificial Intelligence Research , vol.2 , pp. 287-318
- Cichosz, P.¹

7
- 0347763086
- Supervised learning with growing cell structures
- J. Cowan, G. Tesauro, & J. Alspector (Eds.), San Mateo, CA: Morgan Kaufmann
- Fritzke, B. (1994). Supervised learning with growing cell structures. In J. Cowan, G. Tesauro, & J. Alspector (Eds.), Advances in neural information processing systems (Vol.6, pp. 255-262). San Mateo, CA: Morgan Kaufmann.
- (1994) Advances in Neural Information Processing Systems , vol.6 , pp. 255-262
- Fritzke, B.¹

8
- 0029751419
- The effect of representation and knowledge on goal-directed exploration with reinforcement learning algorithms
- Koenig, S., & Simmons, R.G. (1996). The effect of representation and knowledge on goal-directed exploration with reinforcement learning algorithms. Machine Learning, 22, 228-250.
- (1996) Machine Learning , vol.22 , pp. 228-250
- Koenig, S.¹ Simmons, R.G.²

9
- 0003527079
- Springer
- Kohonen, T. (1988). Self-organization and associative memory (2nd ed.). Springer.
- (1988) Self-organization and Associative Memory (2nd Ed.)
- Kohonen, T.¹

10
- 0003673017
- Ph.D. thesis, Carnegie Mellon University, Pittsburgh
- Lin, L.-J. (1993). Reinforcement learning for robots using neural networks. Ph.D. thesis, Carnegie Mellon University, Pittsburgh.
- (1993) Reinforcement Learning for Robots Using Neural Networks
- Lin, L.-J.¹

11
- 0000955979
- Incremental multi-step Q-learning
- Peng, J., & Williams, R. (1996). Incremental multi-step Q-learning. Machine Learning, 22, 283-290.
- (1996) Machine Learning , vol.22 , pp. 283-290
- Peng, J.¹ Williams, R.²

12
- 0345161982
- (Technical Report CUED/ F-INFENG-TR 166). UK: Cambridge University
- Rummery, G., & Niranjan, M. (1994). On-line Q-learning using connectionist sytems (Technical Report CUED/ F-INFENG-TR 166). UK: Cambridge University.
- (1994) On-line Q-learning Using Connectionist Sytems
- Rummery, G.¹ Niranjan, M.²

13
- 0029753630
- Reinforcement learning with replacing eligibility traces
- Singh, S., & Sutton, R. (1996). Reinforcement learning with replacing eligibility traces. Machine Learning, 22, 123-158.
- (1996) Machine Learning , vol.22 , pp. 123-158
- Singh, S.¹ Sutton, R.²

14
- 33847202724
- Learning to predict by the methods of temporal differences
- Sutton, R.S. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3, 9-44.
- (1988) Machine Learning , vol.3 , pp. 9-44
- Sutton, R.S.¹

15
- 0000723997
- Generalization in reinforcement learning: Successful examples using sparse coarse coding
- D.S. Touretzky, M.C. Mozer, & M.E. Hasselmo (Eds.), Cambridge, MA: MIT Press
- Sutton, R.S. (1996). Generalization in reinforcement learning: Successful examples using sparse coarse coding. In D.S. Touretzky, M.C. Mozer, & M.E. Hasselmo (Eds.), Advances in neural information processing systems, (Vol. 8, pp. 1033-1045). Cambridge, MA: MIT Press.
- (1996) Advances in Neural Information Processing Systems , vol.8 , pp. 1033-1045
- Sutton, R.S.¹

16
- 2542485629
- Practical issues in temporal difference learning
- D.S., Lippman, J.E. Moody, & D.S Touretzky (Eds.), San Mateo, CA: Morgan Kaufmann
- Tesauro, G. (1992). Practical issues in temporal difference learning. In D.S., Lippman, J.E. Moody, & D.S Touretzky (Eds.), Advances in neural information processing systems (Vol. 4, pp. 259-266). San Mateo, CA: Morgan Kaufmann.
- (1992) Advances in Neural Information Processing Systems , vol.4 , pp. 259-266
- Tesauro, G.¹

17
- 0003411271
- (Technical Report CMU-CS-92-102). CarnegieMellon University
- Thrun, S. (1992). Efficient exploration in reinforcement learning (Technical Report CMU-CS-92-102). CarnegieMellon University.
- (1992) Efficient Exploration in Reinforcement Learning
- Thrun, S.¹

18
- 0004049893
- Ph.D. thesis, King's College, Cambridge, England
- Watkins, C.J.C.H. (1989). Learning from delayed rewards. Ph.D. thesis, King's College, Cambridge, England.
- (1989) Learning from Delayed Rewards
- Watkins, C.J.C.H.¹

19
- 34249833101
- Technical note: Q-learning
- Watkins, C.J.C.H., & Dayan, P. (1992). Technical note: Q-learning. Machine Learning, 8, 279-292.
- (1992) Machine Learning , vol.8 , pp. 279-292
- Watkins, C.J.C.H.¹ Dayan, P.²

20
- 0003619736
- Ph.D. thesis, University of Rochester
- Whitehead, S. (1992). Reinforcement learning for the adaptive control of perception and action. Ph.D. thesis, University of Rochester.
- (1992) Reinforcement Learning for the Adaptive Control of Perception and Action
- Whitehead, S.¹

21
- 2542475311
- Speeding up Q(λ)-learning
- C. Nedellec, & C. Rouveirol (Eds.), Berlin: Springer Verlag
- Wiering, M.A., & Schmidhuber, J. (1998). Speeding up Q(λ)-learning. In C. Nedellec, & C. Rouveirol (Eds.), Machine Learning: Proceedings of the Tenth European Conference. Berlin: Springer Verlag.
- (1998) Machine Learning: Proceedings of the Tenth European Conference
- Wiering, M.A.¹ Schmidhuber, J.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.