SCOPUS 정보 검색 플랫폼

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

Volumn 38, Issue 4, 2008, Pages 930-936

Ensemble algorithms in reinforcement learning

(2) Wiering, Marco A a van Hasselt, Hado b

a UNIVERSITY OF GRONINGEN (Netherlands)

b UTRECHT UNIVERSITY (Netherlands)

Author keywords

Dynamic mazes; Ensemble algorithms; Partially observable environments; Reinforcement learning (RL)

Indexed keywords

ALGORITHMS; CHLORINE COMPOUNDS; EDUCATION; LEARNING SYSTEMS; MATHEMATICAL PROGRAMMING; PROBABILITY; REINFORCEMENT; REINFORCEMENT LEARNING; SYSTEMS ENGINEERING;

ACTOR-CRITIC; BOLTZMANN; DYNAMIC MAZES; ENHANCE LEARNING; ENSEMB LE METHODS; ENSEMBLE ALGORITHMS; LEARNING AUTOMATON; MAJORITY VOTING; MAZE PROBLEMS; PARTIALLY OBSERVABLE ENVIRONMENTS; Q-LEARNING; REINFORCEMENT LEARNING (RL); SINGLE AGENTS; SINGLE-VALUE; VALUE FUNCTIONS;

LEARNING ALGORITHMS;

ALGORITHM; ARTICLE; ARTIFICIAL NEURAL NETWORK; COMPUTER SIMULATION; FEEDBACK SYSTEM; REINFORCEMENT; SYSTEM ANALYSIS; SYSTEMS THEORY; THEORETICAL MODEL;

ALGORITHMS; COMPUTER SIMULATION; FEEDBACK; MODELS, THEORETICAL; NEURAL NETWORKS (COMPUTER); PROGRAMMING, LINEAR; REINFORCEMENT (PSYCHOLOGY); SYSTEMS THEORY;

EID: 49049105169 PISSN: 10834419 EISSN: None Source Type: Journal
DOI: 10.1109/TSMCB.2008.920231 Document Type: Article

Times cited : (196)

References (23)

1
- 0004102479
- Cambridge, MA: MIT Press
- R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction Cambridge, MA: MIT Press, 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

2
- 0029679044
- Reinforcement learning: A survey
- May
- L. P. Kaelbling, M. L. Littman, and A. W. Moore, "Reinforcement learning: A survey," J. Artif. Intell. Res., vol. 4, pp. 237-285, May 1996.
- (1996) J. Artif. Intell. Res , vol.4 , pp. 237-285
- Kaelbling, L.P.¹ Littman, M.L.² Moore, A.W.³

3
- 0004049893
- Learning from delayed rewards,
- Ph.D. dissertation, King's College, Cambridge, U.K
- C. J. C. H. Watkins, "Learning from delayed rewards," Ph.D. dissertation, King's College, Cambridge, U.K., 1989.
- (1989)
- Watkins, C.J.C.H.¹

4
- 49049097809
- G. Rummery and M. Niranjan, On-line Q-learning using connectionist systems, Cambridge Univ., Cambridge, U.K., Tech. Rep. CUED/F-INFENG-TR 166, 1994.
- G. Rummery and M. Niranjan, "On-line Q-learning using connectionist systems," Cambridge Univ., Cambridge, U.K., Tech. Rep. CUED/F-INFENG-TR 166, 1994.

5
- 85156221438
- Generalization in reinforcement learning: Successful examples using sparse coarse coding
- D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo, Eds. Cambridge, MA: MIT Press
- R. S. Sutton, "Generalization in reinforcement learning: Successful examples using sparse coarse coding," in Advances in Neural Information Processing Systems, vol. 8, D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo, Eds. Cambridge, MA: MIT Press, 1996, pp. 1038-1045.
- (1996) Advances in Neural Information Processing Systems , vol.8 , pp. 1038-1045
- Sutton, R.S.¹

6
- 34548771972
- Two novel on-policy reinforcement learning algorithms based on TD(λ)-methods
- M. Wiering and H. van Hasselt, "Two novel on-policy reinforcement learning algorithms based on TD(λ)-methods," in Proc. IEEE Int. Symp. Adaptive Dyn. Program. Reinforcement Learn., 2007, pp. 280-287.
- (2007) Proc. IEEE Int. Symp. Adaptive Dyn. Program. Reinforcement Learn , pp. 280-287
- Wiering, M.¹ van Hasselt, H.²

7
- 84898939480
- Policy gradient methods for reinforcement learning with function approximation
- Cambridge, MA: MIT Press
- R. Sutton, D. McAllester, S. Singh, and Y. Mansour, "Policy gradient methods for reinforcement learning with function approximation," in Advances in Neural Information Processing Systems, vol. 12. Cambridge, MA: MIT Press, 2000, pp. 1057-1063.
- (2000) Advances in Neural Information Processing Systems , vol.12 , pp. 1057-1063
- Sutton, R.¹ McAllester, D.² Singh, S.³ Mansour, Y.⁴

8
- 0013535965
- Infinite-horizon policy-gradient estimation
- J. Baxter and P. Bartlett, "Infinite-horizon policy-gradient estimation," J. Artif. Intell. Res., vol. 15, pp. 319-350, 2001.
- (2001) J. Artif. Intell. Res , vol.15 , pp. 319-350
- Baxter, J.¹ Bartlett, P.²

9
- 0027684215
- Prioritized sweeping: Reinforcement learning with less data and less time
- Oct
- A. W. Moore and C. G. Atkeson, "Prioritized sweeping: Reinforcement learning with less data and less time," Mach. Learn., vol. 13, no. 1, pp. 103-130, Oct. 1993.
- (1993) Mach. Learn , vol.13 , Issue.1 , pp. 103-130
- Moore, A.W.¹ Atkeson, C.G.²

10
- 33646398129
- Neural fitted Q iteration - First experiences with a data efficient neural reinforcement learning method
- M. Riedmiller, "Neural fitted Q iteration - First experiences with a data efficient neural reinforcement learning method," in Proc. 16th ECML, 2005, pp. 317-328.
- (2005) Proc. 16th ECML , pp. 317-328
- Riedmiller, M.¹

11
- 0030211964
- Bagging predictors
- Aug
- L. Breiman, "Bagging predictors," Mach. Learn., vol. 24, no. 2, pp. 123-140, Aug. 1996.
- (1996) Mach. Learn , vol.24 , Issue.2 , pp. 123-140
- Breiman, L.¹

12
- 0002978642
- Experiments with a new boosting algorithm
- Y. Freund and R. E. Schapire, "Experiments with a new boosting algorithm," in Proc. 13th Int. Conf. Mach. Learn., 1996, pp. 148-156.
- (1996) Proc. 13th Int. Conf. Mach. Learn , pp. 148-156
- Freund, Y.¹ Schapire, R.E.²

13
- 0001940458
- Adaptive mixtures of local experts
- R. A. Jacobs, M. I. Jordan, S. J. Nowlan, and G. E. Hinton, "Adaptive mixtures of local experts," Neural Comput., vol. 3, no. 1, pp. 79-87, 1991.
- (1991) Neural Comput , vol.3 , Issue.1 , pp. 79-87
- Jacobs, R.A.¹ Jordan, M.I.² Nowlan, S.J.³ Hinton, G.E.⁴

14
- 0001652790
- The efficient learning of multiple task sequences
- J. Moody, S. Hanson, and R. Lippman, Eds. San Mateo, CA: Morgan Kaufmann
- S. P. Singh, "The efficient learning of multiple task sequences," in Advances in Neural Information Processing Systems, vol. 4, J. Moody, S. Hanson, and R. Lippman, Eds. San Mateo, CA: Morgan Kaufmann, 1992, pp. 251-258.
- (1992) Advances in Neural Information Processing Systems , vol.4 , pp. 251-258
- Singh, S.P.¹

15
- 0029390263
- Reinforcement learning of multiple tasks using a hierarchical CMAC architecture
- C. Tham, "Reinforcement learning of multiple tasks using a hierarchical CMAC architecture," Robot. Auton. Syst., vol. 15, no. 4, pp. 247-274, 1995.
- (1995) Robot. Auton. Syst , vol.15 , Issue.4 , pp. 247-274
- Tham, C.¹

16
- 0032772352
- Multi-agent reinforcement learning: Weighting and partitioning
- Jun
- R. Sun and T. Peterson, "Multi-agent reinforcement learning: Weighting and partitioning," Neural Netw., vol. 12, no. 4/5, pp. 727-753, Jun. 1999.
- (1999) Neural Netw , vol.12 , Issue.4-5 , pp. 727-753
- Sun, R.¹ Peterson, T.²

17
- 21844465127
- Tree-based batch mode reinforcement learning
- Dec
- D. Ernst, P. Geurts, and L. Wehenkel, "Tree-based batch mode reinforcement learning," J. Mach. Learn. Res., vol. 6, pp. 503-556, Dec. 2005.
- (2005) J. Mach. Learn. Res , vol.6 , pp. 503-556
- Ernst, D.¹ Geurts, P.² Wehenkel, L.³

18
- 34249833101
- Q-learning
- C. J. C. H. Watkins and P. Dayan, "Q-learning," Mach. Learn., vol. 8, no. 3/4, pp. 279-292, 1992.
- (1992) Mach. Learn , vol.8 , Issue.3-4 , pp. 279-292
- Watkins, C.J.C.H.¹ Dayan, P.²

19
- 33847202724
- Learning to predict by the methods of temporal differences
- Aug
- R. S. Sutton, "Learning to predict by the methods of temporal differences," Mach. Learn., vol. 3, no. 1, pp. 9-44, Aug. 1988.
- (1988) Mach. Learn , vol.3 , Issue.1 , pp. 9-44
- Sutton, R.S.¹

20
- 0016082525
- Learning automata - A survey
- Jul
- K. S. Narendra and M. A. L. Thathatchar, "Learning automata - A survey," IEEE Trans. Syst., Man, Cybern., vol. SMC-4, no. 4, pp. 323-334, Jul. 1974.
- (1974) IEEE Trans. Syst., Man, Cybern , vol.SMC-4 , Issue.4 , pp. 323-334
- Narendra, K.S.¹ Thathatchar, M.A.L.²

21
- 85153940465
- Generalization in reinforcement learning: Safely approximating the value function
- G. Tesauro, D. S. Touretzky, and T. K. Leen, Eds. Cambridge, MA: MIT Press
- J. A. Boyan and A. W. Moore, "Generalization in reinforcement learning: Safely approximating the value function," in Advances in Neural Information Processing Systems, vol. 7, G. Tesauro, D. S. Touretzky, and T. K. Leen, Eds. Cambridge, MA: MIT Press, 1995, pp. 369-376.
- (1995) Advances in Neural Information Processing Systems , vol.7 , pp. 369-376
- Boyan, J.A.¹ Moore, A.W.²

22
- 0038595393
- Carnegie Mellon Univ, Pittsburgh, PA, Tech. Rep. CMU-CS-95-103
- G. Gordon, "Stable function approximation in dynamic programming," Carnegie Mellon Univ., Pittsburgh, PA, Tech. Rep. CMU-CS-95-103, 1995.
- (1995) Stable function approximation in dynamic programming
- Gordon, G.¹

23
- 0030421566
- Generalized maze navigation: SRN critics solve what feedforward or Hebbian nets cannot
- P. Werbos and X. Pang, "Generalized maze navigation: SRN critics solve what feedforward or Hebbian nets cannot," in Proc. IEEE Int. Conf. Syst., Man, Cybern., 1996, vol. 3, pp. 1764-1769.
- (1996) Proc. IEEE Int. Conf. Syst., Man, Cybern , vol.3 , pp. 1764-1769
- Werbos, P.¹ Pang, X.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.