SCOPUS 정보 검색 플랫폼

International Journal of Intelligent Systems

Volumn 23, Issue 2, 2008, Pages 213-245

Two steps reinforcement learning

(2) Fernández, Fernando a Borrajo, Daniel a

a UNIVERSIDAD CARLOS III DE MADRID (Spain)

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS; APPROXIMATION THEORY; DECISION TREES; INTELLIGENT SYSTEMS; STATE SPACE METHODS;

PROTOTYPE CLASSIFIERS; SPACE DISCRETIZATION; VALUE FUNCTIONS;

REINFORCEMENT LEARNING;

EID: 38949129339 PISSN: 08848173 EISSN: 1098111X Source Type: Journal
DOI: 10.1002/int.20255 Document Type: Conference Paper

Times cited : (25)

References (43)

1
- 0004102479
- Cambridge, MA: MIT Press
- Sutton RS, Barto AG. Reinforcement learning: An introduction. Cambridge, MA: MIT Press, 1998.
- (1998) Reinforcement learning: An introduction
- Sutton, R.S.¹ Barto, A.G.²

2
- 0029679044
- Reinforcement learning: A survey
- Kaelbling LP, Littman ML, Moore AW. Reinforcement learning: A survey. Int J Artif Intelli Res 1996;4:237-285.
- (1996) Int J Artif Intelli Res , vol.4 , pp. 237-285
- Kaelbling, L.P.¹ Littman, M.L.² Moore, A.W.³

3
- 0004049893
- PhD thesis, Cambridge University, Cambridge, England
- Watkins CJCH. Learning from delayed rewards. PhD thesis, Cambridge University, Cambridge, England, 1989.
- (1989) Learning from delayed rewards
- Watkins, C.J.C.H.¹

4
- 85012688561
- Princeton, NJ. Princeton University Press;
- Bellman R. Dynamic programming. Princeton, NJ. Princeton University Press; 1957.
- (1957) Dynamic programming
- Bellman, R.¹

5
- 0001133021
- Generalization in reinforcement learning: Safely approximating the value function
- Boyan JA, Moore AW. Generalization in reinforcement learning: Safely approximating the value function. Adva Neural Inform Process Syst 1995;7.
- (1995) Adva Neural Inform Process Syst , pp. 7
- Boyan, J.A.¹ Moore, A.W.²

6
- 0036832953
- Variable resolution discretization in optimal control
- Munos R, Moore A. Variable resolution discretization in optimal control. Machine Learning 2002;49(2/3):291-323.
- (2002) Machine Learning , vol.49 , Issue.2-3 , pp. 291-323
- Munos, R.¹ Moore, A.²

7
- 0031231885
- Experiments with reinforcement learning in problems with continuous state and action spaces
- Santamaría JC, Sutton RS, Ram A. Experiments with reinforcement learning in problems with continuous state and action spaces. Adaptive Behavior 1998;6(2): 163-218.
- (1998) Adaptive Behavior , vol.6 , Issue.2 , pp. 163-218
- Santamaría, J.C.¹ Sutton, R.S.² Ram, A.³

8
- 0034274415
- A study of reinforcement learning in the continuous case by the means of viscosity solutions
- Munos R. A study of reinforcement learning in the continuous case by the means of viscosity solutions. Machine Learning 1999;40:265-299.
- (1999) Machine Learning , vol.40 , pp. 265-299
- Munos, R.¹

9
- 25944467789
- On determinism handling while learning reduced state space representations
- Lyon France, July
- Fernández F, Borrajo D. On determinism handling while learning reduced state space representations. In: Proc Eur Conf on Artificial Intelligence (ECAI 2002), Lyon (France); July 2002. pp 280-284.
- (2002) Proc Eur Conf on Artificial Intelligence (ECAI , pp. 280-284
- Fernández, F.¹ Borrajo, D.²

10
- 0002267046
- Variable resolution dynamic programming: Efficiently learning action maps in multivariate real-valued spaces
- Moore AW. Variable resolution dynamic programming: Efficiently learning action maps in multivariate real-valued spaces. In: Proc Eighth Int Machine Learning Workshop, 1991.
- (1991) Proc Eighth Int Machine Learning Workshop
- Moore, A.W.¹

11
- 13244294436
- PhD thesis, University of Birmingham
- Reynolds SI. Reinforcement learning with exploration. PhD thesis, University of Birmingham, 2002.
- (2002) Reinforcement learning with exploration
- Reynolds, S.I.¹

12
- 21244478586
- Variable resolution discretization in the joint space,
- Monson CK, Wingate D, Seppi KD, Peterson TS. Variable resolution discretization in the joint space,. In: Proc Int Conf on Machine Learning and Applications (ICMLA 2004), 2004.
- (2004) Proc Int Conf on Machine Learning and Applications (ICMLA
- Monson, C.K.¹ Wingate, D.² Seppi, K.D.³ Peterson, T.S.⁴

13
- 0029514510
- The parti game algorithm for variable resolution reinforcement learning in multidimensional state-spaces
- Moore AW, Atkeson CG. The parti game algorithm for variable resolution reinforcement learning in multidimensional state-spaces. Machine Learning 1995;21(3):199-233.
- (1995) Machine Learning , vol.21 , Issue.3 , pp. 199-233
- Moore, A.W.¹ Atkeson, C.G.²

14
- 23144434851
- PhD thesis, Carnegie Mellon University, August
- Uther WTB. Tree based hierarchical reinforcement learning. PhD thesis, Carnegie Mellon University, August 2002.
- (2002) Tree based hierarchical reinforcement learning
- Uther, W.T.B.¹

15
- 0003487482
- Athena Scientific; MA
- Bertsekas DP, Tsitsiklis JN. Neuro-Dynamic programming. Bellmon, Athena Scientific; MA 1996.
- (1996) Neuro-Dynamic programming. Bellmon
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

16
- 0029752470
- Feature-based methods for large scale dynamic programming
- Tsitsiklis JN, Van Roy B. Feature-based methods for large scale dynamic programming. Machine Learning, 1996;22:59-94.
- (1996) Machine Learning , vol.22 , pp. 59-94
- Tsitsiklis, J.N.¹ Van Roy, B.²

17
- 0003208321
- Gradient descent for general reinforcement learning
- Baird LC. Gradient descent for general reinforcement learning. Neural Infor Process Syst 1998; 11.
- (1998) Neural Infor Process Syst , pp. 11
- Baird, L.C.¹

18
- 0001046225
- Practical issues in temporal difference learning
- Tesauro G. Practical issues in temporal difference learning. Machine Learning, 1992;8:257-277.
- (1992) Machine Learning , vol.8 , pp. 257-277
- Tesauro, G.¹

19
- 0004267735
- Boston, MA: Kluwer Academic Publishers;
- Aha D. Lazy learning. Boston, MA: Kluwer Academic Publishers; 1997.
- (1997) Lazy learning
- Aha, D.¹

20
- 0001898381
- Practical reinforcement learning in continuous spaces
- Smart WD, Kaelbling LP. Practical reinforcement learning in continuous spaces. In: Proc Int Conf of Machine Learning, 2000; pp 903-907.
- (2000) Proc Int Conf of Machine Learning , pp. 903-907
- Smart, W.D.¹ Kaelbling, L.P.²

21
- 0031341345
- Neural reinforcement learning for behaviour synthesis
- Touzet C. Neural reinforcement learning for behaviour synthesis. Robotics and Auton Syst, 1997;22:251-281.
- (1997) Robotics and Auton Syst , vol.22 , pp. 251-281
- Touzet, C.¹

22
- 84944872843
- Applying vector quantization to reinforcement learning
- RoboCup-99: Robot Soccer World Cup III, Springer Verlag; Berlin;
- Fernández F, Borrajo D. VQQL. Applying vector quantization to reinforcement learning. In: RoboCup-99: Robot Soccer World Cup III, Lecture Notes in Artificial Intelligence, vol 1856 Springer Verlag; Berlin; 2000. pp 292-303.
- (2000) Lecture Notes in Artificial Intelligence , vol.1856 , pp. 292-303
- Fernández, F.¹ Borrajo, D.V.²

23
- 5644261272
- Learning in large cooperative multi-robot domains
- Fernández F, Parker L. Learning in large cooperative multi-robot domains. Int J Robot Automation 2001;16(4):217-226.
- (2001) Int J Robot Automation , vol.16 , Issue.4 , pp. 217-226
- Fernández, F.¹ Parker, L.²

24
- 0020102027
- Least squares quantization in PCM
- Lloyd SP. Least squares quantization in PCM. IEEE Trans Infor Theory 1982;28:127-135.
- (1982) IEEE Trans Infor Theory , vol.28 , pp. 127-135
- Lloyd, S.P.¹

25
- 0003959189
- Boston, MA: Kluwer Academic Publishers
- Gersho A, Gray RM. Vector quantization and signal compression. Boston, MA: Kluwer Academic Publishers, 1992.
- (1992) Vector quantization and signal compression
- Gersho, A.¹ Gray, R.M.²

26
- 84880668656
- The Robocup synthetic agent challenge
- San Francisco, CA
- Kitano H, Tambe M, Stone P, Veloso M, Coradeschi S, Osawa H, Matsubara H, Noda I, Asada M. The Robocup synthetic agent challenge. In: Proc Fifteenth Int Joint Conference on Artificial Intelligence (IJCAI97), San Francisco, CA, 1997. pp 24-49.
- (1997) Proc Fifteenth Int Joint Conference on Artificial Intelligence (IJCAI97) , pp. 24-49
- Kitano, H.¹ Tambe, M.² Stone, P.³ Veloso, M.⁴ Coradeschi, S.⁵ Osawa, H.⁶ Matsubara, H.⁷ Noda, I.⁸ Asada, M.⁹

27
- 0036573011
- Distributed algorithms for multi-robot observation of multiple moving targets
- Parker LE. Distributed algorithms for multi-robot observation of multiple moving targets. Auton Robots 2002; 12(3):231-255.
- (2002) Auton Robots , vol.12 , Issue.3 , pp. 231-255
- Parker, L.E.¹

28
- 38949213221
- Parker LE, Touzet C, Fernández F Techniques for learning in multi-robot teams, In: Robot teams: from Diversity to polymorphism A. K. Peters Publishers, 2002; pp 191-236.
- Parker LE, Touzet C, Fernández F Techniques for learning in multi-robot teams, In: Robot teams: from Diversity to polymorphism A. K. Peters Publishers, 2002; pp 191-236.

29
- 29344446348
- A reinforcement learning algorithm in cooperative multirobot domains
- Fernández F, Borrajo D, Parker L. A reinforcement learning algorithm in cooperative multirobot domains. J Intel Robot Syst 2005;43(2-4):161-174.
- (2005) J Intel Robot Syst , vol.43 , Issue.2-4 , pp. 161-174
- Fernández, F.¹ Borrajo, D.² Parker, L.³

30
- 21844465127
- Tree-based batch mode reinforcement learning
- Ernst D. Tree-based batch mode reinforcement learning. J Machine Learning Res 2005;6:503-556.
- (2005) J Machine Learning Res , vol.6 , pp. 503-556
- Ernst, D.¹

31
- 22944476070
- PhD thesis, Graduate Division of the University of California at Berkeley
- Forbes JRN. Reinforcement learning for autonomous vehicles. PhD thesis, Graduate Division of the University of California at Berkeley, 2002.
- (2002) Reinforcement learning for autonomous vehicles
- Forbes, J.R.N.¹

32
- 0042312608
- Feature weighting in k-means clustering
- Modha DS, Spangler WS. Feature weighting in k-means clustering. Machine Learning 2003;52:217-237.
- (2003) Machine Learning , vol.52 , pp. 217-237
- Modha, D.S.¹ Spangler, W.S.²

33
- 0004090962
- PhD thesis, Department of Computer Science at Brown University, Providence, RI, May
- Smart WD. Making reinforcement learning work on real robots. PhD thesis, Department of Computer Science at Brown University, Providence, RI, May 2002.
- (2002) Making reinforcement learning work on real robots
- Smart, W.D.¹

34
- 33744584654
- Induction of decision trees
- Quinlan JR. Induction of decision trees. Machine Learning 1986;1(1):81-106.
- (1986) Machine Learning , vol.1 , Issue.1 , pp. 81-106
- Quinlan, J.R.¹

35
- 0002129041
- Generating accurate rule sets without global optimization
- Frank E, Witten IH. Generating accurate rule sets without global optimization. In: Proc Fifteenth Int Conf on Machine Learning 1998.
- (1998) Proc Fifteenth Int Conf on Machine Learning
- Frank, E.¹ Witten, I.H.²

36
- 34047098195
- Data mining
- San Francisco, CA Morgan Kaufmann;
- Witten IH, Frank E. Data mining. Practical machine learning tools and techniques with Java implementations. San Francisco, CA Morgan Kaufmann; 2000.
- (2000) Practical machine learning tools and techniques with Java implementations
- Witten, I.H.¹ Frank, E.²

37
- 0346046894
- Automatic finding of good classifiers following a biologically inspired metaphor
- Fernández F, Isasi P. Automatic finding of good classifiers following a biologically inspired metaphor. Comput Inform 2002; 21(3):205-220.
- (2002) Comput Inform , vol.21 , Issue.3 , pp. 205-220
- Fernández, F.¹ Isasi, P.²

38
- 3142672346
- Evolutionary design of nearest prototype classifiers
- Fernández F, Isasi P. Evolutionary design of nearest prototype classifiers. J Heuristics 2004; 10(4):431-454.
- (2004) J Heuristics , vol.10 , Issue.4 , pp. 431-454
- Fernández, F.¹ Isasi, P.²

39
- 0242536865
- Adaptive resolution model-free reinforcement learning: Decision boundary partitioning
- Reynolds SI. Adaptive resolution model-free reinforcement learning: Decision boundary partitioning. In: Proc Int Conf Machine Learning 2000; pp 783-790.
- (2000) Proc Int Conf Machine Learning , pp. 783-790
- Reynolds, S.I.¹

40
- 0042353224
- Multigrid Q-learning
- Technical report, Colorado State University, Boulder, CO
- Anderson C, Crawford-Hines S. Multigrid Q-learning. Technical report, Colorado State University, Boulder, CO, 1994.
- (1994)
- Anderson, C.¹ Crawford-Hines, S.²

41
- 0025484857
- Numerical methods for stochastic control problems in continuous time
- Kushner HJ. Numerical methods for stochastic control problems in continuous time. SIAM J. Control Optim 28, 1990.
- (1990) SIAM J. Control Optim , vol.28
- Kushner, H.J.¹

42
- 10044297591
- K-d decision tree: An accelerated and memory efficient nearest neighbor classifier
- Shibata T, Kato T, Wad T. K-d decision tree: An accelerated and memory efficient nearest neighbor classifier. In: Proc Third IEEE Int Conf on Data Mining 2003; pp 641-644.
- (2003) Proc Third IEEE Int Conf on Data Mining , pp. 641-644
- Shibata, T.¹ Kato, T.² Wad, T.³

43
- 0035312760
- Relational reinforcement learning
- Dzeroski S, De Raedt L, Driessens K. Relational reinforcement learning. Machine Learning 43, 2001.
- (2001) Machine Learning , vol.43
- Dzeroski, S.¹ De Raedt, L.² Driessens, K.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.