SCOPUS 정보 검색 플랫폼

Journal of Heuristics

Volumn 14, Issue 2, 2008, Pages 135-168

Accelerating autonomous learning by using heuristic selection of actions

(3) Bianchi, Reinaldo A C a Ribeiro, Carlos H C b Costa, Anna H R c

a CENTRO UNIVERSITÁRIO DA FEI (Brazil)

b INSTITUTO TECNOLÓGICO DE AERONÁUTICA (Brazil)

c UNIVERSITY OF SÃO PAULO (Brazil)

Author keywords

Action selection; Heuristic function; Reinforcement learning; Robot navigation

Indexed keywords

CONVERGENCE OF NUMERICAL METHODS; HEURISTIC METHODS; MOBILE ROBOTS; REINFORCEMENT LEARNING; ROBOTICS;

CONTROL POLICIES; HEURISTIC SELECTION; LEARNING PROCESSES; ROBOT NAVIGATION;

LEARNING ALGORITHMS;

EID: 41249102188 PISSN: 13811231 EISSN: 15729397 Source Type: Journal
DOI: 10.1007/s10732-007-9031-5 Document Type: Article

Times cited : (72)

References (29)

1
- 0016555419
- Data storage in the cerebellar model articulation controller (CMAC)
- Albus, J.S.: Data storage in the cerebellar model articulation controller (CMAC). J. Dyn. Syst. Meas. Control 97, 228-233 (1975)
- (1975) J. Dyn. Syst. Meas. Control , vol.97 , pp. 228-233
- Albus, J.S.¹

2
- 0003565779
- Prentice-Hall Upper Saddle River
- Bertsekas, D.P.: Dynamic Programming: Deterministic and Stochastic Models. Prentice-Hall, Upper Saddle River (1987)
- (1987) Dynamic Programming: Deterministic and Stochastic Models
- Bertsekas, D.P.¹

3
- 0003565783
- Athena Scientific Belmont
- Bertsekas, D.P.: Dynamic Programming and Optimal Control, vol. 1. Athena Scientific, Belmont (1995)
- (1995) Dynamic Programming and Optimal Control, Vol. 1
- Bertsekas, D.P.¹

4
- 79952394323
- Ph.D. thesis, University of São Paulo
- Bianchi, R.A.C.: Using heuristics to accelerate reinforcement learning algorithms (in Portuguese). Ph.D. thesis, University of São Paulo (2004)
- (2004) Using Heuristics to Accelerate Reinforcement Learning Algorithms (In Portuguese)
- Bianchi, R.A.C.¹

5
- 0034612523
- Inspiration for optimization from social insect behaviour
- Bonabeau, E., Dorigo, M., Theraulaz, G.: Inspiration for optimization from social insect behaviour. Nature 406 (2000)
- (2000) Nature , vol.406
- Bonabeau, E.¹ Dorigo, M.² Theraulaz, G.³

6
- 23144451490
- State value learning with an anticipatory learning classifier system in a Markov decision process
- Butz, M.V.: State value learning with an anticipatory learning classifier system in a Markov decision process. Technical report 2002018 at the Illinois Genetic Algorithms Laboratory (2002)
- (2002) Technical Report 2002018 at the Illinois Genetic Algorithms Laboratory
- Butz, M.V.¹

7
- 0043247546
- Accelerating reinforcement learning by composing solutions of automatically identified subtasks
- Drummond, C.: Accelerating reinforcement learning by composing solutions of automatically identified subtasks. J. Artif. Intell. Res. 16, 59-104 (2002)
- (2002) J. Artif. Intell. Res. , vol.16 , pp. 59-104
- Drummond, C.¹

8
- 0024684020
- Using occupancy grids for mobile robot perception and navigation
- Elfes, A.: Using occupancy grids for mobile robot perception and navigation. Computer 22, 46-57 (1989)
- (1989) Computer , vol.22 , pp. 46-57
- Elfes, A.¹

9
- 0036832959
- Structure in the space of value functions
- Foster, D., Dayan, P.: Structure in the space of value functions. Mach. Learn. 49, 325-346 (2002)
- (2002) Mach. Learn. , vol.49 , pp. 325-346
- Foster, D.¹ Dayan, P.²

10
- 0000016031
- Markov localization for mobile robots in dynamic environments
- Fox, D., Burgard, W., Thrun, S.: Markov localization for mobile robots in dynamic environments. J. Artif. Intell. Res. 11, 391-427 (1999)
- (1999) J. Artif. Intell. Res. , vol.11 , pp. 391-427
- Fox, D.¹ Burgard, W.² Thrun, S.³

11
- 84899829959
- A formal basis for the heuristic determination of minimum cost paths
- Hart, P.E., Nilsson, N.J., Raphael, B.: A formal basis for the heuristic determination of minimum cost paths. IEEE Trans. Syst. Sci. Cybern. 4, 100-107 (1968)
- (1968) IEEE Trans. Syst. Sci. Cybern. , vol.4 , pp. 100-107
- Hart, P.E.¹ Nilsson, N.J.² Raphael, B.³

12
- 0029679044
- Reinforcement learning: A survey
- Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4, 237-285 (1996)
- (1996) J. Artif. Intell. Res. , vol.4 , pp. 237-285
- Kaelbling, L.P.¹ Littman, M.L.² Moore, A.W.³

13
- 0006473335
- The Saphira architecture for autonomous mobile robots
- MIT Cambridge
- Konolige, K., Myers, K.: The Saphira architecture for autonomous mobile robots. In: AI-based Mobile Robots: Case Studies of Successful Robot Systems. MIT, Cambridge (1996)
- (1996) AI-based Mobile Robots: Case Studies of Successful Robot Systems
- Konolige, K.¹ Myers, K.²

14
- 0036832960
- Continuous-action Q-learning
- Millan, J.R., Posenato, D., Dedieu, E.: Continuous-action Q-learning. Mach. Learn. 49, 247-266 (2002)
- (2002) Mach. Learn. , vol.49 , pp. 247-266
- Millan, J.R.¹ Posenato, D.² Dedieu, E.³

15
- 0004255908
- McGraw-Hill New York
- Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997)
- (1997) Machine Learning
- Mitchell, T.¹

16
- 0027684215
- Prioritized sweeping: Reinforcement learning with less data and less time
- Moore, A.W., Atkeson, C.G.: Prioritized sweeping: reinforcement learning with less data and less time. Mach. Learn. 13, 103-130 (1993)
- (1993) Mach. Learn. , vol.13 , pp. 103-130
- Moore, A.W.¹ Atkeson, C.G.²

17
- 0036832953
- Variable resolution discretization in optimal control
- Munos, R., Moore, A.W.: Variable resolution discretization in optimal control. Mach. Learn. 49, 291-323 (2002)
- (2002) Mach. Learn. , vol.49 , pp. 291-323
- Munos, R.¹ Moore, A.W.²

18
- 84977063352
- Efficient learning and planning within the dyna framework
- Peng, J., Williams, R.J.: Efficient learning and planning within the dyna framework. Adapt. Behav. 1, 437-454 (1993)
- (1993) Adapt. Behav. , vol.1 , pp. 437-454
- Peng, J.¹ Williams, R.J.²

19
- 0003998452
- Wiley New York
- Puterman, M.L.: Markovian Decision Problems. Wiley, New York (1994)
- (1994) Markovian Decision Problems
- Puterman, M.L.¹

20
- 0003636089
- On-line Q-learning using connectionist systems
- Cambridge University Engineering Department
- Rummery, G., Niranjan, M.: On-line Q-learning using connectionist systems. Technical report CUED/F-INFENG/TR 166, Cambridge University Engineering Department (1994)
- (1994) Technical Report CUED/F-INFENG/TR 166
- Rummery, G.¹ Niranjan, M.²

21
- 0003584577
- 2 Prentice-Hall Englewood Cliffs
- Russell, S., Norvig, P.: Artificial Intelligence: a Modern Approach, 2nd edn. Prentice-Hall, Englewood Cliffs (2002)
- (2002) Artificial Intelligence: A Modern Approach
- Russell, S.¹ Norvig, P.²

22
- 0004234658
- McGraw-Hill New York
- Spiegel, M.R.: Probability and Statistics. McGraw-Hill, New York (1975)
- (1975) Probability and Statistics
- Spiegel, M.R.¹

23
- 33847202724
- Learning to predict by the methods of temporal differences
- Sutton, R.S.: Learning to predict by the methods of temporal differences. Mach. Learn. 3, 9-44 (1988)
- (1988) Mach. Learn. , vol.3 , pp. 9-44
- Sutton, R.S.¹

24
- 85132026293
- Integrated architectures for learning, planning and reacting based on approximating dynamic programming
- Morgan Kaufmann Austin
- Sutton, R.S.: Integrated architectures for learning, planning and reacting based on approximating dynamic programming. In: Proceedings of the 7th International Conference on Machine Learning. Morgan Kaufmann, Austin (1990)
- (1990) Proceedings of the 7th International Conference on Machine Learning
- Sutton, R.S.¹

25
- 85156221438
- Generalization in reinforcement learning: Successful examples using sparse coarse coding
- Sutton, R.S.: Generalization in reinforcement learning: successful examples using sparse coarse coding. Adv. Neural. Inf. Process. Syst. 8, 1038-1044 (1996)
- (1996) Adv. Neural. Inf. Process. Syst. , vol.8 , pp. 1038-1044
- Sutton, R.S.¹

26
- 0008876345
- Ph.D. thesis, Jozsef Attila University, Szeged, Hungary
- Szepesvári, C.: Static and dynamic aspects of optimal sequential decision making. Ph.D. thesis, Jozsef Attila University, Szeged, Hungary (1997)
- (1997) Static and Dynamic Aspects of Optimal Sequential Decision Making
- Szepesvári, C.¹

27
- 0003629453
- CS-96-11, Brown University, Department of Computer Science, Providence, RI
- Szepesvári, C., Littman, M.: Generalized Markov decision processes: dynamic-programming and reinforcement-learning algorithms. CS-96-11, Brown University, Department of Computer Science, Providence, RI (1996)
- (1996) Generalized Markov Decision Processes: Dynamic-programming and Reinforcement-learning Algorithms
- Szepesvári, C.¹ Littman, M.²

28
- 0035336711
- Robust Monte Carlo localization for mobile robots
- Thrun, S., Fox, W., Burgard, D., Dellaert, F.: Robust Monte Carlo localization for mobile robots. Artif. Intell. 128, 99-141 (2001)
- (2001) Artif. Intell. , vol.128 , pp. 99-141
- Thrun, S.¹ Fox, W.² Burgard, D.³ Dellaert, F.⁴

29
- 0004049893
- Ph.D. thesis, University of Cambridge
- Watkins, C.J.C.H.: Learning from delayed rewards. Ph.D. thesis, University of Cambridge (1989)
- (1989) Learning from Delayed Rewards
- Watkins, C.J.C.H.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.