메뉴 건너뛰기




Volumn 14, Issue 2, 2008, Pages 135-168

Accelerating autonomous learning by using heuristic selection of actions

Author keywords

Action selection; Heuristic function; Reinforcement learning; Robot navigation

Indexed keywords

CONVERGENCE OF NUMERICAL METHODS; HEURISTIC METHODS; MOBILE ROBOTS; REINFORCEMENT LEARNING; ROBOTICS;

EID: 41249102188     PISSN: 13811231     EISSN: 15729397     Source Type: Journal    
DOI: 10.1007/s10732-007-9031-5     Document Type: Article
Times cited : (72)

References (29)
  • 1
    • 0016555419 scopus 로고
    • Data storage in the cerebellar model articulation controller (CMAC)
    • Albus, J.S.: Data storage in the cerebellar model articulation controller (CMAC). J. Dyn. Syst. Meas. Control 97, 228-233 (1975)
    • (1975) J. Dyn. Syst. Meas. Control , vol.97 , pp. 228-233
    • Albus, J.S.1
  • 5
    • 0034612523 scopus 로고    scopus 로고
    • Inspiration for optimization from social insect behaviour
    • Bonabeau, E., Dorigo, M., Theraulaz, G.: Inspiration for optimization from social insect behaviour. Nature 406 (2000)
    • (2000) Nature , vol.406
    • Bonabeau, E.1    Dorigo, M.2    Theraulaz, G.3
  • 7
    • 0043247546 scopus 로고    scopus 로고
    • Accelerating reinforcement learning by composing solutions of automatically identified subtasks
    • Drummond, C.: Accelerating reinforcement learning by composing solutions of automatically identified subtasks. J. Artif. Intell. Res. 16, 59-104 (2002)
    • (2002) J. Artif. Intell. Res. , vol.16 , pp. 59-104
    • Drummond, C.1
  • 8
    • 0024684020 scopus 로고
    • Using occupancy grids for mobile robot perception and navigation
    • Elfes, A.: Using occupancy grids for mobile robot perception and navigation. Computer 22, 46-57 (1989)
    • (1989) Computer , vol.22 , pp. 46-57
    • Elfes, A.1
  • 9
    • 0036832959 scopus 로고    scopus 로고
    • Structure in the space of value functions
    • Foster, D., Dayan, P.: Structure in the space of value functions. Mach. Learn. 49, 325-346 (2002)
    • (2002) Mach. Learn. , vol.49 , pp. 325-346
    • Foster, D.1    Dayan, P.2
  • 10
    • 0000016031 scopus 로고    scopus 로고
    • Markov localization for mobile robots in dynamic environments
    • Fox, D., Burgard, W., Thrun, S.: Markov localization for mobile robots in dynamic environments. J. Artif. Intell. Res. 11, 391-427 (1999)
    • (1999) J. Artif. Intell. Res. , vol.11 , pp. 391-427
    • Fox, D.1    Burgard, W.2    Thrun, S.3
  • 11
    • 84899829959 scopus 로고
    • A formal basis for the heuristic determination of minimum cost paths
    • Hart, P.E., Nilsson, N.J., Raphael, B.: A formal basis for the heuristic determination of minimum cost paths. IEEE Trans. Syst. Sci. Cybern. 4, 100-107 (1968)
    • (1968) IEEE Trans. Syst. Sci. Cybern. , vol.4 , pp. 100-107
    • Hart, P.E.1    Nilsson, N.J.2    Raphael, B.3
  • 16
    • 0027684215 scopus 로고
    • Prioritized sweeping: Reinforcement learning with less data and less time
    • Moore, A.W., Atkeson, C.G.: Prioritized sweeping: reinforcement learning with less data and less time. Mach. Learn. 13, 103-130 (1993)
    • (1993) Mach. Learn. , vol.13 , pp. 103-130
    • Moore, A.W.1    Atkeson, C.G.2
  • 17
    • 0036832953 scopus 로고    scopus 로고
    • Variable resolution discretization in optimal control
    • Munos, R., Moore, A.W.: Variable resolution discretization in optimal control. Mach. Learn. 49, 291-323 (2002)
    • (2002) Mach. Learn. , vol.49 , pp. 291-323
    • Munos, R.1    Moore, A.W.2
  • 18
    • 84977063352 scopus 로고
    • Efficient learning and planning within the dyna framework
    • Peng, J., Williams, R.J.: Efficient learning and planning within the dyna framework. Adapt. Behav. 1, 437-454 (1993)
    • (1993) Adapt. Behav. , vol.1 , pp. 437-454
    • Peng, J.1    Williams, R.J.2
  • 20
    • 0003636089 scopus 로고
    • On-line Q-learning using connectionist systems
    • Cambridge University Engineering Department
    • Rummery, G., Niranjan, M.: On-line Q-learning using connectionist systems. Technical report CUED/F-INFENG/TR 166, Cambridge University Engineering Department (1994)
    • (1994) Technical Report CUED/F-INFENG/TR 166
    • Rummery, G.1    Niranjan, M.2
  • 23
    • 33847202724 scopus 로고
    • Learning to predict by the methods of temporal differences
    • Sutton, R.S.: Learning to predict by the methods of temporal differences. Mach. Learn. 3, 9-44 (1988)
    • (1988) Mach. Learn. , vol.3 , pp. 9-44
    • Sutton, R.S.1
  • 24
    • 85132026293 scopus 로고
    • Integrated architectures for learning, planning and reacting based on approximating dynamic programming
    • Morgan Kaufmann Austin
    • Sutton, R.S.: Integrated architectures for learning, planning and reacting based on approximating dynamic programming. In: Proceedings of the 7th International Conference on Machine Learning. Morgan Kaufmann, Austin (1990)
    • (1990) Proceedings of the 7th International Conference on Machine Learning
    • Sutton, R.S.1
  • 25
    • 85156221438 scopus 로고    scopus 로고
    • Generalization in reinforcement learning: Successful examples using sparse coarse coding
    • Sutton, R.S.: Generalization in reinforcement learning: successful examples using sparse coarse coding. Adv. Neural. Inf. Process. Syst. 8, 1038-1044 (1996)
    • (1996) Adv. Neural. Inf. Process. Syst. , vol.8 , pp. 1038-1044
    • Sutton, R.S.1
  • 28
    • 0035336711 scopus 로고    scopus 로고
    • Robust Monte Carlo localization for mobile robots
    • Thrun, S., Fox, W., Burgard, D., Dellaert, F.: Robust Monte Carlo localization for mobile robots. Artif. Intell. 128, 99-141 (2001)
    • (2001) Artif. Intell. , vol.128 , pp. 99-141
    • Thrun, S.1    Fox, W.2    Burgard, D.3    Dellaert, F.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.