메뉴 건너뛰기




Volumn 15, Issue 1, 2007, Pages 33-50

Empirical studies in action selection with reinforcement learning

Author keywords

Autonomic computing; Evolutionary computation; Neural networks; Reinforcement learning; Robot soccer; Temporal difference methods

Indexed keywords


EID: 33847264400     PISSN: 10597123     EISSN: 17412633     Source Type: Journal    
DOI: 10.1177/1059712306076253     Document Type: Article
Times cited : (40)

References (60)
  • 2
    • 0036568025 scopus 로고    scopus 로고
    • Finite-time analysis of the multi-armed bandit problem
    • Auer, P., Cesa-Bianchi, N., & Fischer, P. (2002). Finite-time analysis of the multi-armed bandit problem. Machine Learning, 47, 235-256.
    • (2002) Machine Learning , vol.47 , pp. 235-256
    • Auer, P.1    Cesa-Bianchi, N.2    Fischer, P.3
  • 3
    • 84898958374 scopus 로고    scopus 로고
    • Gradient descent for general reinforcement learning
    • M. S. Kearns, S. A. Solla, & D. A. Cohn (Eds.), Cambridge, MA: MIT Press.
    • Baird, L., & Moore, A. (1999). Gradient descent for general reinforcement learning. In M. S. Kearns, S. A. Solla, & D. A. Cohn (Eds.), Advances in neural information processing systems 11. Cambridge, MA: MIT Press.
    • (1999) Advances in neural information processing systems 11
    • Baird, L.1    Moore, A.2
  • 4
    • 0001410750 scopus 로고
    • A new factor in evolution
    • Baldwin, J. M. (1896). A new factor in evolution. The American Naturalist, 30, 441-451.
    • (1896) The American Naturalist , vol.30 , pp. 441-451
    • Baldwin, J.M.1
  • 6
    • 0004870746 scopus 로고
    • A problem in the sequential design of experiments
    • Bellman, R. E. (1956). A problem in the sequential design of experiments. Sankhya, 16, 221-229.
    • (1956) Sankhya , vol.16 , pp. 221-229
    • Bellman, R.E.1
  • 8
    • 0001133021 scopus 로고
    • Generalization in reinforcement learning: Safely approximating the value function
    • G. Tesauro, D. S. Touretzky, & T. K. Leen (Eds.), Cambridge, MA: MIT Press.
    • Boyan, J. A., & Moore, A. W. (1995). Generalization in reinforcement learning: Safely approximating the value function. In G. Tesauro, D. S. Touretzky, & T. K. Leen (Eds.), Advances in neural information processing systems 7. Cambridge, MA: MIT Press.
    • (1995) Advances in neural information processing systems 7
    • Boyan, J.A.1    Moore, A.W.2
  • 9
    • 0032208335 scopus 로고    scopus 로고
    • Elevator group control using multiple reinforcement learning agents
    • Crites, R. H., & Barto, A. G. (1998). Elevator group control using multiple reinforcement learning agents. Machine Learning, 33, 235-262.
    • (1998) Machine Learning , vol.33 , pp. 235-262
    • Crites, R.H.1    Barto, A.G.2
  • 14
    • 0000211184 scopus 로고
    • How learning can guide evolution
    • Hinton, G. E., & Nowlan, S. J. (1987). How learning can guide evolution. Complex Systems, 1, 495-502.
    • (1987) Complex Systems , vol.1 , pp. 495-502
    • Hinton, G.E.1    Nowlan, S.J.2
  • 18
    • 0037253062 scopus 로고    scopus 로고
    • The vision of autonomic computing
    • Kephart, J. O., & Chess, D. M. (2003). The vision of autonomic computing. Computer, 36, 41-50.
    • (2003) Computer , vol.36 , pp. 41-50
    • Kephart, J.O.1    Chess, D.M.2
  • 20
    • 84898938510 scopus 로고    scopus 로고
    • Actor-critic algorithms
    • M. S. Kearns, S. A. Solla, & D. A. Cohn (Eds.), Cambridge, MA: MIT Press.
    • Konda, V. R., & Tsitsiklis, J. N. (1999). Actor-critic algorithms. In M. S. Kearns, S. A. Solla, & D. A. Cohn (Eds.), Advances in neural information processing systems 11 (pp. 1008-1014). Cambridge, MA: MIT Press.
    • (1999) Advances in neural information processing systems 11 , pp. 1008-1014
    • Konda, V.R.1    Tsitsiklis, J.N.2
  • 23
    • 0000123778 scopus 로고
    • Self-improving reactive agents based on reinforcement learning, planning, and teaching
    • Lin, L.-J. (1992). Self-improving reactive agents based on reinforcement learning, planning, and teaching. Machine Learning, 8, 293-321.
    • (1992) Machine Learning , vol.8 , pp. 293-321
    • Lin, L.-J.1
  • 27
    • 0002318273 scopus 로고    scopus 로고
    • Efficient reinforcement learning through symbiotic evolution
    • Moriarty, D. E., & Miikkulainen, R. (1996). Efficient reinforcement learning through symbiotic evolution. Machine Learning, 22, 11-32.
    • (1996) Machine Learning , vol.22 , pp. 11-32
    • Moriarty, D.E.1    Miikkulainen, R.2
  • 30
    • 0002898235 scopus 로고
    • Learning and evolution in neural networks
    • Nolfi, S., Elman, J. L., & Parisi, D. (1994). Learning and evolution in neural networks. Adaptive Behavior, 2, 5-28.
    • (1994) Adaptive Behavior , vol.2 , pp. 5-28
    • Nolfi, S.1    Elman, J.L.2    Parisi, D.3
  • 32
    • 0001355838 scopus 로고
    • Radial basis functions for multivariate interpolation: A review
    • J. C. Mason & M. G. Cox (Eds.), Oxford: Clarendon Press.
    • Powell, M. J. D. (1987). Radial basis functions for multivariate interpolation: A review. In J. C. Mason & M. G. Cox (Eds.), Algorithms for Approximation (pp. 143-167). Oxford: Clarendon Press.
    • (1987) Algorithms for Approximation , pp. 143-167
    • Powell, M.J.D.1
  • 35
    • 33646398129 scopus 로고    scopus 로고
    • Neural fitted Q iteration - First experiences with a data efficient neural reinforcement learning method
    • Reidmiller, M. (2005). Neural fitted Q iteration - first experiences with a data efficient neural reinforcement learning method. In Proceedings of the 16th European Conference on Machine Learning (pp. 317-328). Porto, Portugal.
    • Proceedings of the 16th European Conference on Machine Learning , pp. 317-328
    • Reidmiller, M.1
  • 38
    • 29244474089 scopus 로고    scopus 로고
    • Co-evolution versus self-play temporal difference learning for acquiring position evaluation in small-board go
    • Runarsson, T. P., & Lucas, S. M. (2005). Co-evolution versus self-play temporal difference learning for acquiring position evaluation in small-board go. IEEE Transactions on Evolutionary Computation, 9, 628-640.
    • (2005) IEEE Transactions on Evolutionary Computation , vol.9 , pp. 628-640
    • Runarsson, T.P.1    Lucas, S.M.2
  • 40
    • 0029753630 scopus 로고    scopus 로고
    • Reinforcement learning with replacing eligibility traces
    • Singh, S. P., & Sutton, R. S. (1996). Reinforcement learning with replacing eligibility traces. Machine Learning, 22, 123-158.
    • (1996) Machine Learning , vol.22 , pp. 123-158
    • Singh, S.P.1    Sutton, R.S.2
  • 42
    • 84878524995 scopus 로고    scopus 로고
    • Averaging efficiently in the presence of noise
    • Amsterdam, the Netherlands.
    • Stagge, P. (1998). Averaging efficiently in the presence of noise. In Parallel problem solving from nature (Vol. 5, pp. 188-197). Amsterdam, the Netherlands.
    • (1998) Parallel problem solving from nature , pp. 188-197
    • Stagge, P.1
  • 44
    • 0036594106 scopus 로고    scopus 로고
    • Evolving neural networks through augmenting topologies
    • Stanley, K. O., & Miikkulainen, R. (2002). Evolving neural networks through augmenting topologies. Evolutionary Computation, 10, 99-127.
    • (2002) Evolutionary Computation , vol.10 , pp. 99-127
    • Stanley, K.O.1    Miikkulainen, R.2
  • 47
    • 27544506565 scopus 로고    scopus 로고
    • Reinforcement learning for RoboCup-soccer keepaway
    • Stone, P., Sutton, R. S., & Kuhlmann, G. (2005). Reinforcement learning for RoboCup-soccer keepaway. Adaptive Behavior, 13, 165-188.
    • (2005) Adaptive Behavior , vol.13 , pp. 165-188
    • Stone, P.1    Sutton, R.S.2    Kuhlmann, G.3
  • 48
    • 37249034293 scopus 로고    scopus 로고
    • Keepaway soccer: From machine learning testbed to benchmark
    • I. Noda, A. Jacoff, A. Bredenfeld, & Y. Takahashi (Eds.), Berlin: Springer-Verlag.
    • Stone, P., Kuhlmann, G., Taylor, M. E., & Liu, Y. (2006). Keepaway soccer: From machine learning testbed to benchmark. In I. Noda, A. Jacoff, A. Bredenfeld, & Y. Takahashi (Eds.), RoboCup-2005: Robot Soccer World Cup IX (Vol. 4020, pp. 93-105). Berlin: Springer-Verlag.
    • (2006) RoboCup-2005: Robot Soccer World Cup IX , pp. 93-105
    • Stone, P.1    Kuhlmann, G.2    Taylor, M.E.3    Liu, Y.4
  • 49
    • 33847202724 scopus 로고
    • Learning to predict by the methods of temporal differences
    • Sutton, R. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3, 9-44.
    • (1988) Machine Learning , vol.3 , pp. 9-44
    • Sutton, R.1
  • 50
    • 84898939480 scopus 로고    scopus 로고
    • Policy gradient methods for reinforcement learning with function approximation
    • S. A. Solla, T. K. Leen, & K.-R. Muller (Eds.), Cambridge, MA: MIT Press.
    • Sutton, R., McAllester, D., Singh, S., & Mansour, Y. (2000). Policy gradient methods for reinforcement learning with function approximation. In S. A. Solla, T. K. Leen, & K.-R. Muller (Eds.), Advances in neural information processing systems (Vol. 12, pp. 1057-1063). Cambridge, MA: MIT Press.
    • (2000) Advances in neural information processing systems , pp. 1057-1063
    • Sutton, R.1    McAllester, D.2    Singh, S.3    Mansour, Y.4
  • 51
    • 85156221438 scopus 로고    scopus 로고
    • Generalization in reinforcement learning: Successful examples using sparse coarse coding
    • D. S. Touretzky, M. C. Mozer, & M. E. Hasselmo (Eds.), Cambridge, MA: MIT Press.
    • Sutton, R. S. (1996). Generalization in reinforcement learning: Successful examples using sparse coarse coding. In D. S. Touretzky, M. C. Mozer, & M. E. Hasselmo (Eds.), Advances in neural information processing systems 8 (pp. 1038-1044). Cambridge, MA: MIT Press.
    • (1996) Advances in neural information processing systems 8 , pp. 1038-1044
    • Sutton, R.S.1
  • 56
    • 33646714634 scopus 로고    scopus 로고
    • Evolutionary function approximation for reinforcement learning
    • Whiteson, S., & Stone, P. (2006a). Evolutionary function approximation for reinforcement learning. Journal of Machine Learning Research, 7, 877-917.
    • (2006) Journal of Machine Learning Research , vol.7 , pp. 877-917
    • Whiteson, S.1    Stone, P.2
  • 58
    • 21244469857 scopus 로고    scopus 로고
    • Evolving keepaway soccer players through task decomposition
    • Whiteson, S., Kohl, N., Miikkulainen, R., & Stone, P. (2005). Evolving keepaway soccer players through task decomposition. Machine Learning, 59, 5-30.
    • (2005) Machine Learning , vol.59 , pp. 5-30
    • Whiteson, S.1    Kohl, N.2    Miikkulainen, R.3    Stone, P.4
  • 59
    • 0027701513 scopus 로고
    • Genetic reinforcement learning for neurocontrol problems
    • Whitley, D., Dominic, S., Das, R., & Anderson, C. W. (1993). Genetic reinforcement learning for neurocontrol problems. Machine Learning, 13, 259-284.
    • (1993) Machine Learning , vol.13 , pp. 259-284
    • Whitley, D.1    Dominic, S.2    Das, R.3    Anderson, C.W.4
  • 60
    • 0033362601 scopus 로고    scopus 로고
    • Evolving artificial neural networks
    • Yao, X. (1999). Evolving artificial neural networks. Proceedings of the IEEE, 87, 1423-1447.
    • (1999) Proceedings of the IEEE , vol.87 , pp. 1423-1447
    • Yao, X.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.