메뉴 건너뛰기




Volumn 1, Issue , 2009, Pages 554-561

An empirical analysis of value function-based and policy search reinforcement learning

Author keywords

Function approximation; Policy search; Reinforcement learning; Temporal difference learning

Indexed keywords

MULTI AGENT SYSTEMS; REINFORCEMENT LEARNING;

EID: 84899831232     PISSN: 15488403     EISSN: 15582914     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (16)

References (23)
  • 2
    • 0003787146 scopus 로고
    • Princeton University Press. Princeton, NJ. June
    • R. E. Bellman. Dynamic Programming. Princeton University Press. Princeton, NJ. June 1957.
    • (1957) Dynamic Programming
    • Bellman, R.E.1
  • 3
    • 85162049326 scopus 로고    scopus 로고
    • Incremental natural actor-critic algorithms
    • J. Piatt, D. Koller, Y. Singer, and S. Roweis, editors., MIT Press, Cambridge, MA
    • S. Bhatnagar, R. Sutton, M. Ghavamzadeh, and M. Lee. Incremental natural actor-critic algorithms. In J. Piatt, D. Koller, Y. Singer, and S. Roweis, editors. Advances in Neural Information Processing Systems 20, pages 105-112. MIT Press, Cambridge, MA, 2008.
    • (2008) Advances in Neural Information Processing Systems , vol.20 , pp. 105-112
    • Bhatnagar, S.1    Sutton, R.2    Ghavamzadeh, M.3    Lee, M.4
  • 5
    • 85156187730 scopus 로고    scopus 로고
    • Improving elevator performance using reinforcement learning
    • D. S. Touretzky, M. Mozer, and M. E. Hassehno, editors, NIPS, Denver, CO, November 27-30, 1995, MIT Press
    • R. H. Crites and A. G. Barto. Improving elevator performance using reinforcement learning. In D. S. Touretzky, M. Mozer, and M. E. Hassehno, editors, Advances in Neural Information Processing Systems 8, NIPS, Denver, CO, November 27-30, 1995, pages 1017-1023. MIT Press, 1996.
    • (1996) Advances in Neural Information Processing Systems , vol.8 , pp. 1017-1023
    • Crites, R.H.1    Barto, A.G.2
  • 7
    • 33646243319 scopus 로고    scopus 로고
    • A natural policy gradient
    • T. G. Dietterich, S. Becker, and 2. Ghahramani, editors, MIT Press
    • S. Kakade. A natural policy gradient. In T. G. Dietterich, S. Becker, and 2. Ghahramani, editors, Advances in Neural Information Processing Systems 14, pages 1531-1538. MIT Press. 2001.
    • (2001) Advances in Neural Information Processing Systems , vol.14 , pp. 1531-1538
    • Kakade, S.1
  • 10
    • 0012327484 scopus 로고    scopus 로고
    • Using eligibility traces to find the best memoryless policy in partially observable markov decision processes
    • Morgan Kaufmann
    • J. Loch and S. Singh. Using eligibility traces to find the best memoryless policy in partially observable Markov decision processes. In Proceedings of the Fifteenth International Conference on Machine Learning, pages 323-331. Morgan Kaufmann, 1998.
    • (1998) Proceedings of the Fifteenth International Conference on Machine Learning , pp. 323-331
    • Loch, J.1    Singh, S.2
  • 11
    • 84898980684 scopus 로고    scopus 로고
    • Autonomous helicopter flight via reinforcement learning
    • S. Thrun, L. Saul, and B. Scholkopf, editors, MIT Press, Cambridge, MA
    • A. Y. Ng, H. J. Kim, M. I. Jordan, and S. Sastry. Autonomous helicopter flight via reinforcement learning. In S. Thrun, L. Saul, and B. Scholkopf, editors, Advances in Neural Information Processing Systems 16. MIT Press, Cambridge, MA, 2004.
    • (2004) Advances in Neural Information Processing Systems , vol.16
    • Ng, A.Y.1    Kim, H.J.2    Jordan, M.I.3    Sastry, S.4
  • 12
    • 84898960655 scopus 로고    scopus 로고
    • A convergent form of approximate policy iteration
    • S. T. S. Becker and K. Obermayer, editors, MIT Press, Cambridge, MA
    • T. J. Perkins and D. Precup. A convergent form of approximate policy iteration. In S. T. S. Becker and K. Obermayer, editors, Advances in Neural Information Processing Systems 15, pages 1595-1602. MIT Press, Cambridge, MA, 2003.
    • (2003) Advances in Neural Information Processing Systems , vol.15 , pp. 1595-1602
    • Perkins, T.J.1    Precup, D.2
  • 13
    • 40649106649 scopus 로고    scopus 로고
    • Natural actor-critic
    • J. Peters and S. Schaal. Natural actor-critic. Neurocomputing, 71 (7-9): 1180-1190, 2008.
    • (2008) Neurocomputing , vol.71 , Issue.7-9 , pp. 1180-1190
    • Peters, J.1    Schaal, S.2
  • 16
    • 27544506565 scopus 로고    scopus 로고
    • Reinforcement learning for robocup-soccer keepaway
    • P. Stone, R. S. Sutton, and G. Kuhlmann. Reinforcement learning for RoboCup-soccer keepaway. Adaptive Behavior, 13(3):165-188. 2005.
    • (2005) Adaptive Behavior , vol.13 , Issue.3 , pp. 165-188
    • Stone, P.1    Sutton, R.S.2    Kuhlmann, G.3
  • 18
    • 33845344721 scopus 로고    scopus 로고
    • Learning tetris using the noisy cross-entropy method
    • I. Szita and A. Lorincz. Learning Tetris using the noisy cross-entropy method. Neural Computation, 18:2936-2941, 2006.
    • (2006) Neural Computation , vol.18 , pp. 2936-2941
    • Szita, I.1    Lorincz, A.2
  • 20
    • 27544473171 scopus 로고    scopus 로고
    • Behavior transfer for value-function-based reinforcement learning
    • F. Dignum, V. Dignum, S. Koenig. S. Kraus, M. P. Singh, and M. Wooldridge, editors, New York, NY, July, ACM Press
    • M. E. Taylor and P. Stone. Behavior transfer for value-function-based reinforcement learning. In F. Dignum, V. Dignum, S. Koenig. S. Kraus, M. P. Singh, and M. Wooldridge, editors, The Fourth International Joint Conference on Autonomous Agents and Multiagent Systems, pages 53-59, New York, NY, July 2005. ACM Press.
    • (2005) The Fourth International Joint Conference on Autonomous Agents and Multiagent Systems , pp. 53-59
    • Taylor, M.E.1    Stone, P.2
  • 21
    • 34548031419 scopus 로고    scopus 로고
    • On the use of hybrid reinforcement learning for autonomic resource allocation
    • G. Tesauro, N. K. Jong, R. Das, and M. N. Bennani. On the use of hybrid reinforcement learning for autonomic resource allocation. Cluster Computing, 10(3):287-299, 2007.
    • (2007) Cluster Computing , vol.10 , Issue.3 , pp. 287-299
    • Tesauro, G.1    Jong, N.K.2    Das, R.3    Bennani, M.N.4
  • 23
    • 33646714634 scopus 로고    scopus 로고
    • Evolutionary function approximation for reinforcement learning
    • May
    • S. Whiteson and P. Stone. Evolutionary function approximation for reinforcement learning. Journal of Machine Learning Research, 7:877-917, May 2006.
    • (2006) Journal of Machine Learning Research , vol.7 , pp. 877-917
    • Whiteson, S.1    Stone, P.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.