메뉴 건너뛰기




Volumn 30, Issue , 2007, Pages 659-684

Learning to play using low-complexity rule-based policies: Illustrations through Ms. Pac-Man

Author keywords

[No Author keywords available]

Indexed keywords

COMBINATORIAL MATHEMATICS; COMPUTATIONAL COMPLEXITY; GAME THEORY; OPTIMIZATION;

EID: 38349162555     PISSN: None     EISSN: 10769757     Source Type: Journal    
DOI: 10.1613/jair.2368     Document Type: Article
Times cited : (55)

References (38)
  • 1
    • 17444384857 scopus 로고    scopus 로고
    • Application of the cross-entropy method to the buffer allocation problem in a simulation-based environment
    • Allon, G., Kroese, D. P., Raviv, T., & Rubinstein, R. Y. (2005). Application of the cross-entropy method to the buffer allocation problem in a simulation-based environment. Annals of Operations Research, 134, 137-151.
    • (2005) Annals of Operations Research , vol.134 , pp. 137-151
    • Allon, G.1    Kroese, D.P.2    Raviv, T.3    Rubinstein, R.Y.4
  • 3
    • 38349096103 scopus 로고    scopus 로고
    • Machines that learn to play games, chap
    • Nova Science Publishers, Inc
    • Baxter, J., Tridgell, A., & Weaver, L. (2001). Machines that learn to play games, chap. Reinforcement learning and chess, pp. 91-116. Nova Science Publishers, Inc.
    • (2001) Reinforcement learning and chess , pp. 91-116
    • Baxter, J.1    Tridgell, A.2    Weaver, L.3
  • 5
    • 38349171751 scopus 로고    scopus 로고
    • Bonet, J. S. D., & Stauffer, C. P. (1999). Learning to play Pac-Man using incremental reinforcement learning.. [Online; accessed 09 October 2006].
    • Bonet, J. S. D., & Stauffer, C. P. (1999). Learning to play Pac-Man using incremental reinforcement learning.. [Online; accessed 09 October 2006].
  • 8
    • 38349150853 scopus 로고    scopus 로고
    • Courtillat, P. (2001). NoN-SeNS Pacman 1.6 with C sourcecode.. [Online; accessed 09 October 2006].
    • Courtillat, P. (2001). NoN-SeNS Pacman 1.6 with C sourcecode.. [Online; accessed 09 October 2006].
  • 9
    • 38349116176 scopus 로고    scopus 로고
    • Cross-entropic learning of a machine for the decision in a partially observable universe
    • To appear
    • Dambreville, F. (2006). Cross-entropic learning of a machine for the decision in a partially observable universe. Journal of Global Optimization. To appear.
    • (2006) Journal of Global Optimization
    • Dambreville, F.1
  • 11
    • 84901386407 scopus 로고    scopus 로고
    • Gallagher, M., & Ryan, A. (2003). Learning to play pac-man: An evolutionary, rule-based approach. In et. al., R. S. (Ed.), Proc. Congress on Evolutionary Computation, pp. 2462-2469.
    • Gallagher, M., & Ryan, A. (2003). Learning to play pac-man: An evolutionary, rule-based approach. In et. al., R. S. (Ed.), Proc. Congress on Evolutionary Computation, pp. 2462-2469.
  • 12
    • 0000746883 scopus 로고
    • Escaping brittleness: The possibilities of general-purpose learning algorithms applied to parallel rule-based systems
    • Mitchell, Michalski, & Carbonell Eds, chap. 20, pp, Morgan Kaufmann
    • Holland, J. H. (1986). Escaping brittleness: The possibilities of general-purpose learning algorithms applied to parallel rule-based systems. In Mitchell, Michalski, & Carbonell (Eds.), Machine Learning, an Artificial Intelligence Approach. Volume II, chap. 20, pp. 593-623. Morgan Kaufmann.
    • (1986) Machine Learning, an Artificial Intelligence Approach , vol.2 , pp. 593-623
    • Holland, J.H.1
  • 17
    • 17444377167 scopus 로고    scopus 로고
    • On the convergence of the cross-entropy method
    • Margolin, L. (2004). On the convergence of the cross-entropy method. Annals of Operations Research, 134, 201-214.
    • (2004) Annals of Operations Research , vol.134 , pp. 201-214
    • Margolin, L.1
  • 18
    • 17444414191 scopus 로고    scopus 로고
    • Basis function adaptation in temporal difference reinforcement learning
    • Menache, I., Mannor, S., & Shimkin, N. (2005). Basis function adaptation in temporal difference reinforcement learning. Annals of Operations Research, 134(1), 215-238.
    • (2005) Annals of Operations Research , vol.134 , Issue.1 , pp. 215-238
    • Menache, I.1    Mannor, S.2    Shimkin, N.3
  • 19
    • 0031215849 scopus 로고    scopus 로고
    • The equation for response to selection and its use for prediction
    • Muehlenbein, H. (1998). The equation for response to selection and its use for prediction. Evolutionary Computation, 5, 303-346.
    • (1998) Evolutionary Computation , vol.5 , pp. 303-346
    • Muehlenbein, H.1
  • 22
    • 0000228665 scopus 로고    scopus 로고
    • The cross-entropy method for combinatorial and continuous optimization
    • Rubinstein, R. Y. (1999). The cross-entropy method for combinatorial and continuous optimization. Methodology and Computing in Applied Probability, 1, 127-190.
    • (1999) Methodology and Computing in Applied Probability , vol.1 , pp. 127-190
    • Rubinstein, R.Y.1
  • 23
    • 0001201756 scopus 로고
    • Some studies in machine learning using the game of checkers
    • Samuel, A. L. (1959). Some studies in machine learning using the game of checkers. IBM Journal of Research and Development, 6, 211-229.
    • (1959) IBM Journal of Research and Development , vol.6 , pp. 211-229
    • Samuel, A.L.1
  • 25
    • 0041644968 scopus 로고    scopus 로고
    • A computer scientist's view of life, the universe, and everything
    • Freksa, C, Jantzen, M, & Valk, R, Eds, Foundations of Computer Science: Potential, Theory, Cognition, of, Springer, Berlin
    • Schmidhuber, J. (1997). A computer scientist's view of life, the universe, and everything. In Freksa, C., Jantzen, M., & Valk, R. (Eds.), Foundations of Computer Science: Potential - Theory - Cognition, Vol. 1337 of Lecture Notes in Computer Science, pp. 201-208. Springer, Berlin.
    • (1997) Lecture Notes in Computer Science , vol.1337 , pp. 201-208
    • Schmidhuber, J.1
  • 29
    • 33745683202 scopus 로고    scopus 로고
    • Szabó, Z., Póczos, B., & Lorincz, A. (2006). Cross-entropy optimization for independent process analysis. In ICA, pp. 909-916.
    • Szabó, Z., Póczos, B., & Lorincz, A. (2006). Cross-entropy optimization for independent process analysis. In ICA, pp. 909-916.
  • 30
    • 38349174986 scopus 로고    scopus 로고
    • How to select the 100 voxels that are best for prediction - a simplistic approach
    • Tech. rep, Eötvös Loránd University, Hungary
    • Szita, I. (2006). How to select the 100 voxels that are best for prediction - a simplistic approach. Tech. rep., Eötvös Loránd University, Hungary.
    • (2006)
    • Szita, I.1
  • 31
    • 33845344721 scopus 로고    scopus 로고
    • Learning Tetris using the noisy cross-entropy method
    • Szita, I., & Lorincz, A. (2006). Learning Tetris using the noisy cross-entropy method. Neural Computation, 18(12), 2936-2941.
    • (2006) Neural Computation , vol.18 , Issue.12 , pp. 2936-2941
    • Szita, I.1    Lorincz, A.2
  • 32
    • 0000985504 scopus 로고
    • TD-Gammon, a self-teaching backgammon program, achieves master-level play
    • Tesauro, G. (1994). TD-Gammon, a self-teaching backgammon program, achieves master-level play. Neural Computation, 6(2), 215-219.
    • (1994) Neural Computation , vol.6 , Issue.2 , pp. 215-219
    • Tesauro, G.1
  • 35
    • 0037158688 scopus 로고    scopus 로고
    • Fast hands-free writing by gaze direction
    • Ward, D. J., & MacKay, D. J. C. (2002). Fast hands-free writing by gaze direction. Nature, 418, 838-540.
    • (2002) Nature , vol.418 , pp. 838-540
    • Ward, D.J.1    MacKay, D.J.C.2
  • 36
    • 38349110468 scopus 로고    scopus 로고
    • Wikipedia (2006). Pac-Man -Wikipedia, the free encyclopedia. Wikipedia. [Online; accessed 20 May 2007].
    • Wikipedia (2006). Pac-Man -Wikipedia, the free encyclopedia. Wikipedia. [Online; accessed 20 May 2007].


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.