메뉴 건너뛰기




Volumn 1, Issue , 2000, Pages 2992-2997

Evolutionary computation versus reinforcement learning

Author keywords

[No Author keywords available]

Indexed keywords

DYNAMIC PROGRAMMING; EVOLUTIONARY ALGORITHMS; INDUSTRIAL ELECTRONICS; STOCHASTIC SYSTEMS;

EID: 84969172313     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/IECON.2000.972474     Document Type: Conference Paper
Times cited : (8)

References (52)
  • 3
    • 77954563357 scopus 로고
    • On the length of programs for computing finite binary sequences: Statistical considerations
    • G.J. Chaitin. On the length of programs for computing finite binary sequences: statistical considerations. Journal of the ACM, 16:145-159, 1969.
    • (1969) Journal of the ACM , vol.16 , pp. 145-159
    • Chaitin, G.J.1
  • 4
    • 0002258659 scopus 로고
    • A representation for the adaptive generation of simple sequential programs
    • J.J. Grefenstette, editor, Hillsdale NJ, Lawrence Erlbaum Associates
    • N. L. Cramer. A representation for the adaptive generation of simple sequential programs. In J.J. Grefenstette, editor, Proceedings of an International Conference on Genetic Algorithms and Their Applications, Hillsdale NJ, 1985. Lawrence Erlbaum Associates.
    • (1985) Proceedings of an International Conference on Genetic Algorithms and Their Applications
    • Cramer, N.L.1
  • 5
    • 0001234682 scopus 로고
    • Feudal reinforcement learning
    • D. S. Lippman, J. E. Moody, and D. S. Touretzky, editors, San Mateo, CA: Morgan Kaufmann
    • P. Dayan and G. Hinton. Feudal reinforcement learning. In D. S. Lippman, J. E. Moody, and D. S. Touretzky, editors, Advances in Neural Information Processing Systems 5, pages 271-278. San Mateo, CA: Morgan Kaufmann, 1993.
    • (1993) Advances in Neural Information Processing Systems 5 , pp. 271-278
    • Dayan, P.1    Hinton, G.2
  • 7
    • 0007907759 scopus 로고    scopus 로고
    • Emergent hierarchical control structures: Learning reactive/hierarchical relationships in reinforcement environments
    • Pattie Maes, Maja Mataric, Jean-Arcady Meyer, Jordan Pollack, and Stewart W. Wilson, editors, MIT Press, Bradford Books
    • B.L. Digney. Emergent hierarchical control structures: Learning reactive/hierarchical relationships in reinforcement environments. In Pattie Maes, Maja Mataric, Jean-Arcady Meyer, Jordan Pollack, and Stewart W. Wilson, editors, From Animals to Animats 4: Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior, Cambridge, MA, pages 363-372. MIT Press, Bradford Books, 1996.
    • (1996) From Animals to Animats 4: Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior, Cambridge, MA , pp. 363-372
    • Digney, B.L.1
  • 8
    • 84969283790 scopus 로고
    • Neural subgoal generation using backpropagation
    • George G. Lendaris, Stephen Grossberg, and Bart Kosko, editors, Lawrence Erlbaum Associates, Inc., Publishers, Hillsdale, July
    • M. Eldracher and B. Baginski. Neural subgoal generation using backpropagation. In George G. Lendaris, Stephen Grossberg, and Bart Kosko, editors, World Congress on Neural Networks, pages III-145-III-148. Lawrence Erlbaum Associates, Inc., Publishers, Hillsdale, July 1993.
    • (1993) World Congress on Neural Networks , pp. III145-III148
    • Eldracher, M.1    Baginski, B.2
  • 11
    • 85153938292 scopus 로고
    • Reinforcement learning algorithm for partially observable Markov decision problems
    • G. Tesauro, D. S. Touretzky, and T. K. Leen, editors, MIT Press, Cambridge MA
    • T. Jaakkola, S. P. Singh, and M. I. Jordan. Reinforcement learning algorithm for partially observable Markov decision problems. In G. Tesauro, D. S. Touretzky, and T. K. Leen, editors, Advances in Neural Information Processing Systems 7, pages 345-352. MIT Press, Cambridge MA, 1995.
    • (1995) Advances in Neural Information Processing Systems 7 , pp. 345-352
    • Jaakkola, T.1    Singh, S.P.2    Jordan, M.I.3
  • 13
    • 84899026236 scopus 로고    scopus 로고
    • Finite-sample convergence rates for Q-learning and indirect algorithms
    • M. Kearns, S. A. Solla, and D. Cohn, editors, MIT Press, Cambridge MA
    • M. Kearns and S. Singh. Finite-sample convergence rates for Q-learning and indirect algorithms. In M. Kearns, S. A. Solla, and D. Cohn, editors, Advances in Neural Information Processing Systems 12. MIT Press, Cambridge MA, 1999.
    • (1999) Advances in Neural Information Processing Systems 12
    • Kearns, M.1    Singh, S.2
  • 14
    • 84967643367 scopus 로고
    • Three approaches to the quantitative definition of information
    • A.N. Kolmogorov. Three approaches to the quantitative definition of information. Problems of Information Transmission, 1:1-11, 1965.
    • (1965) Problems of Information Transmission , vol.1 , pp. 1-11
    • Kolmogorov, A.N.1
  • 15
    • 0007915412 scopus 로고
    • Theory formation by heuristic search
    • D. Lenat. Theory formation by heuristic search. Machine Learning, 21, 1983.
    • (1983) Machine Learning , vol.21
    • Lenat, D.1
  • 19
    • 85138579181 scopus 로고
    • Learning policies for partially observable environments: Scaling up
    • A. Prieditis and S. Russell, editors, Morgan Kaufmann Publishers, San Francisco, CA
    • M.L. Littman, A.R. Cassandra, and L.P. Kaelbling. Learning policies for partially observable environments: Scaling up. In A. Prieditis and S. Russell, editors, Machine Learning: Proceedings of the Twelfth International Conference, pages 362-370. Morgan Kaufmann Publishers, San Francisco, CA, 1995.
    • (1995) Machine Learning: Proceedings of the Twelfth International Conference , pp. 362-370
    • Littman, M.L.1    Cassandra, A.R.2    Kaelbling, L.P.3
  • 20
    • 0002242826 scopus 로고    scopus 로고
    • Learning to use selective attention and short-term memory in sequential tasks
    • Pattie Maes, Maja Mataric, Jean-Arcady Meyer, Jordan Pollack, and Stewart W. Wilson, editors, MIT Press, Bradford Books
    • R. A. McCallum. Learning to use selective attention and short-term memory in sequential tasks. In Pattie Maes, Maja Mataric, Jean-Arcady Meyer, Jordan Pollack, and Stewart W. Wilson, editors, From Animals to Animats 4: Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior, Cambridge, MA, pages 315-324. MIT Press, Bradford Books, 1996.
    • (1996) From Animals to Animats 4: Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior, Cambridge, MA , pp. 315-324
    • McCallum, R.A.1
  • 21
    • 0027684215 scopus 로고
    • Prioritized sweeping: Reinforcement learning with less data and less time
    • A. Moore and C. G. Atkeson. Prioritized sweeping: Reinforcement learning with less data and less time. Machine Learning, 13:103-130, 1993.
    • (1993) Machine Learning , vol.13 , pp. 103-130
    • Moore, A.1    Atkeson, C.G.2
  • 22
    • 0000111025 scopus 로고
    • An approach to the synthesis of life
    • C.G. Langton, C. Taylor, J. D. Farmer, and S. Rasmussen, editors, Addison Wesley Publishing Company
    • T. S. Ray. An approach to the synthesis of life. In C.G. Langton, C. Taylor, J. D. Farmer, and S. Rasmussen, editors, Artificial Life II, pages 371-408. Addison Wesley Publishing Company, 1992.
    • (1992) Artificial Life II , pp. 371-408
    • Ray, T.S.1
  • 24
    • 10844252596 scopus 로고
    • Incremental development of complex behaviors through automatic construction of sensory-motor hierarchies
    • L. Birnbaum and G. Collins, editors, Morgan Kaufmann
    • M. B. Ring. Incremental development of complex behaviors through automatic construction of sensory-motor hierarchies. In L. Birnbaum and G. Collins, editors, Machine Learning: Proceedings of the Eighth International Workshop, pages 343-347. Morgan Kaufmann, 1991.
    • (1991) Machine Learning: Proceedings of the Eighth International Workshop , pp. 343-347
    • Ring, M.B.1
  • 25
    • 0007912190 scopus 로고
    • Learning sequential tasks by incrementally adding higher orders
    • J. D. Cowan S. J. Hanson and C. L. Giles, editors, Morgan Kaufmann
    • M. B. Ring. Learning sequential tasks by incrementally adding higher orders. In J. D. Cowan S. J. Hanson and C. L. Giles, editors, Advances in Neural Information Processing Systems 5, pages 115-122. Morgan Kaufmann, 1993.
    • (1993) Advances in Neural Information Processing Systems 5 , pp. 115-122
    • Ring, M.B.1
  • 28
    • 0001201756 scopus 로고
    • Some studies in machine learning using the game of checkers
    • A. L. Samuel. Some studies in machine learning using the game of checkers. IBM Journal on Research and Development, 3:210-229, 1959.
    • (1959) IBM Journal on Research and Development , vol.3 , pp. 210-229
    • Samuel, A.L.1
  • 30
    • 0003274202 scopus 로고
    • Learning to generate sub-goals for action sequences
    • T. Kohonen, K. Mäkisara, O. Simula, and J. Kangas, editors, Elsevier Science Publishers B.V., North-Holland
    • J. Schmidhuber. Learning to generate sub-goals for action sequences. In T. Kohonen, K. Mäkisara, O. Simula, and J. Kangas, editors, Artificial Neural Networks, pages 967-972. Elsevier Science Publishers B.V., North-Holland, 1991.
    • (1991) Artificial Neural Networks , pp. 967-972
    • Schmidhuber, J.1
  • 31
    • 0000728324 scopus 로고
    • Reinforcement learning in Markovian and non-Markovian environments
    • D. S. Lippman, J. E. Moody, and D. S. Touretzky, editors, San Mateo, CA: Morgan Kaufmann
    • J. Schmidhuber. Reinforcement learning in Markovian and non-Markovian environments. In D. S. Lippman, J. E. Moody, and D. S. Touretzky, editors, Advances in Neural Information Processing Systems 3, pages 500-506. San Mateo, CA: Morgan Kaufmann, 1991.
    • (1991) Advances in Neural Information Processing Systems 3 , pp. 500-506
    • Schmidhuber, J.1
  • 32
    • 85152633484 scopus 로고
    • Discovering solutions with low Kolmogorov complexity and high generalization capability
    • A. Prieditis and S. Russell, editors, Morgan Kaufmann Publishers, San Francisco, CA
    • J. Schmidhuber. Discovering solutions with low Kolmogorov complexity and high generalization capability. In A. Prieditis and S. Russell, editors, Machine Learning: Proceedings of the Twelfth International Conference, pages 488-496. Morgan Kaufmann Publishers, San Francisco, CA, 1995.
    • (1995) Machine Learning: Proceedings of the Twelfth International Conference , pp. 488-496
    • Schmidhuber, J.1
  • 33
    • 0007918330 scopus 로고    scopus 로고
    • A general method for incremental self-improvement and multi-agent learning
    • X. Yao, editor, Scientific Publ. Co., Singapore
    • J. Schmidhuber. A general method for incremental self-improvement and multi-agent learning. In X. Yao, editor, Evolutionary Computation: Theory and Applications, pages 81-123. Scientific Publ. Co., Singapore, 1999.
    • (1999) Evolutionary Computation: Theory and Applications , pp. 81-123
    • Schmidhuber, J.1
  • 35
    • 0000156236 scopus 로고    scopus 로고
    • Reinforcement learning with self-modifying policies
    • S. Thrun and L. Pratt, editors, Kluwer
    • J. Schmidhuber, J. Zhao, and N. Schraudolph. Reinforcement learning with self-modifying policies. In S. Thrun and L. Pratt, editors, Learning to learn, pages 293-309. Kluwer, 1997.
    • (1997) Learning to Learn , pp. 293-309
    • Schmidhuber, J.1    Zhao, J.2    Schraudolph, N.3
  • 37
    • 0031186687 scopus 로고    scopus 로고
    • Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self-improvement
    • J. Schmidhuber, J. Zhao, and M. Wiering. Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self-improvement. Machine Learning, 28:105-130, 1997.
    • (1997) Machine Learning , vol.28 , pp. 105-130
    • Schmidhuber, J.1    Zhao, J.2    Wiering, M.3
  • 39
    • 0001652790 scopus 로고
    • The efficient learning of multiple task sequences
    • J.E. Moody, S.J. Hanson, and R.P. Lippman, editors, San Mateo, CA, Morgan Kaufmann
    • S.P. Singh. The efficient learning of multiple task sequences. In J.E. Moody, S.J. Hanson, and R.P. Lippman, editors, Advances in Neural Information Processing Systems 4, pages 251-258, San Mateo, CA, 1992. Morgan Kaufmann.
    • (1992) Advances in Neural Information Processing Systems 4 , pp. 251-258
    • Singh, S.P.1
  • 40
    • 4544279425 scopus 로고
    • A formal theory of inductive inference. Part I
    • R.J. Solomonoff. A formal theory of inductive inference. Part I. Information and Control, 7:1-22, 1964.
    • (1964) Information and Control , vol.7 , pp. 1-22
    • Solomonoff, R.J.1
  • 41
    • 0022825723 scopus 로고
    • An application of algorithmic probability to problems in artificial intelligence
    • L. N. Kanal and J. F. Lemmer, editors, Elsevier Science Publishers
    • R.J. Solomonoff. An application of algorithmic probability to problems in artificial intelligence. In L. N. Kanal and J. F. Lemmer, editors, Uncertainty in Artificial Intelligence, pages 473-491. Elsevier Science Publishers, 1986.
    • (1986) Uncertainty in Artificial Intelligence , pp. 473-491
    • Solomonoff, R.J.1
  • 43
    • 33847202724 scopus 로고
    • Learning to predict by the methods of temporal differences
    • R. S. Sutton. Learning to predict by the methods of temporal differences. Machine Learning, 3:9-44, 1988.
    • (1988) Machine Learning , vol.3 , pp. 9-44
    • Sutton, R.S.1
  • 44
    • 84922015064 scopus 로고
    • TD models: Modeling the world at a mixture of time scales
    • A. Prieditis and S. Russell, editors, Morgan Kaufmann Publishers, San Francisco, CA
    • R. S. Sutton. TD models: Modeling the world at a mixture of time scales. In A. Prieditis and S. Russell, editors, Machine Learning: Proceedings of the Twelfth International Conference, pages 531-539. Morgan Kaufmann Publishers, San Francisco, CA, 1995.
    • (1995) Machine Learning: Proceedings of the Twelfth International Conference , pp. 531-539
    • Sutton, R.S.1
  • 46
    • 0003362676 scopus 로고
    • The evolution of mental models
    • Jr. Kenneth E. Kinnear, editor, MIT Press
    • A. Teller. The evolution of mental models. In Jr. Kenneth E. Kinnear, editor, Advances in Genetic Programming, pages 199-219. MIT Press, 1994.
    • (1994) Advances in Genetic Programming , pp. 199-219
    • Teller, A.1
  • 47
    • 0000985504 scopus 로고
    • TD-gammon, a self-teaching backgammon program, achieves master-level play
    • G. Tesauro. TD-gammon, a self-teaching backgammon program, achieves master-level play. Neural Computation, 6(2):215-219, 1994.
    • (1994) Neural Computation , vol.6 , Issue.2 , pp. 215-219
    • Tesauro, G.1
  • 48
    • 0029390263 scopus 로고
    • Reinforcement learning of multiple tasks using a hierarchical CMAC architecture
    • C.K. Tham. Reinforcement learning of multiple tasks using a hierarchical CMAC architecture. Robotics and Autonomous Systems, 15(4):247-274, 1995.
    • (1995) Robotics and Autonomous Systems , vol.15 , Issue.4 , pp. 247-274
    • Tham, C.K.1
  • 52
    • 0000337576 scopus 로고
    • Simple statistical gradient-following algorithms for connectionist reinforcement learning
    • R. J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8:229-256, 1992.
    • (1992) Machine Learning , vol.8 , pp. 229-256
    • Williams, R.J.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.