SCOPUS 정보 검색 플랫폼

IECON Proceedings (Industrial Electronics Conference)

Volumn 1, Issue , 2000, Pages 2992-2997

Evolutionary computation versus reinforcement learning

(1) Schmidhuber, J a

a DALLE MOLLE INSTITUTE FOR ARTIFICIAL INTELLIGENCE IDSIA (Switzerland)

Author keywords

[No Author keywords available]

Indexed keywords

DYNAMIC PROGRAMMING; EVOLUTIONARY ALGORITHMS; INDUSTRIAL ELECTRONICS; STOCHASTIC SYSTEMS;

CREDIT ASSIGNMENT; FITNESS FUNCTIONS; PARTIAL OBSERVABILITY; STOCHASTIC ENVIRONMENT; STOCHASTIC POLICY; UNKNOWN ENVIRONMENTS; VALUE FUNCTIONS;

REINFORCEMENT LEARNING;

EID: 84969172313 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/IECON.2000.972474 Document Type: Conference Paper

Times cited : (8)

References (52)

1
- 0003479517
- Morgan Kaufmann Publishers, San Francisco, CA, USA
- W. Banzhaf, P. Nordin, R. E. Keller, and F. D. Francone. Genetic Programming An Introduction. Morgan Kaufmann Publishers, San Francisco, CA, USA, 1998.
- (1998) Genetic Programming an Introduction
- Banzhaf, W.¹ Nordin, P.² Keller, R.E.³ Francone, F.D.⁴

2
- 84967064032
- Princeton University Press
- R. Bellman. Adaptive Control Processes. Princeton University Press, 1961.
- (1961) Adaptive Control Processes
- Bellman, R.¹

3
- 77954563357
- On the length of programs for computing finite binary sequences: Statistical considerations
- G.J. Chaitin. On the length of programs for computing finite binary sequences: statistical considerations. Journal of the ACM, 16:145-159, 1969.
- (1969) Journal of the ACM , vol.16 , pp. 145-159
- Chaitin, G.J.¹

4
- 0002258659
- A representation for the adaptive generation of simple sequential programs
- J.J. Grefenstette, editor, Hillsdale NJ, Lawrence Erlbaum Associates
- N. L. Cramer. A representation for the adaptive generation of simple sequential programs. In J.J. Grefenstette, editor, Proceedings of an International Conference on Genetic Algorithms and Their Applications, Hillsdale NJ, 1985. Lawrence Erlbaum Associates.
- (1985) Proceedings of an International Conference on Genetic Algorithms and Their Applications
- Cramer, N.L.¹

5
- 0001234682
- Feudal reinforcement learning
- D. S. Lippman, J. E. Moody, and D. S. Touretzky, editors, San Mateo, CA: Morgan Kaufmann
- P. Dayan and G. Hinton. Feudal reinforcement learning. In D. S. Lippman, J. E. Moody, and D. S. Touretzky, editors, Advances in Neural Information Processing Systems 5, pages 271-278. San Mateo, CA: Morgan Kaufmann, 1993.
- (1993) Advances in Neural Information Processing Systems 5 , pp. 271-278
- Dayan, P.¹ Hinton, G.²

6
- 0344299454
- Fortgeschrittenenpraktikum, Institut für Informatik, Lehrstuhl Prof. Radig, Technische Universität München
- D. Dickmanns, J. Schmidhuber, and A. Winklhofer. Der genetische Algorithmus: Eine Implementierung in Prolog. Fortgeschrittenenpraktikum, Institut für Informatik, Lehrstuhl Prof. Radig, Technische Universität München, 1987.
- (1987) Der Genetische Algorithmus: Eine Implementierung in Prolog
- Dickmanns, D.¹ Schmidhuber, J.² Winklhofer, A.³

7
- 0007907759
- Emergent hierarchical control structures: Learning reactive/hierarchical relationships in reinforcement environments
- Pattie Maes, Maja Mataric, Jean-Arcady Meyer, Jordan Pollack, and Stewart W. Wilson, editors, MIT Press, Bradford Books
- B.L. Digney. Emergent hierarchical control structures: Learning reactive/hierarchical relationships in reinforcement environments. In Pattie Maes, Maja Mataric, Jean-Arcady Meyer, Jordan Pollack, and Stewart W. Wilson, editors, From Animals to Animats 4: Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior, Cambridge, MA, pages 363-372. MIT Press, Bradford Books, 1996.
- (1996) From Animals to Animats 4: Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior, Cambridge, MA , pp. 363-372
- Digney, B.L.¹

8
- 84969283790
- Neural subgoal generation using backpropagation
- George G. Lendaris, Stephen Grossberg, and Bart Kosko, editors, Lawrence Erlbaum Associates, Inc., Publishers, Hillsdale, July
- M. Eldracher and B. Baginski. Neural subgoal generation using backpropagation. In George G. Lendaris, Stephen Grossberg, and Bart Kosko, editors, World Congress on Neural Networks, pages III-145-III-148. Lawrence Erlbaum Associates, Inc., Publishers, Hillsdale, July 1993.
- (1993) World Congress on Neural Networks , pp. III145-III148
- Eldracher, M.¹ Baginski, B.²

9
- 0003463297
- University of Michigan Press, Ann Arbor
- J. H. Holland. Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor, 1975.
- (1975) Adaptation in Natural and Artificial Systems
- Holland, J.H.¹

10
- 0007914441
- Action selection methods using reinforcement learning
- Pattie Maes, Maja Mataric, Jean-Arcady Meyer, Jordan Pollack, and Stewart W. Wilson, editors, MIT Press, Bradford Books
- M. Humphrys. Action selection methods using reinforcement learning. In Pattie Maes, Maja Mataric, Jean-Arcady Meyer, Jordan Pollack, and Stewart W. Wilson, editors, From Animals to Animats 4: Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior, Cambridge, MA, pages 135-144. MIT Press, Bradford Books, 1996.
- (1996) From Animals to Animats 4: Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior, Cambridge, MA , pp. 135-144
- Humphrys, M.¹

11
- 85153938292
- Reinforcement learning algorithm for partially observable Markov decision problems
- G. Tesauro, D. S. Touretzky, and T. K. Leen, editors, MIT Press, Cambridge MA
- T. Jaakkola, S. P. Singh, and M. I. Jordan. Reinforcement learning algorithm for partially observable Markov decision problems. In G. Tesauro, D. S. Touretzky, and T. K. Leen, editors, Advances in Neural Information Processing Systems 7, pages 345-352. MIT Press, Cambridge MA, 1995.
- (1995) Advances in Neural Information Processing Systems 7 , pp. 345-352
- Jaakkola, T.¹ Singh, S.P.² Jordan, M.I.³

12
- 0003442794
- Technical report, Brown University, Providence RI
- L.P. Kaelbling, M.L. Littman, and A.R. Cassandra. Planning and acting in partially observable stochastic domains. Technical report, Brown University, Providence RI, 1995.
- (1995) Planning and Acting in Partially Observable Stochastic Domains
- Kaelbling, L.P.¹ Littman, M.L.² Cassandra, A.R.³

13
- 84899026236
- Finite-sample convergence rates for Q-learning and indirect algorithms
- M. Kearns, S. A. Solla, and D. Cohn, editors, MIT Press, Cambridge MA
- M. Kearns and S. Singh. Finite-sample convergence rates for Q-learning and indirect algorithms. In M. Kearns, S. A. Solla, and D. Cohn, editors, Advances in Neural Information Processing Systems 12. MIT Press, Cambridge MA, 1999.
- (1999) Advances in Neural Information Processing Systems 12
- Kearns, M.¹ Singh, S.²

14
- 84967643367
- Three approaches to the quantitative definition of information
- A.N. Kolmogorov. Three approaches to the quantitative definition of information. Problems of Information Transmission, 1:1-11, 1965.
- (1965) Problems of Information Transmission , vol.1 , pp. 1-11
- Kolmogorov, A.N.¹

15
- 0007915412
- Theory formation by heuristic search
- D. Lenat. Theory formation by heuristic search. Machine Learning, 21, 1983.
- (1983) Machine Learning , vol.21
- Lenat, D.¹

16
- 0003680739
- Springer
- M. Li and P. M. B. Vitányi. An Introduction to Kolmogorov Complexity and its Applications. Springer, 1993.
- (1993) An Introduction to Kolmogorov Complexity and Its Applications
- Li, M.¹ Vitányi, P.M.B.²

17
- 0003673017
- PhD thesis, Carnegie Mellon University, Pittsburgh, January
- L.J. Lin. Reinforcement Learning for Robots Using Neural Networks. PhD thesis, Carnegie Mellon University, Pittsburgh, January 1993.
- (1993) Reinforcement Learning for Robots Using Neural Networks
- Lin, L.J.¹

18
- 0003861655
- PhD thesis, Brown University, March
- M.L. Littman. Algorithms for Sequential Decision Making. PhD thesis, Brown University, March 1996.
- (1996) Algorithms for Sequential Decision Making
- Littman, M.L.¹

19
- 85138579181
- Learning policies for partially observable environments: Scaling up
- A. Prieditis and S. Russell, editors, Morgan Kaufmann Publishers, San Francisco, CA
- M.L. Littman, A.R. Cassandra, and L.P. Kaelbling. Learning policies for partially observable environments: Scaling up. In A. Prieditis and S. Russell, editors, Machine Learning: Proceedings of the Twelfth International Conference, pages 362-370. Morgan Kaufmann Publishers, San Francisco, CA, 1995.
- (1995) Machine Learning: Proceedings of the Twelfth International Conference , pp. 362-370
- Littman, M.L.¹ Cassandra, A.R.² Kaelbling, L.P.³

20
- 0002242826
- Learning to use selective attention and short-term memory in sequential tasks
- Pattie Maes, Maja Mataric, Jean-Arcady Meyer, Jordan Pollack, and Stewart W. Wilson, editors, MIT Press, Bradford Books
- R. A. McCallum. Learning to use selective attention and short-term memory in sequential tasks. In Pattie Maes, Maja Mataric, Jean-Arcady Meyer, Jordan Pollack, and Stewart W. Wilson, editors, From Animals to Animats 4: Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior, Cambridge, MA, pages 315-324. MIT Press, Bradford Books, 1996.
- (1996) From Animals to Animats 4: Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior, Cambridge, MA , pp. 315-324
- McCallum, R.A.¹

21
- 0027684215
- Prioritized sweeping: Reinforcement learning with less data and less time
- A. Moore and C. G. Atkeson. Prioritized sweeping: Reinforcement learning with less data and less time. Machine Learning, 13:103-130, 1993.
- (1993) Machine Learning , vol.13 , pp. 103-130
- Moore, A.¹ Atkeson, C.G.²

22
- 0000111025
- An approach to the synthesis of life
- C.G. Langton, C. Taylor, J. D. Farmer, and S. Rasmussen, editors, Addison Wesley Publishing Company
- T. S. Ray. An approach to the synthesis of life. In C.G. Langton, C. Taylor, J. D. Farmer, and S. Rasmussen, editors, Artificial Life II, pages 371-408. Addison Wesley Publishing Company, 1992.
- (1992) Artificial Life II , pp. 371-408
- Ray, T.S.¹

23
- 84964572821
- Dissertation, Published by Fromman-Holzboog
- I. Rechenberg. Evolutionsstrategie - Optimierung technischer Systeme nach Prinzipien der biologischen Evolution. Dissertation, 1971. Published 1973 by Fromman-Holzboog.
- (1971) Evolutionsstrategie - Optimierung Technischer Systeme Nach Prinzipien der Biologischen Evolution
- Rechenberg, I.¹

24
- 10844252596
- Incremental development of complex behaviors through automatic construction of sensory-motor hierarchies
- L. Birnbaum and G. Collins, editors, Morgan Kaufmann
- M. B. Ring. Incremental development of complex behaviors through automatic construction of sensory-motor hierarchies. In L. Birnbaum and G. Collins, editors, Machine Learning: Proceedings of the Eighth International Workshop, pages 343-347. Morgan Kaufmann, 1991.
- (1991) Machine Learning: Proceedings of the Eighth International Workshop , pp. 343-347
- Ring, M.B.¹

25
- 0007912190
- Learning sequential tasks by incrementally adding higher orders
- J. D. Cowan S. J. Hanson and C. L. Giles, editors, Morgan Kaufmann
- M. B. Ring. Learning sequential tasks by incrementally adding higher orders. In J. D. Cowan S. J. Hanson and C. L. Giles, editors, Advances in Neural Information Processing Systems 5, pages 115-122. Morgan Kaufmann, 1993.
- (1993) Advances in Neural Information Processing Systems 5 , pp. 115-122
- Ring, M.B.¹

26
- 0003588579
- PhD thesis, University of Texas at Austin, Austin, Texas 78712, August
- M. B. Ring. Continual Learning in Reinforcement Environments. PhD thesis, University of Texas at Austin, Austin, Texas 78712, August 1994.
- (1994) Continual Learning in Reinforcement Environments
- Ring, M.B.¹

27
- 0000108169
- Probabilistic incremental program evolution
- R. P. Salustowicz and J. Schmidhuber. Probabilistic incremental program evolution. Evolutionary Computation, 5(2):123-141, 1997.
- (1997) Evolutionary Computation , vol.5 , Issue.2 , pp. 123-141
- Salustowicz, R.P.¹ Schmidhuber, J.²

28
- 0001201756
- Some studies in machine learning using the game of checkers
- A. L. Samuel. Some studies in machine learning using the game of checkers. IBM Journal on Research and Development, 3:210-229, 1959.
- (1959) IBM Journal on Research and Development , vol.3 , pp. 210-229
- Samuel, A.L.¹

29
- 0008006333
- Institut für Informatik, Technische Universität München
- J. Schmidhuber. Evolutionary principles in self-referential learning, or on learning how to learn: the meta-meta-... hook. Institut für Informatik, Technische Universität München, 1987.
- (1987) Evolutionary Principles in Self-referential Learning, or on Learning How to Learn: The Meta-meta-... Hook
- Schmidhuber, J.¹

30
- 0003274202
- Learning to generate sub-goals for action sequences
- T. Kohonen, K. Mäkisara, O. Simula, and J. Kangas, editors, Elsevier Science Publishers B.V., North-Holland
- J. Schmidhuber. Learning to generate sub-goals for action sequences. In T. Kohonen, K. Mäkisara, O. Simula, and J. Kangas, editors, Artificial Neural Networks, pages 967-972. Elsevier Science Publishers B.V., North-Holland, 1991.
- (1991) Artificial Neural Networks , pp. 967-972
- Schmidhuber, J.¹

31
- 0000728324
- Reinforcement learning in Markovian and non-Markovian environments
- D. S. Lippman, J. E. Moody, and D. S. Touretzky, editors, San Mateo, CA: Morgan Kaufmann
- J. Schmidhuber. Reinforcement learning in Markovian and non-Markovian environments. In D. S. Lippman, J. E. Moody, and D. S. Touretzky, editors, Advances in Neural Information Processing Systems 3, pages 500-506. San Mateo, CA: Morgan Kaufmann, 1991.
- (1991) Advances in Neural Information Processing Systems 3 , pp. 500-506
- Schmidhuber, J.¹

32
- 85152633484
- Discovering solutions with low Kolmogorov complexity and high generalization capability
- A. Prieditis and S. Russell, editors, Morgan Kaufmann Publishers, San Francisco, CA
- J. Schmidhuber. Discovering solutions with low Kolmogorov complexity and high generalization capability. In A. Prieditis and S. Russell, editors, Machine Learning: Proceedings of the Twelfth International Conference, pages 488-496. Morgan Kaufmann Publishers, San Francisco, CA, 1995.
- (1995) Machine Learning: Proceedings of the Twelfth International Conference , pp. 488-496
- Schmidhuber, J.¹

33
- 0007918330
- A general method for incremental self-improvement and multi-agent learning
- X. Yao, editor, Scientific Publ. Co., Singapore
- J. Schmidhuber. A general method for incremental self-improvement and multi-agent learning. In X. Yao, editor, Evolutionary Computation: Theory and Applications, pages 81-123. Scientific Publ. Co., Singapore, 1999.
- (1999) Evolutionary Computation: Theory and Applications , pp. 81-123
- Schmidhuber, J.¹

34
- 0002487353
- Direct policy search and uncertain policy evaluation
- American Association for Artificial Intelligence, Menlo Park, Calif.
- J. Schmidhuber and J. Zhao. Direct policy search and uncertain policy evaluation. In AAAI Spring Symposium on Search under Uncertain and Incomplete Information, Stanford Univ., pages 119-124. American Association for Artificial Intelligence, Menlo Park, Calif., 1999.
- (1999) AAAI Spring Symposium on Search under Uncertain and Incomplete Information, Stanford Univ. , pp. 119-124
- Schmidhuber, J.¹ Zhao, J.²

35
- 0000156236
- Reinforcement learning with self-modifying policies
- S. Thrun and L. Pratt, editors, Kluwer
- J. Schmidhuber, J. Zhao, and N. Schraudolph. Reinforcement learning with self-modifying policies. In S. Thrun and L. Pratt, editors, Learning to learn, pages 293-309. Kluwer, 1997.
- (1997) Learning to Learn , pp. 293-309
- Schmidhuber, J.¹ Zhao, J.² Schraudolph, N.³

36
- 0008010287
- Technical Report IDSIA
- J. Schmidhuber, J. Zhao, and M. Wiering. Simple principles of metalearning. Technical Report IDSIA-69-96, IDSIA, 1996.
- (1996) Simple Principles of Metalearning
- Schmidhuber, J.¹ Zhao, J.² Wiering, M.³

37
- 0031186687
- Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self-improvement
- J. Schmidhuber, J. Zhao, and M. Wiering. Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self-improvement. Machine Learning, 28:105-130, 1997.
- (1997) Machine Learning , vol.28 , pp. 105-130
- Schmidhuber, J.¹ Zhao, J.² Wiering, M.³

38
- 0007958081
- Dissertation, Published by Birkhäuser, Basel
- H. P. Schwefel. Numerische Optimierung von Computer-Modellen. Dissertation, 1974. Published 1977 by Birkhäuser, Basel.
- (1974) Numerische Optimierung von Computer-Modellen
- Schwefel, H.P.¹

39
- 0001652790
- The efficient learning of multiple task sequences
- J.E. Moody, S.J. Hanson, and R.P. Lippman, editors, San Mateo, CA, Morgan Kaufmann
- S.P. Singh. The efficient learning of multiple task sequences. In J.E. Moody, S.J. Hanson, and R.P. Lippman, editors, Advances in Neural Information Processing Systems 4, pages 251-258, San Mateo, CA, 1992. Morgan Kaufmann.
- (1992) Advances in Neural Information Processing Systems 4 , pp. 251-258
- Singh, S.P.¹

40
- 4544279425
- A formal theory of inductive inference. Part I
- R.J. Solomonoff. A formal theory of inductive inference. Part I. Information and Control, 7:1-22, 1964.
- (1964) Information and Control , vol.7 , pp. 1-22
- Solomonoff, R.J.¹

41
- 0022825723
- An application of algorithmic probability to problems in artificial intelligence
- L. N. Kanal and J. F. Lemmer, editors, Elsevier Science Publishers
- R.J. Solomonoff. An application of algorithmic probability to problems in artificial intelligence. In L. N. Kanal and J. F. Lemmer, editors, Uncertainty in Artificial Intelligence, pages 473-491. Elsevier Science Publishers, 1986.
- (1986) Uncertainty in Artificial Intelligence , pp. 473-491
- Solomonoff, R.J.¹

42
- 0033720075
- Self-segmentation of sequences: Automatic formation of hierarchies of sequential behaviors
- R. Sun and C. Sessions. Self-segmentation of sequences: automatic formation of hierarchies of sequential behaviors. IEEE Transactions on Systems, Man, and Cybernetics: Part B Cybernetics, 30(3), 2000.
- (2000) IEEE Transactions on Systems, Man, and Cybernetics: Part B Cybernetics , vol.30 , Issue.3
- Sun, R.¹ Sessions, C.²

43
- 33847202724
- Learning to predict by the methods of temporal differences
- R. S. Sutton. Learning to predict by the methods of temporal differences. Machine Learning, 3:9-44, 1988.
- (1988) Machine Learning , vol.3 , pp. 9-44
- Sutton, R.S.¹

44
- 84922015064
- TD models: Modeling the world at a mixture of time scales
- A. Prieditis and S. Russell, editors, Morgan Kaufmann Publishers, San Francisco, CA
- R. S. Sutton. TD models: Modeling the world at a mixture of time scales. In A. Prieditis and S. Russell, editors, Machine Learning: Proceedings of the Twelfth International Conference, pages 531-539. Morgan Kaufmann Publishers, San Francisco, CA, 1995.
- (1995) Machine Learning: Proceedings of the Twelfth International Conference , pp. 531-539
- Sutton, R.S.¹

45
- 0000672258
- Improved switching among temporally abstract actions
- MIT Press
- R. S. Sutton, S. Singh, D. Precup, and B. Ravindran. Improved switching among temporally abstract actions. In Advances in Neural Information Processing Systems 11. MIT Press, 1999. To appear.
- (1999) Advances in Neural Information Processing Systems 11
- Sutton, R.S.¹ Singh, S.² Precup, D.³ Ravindran, B.⁴

46
- 0003362676
- The evolution of mental models
- Jr. Kenneth E. Kinnear, editor, MIT Press
- A. Teller. The evolution of mental models. In Jr. Kenneth E. Kinnear, editor, Advances in Genetic Programming, pages 199-219. MIT Press, 1994.
- (1994) Advances in Genetic Programming , pp. 199-219
- Teller, A.¹

47
- 0000985504
- TD-gammon, a self-teaching backgammon program, achieves master-level play
- G. Tesauro. TD-gammon, a self-teaching backgammon program, achieves master-level play. Neural Computation, 6(2):215-219, 1994.
- (1994) Neural Computation , vol.6 , Issue.2 , pp. 215-219
- Tesauro, G.¹

48
- 0029390263
- Reinforcement learning of multiple tasks using a hierarchical CMAC architecture
- C.K. Tham. Reinforcement learning of multiple tasks using a hierarchical CMAC architecture. Robotics and Autonomous Systems, 15(4):247-274, 1995.
- (1995) Robotics and Autonomous Systems , vol.15 , Issue.4 , pp. 247-274
- Tham, C.K.¹

49
- 0004049893
- PhD thesis, King's College, Oxford
- C.J.C.H Watkins. Learning from Delayed Rewards. PhD thesis, King's College, Oxford, 1989.
- (1989) Learning from Delayed Rewards
- Watkins, C.J.C.H.¹

50
- 0031215211
- HQ-learning
- M. Wiering and J. Schmidhuber. HQ-learning. Adaptive Behavior, 6(2):219-246, 1998.
- (1998) Adaptive Behavior , vol.6 , Issue.2 , pp. 219-246
- Wiering, M.¹ Schmidhuber, J.²

51
- 0010888394
- Solving POMDPs with Levin search and EIRA
- L. Saitta, editor, Morgan Kaufmann Publishers, San Francisco, CA
- M.A. Wiering and J. Schmidhuber. Solving POMDPs with Levin search and EIRA. In L. Saitta, editor, Machine Learning: Proceedings of the Thirteenth International Conference, pages 534-542. Morgan Kaufmann Publishers, San Francisco, CA, 1996.
- (1996) Machine Learning: Proceedings of the Thirteenth International Conference , pp. 534-542
- Wiering, M.A.¹ Schmidhuber, J.²

52
- 0000337576
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- R. J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8:229-256, 1992.
- (1992) Machine Learning , vol.8 , pp. 229-256
- Williams, R.J.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.