SCOPUS 정보 검색 플랫폼

Volumn 31, Issue 1-3, 1998, Pages 55-85

Module-Based Reinforcement Learning: Experiments with a Real Robot

(3) Kalmár, Zsolt a,b Szepesvári, Csaba a,b Lörincz, András a,b

Author keywords

Feature space; Local control; Markovian Decision Problems; Module based RL; Problem decomposition; Reinforcement learning; Robot learning; Subgoals; Switching control

Indexed keywords

FEATURE SPACE; LOCAL CONTROL; MARKOVIAN DECISION PROBLEMS; MODULE-BASED RL; PROBLEM DECOMPOSITION; REINFORCEMENT LEARNING (RL); SUBGOALS; SWITCHING CONTROL;

CONTROL EQUIPMENT; DISCRETE TIME CONTROL SYSTEMS; LEARNING ALGORITHMS; MARKOV PROCESSES; MATHEMATICAL MODELS; PROBLEM SOLVING; DECISION THEORY;

ROBOTS; ROBOT LEARNING;

FEATURE SPACE; LOCAL CONTROL; MODULE-BASED REINFORCEMENT LEARNING; PROBLEM DECOMPOSITION; SWITCHING CONTROL;

EID: 0032045145 PISSN: 08856125 EISSN: None Source Type: Journal
DOI: 10.1023/a:1007440607681 Document Type: Article

Times cited : (29)

References (72)

1
- 0030149709
- Purposive behavior acquisition for a real robot by vision-based reinforcement learning
- Asada, M., Noda, S., Tawaratsumida, S. & Hosoda, K. (1996). Purposive behavior acquisition for a real robot by vision-based reinforcement learning. Machine Learning, 23:279-303.
- (1996) Machine Learning , vol.23 , pp. 279-303
- Asada, M.¹ Noda, S.² Tawaratsumida, S.³ Hosoda, K.⁴

2
- 0029210635
- Learning to act using real-time dynamic programming
- Barto, A.G., Bradtke, S.J. & Singh, S.P. (1995). Learning to act using real-time dynamic programming. Artificial Intelligence, 1(72):81-138.
- (1995) Artificial Intelligence , vol.1 , Issue.72 , pp. 81-138
- Barto, A.G.¹ Bradtke, S.J.² Singh, S.P.³

3
- 0003787146
- Princeton University Press, Princeton, New Jersey
- Bellman, R. (1957). Dynamic Programming. Princeton University Press, Princeton, New Jersey.
- (1957) Dynamic Programming
- Bellman, R.¹

4
- 2342578440
- Sixth European Workshop on Learning Robots
- Springer, Berlin, 1998
- Birk, A. & Demiris, J. (1998). Sixth European Workshop on Learning Robots. Lecture Notes in Artificial Intelligence. Springer, Berlin, 1998.
- (1998) Lecture Notes in Artificial Intelligence
- Birk, A.¹ Demiris, J.²

5
- 0031185898
- Modeling agents as qualitative decision makers
- Brafman, R.I. & Moshe, T. (1997). Modeling agents as qualitative decision makers. Artificial Intelligence, 94 (1):217-268.
- (1997) Artificial Intelligence , vol.94 , Issue.1 , pp. 217-268
- Brafman, R.I.¹ Moshe, T.²

6
- 0003672832
- PhD thesis, Laboratory of Information and Decision, MIT, 77 Massachusetts Avenue, Cambridge, MA 02139-4307 USA
- Branicky, M.S. (1995). Studies in Hybrid Systems: Modeling, Analysis, and Control. PhD thesis, Laboratory of Information and Decision, MIT, 77 Massachusetts Avenue, Cambridge, MA 02139-4307 USA.
- (1995) Studies in Hybrid Systems: Modeling, Analysis, and Control
- Branicky, M.S.¹

7
- 0344226898
- Technical report lids-p-2239, Laboratory for Information and Decision Systems, MIT, 77 Massachusetts Avenue, Cambridge, MA 02139-4307 USA
- Branicky, M.S., Borkar, V.S. & Mitter, S.K. (1994). A unified framework for hybrid control: Background, model, and theory. Technical report lids-p-2239, Laboratory for Information and Decision Systems, MIT, 77 Massachusetts Avenue, Cambridge, MA 02139-4307 USA.
- (1994) A Unified Framework for Hybrid Control: Background, Model, and Theory
- Branicky, M.S.¹ Borkar, V.S.² Mitter, S.K.³

8
- 0003106852
- Hybrid models for motion control systems
- Birkhäuser, Boston
- Brockett, R.W. (1993). Hybrid models for motion control systems. In Essays in Control: Perspectives in the Theory and its Applications, pages 29-53. Birkhäuser, Boston.
- (1993) Essays in Control: Perspectives in the Theory and Its Applications , pp. 29-53
- Brockett, R.W.¹

9
- 0001845570
- Artificial life and real robots
- MIT Press
- Brooks, R. (1991a). Artificial life and real robots. In Proceedings of the First European Conference on Artificial Life (ECAL), pages 3-10. MIT Press.
- (1991) Proceedings of the First European Conference on Artificial Life (ECAL) , pp. 3-10
- Brooks, R.¹

10
- 0001937317
- Elephants don't play chess
- Bradford-MIT Press, 1991
- Brooks, R.A. (1991b). Elephants don't play chess. In Designing Autonomous Agents. Bradford-MIT Press, 1991.
- (1991) Designing Autonomous Agents
- Brooks, R.A.¹

11
- 0026998041
- Reinforcement learning with perceptual aliasing: The perceptual distinctions approach
- San Jose, CA. AAAI Press
- Chrisman, L. (1992). Reinforcement learning with perceptual aliasing: The perceptual distinctions approach. In Proceedings of the Tenth National Conference on Artificial Intelligence, pages 183-188, San Jose, CA. AAAI Press.
- (1992) Proceedings of the Tenth National Conference on Artificial Intelligence , pp. 183-188
- Chrisman, L.¹

12
- 0030167564
- Behavior analysis and training: A methodology for behavior engineering
- Colombetti, M., Dorigo, M. & Borghi, G. (1996). Behavior analysis and training: A methodology for behavior engineering. IEEE Transactions on Systems, Man, and Cybernetics-Part B, 26(3):365-380.
- (1996) IEEE Transactions on Systems, Man, and Cybernetics-Part B , vol.26 , Issue.3 , pp. 365-380
- Colombetti, M.¹ Dorigo, M.² Borghi, G.³

13
- 0021577685
- A qualitative physics based on confluences
- de Kleer, J. & Seely, B.J. (1984). A qualitative physics based on confluences. Artificial Intelligence, 24(1-3): 7-83.
- (1984) Artificial Intelligence , vol.24 , Issue.1-3 , pp. 7-83
- Kleer, J.¹ Seely, B.J.²

14
- 0029326107
- Alecsys and the autonomouse: Learning to control a real robot by distributed classifier systems
- Dorigo, M. (1995). Alecsys and the autonomouse: Learning to control a real robot by distributed classifier systems. Machine Learning, 19(3):209-240.
- (1995) Machine Learning , vol.19 , Issue.3 , pp. 209-240
- Dorigo, M.¹

15
- 0028739953
- Robot shaping: Developing autonomous agents through learning
- Dorigo, M. & Colombetti, M. (1994). Robot shaping: Developing autonomous agents through learning. Artificial Intelligence, 71:321-370.
- (1994) Artificial Intelligence , vol.71 , pp. 321-370
- Dorigo, M.¹ Colombetti, M.²

16
- 0343860991
- Technical report 98-115, Research Group on Artificial Intelligence, JATE-MTA
- Gábor, Z., Kalmár, Zs. & Szepesvári, Cs.. (1998). Multi-criteria reinforcement learning. Technical report 98-115, Research Group on Artificial Intelligence, JATE-MTA.
- (1998) Multi-criteria Reinforcement Learning
- Gábor, Z.¹ Kalmár, Zs.² Szepesvári, Cs.³

17
- 0004242478
- Lecture Notes in Computer Science. Springer-Verlag, New York
- Grossman, R.L., Nerode, A., Ravn, A. P. & Rischel, H. (1993). Hybrid Systems, volume 736 of Lecture Notes in Computer Science. Springer-Verlag, New York.
- (1993) Hybrid Systems , vol.736
- Grossman, R.L.¹ Nerode, A.² Ravn, A.P.³ Rischel, H.⁴

18
- 0029751418
- The loss from imperfect value functions in expectation-based and minimax-based tasks
- Heger, M. (1996). The loss from imperfect value functions in expectation-based and minimax-based tasks. Machine Learning, 22:197-225.
- (1996) Machine Learning , vol.22 , pp. 197-225
- Heger, M.¹

19
- 0020749243
- Vector-valued dynamic programming
- Henig, M.I. (1983). Vector-valued dynamic programming. SIAM J. Control and Optimization, 21(3):490-499.
- (1983) SIAM J. Control and Optimization , vol.21 , Issue.3 , pp. 490-499
- Henig, M.I.¹

20
- 0000439891
- On the convergence of stochastic iterative dynamic programming algorithms
- Jaakkola, T., Jordan, M.I. & Singh, S.P. (1994). On the convergence of stochastic iterative dynamic programming algorithms. Neural Computation, 6(6):1185-1201.
- (1994) Neural Computation , vol.6 , Issue.6 , pp. 1185-1201
- Jaakkola, T.¹ Jordan, M.I.² Singh, S.P.³

21
- 0029679044
- Reinforcement learning: A survey
- Kaebling, L.P., Littman, M.L. & Moore, A.W. (1996). Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4:237-285.
- (1996) Journal of Artificial Intelligence Research , vol.4 , pp. 237-285
- Kaebling, L.P.¹ Littman, M.L.² Moore, A.W.³

22
- 0028745065
- Generalization in an autonomous agent
- Orlando, Florida. IEEE Inc
- Kalmár, Zs., Szepesvári, Cs. & Lorincz, A. (1994). Generalization in an autonomous agent. In Proc. of IEEE WCCI ICNN'94, volume 3, pages 1815-1817, Orlando, Florida. IEEE Inc.
- (1994) Proc. of IEEE WCCI ICNN'94 , vol.3 , pp. 1815-1817
- Kalmár, Zs.¹ Szepesvári, Cs.² Lorincz, A.³

23
- 0029205333
- Generalized dynamic concept model as a route to construct adaptive autonomous agents
- Kalmár, Zs., Szepesvári, Cs. & Lorincz, A. (1995). Generalized dynamic concept model as a route to construct adaptive autonomous agents. Neural Network World, 5:353-360.
- (1995) Neural Network World , vol.5 , pp. 353-360
- Kalmár, Zs.¹ Szepesvári, Cs.² Lorincz, A.³

24
- 2342560826
- Module based reinforcement learning for a real robot
- Kalmár, Zs., Szepesvári, Cs. & Lorincz, A. (1997). Module based reinforcement learning for a real robot. In Proc. of the 6th European Workshop on Learning Robots, pages 22-32.
- (1997) Proc. of the 6th European Workshop on Learning Robots , pp. 22-32
- Kalmár, Zs.¹ Szepesvári, Cs.² Lorincz, A.³

25
- 2342475494
- Complexity analysis of real-time reinforcement learning applied to finding shortest paths in deterministic domains
- Koenig, S. & Simmons, R.G. (1997). Complexity analysis of real-time reinforcement learning applied to finding shortest paths in deterministic domains. Machine Learning: A Special Issue on Reinforcement Learning, 12:234-345.
- (1997) Machine Learning: A Special Issue on Reinforcement Learning , vol.12 , pp. 234-345
- Koenig, S.¹ Simmons, R.G.²

26
- 0003828024
- Pitman Publisher, Massachusetts
- Korf, R.E. (1985a). Learning to solve problems by searching for macro-operators. Pitman Publisher, Massachusetts.
- (1985) Learning to Solve Problems by Searching for Macro-operators
- Korf, R.E.¹

27
- 0022045044
- Macro-operators: A weak method for learning
- Korf, R.E. (1985b). Macro-operators: A weak method for learning. Artificial Intelligence, 26:35-77.
- (1985) Artificial Intelligence , vol.26 , pp. 35-77
- Korf, R.E.¹

28
- 0023421864
- Planning as search: A quantitative approach
- Korf, R.E. (1987). Planning as search: A quantitative approach. Artificial Intelligence, 33:65-88.
- (1987) Artificial Intelligence , vol.33 , pp. 65-88
- Korf, R.E.¹

29
- 0026961481
- Automatic programming of robots using genetic programming
- Menlo Park, CA. AAAI Press/The MIT Press
- Koza, J.R. & Rice, J.P. (1992). Automatic programming of robots using genetic programming. In Proceedings of Tenth National Conference on Artificial Intelligence, pages 194-201, Menlo Park, CA. AAAI Press/The MIT Press.
- (1992) Proceedings of Tenth National Conference on Artificial Intelligence , pp. 194-201
- Koza, J.R.¹ Rice, J.P.²

30
- 0022062142
- A survey of some results in stochastic adaptive controls
- Kumar, P.R. (1985). A survey of some results in stochastic adaptive controls. SIAM Journal of Control and Optimization, 23:329-380.
- (1985) SIAM Journal of Control and Optimization , vol.23 , pp. 329-380
- Kumar, P.R.¹

31
- 0003861655
- PhD thesis, Department of Computer Science, Brown University. Also Technical Report CS-96-09
- Littman, M.L. (1996). Algorithms for Sequential Decision Making. PhD thesis, Department of Computer Science, Brown University. Also Technical Report CS-96-09.
- (1996) Algorithms for Sequential Decision Making
- Littman, M.L.¹

32
- 0001961616
- A Generalized Reinforcement Learning Model: Convergence and applications
- Littman, M.L. & Szepesvári, Cs. (1996). A Generalized Reinforcement Learning Model: Convergence and applications. In Int. Conf. on Machine Learning, pages 310-318.
- (1996) Int. Conf. on Machine Learning , pp. 310-318
- Littman, M.L.¹ Szepesvári, Cs.²

33
- 2342628252
- A design framework for hierarchical, hybrid control
- submitted
- Lygeros, J., Godbole, D.N. & Sastry, S.S. (1997). A design framework for hierarchical, hybrid control. IEEE Transactions on Automatic Control, special issue on Hybrid Systems, (submitted).
- (1997) IEEE Transactions on Automatic Control, Special Issue on Hybrid Systems
- Lygeros, J.¹ Godbole, D.N.² Sastry, S.S.³

34
- 0010879246
- Adaptive action selection
- Lawrence Erlbaum Associates
- Maes, P. (1991a). Adaptive action selection. In Proc. of the Thirteenth Annual Conf. of the Cognitive Science Society. Lawrence Erlbaum Associates.
- (1991) Proc. of the Thirteenth Annual Conf. of the Cognitive Science Society
- Maes, P.¹

35
- 0002765109
- A bottom-up mechanism for behavior selection in an artificial creature
- J.A. Meyer and S. Wilson, editors, MIT Press
- Maes, P. (1991b), A bottom-up mechanism for behavior selection in an artificial creature. In J.A. Meyer and S. Wilson, editors, Proc. of the First International Conference on Simulation of Adaptive Behavior. MIT Press.
- (1991) Proc. of the First International Conference on Simulation of Adaptive Behavior
- Maes, P.¹

36
- 0002434059
- Learning behavior networks from experience
- MIT Press, Cambridge, Massachusetts
- Maes, P. (1992). Learning behavior networks from experience. In Toward a Practice of Autonomous Systems (Proc. First European Conference on Artificial Life), pages 48-57. MIT Press, Cambridge, Massachusetts.
- (1992) Toward a Practice of Autonomous Systems (Proc. First European Conference on Artificial Life) , pp. 48-57
- Maes, P.¹

37
- 84976813028
- Learning to coordinate behaviors
- Boston, MA
- Maes, P. & Brooks, R.A. (1990). Learning to coordinate behaviors. In Proc. of AAAI-90, pages 796-802, Boston, MA.
- (1990) Proc. of AAAI-90 , pp. 796-802
- Maes, P.¹ Brooks, R.A.²

38
- 0026880130
- Automatic programming of behavior-based robots using reinforcement learning
- Mahadevan, S. & Connell, J. (1992). Automatic programming of behavior-based robots using reinforcement learning. Artificial Intelligence, 55:311-365.
- (1992) Artificial Intelligence , vol.55 , pp. 311-365
- Mahadevan, S.¹ Connell, J.²

39
- 0030647149
- Reinforcement learning in the multi-robot domain
- Matarić, M. (1997). Reinforcement learning in the multi-robot domain. Autonomous Robots, 4.
- (1997) Autonomous Robots , vol.4
- Matarić, M.¹

40
- 85151432208
- Overcoming incomplete perception with utile distinction memory
- Amherst, Massachusetts. Morgan Kaufmann
- McCallum, R.A. (1993). Overcoming incomplete perception with utile distinction memory. In Proceedings of the Tenth International Conference on Machine Learning, pages 190-196, Amherst, Massachusetts. Morgan Kaufmann.
- (1993) Proceedings of the Tenth International Conference on Machine Learning , pp. 190-196
- McCallum, R.A.¹

41
- 84947933152
- Finite-element methods with local triangulation refinement for continuous reinforcement learning problems
- M.van Someren and G. Widmer, editors, Lecture Notes in Artificial Intelligence, Springer, Berlin
- Munos, R. (1997). Finite-element methods with local triangulation refinement for continuous reinforcement learning problems. In M.van Someren and G. Widmer, editors, Machine Learning: ECML'97 (9th European Conf. on Machine Learning, Proceedings), volume 1224 of Lecture Notes in Artificial Intelligence, pages 170-183. Springer, Berlin.
- (1997) Machine Learning: ECML'97 (9th European Conf. on Machine Learning, Proceedings) , vol.1224 , pp. 170-183
- Munos, R.¹

42
- 0003430412
- Prentice-Hall, Englewood Cliffs, NJ
- Newell, A. & Simon, H.A. (1972). Human Problem Solving. Prentice-Hall, Englewood Cliffs, NJ.
- (1972) Human Problem Solving
- Newell, A.¹ Simon, H.A.²

43
- 0344506168
- Reinforcement learning with hierarchies of machines
- Cambridge, MA. MIT Press. in press
- Parr, R. & Russell, S. (1997). Reinforcement learning with hierarchies of machines. In Advances in Neural Information Processing Systems 11, Cambridge, MA. MIT Press. in press.
- (1997) Advances in Neural Information Processing Systems 11
- Parr, R.¹ Russell, S.²

44
- 0004235832
- Princeton University Press, Princeton, NJ
- Pólya, Gy.. (1945). How to solve it? Princeton University Press, Princeton, NJ.
- (1945) How to Solve It?
- Pólya, Gy.¹

45
- 0006419257
- Planning with closed-loop macro actions
- AAAI Press/The MIT Press. in press
- Precup, D., Sutton, R.S. & Singh, S.P. (1997). Planning with closed-loop macro actions. In Working notes of the 1997 AAAI Fall Symposium on Model-directed Autonomous Systems. AAAI Press/The MIT Press. in press.
- (1997) Working Notes of the 1997 AAAI Fall Symposium on Model-directed Autonomous Systems
- Precup, D.¹ Sutton, R.S.² Singh, S.P.³

46
- 0003644137
- Holden Day, San Francisco, California
- Ross, S.M. (1970). Applied Probability Models with Optimization Applications. Holden Day, San Francisco, California.
- (1970) Applied Probability Models with Optimization Applications
- Ross, S.M.¹

47
- 0016069798
- Planning in a hierarchy of abstraction spaces
- Sacerdoti, E.D. (1974). Planning in a hierarchy of abstraction spaces. Artificial Intelligence, 5:115-135.
- (1974) Artificial Intelligence , vol.5 , pp. 115-135
- Sacerdoti, E.D.¹

48
- 0006506831
- Algorithms for design of hybrid systems
- Sastry, S. (1997). Algorithms for design of hybrid systems. In Proc. of Int. Conf. of Information Sciences.
- (1997) Proc. of Int. Conf. of Information Sciences
- Sastry, S.¹

49
- 0030145238
- Qualitative system identification: Deriving structure from behavior
- Say, A.C.C. & Selahattin, K. (1996). Qualitative system identification: deriving structure from behavior. Artificial Intelligence, 83(1):75-141.
- (1996) Artificial Intelligence , vol.83 , Issue.1 , pp. 75-141
- Say, A.C.C.¹ Selahattin, K.²

50
- 84899022377
- How to dynamically merge markov decision processes
- Cambridge, MA. MIT Press. in press
- Singh, S. & Cohn, D. (1997). How to dynamically merge markov decision processes. In Advances in Neural Information Processing Systems 11, Cambridge, MA. MIT Press. in press.
- (1997) Advances in Neural Information Processing Systems 11
- Singh, S.¹ Cohn, D.²

51
- 2342564758
- On the convergence of single-step on-policy reinforcement-learning algorithms
- accepted
- Singh, S., Jaakkola, T., Littman, M.L. & Szepesvári, Cs.. (1997). On the convergence of single-step on-policy reinforcement-learning algorithms. Machine Learning. accepted.
- (1997) Machine Learning
- Singh, S.¹ Jaakkola, T.² Littman, M.L.³ Szepesvári, Cs.⁴

52
- 0026962175
- Reinforcement learning with a hierarchy of abstract models
- San Jose, CA. AAAI Press
- Singh, S.P. (1992). Reinforcement learning with a hierarchy of abstract models. In Proceedings of the Tenth National Conference on Artificial Intelligence, pages 202-207, San Jose, CA. AAAI Press.
- (1992) Proceedings of the Tenth National Conference on Artificial Intelligence , pp. 202-207
- Singh, S.P.¹

53
- 2142812536
- Learning without state-estimation in partially observable markovian decision processes
- Singh, S.P., Jaakkola, T. & Jordan, M.I. (1995). Learning without state-estimation in partially observable markovian decision processes. In Proc. of the Eleventh Machine Learning Conference, pages pp. 284-292.
- (1995) Proc. of the Eleventh Machine Learning Conference , pp. 284-292
- Singh, S.P.¹ Jaakkola, T.² Jordan, M.I.³

54
- 0000723997
- Generalization in reinforcement learning: Successful examples using sparse coarse coding
- Sutton, R.S. (1996). Generalization in reinforcement learning: Successful examples using sparse coarse coding. Advances in Neural Information Processing Systems, 8.
- (1996) Advances in Neural Information Processing Systems , vol.8
- Sutton, R.S.¹

55
- 0003617454
- PhD thesis, University of Massachusetts, Amherst, MA
- Sutton, R.S. (1984). Temporal Credit Assignment in Reinforcement Learning. PhD thesis, University of Massachusetts, Amherst, MA.
- (1984) Temporal Credit Assignment in Reinforcement Learning
- Sutton, R.S.¹

56
- 0028742076
- Dynamic Concept Model learns optimal policies
- Orlando, Florida. IEEE Inc
- Szepesvári, Cs. (1994). Dynamic Concept Model learns optimal policies. In Proc. of IEEE WCCI ICNN'94, volume 3, pages 1738-1742, Orlando, Florida. IEEE Inc.
- (1994) Proc. of IEEE WCCI ICNN'94 , vol.3 , pp. 1738-1742
- Szepesvári, C.¹

57
- 84947910334
- Learning and exploitation do not conflict under minimax optimality
- M.van Someren and G. Widmer, editors, Lecture Notes in Artificial Intelligence, Springer, Berlin
- Szepesvári, Cs. (1997a). Learning and exploitation do not conflict under minimax optimality. In M.van Someren and G. Widmer, editors, Machine Learning: ECML'97 (9th European Conf. on Machine Learning, Proceedings), volume 1224 of Lecture Notes in Artificial Intelligence, pages 242-249. Springer, Berlin.
- (1997) Machine Learning: ECML'97 (9th European Conf. on Machine Learning, Proceedings) , vol.1224 , pp. 242-249
- Szepesvári, Cs.¹

58
- 0008876345
- PhD thesis, Bolyai Institute of Mathematics, University of Szeged, Szeged, Aradi vrt. tere 1, HUNGARY, 6720
- Szepesvári, Cs. (1997b). Static and Dynamic Aspects of Optimal Sequential Decision Making. PhD thesis, Bolyai Institute of Mathematics, University of Szeged, Szeged, Aradi vrt. tere 1, HUNGARY, 6720.
- (1997) Static and Dynamic Aspects of Optimal Sequential Decision Making
- Szepesvári, Cs.¹

59
- 2342450292
- Generalized Markov Decision Processes: Dynamic programming and reinforcement learning algorithms
- in preparation
- Szepesvári, Cs. & Littman, M.L. (1997). Generalized Markov Decision Processes: Dynamic programming and reinforcement learning algorithms. Neural Computation. in preparation.
- (1997) Neural Computation
- Szepesvári, Cs.¹ Littman, M.L.²

60
- 84977014241
- Behavior of an adaptive self-organizing autonomous agent working with cues and competing concepts
- Szepesvári, Cs. & Lorincz, A. (1994). Behavior of an adaptive self-organizing autonomous agent working with cues and competing concepts. Adaptive Behavior, 2(2):131-160.
- (1994) Adaptive Behavior , vol.2 , Issue.2 , pp. 131-160
- Szepesvári, Cs.¹ Lorincz, A.²

61
- 0002210775
- Van Nostrand Rheinhold, Florence KY
- Thrun, S.B. (1992). The role of exploration in learning control. Van Nostrand Rheinhold, Florence KY.
- (1992) The Role of Exploration in Learning Control
- Thrun, S.B.¹

62
- 33749882712
- Finding structure in reinforcement learning
- Gerald Tesauro, David S. Touretzky, and Todd K. Leen, editors, The MIT Press, Cambridge
- Thrun, S. & Schwartz, A. (1995). Finding structure in reinforcement learning. In Gerald Tesauro, David S. Touretzky, and Todd K. Leen, editors, Advances in Neural Information Processing Systems, volume 7, pages 385-392. The MIT Press, Cambridge.
- (1995) Advances in Neural Information Processing Systems , vol.7 , pp. 385-392
- Thrun, S.¹ Schwartz, A.²

63
- 0344195285
- Genetic algorithm with alphabet optimization
- Tóth, G.J., Kovács, Sz. & Lorincz, A. (1995). Genetic algorithm with alphabet optimization. Biological Cybernetics, 73:61-68.
- (1995) Biological Cybernetics , vol.73 , pp. 61-68
- Tóth, G.J.¹ Kovács, Sz.² Lorincz, A.³

64
- 2342562099
- Asynchronous stochastic approximation and q-learning
- Tsitsiklis, J.N. (1994). Asynchronous stochastic approximation and q-learning. Machine Learning, 8(3-4): 257-277.
- (1994) Machine Learning , vol.8 , Issue.3-4 , pp. 257-277
- Tsitsiklis, J.N.¹

65
- 0029752470
- Feature-based methods for large scale dynamic programming
- Tsitsiklis, J.N. & Van Roy, B. (1996). Feature-based methods for large scale dynamic programming. Machine Learning, 22:59-94.
- (1996) Machine Learning , vol.22 , pp. 59-94
- Tsitsiklis, J.N.¹ Van Roy, B.²

66
- 0008813539
- Technical Report LIDS-P-2322, Laboratory for Information and Decision Systems, Massachusetts Institute of Technology
- Tsitsiklis, J.N. & Van Roy, B. (1995). An analysis of temporal difference learning with function approximation. Technical Report LIDS-P-2322, Laboratory for Information and Decision Systems, Massachusetts Institute of Technology.
- (1995) An Analysis of Temporal Difference Learning with Function Approximation
- Tsitsiklis, J.N.¹ Van Roy, B.²

67
- 0003702006
- PhD thesis, University of Edinburgh
- Tyrrell, T. (1993). Computational Mechanisms for Action Selection. PhD thesis, University of Edinburgh.
- (1993) Computational Mechanisms for Action Selection
- Tyrrell, T.¹

68
- 0030418601
- Behavior coordination for a mobile robot using modular reinforcement learning
- Uchibe, E., Asada, M. & Hosoda, K. (1996). Behavior coordination for a mobile robot using modular reinforcement learning. In Proc. of IEEE/RSJ Int. Conf. on Intelligent Robot and Sytems, pages 1329-1336.
- (1996) Proc. of IEEE/RSJ Int. Conf. on Intelligent Robot and Sytems , pp. 1329-1336
- Uchibe, E.¹ Asada, M.² Hosoda, K.³

69
- 34249833101
- Q-learning
- Watkins, C.J.C.H. & Dayan, P. (1992). Q-learning. Machine Learning, 3(8):279-292.
- (1992) Machine Learning , vol.3 , Issue.8 , pp. 279-292
- Watkins, C.J.C.H.¹ Dayan, P.²

70
- 0002557583
- Advanced forecasting methods for global crisis warning and models of intelligence
- Werbös, P.J. (1977). Advanced forecasting methods for global crisis warning and models of intelligence. General Systems Yearbook, 22:25-38.
- (1977) General Systems Yearbook , vol.22 , pp. 25-38
- Werbös, P.J.¹

71
- 0031215211
- HQ-learning
- Wiering, M. & Schmidhuber, J. (1997). HQ-learning. Adaptive Behavior, 6(2).
- (1997) Adaptive Behavior , vol.6 , Issue.2
- Wiering, M.¹ Schmidhuber, J.²

72
- 2342534771
- Optimal control by means of switching
- Zabczyk, J. (1973). Optimal control by means of switching. Studia Mathematica, 65:161-171.
- (1973) Studia Mathematica , vol.65 , pp. 161-171
- Zabczyk, J.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.