SCOPUS 정보 검색 플랫폼

Journal of Artificial Intelligence Research

Volumn 45, Issue , 2012, Pages 515-564

Safe exploration of state and action spaces in reinforcement learning

(2) Garcia, Javier a Fernandez, Fernando a

a UNIVERSIDAD CARLOS III DE MADRID (Spain)

Author keywords

[No Author keywords available]

Indexed keywords

ACTION SPACES; BUSINESS MANAGEMENT; COMPLEX TRANSITIONS; CONTINUOUS STATE; EXPLORATION TECHNIQUES; HIGH-DIMENSIONAL; ROBUST BEHAVIOR; TRIAL-AND-ERROR PROCESS;

ARTIFICIAL INTELLIGENCE;

REINFORCEMENT LEARNING;

EID: 84875199879 PISSN: None EISSN: 10769757 Source Type: Journal
DOI: 10.1613/jair.3761 Document Type: Article

Times cited : (162)

References (62)

1
- 0028401306
- Case-based reasoning; foundational issues, methodological variations, and system approaches
- Aamodt, A., & Plaza, E. (1994). Case-Based Reasoning; Foundational Issues, Methodological Variations, and System Approaches. AI Communications, 7 (1), 39-59.
- (1994) AI Communications , vol.7 , Issue.1 , pp. 39-59
- Aamodt, A.¹ Plaza, E.²

2
- 84883027643
- Autonomous Autorotation of an RC Helicopter
- Abbeel, P., Coates, A., Hunter, T., & Ng, A. Y. (2008). Autonomous Autorotation of an RC Helicopter. In ISER, pp. 385-394.
- (2008) ISER , pp. 385-394
- Abbeel, P.¹ Coates, A.² Hunter, T.³ Ng, A.Y.⁴

3
- 77955809093
- Autonomous helicopter aerobatics through apprenticeship learning. I
- Abbeel, P., Coates, A., & Ng, A. Y. (2010). Autonomous helicopter aerobatics through apprenticeship learning. I. J. Robotic Res., 29 (13), 1608-1639.
- (2010) J. Robotic Res. , vol.29 , Issue.13 , pp. 1608-1639
- Abbeel, P.¹ Coates, A.² Ng, A.Y.³

4
- 50249164874
- Robocup 2007: Robot soccer world cup xi
- Springer-Verlag, Berlin, Heidelberg
- Abbott, R. G. (2008). Robocup 2007: Robot soccer world cup xi.. chap. Behavioral Cloning for Simulator Validation, pp. 329-336. Springer-Verlag, Berlin, Heidelberg.
- (2008) Chap. Behavioral Cloning for Simulator Validation , pp. 329-336
- Abbott, R.G.¹

5
- 0000217085
- Tolerating noisy, irrelevant and novel attributes in instance-based learning algorithms
- Aha, D. W. (1992). Tolerating Noisy, Irrelevant and Novel Attributes in Instance-Based Learning Algorithms. International Journal Man-Machine Studies, 36 (2), 267-287.
- (1992) International Journal Man-Machine Studies , vol.36 , Issue.2 , pp. 267-287
- Aha, D.W.¹

6
- 0025725905
- Instance-based learning algorithms
- Aha, D. W., & Kibler, D. (1991). Instance-based learning algorithms. In Machine Learning, pp. 37-66.
- (1991) Machine Learning , pp. 37-66
- Aha, D.W.¹ Kibler, D.²

7
- 1942515258
- Behavioral cloning of student pilots with modular neural networks
- Morgan Kaufmann
- Anderson, C. W., Draper, B. A., & Peterson, D. A. (2000). Behavioral cloning of student pilots with modular neural networks. In Proceedings of the Seventeenth International Conference on Machine Learning, pp. 25-32. Morgan Kaufmann.
- (2000) Proceedings of the Seventeenth International Conference on Machine Learning , pp. 25-32
- Anderson, C.W.¹ Draper, B.A.² Peterson, D.A.³

8
- 63149159130
- A survey of robot learning from demonstration
- Argall, B., Chernova, S., Veloso, M., & Browning, B. (2009). A Survey of Robot Learning from Demonstration. Robotics and Autonomous Systems, 57 (5), 469-483.
- (2009) Robotics and Autonomous Systems , vol.57 , Issue.5 , pp. 469-483
- Argall, B.¹ Chernova, S.² Veloso, M.³ Browning, B.⁴

9
- 84901708832
- Case-based reasoning: Survey and future directions
- Puppe, F. (Ed.), Vol. 1570 of Lecture Notes in Computer Science, Springer
- Bartsch-Sprl, B., Lenz, M., & Hbner, A. (1999). Case-based reasoning: Survey and future directions.. In Puppe, F. (Ed.), XPS, Vol. 1570 of Lecture Notes in Computer Science, pp. 67-89. Springer.
- (1999) XPS , pp. 67-89
- Bartsch-Sprl, B.¹ Lenz, M.² Hbner, A.³

10
- 70350352555
- Improving reinforcement learning by using case-based heuristics
- Springer, Lecture Notes in Artificial Intelligence, Springer
- Bianchi, R., Ros, R., & de Mántaras, R. L. (2009). Improving reinforcement learning by using case-based heuristics.. Vol. 5650, pp. 75-89. Lecture Notes in Artificial Intelligence, Springer, Lecture Notes in Artificial Intelligence, Springer.
- (2009) Lecture Notes in Artificial Intelligence , vol.5650 , pp. 75-89
- Bianchi, R.¹ Ros, R.² De Mántaras, R.L.³

11
- 72249118874
- SIMBA: A simulator for business education and research
- Borrajo, F., Bueno, Y., de Pablo, I., Santos, B. n., Fernandez, F., Garcia, J., & Sagredo, I. (2010). SIMBA: A Simulator for Business Education and Research. Decission Support Systems, 48 (3), 498-506.
- (2010) Decission Support Systems , vol.48 , Issue.3 , pp. 498-506
- Borrajo, F.¹ Bueno, Y.² De Pablo, I.³ Santos, B.N.⁴ Fernandez, F.⁵ García, J.⁶ Sagredo, I.⁷

12
- 84875147306
- Proceedings of the workshop on value function approximation, machine learning conference 1995
- Boyan, J., Moore, A., & Sutton, R. (1995). Proceedings of the workshop on value function approximation, machine learning conference 1995... Technical Report CMU-CS-95- 206.
- (1995) Technical Report CMU-CS-95- 206
- Boyan, J.¹ Moore, A.² Sutton, R.³

13
- 60349110367
- Confidence-based policy learning from demonstration using gaussian mixture models
- Chernova, S., & Veloso, M. (2007). Confidence-based policy learning from demonstration using gaussian mixture models. In Joint Conference on Autonomous Agents and Multi-Agent Systems.
- (2007) Joint Conference on Autonomous Agents and Multi-Agent Systems
- Chernova, S.¹ Veloso, M.²

14
- 67650691600
- Multi-thresholded approach to demonstration selection for interactive robot learning
- New York, NY, USA. ACM
- Chernova, S., & Veloso, M. (2008). Multi-thresholded approach to demonstration selection for interactive robot learning. In Proceedings of the 3rd ACM/IEEE international conference on Human robot interaction, HRI '08, pp. 225-232, New York, NY, USA. ACM.
- (2008) Proceedings of the 3rd ACM/IEEE International Conference on Human Robot Interaction, HRI '08 , pp. 225-232
- Chernova, S.¹ Veloso, M.²

15
- 0007512578
- Truncating temporal differences: On the efficient implementation of td(lambda) for reinforcement learning
- Cichosz, P. (1995). Truncating temporal differences: On the efficient implementation of td(lambda) for reinforcement learning. Journal of Artificial Intelligence Research (JAIR), 2, 287-318.
- (1995) Journal of Artificial Intelligence Research (JAIR) , vol.2 , pp. 287-318
- Cichosz, P.¹

16
- 84875144505
- Truncated temporal differences with function approximation: Successful examples using cmac
- Cichosz, P. (1996). Truncated temporal differences with function approximation: Successful examples using cmac. In Proceedings of the Thirteenth European Symposium on Cybernetics and Systems Research (EMCSR-96).
- (1996) Proceedings of the Thirteenth European Symposium on Cybernetics and Systems Research (EMCSR-96)
- Cichosz, P.¹

17
- 0033077715
- Risk-sensitive and minimax control of discrete- time, finite-state markov decision processes
- Coraluppi, S. P., & Marcus, S. I. (1999). Risk-Sensitive and Minimax Control of Discrete- Time, Finite-State Markov Decision Processes. AUTOMATICA, 35, 301-309.
- (1999) Automatica , vol.35 , pp. 301-309
- Coraluppi, S.P.¹ Marcus, S.I.²

18
- 77956549914
- Risk-aware decision making and dynamic programming
- Defourny, B., Ernst, D., & Wehenkel, L. (2008). Risk-aware decision making and dynamic programming. In NIPS 2008 Workshop on Model Uncertainty and Risk in RL.
- (2008) NIPS 2008 Workshop on Model Uncertainty and Risk in RL
- Defourny, B.¹ Ernst, D.² Wehenkel, L.³

19
- 1942421161
- Relational instance based regression for relational rl
- Driessens, K., & Ramon, J. (2003). Relational instance based regression for relational rl. In International Conference of Machine Learning (ICML), pp. 123-130.
- (2003) International Conference of Machine Learning (ICML) , pp. 123-130
- Driessens, K.¹ Ramon, J.²

20
- 4444312102
- Integrating guidance into relational reinforcement learning
- Driessens, K., & Dẑeroski, S. (2004). Integrating guidance into relational reinforcement learning. Machine Learning, 57 (3), 271-304.
- (2004) Machine Learning , vol.57 , Issue.3 , pp. 271-304
- Driessens, K.¹ Dẑeroski, S.²

21
- 39549117816
- Local feature weighting in nearest prototype classification
- Fernandez, F., & Isasi, P. (2008). Local feature weighting in nearest prototype classification. Neural Networks, IEEE Transactions on, 19 (1), 40-53.
- (2008) Neural Networks, IEEE Transactions on , vol.19 , Issue.1 , pp. 40-53
- Fernandez, F.¹ Isasi, P.²

22
- 38949129339
- Two steps reinforcement learning
- Fernandez, F., & Borrajo, D. (2008). Two steps reinforcement learning. International Journal of Intelligent Systems, 23 (2), 213-245.
- (2008) International Journal of Intelligent Systems , vol.23 , Issue.2 , pp. 213-245
- Fernandez, F.¹ Borrajo, D.²

23
- 80052252232
- Toward a domain-independent case-based reasoning approach for imitation: Three case studies in gaming
- Floyd, M. W., & Esfandiari, B. (2010). Toward a domain-independent case-based reasoning approach for imitation: Three case studies in gaming. In Workshop on Case-Based Reasoning for Computer Games at the 18th International Conference on Case-Based Reasoning (ICCBR), pp. 55-64.
- (2010) Workshop on Case-Based Reasoning for Computer Games at the 18th International Conference on Case-Based Reasoning (ICCBR) , pp. 55-64
- Floyd, M.W.¹ Esfandiari, B.²

24
- 52449097334
- A case-based reasoning approach to imitating robocup players
- Floyd, M. W., Esfandiari, B., & Lam, K. (2008). A Case-Based Reasoning Approach to Imitating Robocup Players. In Proceedings of the 21st International Florida Artificial Intelligence Research Society Conference, pp. 251-256.
- (2008) Proceedings of the 21st International Florida Artificial Intelligence Research Society Conference , pp. 251-256
- Floyd, M.W.¹ Esfandiari, B.² Lam, K.³

25
- 1942419282
- The University of New South
- Forbes, J., & Andre, D. (2002). Representations for learning control policies. In The University of New South, pp. 7-14.
- (2002) Representations for Learning Control Policies , pp. 7-14
- Forbes, J.¹ Andre, D.²

26
- 26944491842
- Cbr for state value function approximation in reinforcement learning
- Springer
- Gabel, T., & Riedmiller, M. (2005). Cbr for state value function approximation in reinforcement learning. In Proceedings of the 6th International Conference on Case-Based Reasoning (ICCBR 2005, pp. 206-221. Springer.
- (2005) Proceedings of the 6th International Conference on Case-Based Reasoning (ICCBR 2005) , pp. 206-221
- Gabel, T.¹ Riedmiller, M.²

27
- 13444290317
- Reinforcement learning with bounded risk
- Morgan Kaufmann
- Geibel, P. (2001). Reinforcement Learning with Bounded Risk. In Proceedings of the 18th International Conference on Machine Learning, pp. 162-169. Morgan Kaufmann.
- (2001) Proceedings of the 18th International Conference on Machine Learning , pp. 162-169
- Geibel, P.¹

28
- 31144477417
- Risk-sensitive reinforcement learning applied to control under constraints
- Geibel, P., & Wysotzki, F. (2005). Risk-sensitive Reinforcement Learning Applied to Control under Constraints. Journal of Artificial Intelligence Research (JAIR), 24, 81-108.
- (2005) Journal of Artificial Intelligence Research (JAIR) , vol.24 , pp. 81-108
- Geibel, P.¹ Wysotzki, F.²

29
- 79956136559
- Safe exploration for reinforcement learning
- Hans, A., Schneegass, D., Schäfer, A. M., & Udluft, S. (2008). Safe Exploration for Reinforcement Learning. In European Symposium on Artificial Neural Network, pp. 143-148.
- (2008) European Symposium on Artificial Neural Network , pp. 143-148
- Hans, A.¹ Schneegass, D.² Schäfer, A.M.³ Udluft, S.⁴

30
- 85120861483
- Consideration of risk in reinforcement learning
- Heger, M. (1994). Consideration of Risk in Reinforcement Learning. In 11th International Conference on Machine Learning, pp. 105-111.
- (1994) 11th International Conference on Machine Learning , pp. 105-111
- Heger, M.¹

31
- 55749100315
- Seeding the initial population of a multi-objective evolutionary algorithm using gradient-based information
- IEEE
- Hernández-Díaz, A. G., Coello, C. A. C., Perez, F., Caballero, R., Luque, J. M., & Santana- Quintero, L. V. (2008). Seeding the initial population of a multi-objective evolutionary algorithm using gradient-based information. In IEEE Congress on Evolutionary Computation, pp. 1617-1624. IEEE.
- (2008) IEEE Congress on Evolutionary Computation , pp. 1617-1624
- Hernández-Díaz, A.G.¹ Coello, C.A.C.² Perez, F.³ Caballero, R.⁴ Luque, J.M.⁵ Santana- Quintero, L.V.⁶

32
- 84875140551
- Tech. rep. arXiv e-Prints 1105.1749, arXiv
- Hester, T., Quinlan, M., & Stone, P. (2011). A real-time model-based reinforcement learning architecture for robot control. Tech. rep. arXiv e-Prints 1105.1749, arXiv.
- (2011) A Real-time Model-based Reinforcement Learning Architecture for Robot Control
- Hester, T.¹ Quinlan, M.² Stone, P.³

33
- 84867438662
- Essex wizards 2001 team description
- Birk, A. Coradeschi, S. & Tadokoro, S. (Eds.), Vol. 2377 of Lecture Notes in Computer Science, Springer
- Hu, H., Kostiadis, K., Hunter, M., & Kalyviotis, N. (2001). Essex wizards 2001 team description. In Birk, A., Coradeschi, S., & Tadokoro, S. (Eds.), RoboCup, Vol. 2377 of Lecture Notes in Computer Science, pp. 511-514. Springer.
- (2001) RoboCup , pp. 511-514
- Hu, H.¹ Kostiadis, K.² Hunter, M.³ Kalyviotis, N.⁴

34
- 84875156987
- Jiang, A. X. (2004). Multiagent reinforcement learning in stochastic games with continuous action spaces..
- (2004) Multiagent Reinforcement Learning in Stochastic Games with Continuous Action Spaces
- Jiang, A.X.¹

35
- 0029679044
- Reinforcement learning: A survey
- Kaelbling, L., Littman, M., & Moore, A. (1996). Reinforcement learning: A survey. Journal of Artificial Intelligence Research (JAIR), 4, 237-285.
- (1996) Journal of Artificial Intelligence Research (JAIR) , vol.4 , pp. 237-285
- Kaelbling, L.¹ Littman, M.² Moore, A.³

36
- 84866396617
- Reinforcement learning for games: Failures and successes
- New York, NY, USA. ACM
- Konen, W., & Bartz-Beielstein, T. (2009). Reinforcement learning for games: failures and successes. In Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers, GECCO '09, pp. 2641- 2648, New York, NY, USA. ACM.
- (2009) Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers, GECCO '09 , pp. 2641-2648
- Konen, W.¹ Bartz-Beielstein, T.²

37
- 72749107057
- Neuroevolutionary reinforcement learning for generalized helicopter control
- Koppejan, R., & Whiteson, S. (2009). Neuroevolutionary reinforcement learning for generalized helicopter control. In GECCO 2009: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 145-152.
- (2009) GECCO 2009: Proceedings of the Genetic and Evolutionary Computation Conference , pp. 145-152
- Koppejan, R.¹ Whiteson, S.²

38
- 80955137547
- Neuroevolutionary reinforcement learning for generalized control of simulated helicopters
- Koppejan, R., & Whiteson, S. (2011). Neuroevolutionary reinforcement learning for generalized control of simulated helicopters. Evolutionary Intelligence, 4, 219-241.
- (2011) Evolutionary Intelligence , vol.4 , pp. 219-241
- Koppejan, R.¹ Whiteson, S.²

39
- 84875159999
- No. May
- Lee, J.-Y., & Lee, J.-J. (2008). Multiple Designs of Fuzzy Controllers for Car Parking Using Evolutionary Algorithm, pp. 1-6. No. May.
- (2008) Multiple Designs of Fuzzy Controllers for Car Parking Using Evolutionary Algorithm , pp. 1-6
- Lee, J.-Y.¹ Lee, J.-J.²

40
- 0004281114
- Oxford University Press
- Luenberger, D. G. (1998). Investment science. Oxford University Press.
- (1998) Investment Science
- Luenberger, D.G.¹

41
- 9444276079
- Reinforcement learning for average reward zero-sum games
- Shawe- Taylor, J. & Singer, Y. (Eds.), Vol. 3120 of Lecture Notes in Computer Science, Springer
- Mannor, S. (2004). Reinforcement learning for average reward zero-sum games. In Shawe- Taylor, J., & Singer, Y. (Eds.), COLT, Vol. 3120 of Lecture Notes in Computer Science, pp. 49-63. Springer.
- (2004) COLT , pp. 49-63
- Mannor, S.¹

42
- 77951530503
- Exa: An effective algorithm for continuous actions reinforcement learning problems
- Martin H, J., & de Lope, J. (2009). Exa: An effective algorithm for continuous actions reinforcement learning problems. In Industrial Electronics, 2009. IECON '09. 35th Annual Conference of IEEE, pp. 2063 -2068.
- (2009) Industrial Electronics, 2009. IECON '09. 35th Annual Conference of IEEE , pp. 2063-2068
- Martin H, J.¹ De Lope, J.²

43
- 78651248230
- Learning autonomous helicopter flight with evolutionary reinforcement learning
- Martín H., J. A., & Lope, J. (2009). Learning Autonomous Helicopter Flight with Evolutionary Reinforcement Learning. In 12th International Conference on Computer Aided Systems Theory (EUROCAST), pp. 75-82.
- (2009) 12th International Conference on Computer Aided Systems Theory (EUROCAST) , pp. 75-82
- Martín, H.J.A.¹ Lope, J.²

44
- 0036832952
- Risk-Sensitive reinforcement learning
- Mihatsch, O., & Neuneier, R. (2002). Risk-Sensitive reinforcement learning. Machine Learning, 49 (2-3), 267-290.
- (2002) Machine Learning , vol.49 , Issue.2-3 , pp. 267-290
- Mihatsch, O.¹ Neuneier, R.²

45
- 84867130083
- CoRR, abs/1205.4810
- Moldovan, T. M., & Abbeel, P. (2012). Safe exploration in markov decision processes. CoRR, abs/1205.4810.
- (2012) Safe Exploration in Markov Decision Processes
- Moldovan, T.M.¹ Abbeel, P.²

46
- 0016082525
- Learning automata - A survey
- Narendra, K. S., & Thathachar, M. A. L. (1974). Learning automata - a survey. Ieee Transactions On Systems Man And Cybernetics, SMC-4(4), 323-334.
- (1974) Ieee Transactions on Systems Man and Cybernetics, SMC-4 , Issue.4 , pp. 323-334
- Narendra, K.S.¹ Thathachar, M.A.L.²

47
- 0003891507
- Prentice-Hall, Inc. Upper Saddle River, NJ, USA
- Narendra, K. S., & Thathachar, M. A. L. (1989). Learning automata: an introduction. Prentice-Hall, Inc., Upper Saddle River, NJ, USA.
- (1989) Learning Automata: An Introduction
- Narendra, K.S.¹ Thathachar, M.A.L.²

48
- 3042583887
- Autonomous helicopter flight via reinforcement learning
- Thrun, S. Saul, L. K. & Scholkopf, B. (Eds.), MIT Press
- Ng, A. Y., Kim, H. J., Jordan, M. I., & Sastry, S. (2003). Autonomous Helicopter Flight via Reinforcement Learning. In Thrun, S., Saul, L. K., & Scholkopf, B. (Eds.), NIPS. MIT Press.
- (2003) NIPS
- Ng, A.Y.¹ Kim, H.J.² Jordan, M.I.³ Sastry, S.⁴

49
- 84875131154
- Robot learning
- Sammut, C. & Webb, G. I. (Eds.), Springer
- Peters, J., Tedrake, R., Roy, N., & Morimoto, J. (2010). Robot learning. In Sammut, C., & Webb, G. I. (Eds.), Encyclopedia of Machine Learning, pp. 865-869. Springer.
- (2010) Encyclopedia of Machine Learning , pp. 865-869
- Peters, J.¹ Tedrake, R.² Roy, N.³ Morimoto, J.⁴

50
- 0242667271
- Genetic programming with user-driven selection: Experiments on the evolution of algorithms for image enhancement
- Morgan Kaufmann
- Poli, R., & Cagnoni, S. (1997). Genetic programming with user-driven selection: Experiments on the evolution of algorithms for image enhancement. In Genetic Programming 1997: Proceedings of the Second Annual Conference, pp. 269-277. Morgan Kaufmann.
- (1997) Genetic Programming 1997: Proceedings of the Second Annual Conference , pp. 269-277
- Poli, R.¹ Cagnoni, S.²

51
- 62949112174
- A collaborative reinforcement learning approach to urban tra-c control optimization
- Salkham, A., Cunningham, R., Garg, A., & Cahill, V. (2008). A collaborative reinforcement learning approach to urban tra-c control optimization. In Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT '08. IEEE/WIC/ACM International Conference on, Vol. 2, pp. 560-566.
- (2008) Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT '08. IEEE/WIC/ACM International Conference on , vol.2 , pp. 560-566
- Salkham, A.¹ Cunningham, R.² Garg, A.³ Cahill, V.⁴

52
- 0031231885
- Experiments with reinforcement learning in problems with continuous state and action spaces
- Santamaría, J. C., Sutton, R. S., & Ram, A. (1998). Experiments with reinforcement learning in problems with continuous state and action spaces. Adaptive Behavior, 6, 163-218.
- (1998) Adaptive Behavior , vol.6 , pp. 163-218
- Santamaría, J.C.¹ Sutton, R.S.² Ram, A.³

53
- 80054035256
- Transfer learning in real-time strategy games using hybrid cbr/rl
- Sharma, M., Holmes, M., Santamaria, J., Irani, A., Isbell, C., & Ram, A. (2007). Transfer learning in real-time strategy games using hybrid cbr/rl. In In Proceedings of the Twentieth International Joint Conference on Artificial Intelligence.
- (2007) Proceedings of the Twentieth International Joint Conference on Artificial Intelligenced
- Sharma, M.¹ Holmes, M.² Santamaria, J.³ Irani, A.⁴ Isbell, C.⁵ Ram, A.⁶

54
- 55749091103
- Evolutionary reinforcement learning of artificial neural networks
- Siebel, N. T., & Sommer, G. (2007). Evolutionary reinforcement learning of artificial neural networks. International Journal of Hybrid Intelligent Systems, 4, 171-183.
- (2007) International Journal of Hybrid Intelligent Systems , vol.4 , pp. 171-183
- Siebel, N.T.¹ Sommer, G.²

55
- 0001898381
- Practical reinforcement learning in continuous spaces
- Morgan Kaufmann
- Smart, W. D., & Kaelbling, L. P. (2000). Practical reinforcement learning in continuous spaces. In Artificial Intelligence, pp. 903-910. Morgan Kaufmann.
- (2000) Artificial Intelligence , pp. 903-910
- Smart, W.D.¹ Kaelbling, L.P.²

56
- 0036058423
- Effective reinforcement learning for mobile robots
- IEEE
- Smart, W. D., & Kaelbling, L. P. (2002). Effective reinforcement learning for mobile robots. In ICRA, pp. 3404-3410. IEEE.
- (2002) ICRA , pp. 3404-3410
- Smart, W.D.¹ Kaelbling, L.P.²

57
- 0004102479
- The MIT Press
- Sutton, R. S., & Barto, A. G. (1998). Reinforcement Learning: An Introduction. The MIT Press.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

58
- 77955839705
- Parameterized maneuver learning for autonomous helicopter ight
- Tang, J., Singh, A., Goehausen, N., & Abbeel, P. (2010). Parameterized maneuver learning for autonomous helicopter ight. In International Conference on Robotics and Automation (ICRA).
- (2010) International Conference on Robotics and Automation (ICRA)
- Tang, J.¹ Singh, A.² Goehausen, N.³ Abbeel, P.⁴

59
- 84899448409
- Metric learning for reinforcement learning agents
- Taylor, M. E., Kulis, B., & Sha, F. (2011). Metric learning for reinforcement learning agents. In Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS).
- (2011) Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS)
- Taylor, M.E.¹ Kulis, B.² Sha, F.³

60
- 34548807200
- Reinforcement learning in continuous action spaces
- Van Hasselt, H., & Wiering, M. A. (2007). Reinforcement Learning in Continuous Action Spaces. In Approximate Dynamic Programming and Reinforcement Learning, 2007. ADPRL 2007. IEEE International Symposium on, pp. 272-279.
- (2007) Approximate Dynamic Programming and Reinforcement Learning, 2007. ADPRL 2007. IEEE International Symposium on , pp. 272-279
- Van Hasselt, H.¹ Wiering, M.A.²

61
- 0008954974
- University of Edinburgh
- Wyatt, J. (1997). Exploration and Inference in Learning from Reinforcement. University of Edinburgh.
- (1997) Exploration and Inference in Learning from Reinforcement
- Wyatt, J.¹

62
- 0033362601
- Evolving artificial neural networks
- Yao, X. (1999). Evolving artificial neural networks. PIEEE: Proceedings of the IEEE, 87, 1423-1447.
- (1999) PIEEE: Proceedings of the IEEE , vol.87 , pp. 1423-1447
- Yao, X.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.