SCOPUS 정보 검색 플랫폼

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Volumn 6911 LNAI, Issue PART 1, 2011, Pages 312-327

Preference-based policy iteration: Leveraging preference learning for reinforcement learning

(4) Cheng, Weiwei a Fürnkranz, Johannes b Hüllermeier, Eyke a Park, Sang Hyeun b

a PHILIPPS UNIVERSITY MARBURG (Germany)

b DARMSTADT UNIVERSITY OF TECHNOLOGY (Germany)

Author keywords

[No Author keywords available]

Indexed keywords

CLASSIFICATION METHODS; NOVEL METHODS; POLICY ITERATION; POLICY MODEL; PREFERENCE LEARNING; PREFERENCE-BASED; QUALITATIVE FEEDBACK; RANKING FUNCTIONS; ROLL-OUTS; SUBFIELDS; LABEL RANKINGS; POLICY LEARNING;

KNOWLEDGE MANAGEMENT; REINFORCEMENT LEARNING; DATA MINING; ITERATIVE METHODS; LEARNING ALGORITHMS; NUMERICAL METHODS; OBJECT ORIENTED PROGRAMMING;

NUMERICAL METHODS; LEARNING SYSTEMS;

EID: 80052414142 PISSN: 03029743 EISSN: 16113349 Source Type: Book Series
DOI: 10.1007/978-3-642-23780-5_30 Document Type: Conference Paper

Times cited : (33)

References (17)

1
- 0020970738
- Neuron-like elements that can solve difficult learning control problems
- Barto, A.G., Sutton, R.S., Anderson, C.: Neuron-like elements that can solve difficult learning control problems. IEEE Transaction on Systems, Man and Cybernetics 13, 835-846 (1983)
- (1983) IEEE Transaction on Systems, Man and Cybernetics , vol.13 , pp. 835-846
- Barto, A.G.¹ Sutton, R.S.² Anderson, C.³

2
- 70349984547
- Natural actor-critic algorithms
- Bhatnagar, S., Sutton, R.S., Ghavamzadeh, M., Lee, M.: Natural actor-critic algorithms. Automatica 45(11), 2471-2482 (2009)
- (2009) Automatica , vol.45 , Issue.11 , pp. 2471-2482
- Bhatnagar, S.¹ Sutton, R.S.² Ghavamzadeh, M.³ Lee, M.⁴

3
- 48349140736
- Rollout sampling approximate policy iteration
- Dimitrakakis, C., Lagoudakis, M.G.: Rollout sampling approximate policy iteration. Machine Learning 72(3), 157-171 (2008)
- (2008) Machine Learning , vol.72 , Issue.3 , pp. 157-171
- Dimitrakakis, C.¹ Lagoudakis, M.G.²

4
- 33744466799
- Approximate policy iteration with a policy language bias: Solving relational markov decision processes
- Fern, A., Yoon, S.W., Givan, R.: Approximate policy iteration with a policy language bias: Solving relational markov decision processes. Journal of Artificial Intelligence Research 25, 75-118 (2006)
- (2006) Journal of Artificial Intelligence Research , vol.25 , pp. 75-118
- Fern, A.¹ Yoon, S.W.² Givan, R.³

5
- 84884290516
- Springer, Heidelberg
- Fürnkranz, J., Hüllermeier, E. (eds.): Preference Learning. Springer, Heidelberg (2010)
- (2010) Preference Learning
- Fürnkranz, J.¹ Hüllermeier, E.²

6
- 84865254474
- Rollout allocation strategies for classification-based policy iteration
- Auer, P., Kaski, S., Szepesvàri, C. (eds.)
- Gabillon, V., Lazaric, A., Ghavamzadeh, M.: Rollout allocation strategies for classification-based policy iteration. In: Auer, P., Kaski, S., Szepesvàri, C. (eds.) Proceedings of the ICML 2010 Workshop on Reinforcement Learning and Search in Very Large Spaces (2010)
- Proceedings of the ICML 2010 Workshop on Reinforcement Learning and Search in Very Large Spaces (2010)
- Gabillon, V.¹ Lazaric, A.² Ghavamzadeh, M.³

7
- 76749092270
- The weka data mining software: An update
- Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.: The weka data mining software: An update. SIGKDD Explorations 11(1), 10-18 (2009)
- (2009) SIGKDD Explorations , vol.11 , Issue.1 , pp. 10-18
- Hall, M.¹ Frank, E.² Holmes, G.³ Pfahringer, B.⁴ Reutemann, P.⁵ Witten, I.⁶

8
- 52949143827
- Label ranking by learning pairwise preferences
- Hüllermeier, E., Fürnkranz, J., Cheng, W., Brinker, K.: Label ranking by learning pairwise preferences. Artificial Intelligence 172, 1897-1916 (2008)
- (2008) Artificial Intelligence , vol.172 , pp. 1897-1916
- Hüllermeier, E.¹ Fürnkranz, J.² Cheng, W.³ Brinker, K.⁴

9
- 56449088242
- Non-parametric policy gradients: A unified treatment of propositional and relational domains
- Cohen, W.W., McCallum, A., Roweis, S.T. (eds.) ACM, Helsinki
- Kersting, K., Driessens, K.: Non-parametric policy gradients: a unified treatment of propositional and relational domains. In: Cohen, W.W., McCallum, A., Roweis, S.T. (eds.) Proceedings of the 25th International Conference on Machine Learning (ICML 2008), pp. 456-463. ACM, Helsinki (2008)
- (2008) Proceedings of the 25th International Conference on Machine Learning (ICML 2008) , pp. 456-463
- Kersting, K.¹ Driessens, K.²

10
- 4043069840
- On actor-critic algorithms
- Konda, V.R., Tsitsiklis, J.N.: On actor-critic algorithms. SIAM Journal of Control and Optimization 42(4), 1143-1166 (2003)
- (2003) SIAM Journal of Control and Optimization , vol.42 , Issue.4 , pp. 1143-1166
- Konda, V.R.¹ Tsitsiklis, J.N.²

11
- 1942420814
- Reinforcement learning as classification: Leveraging modern classifiers
- Fawcett, T.E., Mishra, N. (eds.) AAAI Press, Washington, DC
- Lagoudakis, M.G., Parr, R.: Reinforcement learning as classification: Leveraging modern classifiers. In: Fawcett, T.E., Mishra, N. (eds.) Proceedings of the 20th International Conference on Machine Learning (ICML 2003), pp. 424-431. AAAI Press, Washington, DC (2003)
- (2003) Proceedings of the 20th International Conference on Machine Learning (ICML 2003) , pp. 424-431
- Lagoudakis, M.G.¹ Parr, R.²

12
- 33847202724
- Learning to predict by the methods of temporal differences
- Sutton, R.S.: Learning to predict by the methods of temporal differences. Machine Learning 3, 9-44 (1988)
- (1988) Machine Learning , vol.3 , pp. 9-44
- Sutton, R.S.¹

13
- 84898939480
- Policy gradient methods for reinforcement learning with function approximation
- Solla, S.A., Leen, T.K., Müller, K.-R. (eds.) MIT Press, Denver
- Sutton, R.S., McAllester, D.A., Singh, S.P., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: Solla, S.A., Leen, T.K., Müller, K.-R. (eds.) Advances in Neural Information Processing Systems 12 (NIPS- 1999), pp. 1057-1063. MIT Press, Denver (1999)
- (1999) Advances in Neural Information Processing Systems 12 (NIPS- 1999) , pp. 1057-1063
- Sutton, R.S.¹ McAllester, D.A.² Singh, S.P.³ Mansour, Y.⁴

14
- 84890217212
- Label ranking algorithms: A survey
- Fürnkranz and Hüllermeier
- Vembu, S., Gärtner, T.: Label ranking algorithms: A survey. In: Fürnkranz and Hüllermeier [5], pp. 45-64.
- Preference Learning , pp. 45-64
- Vembu, S.¹ Gärtner, T.²

15
- 34249833101
- Q-learning
- Watkins, C.J., Dayan, P.: Q-learning. Machine Learning 8, 279-292 (1992)
- (1992) Machine Learning , vol.8 , pp. 279-292
- Watkins, C.J.¹ Dayan, P.²

16
- 0000337576
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning 8, 229-256 (1992)
- (1992) Machine Learning , vol.8 , pp. 229-256
- Williams, R.J.¹

17
- 70449449564
- Reinforcement learning design for cancer clinical trials
- Zhao, Y., Kosorok, M., Zeng, D.: Reinforcement learning design for cancer clinical trials. Statistics in Medicine 28, 3295-3315 (2009)
- (2009) Statistics in Medicine , vol.28 , pp. 3295-3315
- Zhao, Y.¹ Kosorok, M.² Zeng, D.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.