SCOPUS 정보 검색 플랫폼

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Volumn 7207 LNAI, Issue , 2012, Pages 2-6

Beyond reward: The problem of knowledge and data

(1) Sutton, Richard S a

a UNIVERSITY OF ALBERTA (Canada)

Author keywords

[No Author keywords available]

Indexed keywords

EID: 84864841464 PISSN: 03029743 EISSN: 16113349 Source Type: Book Series
DOI: 10.1007/978-3-642-31951-8_2 Document Type: Conference Paper

Times cited : (5)

References (24)

1
- 85151728371
- Residual algorithms: Reinforcement learning with function approximation
- Baird, L.C.: Residual algorithms: Reinforcement learning with function approximation. In: Proceedings of the Twelfth International Conference on Machine Learning, pp. 30-37 (1995)
- (1995) Proceedings of the Twelfth International Conference on Machine Learning , pp. 30-37
- Baird, L.C.¹

2
- 84864878046
- Scaling-up knowledge for a cognizant robot
- Degris, T., Modayil, J.: Scaling-up knowledge for a cognizant robot. In: Notes of the AAAI Spring Symposium on Designing Intelligent Robots: Reintegrating AI (2012)
- (2012) Notes of the AAAI Spring Symposium on Designing Intelligent Robots: Reintegrating AI
- Degris, T.¹ Modayil, J.²

3
- 84880873347
- Building portable options: Skill transfer in reinforcement learning
- Konidaris, G., Barto, A.G.: Building portable options: Skill transfer in reinforcement learning. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence, pp. 895-900 (2007)
- (2007) Proceedings of the 20th International Joint Conference on Artificial Intelligence , pp. 895-900
- Konidaris, G.¹ Barto, A.G.²

4
- 54249139924
- MSc. thesis, University of Alberta
- Koop, A.: Investigating Experience: Temporal Coherence and Empirical Knowledge Representation. MSc. thesis, University of Alberta (2007)
- (2007) Investigating Experience: Temporal Coherence and Empirical Knowledge Representation
- Koop, A.¹

5
- 84864655352
- PhD. thesis, University of Alberta
- Maei, H.R.: Gradient Temporal-Difference Learning Algorithms. PhD. thesis, University of Alberta (2011)
- (2011) Gradient Temporal-Difference Learning Algorithms
- Maei, H.R.¹

6
- 77954101982
- Gq(+) a general gradient algorithm for temporal-difference prediction learning with eligibility traces
- Maei, H.R., Sutton, R.S.: GQ(+): A general gradient algorithm for temporal-difference prediction learning with eligibility traces. In: Proceedings of the Third Conference on Artificial General Intelligence (2010)
- (2010) Proceedings of the Third Conference on Artificial General Intelligence
- Maei, H.R.¹ Sutton, R.S.²

7
- 79951481923
- Convergent temporal-difference learning with arbitrary smooth function approximation
- MIT Press
- Maei, H.R., Szepesvári, C., Bhatnagar, S., Precup, D., Silver, D., Sutton, R.S.: Convergent temporal-difference learning with arbitrary smooth function approximation. In: Advances in Neural Information Processing Systems, vol. 22. MIT Press (2009)
- (2009) Advances in Neural Information Processing Systems , vol.22
- Maei, H.R.¹ Szepesvári, C.² Bhatnagar, S.³ Precup, D.⁴ Silver, D.⁵ Sutton, R.S.⁶

8
- 77956541799
- Toward off-policy learning control with function approximation
- Maei, H.R., Szepesvári, C., Bhatnagar, S., Sutton, R.S.: Toward off-policy learning control with function approximation. In: Proceedings of the 27th International Conference on Machine Learning (2010)
- (2010) Proceedings of the 27th International Conference on Machine Learning
- Maei, H.R.¹ Szepesvári, C.² Bhatnagar, S.³ Sutton, R.S.⁴

9
- 14344250635
- Dynamic abstraction in reinforcement learning via clustering
- Mannor, S., Menache, I., Hoze, A., Klein, U.: Dynamic abstraction in reinforcement learning via clustering. In: Proceedings of the Twenty-First International Conference on Machine Learning (2004)
- (2004) Proceedings of the Twenty-First International Conference on Machine Learning
- Mannor, S.¹ Menache, I.² Hoze, A.³ Klein, U.⁴

10
- 0003543129
- Technical Report 98-70, University of Massachusetts, Department of Computer Science
- McGovern, A., Sutton, R.S.: Macro-actions in reinforcement learning: An empirical analysis. Technical Report 98-70, University of Massachusetts, Department of Computer Science (1998)
- (1998) Macro-actions in reinforcement learning: An empirical analysis
- McGovern, A.¹ Sutton, R.S.²

11
- 84864861073
- Multi-timescale nexting in a reinforcement learning robot
- Modayil, J., White, A., Sutton, R.S.: Multi-timescale nexting in a reinforcement learning robot. In: Proceedings of the 2012 Conference on Simulation of Adaptive Behaviour (to appear, 2012)
- Proceedings of the 2012 Conference on Simulation of Adaptive Behaviour (to appear, 2012
- Modayil, J.¹ White, A.² Sutton, R.S.³

12
- 0003989214
- PhD thesis, University of California at Berkeley
- Parr, R.: Hierarchical Control and Learning for Markov Decision Processes. PhD thesis, University of California at Berkeley (1998)
- (1998) Hierarchical Control and Learning for Markov Decision Processes
- Parr, R.¹

13
- 0003392384
- PhD thesis, University of Massachusetts
- Precup, D.: Temporal Abstraction in Reinforcement Learning. PhD thesis, University of Massachusetts (2000)
- (2000) Temporal Abstraction in Reinforcement Learning
- Precup, D.¹

14
- 84864829509
- MSc. thesis, University of Alberta
- Rafols, E.J.: Temporal Abstraction in Temporal-difference Networks. MSc. thesis, University of Alberta (2006)
- (2006) Temporal Abstraction in Temporal-difference Networks
- Rafols, E.J.¹

15
- 84899031920
- Intrinsically motivated reinforcement learning
- Singh, S., Barto, A.G., Chentanez, N.: Intrinsically motivated reinforcement learning. In: Advances in Neural Information Processing Systems, vol. 17, pp. 1281-1288 (2005)
- (2005) Advances in Neural Information Processing Systems , vol.17 , pp. 1281-1288
- Singh, S.¹ Barto, A.G.² Chentanez, N.³

16
- 84912073624
- Learning options in reinforcement learning
- In: Koenig, S., Holte, R.C. (eds.) Springer, Heidelberg
- Stolle, M., Precup, D.: Learning Options in Reinforcement Learning. In: Koenig, S., Holte, R.C. (eds.) SARA 2002. LNCS (LNAI), vol. 2371, pp. 212-223. Springer, Heidelberg (2002)
- (2002) SARA 2002. LNCS (LNAI) , vol.2371 , pp. 212-223
- Stolle, M.¹ Precup, D.²

17
- 84864837762
- http://richsutton.com/IncIdeas/KeytoAI.html
- Sutton, R.S.: "Verification" and "Verfication, the key to AI" (2001), http://richsutton.com/IncIdeas/Verification.html, http://richsutton.com/IncIdeas/KeytoAI.html
- (2001) Verification and Verfication, the key to AI
- Sutton, R.S.¹

18
- 77954125318
- The grand challenge of predictive empirical abstract knowledge
- Sutton, R.S.: The grand challenge of predictive empirical abstract knowledge. In:Working Notes of the IJCAI 2009 Workshop on Grand Challenges for Reasoning from Experiences (2009)
- (2009) Working Notes of the IJCAI 2009 Workshop on Grand Challenges for Reasoning from Experiences
- Sutton, R.S.¹

19
- 0004102479
- MIT Press
- Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press (1998)
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

20
- 71149099079
- Fast gradient-descent methods for temporal-difference learning with linear function approximation
- Sutton, R.S., Maei, H.R., Precup, D., Bhatnagar, S., Silver, D., Szepesvári, C., Wiewiora, E.: Fast gradient-descent methods for temporal-difference learning with linear function approximation. In: Proceedings of the 26th International Conference on Machine Learning (2009)
- (2009) Proceedings of the 26th International Conference on Machine Learning
- Sutton, R.S.¹ Maei, H.R.² Precup, D.³ Bhatnagar, S.⁴ Silver, D.⁵ Szepesvári, C.⁶ Wiewiora, E.⁷

21
- 84899464022
- Horde: A scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction
- Sutton, R.S., Modayil, J., Delp, M., Degris, T., Pilarski, P.M., White, A., Precup, D.: Horde: A scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction. In: Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems, AAMAS (2011)
- (2011) Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems, AAMAS
- Sutton, R.S.¹ Modayil, J.² Delp, M.³ Degris, T.⁴ Pilarski, P.M.⁵ White, A.⁶ Precup, D.⁷

22
- 0033170372
- Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning
- Sutton, R.S., Precup, D., Singh, S.: Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112, 181-211 (1999)
- (1999) Artificial Intelligence , vol.112 , pp. 181-211
- Sutton, R.S.¹ Precup, D.² Singh, S.³

23
- 77956513316
- A convergent O(n) Algorithm for off-policy temporal-difference learning with linear function approximation
- MIT Press
- Sutton, R.S., Szepesvári, C., Maei, H.R.: A convergent O(n) algorithm for off-policy temporal-difference learning with linear function approximation. In: Advances in Neural Information Processing Systems, vol. 21. MIT Press (2009)
- (2009) Advances in Neural Information Processing Systems , vol.21
- Sutton, R.S.¹ Szepesvári, C.² Maei, H.R.³

24
- 0031143730
- An analysis of temporal-difference learning with function approximation
- Tsitsiklis, J.N., Van Roy, B.: An analysis of temporal-difference learning with function approximation. IEEE Transactions on Automatic Control 42, 674-690 (1997)
- (1997) IEEE Transactions on Automatic Control , vol.42 , pp. 674-690
- Tsitsiklis, J.N.¹ Van Roy, B.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.