SCOPUS 정보 검색 플랫폼

Proceedings, Twentieth International Conference on Machine Learning

Volumn 1, Issue , 2003, Pages 424-431

Reinforcement Learning as Classification: Leveraging Modern Classifiers

(2) Lagoudakis, Michail G a Parr, Ronald a

Author keywords

[No Author keywords available]

Indexed keywords

ALGORITHMS; CLASSIFICATION (OF INFORMATION); CRYSTAL ORIENTATION; FEATURE EXTRACTION; MARKOV PROCESSES; MONTE CARLO METHODS; NEURAL NETWORKS; PARAMETER ESTIMATION;

FEATURE ENGINEERING; KERNEL TRICK;

LEARNING SYSTEMS;

EID: 1942420814 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (148)

References (18)

1
- 0003487482
- Belmont, Massachusetts: Athena Scientific
- Bertsekas, D., & Tsitsiklis, J. (1996). Neuro-dynamic programming. Belmont, Massachusetts: Athena Scientific.
- (1996) Neuro-dynamic Programming
- Bertsekas, D.¹ Tsitsiklis, J.²

2
- 0000913324
- SVMTorch: Support vector machines for large-scale regression problems
- Collobert, R., & Bengio, S. (2001). SVMTorch: Support vector machines for large-scale regression problems. Journal of Machine Learning Research (JMLR), 1, 143-160.
- (2001) Journal of Machine Learning Research (JMLR) , vol.1 , pp. 143-160
- Collobert, R.¹ Bengio, S.²

3
- 84899029004
- Batch value funtion approximation via support vectors
- Vancouver, British Columbia: MIT Press
- Dietterich, T. G., & Wang, X. (2001). Batch value funtion approximation via support vectors. Advances in Neural Information Processing Systems 14: Proceedings of the 2001 Conference. Vancouver, British Columbia: MIT Press.
- (2001) Advances in Neural Information Processing Systems 14: Proceedings of the 2001 Conference
- Dietterich, T.G.¹ Wang, X.²

4
- 1942450288
- Technical report TR-ECE-03-11). Purdue University School of Electrical and Computer Engineering
- Fern, A., Yoon, S., & Givan, R. (2003). Approximate policy iteration with a policy language bias: Learning control knowledge in planning domainsTechnical report TR-ECE-03-11). Purdue University School of Electrical and Computer Engineering.
- (2003) Approximate Policy Iteration with a Policy Language Bias: Learning Control Knowledge in Planning Domains
- Fern, A.¹ Yoon, S.² Givan, R.³

5
- 84880898477
- Max-norm projections for factored MDPs
- Seattle, Washington: Morgan Kaufmann
- Guestrin, C. E., Koller, D., & Parr, R. (2001). Max-norm projections for factored MDPs. Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI-01) (pp. 673 - 680). Seattle, Washington: Morgan Kaufmann.
- (2001) Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI-01) , pp. 673-680
- Guestrin, C.E.¹ Koller, D.² Parr, R.³

6
- 0000439891
- On the convergence of stochastic iterative dynamic programming algorithms
- Jaakkola, T., Jordan, M., & Singh, S. (1994). On the convergence of stochastic iterative dynamic programming algorithms. Neural Computation, 6, 1185-1201.
- (1994) Neural Computation , vol.6 , pp. 1185-1201
- Jaakkola, T.¹ Jordan, M.² Singh, S.³

7
- 85153938292
- Reinforcement learning algorithm for partially observable Markov decision problems
- Cambridge, Massachusetts: MIT Press
- Jaakkola, T., Singh, S. P., & Jordan, M. I. (1995). Reinforcement learning algorithm for partially observable Markov decision problems. Advances in Neural Information Processing Systems 7 (pp. 345-352). Cambridge, Massachusetts: MIT Press.
- (1995) Advances in Neural Information Processing Systems , vol.7 , pp. 345-352
- Jaakkola, T.¹ Singh, S.P.² Jordan, M.I.³

8
- 0004280606
- Cambridge, Massachusetts: MIT Press
- Kaelbling, L. P. (1993). Learning in embedded systems. Cambridge, Massachusetts: MIT Press.
- (1993) Learning in Embedded Systems
- Kaelbling, L.P.¹

9
- 1942514728
- Approximately optimal approximate reinforcement learning
- Sydney, Australia
- Kakade, S., & Langford, J. (2002). Approximately optimal approximate reinforcement learning. The Nineteenth International Conference on Machine Learning (ICML-2002). Sydney, Australia.
- (2002) The Nineteenth International Conference on Machine Learning (ICML-2002)
- Kakade, S.¹ Langford, J.²

10
- 84880649215
- A sparse sampling algorithm for near-optimal planning large markov decision processes
- Stockholm, Sweden: Morgan Kaufmann
- Kearns, M., Mansour, Y., & Ng, A. Y. (1999). A sparse sampling algorithm for near-optimal planning large markov decision processes. Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence (IJCAI-99) (pp. 1324-1331). Stockholm, Sweden: Morgan Kaufmann.
- (1999) Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence (IJCAI-99) , pp. 1324-1331
- Kearns, M.¹ Mansour, Y.² Ng, A.Y.³

11
- 84898963274
- Model free least squares policy iteration
- To appear. Vancouver, Canada
- Lagoudakis, M., & Parr, R. (2001). Model free least squares policy iteration. To appear in 14th Neural Information Processing Systems (NIPS-14). Vancouver, Canada.
- (2001) 14th Neural Information Processing Systems (NIPS-14)
- Lagoudakis, M.¹ Parr, R.²

12
- 0141596576
- Policy invariance under reward transformations: Theory and application to reward shaping
- Morgan Kaufmann, San Francisco, CA
- Ng, A. Y., Harada, D., & Russell, S. (1999). Policy invariance under reward transformations: theory and application to reward shaping. Proc. 16th International Conf. on Machine Learning (pp. 278-287). Morgan Kaufmann, San Francisco, CA.
- (1999) Proc. 16th International Conf. on Machine Learning , pp. 278-287
- Ng, A.Y.¹ Harada, D.² Russell, S.³

13
- 1642401055
- Learning to drive a bicycle using reinforcement learning and shaping
- Madison, Wisconsin: Morgan Kaufmann
- Randløv, J., & Alstrøm, P. (1998). Learning to drive a bicycle using reinforcement learning and shaping. The Fifteenth International Conference on Machine Learning. Madison, Wisconsin: Morgan Kaufmann.
- (1998) The Fifteenth International Conference on Machine Learning
- Randløv, J.¹ Alstrøm, P.²

14
- 0004102479
- Cambridge, MA: MIT Press
- Sutton, R., & Barto, A. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT Press.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.¹ Barto, A.²

15
- 84898992015
- On-line policy improvement using monte-carlo search
- Denver, Colorado
- Tesauro, G., & Tesauro, G. (1997). On-line policy improvement using monte-carlo search. 9th Neural Information Processing Systems (NIPS-9). Denver, Colorado.
- (1997) 9th Neural Information Processing Systems (NIPS-9)
- Tesauro, G.¹ Tesauro, G.²

16
- 0030082891
- An approach to fuzzy control of nonlinear systems: Stability and design issues
- Wang, H. Tanaka, K., & Griffin, M. (1996). An approach to fuzzy control of nonlinear systems: Stability and design issues. IEEE Transactions on Fuzzy Systems, 4, 14-23.
- (1996) IEEE Transactions on Fuzzy Systems , vol.4 , pp. 14-23
- Wang, H.¹ Tanaka, K.² Griffin, M.³

17
- 0000337576
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- Williams, R. J. (1992). Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8, 229-256.
- (1992) Machine Learning , vol.8 , pp. 229-256
- Williams, R.J.¹

18
- 13444310066
- Inductive policy selection for first-order MDPs
- Edmonton, Canada: Morgan Kaufmann
- Yoon, S. W., Fern, A., & Givan, B. (2002). Inductive policy selection for first-order MDPs. Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI-02). Edmonton, Canada: Morgan Kaufmann.
- (2002) Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI-02)
- Yoon, S.W.¹ Fern, A.² Givan, B.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.