SCOPUS 정보 검색 플랫폼

Advances in Neural Information Processing Systems 20 - Proceedings of the 2007 Conference

Volumn , Issue , 2007, Pages

Reinforcement learning in continuous action spaces through sequential Monte Carlo methods

(3) Lazaric, Alessandro a Restelli, Marcello a Bonarini, Andrea a

a POLITECNICO DI MILANO (Italy)

Author keywords

[No Author keywords available]

Indexed keywords

APPROXIMATION ALGORITHMS; LEARNING ALGORITHMS; REINFORCEMENT LEARNING;

ACTION SPACES; CONTINUOUS ACTIONS; CONTINUOUS STATE; CONTINUOUS STATE SPACE; FAST METHODS; REAL WORLD DOMAIN; REINFORCEMENT LEARNING ALGORITHMS; REINFORCEMENT LEARNINGS; SEQUENTIAL MONTE CARLO METHODS; VALUE FUNCTIONS;

IMPORTANCE SAMPLING;

EID: 85161968592 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (88)

References (14)

1
- 0036475447
- A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking
- DOI 10.1109/78.978374, PII S1053587X0200569X
- M. Sanjeev Arulampalam, Simon Maskell, Neil Gordon, and Tim Clapp. A tutorial on particle filters for online nonlinear/non-gaussian bayesian tracking. IEEE Trans. on Signal Processing, 50(2):174-188, 2002. (Pubitemid 34291500)
- (2002) IEEE Transactions on Signal Processing , vol.50 , Issue.2 , pp. 174-188
- Arulampalam, M.S.¹ Maskell, S.² Gordon, N.³ Clapp, T.⁴

2
- 0003477315
- Reinforcement learning with high-dimensional, continuous actions
- Leemon C. Baird and A. Harry Klopf. Reinforcement learning with high-dimensional, continuous actions. Technical Report WL-TR-93-117, Wright-Patterson Air Force Base Ohio: Wright Laboratory, 1993.
- (1993) Technical Report WL-TR-93-117, Wright-Patterson Air Force Base Ohio: Wright Laboratory
- Baird, L.C.¹ Klopf, A.H.²

3
- 0003487482
- Athena Scientific, Belmont, MA
- D.P. Bertsekas and J.N. Tsitsiklis. Neural Dynamic Programming. Athena Scientific, Belmont, MA, 1996.
- (1996) Neural Dynamic Programming
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

4
- 20844441671
- Q-learning in continuous state and action spaces
- Chris Gaskett, DavidWettergreen, and Alexander Zelinsky. Q-learning in continuous state and action spaces. In Australian Joint Conference on Artificial Intelligence, pages 417-428, 2003.
- (2003) Australian Joint Conference on Artificial Intelligence , pp. 417-428
- Gaskett, C.¹ Wettergreen, D.² Zelinsky, A.³

5
- 0032140718
- Fuzzy inference system learning by reinforcement methods
- PII S1094697798039029
- L. Jouffe. Fuzzy inference system learning by reinforcement methods. IEEE Trans. on Systems, Man, and Cybernetics-PART C, 28(3):338-355, 1998. (Pubitemid 128748635)
- (1998) IEEE Transactions on Systems, Man and Cybernetics Part C: Applications and Reviews , vol.28 , Issue.3 , pp. 338-355
- Jouffe, L.¹

6
- 0008550260
- Reinforcement learning for continuous action using stochastic gradient ascent
- H. Kimura and S. Kobayashi. Reinforcement learning for continuous action using stochastic gradient ascent. In 5th Intl. Conf. on Intelligent Autonomous Systems, pages 288-295, 1998.
- (1998) 5th Intl. Conf. on Intelligent Autonomous Systems , pp. 288-295
- Kimura, H.¹ Kobayashi, S.²

7
- 4043069840
- Actor-critic algorithms
- V. R. Konda and J. N. Tsitsiklis. Actor-critic algorithms. SIAM Journal on Control and Optimization, 42(4):1143-1166, 2003.
- (2003) SIAM Journal on Control and Optimization , vol.42 , Issue.4 , pp. 1143-1166
- Konda, V.R.¹ Tsitsiklis, J.N.²

8
- 0032359151
- Sequential monte carlo methods for dynamical systems
- J. S. Liu and E. Chen. Sequential monte carlo methods for dynamical systems. Journal of American Statistical Association, 93:1032-1044, 1998.
- (1998) Journal of American Statistical Association , vol.93 , pp. 1032-1044
- Liu, J.S.¹ Chen, E.²

9
- 0036832960
- Continuous-action Q-learning
- DOI 10.1023/A:1017988514716
- Jose Del R. Millan, Daniele Posenato, and Eric Dedieu. Continuous-action q-learning. Machine Learning, 49:247-265, 2002. (Pubitemid 34325689)
- (2002) Machine Learning , vol.49 , Issue.2-3 , pp. 247-265
- Millan, J.D.R.¹ Posenato, D.² Dedieu, E.³

10
- 34250635407
- Policy gradient methods for robotics
- DOI 10.1109/IROS.2006.282564, 4058714, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2006
- Jan Peters and Stefen Schaal. Policy gradient methods for robotics. In Proceedings of the IEEE International Conference on Intelligent Robotics Systems (IROS), pages 2219-2225, 2006. (Pubitemid 46928224)
- (2006) IEEE International Conference on Intelligent Robots and Systems , pp. 2219-2225
- Peters, J.¹ Schaal, S.²

11
- 0031231885
- Experiments with reinforcement learning in problems with continuous state and action spaces
- J. C. Santamaria, R. S: Sutton, and A. Ram. Experiments with reinforcement learning in problems with continuous state and action spaces. Adaptive Behavior, 6:163-217, 1998. (Pubitemid 127175211)
- (1997) Adaptive Behavior , vol.6 , Issue.2 , pp. 163-217
- Santamaria, J.C.¹ Ram, A.² Sutton, R.S.³

12
- 26944466214
- Function approximation via tile coding: Automating parameter choice
- LNAI. Springer Verlag
- Alexander A. Sherstov and Peter Stone. Function approximation via tile coding: Automating parameter choice. In SARA 2005, LNAI, pages 194-205. Springer Verlag, 2005.
- (2005) SARA 2005 , pp. 194-205
- Sherstov, A.A.¹ Stone, P.²

13
- 0004102479
- MIT Press, Cambridge, MA
- Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

14
- 34548807200
- Reinforcement learning in continuous action spaces
- DOI 10.1109/ADPRL.2007.368199, 4220844, Proceedings of the 2007 IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning, ADPRL 2007
- Hado van Hasselt and MarcoWiering. Reinforcement learning in continuous action spaces. In 2007 IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning, pages 272-279, 2007. (Pubitemid 47431396)
- (2007) Proceedings of the 2007 IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning, ADPRL 2007 , pp. 272-279
- Van Hasselt, H.¹ Wiering, M.A.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.