메뉴 건너뛰기




Volumn 81, Issue 3, 2010, Pages 359-397

Adaptive-resolution reinforcement learning with polynomial exploration in deterministic domains

Author keywords

Adaptive resolution; Efficient exploration; Kernel functions; Reinforcement learning

Indexed keywords

ADAPTIVE APPROXIMATION; ADAPTIVE RESOLUTION; APPROXIMATION SCHEME; COARSE TO FINE; COARSER RESOLUTION; CONTINUOUS STATE SPACE; DETERMINISTIC DOMAINS; EFFICIENT EXPLORATION; EXPLORATION TECHNIQUES; KERNEL FUNCTION; LEARNING RATES; MISTAKE BOUNDS; MODEL-BASED; ONLINE LEARNING; OPTIMAL VALUE FUNCTIONS; ORIGINAL ALGORITHMS; STATE SPACE; UNCERTAINTY INTERVALS;

EID: 78649716899     PISSN: 08856125     EISSN: 15730565     Source Type: Journal    
DOI: 10.1007/s10994-010-5186-7     Document Type: Article
Times cited : (32)

References (32)
  • 1
    • 0016556021 scopus 로고
    • A new approach to manipulator control: The cerebellar model articulation controller (CMAC)
    • 0314.92007
    • J. S. Albus 1975 A new approach to manipulator control: the cerebellar model articulation controller (CMAC) Journal of Dynamic Systems, Measurement and Control 97 220 227 0314.92007
    • (1975) Journal of Dynamic Systems, Measurement and Control , vol.97 , pp. 220-227
    • Albus, J.S.1
  • 2
    • 40849145988 scopus 로고    scopus 로고
    • Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
    • 10.1007/s10994-007-5038-2
    • A. Antos C. Szepesvári R. Munos 2008 Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path Machine Learning 71 1 89 129 10.1007/s10994-007-5038-2
    • (2008) Machine Learning , vol.71 , Issue.1 , pp. 89-129
    • Antos, A.1    Szepesvári, C.2    Munos, R.3
  • 4
    • 78649714480 scopus 로고    scopus 로고
    • Master's thesis, Technion-Israel Institute of Technology. URL:
    • Bernstein, A. (2007). Adaptive state aggregation for reinforcement learning. Master's thesis, Technion-Israel Institute of Technology. URL: http://tx.technion.ac.il/~andreyb/MSc-Thesis-final.pdf.
    • (2007) Adaptive State Aggregation for Reinforcement Learning
    • Bernstein, A.1
  • 8
    • 0346942368 scopus 로고    scopus 로고
    • Decision-theoretic planning: Structural assumptions and computational leverage
    • 0918.68110 1718251
    • C. Boutilier T. Dean S. Hanks 1999 Decision-theoretic planning: structural assumptions and computational leverage Journal of Artificial Intelligence Research 11 1 94 0918.68110 1718251
    • (1999) Journal of Artificial Intelligence Research , vol.11 , pp. 1-94
    • Boutilier, C.1    Dean, T.2    Hanks, S.3
  • 9
    • 0041965975 scopus 로고    scopus 로고
    • R-MAX-a general polynomial time algorithm for near-optimal reinforcement learning
    • 10.1162/153244303765208377 1971337
    • R. I. Brafman M. Tennenholtz 2002 R-MAX-a general polynomial time algorithm for near-optimal reinforcement learning Journal of Machine Learning Research 3 213 231 10.1162/153244303765208377 1971337
    • (2002) Journal of Machine Learning Research , vol.3 , pp. 213-231
    • Brafman, R.I.1    Tennenholtz, M.2
  • 11
    • 0026206780 scopus 로고
    • An optimal one-way multigrid algorithm for discrete-time stochastic control
    • DOI 10.1109/9.133184
    • C.-S. Chow J. N. Tsitsiklis 1991 An optimal one-way multigrid algorithm for discrete-time stochastic control IEEE Transactions on Automatic Control 36 8 898 914 0752.93078 10.1109/9.133184 1116447 (Pubitemid 21674882)
    • (1991) IEEE Transactions on Automatic Control , vol.36 , Issue.8 , pp. 898-914
    • Chow Chee-Seng1    Tsitsiklis John, N.2
  • 12
    • 0033629916 scopus 로고    scopus 로고
    • Reinforcement learning in continuous time and space
    • 10.1162/089976600300015961
    • K. Doya 2000 Reinforcement learning in continuous time and space Neural Computation 12 219 245 10.1162/089976600300015961
    • (2000) Neural Computation , vol.12 , pp. 219-245
    • Doya, K.1
  • 14
    • 23244466805 scopus 로고    scopus 로고
    • PhD thesis, Gatsby Computational Neuroscience Unit, University College London, UK
    • Kakade, S. M. (2003). On the sample complexity of reinforcement learning. PhD thesis, Gatsby Computational Neuroscience Unit, University College London, UK.
    • (2003) On the Sample Complexity of Reinforcement Learning
    • Kakade, S.M.1
  • 15
    • 0036832954 scopus 로고    scopus 로고
    • Near-optimal reinforcement learning in polynomial time
    • DOI 10.1023/A:1017984413808
    • M. Kearns S. P. Singh 2002 Near-optimal reinforcement learning in polynomial time Machine Learning 49 209 232 1014.68071 10.1023/A:1017984413808 (Pubitemid 34325687)
    • (2002) Machine Learning , vol.49 , Issue.2-3 , pp. 209-232
    • Kearns, M.1    Singh, S.2
  • 16
    • 4043069840 scopus 로고    scopus 로고
    • On actor-critic algorithms
    • 1049.93095 10.1137/S0363012901385691 2044789
    • V. R. Konda J. N. Tsitsiklis 2003 On actor-critic algorithms SIAM Journal on Control and Optimization 42 4 1143 1166 1049.93095 10.1137/S0363012901385691 2044789
    • (2003) SIAM Journal on Control and Optimization , vol.42 , Issue.4 , pp. 1143-1166
    • Konda, V.R.1    Tsitsiklis, J.N.2
  • 18
    • 0029514510 scopus 로고
    • The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces
    • A. W. Moore C. G. Atkeson 1995 The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces Machine Learning 21 199 233
    • (1995) Machine Learning , vol.21 , pp. 199-233
    • Moore, A.W.1    Atkeson, C.G.2
  • 19
    • 0036832953 scopus 로고    scopus 로고
    • Variable resolution discretization in optimal control
    • DOI 10.1023/A:1017992615625
    • R. Munos A. W. Moore 2002 Variable resolution discretization in optimal control Machine Learning 49 291 323 1005.68086 10.1023/A:1017992615625 (Pubitemid 34325691)
    • (2002) Machine Learning , vol.49 , Issue.2-3 , pp. 291-323
    • Munos, R.1    Moore, A.2
  • 22
    • 0036832956 scopus 로고    scopus 로고
    • Kernel-based reinforcement learning
    • DOI 10.1023/A:1017928328829
    • D. Ormoneit S. Sen 2002 Kernel-based reinforcement learning Machine Learning 49 161 178 1014.68069 10.1023/A:1017928328829 (Pubitemid 34325684)
    • (2002) Machine Learning , vol.49 , Issue.2-3 , pp. 161-178
    • Ormoneit, D.1    Sen, A.2
  • 29
    • 85156221438 scopus 로고    scopus 로고
    • Generalization in reinforcement learning: Successful examples using sparse coarse coding
    • Sutton, R. S. (1996). Generalization in reinforcement learning: successful examples using sparse coarse coding. In Advances in neural information processing systems 8 (NIPS) (pp. 1038-1044).
    • (1996) Advances in Neural Information Processing Systems 8 (NIPS) , pp. 1038-1044
    • Sutton, R.S.1
  • 31
    • 0002999362 scopus 로고    scopus 로고
    • Splines: A perfect fit for signal and image processing
    • 10.1109/79.799930
    • M. Unser 1999 Splines: A perfect fit for signal and image processing IEEE Signal Processing Magazine 16 22 38 10.1109/79.799930
    • (1999) IEEE Signal Processing Magazine , vol.16 , pp. 22-38
    • Unser, M.1
  • 32
    • 0017997986 scopus 로고
    • Approximations of dynamic programs, i
    • 0393.90094 10.1287/moor.3.3.231 506661
    • W. Whitt 1978 Approximations of dynamic programs, I Mathematics of Operations Research 3 3 231 243 0393.90094 10.1287/moor.3.3.231 506661
    • (1978) Mathematics of Operations Research , vol.3 , Issue.3 , pp. 231-243
    • Whitt, W.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.