메뉴 건너뛰기




Volumn 6, Issue 2, 2010, Pages 577-590

A reinforcement learning model using deterministic state-action sequences

Author keywords

Inductive bias; Macro action; Radial basis function; Reinforcement learning

Indexed keywords

ACTION SELECTION; ACTION SEQUENCES; ACTOR CRITIC; EXPLICIT FORM; EXPLORATION AND EXPLOITATION; INDUCTIVE BIAS; LONG TERM MEMORY; LOW TEMPERATURES; NEURAL MODELS; NEW APPROACHES; OPTIMAL ACTION POLICIES; PRIMITIVE ACTIONS; RADIAL BASIS FUNCTIONS; RADIAL-BASIS FUNCTION; REINFORCEMENT LEARNING MODELS; TEMPERATURE PARAMETERS;

EID: 77649312856     PISSN: 13494198     EISSN: None     Source Type: Journal    
DOI: None     Document Type: Article
Times cited : (6)

References (17)
  • 2
    • 63649153086 scopus 로고    scopus 로고
    • A Q-learning system for container transfer scheduling based on shipping order at container terminals
    • Y. Hirashima, A Q-learning system for container transfer scheduling based on shipping order at container terminals, Int. J. of Innovative Computing, Information and Control, vol.4, no.3, pp.547-558, 2008.
    • (2008) Int. J. of Innovative Computing, Information and Control , vol.4 , Issue.3 , pp. 547-558
    • Hirashima, Y.1
  • 3
    • 0000148778 scopus 로고
    • A heuristic approach to the discovery of macro-operators
    • G. A. Iba, A heuristic approach to the discovery of macro-operators, Machine Learning, vol.3, pp.285-317, 1989.
    • (1989) Machine Learning , vol.3 , pp. 285-317
    • Iba, G.A.1
  • 5
    • 0013465187 scopus 로고    scopus 로고
    • Automatic discovery of subgoals in reinforcement learning using diverse density
    • A. McGovern and A. G. Barto, Automatic discovery of subgoals in reinforcement learning using diverse density, Proc. of Int. Conf. on Machine Learning, 2001.
    • (2001) Proc. of Int. Conf. on Machine Learning
    • McGovern, A.1    Barto, A.G.2
  • 6
    • 0032312876 scopus 로고    scopus 로고
    • Reinforcement learning of dynamic motor sequence: Learning to standup
    • J. Morimoto and K. Doya, Reinforcement learning of dynamic motor sequence: learning to standup, Proc. of Int. Conf. on Intel. Robots and Sys., vol.3, pp.1721-1726, 1998.
    • (1998) Proc. of Int. Conf. on Intel. Robots and Sys. , vol.3 , pp. 1721-1726
    • Morimoto, J.1    Doya, K.2
  • 8
    • 64349083302 scopus 로고    scopus 로고
    • Task segmentation in a mobile robot by mnSOM and clustering with spatio-temporal contiguity
    • M. A. Muslim, M. Ishikawa, and T. Furukawa, Task segmentation in a mobile robot by mnSOM and clustering with spatio-temporal contiguity, Int. J. of Innovative Computing, Information and Control, vol.5, no.4, pp.865-875, 2009.
    • (2009) Int. J. of Innovative Computing, Information and Control , vol.5 , Issue.4 , pp. 865-875
    • Muslim, M.A.1    Ishikawa, M.2    Furukawa, T.3
  • 9
    • 77649304760 scopus 로고    scopus 로고
    • A memory-based reinforcement learning algorithm to prevent unlearning in neural networks
    • J. C. Rajapaks and L. Wang (ed.), Springer
    • S. Ozawa and S. Abe, A memory-based reinforcement learning algorithm to prevent unlearning in neural networks, in Neural Information Processing: Research and Development, J. C. Rajapaks and L. Wang (ed.), Springer, pp.238-255, 2004.
    • (2004) Neural Information Processing: Research and Development , pp. 238-255
    • Ozawa, S.1    Abe, S.2
  • 10
    • 27744441220 scopus 로고    scopus 로고
    • Incremental learning of feature space and classifier for face recognition
    • DOI 10.1016/j.neunet.2005.06.016, PII S0893608005001176
    • S. Ozawa, S. L. Toh, S. Abe, S. Pang, and N. Kasabov, Incremental learning of feature space and classifier for face recognition, Neural Networks, vol.18, pp.575-584, 2005. (Pubitemid 43186577)
    • (2005) Neural Networks , vol.18 , Issue.5-6 , pp. 575-584
    • Ozawa, S.1    Toh, S.L.2    Abe, S.3    Pang, S.4    Kasabov, N.5
  • 11
    • 63049087592 scopus 로고    scopus 로고
    • A multitask learning model for online pattern recognition
    • S. Ozawa, A. Roy, and D. Roussinov, A multitask learning model for online pattern recognition, IEEE Trans. on Neural Networks, vol.20, no.3, pp.430-445, 2009.
    • (2009) IEEE Trans. on Neural Networks , vol.20 , Issue.3 , pp. 430-445
    • Ozawa, S.1    Roy, A.2    Roussinov, D.3
  • 12
    • 14344250461 scopus 로고    scopus 로고
    • PolicyBlocks: An algorithm for creating useful macro-actions in reinforcement learning
    • M. Pickett and A. G. Barto, PolicyBlocks: An algorithm for creating useful macro-actions in reinforcement learning, Proc. of Int. Conf. on Machine Learning, pp.506-513, 2002.
    • (2002) Proc. of Int. Conf. on Machine Learning , pp. 506-513
    • Pickett, M.1    Barto, A.G.2
  • 13
    • 0001071040 scopus 로고
    • A resource allocating network for function interpolation
    • J. Platt, A resource allocating network for function interpolation, Neural Computation, vol.3, pp.213-225, 1991.
    • (1991) Neural Computation , vol.3 , pp. 213-225
    • Platt, J.1
  • 15
    • 0033170372 scopus 로고    scopus 로고
    • Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
    • R. S. Sutton, D. Precup, and S. Singh, Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning, Artificial Intelligence, vol.112, pp.181-211, 1999.
    • (1999) Artificial Intelligence , vol.112 , pp. 181-211
    • Sutton, R.S.1    Precup, D.2    Singh, S.3
  • 16
    • 77649332914 scopus 로고
    • Finding structure in reinforcement learning, Advances in Neural Info.
    • the MIT Press
    • S. Thrun and A. Schwartz, Finding structure in reinforcement learning, Advances in Neural Info. Processing Systems 7, the MIT Press, 1995.
    • (1995) Processing Systems 7
    • Thrun, S.1    Schwartz, A.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.