메뉴 건너뛰기




Volumn 1545, Issue , 1998, Pages 29-45

Modular reinforcement learning: An application to a real robot task

Author keywords

[No Author keywords available]

Indexed keywords

CONTROLLERS; MACHINE LEARNING; MARKOV PROCESSES; ROBOTS;

EID: 84878320217     PISSN: 03029743     EISSN: 16113349     Source Type: Book Series    
DOI: 10.1007/3-540-49240-2_3     Document Type: Conference Paper
Times cited : (4)

References (25)
  • 1
    • 0030149709 scopus 로고    scopus 로고
    • Purposive behavior acquisition for a real robot by vision-based reinforcement learning
    • M. Asada, S. Noda, S. Tawaratsumida, and K. Hosoda. Purposive behavior acquisition for a real robot by vision-based reinforcement learning. Machine Learning, 23:279-303, 1996.
    • (1996) Machine Learning , vol.23 , pp. 279-303
    • Asada, M.1    Noda, S.2    Tawaratsumida, S.3    Hosoda, K.4
  • 2
    • 0029210635 scopus 로고
    • Learning to act using real-time dynamic programming
    • A. Barto, S. J. Bradtke, and S. Singh. Learning to act using real-time dynamic programming. Artificial Intelligence, 1(72):81-138, 1995.
    • (1995) Artificial Intelligence , vol.1 , Issue.72 , pp. 81-138
    • Barto, A.1    Bradtke, S.J.2    Singh, S.3
  • 3
    • 0003787146 scopus 로고
    • Princeton University Press, Princeton, New Jersey
    • R. Bellman. Dynamic Programming. Princeton University Press, Princeton, New Jersey, 1957.
    • (1957) Dynamic Programming
    • Bellman, R.1
  • 4
    • 0001937317 scopus 로고
    • Elephants don't play chess
    • Bradford-MIT Press
    • R. Brooks. Elephants don't play chess. In Designing Autonomous Agents. Bradford-MIT Press, 1991.
    • (1991) Designing Autonomous Agents
    • Brooks, R.1
  • 5
    • 0000439891 scopus 로고
    • On the convergence of stochastic iterative dynamic programming algorithms
    • T. Jaakkola, M. Jordan, and S. Singh. On the convergence of stochastic iterative dynamic programming algorithms. Neural Computation, 6(6):1185-1201, November 1994.
    • (1994) Neural Computation , vol.6 , Issue.6 , pp. 1185-1201
    • Jaakkola, T.1    Jordan, M.2    Singh, S.3
  • 6
    • 0028745065 scopus 로고    scopus 로고
    • Z. Kalmar, C. Szepesväri, and A. Lorincz. Generalization in an autonomous agent. In Proc. of IEEE WCCI ICNN'94, volume 3, pages 1815-1817, Orlando, Florida, June 1994. IEEE Inc.
    • Kalmar, Z.1    Szepesväri, C.2    Lorincz, A.3
  • 7
    • 0029205333 scopus 로고
    • Generalized dynamic concept model as a route to construct adaptive autonomous agents
    • Z. Kalmar, C. Szepesväri, and A. Lorincz. Generalized dynamic concept model as a route to construct adaptive autonomous agents. Neural Network World, 5:353-360, 1995.
    • (1995) Neural Network World , vol.5 , pp. 353-360
    • Kalmar, Z.1    Szepesväri, C.2    Lorincz, A.3
  • 8
    • 0032045145 scopus 로고    scopus 로고
    • Z. Kalmar, C. Szepesväri, and A. Lorincz. Module based reinforcement learning: Experiments with a real robot. Machine Learning, 31:55-85, 1998. joint special issue on "Learning Robots" with the J. of Autonomous Robots;.
    • Kalmar, Z.1    Szepesväri, C.2    Lorincz, A.3
  • 9
    • 84947778935 scopus 로고    scopus 로고
    • M. Littman. Algorithms for Sequential Decision Making. PhD thesis, Department of Computer Science, Brown University, February 1996. Also Technical Report CS-96-09.
    • Littman, M.1
  • 10
    • 0001961616 scopus 로고    scopus 로고
    • A Generalized Reinforcement Learning Model: Convergence and applications
    • M. Littman and C. Szepesväri. A Generalized Reinforcement Learning Model: Convergence and applications. In Int. Gonf. on Machine Learning, pages 310-318, 1996.
    • (1996) Int. Gonf. On Machine Learning , pp. 310-318
    • Littman, M.1    Szepesväri, C.2
  • 11
    • 85138579181 scopus 로고
    • Learning policies for partially observable environments: Scaling up
    • In A. Prieditis and S. Russell, editors, San Francisco, CA, Morgan Kaufmann
    • M. L. Littman, A. Cassandra, and L. P. Kaelbling. Learning policies for partially observable environments: Scaling up. In A. Prieditis and S. Russell, editors, Proceedings of the Twelfth International Conference on Machine Learning, pages 362-370, San Francisco, CA, 1995. Morgan Kaufmann.
    • (1995) Proceedings of the Twelfth International Conference on Machine Learning , pp. 362-370
    • Littman, M.L.1    Cassandra, A.2    Kaelbling, L.P.3
  • 12
  • 13
    • 0026880130 scopus 로고
    • Automatic programming of behavior-based robots using reinforcement learning
    • S. Mahadevan and J. Connell. Automatic programming of behavior-based robots using reinforcement learning. Artificial Intelligence, 55:311-365, 1992.
    • (1992) Artificial Intelligence , vol.55 , pp. 311-365
    • Mahadevan, S.1    Connell, J.2
  • 14
    • 0030647149 scopus 로고    scopus 로고
    • Reinforcement learning in the multi-robot domain
    • M. Mataric. Reinforcement learning in the multi-robot domain. Autonomous Robots, 4, 1997.
    • (1997) Autonomous Robots , pp. 4
    • Mataric, M.1
  • 16
    • 2142812536 scopus 로고
    • Learning without state-estimation in partially observable Markovian decision processes
    • S. Singh, T. Jaakkola, and M. Jordan. Learning without state-estimation in partially observable Markovian decision processes. In Proc. of the Eleventh Machine Learning Conference, pages pp. 284-292, 1995.
    • (1995) Proc. Of the Eleventh Machine Learning Conference , pp. 284-292
    • Singh, S.1    Jaakkola, T.2    Jordan, M.3
  • 18
    • 0000723997 scopus 로고    scopus 로고
    • Generalization in reinforcement learning: Successful examples using sparse coarse coding
    • R. S. Sutton. Generalization in reinforcement learning: Successful examples using sparse coarse coding. Advances in Neural Information Processing Systems, 8, 1996.
    • (1996) Advances in Neural Information Processing Systems , pp. 8
    • Sutton, R.S.1
  • 19
    • 30844447222 scopus 로고    scopus 로고
    • A unified analysis of value-function-based reinforcement-learning algorithms
    • submitted
    • C. Szepesväri and M. Littman. A unified analysis of value-function-based reinforcement-learning algorithms. Neural Computation, 1997. submitted.
    • (1997) Neural Computation
    • Szepesväri, C.1    Littman, M.2
  • 20
    • 84977014241 scopus 로고
    • Behavior of an adaptive self-organizing autonomous agent working with cues and competing concepts
    • C. Szepesväri and A. Lorincz. Behavior of an adaptive self-organizing autonomous agent working with cues and competing concepts. Adaptive Behavior, 2(2): 131-160, 1994.
    • (1994) Adaptive Behavior , vol.2 , Issue.2 , pp. 131-160
    • Szepesväri, C.1    Lorincz, A.2
  • 23
    • 0029752470 scopus 로고    scopus 로고
    • Feature-based methods for large scale dynamic programming
    • J. N. Tsitsiklis and B. Van Roy. Feature-based methods for large scale dynamic programming. Machine Learning, 22:59-94, 1996.
    • (1996) Machine Learning , vol.22 , pp. 59-94
    • Tsitsiklis, J.N.1    Van Roy, B.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.