메뉴 건너뛰기




Volumn , Issue , 2005, Pages 601-608

Dynamic preferences in multi-criteria reinforcement

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTER NETWORKS; NUMERICAL METHODS; OPTIMIZATION; PROBLEM SOLVING; TIME VARYING SYSTEMS; VECTORS;

EID: 31844444500     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1145/1102351.1102427     Document Type: Conference Paper
Times cited : (134)

References (21)
  • 1
    • 14344251217 scopus 로고    scopus 로고
    • Apprenticeship learning via inverse reinforcement learning
    • Abbeel, P., & Ng, A. (2004). Apprenticeship learning via inverse reinforcement learning. In Proceedings of ICML-04.
    • (2004) Proceedings of ICML-04
    • Abbeel, P.1    Ng, A.2
  • 3
    • 0036930295 scopus 로고    scopus 로고
    • A POMDP formulation of preference elicitation problems
    • Boutilier, C. (2002). A POMDP formulation of preference elicitation problems. In Proceedings AAAI-02.
    • (2002) Proceedings AAAI-02
    • Boutilier, C.1
  • 5
    • 0000184142 scopus 로고
    • Constrained markov decision models with weighted discounted rewards
    • Feinberg, E., & Schwartz, A. (1995). Constrained markov decision models with weighted discounted rewards. Mathematics of Operations Research, 20, 302-320.
    • (1995) Mathematics of Operations Research , vol.20 , pp. 302-320
    • Feinberg, E.1    Schwartz, A.2
  • 10
    • 51149092685 scopus 로고    scopus 로고
    • Study of distance vector routing protocols for mobile ad hoc networks
    • Lu, Y., Wang, W., Zhong, Y., & Bhargava, B. (2003). Study of distance vector routing protocols for mobile ad hoc networks. In Proceedings of PerCom-03.
    • (2003) Proceedings of PerCom-03
    • Lu, Y.1    Wang, W.2    Zhong, Y.3    Bhargava, B.4
  • 11
    • 0029752592 scopus 로고    scopus 로고
    • Average reward reinforcement learning: Foundations, algorithms, and empirical results
    • Mahadevan, S. (1996). Average reward reinforcement learning: Foundations, algorithms, and empirical results. Machine Learning, 22, 159-195.
    • (1996) Machine Learning , vol.22 , pp. 159-195
    • Mahadevan, S.1
  • 12
    • 79960013704 scopus 로고    scopus 로고
    • A geometric approach to multi-criterion reinforcement learning
    • Mannor, S., & Shimkin, N. (2004). A geometric approach to multi-criterion reinforcement learning. JMLR, 5, 325-360.
    • (2004) JMLR , vol.5 , pp. 325-360
    • Mannor, S.1    Shimkin, N.2
  • 13
    • 0042547347 scopus 로고    scopus 로고
    • Algorithms for inverse reinforcement learning
    • Ng, A. Y., & Russell, S. (2000). Algorithms for inverse reinforcement learning. In Proceedings of ICML-00.
    • (2000) Proceedings of ICML-00
    • Ng, A.Y.1    Russell, S.2
  • 14
    • 0346738900 scopus 로고    scopus 로고
    • Flexible decomposition algorithms for weakly coupled markov decision problems
    • Parr, R. (1998). Flexible decomposition algorithms for weakly coupled markov decision problems. In Proceedings UAI-98.
    • (1998) Proceedings UAI-98
    • Parr, R.1
  • 16
  • 17
    • 85152626183 scopus 로고
    • A reinforcement learning method for maximizing undiscounted rewards
    • Schwartz, A. (1993). A reinforcement learning method for maximizing undiscounted rewards. In Proceedings of ICML-93.
    • (1993) Proceedings of ICML-93
    • Schwartz, A.1
  • 18
    • 18144424551 scopus 로고    scopus 로고
    • TPOT-RL applied to network routing
    • Stone, P. (2000). TPOT-RL applied to network routing. In Proceedings of ICML-00.
    • (2000) Proceedings of ICML-00
    • Stone, P.1
  • 19
    • 0032050241 scopus 로고    scopus 로고
    • Model-based average reward reinforcement learning
    • Tadepalli, P., & Ok, D. (1998). Model-based average reward reinforcement learning. AI Journal, 100, 177-223.
    • (1998) AI Journal , vol.100 , pp. 177-223
    • Tadepalli, P.1    Ok, D.2
  • 20
    • 13444294406 scopus 로고    scopus 로고
    • A multi-agent policy-gradient approach to network routing
    • Tao, N., Baxter, J., & Weaver, L. (2001). A multi-agent policy-gradient approach to network routing. In Proceedings of ICML-01.
    • (2001) Proceedings of ICML-01
    • Tao, N.1    Baxter, J.2    Weaver, L.3
  • 21
    • 0040030981 scopus 로고
    • Multi-objecticve infinite-horizon discounted markov decision processes
    • White, D. (1982). Multi-objecticve infinite-horizon discounted markov decision processes. Journal of Mathematical Analysis and Applications, 89, 639-647.
    • (1982) Journal of Mathematical Analysis and Applications , vol.89 , pp. 639-647
    • White, D.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.