SCOPUS 정보 검색 플랫폼 - 논문 보기

메뉴 건너뛰기

ICML 2005 - Proceedings of the 22nd International Conference on Machine Learning

Volumn , Issue , 2005, Pages 601-608

Dynamic preferences in multi-criteria reinforcement

(2) Natarajan, Sriraam a Tadepalli, Prasad a

a OREGON STATE UNIVERSITY (United States)

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTER NETWORKS; NUMERICAL METHODS; OPTIMIZATION; PROBLEM SOLVING; TIME VARYING SYSTEMS; VECTORS;

MULTI-CRITERIA REINFORCEMENT; NUMERIC WEIGHTS; SCALAR REWARDS; TIME-VARYING PREFERENCES;

LEARNING SYSTEMS;

EID: 31844444500 PISSN: None EISSN: None Source Type: Conference Proceeding
DOI: 10.1145/1102351.1102427 Document Type: Conference Paper

Times cited : (134)

References (21)

1
- 14344251217
- Apprenticeship learning via inverse reinforcement learning
- Abbeel, P., & Ng, A. (2004). Apprenticeship learning via inverse reinforcement learning. In Proceedings of ICML-04.
- (2004) Proceedings of ICML-04
- Abbeel, P.¹ Ng, A.²

2
- 0003989208
- Chapman and Hall. First edition
- Altman, E. (1999). Constrained markov decision processes. Chapman and Hall. First edition.
- (1999) Constrained Markov Decision Processes
- Altman, E.¹

3
- 0036930295
- A POMDP formulation of preference elicitation problems
- Boutilier, C. (2002). A POMDP formulation of preference elicitation problems. In Proceedings AAAI-02.
- (2002) Proceedings AAAI-02
- Boutilier, C.¹

4
- 4544323223
- Learning an agent's utility function by observing behavior
- Chajewska, U., Koller, D., & Ormoneit, D. (2001). Learning an agent's utility function by observing behavior. In Proceedings of ICML-01.
- (2001) Proceedings of ICML-01
- Chajewska, U.¹ Koller, D.² Ormoneit, D.³

5
- 0000184142
- Constrained markov decision models with weighted discounted rewards
- Feinberg, E., & Schwartz, A. (1995). Constrained markov decision models with weighted discounted rewards. Mathematics of Operations Research, 20, 302-320.
- (1995) Mathematics of Operations Research , vol.20 , pp. 302-320
- Feinberg, E.¹ Schwartz, A.²

6
- 0343860991
- Multicriteria reinforcement learning
- Gabor, Z., Kalmar, Z., & Szepesvari, C. (1998). Multicriteria reinforcement learning. In Proceedings of ICML-98.
- (1998) Proceedings of ICML-98
- Gabor, Z.¹ Kalmar, Z.² Szepesvari, C.³

7
- 0012296128
- Multiagent planning with factored MDPs
- Guestrin, C., Koller, D., & Parr, R. (2001). Multiagent planning with factored MDPs. In Proceedings NIPS-01.
- (2001) Proceedings NIPS-01
- Guestrin, C.¹ Koller, D.² Parr, R.³

8
- 0032073263
- Planning and acting in partially observable stochastic domains
- Kaelbling, L. P., Littman, M., & Cassandra, A. (1998). Planning and acting in partially observable stochastic domains. Artificial Intelligence, 101.
- (1998) Artificial Intelligence , pp. 101
- Kaelbling, L.P.¹ Littman, M.² Cassandra, A.³

9
- 0004112961
- Pearson Education. Second edition
- Kurose, J. F., & Ross, K. W. (2003). Computer networking - a top-down approach featuring the internet. Pearson Education. Second edition.
- (2003) Computer Networking - A Top-down Approach Featuring the Internet
- Kurose, J.F.¹ Ross, K.W.²

10
- 51149092685
- Study of distance vector routing protocols for mobile ad hoc networks
- Lu, Y., Wang, W., Zhong, Y., & Bhargava, B. (2003). Study of distance vector routing protocols for mobile ad hoc networks. In Proceedings of PerCom-03.
- (2003) Proceedings of PerCom-03
- Lu, Y.¹ Wang, W.² Zhong, Y.³ Bhargava, B.⁴

11
- 0029752592
- Average reward reinforcement learning: Foundations, algorithms, and empirical results
- Mahadevan, S. (1996). Average reward reinforcement learning: Foundations, algorithms, and empirical results. Machine Learning, 22, 159-195.
- (1996) Machine Learning , vol.22 , pp. 159-195
- Mahadevan, S.¹

12
- 79960013704
- A geometric approach to multi-criterion reinforcement learning
- Mannor, S., & Shimkin, N. (2004). A geometric approach to multi-criterion reinforcement learning. JMLR, 5, 325-360.
- (2004) JMLR , vol.5 , pp. 325-360
- Mannor, S.¹ Shimkin, N.²

13
- 0042547347
- Algorithms for inverse reinforcement learning
- Ng, A. Y., & Russell, S. (2000). Algorithms for inverse reinforcement learning. In Proceedings of ICML-00.
- (2000) Proceedings of ICML-00
- Ng, A.Y.¹ Russell, S.²

14
- 0346738900
- Flexible decomposition algorithms for weakly coupled markov decision problems
- Parr, R. (1998). Flexible decomposition algorithms for weakly coupled markov decision problems. In Proceedings UAI-98.
- (1998) Proceedings UAI-98
- Parr, R.¹

15
- 0003998452
- J.Wiley and Sons
- Puterman, M. L. (1994). Markov decision processes. J.Wiley and Sons.
- (1994) Markov Decision Processes
- Puterman, M.L.¹

16
- 1942484759
- Q-decomposition for reinforcement learning agents
- Russell, S., & Zimdars, A. L. (2003). Q-decomposition for reinforcement learning agents. In Proceedings of ICML-03.
- (2003) Proceedings of ICML-03
- Russell, S.¹ Zimdars, A.L.²

17
- 85152626183
- A reinforcement learning method for maximizing undiscounted rewards
- Schwartz, A. (1993). A reinforcement learning method for maximizing undiscounted rewards. In Proceedings of ICML-93.
- (1993) Proceedings of ICML-93
- Schwartz, A.¹

18
- 18144424551
- TPOT-RL applied to network routing
- Stone, P. (2000). TPOT-RL applied to network routing. In Proceedings of ICML-00.
- (2000) Proceedings of ICML-00
- Stone, P.¹

19
- 0032050241
- Model-based average reward reinforcement learning
- Tadepalli, P., & Ok, D. (1998). Model-based average reward reinforcement learning. AI Journal, 100, 177-223.
- (1998) AI Journal , vol.100 , pp. 177-223
- Tadepalli, P.¹ Ok, D.²

20
- 13444294406
- A multi-agent policy-gradient approach to network routing
- Tao, N., Baxter, J., & Weaver, L. (2001). A multi-agent policy-gradient approach to network routing. In Proceedings of ICML-01.
- (2001) Proceedings of ICML-01
- Tao, N.¹ Baxter, J.² Weaver, L.³

21
- 0040030981
- Multi-objecticve infinite-horizon discounted markov decision processes
- White, D. (1982). Multi-objecticve infinite-horizon discounted markov decision processes. Journal of Mathematical Analysis and Applications, 89, 639-647.
- (1982) Journal of Mathematical Analysis and Applications , vol.89 , pp. 639-647
- White, D.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.