SCOPUS 정보 검색 플랫폼

Journal of Artificial Intelligence Research

Volumn 24, Issue , 2005, Pages 81-108

Risk-sensitive reinforcement learning applied to control under constraints

(2) Geibel, Peter a Wysotzki, Fritz b

a UNIVERSITY OF OSNABRÜCK (Germany)

b TECHNISCHE UNIVERSITÄT BERLIN (Germany)

Author keywords

[No Author keywords available]

Indexed keywords

CONSTRAINT THEORY; ERROR ANALYSIS; FUNCTIONS; HEURISTIC METHODS; MARKOV PROCESSES; REINFORCEMENT; RISK MANAGEMENT;

MARKOV DECISION PROCESSES (MDP); REINFORCEMENT LEARNING; STOCHASTIC INFLOWS; VALUE FUNCTION;

LEARNING SYSTEMS;

EID: 31144477417 PISSN: 10769757 EISSN: 10769757 Source Type: Journal
DOI: 10.1613/jair.1666 Document Type: Article

Times cited : (336)

References (39)

1
- 0003989208
- Chapman and Hall/CRC
- Altman, E. (1999). Constrained Markov Decision Processes. Chapman and Hall/CRC.
- (1999) Constrained Markov Decision Processes
- Altman, E.¹

2
- 85151728371
- Residual algorithms: Reinforcement learning with function approximation
- Morgan Kaufmann
- Baird, L. (1995). Residual algorithms: reinforcement learning with function approximation. In Proc. 12th International Conference on Machine Learning, pp. 30-37. Morgan Kaufmann.
- (1995) Proc. 12th International Conference on Machine Learning , pp. 30-37
- Baird, L.¹

3
- 31144450867
- Optimal rules for ordering uncertain prospects
- Bawas, V. S. (1975). Optimal rules for ordering uncertain prospects. Journal of Finance, 2(1), 1975.
- (1975) Journal of Finance , vol.2 , Issue.1 , pp. 1975
- Bawas, V.S.¹

4
- 0003565783
- Athena Scientific, Belmont, Massachusetts
- Bertsekas, D. P. (1995). Dynamic Programming and Optimal Control. Athena Scientific, Belmont, Massachusetts. Volumes 1 and 2.
- (1995) Dynamic Programming and Optimal Control , vol.1-2
- Bertsekas, D.P.¹

5
- 0003487482
- Athena Scientific, Belmont, MA
- Bertsekas, D. P., & Tsitsiklis, J. N. (1996). Neuro-Dynamic Programming. Athena Scientific, Belmont, MA.
- (1996) Neuro-dynamic Programming
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

6
- 0003487601
- Oxford University Press, Oxford
- Bishop, C. M. (1995). Neural Networks for Pattern Recognition. Oxford University Press, Oxford.
- (1995) Neural Networks for Pattern Recognition
- Bishop, C.M.¹

7
- 0032666946
- Decision-theoretic planning
- Blythe, J. (1999). Decision-theoretic planning. AI Magazine, 20(2), 37-54.
- (1999) AI Magazine , vol.20 , Issue.2 , pp. 37-54
- Blythe, J.¹

8
- 0036577013
- Q-learning for risk-sensitive control
- Borkar, V. (2002). Q-learning for risk-sensitive control. Mathematics of Operations Research, 27(2), 294-311.
- (2002) Mathematics of Operations Research , vol.27 , Issue.2 , pp. 294-311
- Borkar, V.¹

9
- 0033077715
- Risk-sensitive and minimax control of discrete-time, finite-state Markov decision processes
- Coraluppi, S., & Marcus, S. (1999). Risk-sensitive and minimax control of discrete-time, finite-state Markov decision processes. Automatica, 35, 301-309.
- (1999) Automatica , vol.35 , pp. 301-309
- Coraluppi, S.¹ Marcus, S.²

10
- 0032208335
- Elevator group control using multiple reinforcement learning agents
- Crites, R. H., & Barto, A. G. (1998). Elevator group control using multiple reinforcement learning agents. Machine Learning, 55(2/3), 235-262.
- (1998) Machine Learning , vol.55 , Issue.2-3 , pp. 235-262
- Crites, R.H.¹ Barto, A.G.²

11
- 13444303262
- Approximating optimal policies for agents with limited execution resources
- AAAI Press
- Dolgov, D., & Durfee, E. (2004). Approximating optimal policies for agents with limited execution resources. In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, pp. 1107-1112. AAAI Press.
- (2004) Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence , pp. 1107-1112
- Dolgov, D.¹ Durfee, E.²

12
- 0000935892
- Markov decision models with weighted discounted criteria
- Feinberg, E., & Shwartz, A. (1994). Markov decision models with weighted discounted criteria. Math. of Operations Research, 19, 152-168.
- (1994) Math. of Operations Research , vol.19 , pp. 152-168
- Feinberg, E.¹ Shwartz, A.²

13
- 0030282046
- Constrained discounted dynamic programming
- Feinberg, E., & Shwartz, A. (1996). Constrained discounted dynamic programming. Math. of Operations Research, 21, 922-945.
- (1996) Math. of Operations Research , vol.21 , pp. 922-945
- Feinberg, E.¹ Shwartz, A.²

14
- 0033099473
- Constrained dynamic programming with two discount factors: Applications and an algorithm
- Feinberg, E., &: Shwartz, A. (1999). Constrained dynamic programming with two discount factors: Applications and an algorithm. IEEE Transactions on Automatic Control, 44, 628-630.
- (1999) IEEE Transactions on Automatic Control , vol.44 , pp. 628-630
- Feinberg, E.¹ Shwartz, A.²

15
- 0000096680
- Mean-risk analysis with risk associated with below-target returns
- Fishburn, P. C. (1977). Mean-risk analysis with risk associated with below-target returns. American Economics Review, 67(2), 116-126.
- (1977) American Economics Review , vol.67 , Issue.2 , pp. 116-126
- Fishburn, P.C.¹

16
- 0000581072
- The introduction of risk into a programming model
- Freund, R. (1956). The introduction of risk into a programming model. Econometrica, 21, 253-263.
- (1956) Econometrica , vol.21 , pp. 253-263
- Freund, R.¹

17
- 0031343897
- Speeding safely: Multi-criteria optimization in probabilistic planning
- AAAI Press/MIT Press
- Fulkerson, M. S., Littman, M. L., & Keim, G. A. (1998). Speeding safely: Multi-criteria optimization in probabilistic planning. In Proceedings of the Fourteenth National Conference on Artificial Intelligence, p. 831. AAAI Press/MIT Press.
- (1998) Proceedings of the Fourteenth National Conference on Artificial Intelligence , pp. 831
- Fulkerson, M.S.¹ Littman, M.L.² Keim, G.A.³

18
- 0343860991
- Multi-criteria reinforcement learning
- Morgan Kaufmann, San Francisco, CA
- Gabor, Z., Kalmar, Z., & Szepesvari, C. (1998). Multi-criteria reinforcement learning. In Proc. 15th International Conf. on Machine Learning, pp. 197-205. Morgan Kaufmann, San Francisco, CA.
- (1998) Proc. 15th International Conf. on Machine Learning , pp. 197-205
- Gabor, Z.¹ Kalmar, Z.² Szepesvari, C.³

19
- 13444290317
- Reinforcement learning with bounded risk
- Brodley, E., & Danyluk, A. P. (Eds.), Morgan Kaufmann Publishers
- Geibel, P. (2001). Reinforcement learning with bounded risk. In Brodley, E., & Danyluk, A. P. (Eds.), Machine Learning - Proceedings of the Eighteenth International Conference (ICML01), pp. 162-169. Morgan Kaufmann Publishers.
- (2001) Machine Learning - Proceedings of the Eighteenth International Conference (ICML01) , pp. 162-169
- Geibel, P.¹

20
- 85120861483
- Consideration of risk in reinforcement learning
- Morgan Kaufmann
- Heger, M. (1994). Consideration of risk in reinforcement learning. In Proc. 11th International Conference on Machine Learning, pp. 105-111. Morgan Kaufmann.
- (1994) Proc. 11th International Conference on Machine Learning , pp. 105-111
- Heger, M.¹

21
- 0003410673
- Wiley, New York
- Kall, P., & Wallace, S. W. (1994). Stochastic Programming. Wiley, New York.
- (1994) Stochastic Programming
- Kall, P.¹ Wallace, S.W.²

22
- 0008840793
- Risk-sensitive planning with probabilistic decision graphs
- Doyle, J., Sandewall, E., & Torasso, P. (Eds.), San Francisco, California. Morgan Kaufmann
- Koenig, S., & Simmons, R. G. (1994). Risk-sensitive planning with probabilistic decision graphs. In Doyle, J., Sandewall, E., & Torasso, P. (Eds.), KR'94: Principles of Knowledge Representation and Reasoning, pp. 363-373, San Francisco, California. Morgan Kaufmann.
- (1994) KR'94: Principles of Knowledge Representation and Reasoning , pp. 363-373
- Koenig, S.¹ Simmons, R.G.²

23
- 0028566295
- An algorithm for probabilistic least-commitment planning
- Kushmerick, N., Hanks, S., & Weld, D. S. (1994). An algorithm for probabilistic least-commitment planning.. In AAAI, pp. 1073-1078.
- (1994) AAAI , pp. 1073-1078
- Kushmerick, N.¹ Hanks, S.² Weld, D.S.³

24
- 0036591952
- Optimal operation of distillation processes under uncertain inflows accumulated in a feed tank
- Li, P., Wendt, M., Arellano-Garcia, & Wozny, G. (2002). Optimal operation of distillation processes under uncertain inflows accumulated in a feed tank. AIChe Journal, 48, 1198-1211.
- (2002) AIChe Journal , vol.48 , pp. 1198-1211
- Li, P.¹ Wendt, M.² Arellano-Garcia³ Wozny, G.⁴

25
- 1142280944
- Risk-averse auction agents
- Rosenschein, J., Sandholm, T., & Wooldridge, M. Yokoo, M. (Eds.), ACM Press
- Liu, Y., Goodwin, R., & Koenig, S. (2003a). Risk-averse auction agents. In Rosenschein, J., Sandholm, T., & Wooldridge, M. Yokoo, M. (Eds.), Proceedings of the Second International Joint Conference on Autonomous Agents and MultiAgent Systems (AAMAS-03), pp. 353-360. ACM Press.
- (2003) Proceedings of the Second International Joint Conference on Autonomous Agents and MultiAgent Systems (AAMAS-03) , pp. 353-360
- Liu, Y.¹ Goodwin, R.² Koenig, S.³

26
- 1142280944
- Risk-averse auction agents
- Liu, Y., Goodwin, R., & Koenig, S. (2003b). Risk-averse auction agents.. In AAMAS, pp. 353-360.
- (2003) AAMAS , pp. 353-360
- Liu, Y.¹ Goodwin, R.² Koenig, S.³

27
- 84995186518
- Portfolio selection
- Markowitz, H. M. (1952). Portfolio selection. The Journal of Finance, 7(1), 77-91.
- (1952) The Journal of Finance , vol.7 , Issue.1 , pp. 77-91
- Markowitz, H.M.¹

28
- 0004289472
- John Wiley and Sons, New York
- Markowitz, H. M. (1959). Portfolio Selection. John Wiley and Sons, New York.
- (1959) Portfolio Selection
- Markowitz, H.M.¹

29
- 0036832952
- Risk-sensitive reinforcement learning
- Mihatsch, O., & Neuneier, R. (2002). Risk-sensitive reinforcement learning. Machine Learning, 49(2-3), 267-290.
- (2002) Machine Learning , vol.49 , Issue.2-3 , pp. 267-290
- Mihatsch, O.¹ Neuneier, R.²

30
- 84899024446
- Risk-sensitive reinforcement learning
- Michael S. Kearns, Sara A. Solla, D. A. C. (Ed.), MIT Press
- Neuneier, R., & Mihatsch, O. (1999). Risk-sensitive reinforcement learning. In Michael S. Kearns, Sara A. Solla, D. A. C. (Ed.), Advances in Neural Information Processing Systems, Vol. 11. MIT Press.
- (1999) Advances in Neural Information Processing Systems , vol.11
- Neuneier, R.¹ Mihatsch, O.²

31
- 0003897447
- Academic Press, New York
- Ross, S. M. (2000). Introduction to Probability Models. Academic Press, New York.
- (2000) Introduction to Probability Models
- Ross, S.M.¹

32
- 0001567393
- Safety first and the holding of assets
- Roy, A. D. (1952). Safety first and the holding of assets. Econometrica, 20(3), 431-449.
- (1952) Econometrica , vol.20 , Issue.3 , pp. 431-449
- Roy, A.D.¹

33
- 0036058423
- Effective reinforcement learning for mobile robots
- Smart, W. D., & Kaelbling, L. P. (2002). Effective reinforcement learning for mobile robots. In Proceedings of the 2002 IEEE International Conference on Robotics and Automation (ICRA 2002).
- (2002) Proceedings of the 2002 IEEE International Conference on Robotics and Automation (ICRA 2002)
- Smart, W.D.¹ Kaelbling, L.P.²

34
- 31144475200
- A new control scheme for combustion processes using reinforcement learning based on neural networks
- Stephan, V., Debes, K., Gross, H.-M., Wintrich, F., & Wintrich, H. (2001). A new control scheme for combustion processes using reinforcement learning based on neural networks. International Journal of Computational Intelligence and Applications, 1(2), 121-136.
- (2001) International Journal of Computational Intelligence and Applications , vol.1 , Issue.2 , pp. 121-136
- Stephan, V.¹ Debes, K.² Gross, H.-M.³ Wintrich, F.⁴ Wintrich, H.⁵

35
- 0004102479
- MIT Press
- Sutton, R. S., & Barto, A. G. (1998). Reinforcement Learning - An Introduction. MIT Press.
- (1998) Reinforcement Learning - An Introduction
- Sutton, R.S.¹ Barto, A.G.²

36
- 0028497630
- Asynchronous stochastic approximation and Q-learning
- Tsitsiklis, J. N. (1994). Asynchronous stochastic approximation and Q-learning. Machine Learning, 16(3), 185-202.
- (1994) Machine Learning , vol.16 , Issue.3 , pp. 185-202
- Tsitsiklis, J.N.¹

37
- 0004049893
- Ph.D. thesis, King's College, Oxford
- Watkins, C. J. C. H. (1989). Learning from Delayed Rewards. Ph.D. thesis, King's College, Oxford.
- (1989) Learning from Delayed Rewards
- Watkins, C.J.C.H.¹

38
- 34249833101
- Q-learning
- Special Issue on Reinforcement Learning
- Watkins, C. J. C. H., & Dayan, P. (1992). Q-learning. Machine Learning, 5(3/4). Special Issue on Reinforcement Learning.
- (1992) Machine Learning , vol.5 , Issue.3-4
- Watkins, C.J.C.H.¹ Dayan, P.²

39
- 0037166997
- Non-linear chance constrained process optimization under uncertainty
- Wendt, M., Li, P., & Wozny, G. (2002). Non-linear chance constrained process optimization under uncertainty. Ind. Eng. Chem. Res., 21, 3621-3629.
- (2002) Ind. Eng. Chem. Res. , vol.21 , pp. 3621-3629
- Wendt, M.¹ Li, P.² Wozny, G.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.