-
1
-
-
0003923091
-
-
1st ed., Athena Scientific, Nashua, NH
-
Bertsekas, D. P. and Shreve, S. E., 2007, Stochastic Optimal Control: The Discrete-Time Case, 1st ed., Athena Scientific, Nashua, NH.
-
(2007)
Stochastic Optimal Control: The Discrete-Time Case
-
-
Bertsekas, D.P.1
Shreve, S.E.2
-
2
-
-
0742319170
-
Reinforcement Learning for Long-Run Average Cost
-
Gosavi, A., 2004, "Reinforcement Learning for Long-Run Average Cost," Eur. J. Oper. Res., 155, pp. 654-74.
-
(2004)
Eur. J. Oper. Res.
, vol.155
, pp. 654-674
-
-
Gosavi, A.1
-
3
-
-
0003487482
-
-
(Optimization and Neural Computation Series, 3)
-
Bertsekas, D. P. and Tsitsiklis, J. N., 1996, "Neuro-Dynamic Programming" (Optimization and Neural Computation Series, 3), 1st ed., Athena Scientific, Nashua, NH.
-
(1996)
Neuro-Dynamic Programming
-
-
Bertsekas, D.P.1
Tsitsiklis, J.N.2
-
4
-
-
0004102479
-
-
(Adaptive Computation and Machine Learning), MIT Press, Cambridge, MA
-
Sutton, R. S. and Barto, A. G., 1998, "Reinforcement Learning: An Introduction" (Adaptive Computation and Machine Learning), MIT Press, Cambridge, MA.
-
(1998)
Reinforcement Learning: An Introduction
-
-
Sutton, R.S.1
Barto, A.G.2
-
6
-
-
0001201756
-
Some Studies in Machine Learning Using the Game of Checkers
-
Samuel, A. L., 1959, "Some Studies in Machine Learning Using the Game of Checkers," IBM J. Res. Dev., 3, pp. 210-229.
-
(1959)
IBM J. Res. Dev.
, vol.3
, pp. 210-229
-
-
Samuel, A.L.1
-
7
-
-
0001201757
-
Some Studies in Machine Learning Using the Game of Checkers. II: Recent Progress
-
Samuel, A. L., 1967, "Some Studies in Machine Learning Using the Game of Checkers. II: Recent Progress," IBM J. Res. Develop., 11, pp. 601-617.
-
(1967)
IBM J. Res. Develop.
, vol.11
, pp. 601-617
-
-
Samuel, A.L.1
-
8
-
-
0003617454
-
-
Ph.D. thesis, University of Massachusetts, Amherst, MA
-
Sutton, R. S., 1984, "Temporal Credit Assignment in Reinforcement Learning," Ph.D. thesis, University of Massachusetts, Amherst, MA.
-
(1984)
Temporal Credit Assignment in Reinforcement Learning
-
-
Sutton, R.S.1
-
9
-
-
33847202724
-
Learning to Predict by the Methods of Temporal Difference
-
Sutton, R. S., 1988, "Learning to Predict by the Methods of Temporal Difference," Mach. Learn., 3, pp. 9-44.
-
(1988)
Mach. Learn.
, vol.3
, pp. 9-44
-
-
Sutton, R.S.1
-
10
-
-
0004049893
-
-
Ph.D. thesis, Kings College, Cambridge, England
-
Watkins, C. J., 1989, "Learning From Delayed Rewards," Ph.D. thesis, Kings College, Cambridge, England.
-
(1989)
Learning From Delayed Rewards
-
-
Watkins, C.J.1
-
11
-
-
34249833101
-
Q-Learning
-
Watkins, C. J. C. H., and Dayan, P., 1992, "Q-Learning," Mach. Learn., 8, pp. 279-92.
-
(1992)
Mach. Learn.
, vol.8
, pp. 279-292
-
-
Watkins, C.J.C.H.1
Dayan, P.2
-
12
-
-
0029679044
-
Reinforcement Learning: A Survey
-
Kaelbling, L. P., Littman, M. L., and Moore, A. W., 1996, "Reinforcement Learning: A Survey," J. Artif. Intell. Res., 4, pp. 237-285.
-
(1996)
J. Artif. Intell. Res.
, vol.4
, pp. 237-285
-
-
Kaelbling, L.P.1
Littman, M.L.2
Moore, A.W.3
-
13
-
-
85152626183
-
A Reinforcement Learning Method for Maximizing Undiscounted Rewards
-
Amherst, MA
-
Schwartz, A., 1993, "A Reinforcement Learning Method for Maximizing Undiscounted Rewards," Proceedings of the Tenth International Conference on Machine Learning, Amherst, MA, pp. 298-305.
-
(1993)
Proceedings of the Tenth International Conference on Machine Learning
, pp. 298-305
-
-
Schwartz, A.1
-
14
-
-
0029752592
-
Average Reward Reinforcement Learning: Foundations, Algorithms, and Empirical Results
-
Mahadevan, S., 1996, "Average Reward Reinforcement Learning: Foundations, Algorithms, and Empirical Results," Mach. Learn., 22, pp. 159-195.
-
(1996)
Mach. Learn.
, vol.22
, pp. 159-195
-
-
Mahadevan, S.1
-
15
-
-
85132026293
-
Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming
-
Austin, TX
-
Sutton, R. S., 1990, "Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming," Proceedings of the Seventh International Conference on Machine Learning, Austin, TX, pp. 216-224.
-
(1990)
Proceedings of the Seventh International Conference on Machine Learning
, pp. 216-224
-
-
Sutton, R.S.1
-
16
-
-
85152618928
-
Planning by Incremental Dynamic Programming
-
Evanston, IL
-
Sutton, R. S., 1991, "Planning by Incremental Dynamic Programming," Proceedings of the Eighth International Workshop on Machine Learning (ML91), Evanston, IL, pp. 353-357.
-
(1991)
Proceedings of the Eighth International Workshop on Machine Learning (ML91)
, pp. 353-357
-
-
Sutton, R.S.1
-
17
-
-
0027684215
-
Prioritized Sweeping: Reinforcement Learning With Less Data and Less Time
-
Moore, A. W., and Atkinson, C. G., 1993, "Prioritized Sweeping: Reinforcement Learning With Less Data and Less Time," Mach. Learn., 13, pp. 103-30.
-
(1993)
Mach. Learn.
, vol.13
, pp. 103-130
-
-
Moore, A.W.1
Atkinson, C.G.2
-
18
-
-
84943269745
-
Efficient Learning and Planning Within the Dyna Framework
-
San Francisco, CA
-
Peng, J., and Williams, R. J., 1993, "Efficient Learning and Planning Within the Dyna Framework," Proceedings of the IEEE International Conference on Neural Networks, San Francisco, CA, pp. 168-74.
-
(1993)
Proceedings of the IEEE International Conference on Neural Networks
, pp. 168-174
-
-
Peng, J.1
Williams, R.J.2
-
19
-
-
0029210635
-
Learning to Act Using Real-Time Dynamic Programming
-
Barto, A. G., Bradtke, S. J., and Singh, S. P., 1995, "Learning to Act Using Real-Time Dynamic Programming," Artif. Intell., 72, pp. 81-138.
-
(1995)
Artif. Intell.
, vol.72
, pp. 81-138
-
-
Barto, A.G.1
Bradtke, S.J.2
Singh, S.P.3
-
20
-
-
44349181405
-
A State-Space Representation Model and Learning Algorithm for Real-Time Decision-Making Under Uncertainty
-
Seattle, WA, Nov
-
Malikopoulos, A. A., Papalambros, P. Y., and Assanis, D. N., 2007, "A State-Space Representation Model and Learning Algorithm for Real-Time Decision-Making Under Uncertainty," Proceedings of the 2007 ASME International Mechanical Engineering Congress and Exposition, Seattle, WA, Nov. 11-15.
-
(2007)
Proceedings of the 2007 ASME International Mechanical Engineering Congress and Exposition
, pp. 11-15
-
-
Malikopoulos, A.A.1
Papalambros, P.Y.2
Assanis, D.N.3
-
22
-
-
44849113822
-
A Learning Algorithm for Optimal Internal Combustion Engine Calibration in Real Time
-
Las Vegas, NV, Sept
-
Malikopoulos, A. A., Papalambros, P. Y., and Assanis, D. N., 2007, "A Learning Algorithm for Optimal Internal Combustion Engine Calibration in Real Time," Proceedings of the ASME 2007 International Design Engineering Technical Conferences Computers and Information in Engineering Conference, Las Vegas, NV, Sept. 4-7.
-
(2007)
Proceedings of the ASME 2007 International Design Engineering Technical Conferences Computers and Information in Engineering Conference
, pp. 4-7
-
-
Malikopoulos, A.A.1
Papalambros, P.Y.2
Assanis, D.N.3
-
23
-
-
70349926053
-
-
Ph.D. thesis, Department of Mechanical Engineering, University of Michigan, Ann Arbor, MI
-
Malikopoulos, A. A., 2008, "Real-Time, Self-Learning Identification and Stochastic Optimal Control of Advanced Powertrain Systems," Ph.D. thesis, Department of Mechanical Engineering, University of Michigan, Ann Arbor, MI.
-
(2008)
Real-Time, Self-Learning Identification and Stochastic Optimal Control of Advanced Powertrain Systems
-
-
Malikopoulos, A.A.1
-
24
-
-
44949250090
-
Real-Time, Self-Learning Optimization of Diesel Engine Calibration
-
Charleston, SC, Oct
-
Malikopoulos, A. A., Assanis, D. N., and Papalambros, P. Y., 2007, "Real-Time, Self-Learning Optimization of Diesel Engine Calibration," Proceedings of the 2007 Fall Technical Conference of the ASME Internal Combustion Engine Division, Charleston, SC, Oct. 14-17.
-
(2007)
Proceedings of the 2007 Fall Technical Conference of the ASME Internal Combustion Engine Division
, pp. 14-17
-
-
Malikopoulos, A.A.1
Assanis, D.N.2
Papalambros, P.Y.3
-
25
-
-
85072472254
-
Optimal Engine Calibration for Individual Driving Styles
-
SAE Paper No. 2008-01-1367
-
Malikopoulos, A. A., Assanis, D. N., and Papalambros, P. Y., 2008, "Optimal Engine Calibration for Individual Driving Styles," Proceedings of the SAE 2008 World Congress and Exhibition, Detroit, MI, Apr. 14-17, SAE Paper No. 2008-01-1367.
-
(2008)
Proceedings of the SAE 2008 World Congress and Exhibition, Detroit, MI, Apr. 14-17
-
-
Malikopoulos, A.A.1
Assanis, D.N.2
Papalambros, P.Y.3
-
26
-
-
73849149073
-
Combining Exploitation-Based and Exploration-Based Approach in Reinforcement Learning
-
Hong Kong, China
-
Iwata, K., Ito, N., Yamauchi, K., and Ishii, N., 2000, "Combining Exploitation-Based and Exploration-Based Approach in Reinforcement Learning," Proceedings of the Intelligent Data Engineering and Automated-IDEAL 2000, Hong Kong, China, pp. 326-31.
-
(2000)
Proceedings of the Intelligent Data Engineering and Automated-IDEAL 2000
, pp. 326-331
-
-
Iwata, K.1
Ito, N.2
Yamauchi, K.3
Ishii, N.4
-
27
-
-
0036592028
-
Control of Exploitation-Exploration Meta-Parameter in Reinforcement Learning
-
Ishii, S., Yoshida, W., and Yoshimoto, J., 2002, "Control of Exploitation-Exploration Meta-Parameter in Reinforcement Learning," Neural Networks, 15, pp. 665-87.
-
(2002)
Neural Networks
, vol.15
, pp. 665-687
-
-
Ishii, S.1
Yoshida, W.2
Yoshimoto, J.3
-
28
-
-
44349177251
-
Implementation of the Agent Using Universal On-Line Q-Learning by Balancing Exploration and Exploitation in Reinforcement Learning
-
Chan-Geon, P., and Sung-Bong, Y., 2003, "Implementation of the Agent Using Universal On-Line Q-Learning by Balancing Exploration and Exploitation in Reinforcement Learning," Journal of KISS: Software and Applications, 30, pp. 672-80.
-
(2003)
Journal of KISS: Software and Applications
, vol.30
, pp. 672-680
-
-
Chan-Geon, P.1
Sung-Bong, Y.2
-
29
-
-
33746840174
-
Marco Polo: A Reinforcement Learning System Considering Tradeoff Exploitation and Exploration Under Markovian Environments
-
Miyazaki, K., and Yamamura, M., 1997, "Marco Polo: A Reinforcement Learning System Considering Tradeoff Exploitation and Exploration Under Markovian Environments," Journal of Japanese Society for Artificial Intelligence, 12, pp. 78-89.
-
(1997)
Journal of Japanese Society for Artificial Intelligence
, vol.12
, pp. 78-89
-
-
Miyazaki, K.1
Yamamura, M.2
-
30
-
-
0346922954
-
The Probably Approximately Correct (PAC) Population Size of a Genetic Algorithm
-
Hernandez-Aguirre, A., Buckles, B. P., and Martinez-Alcantara, A., 2000, "The Probably Approximately Correct (PAC) Population Size of a Genetic Algorithm," 12th IEEE Internationals Conference on Tools With Artificial Intelligence, pp. 199-202.
-
(2000)
12th IEEE Internationals Conference on Tools With Artificial Intelligence
, pp. 199-202
-
-
Hernandez-Aguirre, A.1
Buckles, B.P.2
Martinez-Alcantara, A.3
-
31
-
-
70349904944
-
Convergence Properties of a Computational Learning Model for Unknown Markov Chains
-
Ann Arbor, MI, Oct
-
Malikopoulos, A. A., 2008, "Convergence Properties of a Computational Learning Model for Unknown Markov Chains," Proceedings of the 2008 ASME Dynamic Systems and Control Conference, Ann Arbor, MI, Oct. 20-22.
-
(2008)
Proceedings of the 2008 ASME Dynamic Systems and Control Conference
, pp. 20-22
-
-
Malikopoulos, A.A.1
-
32
-
-
0024646143
-
Learning to Control an Inverted Pendulum Using Neural Networks
-
Anderson, C. W., 1989, "Learning to Control an Inverted Pendulum Using Neural Networks," IEEE Control Syst. Mag., 9, pp. 31-7.
-
(1989)
IEEE Control Syst. Mag.
, vol.9
, pp. 31-37
-
-
Anderson, C.W.1
-
33
-
-
0027068318
-
Learning to Balance the Inverted Pendulum Using Neural Networks
-
Cat. No.91CH3065-0
-
Williams, V., and Matsuoka, K., 1991, "Learning to Balance the Inverted Pendulum Using Neural Networks," Proceedings of the 1991 IEEE International Joint Conference on Neural Networks, Singapore, pp. 214-19, Cat. No.91CH3065-0.
-
(1991)
Proceedings of the 1991 IEEE International Joint Conference on Neural Networks, Singapore
, pp. 214-219
-
-
Williams, V.1
Matsuoka, K.2
-
34
-
-
0029485659
-
A Neural-Fuzzy BOXES Control System With Reinforcement Learning and its Applications to Inverted Pendulum
-
Vancouver, BC, Canada
-
Zhidong, D., Zaixing, Z., and Peifa, J., 1995, "A Neural-Fuzzy BOXES Control System With Reinforcement Learning and its Applications to Inverted Pendulum," Proceedings of the 1995 IEEE International Conference on Systems, Man and Cybernetics. Intelligent Systems for the 21st Century, Vancouver, BC, Canada, pp. 1250-4, Cat. No.95CH3576-7.
-
(1995)
Proceedings of the 1995 IEEE International Conference on Systems, Man and Cybernetics. Intelligent Systems for the 21st Century
, pp. 1250-1254
-
-
Zhidong, D.1
Zaixing, Z.2
Peifa, J.3
-
35
-
-
0031380408
-
A Modified Defuzzifier for Control of the Inverted Pendulum Using Learning
-
Syracuse, NY
-
Jeen-Shing, W., and McLaren, R., 1997, "A Modified Defuzzifier for Control of the Inverted Pendulum Using Learning," Proceedings of the 1997 Annual Meeting of the North American Fuzzy Information Processing Society-NAFIPS, Syracuse, NY, pp. 118-23, Cat. No. 97TH8297.
-
(1997)
Proceedings of the 1997 Annual Meeting of the North American Fuzzy Information Processing Society-NAFIPS
, pp. 118-123
-
-
Jeen-Shing, W.1
McLaren, R.2
-
36
-
-
0033698799
-
A Modified Actor-Critic Reinforcement Learning Algorithm
-
Halifax, NS, Canada
-
Mustapha, S. M., and Lachiver, G., 2000, "A Modified Actor-Critic Reinforcement Learning Algorithm," Proceedings of the 2000 Canadian Conference on Electrical and Computer Engineering, Halifax, NS, Canada, pp. 605-9.
-
(2000)
Proceedings of the 2000 Canadian Conference on Electrical and Computer Engineering
, pp. 605-609
-
-
Mustapha, S.M.1
Lachiver, G.2
-
37
-
-
0035273403
-
On-line Learning Control by Association and Reinforcement
-
Si, J., and Wang, Y. T., 2001, "On-line Learning Control by Association and Reinforcement," IEEE Trans. Neural Netw., 12, pp. 264-276.
-
(2001)
IEEE Trans. Neural Netw.
, vol.12
, pp. 264-276
-
-
Si, J.1
Wang, Y.T.2
-
38
-
-
0029178384
-
Learning Control Based on Pattern Recognition Applied to Vehicle Cruise Control Systems
-
Seattle, WA
-
Zhang, B. S., Leigh, I., and Leigh, J. R., 1995, "Learning Control Based on Pattern Recognition Applied to Vehicle Cruise Control Systems," Proceedings of the American Control Conference, Seattle, WA, pp. 3101-3105.
-
(1995)
Proceedings of the American Control Conference
, pp. 3101-3105
-
-
Zhang, B.S.1
Leigh, I.2
Leigh, J.R.3
-
39
-
-
1542433778
-
Use of Active Learning Method to Develop an Intelligent Stop and Go Cruise Control
-
Salzburg, Austria
-
Shahdi, S. A., and Shouraki, S. B., 2003, "Use of Active Learning Method to Develop an Intelligent Stop and Go Cruise Control," Proceedings of the IASTED International Conference on Intelligent Systems and Control, Salzburg, Austria, pp. 87-90.
-
(2003)
Proceedings of the IASTED International Conference on Intelligent Systems and Control
, pp. 87-90
-
-
Shahdi, S.A.1
Shouraki, S.B.2
-
40
-
-
77955876912
-
-
TESIS
-
TESIS, http://www.tesis.de/en/.
-
-
-
-
41
-
-
26444601262
-
Cooperative Multi-Agent Learning: The State of the Art
-
Panait, L., and Luke, S., 2005, "Cooperative Multi-Agent Learning: The State of the Art," Auton. Agents Multi-Agent Syst., 11, pp. 387-434.
-
(2005)
Auton. Agents Multi-Agent Syst.
, vol.11
, pp. 387-434
-
-
Panait, L.1
Luke, S.2
|