SCOPUS 정보 검색 플랫폼

Journal of Dynamic Systems, Measurement and Control, Transactions of the ASME

Volumn 131, Issue 4, 2009, Pages 1-8

A real-time computational learning model for sequential decision-making problems under uncertainty

(3) Malikopoulos, Andreas A a Papalambros, Panos Y a Assanis, Dennis N a

a UNIVERSITY OF MICHIGAN (United States)

Author keywords

[No Author keywords available]

Indexed keywords

ALTERNATIVE APPROACH; COMPUTATIONAL LEARNING MODEL; CONTROL POLICY; CONTROL PROBLEMS; CONTROLLED MARKOV CHAINS; DECISION-MAKING PROBLEM; DYNAMIC SYSTEMS; EVALUATION FUNCTION; EXISTING METHOD; EXPECTED COSTS; LEARNING MECHANISM; LEARNING MODELS; OPTIMAL CONTROL POLICY; POLE BALANCING; REAL TIME; SIMULATION-BASED; STATE SPACE REPRESENTATION; STATE TRANSITIONS; STOCHASTIC DISTURBANCES; STOCHASTIC FRAMEWORK; SYSTEM RESPONSE;

DYNAMICAL SYSTEMS; MARKOV PROCESSES; STOCHASTIC SYSTEMS;

DECISION MAKING;

EID: 77955875058 PISSN: 00220434 EISSN: 15289028 Source Type: Journal
DOI: 10.1115/1.3117200 Document Type: Article

Times cited : (21)

References (41)

1
- 0003923091
- 1st ed., Athena Scientific, Nashua, NH
- Bertsekas, D. P. and Shreve, S. E., 2007, Stochastic Optimal Control: The Discrete-Time Case, 1st ed., Athena Scientific, Nashua, NH.
- (2007) Stochastic Optimal Control: The Discrete-Time Case
- Bertsekas, D.P.¹ Shreve, S.E.²

2
- 0742319170
- Reinforcement Learning for Long-Run Average Cost
- Gosavi, A., 2004, "Reinforcement Learning for Long-Run Average Cost," Eur. J. Oper. Res., 155, pp. 654-74.
- (2004) Eur. J. Oper. Res. , vol.155 , pp. 654-674
- Gosavi, A.¹

3
- 0003487482
- (Optimization and Neural Computation Series, 3)
- Bertsekas, D. P. and Tsitsiklis, J. N., 1996, "Neuro-Dynamic Programming" (Optimization and Neural Computation Series, 3), 1st ed., Athena Scientific, Nashua, NH.
- (1996) Neuro-Dynamic Programming
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

4
- 0004102479
- (Adaptive Computation and Machine Learning), MIT Press, Cambridge, MA
- Sutton, R. S. and Barto, A. G., 1998, "Reinforcement Learning: An Introduction" (Adaptive Computation and Machine Learning), MIT Press, Cambridge, MA.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

5
- 0034550848
- A Learning Algorithm for Discrete-Time Stochastic Control
- Borkar, V. S., 2000, "A Learning Algorithm for Discrete-Time Stochastic Control," Probability in the Engineering and Informational Sciences, 14, pp. 243-258.
- (2000) Probability in the Engineering and Informational Sciences , vol.14 , pp. 243-258
- Borkar, V.S.¹

6
- 0001201756
- Some Studies in Machine Learning Using the Game of Checkers
- Samuel, A. L., 1959, "Some Studies in Machine Learning Using the Game of Checkers," IBM J. Res. Dev., 3, pp. 210-229.
- (1959) IBM J. Res. Dev. , vol.3 , pp. 210-229
- Samuel, A.L.¹

7
- 0001201757
- Some Studies in Machine Learning Using the Game of Checkers. II: Recent Progress
- Samuel, A. L., 1967, "Some Studies in Machine Learning Using the Game of Checkers. II: Recent Progress," IBM J. Res. Develop., 11, pp. 601-617.
- (1967) IBM J. Res. Develop. , vol.11 , pp. 601-617
- Samuel, A.L.¹

8
- 0003617454
- Ph.D. thesis, University of Massachusetts, Amherst, MA
- Sutton, R. S., 1984, "Temporal Credit Assignment in Reinforcement Learning," Ph.D. thesis, University of Massachusetts, Amherst, MA.
- (1984) Temporal Credit Assignment in Reinforcement Learning
- Sutton, R.S.¹

9
- 33847202724
- Learning to Predict by the Methods of Temporal Difference
- Sutton, R. S., 1988, "Learning to Predict by the Methods of Temporal Difference," Mach. Learn., 3, pp. 9-44.
- (1988) Mach. Learn. , vol.3 , pp. 9-44
- Sutton, R.S.¹

10
- 0004049893
- Ph.D. thesis, Kings College, Cambridge, England
- Watkins, C. J., 1989, "Learning From Delayed Rewards," Ph.D. thesis, Kings College, Cambridge, England.
- (1989) Learning From Delayed Rewards
- Watkins, C.J.¹

11
- 34249833101
- Q-Learning
- Watkins, C. J. C. H., and Dayan, P., 1992, "Q-Learning," Mach. Learn., 8, pp. 279-92.
- (1992) Mach. Learn. , vol.8 , pp. 279-292
- Watkins, C.J.C.H.¹ Dayan, P.²

12
- 0029679044
- Reinforcement Learning: A Survey
- Kaelbling, L. P., Littman, M. L., and Moore, A. W., 1996, "Reinforcement Learning: A Survey," J. Artif. Intell. Res., 4, pp. 237-285.
- (1996) J. Artif. Intell. Res. , vol.4 , pp. 237-285
- Kaelbling, L.P.¹ Littman, M.L.² Moore, A.W.³

13
- 85152626183
- A Reinforcement Learning Method for Maximizing Undiscounted Rewards
- Amherst, MA
- Schwartz, A., 1993, "A Reinforcement Learning Method for Maximizing Undiscounted Rewards," Proceedings of the Tenth International Conference on Machine Learning, Amherst, MA, pp. 298-305.
- (1993) Proceedings of the Tenth International Conference on Machine Learning , pp. 298-305
- Schwartz, A.¹

14
- 0029752592
- Average Reward Reinforcement Learning: Foundations, Algorithms, and Empirical Results
- Mahadevan, S., 1996, "Average Reward Reinforcement Learning: Foundations, Algorithms, and Empirical Results," Mach. Learn., 22, pp. 159-195.
- (1996) Mach. Learn. , vol.22 , pp. 159-195
- Mahadevan, S.¹

15
- 85132026293
- Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming
- Austin, TX
- Sutton, R. S., 1990, "Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming," Proceedings of the Seventh International Conference on Machine Learning, Austin, TX, pp. 216-224.
- (1990) Proceedings of the Seventh International Conference on Machine Learning , pp. 216-224
- Sutton, R.S.¹

16
- 85152618928
- Planning by Incremental Dynamic Programming
- Evanston, IL
- Sutton, R. S., 1991, "Planning by Incremental Dynamic Programming," Proceedings of the Eighth International Workshop on Machine Learning (ML91), Evanston, IL, pp. 353-357.
- (1991) Proceedings of the Eighth International Workshop on Machine Learning (ML91) , pp. 353-357
- Sutton, R.S.¹

17
- 0027684215
- Prioritized Sweeping: Reinforcement Learning With Less Data and Less Time
- Moore, A. W., and Atkinson, C. G., 1993, "Prioritized Sweeping: Reinforcement Learning With Less Data and Less Time," Mach. Learn., 13, pp. 103-30.
- (1993) Mach. Learn. , vol.13 , pp. 103-130
- Moore, A.W.¹ Atkinson, C.G.²

18
- 84943269745
- Efficient Learning and Planning Within the Dyna Framework
- San Francisco, CA
- Peng, J., and Williams, R. J., 1993, "Efficient Learning and Planning Within the Dyna Framework," Proceedings of the IEEE International Conference on Neural Networks, San Francisco, CA, pp. 168-74.
- (1993) Proceedings of the IEEE International Conference on Neural Networks , pp. 168-174
- Peng, J.¹ Williams, R.J.²

19
- 0029210635
- Learning to Act Using Real-Time Dynamic Programming
- Barto, A. G., Bradtke, S. J., and Singh, S. P., 1995, "Learning to Act Using Real-Time Dynamic Programming," Artif. Intell., 72, pp. 81-138.
- (1995) Artif. Intell. , vol.72 , pp. 81-138
- Barto, A.G.¹ Bradtke, S.J.² Singh, S.P.³

20
- 44349181405
- A State-Space Representation Model and Learning Algorithm for Real-Time Decision-Making Under Uncertainty
- Seattle, WA, Nov
- Malikopoulos, A. A., Papalambros, P. Y., and Assanis, D. N., 2007, "A State-Space Representation Model and Learning Algorithm for Real-Time Decision-Making Under Uncertainty," Proceedings of the 2007 ASME International Mechanical Engineering Congress and Exposition, Seattle, WA, Nov. 11-15.
- (2007) Proceedings of the 2007 ASME International Mechanical Engineering Congress and Exposition , pp. 11-15
- Malikopoulos, A.A.¹ Papalambros, P.Y.² Assanis, D.N.³

21
- 85102627959
- 2nd rev. ed., Wiley-Interscience, New York
- Puterman, M. L., 2005, Markov Decision Processes: Discrete Stochastic Dynamic Programming, 2nd rev. ed., Wiley-Interscience, New York.
- (2005) Markov Decision Processes: Discrete Stochastic Dynamic Programming
- Puterman, M.L.¹

22
- 44849113822
- A Learning Algorithm for Optimal Internal Combustion Engine Calibration in Real Time
- Las Vegas, NV, Sept
- Malikopoulos, A. A., Papalambros, P. Y., and Assanis, D. N., 2007, "A Learning Algorithm for Optimal Internal Combustion Engine Calibration in Real Time," Proceedings of the ASME 2007 International Design Engineering Technical Conferences Computers and Information in Engineering Conference, Las Vegas, NV, Sept. 4-7.
- (2007) Proceedings of the ASME 2007 International Design Engineering Technical Conferences Computers and Information in Engineering Conference , pp. 4-7
- Malikopoulos, A.A.¹ Papalambros, P.Y.² Assanis, D.N.³

23
- 70349926053
- Ph.D. thesis, Department of Mechanical Engineering, University of Michigan, Ann Arbor, MI
- Malikopoulos, A. A., 2008, "Real-Time, Self-Learning Identification and Stochastic Optimal Control of Advanced Powertrain Systems," Ph.D. thesis, Department of Mechanical Engineering, University of Michigan, Ann Arbor, MI.
- (2008) Real-Time, Self-Learning Identification and Stochastic Optimal Control of Advanced Powertrain Systems
- Malikopoulos, A.A.¹

24
- 44949250090
- Real-Time, Self-Learning Optimization of Diesel Engine Calibration
- Charleston, SC, Oct
- Malikopoulos, A. A., Assanis, D. N., and Papalambros, P. Y., 2007, "Real-Time, Self-Learning Optimization of Diesel Engine Calibration," Proceedings of the 2007 Fall Technical Conference of the ASME Internal Combustion Engine Division, Charleston, SC, Oct. 14-17.
- (2007) Proceedings of the 2007 Fall Technical Conference of the ASME Internal Combustion Engine Division , pp. 14-17
- Malikopoulos, A.A.¹ Assanis, D.N.² Papalambros, P.Y.³

25
- 85072472254
- Optimal Engine Calibration for Individual Driving Styles
- SAE Paper No. 2008-01-1367
- Malikopoulos, A. A., Assanis, D. N., and Papalambros, P. Y., 2008, "Optimal Engine Calibration for Individual Driving Styles," Proceedings of the SAE 2008 World Congress and Exhibition, Detroit, MI, Apr. 14-17, SAE Paper No. 2008-01-1367.
- (2008) Proceedings of the SAE 2008 World Congress and Exhibition, Detroit, MI, Apr. 14-17
- Malikopoulos, A.A.¹ Assanis, D.N.² Papalambros, P.Y.³

26
- 73849149073
- Combining Exploitation-Based and Exploration-Based Approach in Reinforcement Learning
- Hong Kong, China
- Iwata, K., Ito, N., Yamauchi, K., and Ishii, N., 2000, "Combining Exploitation-Based and Exploration-Based Approach in Reinforcement Learning," Proceedings of the Intelligent Data Engineering and Automated-IDEAL 2000, Hong Kong, China, pp. 326-31.
- (2000) Proceedings of the Intelligent Data Engineering and Automated-IDEAL 2000 , pp. 326-331
- Iwata, K.¹ Ito, N.² Yamauchi, K.³ Ishii, N.⁴

27
- 0036592028
- Control of Exploitation-Exploration Meta-Parameter in Reinforcement Learning
- Ishii, S., Yoshida, W., and Yoshimoto, J., 2002, "Control of Exploitation-Exploration Meta-Parameter in Reinforcement Learning," Neural Networks, 15, pp. 665-87.
- (2002) Neural Networks , vol.15 , pp. 665-687
- Ishii, S.¹ Yoshida, W.² Yoshimoto, J.³

28
- 44349177251
- Implementation of the Agent Using Universal On-Line Q-Learning by Balancing Exploration and Exploitation in Reinforcement Learning
- Chan-Geon, P., and Sung-Bong, Y., 2003, "Implementation of the Agent Using Universal On-Line Q-Learning by Balancing Exploration and Exploitation in Reinforcement Learning," Journal of KISS: Software and Applications, 30, pp. 672-80.
- (2003) Journal of KISS: Software and Applications , vol.30 , pp. 672-680
- Chan-Geon, P.¹ Sung-Bong, Y.²

29
- 33746840174
- Marco Polo: A Reinforcement Learning System Considering Tradeoff Exploitation and Exploration Under Markovian Environments
- Miyazaki, K., and Yamamura, M., 1997, "Marco Polo: A Reinforcement Learning System Considering Tradeoff Exploitation and Exploration Under Markovian Environments," Journal of Japanese Society for Artificial Intelligence, 12, pp. 78-89.
- (1997) Journal of Japanese Society for Artificial Intelligence , vol.12 , pp. 78-89
- Miyazaki, K.¹ Yamamura, M.²

30
- 0346922954
- The Probably Approximately Correct (PAC) Population Size of a Genetic Algorithm
- Hernandez-Aguirre, A., Buckles, B. P., and Martinez-Alcantara, A., 2000, "The Probably Approximately Correct (PAC) Population Size of a Genetic Algorithm," 12th IEEE Internationals Conference on Tools With Artificial Intelligence, pp. 199-202.
- (2000) 12th IEEE Internationals Conference on Tools With Artificial Intelligence , pp. 199-202
- Hernandez-Aguirre, A.¹ Buckles, B.P.² Martinez-Alcantara, A.³

31
- 70349904944
- Convergence Properties of a Computational Learning Model for Unknown Markov Chains
- Ann Arbor, MI, Oct
- Malikopoulos, A. A., 2008, "Convergence Properties of a Computational Learning Model for Unknown Markov Chains," Proceedings of the 2008 ASME Dynamic Systems and Control Conference, Ann Arbor, MI, Oct. 20-22.
- (2008) Proceedings of the 2008 ASME Dynamic Systems and Control Conference , pp. 20-22
- Malikopoulos, A.A.¹

32
- 0024646143
- Learning to Control an Inverted Pendulum Using Neural Networks
- Anderson, C. W., 1989, "Learning to Control an Inverted Pendulum Using Neural Networks," IEEE Control Syst. Mag., 9, pp. 31-7.
- (1989) IEEE Control Syst. Mag. , vol.9 , pp. 31-37
- Anderson, C.W.¹

33
- 0027068318
- Learning to Balance the Inverted Pendulum Using Neural Networks
- Cat. No.91CH3065-0
- Williams, V., and Matsuoka, K., 1991, "Learning to Balance the Inverted Pendulum Using Neural Networks," Proceedings of the 1991 IEEE International Joint Conference on Neural Networks, Singapore, pp. 214-19, Cat. No.91CH3065-0.
- (1991) Proceedings of the 1991 IEEE International Joint Conference on Neural Networks, Singapore , pp. 214-219
- Williams, V.¹ Matsuoka, K.²

34
- 0029485659
- A Neural-Fuzzy BOXES Control System With Reinforcement Learning and its Applications to Inverted Pendulum
- Vancouver, BC, Canada
- Zhidong, D., Zaixing, Z., and Peifa, J., 1995, "A Neural-Fuzzy BOXES Control System With Reinforcement Learning and its Applications to Inverted Pendulum," Proceedings of the 1995 IEEE International Conference on Systems, Man and Cybernetics. Intelligent Systems for the 21st Century, Vancouver, BC, Canada, pp. 1250-4, Cat. No.95CH3576-7.
- (1995) Proceedings of the 1995 IEEE International Conference on Systems, Man and Cybernetics. Intelligent Systems for the 21st Century , pp. 1250-1254
- Zhidong, D.¹ Zaixing, Z.² Peifa, J.³

35
- 0031380408
- A Modified Defuzzifier for Control of the Inverted Pendulum Using Learning
- Syracuse, NY
- Jeen-Shing, W., and McLaren, R., 1997, "A Modified Defuzzifier for Control of the Inverted Pendulum Using Learning," Proceedings of the 1997 Annual Meeting of the North American Fuzzy Information Processing Society-NAFIPS, Syracuse, NY, pp. 118-23, Cat. No. 97TH8297.
- (1997) Proceedings of the 1997 Annual Meeting of the North American Fuzzy Information Processing Society-NAFIPS , pp. 118-123
- Jeen-Shing, W.¹ McLaren, R.²

36
- 0033698799
- A Modified Actor-Critic Reinforcement Learning Algorithm
- Halifax, NS, Canada
- Mustapha, S. M., and Lachiver, G., 2000, "A Modified Actor-Critic Reinforcement Learning Algorithm," Proceedings of the 2000 Canadian Conference on Electrical and Computer Engineering, Halifax, NS, Canada, pp. 605-9.
- (2000) Proceedings of the 2000 Canadian Conference on Electrical and Computer Engineering , pp. 605-609
- Mustapha, S.M.¹ Lachiver, G.²

37
- 0035273403
- On-line Learning Control by Association and Reinforcement
- Si, J., and Wang, Y. T., 2001, "On-line Learning Control by Association and Reinforcement," IEEE Trans. Neural Netw., 12, pp. 264-276.
- (2001) IEEE Trans. Neural Netw. , vol.12 , pp. 264-276
- Si, J.¹ Wang, Y.T.²

38
- 0029178384
- Learning Control Based on Pattern Recognition Applied to Vehicle Cruise Control Systems
- Seattle, WA
- Zhang, B. S., Leigh, I., and Leigh, J. R., 1995, "Learning Control Based on Pattern Recognition Applied to Vehicle Cruise Control Systems," Proceedings of the American Control Conference, Seattle, WA, pp. 3101-3105.
- (1995) Proceedings of the American Control Conference , pp. 3101-3105
- Zhang, B.S.¹ Leigh, I.² Leigh, J.R.³

39
- 1542433778
- Use of Active Learning Method to Develop an Intelligent Stop and Go Cruise Control
- Salzburg, Austria
- Shahdi, S. A., and Shouraki, S. B., 2003, "Use of Active Learning Method to Develop an Intelligent Stop and Go Cruise Control," Proceedings of the IASTED International Conference on Intelligent Systems and Control, Salzburg, Austria, pp. 87-90.
- (2003) Proceedings of the IASTED International Conference on Intelligent Systems and Control , pp. 87-90
- Shahdi, S.A.¹ Shouraki, S.B.²

40
- 77955876912
- TESIS
- TESIS, http://www.tesis.de/en/.

41
- 26444601262
- Cooperative Multi-Agent Learning: The State of the Art
- Panait, L., and Luke, S., 2005, "Cooperative Multi-Agent Learning: The State of the Art," Auton. Agents Multi-Agent Syst., 11, pp. 387-434.
- (2005) Auton. Agents Multi-Agent Syst. , vol.11 , pp. 387-434
- Panait, L.¹ Luke, S.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.