SCOPUS 정보 검색 플랫폼

Volumn 46, Issue 7, 2016, Pages 1628-1639

F-Discrepancy for Efficient Sampling in Approximate Dynamic Programming

(2) Cervellera, Cristiano a MacCiò, Danilo a

a INSTITUTE OF INTELLIGENT SYSTEMS FOR AUTOMATION (Italy)

Author keywords

Approximate dynamic programming (ADP); F discrepancy; Markovian decision problem (MDP); state sampling; value function approximation

Indexed keywords

ALGORITHMS; IMPORTANCE SAMPLING; PROBABILITY DISTRIBUTIONS;

APPROXIMATE DYNAMIC PROGRAMMING; APPROXIMATE SOLUTION; CONTINUOUS STATE; EFFICIENT SAMPLING; MARKOVIAN DECISION PROBLEMS; MARKOVIAN PROCESS; STATE TRAJECTORY; UNIFORM SAMPLING;

DYNAMIC PROGRAMMING;

EID: 84938514337 PISSN: 21682267 EISSN: None Source Type: Journal
DOI: 10.1109/TCYB.2015.2453123 Document Type: Article

Times cited : (9)

References (39)

1
- 0003565783
- 2nd ed. Belmont, CA, USA: Athena Scientific
- D. Bertsekas, Dynamic Programming and Optimal Control, vol. 1, 2nd ed. Belmont, CA, USA: Athena Scientific, 2000.
- (2000) Dynamic Programming and Optimal Control , vol.1
- Bertsekas, D.¹

2
- 84949764394
- 2nd ed. Hoboken, NJ, USA: Wiley
- W. Powell, Approximate Dynamic Programming: Solving the Curses of Dimensionality, 2nd ed. Hoboken, NJ, USA: Wiley, 2011.
- (2011) Approximate Dynamic Programming: Solving the Curses of Dimensionality
- Powell, W.¹

3
- 0003998452
- New York, NY, USA: Wiley
- M. Puterman, Markov Decision Processes. New York, NY, USA: Wiley, 1994.
- (1994) Markov Decision Processes
- Puterman, M.¹

4
- 79551685808
- Reinforcement learning for partially observable dynamic processes: Adaptive dynamic programming using measured output data
- Feb.
- F. L. Lewis and K. G. Vamvoudakis, "Reinforcement learning for partially observable dynamic processes: Adaptive dynamic programming using measured output data," IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 41, no. 1, pp. 14-25, Feb. 2011.
- (2011) IEEE Trans. Syst., Man, Cybern. B, Cybern. , vol.41 , Issue.1 , pp. 14-25
- Lewis, F.L.¹ Vamvoudakis, K.G.²

5
- 80052899788
- Incremental state aggregation for value function estimation in reinforcement learning
- Oct.
- T. Mori and S. Ishii, "Incremental state aggregation for value function estimation in reinforcement learning," IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 41, no. 5, pp. 1407-1416, Oct. 2011.
- (2011) IEEE Trans. Syst., Man, Cybern. B, Cybern. , vol.41 , Issue.5 , pp. 1407-1416
- Mori, T.¹ Ishii, S.²

6
- 84912135349
- Acceleration of reinforcement learning by policy evaluation using nonstationary iterative method
- Dec.
- K. Senda, S. Hattori, T. Hishinuma, and T. Kohda, "Acceleration of reinforcement learning by policy evaluation using nonstationary iterative method," IEEE Trans. Cybern., vol. 44, no. 12, pp. 2696-2705, Dec. 2014.
- (2014) IEEE Trans. Cybern. , vol.44 , Issue.12 , pp. 2696-2705
- Senda, K.¹ Hattori, S.² Hishinuma, T.³ Kohda, T.⁴

7
- 84912071084
- A clustering-based graph Laplacian framework for value function approximation in reinforcement learning
- Dec.
- X. Xu, Z. Huang, D. Graves, and W. Pedrycz, "A clustering-based graph Laplacian framework for value function approximation in reinforcement learning," IEEE Trans. Cybern., vol. 44, no. 12, pp. 2613-2625, Dec. 2014.
- (2014) IEEE Trans. Cybern. , vol.44 , Issue.12 , pp. 2613-2625
- Xu, X.¹ Huang, Z.² Graves, D.³ Pedrycz, W.⁴

8
- 84919769592
- Stochastic abstract policies: Generalizing knowledge to improve reinforcement learning
- Jan.
- M. L. Koga, V. Freire, and A. H. R. Costa, "Stochastic abstract policies: Generalizing knowledge to improve reinforcement learning," IEEE Trans. Cybern., vol. 45, no. 1, pp. 77-88, Jan. 2015.
- (2015) IEEE Trans. Cybern. , vol.45 , Issue.1 , pp. 77-88
- Koga, M.L.¹ Freire, V.² Costa, A.H.R.³

9
- 84912026937
- Revisiting approximate dynamic programming and its convergence
- Dec.
- A. Heydari, "Revisiting approximate dynamic programming and its convergence," IEEE Trans. Cybern., vol. 44, no. 12, pp. 2733-2743, Dec. 2014.
- (2014) IEEE Trans. Cybern. , vol.44 , Issue.12 , pp. 2733-2743
- Heydari, A.¹

10
- 84921377021
- Continuous-time Q-learning for infinite-horizon discounted cost linear quadratic regulator problems
- Feb.
- M. Palanisamy, H. Modares, F. L. Lewis, and M. Aurangzeb, "Continuous-time Q-learning for infinite-horizon discounted cost linear quadratic regulator problems," IEEE Trans. Cybern., vol. 45, no. 2, pp. 165-176, Feb. 2015.
- (2015) IEEE Trans. Cybern. , vol.45 , Issue.2 , pp. 165-176
- Palanisamy, M.¹ Modares, H.² Lewis, F.L.³ Aurangzeb, M.⁴

11
- 84912122528
- Finite-approximation-errorbased discrete-time iterative adaptive dynamic programming
- Dec.
- Q. Wei, F.-Y. Wang, D. Liu, and X. Yang, "Finite-approximation-errorbased discrete-time iterative adaptive dynamic programming," IEEE Trans. Cybern., vol. 44, no. 12, pp. 2820-2833, Dec. 2014.
- (2014) IEEE Trans. Cybern. , vol.44 , Issue.12 , pp. 2820-2833
- Wei, Q.¹ Wang, F.-Y.² Liu, D.³ Yang, X.⁴

12
- 84960101128
- Optimal tracking control of unknown discrete-time linear systems using input-output measured data
- B. Kiumarsi, F. L. Lewis, M.-B. Naghibi-Sistani, and A. Karimpour, "Optimal tracking control of unknown discrete-time linear systems using input-output measured data," IEEE Trans. Cybern., DOI: 10.1109/TCYB.2014.2384016.
- IEEE Trans. Cybern.
- Kiumarsi, B.¹ Lewis, F.L.² Naghibi-Sistani, M.-B.³ Karimpour, A.⁴

13
- 84880065287
- Finite-horizon control-constrained nonlinear optimal control using single network adaptive critics
- Jan.
- A. Heydari and S. N. Balakrishnan, "Finite-horizon control-constrained nonlinear optimal control using single network adaptive critics," IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 1, pp. 145-157, Jan. 2013.
- (2013) IEEE Trans. Neural Netw. Learn. Syst. , vol.24 , Issue.1 , pp. 145-157
- Heydari, A.¹ Balakrishnan, S.N.²

14
- 84864491417
- Multi-agent differential graphical games: Online adaptive learning solution for synchronization with optimality
- K. G. Vamvoudakis, F. L. Lewis, and G. R. Hudas, "Multi-agent differential graphical games: Online adaptive learning solution for synchronization with optimality," Automatica, vol. 48, no. 8, pp. 1598-1611, 2012.
- (2012) Automatica , vol.48 , Issue.8 , pp. 1598-1611
- Vamvoudakis, K.G.¹ Lewis, F.L.² Hudas, G.R.³

15
- 84904389431
- Neural-network-based constrained optimal control scheme for discrete-time switched nonlinear system using dual heuristic programming
- Jul.
- H. Zhang, C. Qin, and Y. Luo, "Neural-network-based constrained optimal control scheme for discrete-time switched nonlinear system using dual heuristic programming," IEEE Trans. Autom. Sci. Eng., vol. 11, no. 3, pp. 839-849, Jul. 2014.
- (2014) IEEE Trans. Autom. Sci. Eng. , vol.11 , Issue.3 , pp. 839-849
- Zhang, H.¹ Qin, C.² Luo, Y.³

16
- 84906778934
- Adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification
- Oct.
- Q. Wei and D. Liu, "Adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification," IEEE Trans. Autom. Sci. Eng., vol. 11, no. 4, pp. 1020-1036, Oct. 2014.
- (2014) IEEE Trans. Autom. Sci. Eng. , vol.11 , Issue.4 , pp. 1020-1036
- Wei, Q.¹ Liu, D.²

17
- 84968468700
- Polynomial approximation-A new computational technique in dynamic programming allocation processes
- R. Bellman, R. Kalaba, and B. Kotkin, "Polynomial approximation-A new computational technique in dynamic programming allocation processes," Math. Comput., vol. 17, no. 82, pp. 155-161, 1963.
- (1963) Math. Comput. , vol.17 , Issue.82 , pp. 155-161
- Bellman, R.¹ Kalaba, R.² Kotkin, B.³

18
- 0027601994
- Numerical solution of continuous-state dynamic programs using linear and spline interpolation
- S. A. Johnson, J. Stedinger, C. A. Shoemaker, Y. Li, and J. A. Tejada-Guibert, "Numerical solution of continuous-state dynamic programs using linear and spline interpolation," Oper. Res., vol. 41, no. 3, pp. 484-500, 1993.
- (1993) Oper. Res. , vol.41 , Issue.3 , pp. 484-500
- Johnson, S.A.¹ Stedinger, J.² Shoemaker, C.A.³ Li, Y.⁴ Tejada-Guibert, J.A.⁵

19
- 0001820934
- Applying experimental design and regression splines to high-dimensional continuous-state stochastic dynamic programming
- V. Chen, D. Ruppert, and C. A. Shoemaker, "Applying experimental design and regression splines to high-dimensional continuous-state stochastic dynamic programming," Oper. Res., vol. 47, no. 1, pp. 38-53, 1999.
- (1999) Oper. Res. , vol.47 , Issue.1 , pp. 38-53
- Chen, V.¹ Ruppert, D.² Shoemaker, C.A.³

20
- 84884969734
- Low-discrepancy sampling for approximate dynamic programming with local approximators
- Mar.
- C. Cervellera, M. Gaggero, and D. Macciò, "Low-discrepancy sampling for approximate dynamic programming with local approximators," Comput. Oper. Res., vol. 43, pp. 108-115, Mar. 2014.
- (2014) Comput. Oper. Res. , vol.43 , pp. 108-115
- Cervellera, C.¹ Gaggero, M.² Macciò, D.³

21
- 84961378056
- Leader-based optimal coordination control for the consensus problem of multiagent differential games via fuzzy adaptive dynamic programming
- Feb.
- H. Zhang, J. Zhang, G.-H. Yang, and Y. Luo, "Leader-based optimal coordination control for the consensus problem of multiagent differential games via fuzzy adaptive dynamic programming," IEEE Trans. Fuzzy Syst., vol. 23, no. 1, pp. 152-163, Feb. 2015.
- (2015) IEEE Trans. Fuzzy Syst. , vol.23 , Issue.1 , pp. 152-163
- Zhang, H.¹ Zhang, J.² Yang, G.-H.³ Luo, Y.⁴

22
- 0003487482
- Belmont, CA, USA: Athena Scientific
- D. Bertsekas and J. Tsitsiklis, Neuro-Dynamic Programming. Belmont, CA, USA: Athena Scientific, 1996.
- (1996) Neuro-Dynamic Programming
- Bertsekas, D.¹ Tsitsiklis, J.²

23
- 77956759955
- Management of water resources systems in the presence of uncertainties by nonlinear approximators and deterministic sampling techniques
- M. Baglietto, C. Cervellera, M. Sanguineti, and R. Zoppoli, "Management of water resources systems in the presence of uncertainties by nonlinear approximators and deterministic sampling techniques," Comput. Optim. Appl., vol. 47, no. 2, pp. 349-376, 2010.
- (2010) Comput. Optim. Appl. , vol.47 , Issue.2 , pp. 349-376
- Baglietto, M.¹ Cervellera, C.² Sanguineti, M.³ Zoppoli, R.⁴

24
- 0036013020
- Measuring the goodness of orthogonal array discretizations for stochastic programming and stochastic dynamic programming
- V. C. P. Chen, "Measuring the goodness of orthogonal array discretizations for stochastic programming and stochastic dynamic programming," SIAM J. Optim., vol. 12, no. 2, pp. 322-344, 2001.
- (2001) SIAM J. Optim. , vol.12 , Issue.2 , pp. 322-344
- Chen, V.C.P.¹

25
- 33746257756
- Neural network and regression spline value function approximations for stochastic dynamic programming
- C. Cervellera, V. Chen, and A. Wen, "Neural network and regression spline value function approximations for stochastic dynamic programming," Comput. Oper. Res., vol. 34, no. 1, pp. 70-90, 2006.
- (2006) Comput. Oper. Res. , vol.34 , Issue.1 , pp. 70-90
- Cervellera, C.¹ Chen, V.² Wen, A.³

26
- 78249259323
- A comparison of global and semi-local approximation in T-stage stochastic optimization
- C. Cervellera and D. Macciò, "A comparison of global and semi-local approximation in T-stage stochastic optimization," Eur. J. Oper. Res., vol. 208, no. 2, pp. 109-118, 2011.
- (2011) Eur. J. Oper. Res. , vol.208 , Issue.2 , pp. 109-118
- Cervellera, C.¹ Macciò, D.²

27
- 36148965498
- Efficient sampling in approximate dynamic programming algorithms
- C. Cervellera and M. Muselli, "Efficient sampling in approximate dynamic programming algorithms," Comput. Optim. Appl., vol. 38, no. 3, pp. 417-443, 2007.
- (2007) Comput. Optim. Appl. , vol.38 , Issue.3 , pp. 417-443
- Cervellera, C.¹ Muselli, M.²

28
- 84871395855
- Adaptive value function approximation for continuous-state stochastic dynamic programming
- H. Fan, P. K. Tarun, and V. Chen, "Adaptive value function approximation for continuous-state stochastic dynamic programming," Comput. Oper. Res., vol. 40, no. 4, pp. 1076-1084, 2013.
- (2013) Comput. Oper. Res. , vol.40 , Issue.4 , pp. 1076-1084
- Fan, H.¹ Tarun, P.K.² Chen, V.³

29
- 68349126329
- Non-uniform low-discrepancy sequence generation and integration of singular integrands
- H. Niederreiter and D. Talay, Eds. Berlin, Germany: Springer
- J. Hartinger and R. Kainhofer, "Non-uniform low-discrepancy sequence generation and integration of singular integrands," in Monte Carlo and Quasi-Monte Carlo Methods 2004, H. Niederreiter and D. Talay, Eds. Berlin, Germany: Springer, 2006, pp. 163-179.
- (2006) Monte Carlo and Quasi-Monte Carlo Methods 2004 , pp. 163-179
- Hartinger, J.¹ Kainhofer, R.²

30
- 84908469557
- An analysis based on F-discrepancy for sampling in regression tree learning
- Jul. Beijing, China
- C. Cervellera, M. Gaggero, and D. Macciò, "An analysis based on F-discrepancy for sampling in regression tree learning," in Proc. Int. Joint Conf. Neural Netw. (IJCNN), Jul. 2014, Beijing, China, pp. 1115-1121.
- (2014) Proc. Int. Joint Conf. Neural Netw. (IJCNN) , pp. 1115-1121
- Cervellera, C.¹ Gaggero, M.² Macciò, D.³

31
- 0003860442
- London, U.K.: Chapman and Hall
- K. T. Fang and Y. Wang, Number-Theoretic Methods in Statistics. London, U.K.: Chapman and Hall, 1994.
- (1994) Number-Theoretic Methods in Statistics
- Fang, K.T.¹ Wang, Y.²

32
- 0003834629
- Philadelphia, PA, USA: SIAM
- H. Niederreiter, Random Number Generation and Quasi-Monte Carlo Methods. Philadelphia, PA, USA: SIAM, 1992.
- (1992) Random Number Generation and Quasi-Monte Carlo Methods
- Niederreiter, H.¹

33
- 84875913709
- High dimensional integration-The quasi-Monte Carlo way
- May
- J. Dick, F. Y. Kuo, and I. H. Sloan, "High dimensional integration-The quasi-Monte Carlo way," Acta Numer., vol. 22, pp. 133-288, May 2013.
- (2013) Acta Numer. , vol.22 , pp. 133-288
- Dick, J.¹ Kuo, F.Y.² Sloan, I.H.³

34
- 0035649406
- The inverse of the star-discrepancy depends linearly on the dimension
- S. Heinrich, E. Novak, G. Wasilkowski, and H. Wózniakowski, "The inverse of the star-discrepancy depends linearly on the dimension," Acta Arith., vol. 96, no. 3, pp. 279-302, 2001.
- (2001) Acta Arith. , vol.96 , Issue.3 , pp. 279-302
- Heinrich, S.¹ Novak, E.² Wasilkowski, G.³ Wózniakowski, H.⁴

35
- 84861366753
- A new randomized algorithm to approximate the star discrepancy based on threshold accepting
- M. Gnewuch, M. Wahlström, and C. Winzen, "A new randomized algorithm to approximate the star discrepancy based on threshold accepting," SIAM J. Numer. Anal., vol. 50, no. 2, pp. 781-807, 2012.
- (2012) SIAM J. Numer. Anal. , vol.50 , Issue.2 , pp. 781-807
- Gnewuch, M.¹ Wahlström, M.² Winzen, C.³

36
- 0027599793
- Universal approximation bounds for superpositions of a sigmoidal function
- May
- A. Barron, "Universal approximation bounds for superpositions of a sigmoidal function," IEEE Trans. Inf. Theory, vol. 39, no. 3, pp. 930-945, May 1993.
- (1993) IEEE Trans. Inf. Theory , vol.39 , Issue.3 , pp. 930-945
- Barron, A.¹

37
- 0001219859
- Regularization theory and neural networks architectures
- F. Girosi, M. Jones, and T. Poggio, "Regularization theory and neural networks architectures," Neural Comput., vol. 7, no. 2, pp. 219-269, 1995.
- (1995) Neural Comput. , vol.7 , Issue.2 , pp. 219-269
- Girosi, F.¹ Jones, M.² Poggio, T.³

38
- 0000796112
- A simple lemma on greedy approximation in Hilbert space and convergence rates for projection pursuit regression and neural network training
- L. K. Jones, "A simple lemma on greedy approximation in Hilbert space and convergence rates for projection pursuit regression and neural network training," Ann. Stat., vol. 20, no. 1, pp. 608-613, 1992.
- (1992) Ann. Stat. , vol.20 , Issue.1 , pp. 608-613
- Jones, L.K.¹

39
- 0028543366
- Training feedforward networks with the Marquardt algorithm
- Nov.
- M. Hagan and M. Menhaj, "Training feedforward networks with the Marquardt algorithm," IEEE Trans. Neural Netw., vol. 5, no. 6, pp. 989-993, Nov. 1994.
- (1994) IEEE Trans. Neural Netw. , vol.5 , Issue.6 , pp. 989-993
- Hagan, M.¹ Menhaj, M.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.