SCOPUS 정보 검색 플랫폼

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

Volumn 37, Issue 3, 2007, Pages 515-527

Reinforcement learning for resource allocation in LEO satellite networks

a SURANAREE UNIVERSITY OF TECHNOLOGY (Thailand)

b IMPERIAL COLLEGE LONDON (United Kingdom)

Author keywords

Call admission control (CAC); Low Earth orbit (LEO) satellite network; Reinforcement learning (RL); Routing; Temporal difference (TD) learning

Indexed keywords

COMPUTATIONAL COMPLEXITY; CONGESTION CONTROL (COMMUNICATION); DECISION MAKING; DYNAMIC PROGRAMMING; MARKOV PROCESSES; RESOURCE ALLOCATION; ROUTING ALGORITHMS;

LOW EARTH ORBIT (LEO) SATELLITE NETWORKS; TEMPORAL-DIFFERENCE LEARNING;

REINFORCEMENT LEARNING;

ALGORITHM; ARTICLE; ARTIFICIAL INTELLIGENCE; AUTOMATED PATTERN RECOGNITION; COMPUTER NETWORK; DECISION SUPPORT SYSTEM; METHODOLOGY; RESOURCE ALLOCATION; SIGNAL PROCESSING; SPACE FLIGHT;

ALGORITHMS; ARTIFICIAL INTELLIGENCE; COMPUTER COMMUNICATION NETWORKS; DECISION SUPPORT TECHNIQUES; PATTERN RECOGNITION, AUTOMATED; RESOURCE ALLOCATION; SIGNAL PROCESSING, COMPUTER-ASSISTED; SPACECRAFT;

EID: 34249107342 PISSN: 10834419 EISSN: None Source Type: Journal
DOI: 10.1109/TSMCB.2006.886173 Document Type: Article

Times cited : (31)

References (41)

1
- 0000719863
- Packet routing in dynamically changing networks: A reinforcement learning approach
- San Mateo, CA: Morgan Kaufmann
- J. A. Boyan and M. L. Littman, "Packet routing in dynamically changing networks: A reinforcement learning approach," in Advances in Neural Information Processing Systems 6. San Mateo, CA: Morgan Kaufmann, 1994.
- (1994) Advances in Neural Information Processing Systems 6
- Boyan, J.A.¹ Littman, M.L.²

2
- 0005977691
- Reinforcement learning for admission control and routing,
- Ph.D. dissertation, Uppsala Univ, Uppsala, Sweden
- J. Carlstrom, "Reinforcement learning for admission control and routing," Ph.D. dissertation, Uppsala Univ., Uppsala, Sweden, 2000.
- (2000)
- Carlstrom, J.¹

3
- 0037093591
- Provision of guaranteed services in broadband LEO satellite networks
- May
- O. Ercetin, S. Krishnamurthy, S. Dao, and L. Tassiulas, "Provision of guaranteed services in broadband LEO satellite networks," Comput. Netw., vol. 39, no. 1, pp. 61-77, May 2002.
- (2002) Comput. Netw , vol.39 , Issue.1 , pp. 61-77
- Ercetin, O.¹ Krishnamurthy, S.² Dao, S.³ Tassiulas, L.⁴

4
- 0037093588
- Signalling for inter-satellite link routing in broadband non-GEO satellite systems
- May
- L. Franck and G. Maral, "Signalling for inter-satellite link routing in broadband non-GEO satellite systems," Comput. Netw., vol. 39, no. 1, pp. 79-92, May 2002.
- (2002) Comput. Netw , vol.39 , Issue.1 , pp. 79-92
- Franck, L.¹ Maral, G.²

5
- 21244434788
- Self-aware networks and QoS
- Sep
- E. Gelenbe, R. Lent, and A. Nunez, "Self-aware networks and QoS," Proc. IEEE, vol. 92, no. 9, pp. 1478-1489, Sep. 2004.
- (2004) Proc. IEEE , vol.92 , Issue.9 , pp. 1478-1489
- Gelenbe, E.¹ Lent, R.² Nunez, A.³

6
- 0033902857
- Performance study of adaptive routing algorithms for LEO satellite constellations under self-similar and Poisson traffic
- Jan
- I. Gragopoulos, E. Papapetrou, and F. Pavlidou, "Performance study of adaptive routing algorithms for LEO satellite constellations under self-similar and Poisson traffic," Space Commun., vol. 16, no. 1, pp. 15-22, Jan. 2000.
- (2000) Space Commun , vol.16 , Issue.1 , pp. 15-22
- Gragopoulos, I.¹ Papapetrou, E.² Pavlidou, F.³

7
- 0242696084
- Adaptive provisioning of differentiated services networks based on reinforcement learning
- Nov
- T. C.-K. Hui and C.-K. Tham, "Adaptive provisioning of differentiated services networks based on reinforcement learning," IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 33, no. 4, pp. 492-501, Nov. 2003.
- (2003) IEEE Trans. Syst., Man, Cybern. C, Appl. Rev , vol.33 , Issue.4 , pp. 492-501
- Hui, T.C.-K.¹ Tham, C.-K.²

8
- 0034313087
- Predictive call admission control scheme for low Earth orbit satellite networks
- Nov
- B. W. Kim, S. L. Min, H. S. Yang, and C. S. Kim, "Predictive call admission control scheme for low Earth orbit satellite networks," IEEE Trans. Veh. Technol., vol. 49, no. 6, pp. 2320-2335, Nov. 2000.
- (2000) IEEE Trans. Veh. Technol , vol.49 , Issue.6 , pp. 2320-2335
- Kim, B.W.¹ Min, S.L.² Yang, H.S.³ Kim, C.S.⁴

9
- 0042758707
- Actor-critic algorithms,
- Ph.D. dissertation, MIT, Cambridge, MA
- V. R. Konda, "Actor-critic algorithms," Ph.D. dissertation, MIT, Cambridge, MA, 2002.
- (2002)
- Konda, V.R.¹

10
- 4043069840
- On actor-critic algorithms
- Aug
- V. R. Konda and J. N. Tsitsiklis, "On actor-critic algorithms," SIAM J. Control Optim., vol. 42, no. 4, pp. 1143-1166, Aug. 2003.
- (2003) SIAM J. Control Optim , vol.42 , Issue.4 , pp. 1143-1166
- Konda, V.R.¹ Tsitsiklis, J.N.²

11
- 0033876565
- Call admission control and routing in integrated services networks using Neuro-dynamic programming
- Feb
- P. Marbach, O. Mihatsch, and J. N. Tsitsiklis, "Call admission control and routing in integrated services networks using Neuro-dynamic programming," IEEE J. Sel. Areas Commun., vol. 18, no. 2, pp. 197-208, Feb. 2000.
- (2000) IEEE J. Sel. Areas Commun , vol.18 , Issue.2 , pp. 197-208
- Marbach, P.¹ Mihatsch, O.² Tsitsiklis, J.N.³

12
- 0036082856
- Reinforcement learning for adaptive routing
- L. Peshkin and V. Savova, "Reinforcement learning for adaptive routing," in Proc. Int. Joint Conf. Neural Netw., 2002, pp. 1825-1830.
- (2002) Proc. Int. Joint Conf. Neural Netw , pp. 1825-1830
- Peshkin, L.¹ Savova, V.²

13
- 84898972974
- Reinforcement learning for dynamic channel allocation in cellular telephone systems
- Cambridge, MA: MIT Press
- S. Singh and D. P. Bertsekas, "Reinforcement learning for dynamic channel allocation in cellular telephone systems," in Advances in Neural Information Processing Systems 10. Cambridge, MA: MIT Press, 1997, pp. 974-980.
- (1997) Advances in Neural Information Processing Systems 10 , pp. 974-980
- Singh, S.¹ Bertsekas, D.P.²

14
- 0033876566
- Adaptive call admission control under quality-of-service constraints: A reinforcement learning solution
- Feb
- H. Tong and T. X. Brown, "Adaptive call admission control under quality-of-service constraints: A reinforcement learning solution," IEEE J. Sel. Areas Commun., vol. 18, no. 2, pp. 209-220, Feb. 2000.
- (2000) IEEE J. Sel. Areas Commun , vol.18 , Issue.2 , pp. 209-220
- Tong, H.¹ Brown, T.X.²

15
- 13444294406
- A multi-agent, policy-gradient approach to network routing
- N. Tao, J. Baxter, and L. Weaver, "A multi-agent, policy-gradient approach to network routing," in Proc. 18th Int. Mach. Learn. Conf. 2001, pp. 553-560.
- (2001) Proc. 18th Int. Mach. Learn. Conf , pp. 553-560
- Tao, N.¹ Baxter, J.² Weaver, L.³

16
- 0001046225
- Practical issues in temporal difference learning
- May
- G. Tesauro, "Practical issues in temporal difference learning," Mach. Learn., vol. 8, no. 3/4, pp. 257-277, May 1992.
- (1992) Mach. Learn , vol.8 , Issue.3-4 , pp. 257-277
- Tesauro, G.¹

17
- 0033361288
- An optimized routing scheme and a channel reservation strategy for a low Earth orbit satellite system
- Sep
- P. Tam, J. Lui, H. W. Chan, C. Sze, and C. N. Sze, "An optimized routing scheme and a channel reservation strategy for a low Earth orbit satellite system," in Proc. IEEE VTC, Sep. 1999, vol. 5, pp. 2870-2874.
- (1999) Proc. IEEE VTC , vol.5 , pp. 2870-2874
- Tam, P.¹ Lui, J.² Chan, H.W.³ Sze, C.⁴ Sze, C.N.⁵

18
- 0034225151
- A routing algorithm for connection-oriented low Earth orbit (LEO) satellite networks with dynamic connectivity
- May
- H. Uzunalioglu, I. F. Akyilidiz, and M. Bender, "A routing algorithm for connection-oriented low Earth orbit (LEO) satellite networks with dynamic connectivity," Wireless Netw., vol. 6, no. 3, pp. 181-190, May 2000.
- (2000) Wireless Netw , vol.6 , Issue.3 , pp. 181-190
- Uzunalioglu, H.¹ Akyilidiz, I.F.² Bender, M.³

19
- 0036823395
- Markov decision theory framework for resource allocation in LEO satellite constellations
- Oct
- W. Usaha and J. Barria, "Markov decision theory framework for resource allocation in LEO satellite constellations," Proc. Inst. Electr. Eng. - Commun., vol. 149, no. 5, pp. 270-276, Oct. 2002.
- (2002) Proc. Inst. Electr. Eng. - Commun , vol.149 , Issue.5 , pp. 270-276
- Usaha, W.¹ Barria, J.²

20
- 0003565783
- Belmont, MA: Athena Scientific
- D. P. Bertsekas, Dynamic Programming and Optimal Control. Belmont, MA: Athena Scientific, 1995.
- (1995) Dynamic Programming and Optimal Control
- Bertsekas, D.P.¹

21
- 7744231640
- Resource allocation in networks with dynamic topology,
- Ph.D. dissertation, Imperial College London, London, U.K
- W. Usaha, "Resource allocation in networks with dynamic topology," Ph.D. dissertation, Imperial College London, London, U.K., 2004.
- (2004)
- Usaha, W.¹

22
- 0031628198
- Probabilistic routing protocol for low Earth orbit satellite networks
- Jun
- H. Uzunalioglu, "Probabilistic routing protocol for low Earth orbit satellite networks," in Proc. ICC, Jun. 1998, vol. 1, pp. 89-93.
- (1998) Proc. ICC , vol.1 , pp. 89-93
- Uzunalioglu, H.¹

23
- 0030735996
- ATM-based routing in LEO/MEO satellite networks with intersatellite links
- Jan
- M. Werner, C. Delluchi, H. Vogel, G. Maral, and J. De Ridder, "ATM-based routing in LEO/MEO satellite networks with intersatellite links," IEEE J. Sel. Areas Commun., vol. 15, no. 1, pp. 69-82, Jan. 1997.
- (1997) IEEE J. Sel. Areas Commun , vol.15 , Issue.1 , pp. 69-82
- Werner, M.¹ Delluchi, C.² Vogel, H.³ Maral, G.⁴ De Ridder, J.⁵

24
- 0031259023
- A dynamic routing concept for ATM-based satellite personal communication network
- Oct
- M. Werner, "A dynamic routing concept for ATM-based satellite personal communication network," IEEE J. Sel. Areas Commun., vol. 15, no. 8, pp. 1636-1648, Oct. 1997.
- (1997) IEEE J. Sel. Areas Commun , vol.15 , Issue.8 , pp. 1636-1648
- Werner, M.¹

25
- 0032292104
- A neural network base approach to distributed adaptive routing of LEO intersatellite link traffic
- May
- M. Werner, C. Mayer, G. Maral, and M. Holzbock, "A neural network base approach to distributed adaptive routing of LEO intersatellite link traffic," in Proc. IEEE VTC, May 1998, vol. 2, pp. 1498-1502.
- (1998) Proc. IEEE VTC , vol.2 , pp. 1498-1502
- Werner, M.¹ Mayer, C.² Maral, G.³ Holzbock, M.⁴

26
- 85156225449
- High performance job-shop scheduling with a time delay TD(λ) network
- Cambridge, MA: MIT Press
- W. Zhang and T. G. Dietterich, "High performance job-shop scheduling with a time delay TD(λ) network," in Advances in Neural Information Processing Systems 8. Cambridge, MA: MIT Press, 1996, pp. 1024-1030.
- (1996) Advances in Neural Information Processing Systems 8 , pp. 1024-1030
- Zhang, W.¹ Dietterich, T.G.²

27
- 0003874616
- Lab. Inf. Decision Syst, MIT, Cambridge, MA, Rep. LIDS-P-2434
- J. Abounadi, D. P. Bertsekas, and V. S. Borkar, "Learning algorithms for Markov decision processes," Lab. Inf. Decision Syst., MIT, Cambridge, MA, Rep. LIDS-P-2434, 1998.
- (1998) Learning algorithms for Markov decision processes
- Abounadi, J.¹ Bertsekas, D.P.² Borkar, V.S.³

28
- 0020970738
- Neuronlike adaptive elements that can solve difficult learning control problems
- Sep./Oct
- A. G. Barto, R. S. Sutton, and C. W. Anderson, "Neuronlike adaptive elements that can solve difficult learning control problems," IEEE Trans. Syst., Man, Cybern., vol. SMC-13, no. 5, pp. 834-846, Sep./Oct. 1983.
- (1983) IEEE Trans. Syst., Man, Cybern , vol.SMC-13 , Issue.5 , pp. 834-846
- Barto, A.G.¹ Sutton, R.S.² Anderson, C.W.³

29
- 0013535965
- Infinite-horizon policy gradient estimation
- Jul.-Dec
- J. Baxter and P. L. Bartlett, "Infinite-horizon policy gradient estimation," J. Artif. Intell. Res., vol. 15, no. 4, pp. 319-350, Jul.-Dec. 2001.
- (2001) J. Artif. Intell. Res , vol.15 , Issue.4 , pp. 319-350
- Baxter, J.¹ Bartlett, P.L.²

30
- 0003487482
- Belmont, MA: Athena Scientific
- D. P. Bertsekas and J. N. Tsitsiklis, Neuro-Dynamic Programming. Belmont, MA: Athena Scientific, 1996.
- (1996) Neuro-Dynamic Programming
- Bertsekas, D.P.¹ Tsitsiklis, J.N.²

31
- 34249002045
- Variance reduction techniques for gradient estimates in reinforcement learning
- Vancouver, BC, Canada: MIT Press
- E. Greensmith, P. L. Bartlett, and J. Baxter, "Variance reduction techniques for gradient estimates in reinforcement learning," in Advances in Neural Information Processing Systems 14. Vancouver, BC, Canada: MIT Press, 2001.
- (2001) Advances in Neural Information Processing Systems 14
- Greensmith, E.¹ Bartlett, P.L.² Baxter, J.³

32
- 85153938292
- Reinforcement learning algorithm for partially observable Markov decision problems
- San Mateo, CA: Morgan Kaufmann
- T. Jaakkola, S. P. Singh, and M. I. Jordan, "Reinforcement learning algorithm for partially observable Markov decision problems," in Advances in Neural Information Processing Systems 7. San Mateo, CA: Morgan Kaufmann, 1995, pp. 345-352.
- (1995) Advances in Neural Information Processing Systems 7 , pp. 345-352
- Jaakkola, T.¹ Singh, S.P.² Jordan, M.I.³

33
- 0343893613
- Actor-critic like learning algorithms for Markov decision processes
- V. R. Konda and V. S. Borkar, "Actor-critic like learning algorithms for Markov decision processes," SIAM J. Control Optim., vol. 38, no. 1, pp. 94-123, 1999.
- (1999) SIAM J. Control Optim , vol.38 , Issue.1 , pp. 94-123
- Konda, V.R.¹ Borkar, V.S.²

34
- 0029752592
- Average reward reinforcement learning: Foundations, algorithms and empirical results
- S. Mahadevan, "Average reward reinforcement learning: Foundations, algorithms and empirical results," Mach. Learn., vol. 22, no. 1-3, pp. 159-196, 1996.
- (1996) Mach. Learn , vol.22 , Issue.1-3 , pp. 159-196
- Mahadevan, S.¹

35
- 0035249254
- Simulation-based optimization of Markov reward processes
- Feb
- P. Marbach and J. N. Tsitsiklis, "Simulation-based optimization of Markov reward processes," IEEE Trans. Autom. Control, vol. 46, no. 2, pp. 191-209, Feb. 2001.
- (2001) IEEE Trans. Autom. Control , vol.46 , Issue.2 , pp. 191-209
- Marbach, P.¹ Tsitsiklis, J.N.²

36
- 0004102479
- Cambridge, MA: MIT Press
- R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction Cambridge, MA: MIT Press, 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

37
- 0031143730
- An analysis of temporal-difference learning with function approximation
- May
- J. N. Tsitsiklis and B. Van Roy, "An analysis of temporal-difference learning with function approximation," IEEE Trans. Autom. Control vol. 42, no. 5, pp. 674-690, May 1997.
- (1997) IEEE Trans. Autom. Control , vol.42 , Issue.5 , pp. 674-690
- Tsitsiklis, J.N.¹ Van Roy, B.²

38
- 0033221519
- Average cost temporal-difference learning
- Nov
- J. N. Tsitsiklis, "Average cost temporal-difference learning," Automatica, vol. 35, no. 11, pp. 1799-1808, Nov. 1999.
- (1999) Automatica , vol.35 , Issue.11 , pp. 1799-1808
- Tsitsiklis, J.N.¹

39
- 21444437925
- The optimal reward baseline for gradient based reinforcement learning
- Seattle, WA, Aug
- L. Weaver and N. Tao, "The optimal reward baseline for gradient based reinforcement learning," in Proc. 17th Conf. Uncertain. Artif. Intell., Seattle, WA, Aug. 2001, pp. 538-545.
- (2001) Proc. 17th Conf. Uncertain. Artif. Intell , pp. 538-545
- Weaver, L.¹ Tao, N.²

40
- 0000337576
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- May
- R. J. William, " Simple statistical gradient-following algorithms for connectionist reinforcement learning," Mach. Learn., vol. 8, no. 3/4 pp. 229-256, May 1992.
- (1992) Mach. Learn , vol.8 , Issue.3-4 , pp. 229-256
- William, R.J.¹

41
- 7744247423
- A reinforcement learning ticket-based probing path discovery scheme for MANETs
- W. Usaha and J. Barria, "A reinforcement learning ticket-based probing path discovery scheme for MANETs," Ad Hoc Netw., vol. 2, no. 3, pp. 319-334, 2004.
- (2004) Ad Hoc Netw , vol.2 , Issue.3 , pp. 319-334
- Usaha, W.¹ Barria, J.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.