메뉴 건너뛰기




Volumn 37, Issue 3, 2007, Pages 515-527

Reinforcement learning for resource allocation in LEO satellite networks

Author keywords

Call admission control (CAC); Low Earth orbit (LEO) satellite network; Reinforcement learning (RL); Routing; Temporal difference (TD) learning

Indexed keywords

COMPUTATIONAL COMPLEXITY; CONGESTION CONTROL (COMMUNICATION); DECISION MAKING; DYNAMIC PROGRAMMING; MARKOV PROCESSES; RESOURCE ALLOCATION; ROUTING ALGORITHMS;

EID: 34249107342     PISSN: 10834419     EISSN: None     Source Type: Journal    
DOI: 10.1109/TSMCB.2006.886173     Document Type: Article
Times cited : (31)

References (41)
  • 1
    • 0000719863 scopus 로고
    • Packet routing in dynamically changing networks: A reinforcement learning approach
    • San Mateo, CA: Morgan Kaufmann
    • J. A. Boyan and M. L. Littman, "Packet routing in dynamically changing networks: A reinforcement learning approach," in Advances in Neural Information Processing Systems 6. San Mateo, CA: Morgan Kaufmann, 1994.
    • (1994) Advances in Neural Information Processing Systems 6
    • Boyan, J.A.1    Littman, M.L.2
  • 2
    • 0005977691 scopus 로고    scopus 로고
    • Reinforcement learning for admission control and routing,
    • Ph.D. dissertation, Uppsala Univ, Uppsala, Sweden
    • J. Carlstrom, "Reinforcement learning for admission control and routing," Ph.D. dissertation, Uppsala Univ., Uppsala, Sweden, 2000.
    • (2000)
    • Carlstrom, J.1
  • 3
    • 0037093591 scopus 로고    scopus 로고
    • Provision of guaranteed services in broadband LEO satellite networks
    • May
    • O. Ercetin, S. Krishnamurthy, S. Dao, and L. Tassiulas, "Provision of guaranteed services in broadband LEO satellite networks," Comput. Netw., vol. 39, no. 1, pp. 61-77, May 2002.
    • (2002) Comput. Netw , vol.39 , Issue.1 , pp. 61-77
    • Ercetin, O.1    Krishnamurthy, S.2    Dao, S.3    Tassiulas, L.4
  • 4
    • 0037093588 scopus 로고    scopus 로고
    • Signalling for inter-satellite link routing in broadband non-GEO satellite systems
    • May
    • L. Franck and G. Maral, "Signalling for inter-satellite link routing in broadband non-GEO satellite systems," Comput. Netw., vol. 39, no. 1, pp. 79-92, May 2002.
    • (2002) Comput. Netw , vol.39 , Issue.1 , pp. 79-92
    • Franck, L.1    Maral, G.2
  • 5
    • 21244434788 scopus 로고    scopus 로고
    • Self-aware networks and QoS
    • Sep
    • E. Gelenbe, R. Lent, and A. Nunez, "Self-aware networks and QoS," Proc. IEEE, vol. 92, no. 9, pp. 1478-1489, Sep. 2004.
    • (2004) Proc. IEEE , vol.92 , Issue.9 , pp. 1478-1489
    • Gelenbe, E.1    Lent, R.2    Nunez, A.3
  • 6
    • 0033902857 scopus 로고    scopus 로고
    • Performance study of adaptive routing algorithms for LEO satellite constellations under self-similar and Poisson traffic
    • Jan
    • I. Gragopoulos, E. Papapetrou, and F. Pavlidou, "Performance study of adaptive routing algorithms for LEO satellite constellations under self-similar and Poisson traffic," Space Commun., vol. 16, no. 1, pp. 15-22, Jan. 2000.
    • (2000) Space Commun , vol.16 , Issue.1 , pp. 15-22
    • Gragopoulos, I.1    Papapetrou, E.2    Pavlidou, F.3
  • 7
    • 0242696084 scopus 로고    scopus 로고
    • Adaptive provisioning of differentiated services networks based on reinforcement learning
    • Nov
    • T. C.-K. Hui and C.-K. Tham, "Adaptive provisioning of differentiated services networks based on reinforcement learning," IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 33, no. 4, pp. 492-501, Nov. 2003.
    • (2003) IEEE Trans. Syst., Man, Cybern. C, Appl. Rev , vol.33 , Issue.4 , pp. 492-501
    • Hui, T.C.-K.1    Tham, C.-K.2
  • 8
    • 0034313087 scopus 로고    scopus 로고
    • Predictive call admission control scheme for low Earth orbit satellite networks
    • Nov
    • B. W. Kim, S. L. Min, H. S. Yang, and C. S. Kim, "Predictive call admission control scheme for low Earth orbit satellite networks," IEEE Trans. Veh. Technol., vol. 49, no. 6, pp. 2320-2335, Nov. 2000.
    • (2000) IEEE Trans. Veh. Technol , vol.49 , Issue.6 , pp. 2320-2335
    • Kim, B.W.1    Min, S.L.2    Yang, H.S.3    Kim, C.S.4
  • 9
    • 0042758707 scopus 로고    scopus 로고
    • Actor-critic algorithms,
    • Ph.D. dissertation, MIT, Cambridge, MA
    • V. R. Konda, "Actor-critic algorithms," Ph.D. dissertation, MIT, Cambridge, MA, 2002.
    • (2002)
    • Konda, V.R.1
  • 10
    • 4043069840 scopus 로고    scopus 로고
    • On actor-critic algorithms
    • Aug
    • V. R. Konda and J. N. Tsitsiklis, "On actor-critic algorithms," SIAM J. Control Optim., vol. 42, no. 4, pp. 1143-1166, Aug. 2003.
    • (2003) SIAM J. Control Optim , vol.42 , Issue.4 , pp. 1143-1166
    • Konda, V.R.1    Tsitsiklis, J.N.2
  • 11
    • 0033876565 scopus 로고    scopus 로고
    • Call admission control and routing in integrated services networks using Neuro-dynamic programming
    • Feb
    • P. Marbach, O. Mihatsch, and J. N. Tsitsiklis, "Call admission control and routing in integrated services networks using Neuro-dynamic programming," IEEE J. Sel. Areas Commun., vol. 18, no. 2, pp. 197-208, Feb. 2000.
    • (2000) IEEE J. Sel. Areas Commun , vol.18 , Issue.2 , pp. 197-208
    • Marbach, P.1    Mihatsch, O.2    Tsitsiklis, J.N.3
  • 13
    • 84898972974 scopus 로고    scopus 로고
    • Reinforcement learning for dynamic channel allocation in cellular telephone systems
    • Cambridge, MA: MIT Press
    • S. Singh and D. P. Bertsekas, "Reinforcement learning for dynamic channel allocation in cellular telephone systems," in Advances in Neural Information Processing Systems 10. Cambridge, MA: MIT Press, 1997, pp. 974-980.
    • (1997) Advances in Neural Information Processing Systems 10 , pp. 974-980
    • Singh, S.1    Bertsekas, D.P.2
  • 14
    • 0033876566 scopus 로고    scopus 로고
    • Adaptive call admission control under quality-of-service constraints: A reinforcement learning solution
    • Feb
    • H. Tong and T. X. Brown, "Adaptive call admission control under quality-of-service constraints: A reinforcement learning solution," IEEE J. Sel. Areas Commun., vol. 18, no. 2, pp. 209-220, Feb. 2000.
    • (2000) IEEE J. Sel. Areas Commun , vol.18 , Issue.2 , pp. 209-220
    • Tong, H.1    Brown, T.X.2
  • 15
    • 13444294406 scopus 로고    scopus 로고
    • A multi-agent, policy-gradient approach to network routing
    • N. Tao, J. Baxter, and L. Weaver, "A multi-agent, policy-gradient approach to network routing," in Proc. 18th Int. Mach. Learn. Conf. 2001, pp. 553-560.
    • (2001) Proc. 18th Int. Mach. Learn. Conf , pp. 553-560
    • Tao, N.1    Baxter, J.2    Weaver, L.3
  • 16
    • 0001046225 scopus 로고
    • Practical issues in temporal difference learning
    • May
    • G. Tesauro, "Practical issues in temporal difference learning," Mach. Learn., vol. 8, no. 3/4, pp. 257-277, May 1992.
    • (1992) Mach. Learn , vol.8 , Issue.3-4 , pp. 257-277
    • Tesauro, G.1
  • 17
    • 0033361288 scopus 로고    scopus 로고
    • An optimized routing scheme and a channel reservation strategy for a low Earth orbit satellite system
    • Sep
    • P. Tam, J. Lui, H. W. Chan, C. Sze, and C. N. Sze, "An optimized routing scheme and a channel reservation strategy for a low Earth orbit satellite system," in Proc. IEEE VTC, Sep. 1999, vol. 5, pp. 2870-2874.
    • (1999) Proc. IEEE VTC , vol.5 , pp. 2870-2874
    • Tam, P.1    Lui, J.2    Chan, H.W.3    Sze, C.4    Sze, C.N.5
  • 18
    • 0034225151 scopus 로고    scopus 로고
    • A routing algorithm for connection-oriented low Earth orbit (LEO) satellite networks with dynamic connectivity
    • May
    • H. Uzunalioglu, I. F. Akyilidiz, and M. Bender, "A routing algorithm for connection-oriented low Earth orbit (LEO) satellite networks with dynamic connectivity," Wireless Netw., vol. 6, no. 3, pp. 181-190, May 2000.
    • (2000) Wireless Netw , vol.6 , Issue.3 , pp. 181-190
    • Uzunalioglu, H.1    Akyilidiz, I.F.2    Bender, M.3
  • 19
    • 0036823395 scopus 로고    scopus 로고
    • Markov decision theory framework for resource allocation in LEO satellite constellations
    • Oct
    • W. Usaha and J. Barria, "Markov decision theory framework for resource allocation in LEO satellite constellations," Proc. Inst. Electr. Eng. - Commun., vol. 149, no. 5, pp. 270-276, Oct. 2002.
    • (2002) Proc. Inst. Electr. Eng. - Commun , vol.149 , Issue.5 , pp. 270-276
    • Usaha, W.1    Barria, J.2
  • 21
    • 7744231640 scopus 로고    scopus 로고
    • Resource allocation in networks with dynamic topology,
    • Ph.D. dissertation, Imperial College London, London, U.K
    • W. Usaha, "Resource allocation in networks with dynamic topology," Ph.D. dissertation, Imperial College London, London, U.K., 2004.
    • (2004)
    • Usaha, W.1
  • 22
    • 0031628198 scopus 로고    scopus 로고
    • Probabilistic routing protocol for low Earth orbit satellite networks
    • Jun
    • H. Uzunalioglu, "Probabilistic routing protocol for low Earth orbit satellite networks," in Proc. ICC, Jun. 1998, vol. 1, pp. 89-93.
    • (1998) Proc. ICC , vol.1 , pp. 89-93
    • Uzunalioglu, H.1
  • 23
    • 0030735996 scopus 로고    scopus 로고
    • ATM-based routing in LEO/MEO satellite networks with intersatellite links
    • Jan
    • M. Werner, C. Delluchi, H. Vogel, G. Maral, and J. De Ridder, "ATM-based routing in LEO/MEO satellite networks with intersatellite links," IEEE J. Sel. Areas Commun., vol. 15, no. 1, pp. 69-82, Jan. 1997.
    • (1997) IEEE J. Sel. Areas Commun , vol.15 , Issue.1 , pp. 69-82
    • Werner, M.1    Delluchi, C.2    Vogel, H.3    Maral, G.4    De Ridder, J.5
  • 24
    • 0031259023 scopus 로고    scopus 로고
    • A dynamic routing concept for ATM-based satellite personal communication network
    • Oct
    • M. Werner, "A dynamic routing concept for ATM-based satellite personal communication network," IEEE J. Sel. Areas Commun., vol. 15, no. 8, pp. 1636-1648, Oct. 1997.
    • (1997) IEEE J. Sel. Areas Commun , vol.15 , Issue.8 , pp. 1636-1648
    • Werner, M.1
  • 25
    • 0032292104 scopus 로고    scopus 로고
    • A neural network base approach to distributed adaptive routing of LEO intersatellite link traffic
    • May
    • M. Werner, C. Mayer, G. Maral, and M. Holzbock, "A neural network base approach to distributed adaptive routing of LEO intersatellite link traffic," in Proc. IEEE VTC, May 1998, vol. 2, pp. 1498-1502.
    • (1998) Proc. IEEE VTC , vol.2 , pp. 1498-1502
    • Werner, M.1    Mayer, C.2    Maral, G.3    Holzbock, M.4
  • 26
    • 85156225449 scopus 로고    scopus 로고
    • High performance job-shop scheduling with a time delay TD(λ) network
    • Cambridge, MA: MIT Press
    • W. Zhang and T. G. Dietterich, "High performance job-shop scheduling with a time delay TD(λ) network," in Advances in Neural Information Processing Systems 8. Cambridge, MA: MIT Press, 1996, pp. 1024-1030.
    • (1996) Advances in Neural Information Processing Systems 8 , pp. 1024-1030
    • Zhang, W.1    Dietterich, T.G.2
  • 28
    • 0020970738 scopus 로고
    • Neuronlike adaptive elements that can solve difficult learning control problems
    • Sep./Oct
    • A. G. Barto, R. S. Sutton, and C. W. Anderson, "Neuronlike adaptive elements that can solve difficult learning control problems," IEEE Trans. Syst., Man, Cybern., vol. SMC-13, no. 5, pp. 834-846, Sep./Oct. 1983.
    • (1983) IEEE Trans. Syst., Man, Cybern , vol.SMC-13 , Issue.5 , pp. 834-846
    • Barto, A.G.1    Sutton, R.S.2    Anderson, C.W.3
  • 29
    • 0013535965 scopus 로고    scopus 로고
    • Infinite-horizon policy gradient estimation
    • Jul.-Dec
    • J. Baxter and P. L. Bartlett, "Infinite-horizon policy gradient estimation," J. Artif. Intell. Res., vol. 15, no. 4, pp. 319-350, Jul.-Dec. 2001.
    • (2001) J. Artif. Intell. Res , vol.15 , Issue.4 , pp. 319-350
    • Baxter, J.1    Bartlett, P.L.2
  • 32
    • 85153938292 scopus 로고
    • Reinforcement learning algorithm for partially observable Markov decision problems
    • San Mateo, CA: Morgan Kaufmann
    • T. Jaakkola, S. P. Singh, and M. I. Jordan, "Reinforcement learning algorithm for partially observable Markov decision problems," in Advances in Neural Information Processing Systems 7. San Mateo, CA: Morgan Kaufmann, 1995, pp. 345-352.
    • (1995) Advances in Neural Information Processing Systems 7 , pp. 345-352
    • Jaakkola, T.1    Singh, S.P.2    Jordan, M.I.3
  • 33
    • 0343893613 scopus 로고    scopus 로고
    • Actor-critic like learning algorithms for Markov decision processes
    • V. R. Konda and V. S. Borkar, "Actor-critic like learning algorithms for Markov decision processes," SIAM J. Control Optim., vol. 38, no. 1, pp. 94-123, 1999.
    • (1999) SIAM J. Control Optim , vol.38 , Issue.1 , pp. 94-123
    • Konda, V.R.1    Borkar, V.S.2
  • 34
    • 0029752592 scopus 로고    scopus 로고
    • Average reward reinforcement learning: Foundations, algorithms and empirical results
    • S. Mahadevan, "Average reward reinforcement learning: Foundations, algorithms and empirical results," Mach. Learn., vol. 22, no. 1-3, pp. 159-196, 1996.
    • (1996) Mach. Learn , vol.22 , Issue.1-3 , pp. 159-196
    • Mahadevan, S.1
  • 35
    • 0035249254 scopus 로고    scopus 로고
    • Simulation-based optimization of Markov reward processes
    • Feb
    • P. Marbach and J. N. Tsitsiklis, "Simulation-based optimization of Markov reward processes," IEEE Trans. Autom. Control, vol. 46, no. 2, pp. 191-209, Feb. 2001.
    • (2001) IEEE Trans. Autom. Control , vol.46 , Issue.2 , pp. 191-209
    • Marbach, P.1    Tsitsiklis, J.N.2
  • 37
    • 0031143730 scopus 로고    scopus 로고
    • An analysis of temporal-difference learning with function approximation
    • May
    • J. N. Tsitsiklis and B. Van Roy, "An analysis of temporal-difference learning with function approximation," IEEE Trans. Autom. Control vol. 42, no. 5, pp. 674-690, May 1997.
    • (1997) IEEE Trans. Autom. Control , vol.42 , Issue.5 , pp. 674-690
    • Tsitsiklis, J.N.1    Van Roy, B.2
  • 38
    • 0033221519 scopus 로고    scopus 로고
    • Average cost temporal-difference learning
    • Nov
    • J. N. Tsitsiklis, "Average cost temporal-difference learning," Automatica, vol. 35, no. 11, pp. 1799-1808, Nov. 1999.
    • (1999) Automatica , vol.35 , Issue.11 , pp. 1799-1808
    • Tsitsiklis, J.N.1
  • 39
    • 21444437925 scopus 로고    scopus 로고
    • The optimal reward baseline for gradient based reinforcement learning
    • Seattle, WA, Aug
    • L. Weaver and N. Tao, "The optimal reward baseline for gradient based reinforcement learning," in Proc. 17th Conf. Uncertain. Artif. Intell., Seattle, WA, Aug. 2001, pp. 538-545.
    • (2001) Proc. 17th Conf. Uncertain. Artif. Intell , pp. 538-545
    • Weaver, L.1    Tao, N.2
  • 40
    • 0000337576 scopus 로고
    • Simple statistical gradient-following algorithms for connectionist reinforcement learning
    • May
    • R. J. William, " Simple statistical gradient-following algorithms for connectionist reinforcement learning," Mach. Learn., vol. 8, no. 3/4 pp. 229-256, May 1992.
    • (1992) Mach. Learn , vol.8 , Issue.3-4 , pp. 229-256
    • William, R.J.1
  • 41
    • 7744247423 scopus 로고    scopus 로고
    • A reinforcement learning ticket-based probing path discovery scheme for MANETs
    • W. Usaha and J. Barria, "A reinforcement learning ticket-based probing path discovery scheme for MANETs," Ad Hoc Netw., vol. 2, no. 3, pp. 319-334, 2004.
    • (2004) Ad Hoc Netw , vol.2 , Issue.3 , pp. 319-334
    • Usaha, W.1    Barria, J.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.