-
1
-
-
0000719863
-
Packet routing in dynamically changing networks: A reinforcement learning approach
-
San Mateo, CA: Morgan Kaufmann
-
J. A. Boyan and M. L. Littman, "Packet routing in dynamically changing networks: A reinforcement learning approach," in Advances in Neural Information Processing Systems 6. San Mateo, CA: Morgan Kaufmann, 1994.
-
(1994)
Advances in Neural Information Processing Systems 6
-
-
Boyan, J.A.1
Littman, M.L.2
-
2
-
-
0005977691
-
Reinforcement learning for admission control and routing,
-
Ph.D. dissertation, Uppsala Univ, Uppsala, Sweden
-
J. Carlstrom, "Reinforcement learning for admission control and routing," Ph.D. dissertation, Uppsala Univ., Uppsala, Sweden, 2000.
-
(2000)
-
-
Carlstrom, J.1
-
3
-
-
0037093591
-
Provision of guaranteed services in broadband LEO satellite networks
-
May
-
O. Ercetin, S. Krishnamurthy, S. Dao, and L. Tassiulas, "Provision of guaranteed services in broadband LEO satellite networks," Comput. Netw., vol. 39, no. 1, pp. 61-77, May 2002.
-
(2002)
Comput. Netw
, vol.39
, Issue.1
, pp. 61-77
-
-
Ercetin, O.1
Krishnamurthy, S.2
Dao, S.3
Tassiulas, L.4
-
4
-
-
0037093588
-
Signalling for inter-satellite link routing in broadband non-GEO satellite systems
-
May
-
L. Franck and G. Maral, "Signalling for inter-satellite link routing in broadband non-GEO satellite systems," Comput. Netw., vol. 39, no. 1, pp. 79-92, May 2002.
-
(2002)
Comput. Netw
, vol.39
, Issue.1
, pp. 79-92
-
-
Franck, L.1
Maral, G.2
-
5
-
-
21244434788
-
Self-aware networks and QoS
-
Sep
-
E. Gelenbe, R. Lent, and A. Nunez, "Self-aware networks and QoS," Proc. IEEE, vol. 92, no. 9, pp. 1478-1489, Sep. 2004.
-
(2004)
Proc. IEEE
, vol.92
, Issue.9
, pp. 1478-1489
-
-
Gelenbe, E.1
Lent, R.2
Nunez, A.3
-
6
-
-
0033902857
-
Performance study of adaptive routing algorithms for LEO satellite constellations under self-similar and Poisson traffic
-
Jan
-
I. Gragopoulos, E. Papapetrou, and F. Pavlidou, "Performance study of adaptive routing algorithms for LEO satellite constellations under self-similar and Poisson traffic," Space Commun., vol. 16, no. 1, pp. 15-22, Jan. 2000.
-
(2000)
Space Commun
, vol.16
, Issue.1
, pp. 15-22
-
-
Gragopoulos, I.1
Papapetrou, E.2
Pavlidou, F.3
-
7
-
-
0242696084
-
Adaptive provisioning of differentiated services networks based on reinforcement learning
-
Nov
-
T. C.-K. Hui and C.-K. Tham, "Adaptive provisioning of differentiated services networks based on reinforcement learning," IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 33, no. 4, pp. 492-501, Nov. 2003.
-
(2003)
IEEE Trans. Syst., Man, Cybern. C, Appl. Rev
, vol.33
, Issue.4
, pp. 492-501
-
-
Hui, T.C.-K.1
Tham, C.-K.2
-
8
-
-
0034313087
-
Predictive call admission control scheme for low Earth orbit satellite networks
-
Nov
-
B. W. Kim, S. L. Min, H. S. Yang, and C. S. Kim, "Predictive call admission control scheme for low Earth orbit satellite networks," IEEE Trans. Veh. Technol., vol. 49, no. 6, pp. 2320-2335, Nov. 2000.
-
(2000)
IEEE Trans. Veh. Technol
, vol.49
, Issue.6
, pp. 2320-2335
-
-
Kim, B.W.1
Min, S.L.2
Yang, H.S.3
Kim, C.S.4
-
9
-
-
0042758707
-
Actor-critic algorithms,
-
Ph.D. dissertation, MIT, Cambridge, MA
-
V. R. Konda, "Actor-critic algorithms," Ph.D. dissertation, MIT, Cambridge, MA, 2002.
-
(2002)
-
-
Konda, V.R.1
-
10
-
-
4043069840
-
On actor-critic algorithms
-
Aug
-
V. R. Konda and J. N. Tsitsiklis, "On actor-critic algorithms," SIAM J. Control Optim., vol. 42, no. 4, pp. 1143-1166, Aug. 2003.
-
(2003)
SIAM J. Control Optim
, vol.42
, Issue.4
, pp. 1143-1166
-
-
Konda, V.R.1
Tsitsiklis, J.N.2
-
11
-
-
0033876565
-
Call admission control and routing in integrated services networks using Neuro-dynamic programming
-
Feb
-
P. Marbach, O. Mihatsch, and J. N. Tsitsiklis, "Call admission control and routing in integrated services networks using Neuro-dynamic programming," IEEE J. Sel. Areas Commun., vol. 18, no. 2, pp. 197-208, Feb. 2000.
-
(2000)
IEEE J. Sel. Areas Commun
, vol.18
, Issue.2
, pp. 197-208
-
-
Marbach, P.1
Mihatsch, O.2
Tsitsiklis, J.N.3
-
13
-
-
84898972974
-
Reinforcement learning for dynamic channel allocation in cellular telephone systems
-
Cambridge, MA: MIT Press
-
S. Singh and D. P. Bertsekas, "Reinforcement learning for dynamic channel allocation in cellular telephone systems," in Advances in Neural Information Processing Systems 10. Cambridge, MA: MIT Press, 1997, pp. 974-980.
-
(1997)
Advances in Neural Information Processing Systems 10
, pp. 974-980
-
-
Singh, S.1
Bertsekas, D.P.2
-
14
-
-
0033876566
-
Adaptive call admission control under quality-of-service constraints: A reinforcement learning solution
-
Feb
-
H. Tong and T. X. Brown, "Adaptive call admission control under quality-of-service constraints: A reinforcement learning solution," IEEE J. Sel. Areas Commun., vol. 18, no. 2, pp. 209-220, Feb. 2000.
-
(2000)
IEEE J. Sel. Areas Commun
, vol.18
, Issue.2
, pp. 209-220
-
-
Tong, H.1
Brown, T.X.2
-
15
-
-
13444294406
-
A multi-agent, policy-gradient approach to network routing
-
N. Tao, J. Baxter, and L. Weaver, "A multi-agent, policy-gradient approach to network routing," in Proc. 18th Int. Mach. Learn. Conf. 2001, pp. 553-560.
-
(2001)
Proc. 18th Int. Mach. Learn. Conf
, pp. 553-560
-
-
Tao, N.1
Baxter, J.2
Weaver, L.3
-
16
-
-
0001046225
-
Practical issues in temporal difference learning
-
May
-
G. Tesauro, "Practical issues in temporal difference learning," Mach. Learn., vol. 8, no. 3/4, pp. 257-277, May 1992.
-
(1992)
Mach. Learn
, vol.8
, Issue.3-4
, pp. 257-277
-
-
Tesauro, G.1
-
17
-
-
0033361288
-
An optimized routing scheme and a channel reservation strategy for a low Earth orbit satellite system
-
Sep
-
P. Tam, J. Lui, H. W. Chan, C. Sze, and C. N. Sze, "An optimized routing scheme and a channel reservation strategy for a low Earth orbit satellite system," in Proc. IEEE VTC, Sep. 1999, vol. 5, pp. 2870-2874.
-
(1999)
Proc. IEEE VTC
, vol.5
, pp. 2870-2874
-
-
Tam, P.1
Lui, J.2
Chan, H.W.3
Sze, C.4
Sze, C.N.5
-
18
-
-
0034225151
-
A routing algorithm for connection-oriented low Earth orbit (LEO) satellite networks with dynamic connectivity
-
May
-
H. Uzunalioglu, I. F. Akyilidiz, and M. Bender, "A routing algorithm for connection-oriented low Earth orbit (LEO) satellite networks with dynamic connectivity," Wireless Netw., vol. 6, no. 3, pp. 181-190, May 2000.
-
(2000)
Wireless Netw
, vol.6
, Issue.3
, pp. 181-190
-
-
Uzunalioglu, H.1
Akyilidiz, I.F.2
Bender, M.3
-
19
-
-
0036823395
-
Markov decision theory framework for resource allocation in LEO satellite constellations
-
Oct
-
W. Usaha and J. Barria, "Markov decision theory framework for resource allocation in LEO satellite constellations," Proc. Inst. Electr. Eng. - Commun., vol. 149, no. 5, pp. 270-276, Oct. 2002.
-
(2002)
Proc. Inst. Electr. Eng. - Commun
, vol.149
, Issue.5
, pp. 270-276
-
-
Usaha, W.1
Barria, J.2
-
21
-
-
7744231640
-
Resource allocation in networks with dynamic topology,
-
Ph.D. dissertation, Imperial College London, London, U.K
-
W. Usaha, "Resource allocation in networks with dynamic topology," Ph.D. dissertation, Imperial College London, London, U.K., 2004.
-
(2004)
-
-
Usaha, W.1
-
22
-
-
0031628198
-
Probabilistic routing protocol for low Earth orbit satellite networks
-
Jun
-
H. Uzunalioglu, "Probabilistic routing protocol for low Earth orbit satellite networks," in Proc. ICC, Jun. 1998, vol. 1, pp. 89-93.
-
(1998)
Proc. ICC
, vol.1
, pp. 89-93
-
-
Uzunalioglu, H.1
-
23
-
-
0030735996
-
ATM-based routing in LEO/MEO satellite networks with intersatellite links
-
Jan
-
M. Werner, C. Delluchi, H. Vogel, G. Maral, and J. De Ridder, "ATM-based routing in LEO/MEO satellite networks with intersatellite links," IEEE J. Sel. Areas Commun., vol. 15, no. 1, pp. 69-82, Jan. 1997.
-
(1997)
IEEE J. Sel. Areas Commun
, vol.15
, Issue.1
, pp. 69-82
-
-
Werner, M.1
Delluchi, C.2
Vogel, H.3
Maral, G.4
De Ridder, J.5
-
24
-
-
0031259023
-
A dynamic routing concept for ATM-based satellite personal communication network
-
Oct
-
M. Werner, "A dynamic routing concept for ATM-based satellite personal communication network," IEEE J. Sel. Areas Commun., vol. 15, no. 8, pp. 1636-1648, Oct. 1997.
-
(1997)
IEEE J. Sel. Areas Commun
, vol.15
, Issue.8
, pp. 1636-1648
-
-
Werner, M.1
-
25
-
-
0032292104
-
A neural network base approach to distributed adaptive routing of LEO intersatellite link traffic
-
May
-
M. Werner, C. Mayer, G. Maral, and M. Holzbock, "A neural network base approach to distributed adaptive routing of LEO intersatellite link traffic," in Proc. IEEE VTC, May 1998, vol. 2, pp. 1498-1502.
-
(1998)
Proc. IEEE VTC
, vol.2
, pp. 1498-1502
-
-
Werner, M.1
Mayer, C.2
Maral, G.3
Holzbock, M.4
-
26
-
-
85156225449
-
High performance job-shop scheduling with a time delay TD(λ) network
-
Cambridge, MA: MIT Press
-
W. Zhang and T. G. Dietterich, "High performance job-shop scheduling with a time delay TD(λ) network," in Advances in Neural Information Processing Systems 8. Cambridge, MA: MIT Press, 1996, pp. 1024-1030.
-
(1996)
Advances in Neural Information Processing Systems 8
, pp. 1024-1030
-
-
Zhang, W.1
Dietterich, T.G.2
-
27
-
-
0003874616
-
-
Lab. Inf. Decision Syst, MIT, Cambridge, MA, Rep. LIDS-P-2434
-
J. Abounadi, D. P. Bertsekas, and V. S. Borkar, "Learning algorithms for Markov decision processes," Lab. Inf. Decision Syst., MIT, Cambridge, MA, Rep. LIDS-P-2434, 1998.
-
(1998)
Learning algorithms for Markov decision processes
-
-
Abounadi, J.1
Bertsekas, D.P.2
Borkar, V.S.3
-
28
-
-
0020970738
-
Neuronlike adaptive elements that can solve difficult learning control problems
-
Sep./Oct
-
A. G. Barto, R. S. Sutton, and C. W. Anderson, "Neuronlike adaptive elements that can solve difficult learning control problems," IEEE Trans. Syst., Man, Cybern., vol. SMC-13, no. 5, pp. 834-846, Sep./Oct. 1983.
-
(1983)
IEEE Trans. Syst., Man, Cybern
, vol.SMC-13
, Issue.5
, pp. 834-846
-
-
Barto, A.G.1
Sutton, R.S.2
Anderson, C.W.3
-
29
-
-
0013535965
-
Infinite-horizon policy gradient estimation
-
Jul.-Dec
-
J. Baxter and P. L. Bartlett, "Infinite-horizon policy gradient estimation," J. Artif. Intell. Res., vol. 15, no. 4, pp. 319-350, Jul.-Dec. 2001.
-
(2001)
J. Artif. Intell. Res
, vol.15
, Issue.4
, pp. 319-350
-
-
Baxter, J.1
Bartlett, P.L.2
-
31
-
-
34249002045
-
Variance reduction techniques for gradient estimates in reinforcement learning
-
Vancouver, BC, Canada: MIT Press
-
E. Greensmith, P. L. Bartlett, and J. Baxter, "Variance reduction techniques for gradient estimates in reinforcement learning," in Advances in Neural Information Processing Systems 14. Vancouver, BC, Canada: MIT Press, 2001.
-
(2001)
Advances in Neural Information Processing Systems 14
-
-
Greensmith, E.1
Bartlett, P.L.2
Baxter, J.3
-
32
-
-
85153938292
-
Reinforcement learning algorithm for partially observable Markov decision problems
-
San Mateo, CA: Morgan Kaufmann
-
T. Jaakkola, S. P. Singh, and M. I. Jordan, "Reinforcement learning algorithm for partially observable Markov decision problems," in Advances in Neural Information Processing Systems 7. San Mateo, CA: Morgan Kaufmann, 1995, pp. 345-352.
-
(1995)
Advances in Neural Information Processing Systems 7
, pp. 345-352
-
-
Jaakkola, T.1
Singh, S.P.2
Jordan, M.I.3
-
33
-
-
0343893613
-
Actor-critic like learning algorithms for Markov decision processes
-
V. R. Konda and V. S. Borkar, "Actor-critic like learning algorithms for Markov decision processes," SIAM J. Control Optim., vol. 38, no. 1, pp. 94-123, 1999.
-
(1999)
SIAM J. Control Optim
, vol.38
, Issue.1
, pp. 94-123
-
-
Konda, V.R.1
Borkar, V.S.2
-
34
-
-
0029752592
-
Average reward reinforcement learning: Foundations, algorithms and empirical results
-
S. Mahadevan, "Average reward reinforcement learning: Foundations, algorithms and empirical results," Mach. Learn., vol. 22, no. 1-3, pp. 159-196, 1996.
-
(1996)
Mach. Learn
, vol.22
, Issue.1-3
, pp. 159-196
-
-
Mahadevan, S.1
-
35
-
-
0035249254
-
Simulation-based optimization of Markov reward processes
-
Feb
-
P. Marbach and J. N. Tsitsiklis, "Simulation-based optimization of Markov reward processes," IEEE Trans. Autom. Control, vol. 46, no. 2, pp. 191-209, Feb. 2001.
-
(2001)
IEEE Trans. Autom. Control
, vol.46
, Issue.2
, pp. 191-209
-
-
Marbach, P.1
Tsitsiklis, J.N.2
-
37
-
-
0031143730
-
An analysis of temporal-difference learning with function approximation
-
May
-
J. N. Tsitsiklis and B. Van Roy, "An analysis of temporal-difference learning with function approximation," IEEE Trans. Autom. Control vol. 42, no. 5, pp. 674-690, May 1997.
-
(1997)
IEEE Trans. Autom. Control
, vol.42
, Issue.5
, pp. 674-690
-
-
Tsitsiklis, J.N.1
Van Roy, B.2
-
38
-
-
0033221519
-
Average cost temporal-difference learning
-
Nov
-
J. N. Tsitsiklis, "Average cost temporal-difference learning," Automatica, vol. 35, no. 11, pp. 1799-1808, Nov. 1999.
-
(1999)
Automatica
, vol.35
, Issue.11
, pp. 1799-1808
-
-
Tsitsiklis, J.N.1
-
39
-
-
21444437925
-
The optimal reward baseline for gradient based reinforcement learning
-
Seattle, WA, Aug
-
L. Weaver and N. Tao, "The optimal reward baseline for gradient based reinforcement learning," in Proc. 17th Conf. Uncertain. Artif. Intell., Seattle, WA, Aug. 2001, pp. 538-545.
-
(2001)
Proc. 17th Conf. Uncertain. Artif. Intell
, pp. 538-545
-
-
Weaver, L.1
Tao, N.2
-
40
-
-
0000337576
-
Simple statistical gradient-following algorithms for connectionist reinforcement learning
-
May
-
R. J. William, " Simple statistical gradient-following algorithms for connectionist reinforcement learning," Mach. Learn., vol. 8, no. 3/4 pp. 229-256, May 1992.
-
(1992)
Mach. Learn
, vol.8
, Issue.3-4
, pp. 229-256
-
-
William, R.J.1
-
41
-
-
7744247423
-
A reinforcement learning ticket-based probing path discovery scheme for MANETs
-
W. Usaha and J. Barria, "A reinforcement learning ticket-based probing path discovery scheme for MANETs," Ad Hoc Netw., vol. 2, no. 3, pp. 319-334, 2004.
-
(2004)
Ad Hoc Netw
, vol.2
, Issue.3
, pp. 319-334
-
-
Usaha, W.1
Barria, J.2
|