-
3
-
-
84929155218
-
Human aware path planning in urban environments with nonstationary mdps
-
Hong Kong, China
-
Allamaraju R, Kingravi H, Axelrod A, Chowdhary G, Grande R, Crick C, Sheng W, How J (2014) Human aware path planning in urban environments with nonstationary mdps. In: IEEE international conference on robotics and automation, Hong Kong, China
-
(2014)
IEEE international conference on robotics and automation
-
-
Allamaraju, R.1
Kingravi, H.2
Axelrod, A.3
Chowdhary, G.4
Grande, R.5
Crick, C.6
Sheng, W.7
How, J.8
-
5
-
-
0032022695
-
The information-theoretic capacity of discrete-time queues
-
Bedekar AS, AzizogluM(1998) The information-theoretic capacity of discrete-time queues. IEEE Trans Inf Theory 44(2):446-461
-
(1998)
IEEE Trans Inf Theory
, vol.44
, Issue.2
, pp. 446-461
-
-
Bedekar, A.S.1
Azizoglu, M.2
-
6
-
-
84870655865
-
-
Technical report
-
Bodik P, Hong W, Guestrin C, Madden S, Paskin M, Thibaux R (2004) Intel lab data. Technical report
-
(2004)
Intel lab data
-
-
Bodik, P.1
Hong, W.2
Guestrin, C.3
Madden, S.4
Paskin, M.5
Thibaux, R.6
-
7
-
-
0030675610
-
Efficient reinforcement learning: Model-based acrobot control
-
IEEE
-
Boone G (1997) Efficient reinforcement learning: Model-based acrobot control. In: Robotics and Automation, 1997. Proceedings., 1997 IEEE International Conference on, IEEE, vol 1, pp 229-234
-
(1997)
Robotics and Automation, 1997. Proceedings., 1997 IEEE International Conference on
, vol.1
, pp. 229-234
-
-
Boone, G.1
-
9
-
-
33749246501
-
Hidden-mode markov decision processes for nonstationary sequential decision making
-
Springer
-
Choi SP, Yeung DY, Zhang NL (2001) Hidden-mode markov decision processes for nonstationary sequential decision making. In: Sequence learning, Springer, pp 264-287
-
(2001)
Sequence learning
, pp. 264-287
-
-
Choi, S.P.1
Yeung, D.Y.2
Zhang, N.L.3
-
10
-
-
50249102821
-
The rate-distortion function of a poisson process with a queueing distortion measure
-
DCC 2008, IEEE
-
Coleman TP, Kiyavash N, SubramanianVG(2008) The rate-distortion function of a poisson process with a queueing distortion measure. In: Data Compression Conference, DCC 2008, IEEE, pp 63-72
-
(2008)
Data Compression Conference
, pp. 63-72
-
-
Coleman, T.P.1
Kiyavash, N.2
Subramanian, V.G.3
-
11
-
-
0038891993
-
Sparse on-line Gaussian processes
-
Csató L, Opper M (2002) Sparse on-line Gaussian processes. Neural Comput 14(3):641-668
-
(2002)
Neural Comput
, vol.14
, Issue.3
, pp. 641-668
-
-
Csató, L.1
Opper, M.2
-
12
-
-
33745223257
-
Cortical substrates for exploratory decisions in humans
-
Daw ND, O’Doherty JP, Dayan P, Seymour B, Dolan RJ (2006) Cortical substrates for exploratory decisions in humans. Nature 441(7095):876-879
-
(2006)
Nature
, vol.441
, Issue.7095
, pp. 876-879
-
-
Daw, N.D.1
O’Doherty, J.P.2
Dayan, P.3
Seymour, B.4
Dolan, R.J.5
-
13
-
-
84939061075
-
Traffic modeling for telecommunications networks
-
Frost VS, Melamed B (1994) Traffic modeling for telecommunications networks. IEEE Commun Mag 32(3):70-81
-
(1994)
IEEE Commun Mag
, vol.32
, Issue.3
, pp. 70-81
-
-
Frost, V.S.1
Melamed, B.2
-
16
-
-
84890920160
-
A tutorial on linear function approximators for dynamic programming and reinforcement learning
-
Geramifard A, Walsh TJ, Tellex S, Chowdhary G, Roy N, How JP (2013) A tutorial on linear function approximators for dynamic programming and reinforcement learning. Foundations and Trends® in Machine Learning 6(4): 375-451. doi:10.1561/2200000042
-
(2013)
Foundations and Trends® in Machine Learning
, vol.6
, Issue.4
, pp. 375-451
-
-
Geramifard, A.1
Walsh, T.J.2
Tellex, S.3
Chowdhary, G.4
Roy, N.5
How, J.P.6
-
19
-
-
79551524402
-
Solving non-stationary bandit problems by random sampling from sibling kalman filters
-
Springer
-
Granmo OC, Berg S (2010) Solving non-stationary bandit problems by random sampling from sibling kalman filters. In: Trends in applied intelligent systems, Springer, pp 199-208
-
(2010)
Trends in applied intelligent systems
, pp. 199-208
-
-
Granmo, O.C.1
Berg, S.2
-
20
-
-
84871756682
-
A survey of actor-critic reinforcement learning: Standard and natural policy gradients
-
Grondman I, Busoniu L, Lopes GA, Babuska R (2012) A survey of actor-critic reinforcement learning: Standard and natural policy gradients. IEEE Trans Syst, Man, and Cybern, Part C: Appl Rev 42(6):1291-1307
-
(2012)
IEEE Trans Syst, Man, and Cybern, Part C: Appl Rev
, vol.42
, Issue.6
, pp. 1291-1307
-
-
Grondman, I.1
Busoniu, L.2
Lopes, G.A.3
Babuska, R.4
-
22
-
-
43749104456
-
Mutual information and conditional mean estimation in poisson channels
-
Guo D, Shamai S, Verdú S (2008) Mutual information and conditional mean estimation in poisson channels. IEEE Trans Inf Theory 54(5):1837-1849
-
(2008)
IEEE Trans Inf Theory
, vol.54
, Issue.5
, pp. 1837-1849
-
-
Guo, D.1
Shamai, S.2
Verdú, S.3
-
24
-
-
0035397657
-
Binomial and poisson distributions as maximum entropy distributions
-
Harremoës P (2001) Binomial and poisson distributions as maximum entropy distributions. IEEE Trans Inf Theory 47(5):2039-2041
-
(2001)
IEEE Trans Inf Theory
, vol.47
, Issue.5
, pp. 2039-2041
-
-
Harremoës, P.1
-
27
-
-
51649090077
-
A nash equilibrium related to the poisson channel
-
Harremoës P, Vignat C et al (2003) A nash equilibrium related to the poisson channel. Commun Inf Syst 3(3):183-190
-
(2003)
Commun Inf Syst
, vol.3
, Issue.3
, pp. 183-190
-
-
Harremoës, P.1
Vignat, C.2
-
28
-
-
34547516258
-
Approximating the kullback leibler divergence between gaussian mixture models
-
Hershey JR, Olsen PA (2007) Approximating the kullback leibler divergence between gaussian mixture models. In: ICASSP (4), pp 317-320
-
(2007)
ICASSP
, vol.4
, pp. 317-320
-
-
Hershey, J.R.1
Olsen, P.A.2
-
29
-
-
84874698101
-
Texplore: Real-time sample-efficient reinforcement learning for robots
-
Hester T, Stone P (2013) Texplore: real-time sample-efficient reinforcement learning for robots. Mach learning 90(3):385-429
-
(2013)
Mach learning
, vol.90
, Issue.3
, pp. 385-429
-
-
Hester, T.1
Stone, P.2
-
30
-
-
34247645455
-
Log-concavity and the maximum entropy property of the poisson distribution
-
Johnson O (2007) Log-concavity and the maximum entropy property of the poisson distribution. Stoch Process Appl 117(6):791-802
-
(2007)
Stoch Process Appl
, vol.117
, Issue.6
, pp. 791-802
-
-
Johnson, O.1
-
32
-
-
8344223694
-
A nonstationary poisson view of internet traffic
-
INFOCOM 2004, IEEE
-
Karagiannis T, Molle M, Faloutsos M, Broido A (2004) A nonstationary poisson view of internet traffic. In: INFOCOM 2004. Twenty-third annualjoint conference of the IEEE computer and communications societies. IEEE, vol 3, pp 1558-1569
-
(2004)
Twenty-third annualjoint conference of the IEEE computer and communications societies
, vol.3
, pp. 1558-1569
-
-
Karagiannis, T.1
Molle, M.2
Faloutsos, M.3
Broido, A.4
-
33
-
-
84866711082
-
Anytime motion planning using the RRT*
-
IEEE
-
Karaman S, Walter M, Perez A, Frazzoli E, Teller S (2011) Anytime motion planning using the RRT*. In: International conference on robotics and automation. IEEE, pp 1478-1483
-
(2011)
International conference on robotics and automation
, pp. 1478-1483
-
-
Karaman, S.1
Walter, M.2
Perez, A.3
Frazzoli, E.4
Teller, S.5
-
36
-
-
38649118249
-
Reinforcement learning and evolutionary algorithms for non-stationary multi-armed bandit problems
-
Koulouriotis DE, Xanthopoulos A (2008) Reinforcement learning and evolutionary algorithms for non-stationary multi-armed bandit problems. Appl Math Comput 196(2):913-922
-
(2008)
Appl Math Comput
, vol.196
, Issue.2
, pp. 913-922
-
-
Koulouriotis, D.E.1
Xanthopoulos, A.2
-
37
-
-
4644323293
-
Least-squares policy iteration
-
URL
-
Lagoudakis MG, Parr R (2003) Least-squares policy iteration. J Mach Learn Res 4:1107-1149, URL http://dl.acm.org/citation.cfm-id=945365.964290
-
(2003)
J Mach Learn Res
, vol.4
, pp. 1107-1149
-
-
Lagoudakis, M.G.1
Parr, R.2
-
38
-
-
84892407563
-
Expected entropy as a measure and criterion of randomness of binary sequences
-
Leśniewicz M (2014) Expected entropy as a measure and criterion of randomness of binary sequences. Przeglad Elektrotechniczny 90(1):42-46
-
(2014)
Przeglad Elektrotechniczny
, vol.90
, Issue.1
, pp. 42-46
-
-
Leśniewicz, M.1
-
39
-
-
85028936685
-
-
Markov decision processes (MDP) toolbox (2012). http://www7.inra.fr/mia/T/MDPtoolbox/MDPtoolbox.html
-
(2012)
-
-
-
41
-
-
84902145619
-
Efficient distributed sensing using adaptive censoring-based inference
-
Mu B, Chowdhary G, How J (2014) Efficient distributed sensing using adaptive censoring-based inference. Automatica
-
(2014)
Automatica
-
-
Mu, B.1
Chowdhary, G.2
How, J.3
-
43
-
-
0037319560
-
Entropy and the timing capacity of discrete queues
-
Prabhakar B, Gallager R (2003) Entropy and the timing capacity of discrete queues. IEEE Trans Inf Theory 49(2):357-370
-
(2003)
IEEE Trans Inf Theory
, vol.49
, Issue.2
, pp. 357-370
-
-
Prabhakar, B.1
Gallager, R.2
-
45
-
-
84874248431
-
Towards optimization of a human-inspired heuristic for solving explore-exploit problems
-
Reverdy P, Wilson RC, Holmes P, Leonard NE (2012) Towards optimization of a human-inspired heuristic for solving explore-exploit problems. In: CDC, pp 2820-2825
-
(2012)
CDC
, pp. 2820-2825
-
-
Reverdy, P.1
Wilson, R.C.2
Holmes, P.3
Leonard, N.E.4
-
47
-
-
0016036648
-
Information rates and data-compression schemes for poisson processes
-
Rubin I (1974) Information rates and data-compression schemes for poisson processes. IEEE Trans Inf Theory 20(2):200-210
-
(1974)
IEEE Trans Inf Theory
, vol.20
, Issue.2
, pp. 200-210
-
-
Rubin, I.1
-
48
-
-
84865131152
-
A generalized representer theorem
-
Helmbold D, Williamson B (eds), Lecture notes in computer scienceSpringer, Berlin, URL
-
Scholkopf B, Herbrich R, Smola A (2001) A generalized representer theorem. In: Helmbold D, Williamson B (eds) Computational learning theory., Lecture notes in computer scienceSpringer, Berlin, pp 416-426 URL http://dx.doi.org/10.1007/3-540-44581-1_27
-
(2001)
Computational learning theory
, pp. 416-426
-
-
Scholkopf, B.1
Herbrich, R.2
Smola, A.3
-
53
-
-
0031143730
-
An analysis of temporal difference learning with function approximation
-
Tsitsiklis JN, Roy BV (1997) An analysis of temporal difference learning with function approximation. IEEE Trans Autom Control 42(5):674-690
-
(1997)
IEEE Trans Autom Control
, vol.42
, Issue.5
, pp. 674-690
-
-
Tsitsiklis, J.N.1
Roy, B.V.2
-
54
-
-
85042936847
-
Bayesian reinforcement learning
-
Springer
-
Vlassis N, Ghavamzadeh M, Mannor S, Poupart P (2012) Bayesian reinforcement learning. In: Reinforcement learning, Springer, pp 359-386
-
(2012)
Reinforcement learning
, pp. 359-386
-
-
Vlassis, N.1
Ghavamzadeh, M.2
Mannor, S.3
Poupart, P.4
-
58
-
-
84925600345
-
Humans use directed and random exploration to solve the explore-exploit dilemma
-
Wilson RC, Geana A, White JM, Ludvig EA, Cohen JD (2014) Humans use directed and random exploration to solve the explore-exploit dilemma. J Exp Psychol: Gen 143(6):2074
-
(2014)
J Exp Psychol: Gen
, vol.143
, Issue.6
, pp. 2074
-
-
Wilson, R.C.1
Geana, A.2
White, J.M.3
Ludvig, E.A.4
Cohen, J.D.5
|