메뉴 건너뛰기




Volumn , Issue , 2012, Pages 644-653

Hilbert space embeddings of pomdps

Author keywords

[No Author keywords available]

Indexed keywords

BELLMAN EQUATIONS; FEATURE SPACE; NONPARAMETRIC APPROACHES; OPTIMAL VALUE FUNCTIONS; POLICY LEARNING; REPRODUCING KERNEL HILBERT SPACES; VALUE FUNCTIONS; VALUE ITERATION;

EID: 84879146831     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (39)

References (26)
  • 1
    • 84880715629 scopus 로고    scopus 로고
    • Reinforcement learning in POMDPs without resets
    • Even-dar, 2005] Eyal Even-dar. Reinforcement learning in POMDPs without resets. IJCAI, 690-695, 2005.
    • (2005) IJCAI , pp. 690-695
    • Even-Dar, E.1
  • 2
    • 0041494125 scopus 로고    scopus 로고
    • Efficient SVM training using low-rank kernel representations
    • Fine and Scheinberg, 2001]
    • Fine and Scheinberg, 2001] S. Fine and K. Scheinberg. Efficient SVM training using low-rank kernel representations. JMLR, 2:243-264, 2001.
    • (2001) JMLR , vol.2 , pp. 243-264
    • Fine, S.1    Scheinberg, K.2
  • 3
    • 85161986095 scopus 로고    scopus 로고
    • Kernel measures of conditional dependence
    • Fukumizu et al., 2008]
    • Fukumizu et al., 2008] K. Fukumizu, A. Gretton, X. Sun, and B. Scholkopf. Kernel measures of conditional dependence. In NIPS2008.
    • (2008) NIPS
    • Fukumizu, K.1    Gretton, A.2    Sun, X.3    Scholkopf, B.4
  • 9
    • 84867133646 scopus 로고    scopus 로고
    • Modelling transition dynamics in mdps with rkhs embeddings
    • Grunewalder et al., 2012
    • Grunewalder et al., 2012] S. Grunewalder, G. Lever, L. Baldassarre, M. Pontil and A. Gretton. Modelling transition dynamics in MDPs with RKHS embeddings. In ICML2012.
    • (2012) ICML
    • Grunewalder, S.1    Lever, G.2    Baldassarre, L.3    Pontil, M.4    Gretton, A.5
  • 10
    • 0001770240 scopus 로고    scopus 로고
    • Value-function approximations for partially observable markov decision processes
    • Hauskrecht, 2000
    • Hauskrecht, 2000] M. Hauskrecht. Value-Function Approximations for Partially Observable Markov Decision Processes. In JAIR, vol 13, pages 33-94, 2000.
    • (2000) JAIR , vol.13 , pp. 33-94
    • Hauskrecht, M.1
  • 11
    • 85138579181 scopus 로고
    • Learning policies for partially observable environments: Scaling up
    • Littman, 1995
    • Littman, 1995] M. Littman, A. Cassandra, and L. Kaelbling. Learning policies for partially observable environments: Scaling up. In ICML1995.
    • (1995) ICML
    • Littman, M.1    Cassandra, A.2    Kaelbling, L.3
  • 12
    • 84880772945 scopus 로고    scopus 로고
    • Point-based value iteration: An anytime algorithm for POMDPs
    • Pineau et al., 2003] J. Pineau, G. Gordon, and S. Thrun. Point-based value iteration: an anytime algorithm for POMDPs. In ICJAI, pages 1025-1032, 2003.
    • (2003) ICJAI , pp. 1025-1032
    • Pineau, J.1    Gordon, G.2    Thrun, S.3
  • 13
    • 33750724397 scopus 로고    scopus 로고
    • Point-based value iteration for continuous POMDPs
    • Porta. et al., 2006]
    • Porta. et al., 2006] J. M. Porta, N. Vlassis, and P. Poupart. Point-based value iteration for continuous POMDPs. JMLR, 7:2329-2367, 2006.
    • (2006) JMLR , vol.7 , pp. 2329-2367
    • Porta, J.M.1    Vlassis, N.2    Poupart, P.3
  • 14
    • 33749251297 scopus 로고    scopus 로고
    • An analytic solution to discrete Bayesian reinforcement learning
    • Poupart et al., 2006]
    • Poupart et al., 2006] Pascal Poupart, Nikos A. Vlassis, Jesse Hoey, and Kevin Regan. An analytic solution to discrete Bayesian reinforcement learning. ICML2006.
    • (2006) ICML
    • Poupart, P.1    Vlassis, N.A.2    Hoey, J.3    Regan, K.4
  • 15
    • 52249086942 scopus 로고    scopus 로고
    • Online planning algorithms for POMDPs
    • Ross et al., 2008]
    • Ross et al., 2008] S. Ross, J. Pineau, S. Paquet, and B. Chaib-draa. Online planning algorithms for POMDPs. JAIR, 32(1):663-704, 2008.
    • (2008) JAIR , vol.32 , Issue.1 , pp. 663-704
    • Ross, S.1    Pineau, J.2    Paquet, S.3    Chaib-Draa, B.4
  • 16
    • 85161963598 scopus 로고    scopus 로고
    • Monte-carlo planning in large pomdps
    • Silver and Veness, 2010]
    • Silver and Veness, 2010] David Silver and Joel Veness. Monte-Carlo Planning in Large POMDPs. NIPS2010.
    • (2010) NIPS
    • Silver, D.1    Veness, J.2
  • 17
    • 33750297371 scopus 로고    scopus 로고
    • Heuristic search value iteration for POMDPs
    • Smith and Simmons, 2004]
    • Smith and Simmons, 2004] T. Smith and R. Simmons. Heuristic search value iteration for POMDPs. In UAI2004.
    • (2004) UAI
    • Smith, T.1    Simmons, R.2
  • 18
    • 70049118151 scopus 로고    scopus 로고
    • A Hilbert space embedding for distributions
    • Smola et al., 2007]
    • Smola et al., 2007] A. Smola, A. Gretton, L. Song, and B. Scholkopf. A Hilbert space embedding for distributions. In ALT2007.
    • (2007) ALT
    • Smola, A.1    Gretton, A.2    Song, L.3    Scholkopf, B.4
  • 20
    • 71149099279 scopus 로고    scopus 로고
    • Hilbert space embeddings of conditional distributions with applications to dynamical systems
    • Song et al., 2009]
    • Song et al., 2009] L. Song, J. Huang, A. Smola, and K. Fukumizu. Hilbert space embeddings of conditional distributions with applications to dynamical systems. In ICML2009.
    • (2009) ICML
    • Song, L.1    Huang, J.2    Smola, A.3    Fukumizu, K.4
  • 21
    • 77956540831 scopus 로고    scopus 로고
    • Hilbert space embeddings of hidden markov models
    • Song et al., 2010]
    • Song et al., 2010] L. Song, B. Boots, S. Siddiqi, G. Gordon, and A. Smola. Hilbert space embeddings of hidden Markov models. In ICML2010.
    • (2010) ICML
    • Song, L.1    Boots, B.2    Siddiqi, S.3    Gordon, G.4    Smola, A.5
  • 22
    • 84860645997 scopus 로고    scopus 로고
    • Nonparametric tree graphical models via kernel embeddings
    • Song et al., 2010]
    • Song et al., 2010] L. Song, A. Gretton, and C. Guestrin. Nonparametric tree graphical models via kernel embeddings. In AISTATS, pages 765-772, 2010.
    • (2010) AISTATS , pp. 765-772
    • Song, L.1    Gretton, A.2    Guestrin, C.3
  • 24
    • 31144472319 scopus 로고    scopus 로고
    • Perseus: Randomized point-based value iteration for POMDPs
    • Spaan and Vlassis, 2005]
    • Spaan and Vlassis, 2005] M. T. J. Spaan and N. Vlassis. Perseus: Randomized point-based value iteration for POMDPs. JAIR, 24:195-220, 2005.
    • (2005) JAIR , vol.24 , pp. 195-220
    • Spaan, M.T.J.1    Vlassis, N.2
  • 25
    • 77951953755 scopus 로고    scopus 로고
    • Hilbert space embeddings and metrics on probability measures
    • Sriperumbudur et al., 2010]
    • Sriperumbudur et al., 2010] B. Sriperumbudur, A. Gretton, K. Fukumizu, G. Lanckriet, and B. Scholkopf. Hilbert space embeddings and metrics on probability measures. JMLR, 11:1517-1561, 2010.
    • (2010) JMLR , vol.11 , pp. 1517-1561
    • Sriperumbudur, B.1    Gretton, A.2    Fukumizu, K.3    Lanckriet, G.4    Scholkopf, B.5
  • 26
    • 84898978676 scopus 로고    scopus 로고
    • Monte carlo pomdps
    • Thrun, 2000
    • Thrun, 2000] S. Thrun. Monte Carlo POMDPs. In NIPS2000.
    • (2000) NIPS
    • Thrun, S.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.