메뉴 건너뛰기




Volumn , Issue , 2008, Pages 5558-5564

Shannon meets bellman: Feature based markovian models for detection and optimization

Author keywords

[No Author keywords available]

Indexed keywords

GAUSSIAN DISTRIBUTION; HIDDEN MARKOV MODELS; LEARNING ALGORITHMS; TRELLIS CODES;

EID: 62949191986     PISSN: 07431546     EISSN: 25762370     Source Type: Conference Proceeding    
DOI: 10.1109/CDC.2008.4739405     Document Type: Conference Paper
Times cited : (8)

References (29)
  • 1
    • 62949170421 scopus 로고    scopus 로고
    • Constrained Markov decision processes
    • Chapman & Hall/CRC, Boca Raton, FL
    • E. Altman. Constrained Markov decision processes. Stochastic Modeling. Chapman & Hall/CRC, Boca Raton, FL, 1999.
    • (1999) Stochastic Modeling
    • Altman, E.1
  • 3
    • 0034437507 scopus 로고    scopus 로고
    • V. S. Borkar. Average cost dynamic programming equations for controlled Markov chains with partial observations. SIAM J. Control Optim., 39(3):673-681 (electronic), 2000.
    • V. S. Borkar. Average cost dynamic programming equations for controlled Markov chains with partial observations. SIAM J. Control Optim., 39(3):673-681 (electronic), 2000.
  • 4
    • 0037290932 scopus 로고    scopus 로고
    • Dynamic programming for ergodic control with partial observations
    • V. S. Borkar. Dynamic programming for ergodic control with partial observations. Stoch. Proc. Applns., 103(2):293-310, 2003.
    • (2003) Stoch. Proc. Applns , vol.103 , Issue.2 , pp. 293-310
    • Borkar, V.S.1
  • 5
    • 0033876515 scopus 로고    scopus 로고
    • V. S. Borkar and S. P. Meyn. The O.D.E. method for convergence of stochastic approximation and reinforcement learning. SIAM J. Control Optim., 38(2):447-469, 2000. (also presented at the IEEE CDC, December, 1998).
    • V. S. Borkar and S. P. Meyn. The O.D.E. method for convergence of stochastic approximation and reinforcement learning. SIAM J. Control Optim., 38(2):447-469, 2000. (also presented at the IEEE CDC, December, 1998).
  • 6
    • 0037097924 scopus 로고    scopus 로고
    • Optimal prediction with memory
    • A. Chorin, O. Hald, and R. Kupferman. Optimal prediction with memory. Physica D, 166:239-257, 2002.
    • (2002) Physica D , vol.166 , pp. 239-257
    • Chorin, A.1    Hald, O.2    Kupferman, R.3
  • 8
    • 84980140517 scopus 로고    scopus 로고
    • M.D. Donsker and S.R.S. Varadhan. Asymptotic evaluation of certain Markov process expectations for large time. I. II. Comm. Pure Appl. Math., 28:1-47; ibid. 28 (1975), 279-301, 1975.
    • M.D. Donsker and S.R.S. Varadhan. Asymptotic evaluation of certain Markov process expectations for large time. I. II. Comm. Pure Appl. Math., 28:1-47; ibid. 28 (1975), 279-301, 1975.
  • 9
    • 0003273397 scopus 로고
    • Hidden Markov models
    • of, New York, Springer-Verlag, New York, Estimation and control
    • R. J. Elliott, L. Aggoun, and J. B. Moore. Hidden Markov models, volume 29 of Applications of Mathematics (New York). Springer-Verlag, New York, 1995. Estimation and control.
    • (1995) Applications of Mathematics , vol.29
    • Elliott, R.J.1    Aggoun, L.2    Moore, J.B.3
  • 10
    • 13244262450 scopus 로고    scopus 로고
    • Handbook of Markov decision processes
    • E. A. Feinberg and A. Shwartz, editors, Kluwer Academic Publishers, Boston, MA, Methods and applications
    • E. A. Feinberg and A. Shwartz, editors. Handbook of Markov decision processes. International Series in Operations Research & Management Science, 40. Kluwer Academic Publishers, Boston, MA, 2002. Methods and applications.
    • (2002) International Series in Operations Research & Management Science , vol.40
  • 12
    • 0343893613 scopus 로고    scopus 로고
    • V. R. Konda and V. S. Borkar. Actor-critic-type learning algorithms for Markov decision processes. SIAM J. Control Optim., 38(1):94-123 (electronic), 1999.
    • V. R. Konda and V. S. Borkar. Actor-critic-type learning algorithms for Markov decision processes. SIAM J. Control Optim., 38(1):94-123 (electronic), 1999.
  • 13
    • 4043069840 scopus 로고    scopus 로고
    • V. R. Konda and J. N. Tsitsiklis. On actor-critic algorithms. SIAM J. Control Optim., 42(4):1143-1166 (electronic), 2003.
    • V. R. Konda and J. N. Tsitsiklis. On actor-critic algorithms. SIAM J. Control Optim., 42(4):1143-1166 (electronic), 2003.
  • 14
    • 0037279497 scopus 로고    scopus 로고
    • I. Kontoyiannis and S. P. Meyn. Spectral theory and limit theorems for geometrically ergodic Markov processes. Ann. Appl. Probab., 13:304-362, 2003. Presented at the INFORMS Applied Probability Conference, NYC, July, 2001.
    • I. Kontoyiannis and S. P. Meyn. Spectral theory and limit theorems for geometrically ergodic Markov processes. Ann. Appl. Probab., 13:304-362, 2003. Presented at the INFORMS Applied Probability Conference, NYC, July, 2001.
  • 15
    • 0016533472 scopus 로고
    • Applying a new device in the optimization of exponential queueing systems
    • S. Lippman. Applying a new device in the optimization of exponential queueing systems. Operations Res., 23:687-710, 1975.
    • (1975) Operations Res , vol.23 , pp. 687-710
    • Lippman, S.1
  • 16
    • 62949120272 scopus 로고    scopus 로고
    • G. Mathew and S. Meyn. Learning macroscopic dynamics for optimal prediction. Submitted to 2008 IEEE Conf. on Dec. and Control. Preliminary version presented at Info. Thy. & Appl. at ITA, UCSD 2008.
    • G. Mathew and S. Meyn. Learning macroscopic dynamics for optimal prediction. Submitted to 2008 IEEE Conf. on Dec. and Control. Preliminary version presented at Info. Thy. & Appl. at ITA, UCSD 2008.
  • 17
    • 56449091120 scopus 로고    scopus 로고
    • An analysis of reinforcement learning with function approximation
    • F. S. Melo, S. Meyn, and M. Isabel Ribeiro. An analysis of reinforcement learning with function approximation. In Proceedings of ICML, pages 664-671, 2008.
    • (2008) Proceedings of ICML , pp. 664-671
    • Melo, F.S.1    Meyn, S.2    Isabel Ribeiro, M.3
  • 19
    • 0031344030 scopus 로고    scopus 로고
    • The policy iteration algorithm for average reward Markov decision processes with general state space
    • S. P. Meyn. The policy iteration algorithm for average reward Markov decision processes with general state space. IEEE Trans. Automat. Control, 42(12):1663-1680, 1997.
    • (1997) IEEE Trans. Automat. Control , vol.42 , Issue.12 , pp. 1663-1680
    • Meyn, S.P.1
  • 21
    • 0003637131 scopus 로고
    • Springer-Verlag, London, second edition, Edition to appear, Cambride University Press, Cambridge Mathematical Library, edition online:, 2008
    • S. P. Meyn and R. L. Tweedie. Markov Chains and Stochastic Stability. Springer-Verlag, London, second edition, 1993. 2008 Edition to appear, Cambride University Press, Cambridge Mathematical Library. 1993 edition online: http://black.csl.uiuc.edu/~meyn/pages/book.html.
    • (1993) Markov Chains and Stochastic Stability
    • Meyn, S.P.1    Tweedie, R.L.2
  • 22
    • 0000514837 scopus 로고
    • Transport, collective motion, and brownian motion
    • H. Mori. Transport, collective motion, and brownian motion. Progress of Theoretical Physics, 33:423-455, 1965.
    • (1965) Progress of Theoretical Physics , vol.33 , pp. 423-455
    • Mori, H.1
  • 23
    • 0036832956 scopus 로고    scopus 로고
    • Kernel-based reinforcement learning
    • Dirk Ormoneit and Śaunak Sen. Kernel-based reinforcement learning. Mach. Learn., 49(2-3):161-178, 2002.
    • (2002) Mach. Learn , vol.49 , Issue.2-3 , pp. 161-178
    • Ormoneit, D.1    Sen, S.2
  • 24
    • 0001296683 scopus 로고
    • Perturbation theory and finite Markov chains
    • P. J. Schweitzer. Perturbation theory and finite Markov chains. J. Appl. Prob., 5:401-403, 1968.
    • (1968) J. Appl. Prob , vol.5 , pp. 401-403
    • Schweitzer, P.J.1
  • 25
    • 84856043672 scopus 로고
    • A mathematical theory of communication
    • C.E. Shannon. A mathematical theory of communication. Bell System Tech. J., 27:379-423, 623-656, 1948.
    • (1948) Bell System Tech. J , vol.27 , Issue.379-423 , pp. 623-656
    • Shannon, C.E.1
  • 27
    • 0026626024 scopus 로고
    • Jointly optimal routing and scheduling in packet radio networks
    • L. Tassiulas and A. Ephremides. Jointly optimal routing and scheduling in packet radio networks. IEEE Trans. Inform. Theory, 38(1):165-168, 1992.
    • (1992) IEEE Trans. Inform. Theory , vol.38 , Issue.1 , pp. 165-168
    • Tassiulas, L.1    Ephremides, A.2
  • 28
    • 0029752470 scopus 로고    scopus 로고
    • Feature-based methods for large scale dynamic programming
    • J. N. Tsitsiklis and B. Van Roy. Feature-based methods for large scale dynamic programming. Mach. Learn., 22(1-3):59-94, 1996.
    • (1996) Mach. Learn , vol.22 , Issue.1-3 , pp. 59-94
    • Tsitsiklis, J.N.1    Van Roy, B.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.