메뉴 건너뛰기




Volumn 55, Issue 5 II, 2007, Pages 2170-2181

Q-learning algorithms for constrained Markov decision processes with randomized monotone policies: Application to MIMO transmission control

Author keywords

Constrained Markov decision process (CMDP); Delay constraints; Monotone policies; Q learning; Randomizedpolicies; Reinforcementlearning; Supermodularity; Transmission scheduling; V BLAST

Indexed keywords

CONVERGENCE OF NUMERICAL METHODS; LEARNING ALGORITHMS; MARKOV PROCESSES; OPTIMIZATION; QUALITY OF SERVICE; SCHEDULING ALGORITHMS; STOCHASTIC CONTROL SYSTEMS; TIME VARYING SYSTEMS;

EID: 34247898874     PISSN: 1053587X     EISSN: None     Source Type: Journal    
DOI: 10.1109/TSP.2007.893228     Document Type: Article
Times cited : (82)

References (26)
  • 1
    • 85066695690 scopus 로고    scopus 로고
    • V-BLASt power and rate control under delay constraints in Markovian fading channels - Optimality of randomized monotonic policies
    • accepted for publication
    • D. V. Djonin and V. Krishnamurthy, "V-BLASt power and rate control under delay constraints in Markovian fading channels - Optimality of randomized monotonic policies," IEEE Trans. Signal Process., 2007, accepted for publication.
    • (2007) IEEE Trans. Signal Process
    • Djonin, D.V.1    Krishnamurthy, V.2
  • 2
    • 0003233588 scopus 로고    scopus 로고
    • Transmission policies for time varying channels with average delay constraints
    • Sep
    • B. E. Collins and R. Cruz, "Transmission policies for time varying channels with average delay constraints," in Proc. Allerton Conf. Commun., Contr. Comput., Sep. 1999, pp. 709-717.
    • (1999) Proc. Allerton Conf. Commun., Contr. Comput , pp. 709-717
    • Collins, B.E.1    Cruz, R.2
  • 3
    • 85008016250 scopus 로고    scopus 로고
    • Optimal and suboptimal packet scheduling over time-varying fading channels
    • Feb
    • A. K. Karmokar, D. V. Djonin, and V. K. Bhargava, "Optimal and suboptimal packet scheduling over time-varying fading channels," IEEE Trans. Wireless Commun., vol. 5, no. 2, pp. 446-457, Feb. 2006.
    • (2006) IEEE Trans. Wireless Commun , vol.5 , Issue.2 , pp. 446-457
    • Karmokar, A.K.1    Djonin, D.V.2    Bhargava, V.K.3
  • 4
    • 34247862895 scopus 로고    scopus 로고
    • Delay limited optimal and suboptimal power and bit loading algorithms for OFDM systems over correlated fading
    • St. Louis
    • M. J. Hossain, D. V. Djonin, and V. K. Bhargava, "Delay limited optimal and suboptimal power and bit loading algorithms for OFDM systems over correlated fading," in Proc. GLOBECOM 2005, St. Louis, 2005, pp. 3448-3453.
    • (2005) Proc. GLOBECOM 2005 , pp. 3448-3453
    • Hossain, M.J.1    Djonin, D.V.2    Bhargava, V.K.3
  • 6
    • 0032022195 scopus 로고    scopus 로고
    • On limits of wireless communication in a fading environment when using multiple antennas
    • Mar
    • G. J. Foschini and M. J. Gans, "On limits of wireless communication in a fading environment when using multiple antennas," Wireless Pers. Commun., vol. 6, pp. 311-335, Mar. 1998.
    • (1998) Wireless Pers. Commun , vol.6 , pp. 311-335
    • Foschini, G.J.1    Gans, M.J.2
  • 7
    • 0030234863 scopus 로고    scopus 로고
    • Layered space-time architecture for wireless communication in a fading environment when using multielement antennas
    • Oct
    • G. J. Foschini, "Layered space-time architecture for wireless communication in a fading environment when using multielement antennas," Bell. Labs. Tech. J., pp. 41-59, Oct. 1996.
    • (1996) Bell. Labs. Tech. J , pp. 41-59
    • Foschini, G.J.1
  • 8
    • 0035189565 scopus 로고    scopus 로고
    • Low complexity algorithm for rate quantization in extended V-BLAST
    • S. Chung, H. C. Howard, and A. Lozano, "Low complexity algorithm for rate quantization in extended V-BLAST," in Proc. IEEE VTC' 2001, 2001, pp. 910-914.
    • (2001) Proc. IEEE VTC' 2001 , pp. 910-914
    • Chung, S.1    Howard, H.C.2    Lozano, A.3
  • 9
    • 0345308612 scopus 로고    scopus 로고
    • Low complexity per-antenna rate and power control approach for closed-loop V-BLAST
    • Nov
    • H. Zhang, L. Dai, S. Zhou, and Y. Yao, "Low complexity per-antenna rate and power control approach for closed-loop V-BLAST," IEEE Trans. Commun., vol. 51, no. 11, pp. 1783-1787, Nov. 2003.
    • (2003) IEEE Trans. Commun , vol.51 , Issue.11 , pp. 1783-1787
    • Zhang, H.1    Dai, L.2    Zhou, S.3    Yao, Y.4
  • 10
    • 4544239777 scopus 로고    scopus 로고
    • Spreading code optimization and adaptation in CDMA via discrete stochastic approximation
    • Sep
    • V. Krishnamurthy, X. Wang, and G. Yin, "Spreading code optimization and adaptation in CDMA via discrete stochastic approximation," IEEE Trans. Inf. Theory, vol. 50, no. 9, pp. 1927-1949, Sep. 2004.
    • (2004) IEEE Trans. Inf. Theory , vol.50 , Issue.9 , pp. 1927-1949
    • Krishnamurthy, V.1    Wang, X.2    Yin, G.3
  • 11
    • 0020970738 scopus 로고
    • Neuron-like elements that can solve difficult learning control problems
    • A. Barto, R. Sutton, and C. Anderson, "Neuron-like elements that can solve difficult learning control problems," IEEE Trans. Syst, Man, Cybern., vol. SMC-13, pp. 834-846, 1983.
    • (1983) IEEE Trans. Syst, Man, Cybern , vol.SMC-13 , pp. 834-846
    • Barto, A.1    Sutton, R.2    Anderson, C.3
  • 12
    • 0035249254 scopus 로고    scopus 로고
    • Simulation-based optimization of Markov reward processes
    • Feb
    • P. Marbach and J. N. Tsitsiklis, "Simulation-based optimization of Markov reward processes," IEEE Trans. Autom. Contr., vol. 42, no. 2, pp. 191-209, Feb. 2001.
    • (2001) IEEE Trans. Autom. Contr , vol.42 , Issue.2 , pp. 191-209
    • Marbach, P.1    Tsitsiklis, J.N.2
  • 16
    • 0035439783 scopus 로고    scopus 로고
    • Degrees of freedom in adaptive modulation: A unified view
    • Sep
    • S. T. Chung and A. J. Goldsmith, "Degrees of freedom in adaptive modulation: A unified view," IEEE Trans. Commun., vol. 49, no. 9, pp. 1561-1571, Sep. 2001.
    • (2001) IEEE Trans. Commun , vol.49 , Issue.9 , pp. 1561-1571
    • Chung, S.T.1    Goldsmith, A.J.2
  • 24
    • 0022151359 scopus 로고
    • Optimal policies for controlled Markov chains with a constraint
    • F. J. Beutler and K. W. Ross, "Optimal policies for controlled Markov chains with a constraint," J. Math. Anal. Appl., vol. 112, pp. 236-252, 1985.
    • (1985) J. Math. Anal. Appl , vol.112 , pp. 236-252
    • Beutler, F.J.1    Ross, K.W.2
  • 25
    • 1542348670 scopus 로고    scopus 로고
    • Constrained stochastic approximation algorithms for adaptive control of constrained markov decision processes
    • F. Vazquez Abad and V. Krishnamurthy, "Constrained stochastic approximation algorithms for adaptive control of constrained markov decision processes," in Proc. 42nd IEEE Conf. Decision Contr., 2003, pp. 2823-2828.
    • (2003) Proc. 42nd IEEE Conf. Decision Contr , pp. 2823-2828
    • Vazquez Abad, F.1    Krishnamurthy, V.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.