메뉴 건너뛰기




Volumn , Issue , 2011, Pages 143-150

Bayesian active learning with basis functions

Author keywords

[No Author keywords available]

Indexed keywords

ACTIVE LEARNING; APPROXIMATE DYNAMIC PROGRAMMING; BASIS FUNCTIONS; CURSE OF DIMENSIONALITY; EXPLORATION/EXPLOITATION DILEMMAS; LINEAR COMBINATIONS; NUMERICAL EXPERIMENTS; VALUE FUNCTION APPROXIMATION;

EID: 80052219755     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.1109/ADPRL.2011.5967365     Document Type: Conference Paper
Times cited : (9)

References (35)
  • 2
    • 32044460264 scopus 로고    scopus 로고
    • Understanding the fine structure of electricity prices
    • H. Geman and A. Roncoroni, "Understanding the Fine Structure of Electricity Prices," The Journal of Business, vol. 79, no. 3, 2006.
    • (2006) The Journal of Business , vol.79 , Issue.3
    • Geman, H.1    Roncoroni, A.2
  • 8
    • 0001046225 scopus 로고
    • Practical issues in temporal difference learning
    • G. Tesauro, "Practical Issues in Temporal Difference Learning," Machine Learning, vol. 8, no. 3-4, pp. 257-277, 1992.
    • (1992) Machine Learning , vol.8 , Issue.3-4 , pp. 257-277
    • Tesauro, G.1
  • 9
    • 4544257178 scopus 로고
    • Approximating Q-values with basis function representations
    • Hillsdale, NJ, M. Mozer, D. Touretzky, and P. Smolensky, Eds.
    • P. Sabes, "Approximating Q-values with basis function representations," in Proceedings of the Fourth Connectionist Models Summer School, Hillsdale, NJ, M. Mozer, D. Touretzky, and P. Smolensky, Eds., 1993, pp. 264-271.
    • (1993) Proceedings of the Fourth Connectionist Models Summer School , pp. 264-271
    • Sabes, P.1
  • 10
    • 17444414191 scopus 로고    scopus 로고
    • Basis function adaptation in temporal difference reinforcement learning
    • DOI 10.1007/s10479-005-5732-z
    • I. Menache, S. Mannor, and N. Shimkin, "Basis function adaptation in temporal-difference reinforcement learning," Annals of Operations Research, vol. 134, no. 1, pp. 215-238, 2005. (Pubitemid 40550047)
    • (2005) Annals of Operations Research , vol.134 , Issue.1 , pp. 215-238
    • Menache, I.1    Mannor, S.2    Shimkin, N.3
  • 13
    • 3843131884 scopus 로고    scopus 로고
    • A new criterion using information gain for action selection strategy in reinforcement learning
    • K. Iwata, K. Ikeda, and H. Sakai, "A new criterion using information gain for action selection strategy in reinforcement learning," IEEE Transactions on Neural Networks, vol. 15, no. 4, pp. 792-799, 2004.
    • (2004) IEEE Transactions on Neural Networks , vol.15 , Issue.4 , pp. 792-799
    • Iwata, K.1    Ikeda, K.2    Sakai, H.3
  • 24
    • 0030590294 scopus 로고    scopus 로고
    • Bayesian look ahead one-stage sampling allocations for selection of the best population
    • DOI 10.1016/0378-3758(95)00169-7
    • S. Gupta and K. Miescke, "Bayesian look ahead one-stage sampling allocations for selection of the best population," Journal of Statistical Planning and Inference, vol. 54, no. 2, pp. 229-244, 1996. (Pubitemid 126161097)
    • (1996) Journal of Statistical Planning and Inference , vol.54 , Issue.2 , pp. 229-244
    • Gupta, S.S.1    Miescke, K.J.2
  • 25
    • 55549135706 scopus 로고    scopus 로고
    • A knowledge gradient policy for sequential information collection
    • P. I. Frazier, W. B. Powell, and S. Dayanik, "A knowledge gradient policy for sequential information collection," SIAM Journal on Control and Optimization, vol. 47, no. 5, pp. 2410-2439, 2008.
    • (2008) SIAM Journal on Control and Optimization , vol.47 , Issue.5 , pp. 2410-2439
    • Frazier, P.I.1    Powell, W.B.2    Dayanik, S.3
  • 28
    • 79951586758 scopus 로고    scopus 로고
    • Optimal learning of transition probabilities in the two-agent newsvendor problem
    • B. Johansson, S. Jain, J. Montoya-Torres, J. Hugan, and E. Yücesan, Eds.
    • I. O. Ryzhov, M. R. Valdez-Vivas, and W. B. Powell, "Optimal Learning of Transition Probabilities in the Two-Agent Newsvendor Problem," in Proceedings of the 2010 Winter Simulation Conference, B. Johansson, S. Jain, J. Montoya-Torres, J. Hugan, and E. Yücesan, Eds., 2010.
    • (2010) Proceedings of the 2010 Winter Simulation Conference
    • Ryzhov, I.O.1    Valdez-Vivas, M.R.2    Powell, W.B.3
  • 30
    • 79961092747 scopus 로고    scopus 로고
    • The knowledge-gradient algorithm for sequencing experiments in drug discovery
    • to appear
    • D. Negoescu, P. Frazier, and W. Powell, "The Knowledge-Gradient Algorithm for Sequencing Experiments in Drug Discovery," INFORMS J. on Computing (to appear), 2010.
    • (2010) INFORMS J. on Computing
    • Negoescu, D.1    Frazier, P.2    Powell, W.3
  • 31
    • 33748998787 scopus 로고    scopus 로고
    • Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming
    • DOI 10.1007/s10994-006-8365-9
    • A. George andW. B. Powell, "Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming," Machine Learning, vol. 65, no. 1, pp. 167-198, 2006. (Pubitemid 44451197)
    • (2006) Machine Learning , vol.65 , Issue.1 , pp. 167-198
    • George, A.P.1    Powell, W.B.2
  • 33
    • 70449498873 scopus 로고    scopus 로고
    • The knowledge-gradient policy for correlated normal rewards
    • P. I. Frazier, W. B. Powell, and S. Dayanik, "The knowledge-gradient policy for correlated normal rewards," INFORMS J. on Computing, vol. 21, no. 4, pp. 599-613, 2009.
    • (2009) INFORMS J. on Computing , vol.21 , Issue.4 , pp. 599-613
    • Frazier, P.I.1    Powell, W.B.2    Dayanik, S.3
  • 34
    • 0000792991 scopus 로고    scopus 로고
    • The stochastic behavior of commodity prices: Implications for valuation and hedging
    • E. Schwartz, "The stochastic behavior of commodity prices: Implications for valuation and hedging," Journal of Finance, vol. 52, no. 3, pp. 923- 973, 1997. (Pubitemid 127344954)
    • (1997) Journal of Finance , vol.52 , Issue.3 , pp. 923-973
    • Schwartz, E.S.1
  • 35
    • 77956513316 scopus 로고    scopus 로고
    • A convergent O(n) algorithm for off-policy temporal-difference learning with linear function approximation
    • R. Sutton, C. Szepesvári, and H. Maei, "A convergent O(n) algorithm for off-policy temporal-difference learning with linear function approximation," Advances in Neural Information Processing Systems, vol. 21, pp. 1609-1616, 2008.
    • (2008) Advances in Neural Information Processing Systems , vol.21 , pp. 1609-1616
    • Sutton, R.1    Szepesvári, C.2    Maei, H.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.