메뉴 건너뛰기




Volumn , Issue PART 3, 2013, Pages 1961-1969

Concurrent reinforcement learning from customer interactions

Author keywords

[No Author keywords available]

Indexed keywords

CUSTOMER SATISFACTION; INDUSTRY; REINFORCEMENT LEARNING;

EID: 84897527953     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: None     Document Type: Conference Paper
Times cited : (31)

References (17)
  • 3
    • 49949111037 scopus 로고
    • Parallel dynamic programming
    • Kronsjö, L. and Shumsheruddin, D., editors, John Wiley & Sons, Inc.
    • Archibald, T. (1992). Parallel dynamic programming. In Kronsjö, L. and Shumsheruddin, D., editors, Advances in parallel algorithms, pages 343-367. John Wiley & Sons, Inc.
    • (1992) Advances in Parallel Algorithms , pp. 343-367
    • Archibald, T.1
  • 6
    • 77956543367 scopus 로고    scopus 로고
    • Web-scale Bayesian click-through rate prediction for sponsored search advertising in microsoft's bing search engine
    • Graepel, T., Candela, J. Q., Borchert, T., and Herbrich, R. (2010). Web-scale Bayesian click-through rate prediction for sponsored search advertising in microsoft's bing search engine. In 27th International Conference on Machine Learning, pages 13-20.
    • (2010) 27th International Conference on Machine Learning , pp. 13-20
    • Graepel, T.1    Candela, J.Q.2    Borchert, T.3    Herbrich, R.4
  • 7
    • 49949106524 scopus 로고    scopus 로고
    • Parallel reinforcement learning with linear function approximation
    • Grounds, M. and Kudenko, D. (2007). Parallel reinforcement learning with linear function approximation. In Adaptive Agents and Multi-Agent Systems, pages 60-74.
    • (2007) Adaptive Agents and Multi-Agent Systems , pp. 60-74
    • Grounds, M.1    Kudenko, D.2
  • 8
    • 0032073263 scopus 로고    scopus 로고
    • Planning and acting in partially observable stochastic domains
    • PII S000437029800023X
    • Kaelbling, L., Littman, M., and Cassandra, A. (1995). Planning and acting in partially observable stochastic domains. Artificial Intelligence, 101:99-134. (Pubitemid 128387390)
    • (1998) Artificial Intelligence , vol.101 , Issue.1-2 , pp. 99-134
    • Kaelbling, L.P.1    Littman, M.L.2    Cassandra, A.R.3
  • 9
    • 84897487186 scopus 로고    scopus 로고
    • A contextual-bandit approach to personalized news article recommendation
    • abs/1003.0146
    • Li, L., Chu, W., Langford, J., and Schapire, R. E. (2010). A contextual-bandit approach to personalized news article recommendation. CoRR, abs/1003.0146.
    • (2010) CoRR
    • Li, L.1    Chu, W.2    Langford, J.3    Schapire, R.E.4
  • 10
    • 85149834820 scopus 로고
    • Markov games as a framework for multi-agent reinforcement learning
    • Littman, M. L. (1994). Markov games as a framework for multi-agent reinforcement learning. In 11th International Conference on Machine Learning, pages 157-163.
    • (1994) 11th International Conference on Machine Learning , pp. 157-163
    • Littman, M.L.1
  • 13
    • 33847202724 scopus 로고
    • Learning to predict by the method of temporal differences
    • Sutton, R. (1988). Learning to predict by the method of temporal differences. Machine Learning, 3(9):9-44.
    • (1988) Machine Learning , vol.3 , Issue.9 , pp. 9-44
    • Sutton, R.1
  • 14
    • 0026971570 scopus 로고
    • Adapting bias by gradient descent: An incremental version of delta-bar-delta
    • Sutton, R. (1992). Adapting bias by gradient descent: An incremental version of delta-bar-delta. In 10th National Conference on Artificial Intelligence, pages 171-176.
    • (1992) 10th National Conference on Artificial Intelligence , pp. 171-176
    • Sutton, R.1
  • 16
    • 0033170372 scopus 로고    scopus 로고
    • Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
    • DOI 10.1016/S0004-3702(99)00052-1
    • Sutton, R., Precup, D., and Singh, S. (1999). Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112(1-2):181-211. (Pubitemid 32079890)
    • (1999) Artificial Intelligence , vol.112 , Issue.1 , pp. 181-211
    • Sutton, R.S.1    Precup, D.2    Singh, S.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.