메뉴 건너뛰기




Volumn , Issue , 2011, Pages 1878-1883

Sample efficient on-line learning of optimal dialogue policies with Kalman temporal differences

Author keywords

[No Author keywords available]

Indexed keywords

MACHINE LEARNING METHODS; NATURAL LANGUAGE PROCESSING; OPTIMAL POLICIES; OPTIMAL STRATEGIES; POLICY OPTIMIZATION; STATE OF THE ART; SYSTEM BEHAVIORS; TEMPORAL DIFFERENCES;

EID: 84881039547     PISSN: 10450823     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.5591/978-1-57735-516-8/IJCAI11-314     Document Type: Conference Paper
Times cited : (36)

References (23)
  • 6
    • 84880694195 scopus 로고
    • Stable Function Approximation in Dynamic Programming
    • Geoffrey Gordon. Stable Function Approximation in Dynamic Programming. In ICML'95, 1995.
    • (1995) ICML'95
    • Gordon, G.1
  • 7
    • 51449120317 scopus 로고    scopus 로고
    • Hybrid reinforcement/supervised learning of dialogue policies from fixed data sets
    • James Henderson, Oliver Lemon, and Kallirroi Georgila. Hybrid reinforcement/supervised learning of dialogue policies from fixed data sets. Computational Linguistics, 2008.
    • (2008) Computational Linguistics
    • Henderson, J.1    Lemon, O.2    Georgila, K.3
  • 9
    • 85024429815 scopus 로고
    • A new approach to linear filtering and prediction problems
    • Series D
    • Rudolf Kalman. A new approach to linear filtering and prediction problems. Transactions of the ASME-Journal of Basic Engineering, 82(Series D):35-45, 1960.
    • (1960) Transactions of the ASME-Journal of Basic Engineering , vol.82 , pp. 35-45
    • Kalman, R.1
  • 11
    • 85009087667 scopus 로고    scopus 로고
    • Information state and dialogue management in the TRINDI dialogue move engine toolkit
    • Staffan Larsson and David R. Traum. Information state and dialogue management in the TRINDI dialogue move engine toolkit. Natural Language Engineering, 2000.
    • (2000) Natural Language Engineering
    • Larsson, S.1    Traum, D.R.2
  • 12
    • 84893350028 scopus 로고    scopus 로고
    • An ISU dialogue system exhibiting reinforcement learning of dialogue policies: Generic slot-filling in the TALK in-car system
    • Oliver Lemon, Kalliroi Georgila, James Henderson, and Matthew Stuttle. An ISU dialogue system exhibiting reinforcement learning of dialogue policies: generic slot-filling in the TALK in-car system. In EACL'06, Morristown, NJ, USA, 2006.
    • EACL'06, Morristown, NJ, USA, 2006
    • Lemon, O.1    Georgila, K.2    Henderson, J.3    Stuttle, M.4
  • 14
    • 0033894474 scopus 로고    scopus 로고
    • Stochastic model of human-machine interaction for learning dialog strategies
    • DOI 10.1109/89.817450
    • Esther Levin, Roberto Pieraccini, and Wieland Eckert. A stochastic model of human-machine interaction for learning dialog strategies. IEEE Transactions on Speech and Audio Processing, 8(1):11-23, 2000. (Pubitemid 30540744)
    • (2000) IEEE Transactions on Speech and Audio Processing , vol.8 , Issue.1 , pp. 11-23
    • Levin, E.1    Pieraccini, R.2    Eckert, W.3
  • 15
    • 70450186275 scopus 로고    scopus 로고
    • Reinforcement Learning for Dialog Management using Least-Squares Policy Iteration and Fast Feature Selection
    • Lihong Li, Suhrid Balakrishnan, and Jason Williams. Reinforcement Learning for Dialog Management using Least-Squares Policy Iteration and Fast Feature Selection. In InterSpeech'09, Brighton (UK), 2009.
    • InterSpeech'09, Brighton (UK), 2009
    • Li, L.1    Balakrishnan, S.2    Williams, J.3
  • 16
    • 33750253118 scopus 로고    scopus 로고
    • A probabilistic framework for dialog simulation and optimal strategy learning
    • DOI 10.1109/TSA.2005.855836
    • O. Pietquin and T. Dutoit. A probabilistic framework for dialog simulation and optimal strategy learning. IEEE Transactions on Audio, Speech and Language Processing, 14(2):589-599, 2006. (Pubitemid 46405357)
    • (2006) IEEE Transactions on Audio, Speech and Language Processing , vol.14 , Issue.2 , pp. 589-599
    • Pietquin, O.1    Dutoit, T.2
  • 19
    • 33747607273 scopus 로고    scopus 로고
    • A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies
    • June
    • Jost Schatzmann, Karl Weilhammer, Matt Stuttle, and Steve Young. A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies. The Knowledge Engineering Review, 21(2):97-126, June 2006.
    • (2006) The Knowledge Engineering Review , vol.21 , Issue.2 , pp. 97-126
    • Schatzmann, J.1    Weilhammer, K.2    Stuttle, M.3    Young, S.4
  • 23
    • 33750703175 scopus 로고    scopus 로고
    • Partially observable Markov decision processes for spoken dialog systems
    • Jason Williams and Steve Young. Partially observable Markov decision processes for spoken dialog systems. Computer Speech and Language, 21(2):231-422, 2007.
    • (2007) Computer Speech and Language , vol.21 , Issue.2 , pp. 231-422
    • Williams, J.1    Young, S.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.