메뉴 건너뛰기




Volumn 6, Issue 8, 2012, Pages 891-902

A comprehensive reinforcement learning framework for dialogue management optimization

Author keywords

Dialogue management; Reinforcement learning; Spoken dialogue system

Indexed keywords

DIALOGUE MANAGEMENT; DIALOGUE STRATEGY; DIALOGUE SYSTEMS; GOAL-ORIENTED; INTERACTION STRATEGY; NON-STATIONARITIES; SPOKEN DIALOGUE SYSTEM; TEMPORAL DIFFERENCES;

EID: 84872138024     PISSN: 19324553     EISSN: None     Source Type: Journal    
DOI: 10.1109/JSTSP.2012.2229257     Document Type: Article
Times cited : (46)

References (48)
  • 2
    • 0001700171 scopus 로고
    • A markovian decision process
    • R. Bellman, "A Markovian Decision Process," J. Math. Mech., vol. 6, pp. 679-684, 1957.
    • (1957) J. Math. Mech. , vol.6 , pp. 679-684
    • Bellman, R.1
  • 4
    • 0031624616 scopus 로고    scopus 로고
    • Using Markov decision process for learning dialogue strategies
    • Speech, Signal Process. (ICASSP'98)
    • E. Levin and R. Pieraccini, "Using Markov decision process for learning dialogue strategies," in Proc. Int. Conf. Accoust., Speech, Signal Process. (ICASSP'98), 1998, pp. 201-204.
    • (1998) Proc. Int. Conf. Accoust. , pp. 201-204
    • Levin, E.1    Pieraccini, R.2
  • 7
    • 33750253118 scopus 로고    scopus 로고
    • A probabilistic framework for dialog simulation and optimal strategy learning
    • DOI 10.1109/TSA.2005.855836
    • O. Pietquin and T. Dutoit, "A probabilistic framework for dialog simulation and optimal strategy learning," IEEE Trans. Speech Audio Process., vol. 14, no. 2, pp. 589-599, Mar. 2006. (Pubitemid 46405357)
    • (2006) IEEE Transactions on Audio, Speech and Language Processing , vol.14 , Issue.2 , pp. 589-599
    • Pietquin, O.1    Dutoit, T.2
  • 8
    • 33747607273 scopus 로고    scopus 로고
    • A survey of statistical user simulation techniques for reinforcement- learning of dialogue management strategies
    • DOI 10.1017/S0269888906000944
    • J. Schatzmann, K.Weilhammer, M. Stuttle, and S. Young, "A survey of statistical user simulation techniques for RL of dialogue management strategies," Knowl. Eng. Rev., vol. 21, no. 2, pp. 97-126, 2006. (Pubitemid 44266297)
    • (2006) Knowledge Engineering Review , vol.21 , Issue.2 , pp. 97-126
    • Schatzmann, J.1    Weilhammer, K.2    Stuttle, M.3    Young, S.4
  • 10
    • 84865777012 scopus 로고    scopus 로고
    • A survey on metrics for the evaluation of user simulations
    • O. Pietquin and H. Hastie, "A survey on metrics for the evaluation of user simulations," Knowledge Eng. Rev., 2011.
    • (2011) Knowledge Eng. Rev.
    • Pietquin, O.1    Hastie, H.2
  • 14
    • 70450186275 scopus 로고    scopus 로고
    • Reinforcement learning for dialog management using least-squares policy iteration and fast feature selection
    • L. Li, S. Balakrishnan, and J. Williams, "Reinforcement learning for dialog management using least-squares policy iteration and fast feature selection," in Proc. Interspeech'09, 2009.
    • (2009) Proc. Interspeech'09
    • Li, L.1    Balakrishnan, S.2    Williams, J.3
  • 15
    • 80052060715 scopus 로고    scopus 로고
    • Sample-efficient batch reinforcement learning for dialogue management optimization
    • O. Pietquin, M. Geist, S. Chandramohan, and H. Frezza-Buet, "Sample-efficient batch reinforcement learning for dialogue management optimization," ACM Trans. Speech Audio Process., vol. 7, no. 3, pp. 1-21, 2011.
    • (2011) ACM Trans. Speech Audio Process. , vol.7 , Issue.3 , pp. 1-21
    • Pietquin, O.1    Geist, M.2    Chandramohan, S.3    Frezza-Buet, H.4
  • 16
    • 51449120317 scopus 로고    scopus 로고
    • Hybrid reinforcement/supervised learning of dialogue policies from fixed data sets
    • J. Henderson, O. Lemon, and K. Georgila, "Hybrid reinforcement/supervised learning of dialogue policies from fixed data sets," Comput. Linguist., vol. 34, no. 4, pp. 487-511, 2008.
    • (2008) Comput. Linguist. , vol.34 , Issue.4 , pp. 487-511
    • Henderson, J.1    Lemon, O.2    Georgila, K.3
  • 17
    • 84867601978 scopus 로고    scopus 로고
    • Managing uncertainty within the KTD framework
    • Journal of Machine Learning Research C& WP
    • M. Geist and O. Pietquin, "Managing uncertainty within the KTD framework," in Proc. AL&E Workshop, 2011, Journal of Machine Learning Research C& WP
    • (2011) Proc. AL&E Workshop
    • Geist, M.1    Pietquin, O.2
  • 20
    • 79959813974 scopus 로고    scopus 로고
    • Natural belief-critic: A reinforcement algorithm for parameter estimation in statistical spoken dialogue systems
    • F. Jurčíček, B. Thomson, S. Keizer, M. Gašić, F.Mairesse, K.Yu, and S. Young, "Natural belief-critic: A reinforcement algorithm for parameter estimation in statistical spoken dialogue systems," in Proc. Interspeech' 10, 2010.
    • (2010) Proc. Interspeech' , vol.10
    • Jurčíček, F.1    Thomson, B.2    Keizer, S.3    Gašić, M.4    Mairesse, F.5    Yu, K.6    Young, S.7
  • 21
    • 78651465938 scopus 로고    scopus 로고
    • Kalman temporal differences
    • (JAIR)
    • M. Geist and O. Pietquin, "Kalman temporal differences," J. Artif. Intell. Res. (JAIR), vol. 39, pp. 483-532, 2010.
    • (2010) J. Artif. Intell. Res. , vol.39 , pp. 483-532
    • Geist, M.1    Pietquin, O.2
  • 22
    • 51349089807 scopus 로고    scopus 로고
    • DIPPER: Description and formalisation of an information-state update dialogue system architecture
    • J. Bos, E. Klein, O. Lemon, and T. Oka, "DIPPER: Description and formalisation of an information-state update dialogue system architecture," in Proc. SIGdial'03, 2003.
    • (2003) Proc. SIGdial'03
    • Bos, J.1    Klein, E.2    Lemon, O.3    Oka, T.4
  • 24
    • 0031143730 scopus 로고    scopus 로고
    • An analysis of temporal-difference learning with function approximation
    • PII S0018928697034375
    • J. Tsitsiklis and B. Van Roy, "An analysis of temporal-difference learning with function approximation," IEEE Trans. Autom. Control, vol. 42, no. 5, pp. 674-690, May 1997. (Pubitemid 127760263)
    • (1997) IEEE Transactions on Automatic Control , vol.42 , Issue.5 , pp. 674-690
    • Tsitsiklis, J.N.1    Van Roy, B.2
  • 25
    • 80051605697 scopus 로고    scopus 로고
    • Bayesian reinforcement learning for POMDPbased dialogue systems
    • Speech, Signal Process. (ICASSP'11)
    • S. Png and J. Pineau, "Bayesian reinforcement learning for POMDPbased dialogue systems," in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP'11), 2011, pp. 2156-2159.
    • (2011) Proc. Int. Conf. Acoust. , pp. 2156-2159
    • Png, S.1    Pineau, J.2
  • 26
    • 85009087667 scopus 로고    scopus 로고
    • Information state and dialogue management in the trindi dialogue move engine toolkit
    • S. Larsson and D. R. Traum, "Information state and dialogue management in the trindi dialogue move engine toolkit," Natural Lang. Eng., vol. 6, pp. 323-340, 2000.
    • (2000) Natural Lang. Eng. , vol.6 , pp. 323-340
    • Larsson, S.1    Traum, D.R.2
  • 31
    • 85024429815 scopus 로고
    • A new approach to linear filtering and prediction problems
    • no. Series D
    • R. Kalman, "A new approach to linear filtering and prediction problems," Trans. ASME-J. Basic Eng., vol. 82, no. Series D, pp. 35-45, 1960.
    • (1960) Trans. ASME-J. Basic Eng. , vol.82 , pp. 35-45
    • Kalman, R.1
  • 34
    • 33745211240 scopus 로고    scopus 로고
    • Learning user simulations for information state update dialogue systems
    • (Interspeech - Eurospeech'05)
    • K. Georgila, J. Henderson, and O. Lemon, "Learning user simulations for information state update dialogue systems," in Proc. Eur. Conf. Speech Commun. Technol. (Interspeech - Eurospeech'05), 2005.
    • (2005) Proc. Eur. Conf. Speech Commun. Technol.
    • Georgila, K.1    Henderson, J.2    Lemon, O.3
  • 39
    • 70349231178 scopus 로고    scopus 로고
    • The hidden information state model: A practical framework for POMDP-based spoken dialogue management
    • S. Young, M. Gašić, S. Keizer, F. Mairesse, J. Schatzmann, B. Thomson, and K. Yu, "The hidden information state model: A practical framework for POMDP-based spoken dialogue management," Comput. Speech Lang., vol. 24, no. 2, pp. 150-174, 2010.
    • (2010) Comput. Speech Lang. , vol.24 , Issue.2 , pp. 150-174
    • Young, S.1    Gašić, M.2    Keizer, S.3    Mairesse, F.4    Schatzmann, J.5    Thomson, B.6    Yu, K.7
  • 47
    • 0029276036 scopus 로고
    • Temporal difference learning and TD-Gammon
    • G. Tesauro, "Temporal difference learning and TD-Gammon," Commun. Assoc. for Comput. Mach. (ACM), vol. 38, no. 3, pp. 58-68, 1995.
    • (1995) Commun. Assoc. for Comput. Mach. (ACM) , vol.38 , Issue.3 , pp. 58-68
    • Tesauro, G.1


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.