메뉴 건너뛰기




Volumn 4, Issue , 2016, Pages 2431-2441

On-line active reward learning for policy optimisation in spoken dialogue systems

Author keywords

[No Author keywords available]

Indexed keywords

COMPUTATIONAL LINGUISTICS; E-LEARNING; GAUSSIAN DISTRIBUTION; GAUSSIAN NOISE (ELECTRONIC); NETWORK CODING; RECURRENT NEURAL NETWORKS; SPEECH PROCESSING;

EID: 85012034455     PISSN: None     EISSN: None     Source Type: Conference Proceeding    
DOI: 10.18653/v1/p16-1230     Document Type: Conference Paper
Times cited : (175)

References (51)
  • 6
    • 84988354666 scopus 로고    scopus 로고
    • Hyper-parameter optimisation of Gaussian process reinforcement learning for statistical dialogue management
    • Lu Chen, Pei-Hao Su, and Milica Gasic. 2015. Hyper-parameter optimisation of gaussian process reinforcement learning for statistical dialogue management. In Proc of SigDial.
    • (2015) Proc of SigDial
    • Chen, L.1    Su, P.-H.2    Gasic, M.3
  • 9
    • 84880360544 scopus 로고    scopus 로고
    • Pomdp-based control of workflows for crowdsourcing
    • Peng Dai, Christopher H Lin, Daniel S Weld, et al. 2013. Pomdp-based control of workflows for crowdsourcing. Artificial Intelligence, 202.
    • (2013) Artificial Intelligence , pp. 202
    • Dai, P.1    Lin, C.H.2    Weld, D.S.3
  • 12
    • 84976225037 scopus 로고    scopus 로고
    • Task completion transfer learning for reward inference
    • Layla El Asri, Romain Laroche, and Olivier Pietquin. 2014. Task completion transfer learning for reward inference. In Proc of MLIS.
    • (2014) Proc of MLIS
    • El Asri, L.1    Laroche, R.2    Pietquin, O.3
  • 13
    • 84897936325 scopus 로고    scopus 로고
    • Gaussian processes for pomdp-based dialogue manager optimization
    • Milica Gasic and Steve Young. 2014. Gaussian processes for pomdp-based dialogue manager optimization. TASLP, 22(1): 28-40.
    • (2014) TASLP , vol.22 , Issue.1 , pp. 28-40
    • Gasic, M.1    Young, S.2
  • 14
    • 85011977836 scopus 로고    scopus 로고
    • Online policy optimisation of spoken dialogue systems via live interaction with human subjects
    • Milica Gasic, Filip Jurcicek, Blaise. Thomson, Kai Yu, and Steve Young. 2011. Online policy optimisation of spoken dialogue systems via live interaction with human subjects. In IEEE ASRU.
    • (2011) IEEE ASRU
    • Gasic, M.1    Jurcicek, F.2    Thomson, B.3    Yu, K.4    Young, S.5
  • 16
    • 85012030469 scopus 로고    scopus 로고
    • Hybrid speech recognition with deep bidirectional lstm
    • Alax Graves, Navdeep Jaitly, and Abdel-rahman Mohamed. 2013. Hybrid speech recognition with deep bidirectional lstm. In IEEE ASRU.
    • (2013) IEEE ASRU
    • Graves, A.1    Jaitly, N.2    Mohamed, A.-R.3
  • 17
    • 84988334938 scopus 로고    scopus 로고
    • Discriminative spoken language understanding using word confusion networks
    • Matthew Henderson, Milica Gasic, Blaise Thomson, Pirros Tsiakoulis, Kai Yu, and Steve Young. 2012. Discriminative spoken language understanding using word confusion networks. In IEEE SLT.
    • (2012) IEEE SLT
    • Henderson, M.1    Gasic, M.2    Thomson, B.3    Tsiakoulis, P.4    Yu, K.5    Young, S.6
  • 20
    • 50649102302 scopus 로고    scopus 로고
    • Active learning with Gaussian processes for object categorization
    • Ashish Kapoor, Kristen Grauman, Raquel Urtasun, and Trevor Darrell. 2007. Active learning with gaussian processes for object categorization. In Proc of ICCV.
    • (2007) Proc of ICCV
    • Kapoor, A.1    Grauman, K.2    Urtasun, R.3    Darrell, T.4
  • 22
    • 85011972357 scopus 로고    scopus 로고
    • Issues in the evaluation of spoken dialogue systems using objective and subjective measures
    • L.B. Larsen. 2003. Issues in the evaluation of spoken dialogue systems using objective and subjective measures. In IEEE ASRU.
    • (2003) IEEE ASRU
    • Larsen, L.B.1
  • 23
    • 85135155957 scopus 로고    scopus 로고
    • A stochastic model of computer-human interaction for learning dialogue strategies
    • Esther Levin and Roberto Pieraccini. 1997. A stochastic model of computer-human interaction for learning dialogue strategies. Eurospeech.
    • (1997) Eurospeech
    • Levin, E.1    Pieraccini, R.2
  • 24
    • 84937861081 scopus 로고    scopus 로고
    • Neural word embedding as implicit matrix factorization
    • Omer Levy and Yoav Goldberg. 2014. Neural word embedding as implicit matrix factorization. In NIPS.
    • (2014) NIPS
    • Levy, O.1    Goldberg, Y.2
  • 26
    • 84898956512 scopus 로고    scopus 로고
    • Distributed representations of words and phrases and their compositionality
    • Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In NIPS.
    • (2013) NIPS
    • Mikolov, T.1    Sutskever, I.2    Chen, K.3    Corrado, G.S.4    Dean, J.5
  • 27
    • 56349122110 scopus 로고    scopus 로고
    • Approximations for binary Gaussian process classification
    • Hannes Nickisch and Carl Edward Rasmussen. 2008. Approximations for binary gaussian process classification. JMLR, 9(10).
    • (2008) JMLR , vol.9 , Issue.10
    • Nickisch, H.1    Rasmussen, C.E.2
  • 28
    • 47349127315 scopus 로고    scopus 로고
    • Automating spoken dialogue management design using machine learning: An industry perspective
    • Tim Paek and Roberto Pieraccini. 2008. Automating spoken dialogue management design using machine learning: An industry perspective. Speech communication, 50.
    • (2008) Speech Communication , pp. 50
    • Paek, T.1    Pieraccini, R.2
  • 30
    • 79952788706 scopus 로고    scopus 로고
    • Learning and evaluation of dialogue strategies for new applications: Empirical methods for optimization from small data sets
    • Verena Rieser and Oliver Lemon. 2011. Learning and evaluation of dialogue strategies for new applications: Empirical methods for optimization from small data sets. Computational Linguistics, 37(1).
    • (2011) Computational Linguistics , vol.37 , Issue.1
    • Rieser, V.1    Lemon, O.2
  • 32
    • 0012280661 scopus 로고    scopus 로고
    • Spoken dialogue management using probabilistic reasoning
    • Nicholas Roy, Joelle Pineau, and Sebastian Thrun. 2000. Spoken dialogue management using probabilistic reasoning. In Proc of SigDial.
    • (2000) Proc of SigDial
    • Roy, N.1    Pineau, J.2    Thrun, S.3
  • 33
    • 84860604462 scopus 로고    scopus 로고
    • Learning agents for uncertain environments
    • Stuart Russell. 1998. Learning agents for uncertain environments. In Proc of COLT.
    • (1998) Proc of COLT
    • Russell, S.1
  • 34
    • 33747607273 scopus 로고    scopus 로고
    • A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies
    • Jost Schatzmann, Karl Weilhammer, Matt Stuttle, and Steve Young. 2006. A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies. The knowledge engineering review, 21 (02): 97-126.
    • (2006) The Knowledge Engineering Review , vol.21 , Issue.2 , pp. 97-126
    • Schatzmann, J.1    Weilhammer, K.2    Stuttle, M.3    Young, S.4
  • 37
    • 84959164720 scopus 로고    scopus 로고
    • Learning from real users: Rating dialogue success with neural networks for reinforcement learning in spoken dialogue systems
    • Pei-Hao Su, David Vandyke, Milica Gasic, Dongho Kim, Nikola Mrksic, Tsung-Hsien Wen, and Steve Young. 2015a. Learning from real users: Rating dialogue success with neural networks for reinforcement learning in spoken dialogue systems. In Proc of Interspeech.
    • (2015) Proc of Interspeech
    • Su, P.-H.1    Vandyke, D.2    Gasic, M.3    Kim, D.4    Mrksic, N.5    Wen, T.-H.6    Young, S.7
  • 38
    • 84988411847 scopus 로고    scopus 로고
    • Reward shaping with recurrent neural networks for speeding up on-line policy learning in spoken dialogue systems
    • Pei-Hao Su, David Vandyke, Milica Gasic, Nikola Mrksic, Tsung-Hsien Wen, and Steve Young. 2015b. Reward shaping with recurrent neural networks for speeding up on-line policy learning in spoken dialogue systems. In Proc of SigDial.
    • (2015) Proc of SigDial
    • Su, P.-H.1    Vandyke, D.2    Gasic, M.3    Mrksic, N.4    Wen, T.-H.5    Young, S.6
  • 39
    • 84878412864 scopus 로고    scopus 로고
    • Preferencelearning based inverse reinforcement learning for dialog control
    • Hiroaki Sugiyama, Toyomi Meguro, and Yasuhiro Minami. 2012. Preferencelearning based inverse reinforcement learning for dialog control. In Proc of Interspeech.
    • (2012) Proc of Interspeech
    • Sugiyama, H.1    Meguro, T.2    Minami, Y.3
  • 40
    • 77950862681 scopus 로고    scopus 로고
    • Bayesian update of dialogue state: A pomdp framework for spoken dialogue systems
    • Blaise Thomson and Steve Young. 2010. Bayesian update of dialogue state: A pomdp framework for spoken dialogue systems. Computer Speech and Language, 24: 562-588.
    • (2010) Computer Speech and Language , vol.24 , pp. 562-588
    • Thomson, B.1    Young, S.2
  • 41
    • 80053495924 scopus 로고    scopus 로고
    • Word representations: A simple and general method for semi-supervised learning
    • Joseph Turian, Lev Ratinov, and Yoshua Bengio. 2010. Word representations: a simple and general method for semi-supervised learning. In Proc of ACL.
    • (2010) Proc of ACL
    • Turian, J.1    Ratinov, L.2    Bengio, Y.3
  • 42
    • 85011977965 scopus 로고    scopus 로고
    • Quality-adaptive spoken dialogue initiative selection and implications on reward modelling
    • Stefan Ultes and Wolfgang Minker. 2015. Quality-adaptive spoken dialogue initiative selection and implications on reward modelling. In Proc of SigDial.
    • (2015) Proc of SigDial
    • Ultes, S.1    Minker, W.2
  • 43
    • 57249084011 scopus 로고    scopus 로고
    • Visualizing data using t-sne
    • Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-sne. JMLR, 9: 85.
    • (2008) JMLR , vol.9 , pp. 85
    • Van Der Maaten, L.1    Hinton, G.2
  • 46
    • 85065183198 scopus 로고    scopus 로고
    • PARADISE: A framework for evaluating spoken dialogue agents
    • Marilyn A. Walker, Diane J. Litman, Candace A. Kamm, and Alicia Abella. 1997. PARADISE: A framework for evaluating spoken dialogue agents. In Proc of EACL.
    • (1997) Proc of EACL
    • Walker, M.A.1    Litman, D.J.2    Kamm, C.A.3    Abella, A.4
  • 47
    • 33750703175 scopus 로고    scopus 로고
    • Partially observable Markov decision processes for spoken dialog systems
    • Jason D. Williams and Steve Young. 2007. Partially observable Markov decision processes for spoken dialog systems. Computer Speech and Language, 21(2): 393-422.
    • (2007) Computer Speech and Language , vol.21 , Issue.2 , pp. 393-422
    • Williams, J.D.1    Young, S.2
  • 48
    • 84872169460 scopus 로고    scopus 로고
    • Predicting user satisfaction in spoken dialog system evaluation with collaborative filtering
    • Zhaojun Yang, G Levow, and Helen Meng. 2012. Predicting user satisfaction in spoken dialog system evaluation with collaborative filtering. IEEE Journal of Selected Topics in Signal Processing, 6(99): 971-981.
    • (2012) IEEE Journal of Selected Topics in Signal Processing , vol.6 , Issue.99 , pp. 971-981
    • Yang, Z.1    Levow, G.2    Meng, H.3
  • 49
    • 84876688858 scopus 로고    scopus 로고
    • Pomdp-based statistical spoken dialogue systems: A review
    • Steve Young, Milica Gasic, Blaise Thomson, and Jason Williams. 2013. Pomdp-based statistical spoken dialogue systems: a review. In Proc of IEEE, Volume 99, pages 1-20.
    • (2013) Proc of IEEE , vol.99 , pp. 1-20
    • Young, S.1    Gasic, M.2    Thomson, B.3    Williams, J.4
  • 50
    • 85011971999 scopus 로고    scopus 로고
    • Active learning from weak and strong labelers
    • Chicheng Zhang and Kamalika Chaudhuri. 2015. Active learning from weak and strong labelers. CoRR, abs/1510.02847.
    • (2015) CoRR, abs/1510.02847
    • Zhang, C.1    Chaudhuri, K.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.