SCOPUS 정보 검색 플랫폼

IJCAI International Joint Conference on Artificial Intelligence

Volumn , Issue , 2011, Pages 1878-1883

Sample efficient on-line learning of optimal dialogue policies with Kalman temporal differences

(3) Pietquin, Olivier a,b Geist, Matthieu a Chandramohan, Senthilkumar a

a UMI Georgia Tech CNRS 2958 (France)

b CNRS (France)

Author keywords

[No Author keywords available]

Indexed keywords

MACHINE LEARNING METHODS; NATURAL LANGUAGE PROCESSING; OPTIMAL POLICIES; OPTIMAL STRATEGIES; POLICY OPTIMIZATION; STATE OF THE ART; SYSTEM BEHAVIORS; TEMPORAL DIFFERENCES;

ARTIFICIAL INTELLIGENCE; NATURAL LANGUAGE PROCESSING SYSTEMS; OPTIMIZATION; REINFORCEMENT LEARNING;

LEARNING ALGORITHMS;

EID: 84881039547 PISSN: 10450823 EISSN: None Source Type: Conference Proceeding
DOI: 10.5591/978-1-57735-516-8/IJCAI11-314 Document Type: Conference Paper

Times cited : (36)

References (23)

1
- 0001700171
- A Markovian Decision Process
- Richard Bellman. A Markovian Decision Process. Journal of Mathematics and Mechanics, 1957.
- (1957) Journal of Mathematics and Mechanics
- Bellman, R.¹

2
- 79959828819
- Optimizing Spoken Dialogue Management with Fitted Value Iteration
- Senthilkumar Chandramohan, Matthieu Geist, and Olivier Pietquin. Optimizing Spoken Dialogue Management with Fitted Value Iteration. In Interspeech'10, Makuhari (Japan), 2010.
- Interspeech'10, Makuhari (Japan), 2010
- Chandramohan, S.¹ Geist, M.² Pietquin, O.³

3
- 84857755225
- Gaussian processes for fast policy optimisation of pomdp-based dialogue managers
- Milica Gasic, Filip Jurcicek, Simon Keizer, François Mairesse, Blaise Thomson, Kai Yu, and Steve Young. Gaussian processes for fast policy optimisation of pomdp-based dialogue managers. In SIGDIAL'10, Tokyo, Japan, 2010.
- SIGDIAL'10, Tokyo, Japan, 2010
- Gasic, M.¹ Jurcicek, F.² Keizer, S.³ Mairesse, F.⁴ Thomson, B.⁵ Yu, K.⁶ Young, S.⁷

4
- 78651465938
- Kalman Temporal Differences
- October
- Matthieu Geist and Olivier Pietquin. Kalman Temporal Differences. Journal of Artificial Intelligence Research (JAIR), 39:489-532, October 2010.
- (2010) Journal of Artificial Intelligence Research (JAIR) , vol.39 , pp. 489-532
- Geist, M.¹ Pietquin, O.²

5
- 84881043838
- Managing Uncertainty within Value Function Approximation in Reinforcement Learning
- Matthieu Geist and Olivier Pietquin. Managing Uncertainty within Value Function Approximation in Reinforcement Learning. In Journal of Machine Learning Research, Workshop & Conference Proceedings (JMLR W& CP): Active Learning and Experimental Design, Sardinia, Italy, 2010.
- Journal of Machine Learning Research, Workshop & Conference Proceedings (JMLR W& CP): Active Learning and Experimental Design, Sardinia, Italy, 2010
- Geist, M.¹ Pietquin, O.²

6
- 84880694195
- Stable Function Approximation in Dynamic Programming
- Geoffrey Gordon. Stable Function Approximation in Dynamic Programming. In ICML'95, 1995.
- (1995) ICML'95
- Gordon, G.¹

7
- 51449120317
- Hybrid reinforcement/supervised learning of dialogue policies from fixed data sets
- James Henderson, Oliver Lemon, and Kallirroi Georgila. Hybrid reinforcement/supervised learning of dialogue policies from fixed data sets. Computational Linguistics, 2008.
- (2008) Computational Linguistics
- Henderson, J.¹ Lemon, O.² Georgila, K.³

8
- 79959813974
- Natural Belief-Critic: A reinforcement algorithm for parameter estimation in statistical spoken dialogue systems
- Filip Jurcicek, Blaise Thomson, Simon Keizer, Milica Gasic, François Mairesse, Kai Yu, and Steve Young. Natural Belief-Critic: a reinforcement algorithm for parameter estimation in statistical spoken dialogue systems. In Interspeech'10, Makuhari (Japan), 2010.
- Interspeech'10, Makuhari (Japan), 2010
- Jurcicek, F.¹ Thomson, B.² Keizer, S.³ Gasic, M.⁴ Mairesse, F.⁵ Yu, K.⁶ Young, S.⁷

9
- 85024429815
- A new approach to linear filtering and prediction problems
- Series D
- Rudolf Kalman. A new approach to linear filtering and prediction problems. Transactions of the ASME-Journal of Basic Engineering, 82(Series D):35-45, 1960.
- (1960) Transactions of the ASME-Journal of Basic Engineering , vol.82 , pp. 35-45
- Kalman, R.¹

10
- 4644323293
- Least-squares policy iteration
- Michail Lagoudakis and Ron Parr. Least-squares policy iteration. Journal of Machine Learning Research, 4:1107-1149, 2003.
- (2003) Journal of Machine Learning Research , vol.4 , pp. 1107-1149
- Lagoudakis, M.¹ Parr, R.²

11
- 85009087667
- Information state and dialogue management in the TRINDI dialogue move engine toolkit
- Staffan Larsson and David R. Traum. Information state and dialogue management in the TRINDI dialogue move engine toolkit. Natural Language Engineering, 2000.
- (2000) Natural Language Engineering
- Larsson, S.¹ Traum, D.R.²

12
- 84893350028
- An ISU dialogue system exhibiting reinforcement learning of dialogue policies: Generic slot-filling in the TALK in-car system
- Oliver Lemon, Kalliroi Georgila, James Henderson, and Matthew Stuttle. An ISU dialogue system exhibiting reinforcement learning of dialogue policies: generic slot-filling in the TALK in-car system. In EACL'06, Morristown, NJ, USA, 2006.
- EACL'06, Morristown, NJ, USA, 2006
- Lemon, O.¹ Georgila, K.² Henderson, J.³ Stuttle, M.⁴

13
- 0031624616
- Using markov decision process for learning dialogue strategies
- Esther Levin and Roberto Pieraccini. Using markov decision process for learning dialogue strategies. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP'98), Seattle, Washington, 1998.
- Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP'98), Seattle, Washington, 1998
- Levin, E.¹ Pieraccini, R.²

14
- 0033894474
- Stochastic model of human-machine interaction for learning dialog strategies
- DOI 10.1109/89.817450
- Esther Levin, Roberto Pieraccini, and Wieland Eckert. A stochastic model of human-machine interaction for learning dialog strategies. IEEE Transactions on Speech and Audio Processing, 8(1):11-23, 2000. (Pubitemid 30540744)
- (2000) IEEE Transactions on Speech and Audio Processing , vol.8 , Issue.1 , pp. 11-23
- Levin, E.¹ Pieraccini, R.² Eckert, W.³

15
- 70450186275
- Reinforcement Learning for Dialog Management using Least-Squares Policy Iteration and Fast Feature Selection
- Lihong Li, Suhrid Balakrishnan, and Jason Williams. Reinforcement Learning for Dialog Management using Least-Squares Policy Iteration and Fast Feature Selection. In InterSpeech'09, Brighton (UK), 2009.
- InterSpeech'09, Brighton (UK), 2009
- Li, L.¹ Balakrishnan, S.² Williams, J.³

16
- 33750253118
- A probabilistic framework for dialog simulation and optimal strategy learning
- DOI 10.1109/TSA.2005.855836
- O. Pietquin and T. Dutoit. A probabilistic framework for dialog simulation and optimal strategy learning. IEEE Transactions on Audio, Speech and Language Processing, 14(2):589-599, 2006. (Pubitemid 46405357)
- (2006) IEEE Transactions on Audio, Speech and Language Processing , vol.14 , Issue.2 , pp. 589-599
- Pietquin, O.¹ Dutoit, T.²

17
- 80052060715
- Sample-Efficient Batch Reinforcement Learning for Dialogue Management Optimization
- O. Pietquin, M. Geist, S. Chandramohan, and H. Frezza-Buet. Sample-Efficient Batch Reinforcement Learning for Dialogue Management Optimization. ACM Transactions on Speech and Language Processing, 2011.
- (2011) ACM Transactions on Speech and Language Processing
- Pietquin, O.¹ Geist, M.² Chandramohan, S.³ Frezza-Buet, H.⁴

18
- 33846257740
- Effects of the user model on simulation-based learning of dialogue strategies
- Jost Schatzmann, Matthew N. Stuttle, Karl Weilhammer, and Steve Young. Effects of the user model on simulation-based learning of dialogue strategies. In ASRU'05, San Juan, Puerto Rico, 2005.
- ASRU'05, San Juan, Puerto Rico, 2005
- Schatzmann, J.¹ Stuttle, M.N.² Weilhammer, K.³ Young, S.⁴

19
- 33747607273
- A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies
- June
- Jost Schatzmann, Karl Weilhammer, Matt Stuttle, and Steve Young. A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies. The Knowledge Engineering Review, 21(2):97-126, June 2006.
- (2006) The Knowledge Engineering Review , vol.21 , Issue.2 , pp. 97-126
- Schatzmann, J.¹ Weilhammer, K.² Stuttle, M.³ Young, S.⁴

20
- 84898955256
- Reinforcement learning for spoken dialogue systems
- S. Singh, M. Kearns, D. Litman, and M.Walker. Reinforcement learning for spoken dialogue systems. In NIPS'99, Denver, USA, 1999.
- NIPS'99, Denver, USA, 1999
- Singh, S.¹ Kearns, M.² Litman, D.³ Walker, M.⁴

21
- 0004102479
- The MIT Press, 3rd edition, March
- Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction. The MIT Press, 3rd edition, March 1998.
- (1998) Reinforcement Learning: An Introduction
- Sutton, R.S.¹ Barto, A.G.²

22
- 85065183198
- PARADISE: A framework for evaluating spoken dialogue agents
- M. Walker, D. Litman, C. Kamm, and A. Abella. PARADISE: A framework for evaluating spoken dialogue agents. In ACL'97, Madrid (Spain), 1997.
- ACL'97, Madrid (Spain), 1997
- Walker, M.¹ Litman, D.² Kamm, C.³ Abella, A.⁴

23
- 33750703175
- Partially observable Markov decision processes for spoken dialog systems
- Jason Williams and Steve Young. Partially observable Markov decision processes for spoken dialog systems. Computer Speech and Language, 21(2):231-422, 2007.
- (2007) Computer Speech and Language , vol.21 , Issue.2 , pp. 231-422
- Williams, J.¹ Young, S.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.