SCOPUS 정보 검색 플랫폼

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Volumn 2015-August, Issue , 2015, Pages 4844-4848

Phonological vocoding using artificial neural networks

(3) Cernak, Milos a Potard, Blaise a Garner, Philip N a

a IDIAP RESEARCH INSTITUTE (Switzerland)

Author keywords

low bit rate speech coding; Parametric vocoding; phonology

Indexed keywords

DEEP NEURAL NETWORKS; NEURAL NETWORKS; SIGNAL ENCODING; SPEECH CODING; SPEECH COMMUNICATION; VOCODERS;

LOW BIT-RATE SPEECH CODING; NEURAL NETWORK CLASSIFIER; PARAMETRIC VOCODING; PHONOLOGICAL FEATURES; PHONOLOGY; RECONSTRUCTION PROCESS; SCALAR QUANTIZATION; SIGNAL PARAMETERS;

AUDIO SIGNAL PROCESSING;

EID: 84946076199 PISSN: 15206149 EISSN: None Source Type: Conference Proceeding
DOI: 10.1109/ICASSP.2015.7178891 Document Type: Conference Paper

Times cited : (25)

References (30)

1
- 84890528919
- On the (UN)importance of the contextual factors in HMM-based speech synthesis and coding
- May IEEE
- M. Cernak, P. Motlicek, and P. N. Garner, "On the (UN)importance of the contextual factors in HMM-based speech synthesis and coding, " in Proc. of ICASSP. May 2013, pp. 8140-8143, IEEE
- (2013) Proc. of ICASSP , pp. 8140-8143
- Cernak, M.¹ Motlicek, P.² Garner, P.N.³

2
- 84906268958
- SyllableBased pitch encoding for low bit rate speech coding with recognition/synthesis architecture
- Aug.2013
- Milos Cernak, Xingyu Na, and Philip N. Garner, "SyllableBased Pitch Encoding for Low Bit Rate Speech Coding with Recognition/Synthesis Architecture, " in Proc. of Interspeech, Aug.2013, pp. 3449-3452
- Proc. of Interspeech , pp. 3452
- Cernak, M.¹ Na, X.² Garner, P.N.³

3
- 84910046086
- Stress and accent transmission in HMMBased syllable-context very low bit rate speech coding
- Sept
- Milos Cernak, Alexandros Lazaridis, Philip N. Garner, and Petr Motlicek, "Stress and Accent Transmission In HMMBased Syllable-Context Very Low Bit Rate Speech Coding, " in Proc. of Interspeech, Sept. 2014, pp. 2799-2803
- (2014) Proc. of Interspeech , pp. 2799-2803
- Cernak, M.¹ Lazaridis, A.² Garner, P.N.³ Motlicek, P.⁴

4
- 84890490547
- Statistical parametric speech synthesis using deep neural networks
- May, IEEE
- Heiga Ze, A. Senior, and M. Schuster, "Statistical parametric speech synthesis using deep neural networks, " in Proc. of ICASSP. May 2013, pp. 7962-7966, IEEE
- (2013) Proc. of ICASSP , pp. 7962-7966
- Ze, H.¹ Senior, A.² Schuster, M.³

5
- 84905251808
- On the training aspects of deep neural network ( DNN) for parametric ITS synthesis
- May IEEE
- Yao Qian, Yuchen Fan, Wenping Hu, and F. K. Soong, "On the training aspects of Deep Neural Network ( DNN) for parametric ITS synthesis, " in Proc. of ICASSP. May 2014, pp. 3829-3833, IEEE
- (2014) Proc. of ICASSP , pp. 3829-3833
- Qian, Y.¹ Fan, Y.² Hu, W.³ Soong, F.K.⁴

6
- 84929157442
- Combining a vector space representation of linguistic context with a deep neural network for text-to-speech synthesis
- Heng Lu, Simon King, and Oliver Watts, "Combining a Vector Space Representation of Linguistic Context with a Deep Neural Network for Text-To-Speech Synthesis, " in Proc. of 8th ISCA Workshop on Speech Synthesis, 2013, pp. 281-285
- (2013) Proc. of 8th ISCA Workshop on Speech Synthesis , pp. 281-285
- Lu, H.¹ King, S.² Watts, O.³

7
- 84905234316
- Spectral modeling using neural autoregressive distribution estimators for statistical parametric speech synthesis
- May, IEEE
- Xiang Yin, Zhen-Hua Ling, and Li-Rong Dai, "Spectral modeling using neural autoregressive distribution estimators for statistical parametric speech synthesis, " in Proc. of ICASSP. May 2014, pp. 3824-3828, IEEE
- (2014) Proc. of ICASSP , pp. 3824-3828
- Yin, X.¹ Ling, Z.-H.² Dai, L.-R.³

8
- 0024909981
- A phonetic vocoder
- May voU, IEEE
- J. Picone and G. R. Doddington, "A phonetic vocoder, " in Proc. of ICASSP. May 1989, pp. 580-583 voU, IEEE
- (1989) Proc. of ICASSP , pp. 580-583
- Picone, J.¹ Doddington, G.R.²

9
- 0034297586
- Detection of phonological features in continuous speech using neural networks
- Oct
- Simon King and Paul Taylor, " Detection of phonological features in continuous speech using neural networks, " Computer Speech &Language, vol. 14, no. 4, pp. 333-353, Oct. 2000
- (2000) Computer Speech &Language , vol.14 , Issue.4 , pp. 333-353
- King, S.¹ Taylor, P.²

10
- 84862931515
- Experiments on cross-language attribute detection and phone recognition with minimal target-specific training data
- Mar
- S. M. Siniscalchi, Dau-Cheng Lyu, T. Svendsen, and Chin-Hui Lee, "Experiments on Cross-Language Attribute Detection and Phone Recognition With Minimal Target-Specific Training Data, " IEEE Trans. on Audio, Speech, and Language Processing, vol. 20, no. 3, pp. 875-887, Mar. 2012
- (2012) IEEE Trans. on Audio, Speech, and Language Processing , vol.20 , Issue.3 , pp. 875-887
- Siniscalchi, S.M.¹ Lyu, D.-C.² Svendsen, T.³ Lee, C.-H.⁴

11
- 0004119259
- Harper &Row, New York, NY
- N. Chomsky and M. Halle, The Sound Pattern of English, Harper &Row, New York, NY, 1968
- (1968) The Sound Pattern of English
- Chomsky, N.¹ Halle, M.²

12
- 0004145667
- 7 edition, Jan
- Peter Ladefoged and Keith Johnson, A Course in Phonetics, Cengage Learning, 7 edition, Jan. 2014
- (2014) A Course in Phonetics, Cengage Learning
- Ladefoged, P.¹ Johnson, K.²

13
- 0041385414
- Longman, Harlow, Essex
- J. Harris and G. Lindsey, The elements of phonological representation, pp. 34-79, Longman, Harlow, Essex, 1995
- (1995) The Elements of Phonological Representation , pp. 34-79
- Harris, J.¹ Lindsey, G.²

14
- 84867329143
- Boosting attribute and phone estimation accuracies with deep neural networks for detection-based speech recognition
- March, IEEE SPS
- Dong Yu, Sabato Siniscalchi, Li Deng, and Chin-Hui Lee, "Boosting attribute and phone estimation accuracies with deep neural networks for detection-based speech recognition, " in Proc. of ICASSP. March 2012, IEEE SPS
- (2012) Proc. of ICASSP
- Yu, D.¹ Siniscalchi, S.² Deng, L.³ Lee, C.-H.⁴

15
- 0038988922
- (4th Edition) (Allyn &Bacon Communication Sciences and Disorders), Pearson, 4 edition, Mar
- Jacqueline Bauman-Waengler, Articulatory and Phonological Impairments: A Clinical Focus (4th Edition) (Allyn &Bacon Communication Sciences and Disorders), Pearson, 4 edition, Mar. 2011
- (2011) Articulatory and Phonological Impairments: A Clinical Focus
- Jacqueline, B.-W.¹

16
- 84973386174
- Corpus description of the ester evaluation campaign for the rich transcription of French broadcast news
- S. Galliano, E. Geoffrois, G. Gravier, J. f. Bonastre, D. Mostefa, and K. Choukri, "Corpus description of the ester evaluation campaign for the rich transcription of French broadcast news, " in In Proceedings of the 5th international Conference on Language Resources and Evaluation (LREC 2006, 2006, pp. 315-320
- (2006) Proceedings of the 5th International Conference on Language Resources and Evaluation LREC 2006 , pp. 315-320
- Galliano, S.¹ Geoffrois, E.² Gravier, G.³ Bonastre, J.F.⁴ Mostefa, D.⁵ Choukri, K.⁶

17
- 0022896067
- B D L E X A data and cognition base of spoken French
- G. Perennou, "B.D.L.E. X.: A data and cognition base of spoken French, " in Proc. of ICASSP, 1986, vol. 11, pp. 325-328
- (1986) Proc. of ICASSP , vol.11 , pp. 325-328
- Perennou, G.¹

18
- 70350498327
- The HMM-based speech synthesis system version 2.0
- H. Zen, T. Nose, J. Yamagishi, S. Sako, T. Masuko, A. W. Black, and K. Tokuda, "The HMM-based Speech Synthesis System Version 2.0, " in Proc. of ISCA SSW6, 2007, pp. 131-136
- (2007) Proc. of ISCA SSW6 , pp. 131-136
- Zen, H.¹ Nose, T.² Yamagishi, J.³ Sako, S.⁴ Masuko, T.⁵ Black, A.W.⁶ Tokuda, K.⁷

19
- 85135145174
- Acoustic modeling based on the M D L principle for speech recognition
- Koichi Shinoda and Takao Watanabe, "Acoustic modeling based on the M D L principle for speech recognition, " in Proc. of Eurospeech, 1997, pp. I-99-102
- (1997) Proc. of Eurospeech , pp. 199-102
- Shinoda, K.¹ Watanabe, T.²

20
- 33745805403
- A fast learning algorithm for deep belief nets
- July
- Geoffrey E. Hinton, Simon Osindero, and Yee W. Teh, "A Fast Learning Algorithm for Deep Belief Nets," Neural Comput., vol. 18, no. 7, pp. 1527-1554, July 2006
- (2006) Neural Comput , vol.18 , Issue.7 , pp. 1527-1554
- Hinton, G.E.¹ Osindero, S.² Teh, Y.W.³

21
- 84858953642
- The kaldi speech recognition toolkit
- Dec., IEEE SPS, IEEE Catalog No.: CFP IISRW-USB
- Daniel Povey, Arnab Ghoshal, Gilles Boulianne, Lukas Burget, Ondrej Glembek, Nagendra Goel, Mirko Hannemann, Petr Motlicek, Yanmin Qian, Petr Schwarz, Jan Silovsky, Georg Stemmer, and Karel Vesely, 'The kaldi speech recognition toolkit, " in Proc. of ASRU. Dec. 2011, IEEE SPS, IEEE Catalog No.: CFP IISRW-USB
- (2011) Proc. of ASRU
- Povey, D.¹ Ghoshal, A.² Boulianne, G.³ Burget, L.⁴ Glembek, O.⁵ Goel, N.⁶ Hannemann, M.⁷ Motlicek, P.⁸ Qian, Y.⁹ Schwarz, P.¹⁰ Silovsky, J.¹¹ Stemmer, G.¹² Vesely, K.¹³

22
- 0028996993
- Speech parameter generation from HMM using dynamic features
- May voU, IEEE
- K. Tokuda, T. Kobayashi, and S. Imai, "Speech parameter generation from HMM using dynamic features, " in Proc. of ICASSP. May 1995, vol. 1, pp. 660-663 voU, IEEE
- (1995) Proc. of ICASSP , vol.1 , pp. 660-663
- Tokuda, K.¹ Kobayashi, T.² Imai, S.³

23
- 84946044425
- Tech. Rep. Idiap-RR-03-2015, Idiap, Jan
- Philip N. Garner, Milos Cernak, and Blaise Potard, "A simple continuous excitation model for parametric vocoding, " Tech. Rep. Idiap-RR-03-2015, Idiap, Jan. 2015
- (2015) A Simple Continuous Excitation Model for Parametric Vocoding
- Garner, P.N.¹ Cernak, M.² Potard, B.³

24
- 0027247004
- Mel-cepstral distance measure for objective speech quality assessment
- May voU, IEEE
- R. F. Kubichek, "Mel-cepstral distance measure for objective speech quality assessment, " in Proc. of ICASSP. May 1993, vol. 1, pp. 125-128 voU, IEEE
- (1993) Proc. of ICASSP , vol.1 , pp. 125-128
- Kubichek, R.F.¹

25
- 84874245805
- Reactive and continuous control of HMM-based speech synthesis
- Dec., IEEE
- M. Astrinaki, N. d' Alessandro, B. Picart, T. Drugman, and T. Dutoit, "Reactive and continuous control of HMM-based speech synthesis, " in Spoken Language Technology Workshop (SLT), 2012 IEEE. Dec. 2012, pp. 252-257, IEEE
- (2012) Spoken Language Technology Workshop (SLT), 2012 IEEE , pp. 252-257
- Astrinaki, M.¹ Alessandro, N.D.² Picart, B.³ Drugman, T.⁴ Dutoit, T.⁵

26
- 84928118106
- Fixed point analysis of frequency to instantaneous frequency mapping for accurate estimation of FO and periodicity
- Budapest, Hungary
- H. Kawahara, H. Katayose, A. de Cheveigne, and R. D. Patterson, "Fixed point analysis of frequency to instantaneous frequency mapping for accurate estimation of FO and periodicity, " in Proc. of Eurospeech, Budapest, Hungary, 1999
- (1999) Proc. of Eurospeech
- Kawahara, H.¹ Katayose, H.² De Cheveigne, A.³ Patterson, R.D.⁴

27
- 77949895900
- Ph.D. thesis, Universitat Wien, Vienna
- M. A. Pi:ichtrager, The Structure of Length, Ph.D. thesis, Universitat Wien, Vienna, 2006
- (2006) The Structure of Length
- Piichtrager, M.A.¹

28
- 84946038494
- Voice source modelling using deep neural networks for statistical parametric speech synthesis
- Lisbon, Portugal, September
- Tuomo Raitio, Heng Lu, John Kane, Antti Suni, Martti Vainio, Simon King, and Paavo Alku, "Voice source modelling using deep neural networks for statistical parametric speech synthesis, " in Proc. of EUSIPCO, Lisbon, Portugal, September 2014
- (2014) Proc. of EUSIPCO
- Raitio, T.¹ Lu, H.² Kane, J.³ Suni, A.⁴ Vainio, M.⁵ King, S.⁶ Alku, P.⁷

29
- 85032752177
- Parametric representation of speech signals
- J.L. Flanagan, "Parametric representation of speech signals, " IEEE Signal Processing Magazine, vol. 27, no. 3, pp. 141-145, 2010
- (2010) IEEE Signal Processing Magazine , vol.27 , Issue.3 , pp. 141-145
- Flanagan, J.L.¹

30
- 84936526522
- Towards an articulatory phonology
- May
- Catherine P. Browman and Louis M. Goldstein, "Towards an articulatory phonology, " Phonology, vol. 3, pp. 219-252, May 1986
- (1986) Phonology , vol.3 , pp. 219-252
- Browman, C.P.¹ Goldstein, L.M.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.