SCOPUS 정보 검색 플랫폼

IEEE Transactions on Speech and Audio Processing

Volumn 11, Issue 6, 2003, Pages 617-625

Automatic Phonetic Segmentation

(3) Toledano, Doroteo Torre a,b Hernández Gómez, Luis A c Villarrubia Grande, Luis a

a Speech Technology Group (Spain)

b MASSACHUSETTS INSTITUTE OF TECHNOLOGY (United States)

c UNIVERSIDAD POLITÉCNICA DE MADRID (Spain)

Author keywords

Speech analysis; Speech recognition; Speech synthesis

Indexed keywords

FUZZY SETS; MARKOV PROCESSES; MATHEMATICAL MODELS; NEURAL NETWORKS; SPEECH RECOGNITION; STATISTICAL METHODS;

AUTOMATIC PHONETIC SEGMENTATION;

SPEECH ANALYSIS;

EID: 0347968276 PISSN: 10636676 EISSN: None Source Type: Journal
DOI: 10.1109/TSA.2003.813579 Document Type: Article

Times cited : (143)

References (36)

1
- 85078422187
- A preliminary statistical evaluation of manual and automatic segmentation discrepancies
- P. Cosi, D. Falavigna, and M. Omologo, "A preliminary statistical evaluation of manual and automatic segmentation discrepancies," in Proceedings EUROSPEECH, 1991, pp. 693-696.
- (1991) Proceedings EUROSPEECH , pp. 693-696
- Cosi, P.¹ Falavigna, D.² Omologo, M.³

2
- 0346892469
- Automatic speech segmentation for concatenative inventory selection
- J. P. H. Van Santen, Ed: Springer
- A. Ljolje, J. Hirschberg, and J. P. H. Van Santen, "Automatic speech segmentation for concatenative inventory selection," in Progress in Speech Synthesis, J. P. H. Van Santen, Ed: Springer, 1997, pp. 305-311.
- (1997) Progress in Speech Synthesis , pp. 305-311
- Ljolje, A.¹ Hirschberg, J.² Van Santen, J.P.H.³

3
- 0344612854
- Automatic segmentation of speech for ITS
- A. Ljolje and M. D. Riley, "Automatic segmentation of speech for ITS," in Proceedings EUROSPEECH, 1993, pp. 1445-1448.
- (1993) Proceedings EUROSPEECH , pp. 1445-1448
- Ljolje, A.¹ Riley, M.D.²

4
- 85128350155
- Techniques for accurate automatic annotation of speech waveforms
- Sydney, NSW, Australia
- S. Cox, R. Brady, and P. Jackson, "Techniques for accurate automatic annotation of speech waveforms," in Proceedings of the International Conference on Spoken Language Processing, vol. V, Sydney, NSW, Australia, 1998, pp. 1947-1950.
- (1998) Proceedings of the International Conference on Spoken Language Processing , vol.5 , pp. 1947-1950
- Cox, S.¹ Brady, R.² Jackson, P.³

5
- 85128407266
- Phonetic alignment: Speech synthesis based vs. hybrid HMM/ANN
- Sydney, NSW, Australia
- F. Malfrère, O. Deroo, and T. Dutoit, "Phonetic alignment: Speech synthesis based vs. hybrid HMM/ANN," in Proceedings of the International Conference on Spoken Language Processing, vol. IV, Sydney, NSW, Australia, 1998, pp. 1571-1574.
- (1998) Proceedings of the International Conference on Spoken Language Processing , vol.4 , pp. 1571-1574
- Malfrère, F.¹ Deroo, O.² Dutoit, T.³

6
- 0346126969
- The aligner: Text-to-speech alignment using markov models
- J. P. H. Van Santen, Ed: Springer
- C. W. Wightman and D. T. Talkin, "The aligner: Text-to-speech alignment using markov models," in Progress in Speech Synthesis, J. P. H. Van Santen, Ed: Springer, 1997, pp. 313-323.
- (1997) Progress in Speech Synthesis , pp. 313-323
- Wightman, C.W.¹ Talkin, D.T.²

7
- 0028996888
- Using explicit segmentation to improve HMM phone recognition
- C. D. Mitchell, M. P. Harper, and L. H. Jamieson, "Using explicit segmentation to improve HMM phone recognition," in Proceedings of the International Conference on Acoutics Speech and Signal Processing, vol. 1, 1995, pp. 229-232.
- (1995) Proceedings of the International Conference on Acoutics Speech and Signal Processing , vol.1 , pp. 229-232
- Mitchell, C.D.¹ Harper, M.P.² Jamieson, L.H.³

8
- 33745197155
- Automatic segmentation and labeling of english and Italian speech databases
- B. Angelini, F. Brugnara, D. Falavigna, D. Giuliani, R. Gretter, and M. Omologo, "Automatic segmentation and labeling of english and Italian speech databases," in Proceedings EUROSPEECH, 1993, pp. 653-656.
- (1993) Proceedings EUROSPEECH , pp. 653-656
- Angelini, B.¹ Brugnara, F.² Falavigna, D.³ Giuliani, D.⁴ Gretter, R.⁵ Omologo, M.⁶

9
- 0346262149
- Automatic diphone extraction for an Italian text-to-speech synthesis system
- B. Angelini, C. Barolo, D. Falavigna, M. Omologo, and S. Sandri, "Automatic diphone extraction for an Italian text-to-speech synthesis system," in Proceedings EUROSPEECH, vol. II, 1997, pp. 581-584.
- (1997) Proceedings EUROSPEECH , vol.2 , pp. 581-584
- Angelini, B.¹ Barolo, C.² Falavigna, D.³ Omologo, M.⁴ Sandri, S.⁵

10
- 0346892468
- Labeller - A system for automatic labeling of speech continuous signal
- R. Gubrynowicz and A. Wrzoskowicz, "Labeller - A system for automatic labeling of speech continuous signal," in Proceedings EUROSPEECH, 1993, pp. 297-300.
- (1993) Proceedings EUROSPEECH , pp. 297-300
- Gubrynowicz, R.¹ Wrzoskowicz, A.²

11
- 0348153019
- A segmentai approach versus a centisecond one for automatic phonetic time-alignment
- A. Farhat, G. Pérennou, and R. André-Obrecht, "A segmentai approach versus a centisecond one for automatic phonetic time-alignment," in Proceedings EUROSPEECH, 1993, pp. 657-660.
- (1993) Proceedings EUROSPEECH , pp. 657-660
- Farhat, A.¹ Pérennou, G.² André-Obrecht, R.³

12
- 0347522318
- A nonlinear filtering method applied to automatic segmentation of multilingual speech corpora
- H. Kabré, G. Pérennou, and N. Vigouroux, "A nonlinear filtering method applied to automatic segmentation of multilingual speech corpora," in Proceedings EUROSPEECH, 1991, pp. 689-702.
- (1991) Proceedings EUROSPEECH , pp. 689-702
- Kabré, H.¹ Pérennou, G.² Vigouroux, N.³

13
- 0348153017
- A new fast algorithm for automatic segmentation of continuous speech
- Sydney, NSW, Australia
- I. Gholampour and K. Nayebi, "A new fast algorithm for automatic segmentation of continuous speech," in Proceedings of the International Conference on Spoken Language Processing, vol. IV, Sydney, NSW, Australia, 1998, pp. 1555-1558.
- (1998) Proceedings of the International Conference on Spoken Language Processing , vol.4 , pp. 1555-1558
- Gholampour, I.¹ Nayebi, K.²

14
- 0346892465
- Automatic labeling of speech using an acoustic-phonetic knowledge base
- C. G. J. Houben, "Automatic labeling of speech using an acoustic-phonetic knowledge base," in Proceedings EUROSPEECH, vol. 2, 1989, pp. 104-107.
- (1989) Proceedings EUROSPEECH , vol.2 , pp. 104-107
- Houben, C.G.J.¹

15
- 0348206344
- Segment based variable frame rate speech analysis and recognition using a spectral variation function
- G. Flammia, P. Dalsgaard, O. Andersen, and B. Lindberg, "Segment based variable frame rate speech analysis and recognition using a spectral variation function," in Proceedings of the International Conference on Spoken Language Processing, 1992, pp. 983-986.
- (1992) Proceedings of the International Conference on Spoken Language Processing , pp. 983-986
- Flammia, G.¹ Dalsgaard, P.² Andersen, O.³ Lindberg, B.⁴

16
- 0346262153
- On the use of F0 features in automatic segmentation for speech synthesis
- Sydney, NSW, Australia
- T. Saito, "On the use of F0 features in automatic segmentation for speech synthesis," in Proceedings of the International Conference on Spoken Language Processing, vol. VII, Sydney, NSW, Australia, 1998, pp. 2839-2842.
- (1998) Proceedings of the International Conference on Spoken Language Processing , vol.7 , pp. 2839-2842
- Saito, T.¹

17
- 0348153016
- Robust automatic extraction of diphones with variable boundaries
- Madrid, Spain
- D. Yarrington, H. Timothy, and G. Ball, "Robust automatic extraction of diphones with variable boundaries," in Proceedings EUROSPEECH, Madrid, Spain, 1995, pp. 1845-1848.
- (1995) Proceedings EUROSPEECH , pp. 1845-1848
- Yarrington, D.¹ Timothy, H.² Ball, G.³

18
- 0024922971
- Acoustic segmentation and phonetic classification in the SUMMIT system
- V. Zue, J. Glass, M. Phillips, and S. Seneff, "Acoustic segmentation and phonetic classification in the SUMMIT system," in Proceedings of the International Conference on Acoustics Speech and Signal Processing, 1989, pp. 389-392.
- (1989) Proceedings of the International Conference on Acoustics Speech and Signal Processing , pp. 389-392
- Zue, V.¹ Glass, J.² Phillips, M.³ Seneff, S.⁴

19
- 0346262154
- Automatic segmentation: Data-driven units of speech
- Rhodes, Greece
- W. Beet and L. Baghay-Ravary, "Automatic segmentation: Data-driven units of speech," in Proceedings EUROSPEECH, Rhodes, Greece, 1997, pp. 505-508.
- (1997) Proceedings EUROSPEECH , pp. 505-508
- Beet, W.¹ Baghay-Ravary, L.²

20
- 0348152140
- A multi-level automatic segmentation system: SAPHO and VERIPHONE
- C. Dours, M. Calmes, H. Kabré, J. M. Pécatte, G. Pérennou, and N. Vigouroux, "A multi-level automatic segmentation system: SAPHO and VERIPHONE," in Proceedings EUROSPEECH, vol. 2, 1989, pp. 83-86.
- (1989) Proceedings EUROSPEECH , vol.2 , pp. 83-86
- Dours, C.¹ Calmes, M.² Kabré, H.³ Pécatte, J.M.⁴ Pérennou, G.⁵ Vigouroux, N.⁶

21
- 0024899359
- Phoneme segmentation using spectrogram reading knowledge
- K. Hatazaki, Y. Komor, T. Kawabata, and K. Shikano, "Phoneme segmentation using spectrogram reading knowledge," in Proceedings of the International Conference on Acoustics Speech and Signal Processing, 1989, pp. 393-396.
- (1989) Proceedings of the International Conference on Acoustics Speech and Signal Processing , pp. 393-396
- Hatazaki, K.¹ Komor, Y.² Kawabata, T.³ Shikano, K.⁴

22
- 0348152141
- An efficient labeling tool for the QUICKSIG speech database
- Sydney, NSW, Australia
- M. Karjalainen, T. Altosaar, and M. Huttunen, "An efficient labeling tool for the QUICKSIG speech database," in Proceedings of the International Conference on Spoken Language Processing, vol. IV. Sydney, NSW, Australia, 1998, pp. 1535-1538.
- (1998) Proceedings of the International Conference on Spoken Language Processing , vol.4 , pp. 1535-1538
- Karjalainen, M.¹ Altosaar, T.² Huttunen, M.³

23
- 0347387926
- Fast automatic segmentation and labeling: Results on TIMIT and EUROMO
- Madrid, Spain
- A. Vorstermans, J. M. Martens, and B. Van Colle, "Fast automatic segmentation and labeling: Results on TIMIT and EUROMO," in Proceedings EUROSPEECH, Madrid, Spain, 1995, pp. 1397-1400.
- (1995) Proceedings EUROSPEECH , pp. 1397-1400
- Vorstermans, A.¹ Martens, J.M.² Van Colle, B.³

24
- 0004565879
- High-quality speech synthesis for phonetic speech segmentation
- F. Malfrère and T. Dutoit, "High-quality speech synthesis for phonetic speech segmentation," in Proceedings EUROSPEECH, 1997, pp. 2631-2634.
- (1997) Proceedings EUROSPEECH , pp. 2631-2634
- Malfrère, F.¹ Dutoit, T.²

25
- 0346892467
- Automatic segmentation and quality evaluation of speech unit inventories for concatenation-based multilingual PSOLA text-to-speech systems
- O. Boëffard, B. Cherbonnel, F. Emerard, and S. White, "Automatic segmentation and quality evaluation of speech unit inventories for concatenation-based multilingual PSOLA text-to-speech systems," in Proceedings EUROSPEECH, 1993, pp. 1449-1452.
- (1993) Proceedings EUROSPEECH , pp. 1449-1452
- Boëffard, O.¹ Cherbonnel, B.² Emerard, F.³ White, S.⁴

26
- 84890480439
- Automatic segmental and prosodic labeling of mandarin speech database
- Sydney, NSW, Australia
- F. C. Chou, C. Y. Tseng, and L. S. Lee, "Automatic segmental and prosodic labeling of mandarin speech database," in Proceedings of the International Conference on Spoken Language Processing, vol. IV, Sydney, NSW, Australia, 1998, pp. 1263-1266.
- (1998) Proceedings of the International Conference on Spoken Language Processing , vol.4 , pp. 1263-1266
- Chou, F.C.¹ Tseng, C.Y.² Lee, L.S.³

27
- 0346262152
- Real-time probabilistic segmentation for segment-based speech recognition
- Sydney, NSW, Australia, Nov. 1998
- S. Lee and J. Glass, "Real-time probabilistic segmentation for segment-based speech recognition," in Proceedings of the International Conference on Spoken Language Processing, Sydney, NSW, Australia, Nov. 1998. 1998.
- (1998) Proceedings of the International Conference on Spoken Language Processing
- Lee, S.¹ Glass, J.²

28
- 85133541701
- Trying to mimic human segmentation of speech using HMM and fuzzy logic post-correction rules
- Jenolan Caves, Australia
- D. T. Toledano, M. A. Rodríguez, and J. G. Escalada, "Trying to mimic human segmentation of speech using HMM and fuzzy logic post-correction rules," in Proceedings of the 3rd ESCA/COCOSDA International Workshop on Speech Synthesis, Jenolan Caves, Australia, 1998, pp. 207-212.
- (1998) Proceedings of the 3rd ESCA/COCOSDA International Workshop on Speech Synthesis , pp. 207-212
- Toledano, D.T.¹ Rodríguez, M.A.² Escalada, J.G.³

29
- 0348153018
- Trying to mimic human segmentation of speech using HMM and fuzzy logic post-correction rules
- N. Compbell, Ed., to be published
- D. T. Toledano, M. A. Rodríguez, J. G. Escalada, and L. A. Hernández, Trying to mimic human segmentation of speech using HMM and fuzzy logic post-correction rules, in Progress in Speech Synthesis, N. Compbell, Ed., to be published.
- Progress in Speech Synthesis
- Toledano, D.T.¹ Rodríguez, M.A.² Escalada, J.G.³ Hernández, L.A.⁴

30
- 0033694372
- Neural network boundary refining for automatic speech segmentation
- Istanbul, Turkey, June
- D. T. Toledano, "Neural network boundary refining for automatic speech segmentation," presented at the Proceedings of the International Conference on Acoustics Speech and Signal Processing 2000, Istanbul, Turkey, June 2000.
- (2000) Proceedings of the International Conference on Acoustics Speech and Signal Processing 2000
- Toledano, D.T.¹

31
- 50549091068
- Local refinement of phonetic boundaries: A general framework and its application using different transition models
- Aalborg, Denmark, Sept.
- D. T. Toledano and L. A. Hernández, "Local refinement of phonetic boundaries: A general framework and its application using different transition models," in Proceedings EUSOSPEECH, Aalborg, Denmark, Sept. 2001.
- (2001) Proceedings EUSOSPEECH
- Toledano, D.T.¹ Hernández, L.A.²

32
- 84999697223
- HMM' s for automatic phonetic segmentation
- Canaria, Spain, May.
- _, "HMM' s for automatic phonetic segmentation," in Proceedings of the Third International Conference on Language Resources and Evaluation (LREC), vol. V, Canaria, Spain, May.2002, pp. 1558-1563.
- (2002) Proceedings of the Third International Conference on Language Resources and Evaluation (LREC) , vol.5 , pp. 1558-1563

33
- 0003571972
- Cambridge University
- S. Young, J. Odell, D. Ollason, V. Valtchev, and P. Woodland, The HTK Book, Version 2.1: Cambridge University, 1997.
- (1997) The HTK Book, Version 2.1
- Young, S.¹ Odell, J.² Ollason, D.³ Valtchev, V.⁴ Woodland, P.⁵

34
- 0029288633
- Maximum likelihood linear regression for speaker adaptation of continuous density hidden markov models
- Apr.
- C. J. Leggetter and P. C. Woodland, "Maximum likelihood linear regression for speaker adaptation of continuous density hidden markov models," Computer, Speech and Language, vol. 9, no. 2, pp. 171-185, Apr. 1995.
- (1995) Computer, Speech and Language , vol.9 , Issue.2 , pp. 171-185
- Leggetter, C.J.¹ Woodland, P.C.²

35
- 0028419019
- Maximum a posteriori estimation for multivariate gaussian observations of markov chains
- J. L. Gauvain and C. H. Lee, "Maximum a posteriori estimation for multivariate gaussian observations of markov chains," IEEE Transactions on Speech and Audio Processing, vol. 2, no. 2, pp. 291-298, 1994.
- (1994) IEEE Transactions on Speech and Audio Processing , vol.2 , Issue.2 , pp. 291-298
- Gauvain, J.L.¹ Lee, C.H.²

36
- 0030372637
- A probabilistic framework for feature-based speech recognition
- Philadelphia, PA, Oct.
- J. Glass, J. Chang, and M. McCandless, "A probabilistic framework for feature-based speech recognition," in Proc. Int. Conf. Speech Language Processing 96, Philadelphia, PA, Oct. 1996, pp. 2277-2280.
- (1996) Proc. Int. Conf. Speech Language Processing 96 , pp. 2277-2280
- Glass, J.¹ Chang, J.² McCandless, M.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.