SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 17, Issue 1, 2009, Pages 138-149

Unsupervised adaptation of categorical prosody models for prosody labeling and speech recognition

(2) Ananthakrishnan, Sankaranarayanan a,b Narayanan, Shrikanth a

a University of Southern California (United States)

b BBN TECHNOLOGIES (United States)

Author keywords

Categorical prosody models; Lattice enrichment speech recognition; Unsupervised adaptation

Indexed keywords

ACOUSTIC COMPONENTS; ACOUSTIC MODEL; AUTOMATIC SPEECH RECOGNITION SYSTEM; BASELINE SYSTEMS; BOSTON UNIVERSITY; BREAK INDICES; CATEGORICAL PROSODY MODELS; CLASSIFICATION ERROR RATE; HUMAN SPEECH; LATTICE ENRICHMENT SPEECH RECOGNITION; PITCH ACCENTS; PROSODY LABELING; PROSODY MODEL; RELATIVE REDUCTION; SEED MODEL; SPEECH RECOGNIZER; UNSUPERVISED ADAPTATION; WORD ERROR RATE;

ACOUSTICS; CONCENTRATION (PROCESS); LINGUISTICS; REMELTING; SOFTWARE AGENTS; SYNTACTICS;

SPEECH RECOGNITION;

EID: 70350458869 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2008.2005347 Document Type: Article

Times cited : (20)

References (25)

1
- 85119213703
- ToBI: A standard scheme for labeling prosody
- K. Silverman, M. Beckman, J. Pitrelli, M. Ostendorf, C.Wightman, P. Price, J. Pierrehumbert, and J. Hirschberg, "ToBI: A standard scheme for labeling prosody," in Proc. Int. Conf. Spoken Lang. Process., 1992, pp. 867-869.
- (1992) Proc. Int. Conf. Spoken Lang. Process , pp. 867-869
- Silverman, K.¹ Beckman, M.² Pitrelli, J.³ Ostendorf, M.⁴ Wightman, C.⁵ Price, P.⁶ Pierrehumbert, J.⁷ Hirschberg, J.⁸

2
- 0003665661
- D. Hirst and A. D. Cristo, , D. Hirst and A. D. Cristo, Eds., Cambridge, U.K.: Cambridge Univ. Press
- D. Hirst and A. D. Cristo, , D. Hirst and A. D. Cristo, Eds., Intonation Systems: A Survey of Twenty Languages. Cambridge, U.K.: Cambridge Univ. Press, 1998.
- (1998) Intonation Systems: A Survey of Twenty Languages

3
- 33646805961
- IViE-A comparative transcription system for intonational variation in English
- E. Grabe, F. Nolan, and K. Farrar, "IViE-A comparative transcription system for intonational variation in English," in Proc. Int. Conf. Spoken Lang. Process., 1998, pp. 1259-1262.
- (1998) Proc. Int. Conf. Spoken Lang. Process , pp. 1259-1262
- Grabe, E.¹ Nolan, F.² Farrar, K.³

4
- 60849083145
- Automatic prosodic event detection using acoustic, lexical, and syntactic evidence
- Jan.
- S. Ananthakrishnan and S. Narayanan, "Automatic prosodic event detection using acoustic, lexical, and syntactic evidence," IEEE Trans. Audio, Speech, Lang. Process., vol.16, no.1, pp. 216-228, Jan. 2008.
- (2008) IEEE Trans. Audio, Speech, Lang. Process , vol.16 , Issue.1 , pp. 216-228
- Ananthakrishnan, S.¹ Narayanan, S.²

5
- 34547540499
- Prosody models for conversational speech recognition
- M. Ostendorf, I. Shafran, and R. Bates, "Prosody models for conversational speech recognition," in Proc. 2nd Plenary Meeting Symp. Prosody and Speech Process., 2003, pp. 147-154.
- (2003) Proc. 2nd Plenary Meeting Symp. Prosody and Speech Process , pp. 147-154
- Ostendorf, M.¹ Shafran, I.² Bates, R.³

6
- 85009102907
- Lexical stress modeling for improved speech recognition of spontaneous telephone speech in the JUPITER domain
- C. Wang and S. Seneff, "Lexical stress modeling for improved speech recognition of spontaneous telephone speech in the JUPITER domain," in Proc. 7th Eur. Conf. Speech Commun. Technol., 2001, pp. 2761-2764.
- (2001) Proc. 7th Eur. Conf. Speech Commun. Technol. , pp. 2761-2764
- Wang, C.¹ Seneff, S.²

7
- 34547496179
- Speech recognition models of the interdependence among syntax, prosody and segmental acoustics
- M. Hasegawa-Johnson, J. Cole, C. Shih, K. Chen, A. Cohen, S. Chavarria, H. Kim, T. Yoon, S. Borys, and J.-Y. Choi, "Speech recognition models of the interdependence among syntax, prosody and segmental acoustics," in Proc. HLT/NAACL, 2004, pp. 56-63.
- (2004) Proc. HLT/NAACL , pp. 56-63
- Hasegawa-Johnson, M.¹ Cole, J.² Shih, C.³ Chen, K.⁴ Cohen, A.⁵ Chavarria, S.⁶ Kim, H.⁷ Yoon, T.⁸ Borys, S.⁹ Choi, J.-Y.¹⁰

8
- 33744970676
- Prosody dependent speech recognition on radio news corpus of American english
- Jan.
- K. Chen, M. Hasegawa-Johnson, A. Cohen, S. Borys, S.-S. Kim, J. Cole, and J.-Y. Choi, "Prosody dependent speech recognition on radio news corpus of American english," IEEE Trans. Speech, Audio, Lang. Process., vol.14, no.1, pp. 232-245, Jan. 2006.
- (2006) IEEE Trans. Speech, Audio, Lang. Process , vol.14 , Issue.1 , pp. 232-245
- Chen, K.¹ Hasegawa-Johnson, M.² Cohen, A.³ Borys, S.⁴ Kim, S.-S.⁵ Cole, J.⁶ Choi, J.-Y.⁷

9
- 0004115604
- The boston university radio news corpus
- Mar.
- M. Ostendorf, P. Price, and S. Shattuck-Hufnagel, "The Boston University Radio News Corpus," Boston Univ., Boston, MA, Tech. Rep. ECS-95-1001, Mar. 1995.
- (1995) Boston Univ., Boston, MA, Tech. Rep. ECS-95-1001
- Ostendorf, M.¹ Price, P.² Shattuck-Hufnagel, S.³

10
- 34547525606
- Improved speech recognition using acoustic and lexical correlates of pitch accent in a N-best rescoring framework
- S. Ananthakrishnan and S. Narayanan, "Improved speech recognition using acoustic and lexical correlates of pitch accent in a N-best rescoring framework," in Proc. Int. Conf. Acoust., Speech, Signal Process., 2007, pp. 873-876.
- (2007) Proc. Int. Conf. Acoust., Speech, Signal Process , pp. 873-876
- Ananthakrishnan, S.¹ Narayanan, S.²

11
- 51449107991
- Prosody-enriched lattices for improved syllable recognition
- Antwerp, Belgium, Sep.
- S. Ananthakrishnan and S. Narayanan, "Prosody-enriched lattices for improved syllable recognition," in Proc. Int. Conf. Spoken Lang. Process., Antwerp, Belgium, Sep. 2007.
- (2007) Proc. Int. Conf. Spoken Lang. Process
- Ananthakrishnan, S.¹ Narayanan, S.²

12
- 0035156005
- Automatic ToBI prediction and alignment to speed manual labeling of prosody
- A. Syrdal, J. Hirschberg, J. McGory, and M. Beckman, "Automatic ToBI prediction and alignment to speed manual labeling of prosody," Speech Commun., vol.33, pp. 135-151, 2001.
- (2001) Speech Commun. , vol.33 , pp. 135-151
- Syrdal, A.¹ Hirschberg, J.² McGory, J.³ Beckman, M.⁴

13
- 78149415050
- Discourse structure in spoken language: Studies on speech corpora
- Mar.
- C. Nakatani, J. Hirschberg, and B. Grosz, "Discourse structure in spoken language: Studies on speech corpora," in Proc. AAAI Spring Symp. Empirical Methods in Discourse Interpretation and Generation, Mar. 1995, pp. 106-112.
- (1995) Proc. AAAI Spring Symp. Empirical Methods in Discourse Interpretation and Generation , pp. 106-112
- Nakatani, C.¹ Hirschberg, J.² Grosz, B.³

14
- 67449109656
- The design and implementation of the TRAINS-96 system: A prototype mixed-initiative planning assistant
- Oct.
- G. Ferguson, J. Allen, B. Miller, and E. Ringger, "The design and implementation of the TRAINS-96 system: A prototype mixed-initiative planning assistant," Univ. of Rochester, Rochester, Tech. Rep. TN96-5, Oct. 1996.
- (1996) Univ. of Rochester, Rochester, Tech. Rep. TN96-5
- Ferguson, G.¹ Allen, J.² Miller, B.³ Ringger, E.⁴

15
- 33947647913
- A prosodically labeled database of spontaneous speech
- Oct.
- M. Ostendorf, I. Shafran, S. Shattuck-Hufnagel, L. Carmichael, and W. Byrne, "A prosodically labeled database of spontaneous speech," in Proc. ISCA Workshop Prosody in Speech Recognition and Understanding, Oct. 2001, pp. 119-121.
- (2001) Proc. ISCA Workshop Prosody in Speech Recognition and Understanding , pp. 119-121
- Ostendorf, M.¹ Shafran, I.² Shattuck-Hufnagel, S.³ Carmichael, L.⁴ Byrne, W.⁵

16
- 51449117928
- A novel algorithm for unsupervised prosodic language model adaptation
- Las Vegas, NV
- S. Ananthakrishnan and S. Narayanan, "A novel algorithm for unsupervised prosodic language model adaptation," in Proc. Int. Conf. Acoust. , Speech Signal Process., Las Vegas, NV, 2008.
- (2008) Proc. Int. Conf. Acoust. , Speech Signal Process
- Ananthakrishnan, S.¹ Narayanan, S.²

17
- 70350481607
- CSR-II (WSJ1) complete
- Philadelphia, PA
- "CSR-II (WSJ1) Complete," Linguistic Data Consortium, Philadelphia, PA, 1994.
- Linguistic Data Consortium , pp. 1994

18
- 0141589558
- SONIC: The University of colorado continuous speech recognizer
- Mar.
- B. Pellom, "SONIC: The University of colorado continuous speech recognizer," Univ. of Colorado, Boulder, CO, Tech. Rep. TR-CSLR- 2001-2101, Mar. 2001.
- (2001) Univ. of Colorado, Boulder, CO, Tech. Rep. TR-CSLR-2001-2101
- Pellom, B.¹

19
- 84891308106
- SRILM-An extensible language modeling toolkit
- Denver, CO
- A. Stolcke, "SRILM-An extensible language modeling toolkit," in Proc. Int. Conf. Spoken Lang. Process., Denver, CO, 2002, vol.2, pp. 901-904.
- (2002) Proc. Int. Conf. Spoken Lang. Process , vol.2 , pp. 901-904
- Stolcke, A.¹

20
- 0034296009
- Finding consensus in speech recognition: Word error minimization and other applications of confusion networks
- L. Mangu, E. Brill, and A. Stolcke, "Finding consensus in speech recognition: Word error minimization and other applications of confusion networks," Computer, Speech, Lang., vol.14, no.4, pp. 373-400, 2000.
- (2000) Computer, Speech, Lang. , vol.14 , Issue.4 , pp. 373-400
- Mangu, L.¹ Brill, E.² Stolcke, A.³

21
- 34249306038
- M.S. thesis, Cambridge Univ., Cambridge, U.K.
- G. Evermann, "Minimum word error rate decoding," M.S. thesis, Cambridge Univ., Cambridge, U.K., 1999.
- (1999) Minimum Word Error Rate Decoding
- Evermann, G.¹

22
- 0003857778
- A Gentle tutorial on the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models
- J. Bilmes, "A Gentle tutorial on the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models," Univ. of Berkeley, Berkeley, CA, Tech. Rep. ICSI-TR-97- 021, 1997.
- (1997) Univ. of Berkeley, Berkeley, CA, Tech. Rep. ICSI-TR-97-021
- Bilmes, J.¹

23
- 0028419019
- Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains
- Apr.
- J.-L. Gauvain and C.-H. Lee, "Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains," IEEE Trans. Speech Audio Process., vol.2, no.2, pp. 291-298, Apr. 1994.
- (1994) IEEE Trans. Speech Audio Process , vol.2 , Issue.2 , pp. 291-298
- Gauvain, J.-L.¹ Lee, C.-H.²

24
- 0040262052
- Bayesian learning of Gaussian mixture densities for hidden Markov models
- Pacific Grove, CA, Morgan-Kaufmann
- J.-L. Gauvain and C.-H. Lee, "Bayesian learning of Gaussian mixture densities for hidden Markov models," in Proc. DARPA Speech and Natural Language Workshop, Pacific Grove, CA, 1991, pp. 272-277, Morgan-Kaufmann.
- (1991) Proc. DARPA Speech and Natural Language Workshop , pp. 272-277
- Gauvain, J.-L.¹ Lee, C.-H.²

25
- 0003822743
- Cambridge, U.K.: Cambridge Univ., Dec.
- S. Young, G. Evermann, T. Hain, D. Kershaw, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. Woodland, The HTK Book. Cambridge, U.K.: Cambridge Univ., Dec. 2002.
- (2002) The HTK Book
- Young, S.¹ Evermann, G.² Hain, T.³ Kershaw, D.⁴ Moore, G.⁵ Odell, J.⁶ Ollason, D.⁷ Povey, D.⁸ Valtchev, V.⁹ Woodland, P.¹⁰

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.