메뉴 건너뛰기




Volumn 16, Issue 4, 2008, Pages 797-811

Exploiting acoustic and syntactic features for automatic prosody labeling in a maximum entropy framework

Author keywords

Acoustic prosodic representation; Maximum entropy model; Phrasing; Prominence; Spoken language processing; Supertags; Suprasegmental information; To BI annotation

Indexed keywords

MAXIMUM ENTROPY MODEL; PHRASING; PROMINENCE; SPOKEN LANGUAGE PROCESSING; SUPERTAGS; SUPRASEGMENTAL INFORMATION; TO BI ANNOTATION;

EID: 60849095980     PISSN: 15587916     EISSN: None     Source Type: Journal    
DOI: 10.1109/TASL.2008.917071     Document Type: Article
Times cited : (86)

References (64)
  • 2
    • 0028518062 scopus 로고
    • Automatic labeling of prosodic patterns
    • Oct
    • C. W. Wightman and M. Ostendorf, 'Automatic labeling of prosodic patterns," IEEE Trans. Speech Audio Process., vol. 2, no. 4, pp. 469-481, Oct. 1994.
    • (1994) IEEE Trans. Speech Audio Process , vol.2 , Issue.4 , pp. 469-481
    • Wightman, C.W.1    Ostendorf, M.2
  • 3
    • 0033719622 scopus 로고    scopus 로고
    • Improving into national phrasing with syntactic information
    • P. Koehn, S. Abney. J. Hirschberg, and M. Collins, "Improving into national phrasing with syntactic information," in Proc. ICASSP, 2000, pp. 1289-1290.
    • (2000) Proc. ICASSP , pp. 1289-1290
    • Koehn, P.1    Abney, S.2    Hirschberg, J.3    Collins, M.4
  • 4
    • 0038405804 scopus 로고    scopus 로고
    • Information structure and the syntax-phonology interface
    • M. Steedman. "Information structure and the syntax-phonology interface," Linguist. Inquiry, vol. 31, no. 4, pp. 649-689, 2000.
    • (2000) Linguist. Inquiry , vol.31 , Issue.4 , pp. 649-689
    • Steedman, M.1
  • 5
    • 0000106333 scopus 로고
    • On stress and linguistic rhythm
    • M. Liberman and A. Prince, "On stress and linguistic rhythm," Linguist. Inquiry, vol. 8, no. 2, pp. 249-336, 1977.
    • (1977) Linguist. Inquiry , vol.8 , Issue.2 , pp. 249-336
    • Liberman, M.1    Prince, A.2
  • 7
    • 4544383281 scopus 로고    scopus 로고
    • The TILT intonation model
    • P. Taylor, "The TILT intonation model," in Proc. ICSLP, 1998, vol. 4, pp. 1383-1386.
    • (1998) Proc. ICSLP , vol.4 , pp. 1383-1386
    • Taylor, P.1
  • 8
    • 0010987926 scopus 로고
    • Modelling the dynamic characteristics of voice fundamental frequency with application to analysis and synthesis of intonation
    • H. Fujisaki and K. Hirose, "Modelling the dynamic characteristics of voice fundamental frequency with application to analysis and synthesis of intonation," in Proc. 13th Int. Congr. Linguists, 1982, pp. 57-70.
    • (1982) Proc. 13th Int. Congr. Linguists , pp. 57-70
    • Fujisaki, H.1    Hirose, K.2
  • 9
    • 33646805961 scopus 로고    scopus 로고
    • IViE-A comparative transcription system for intonational variation in english
    • Sydney, Australia
    • E. Grabe, F. Nolan, and K. Farrar, "IViE-A comparative transcription system for intonational variation in english," in Proc. ICSLP, Sydney, Australia, 1998, pp. 1259-1262.
    • (1998) Proc. ICSLP , pp. 1259-1262
    • Grabe, E.1    Nolan, F.2    Farrar, K.3
  • 10
    • 85133236749 scopus 로고
    • Coding fundamental frequency patterns for multilingual synthesis with INTSINT in the MULTEXT project
    • Sep
    • D. J. Hirst, N. Ide, and J. Vronis, "Coding fundamental frequency patterns for multilingual synthesis with INTSINT in the MULTEXT project," in Proc. 2nd ESCA/IEEE Workshop Speech Synth., Sep. 1994, pp. 77-81.
    • (1994) Proc. 2nd ESCA/IEEE Workshop Speech Synth , pp. 77-81
    • Hirst, D.J.1    Ide, N.2    Vronis, J.3
  • 11
    • 0010125082 scopus 로고    scopus 로고
    • A prosody-only decision-tree model for disfluency detection
    • 97, Rhodes, Greece
    • E. E. Shriberg, R. A. Bates, and A. Stolcke, "A prosody-only decision-tree model for disfluency detection," in Proc. Eurospeech '97, Rhodes, Greece, 1997, pp. 2383-2386.
    • (1997) Proc. Eurospeech , pp. 2383-2386
    • Shriberg, E.E.1    Bates, R.A.2    Stolcke, A.3
  • 12
  • 15
    • 0027684991 scopus 로고
    • Pitch accent in context: Predicting intonational prominence from text
    • J. Hirschberg, "Pitch accent in context: Predicting intonational prominence from text," Artif, Intell., vol. 63, no. 1-2, pp. 305-340, 1993.
    • (1993) Artif, Intell , vol.63 , Issue.1-2 , pp. 305-340
    • Hirschberg, J.1
  • 16
    • 85123467304 scopus 로고    scopus 로고
    • Word informativeness and automatic pitch accent modeling
    • College Park, MD
    • P. Shimei and K. McKeown, "Word informativeness and automatic pitch accent modeling," in Proc. EMNLP/VLC, College Park, MD, 1999, pp. 148-157.
    • (1999) Proc. EMNLP/VLC , pp. 148-157
    • Shimei, P.1    McKeown, K.2
  • 17
    • 22944477565 scopus 로고    scopus 로고
    • Pitch accent prediction using ensemble machine learning
    • X. Sun, "Pitch accent prediction using ensemble machine learning," in Proc. ICSLP, 2002, pp. 561-564.
    • (2002) Proc. ICSLP , pp. 561-564
    • Sun, X.1
  • 18
    • 0343353984 scopus 로고    scopus 로고
    • Prosody recognition from speech utterances using acoustic and linguistic based models of prosodic events
    • Budapest, Hungary
    • A. Conkie, G. Riccardi, and R. C. Rose, "Prosody recognition from speech utterances using acoustic and linguistic based models of prosodic events," in Proc. Eurospeech, Budapest, Hungary, 1999, pp. 523-526.
    • (1999) Proc. Eurospeech , pp. 523-526
    • Conkie, A.1    Riccardi, G.2    Rose, R.C.3
  • 19
    • 33646806777 scopus 로고    scopus 로고
    • An automatic prosody recognizer using a coupled multi-stream acoustic model and a syntactic-prosodic language model
    • Philadelphia, PA, Mar
    • S. Ananthakrishnan and S. Narayanan, "An automatic prosody recognizer using a coupled multi-stream acoustic model and a syntactic-prosodic language model," in Proc. ICASSP, Philadelphia, PA, Mar. 2005, pp. 269-272.
    • (2005) Proc. ICASSP , pp. 269-272
    • Ananthakrishnan, S.1    Narayanan, S.2
  • 21
    • 85149102760 scopus 로고    scopus 로고
    • Using conditional random fields to predict pitch accent in conversational speech
    • ACL
    • M. Gregory and Y. Altun, "Using conditional random fields to predict pitch accent in conversational speech," in Proc. 42nd Annu. Meeting Assoc. Comput. Linguist. (ACL), 2004, pp. 677-704.
    • (2004) Proc. 42nd Annu. Meeting Assoc. Comput. Linguist , pp. 677-704
    • Gregory, M.1    Altun, Y.2
  • 22
    • 0034854347 scopus 로고    scopus 로고
    • Joint prosody prediction and unit selection for concatenative speech synthesis
    • I. Bulyko and M. Ostendorf, "Joint prosody prediction and unit selection for concatenative speech synthesis," in Proc. ICASSP, 2001, pp. 781-784.
    • (2001) Proc. ICASSP , pp. 781-784
    • Bulyko, I.1    Ostendorf, M.2
  • 23
    • 0141591472 scopus 로고    scopus 로고
    • Automatic prosody labeling using both text and acoustic information
    • Apr
    • X. Ma, W. Zhang, Q. Shi, W. Zhu, and L. Shen, "Automatic prosody labeling using both text and acoustic information," in Proc. ICASSP, Apr. 2003, vol. 1, pp. 516-519.
    • (2003) Proc. ICASSP , vol.1 , pp. 516-519
    • Ma, X.1    Zhang, W.2    Shi, Q.3    Zhu, W.4    Shen, L.5
  • 24
    • 0347128737 scopus 로고    scopus 로고
    • Intonation and dialogue context as constraints for speech recognition
    • P. Taylor, S. King, S. Isard, and H. Wright, "Intonation and dialogue context as constraints for speech recognition," Lang. Speech, vol. 41, no. 34, pp. 493-512, 2000.
    • (2000) Lang. Speech , vol.41 , Issue.34 , pp. 493-512
    • Taylor, P.1    King, S.2    Isard, S.3    Wright, H.4
  • 26
    • 0030145829 scopus 로고    scopus 로고
    • Training intonational phrasing rules automatically for English and Spanish text-to-speech
    • J. Hirschberg and P. Prieto, "Training intonational phrasing rules automatically for English and Spanish text-to-speech," Speech Commun., vol. 18, no. 3, pp. 281-290, 1996.
    • (1996) Speech Commun , vol.18 , Issue.3 , pp. 281-290
    • Hirschberg, J.1    Prieto, P.2
  • 27
    • 4544275067 scopus 로고    scopus 로고
    • An automatic prosody labeling system using ANN-based syntactic-prosodic model and GMM-based acoustic-prosodic model
    • K. Chen, M. Hasegawa-Johnson, and A. Cohen, "An automatic prosody labeling system using ANN-based syntactic-prosodic model and GMM-based acoustic-prosodic model," in Proc. ICASSP, 2004, pp. I-509-I-512.
    • (2004) Proc. ICASSP
    • Chen, K.1    Hasegawa-Johnson, M.2    Cohen, A.3
  • 28
    • 84937375300 scopus 로고    scopus 로고
    • Intonation across Spanish, in the tones and break indices framework
    • M. E. Beckman, M. Diaz-Campos, J. T. McGory, and T. A. Morgan, "Intonation across Spanish, in the tones and break indices framework," Probus, vol. 14, pp. 9-36.
    • Probus , vol.14 , pp. 9-36
    • Beckman, M.E.1    Diaz-Campos, M.2    McGory, J.T.3    Morgan, T.A.4
  • 29
    • 84934349409 scopus 로고    scopus 로고
    • International structure in Japanese and English
    • M. E. Beckman and J. B. Pierrehumbert, "International structure in Japanese and English," Phonology Yearbook, vol. 3, pp. 255-309.
    • Phonology Yearbook , vol.3 , pp. 255-309
    • Beckman, M.E.1    Pierrehumbert, J.B.2
  • 31
    • 85149113268 scopus 로고    scopus 로고
    • A prosodic analysis of discourse segments in direction-giving monologues
    • J. Hirschberg and C. Nakatani, "A prosodic analysis of discourse segments in direction-giving monologues," in Proc.34th Conf. Assoc. Computat. Linguist., 1996, pp. 286-293.
    • (1996) Proc.34th Conf. Assoc. Computat. Linguist , pp. 286-293
    • Hirschberg, J.1    Nakatani, C.2
  • 33
    • 0026734712 scopus 로고
    • Segmental durations in the vicinity of prosodic phrase boundaries
    • C. W. Wightman, S. Shattuck-Hufnagel, M. Ostendorf, and P. J. Price, "Segmental durations in the vicinity of prosodic phrase boundaries," J. Acoust. Soc. Amer., vol. 91, no. 3, pp. 1707-1717, 1992.
    • (1992) J. Acoust. Soc. Amer , vol.91 , Issue.3 , pp. 1707-1717
    • Wightman, C.W.1    Shattuck-Hufnagel, S.2    Ostendorf, M.3    Price, P.J.4
  • 34
    • 0030181584 scopus 로고    scopus 로고
    • Prediction of abstract prosodic labels for speech synthesis
    • Oct
    • K. Ross and M. Ostendorf, "Prediction of abstract prosodic labels for speech synthesis," Comput. Speech Lang., vol. 10, pp. 155-185, Oct.1996
    • (1996) Comput. Speech Lang , vol.10 , pp. 155-185
    • Ross, K.1    Ostendorf, M.2
  • 37
    • 0026850770 scopus 로고
    • Automatic classification of international phrase boundaries
    • M. Q. Wang and J. Hirschberg, "Automatic classification of international phrase boundaries," Comput. Speech Lang., vol. 6, pp. 175-196, 1992.
    • (1992) Comput. Speech Lang , vol.6 , pp. 175-196
    • Wang, M.Q.1    Hirschberg, J.2
  • 38
    • 85030872484 scopus 로고
    • Evaluation of prosodic transcription labeling reliability in the ToBI framework
    • J. F. Pitrelli, M. E. Beckman, and J. Hirschberg, "Evaluation of prosodic transcription labeling reliability in the ToBI framework," in Proc. ICSLP, 1994, pp. 123-126.
    • (1994) Proc. ICSLP , pp. 123-126
    • Pitrelli, J.F.1    Beckman, M.E.2    Hirschberg, J.3
  • 39
    • 0034274273 scopus 로고    scopus 로고
    • VERBMOBIL: The use of prosody in the linguistic components of a speech understanding system
    • Sep
    • E. Nöth, A. Batliner, A. KieBling, R. Kompe, and H. Niemann, "VERBMOBIL: The use of prosody in the linguistic components of a speech understanding system," IEEE Trans. Speech Audio Process., vol. 8, no. 5, pp. 519-532, Sep. 2000.
    • (2000) IEEE Trans. Speech Audio Process , vol.8 , Issue.5 , pp. 519-532
    • Nöth, E.1    Batliner, A.2    KieBling, A.3    Kompe, R.4    Niemann, H.5
  • 40
    • 64949115111 scopus 로고    scopus 로고
    • Predicting prosodic boundaries using linguistic features
    • Dresden, Germany, CD-ROM
    • T.-J. Yoon, "Predicting prosodic boundaries using linguistic features," in Proc. ICSA Int. Conf. Speech Prosody, Dresden, Germany, 2006, CD-ROM.
    • (2006) Proc. ICSA Int. Conf. Speech Prosody
    • Yoon, T.-J.1
  • 41
    • 64949087377 scopus 로고    scopus 로고
    • M. Ostendorf, 1. Shafran, S. Shattuck-Hufnagel, L. Carmichael, and W. Byrne, A prosodically labeled database of spontaneous speech, in Pmc. ISCA Workshop Prosody in Speech Recognition and Understanding, 2001, pp. 119-121.
    • M. Ostendorf, 1. Shafran, S. Shattuck-Hufnagel, L. Carmichael, and W. Byrne, "A prosodically labeled database of spontaneous speech," in Pmc. ISCA Workshop Prosody in Speech Recognition and Understanding, 2001, pp. 119-121.
  • 45
    • 0002721964 scopus 로고
    • Prosody/parse scoring and its application in atis
    • Morristown, NJ, Assoc. Comput. Linguist
    • N. M. Veilleux and M. Ostendorf, "Prosody/parse scoring and its application in atis," in HLT'93: Pmc. Workshop Human Lang. Technol., Morristown, NJ, 1993, pp. 335-340, Assoc. Comput. Linguist.
    • (1993) HLT'93: Pmc. Workshop Human Lang. Technol , pp. 335-340
    • Veilleux, N.M.1    Ostendorf, M.2
  • 47
    • 84858405009 scopus 로고    scopus 로고
    • Exploiting acoustic and syntactic features for prosody labeling in a maximum entropy framework
    • V. K. Rangarajan Sridhar, S. Bangalore, and S. Narayanan, "Exploiting acoustic and syntactic features for prosody labeling in a maximum entropy framework," in Proc. NAACL-HLT, 2007, pp. 1-8.
    • (2007) Proc. NAACL-HLT , pp. 1-8
    • Rangarajan Sridhar, V.K.1    Bangalore, S.2    Narayanan, S.3
  • 48
    • 0002652285 scopus 로고    scopus 로고
    • A maximum entropy approach to natural language processing
    • A. Berger, S. D. Pietra, and V. D. Pietra, "A maximum entropy approach to natural language processing," Comput. Linguist., vol. 22, no. 1, pp. 39-71, 1996.
    • (1996) Comput. Linguist , vol.22 , Issue.1 , pp. 39-71
    • Berger, A.1    Pietra, S.D.2    Pietra, V.D.3
  • 49
    • 9444278508 scopus 로고    scopus 로고
    • Performance guarantees for regularized maximum entropy density estimation
    • Banff, Canada, Springer-Verlag
    • M. Dudik, S. Phillips, and R. E. Schapire. "Performance guarantees for regularized maximum entropy density estimation," in Proc. COLT, Banff, Canada, 2004, pp. 472-486, Springer-Verlag.
    • (2004) Proc. COLT , pp. 472-486
    • Dudik, M.1    Phillips, S.2    Schapire, R.E.3
  • 50
    • 32144432078 scopus 로고    scopus 로고
    • Scaling large margin classifiers for spoken language understanding
    • P. Haffner, "Scaling large margin classifiers for spoken language understanding," Speech Commun., vol. 48, no. IV, pp. 239-261, 2006.
    • (2006) Speech Commun , vol.48 , Issue.IV , pp. 239-261
    • Haffner, P.1
  • 51
    • 0003356044 scopus 로고    scopus 로고
    • Supertagging: An approach to almost parsing
    • S. Bangalore and A. K. Joshi, "Supertagging: An approach to almost parsing," Computat. Linguist., vol. 25, no. 2, pp. 237-265, 1999.
    • (1999) Computat. Linguist , vol.25 , Issue.2 , pp. 237-265
    • Bangalore, S.1    Joshi, A.K.2
  • 52
    • 0002709670 scopus 로고    scopus 로고
    • Tree - adjoining grammars
    • A. Joshi and Y. Schabes, A. Salomaa and G. Rozenberg, Eds, Berlin: Springer-Verlag
    • A. Joshi and Y. Schabes, A. Salomaa and G. Rozenberg, Eds., "Tree - adjoining grammars," in Handbook of Formal Lanaguages and Automata. Berlin: Springer-Verlag, 1996.
    • (1996) Handbook of Formal Lanaguages and Automata
  • 53
    • 64949199415 scopus 로고    scopus 로고
    • A Lexicalized Tree-Adjoining Grammar for English, Univ. of Pennsylvania, Philadelphia, Tech. Rep., 2001 [Online], Available: http://www.cis.upenn.edu/xtag/gramrelease.html, XTAG
    • "A Lexicalized Tree-Adjoining Grammar for English," Univ. of Pennsylvania, Philadelphia, Tech. Rep., 2001 [Online], Available: http://www.cis.upenn.edu/xtag/gramrelease.html, XTAG
  • 55
    • 64949114482 scopus 로고    scopus 로고
    • F. Xia, M. Palmer, and A. Joshi, A uniform method of grammar extraction and its applications, in Pmc. Empirical Methods in Natural Lang. Process., 2000, pp. 53-62.
    • F. Xia, M. Palmer, and A. Joshi, "A uniform method of grammar extraction and its applications," in Pmc. Empirical Methods in Natural Lang. Process., 2000, pp. 53-62.
  • 56
    • 64949202919 scopus 로고    scopus 로고
    • Factoring Global Inference by Enriching Local Representations
    • AT &T Labs-Research, Tech. Rep
    • S. Bangalore, A. Emami, and P. Haffner, "Factoring Global Inference by Enriching Local Representations," AT &T Labs-Research, Tech. Rep., 2005.
    • (2005)
    • Bangalore, S.1    Emami, A.2    Haffner, P.3
  • 58
    • 64949142316 scopus 로고    scopus 로고
    • M. Hasegawa-Johnson, J. Cole, C. Shih, K. Chen, A. Cohen, S. Chavarria, H. Kim, T. Yoon, S. Borys, and J.-Y. Choi, Speech recognition models of the interdependence among syntax, prosody, and segmental acoustics, in Proc. HLT/NAACL, Workshop Higher-Level Knowledge in Automatic Speech Recognition and Understanding, May 2004, pp. 56-63.
    • M. Hasegawa-Johnson, J. Cole, C. Shih, K. Chen, A. Cohen, S. Chavarria, H. Kim, T. Yoon, S. Borys, and J.-Y. Choi, "Speech recognition models of the interdependence among syntax, prosody, and segmental acoustics," in Proc. HLT/NAACL, Workshop Higher-Level Knowledge in Automatic Speech Recognition and Understanding, May 2004, pp. 56-63.
  • 59
    • 84871623487 scopus 로고    scopus 로고
    • Learning prosodic features using a tree representation
    • Aalborg, Denmark
    • J. Hirschberg and O. Rainbow, "Learning prosodic features using a tree representation," in Proc. Eurospeech, Aalborg, Denmark, 2001, pp. 1175-1180.
    • (2001) Proc. Eurospeech , pp. 1175-1180
    • Hirschberg, J.1    Rainbow, O.2
  • 64
    • 64949143594 scopus 로고    scopus 로고
    • Exploiting prosodic features for dialog act tagging in a discriminative modeling framework
    • V. K. Rangarajan Sridhar, S. Bangalore, and S. Narayanan, "Exploiting prosodic features for dialog act tagging in a discriminative modeling framework," in Proc. ICSLP, 2007.
    • (2007) Proc. ICSLP
    • Rangarajan Sridhar, V.K.1    Bangalore, S.2    Narayanan, S.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.