SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 17, Issue 1, 2009, Pages 2-12

Automatic detection of disfluency boundaries in spontaneous speech of children using audio-visual information

(2) Yildirim, Serdar a,b Narayanan, Shrikanth a

a UNIVERSITY OF SOUTHERN CALIFORNIA (United States)

b MUSTAFA KEMAL UNIVERSITY (Turkey)

Author keywords

Disfluency detection; Feature selection; Information fusion; Spoken language processing; Spontaneous children speech

Indexed keywords

AUDIO-VISUAL INFORMATION; AUTOMATIC DETECTION; AUTOMATIC RECOGNITION; COGNITIVE STATE; COMPUTER GAME; DECISION LEVELS; DETECTION ACCURACY; DETECTION ERROR RATE; DETECTION SYSTEM; DISFLUENCIES; DISFLUENCY DETECTION; FEATURE LEVEL; FEATURE SELECTION; INFORMATION SOURCES; LANGUAGE FEATURES; MULTI-MODAL; SPOKEN LANGUAGE PROCESSING; SPONTANEOUS CHILDREN SPEECH; SPONTANEOUS SPEECH; VISUAL INFORMATION;

ERROR DETECTION; HUMAN COMPUTER INTERACTION; INFORMATION FUSION; LINGUISTICS; SPEECH RECOGNITION; VISUAL COMMUNICATION;

INFORMATION USE;

EID: 70350442414 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2008.2006728 Document Type: Article

Times cited : (26)

References (40)

1
- 51849142134
- Evaluating the effect of predicting oral reading miscues
- Geneva, Switzerland
- S. Banerjee, J. E. Beck, and J. Mostow, "Evaluating the effect of predicting oral reading miscues," in Proc. Eurospeech, Geneva, Switzerland, 2003, pp. 3165-3168.
- (2003) Proc. Eurospeech , pp. 3165-3168
- Banerjee, S.¹ Beck, J.E.² Mostow, J.³

2
- 70450158698
- Analysis and detection of reading miscues for interactive literacy tutors
- Geneva, Switzerland, Aug., Article 1524
- K. Lee, A. Hagen, N. Romanyshyn, S. Martin, and B. Pellom, "Analysis and detection of reading miscues for interactive literacy tutors," in Proc. 20th Int. Conf. Comput. Linguist. (Coling), Geneva, Switzerland, Aug. 2004, Article 1524.
- (2004) Proc. 20th Int. Conf. Comput. Linguist. (Coling)
- Lee, K.¹ Hagen, A.² Romanyshyn, N.³ Martin, S.⁴ Pellom, B.⁵

3
- 67650622233
- Automatic detection and classification of disfluent reading miscues in young childrens speech for the purpose of assessment
- Antwerp, Belgium, Aug.
- M. Black, J. Tepperman, S. Lee, P. Price, and S. Narayanan, "Automatic detection and classification of disfluent reading miscues in young childrens speech for the purpose of assessment," in Proc. InterSpeech ICSLP, Antwerp, Belgium, Aug. 2007, pp. 206-209.
- (2007) Proc. InterSpeech ICSLP , pp. 206-209
- Black, M.¹ Tepperman, J.² Lee, S.³ Price, P.⁴ Narayanan, S.⁵

4
- 0003798906
- Preliminaries to a Theory of Speech Disfluencies
- Univ. of California, Berkley
- E. E. Shriberg, "Preliminaries to a Theory of Speech Disfluencies," Ph.D. dissertation, Univ. of California, Berkley, 1994.
- (1994) Ph.D. dissertation
- Shriberg, E.E.¹

5
- 0029765629
- Statistical language modeling for speech disfluencies
- Atlanta, GA
- A. Stolcke and E. Shriberg, "Statistical language modeling for speech disfluencies," in Proc. ICASSP, Atlanta, GA, 1996, vol.1, pp. 405-408.
- (1996) Proc. ICASSP , vol.1 , pp. 405-408
- Stolcke, A.¹ Shriberg, E.²

6
- 0010125082
- A prosody-only decision-tree model for disfluency detection
- E. Shriberg, R. Bates, and A. Stolcke, "A prosody-only decision-tree model for disfluency detection," in Proc. Eurospeech, 1997, pp. 2383-2386.
- (1997) Proc. Eurospeech , pp. 2383-2386
- Shriberg, E.¹ Bates, R.² Stolcke, A.³

7
- 85128394891
- Automatic detection of sentence boundaries and disfluencies based on recognized words
- A. Stolcke, E. Shriberg, R. Bates, M. Ostendorf, D. Hakkani, M. Plauche, G. Tür, and Y. Lu, "Automatic detection of sentence boundaries and disfluencies based on recognized words," in Proc. ICSLP, 1998, no.5, pp. 2247-2250.
- (1998) Proc. ICSLP , Issue.5 , pp. 2247-2250
- Stolcke, A.¹ Shriberg, E.² Bates, R.³ Ostendorf, M.⁴ Hakkani, D.⁵ Plauche, M.⁶ Tür, G.⁷ Lu, Y.⁸

8
- 85009223733
- Automatic disfluency identification in conversational speech using multiple knowledge source
- Geneva, Switzerland
- Y. Liu, E. Shriberg, and A. Stolcke, "Automatic disfluency identification in conversational speech using multiple knowledge source," in Proc. Eurospeech, Geneva, Switzerland, 2003, pp. 957-960.
- (2003) Proc. Eurospeech , pp. 957-960
- Liu, Y.¹ Shriberg, E.² Stolcke, A.³

9
- 0032969462
- Acoustics of children's speech: Developmental changes of temporal and spectral parameters
- Mar.
- S. Lee, A. Potamianos, and S. Narayanan, "Acoustics of children's speech: Developmental changes of temporal and spectral parameters," J. Acoust. Soc. Amer., vol.105, pp. 1455-1468, Mar. 1999.
- (1999) J. Acoust. Soc. Amer. , vol.105 , pp. 1455-1468
- Lee, S.¹ Potamianos, A.² Narayanan, S.³

10
- 0036475971
- Creating conversational interfaces for children
- Feb.
- S. Narayanan and A. Potamianos, "Creating conversational interfaces for children," IEEE Trans. Speech Audio Process., vol.10, no.2, pp. 65-78, Feb. 2002.
- (2002) IEEE Trans. Speech Audio Process , vol.10 , Issue.2 , pp. 65-78
- Narayanan, S.¹ Potamianos, A.²

11
- 0029747582
- A study of speech recognition for children and elderly
- J.Wilpon and C. Jacobsen, "A study of speech recognition for children and elderly," in Proc. ICASSP, 1996, pp. 349-352.
- (1996) Proc. ICASSP , pp. 349-352
- Wilpon, J.¹ Jacobsen, C.²

12
- 0031644298
- Improvements in children's speech recognition performance
- S. Das, D. Nix, and M. Picheny, "Improvements in children's speech recognition performance," in Proc. ICASSP, 1998, pp. 433-436.
- (1998) Proc. ICASSP , pp. 433-436
- Das, S.¹ Nix, D.² Picheny, M.³

13
- 84946707630
- Childrens speech recognition with application to interactive books and tutors
- St. Thomas, Virgin Islands, Dec.
- A. Hagen, B. Pellom, and R. Cole, "Childrens speech recognition with application to interactive books and tutors," in Proc. IEEE ASRUWorkshop, St. Thomas, Virgin Islands, Dec. 2003.
- (2003) Proc. IEEE ASRUWorkshop
- Hagen, A.¹ Pellom, B.² Cole, R.³

14
- 85009291880
- An analysis of the causes of increased error rated in children's speech recognition
- Denver, CO
- Q. Li and M. J. Russell, "An analysis of the causes of increased error rated in children's speech recognition," in Proc. ICSLP, Denver, CO, 2002, pp. 2337-2340.
- (2002) Proc. ICSLP , pp. 2337-2340
- Li, Q.¹ Russell, M.J.²

15
- 0038418668
- Designing and evaluating conversational interfaces with animated characters
- S. L. Oviatt and B. Adams, , J. Cassell, J. Sullivan, S. Prevost, and E. Churchill, Eds., Cambridge, MA: MIT Press
- S. L. Oviatt and B. Adams, , J. Cassell, J. Sullivan, S. Prevost, and E. Churchill, Eds., "Designing and evaluating conversational interfaces with animated characters," in Embodied Conversational Agents.. Cambridge, MA: MIT Press, 2000, pp. 319-343.
- (2000) Embodied Conversational Agents , pp. 319-343

16
- 4544316886
- A multi-pass linear fold algorithm for sentence boundary detection using prosodic cues
- May
- D. Wang and S. Narayanan, "A multi-pass linear fold algorithm for sentence boundary detection using prosodic cues," in Proc. ICASSP, May 2004, vol.1, pp. 525-528.
- (2004) Proc. ICASSP , vol.1 , pp. 525-528
- Wang, D.¹ Narayanan, S.²

17
- 84982977686
- Catchments, prosody, and discourse
- D. McNeill, F. Quek, K.-E. McCullough, S. D. N. Furuyama, R. Bryll, X.-F. Ma, and R. Ansari, "Catchments, prosody, and discourse," Gesture, vol.1, no.1, pp. 9-33, 2001.
- (2001) Gesture , vol.1 , Issue.1 , pp. 9-33
- McNeill, D.¹ Quek, F.² McCullough, K.-E.³ Furuyama, S.D.N.⁴ Bryll, R.⁵ Ma, X.-F.⁶ Ansari, R.⁷

18
- 84994124293
- Multimodal human discourse: Gesture and speech
- F. Quek, D. McNeill, R. Bryll, S. Duncan, X.-F. Ma, C. Kirbas, K. E. McCullough, and R. Ansari, "Multimodal human discourse: Gesture and speech," in ACM Trans. Comput.-Human Interaction, 2002, vol.9, no.3, pp. 171-193.
- (2002) ACM Trans. Comput.-Human Interaction , vol.9 , Issue.3 , pp. 171-193
- Quek, F.¹ McNeill, D.² Bryll, R.³ Duncan, S.⁴ Ma, X.-F.⁵ Kirbas, C.⁶ McCullough, K.E.⁷ Ansari, R.⁸

19
- 42949107237
- Interrelation between speech and facial gestures in emotional utterances: A single subject study
- Nov.
- C. Busso and S. Narayanan, "Interrelation between speech and facial gestures in emotional utterances: A single subject study," IEEE Trans. Speech, Audio, Lang. Process., vol.15, no.8, pp. 2331-2347, Nov. 2007.
- (2007) IEEE Trans. Speech, Audio, Lang. Process , vol.15 , Issue.8 , pp. 2331-2347
- Busso, C.¹ Narayanan, S.²

20
- 42949167982
- Rigid head motion in expressive speech animation: Analysis and synthesis
- Mar.
- C. Busso, Z. Deng, M. Grimm, U. Neumann, and S. Narayanan, "Rigid head motion in expressive speech animation: Analysis and synthesis," IEEE Trans. Speech, Audio, Lang. Process., vol.15, no.3, pp. 1075-1086, Mar. 2007.
- (2007) IEEE Trans. Speech, Audio, Lang. Process , vol.15 , Issue.3 , pp. 1075-1086
- Busso, C.¹ Deng, Z.² Grimm, M.³ Neumann, U.⁴ Narayanan, S.⁵

21
- 10244254624
- Analysis of speech and gesture frequency during fluent and hesitant phases in speech
- Orlando, FL, Jul.
- L. Valbonesi, R. Ansari, D. McNeill, F. Quek, S. Duncan, K. McCullough, and R. Bryll, "Analysis of speech and gesture frequency during fluent and hesitant phases in speech," in Proc. 6th Multi-Conf. Syst., Cybern., Inf. (SCI 2002), Orlando, FL, Jul. 14-18, 2002.
- (2002) Proc. 6th Multi-Conf. Syst., Cybern., Inf. (SCI 2002) , pp. 14-18
- Valbonesi, L.¹ Ansari, R.² McNeill, D.³ Quek, F.⁴ Duncan, S.⁵ McCullough, K.⁶ Bryll, R.⁷

22
- 3042515718
- Non-verbal cues for discourse structure
- Toulouse, France, Jul.
- J. Cassell, Y. I. Nakano, T. W. Bickmore, C. L. Sidner, and C. Rich, "Non-verbal cues for discourse structure," in Proc. 39th Meeting Assoc. Comput. Linguist., Toulouse, France, Jul. 2001, pp. 106-115.
- (2001) Proc. 39th Meeting Assoc. Comput. Linguist. , pp. 106-115
- Cassell, J.¹ Nakano, Y.I.² Bickmore, T.W.³ Sidner, C.L.⁴ Rich, C.⁵

23
- 70350481599
- Gesture patterns during speech repairs
- Denver, CO
- L. Chen, M. Harper, and F. Quek, "Gesture patterns during speech repairs," in Proc. ICSLP, Denver, CO, 2002, pp. 629-632.
- (2002) Proc. ICSLP , pp. 629-632
- Chen, L.¹ Harper, M.² Quek, F.³

24
- 16244416858
- Prosody based audiovisual coanalysis for coverbal gesture recognition
- Apr.
- S. Kettebekov, M. Yeasin, and R. Sharma, "Prosody based audiovisual coanalysis for coverbal gesture recognition," IEEE Trans. Multimedia, vol.7, no.2, pp. 234-242, Apr. 2005.
- (2005) IEEE Trans. Multimedia , vol.7 , Issue.2 , pp. 234-242
- Kettebekov, S.¹ Yeasin, M.² Sharma, R.³

25
- 14944345809
- Multimodal model integration for sentence unit detection
- State College, PA
- L. Chen, Y. Liu, M. Harper, and E. Shriberg, "Multimodal model integration for sentence unit detection," in Proc. ICMI, State College, PA, 2004, pp. 121-128.
- (2004) Proc. ICMI , pp. 121-128
- Chen, L.¹ Liu, Y.² Harper, M.³ Shriberg, E.⁴

26
- 21244500957
- Logistic model trees
- N. Landwehr, M. Hall, and E. Frank, "Logistic model trees," Mach. Learn. J., vol.59, no.1-2, pp. 161-205, 2005.
- (2005) Mach. Learn. J. , vol.59 , Issue.1-2 , pp. 161-205
- Landwehr, N.¹ Hall, M.² Frank, E.³

27
- 85009243632
- Cu animate tools for enabling conversations with animated characters
- J. Ma, J.Yan, and R. Cole, "Cu animate tools for enabling conversations with animated characters," in Proc. ICSLP, 2002, vol.1, pp. 197-200.
- (2002) Proc. ICSLP , vol.1 , pp. 197-200
- Ma, J.¹ Yan, J.² Cole, R.³

28
- 0038120523
- Jul. 20 2005 2005, retrieved from, [Online]. Available:, (version 4.3.19) [Computer Program]
- P. Boersma and D. Weenink, "Praat: Doing Phonetics by Computer," Jul. 20, 2005 [Online]. Available: http://www.praat.org, (version 4.3.19) [computer program], 2005, retrieved from
- Praat: Doing Phonetics by Computer
- Boersma, P.¹ Weenink, D.²

29
- 84859899617
- Anvil-A generic annotation tool for multimodal dialogue
- M. Kipp, "Anvil-A generic annotation tool for multimodal dialogue," in Proc. Eurospeech, 2001, pp. 1367-1370.
- (2001) Proc. Eurospeech , pp. 1367-1370
- Kipp, M.¹

30
- 74049094559
- Analyzing the interplay between spoken language and gestural cues in conversational child-machine interactions in pre/early literate age group
- paper ID 047
- S. Montanari, S. Yildirim, S. Khurana, M. Landes, L. Lawyer, E. Andersen, and S. Narayanan, "Analyzing the interplay between spoken language and gestural cues in conversational child-machine interactions in pre/early literate age group," in Proc. InStil, Jul. 2004, paper ID 047.
- (2004) Proc. InStil, Jul.
- Montanari, S.¹ Yildirim, S.² Khurana, S.³ Landes, M.⁴ Lawyer, L.⁵ Andersen, E.⁶ Narayanan, S.⁷

31
- 85009115741
- Reference marking in childrens computer-directed speech: An integrated analysis of discourse and gesture
- Oct.
- S. Montanari, S. Yildirim, E. Andersen, and S. Narayanan, "Reference marking in childrens computer-directed speech: An integrated analysis of discourse and gesture," in Proc. ICSLP, Oct. 2004, pp. 1841-1844.
- (2004) Proc. ICSLP , pp. 1841-1844
- Montanari, S.¹ Yildirim, S.² Andersen, E.³ Narayanan, S.⁴

32
- 0029219786
- Predicting spoken disfluecies during human-computer interaction
- S. Oviatt, "Predicting spoken disfluecies during human-computer interaction," Comput. Speech Lang., vol.9, pp. 19-35, 1995.
- (1995) Comput. Speech Lang. , vol.9 , pp. 19-35
- Oviatt, S.¹

33
- 0039100034
- Disfluencies in switchboard
- E. Shriberg, "Disfluencies in switchboard," in Proc. ICSLP, 1996, pp. 11-14.
- (1996) Proc. ICSLP , pp. 11-14
- Shriberg, E.¹

34
- 84891308106
- Srilm-An extensible language modeling toolkit
- A. Stolcke, "Srilm-An extensible language modeling toolkit," in Proc. ICSLP, 2002, vol.2, pp. 901-904.
- (2002) Proc. ICSLP , vol.2 , pp. 901-904
- Stolcke, A.¹

35
- 0028378716
- Performance of optical flow techniques
- J. L. Barren, D. J. Fleet, and S. Beauchemin, "Performance of optical flow techniques," Int. J. Comput. Vision, vol.12, pp. 43-77, 1994.
- (1994) Int. J. Comput. Vision , vol.12 , pp. 43-77
- Barren, J.L.¹ Fleet, D.J.² Beauchemin, S.³

36
- 0029218929
- Recursive filters for optical flow
- Jan.
- D. J. Fleet and K. Langley, "Recursive filters for optical flow," IEEE Trans. Pattern Anal. Mach. Intell., vol.17, no.1, pp. 61-67, Jan. 1995.
- (1995) IEEE Trans. Pattern Anal. Mach. Intell. , vol.17 , Issue.1 , pp. 61-67
- Fleet, D.J.¹ Langley, K.²

37
- 0036379257
- Measuring the structure of dynamic visual signals
- R. A. Peters, C. W. G. Clifford, and C. S. Evans, "Measuring the structure of dynamic visual signals," Animal Behaviour, vol.64, pp. 131-146, 2002.
- (2002) Animal Behaviour , vol.64 , pp. 131-146
- Peters, R.A.¹ Clifford, C.W.G.² Evans, C.S.³

38
- 49149118926
- R. Duda, P. Hart, and D. Stork, Pattern Classification, 2nd ed. New York: Wiley, 2001
- (2001) Pattern Classification, 2nd ed. New York: Wiley
- Duda, R.¹ Hart, P.² Stork, D.³

39
- 0033713738
- Combining multiple classifiers by averaging or by multiplying
- D. Tax, M. van Breukelen, R. Duin, and J. Kittler, "Combining multiple classifiers by averaging or by multiplying," Pattern Recognition, vol.33, pp. 1475-1485, 2000.
- (2000) Pattern Recognition , vol.33 , pp. 1475-1485
- Tax, D.¹ Van Breukelen, M.² Duin, R.³ Kittler, J.⁴

40
- 0037403516
- Measure of diversity in classifier esembles
- L. Kuncheva and C. Whitaker, "Measure of diversity in classifier esembles," Mach. Learn., vol.51, pp. 181-207, 2003.
- (2003) Mach. Learn. , vol.51 , pp. 181-207
- Kuncheva, L.¹ Whitaker, C.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.