SCOPUS 정보 검색 플랫폼

International Journal on Document Analysis and Recognition

Volumn 2, Issue 4, 2000, Pages 147-162

Multimedia document retrieval using speech and speaker recognition

(5) Viswanathan, Mahesh a Beigi, Homayoon S M a Dharanipragada, Satya a Maali, Fereydoun b Tritschler, Alain a

a IBM T J WATSON RESEARCH CENTER (United States)

b Signal Recognition Corporation (United States)

Author keywords

Audio indexing; Speaker recognition; Speaker segmentation; Speech recognition; Spoken document analysis

Indexed keywords

AUDIO INDEXING; MULTIMEDIA CONTENTS; MULTIMEDIA DOCUMENTS; SPEAKER IDENTIFICATION; SPEAKER RECOGNITION; SPEAKER RECOGNITION SYSTEM; SPEAKER SEGMENTATIONS; SPOKEN DOCUMENT;

ABSTRACTING; INDEXING (OF INFORMATION); SPEECH RECOGNITION;

INFORMATION RETRIEVAL;

EID: 33750180399 PISSN: 14332833 EISSN: 14332825 Source Type: Journal
DOI: 10.1007/PL00021522 Document Type: Article

Times cited : (4)

References (29)

1
- 0016355478
- A new look at the statistical model for identification
- [Akaike74]
- [Akaike74] H. Akaike. A new look at the statistical model for identification. IEEE Trans on Autom Control, AC19, 1974, pp. 716-723
- (1974) IEEE Trans on Autom Control , vol.AC19 , pp. 716-723
- Akaike, H.¹

2
- 85079084846
- Robust methods for context-dependent features and models in a continuous speech recognizer
- In, Adelaide, Australia, [Bahl94]
- [Bahl94] L.R. Bahl, P.V. deSouza, P.S. Gopalakrishnan, D. Nahamoo, and M.A. Picheny. Robust methods for context-dependent features and models in a continuous speech recognizer. In: Proc Int Conf on Acoustics, Speech, and Signal Process, Adelaide, Australia, 1994, pp. 533-536
- (1994) Proc Int Conf on Acoustics, Speech, and Signal Process , pp. 533-536
- Bahl, L.R.¹ deSouza, P.V.² Gopalakrishnan, P.S.³ Nahamoo, D.⁴ Picheny, M.A.⁵

3
- 0028996843
- Performance of the IBM Large Vocabulary Continuous Speech Recognition System on the ARPA Wall Street Journal Task
- Detroit, MI, [Bahl95]
- [Bahl95] L.R. Bahl, S. Balakrishnan-Aiyer, J.R. Bellegarda, M. Franz, P.S. Gopalakrishnan, D. Nahamoo, M. Novak, M. Padmanabhan, M.A. Pichery, S. Roukos. Performance of the IBM Large Vocabulary Continuous Speech Recognition System on the ARPA Wall Street Journal Task. Proc Int Conf on Acoustics, Speech, and Signal Process, Detroit, MI, 1995, pp. 41-44
- (1995) Proc Int Conf on Acoustics, Speech, and Signal Process , pp. 41-44
- Bahl, L.R.¹ Balakrishnan-Aiyer, S.² Bellegarda, J.R.³ Franz, M.⁴ Gopalakrishnan, P.S.⁵ Nahamoo, D.⁶ Novak, M.⁷ Padmanabhan, M.⁸ Pichery, M.A.⁹ Roukos, S.¹⁰

4
- 84892185828
- A distance measure between collections of distributions and its application to speaker recognition
- Seattle, Washington, [Beigi98a]
- [Beigi98a] H.S.M. Beigi, S. Maes, J.S. Sorenson. A distance measure between collections of distributions and its application to speaker recognition. Proc Int Conf on Acoustics, Speech, and Signal Process, Seattle, Washington, 1998, pp. 753-756
- (1998) Proc Int Conf on Acoustics, Speech, and Signal Process , pp. 753-756
- Beigi, H.S.M.¹ Maes, S.² Sorenson, J.S.³

5
- 80755126546
- IBM model-based and frame-by-frame speaker-recognition
- Avignon, France, [Beigi98b]
- [Beigi98b] H.S.M. Beigi, S. Maes, U.V. Chaudhari, J.S. Sorenson. IBM model-based and frame-by-frame speaker-recognition. Proc Speaker Recognition and its Commercial and Forensic Appl, Avignon, France, 1998
- (1998) Proc Speaker Recognition and its Commercial and Forensic Appl
- Beigi, H.S.M.¹ Maes, S.² Chaudhari, U.V.³ Sorenson, J.S.⁴

6
- 33748086078
- Multi-environment speaker verification
- Summit, NJ, [Chaudhari99]
- [Chaudhari99] U.V. Chaudhari, H.S.M. Beigi, S.H. Maes, J.S. Sorensen. Multi-environment speaker verification. Proc 2nd Annual Workshop on Autom identification Advanced Technol (Auto ID '99), Summit, NJ, 1999, pp. 19-22
- (1999) Proc 2nd Annual Workshop on Autom identification Advanced Technol (Auto ID '99) , pp. 19-22
- Chaudhari, U.V.¹ Beigi, H.S.M.² Maes, S.H.³ Sorensen, J.S.⁴

7
- 0002595416
- Speaker, environment and channel change detection and clustering via the Bayesian information criterion
- Lansdowne, VA, [Chen98]
- [Chen98] S.S. Chen, P.S. Gopalakrishnan. Speaker, environment and channel change detection and clustering via the Bayesian information criterion. Proc DARPA Broadcast News Transcription and Understanding Workshop, Lansdowne, VA, 1998, pp. 127-132
- (1998) Proc DARPA Broadcast News Transcription and Understanding Workshop , pp. 127-132
- Chen, S.S.¹ Gopalakrishnan, P.S.²

8
- 85135281671
- Speech recognition with automatic punctuation
- Budapest, Hungary, [Chen99]
- [Chen99] C.J. Chen. Speech recognition with automatic punctuation. Proc EuroSpeech99, Budapest, Hungary, 1999, pp. 447-480
- (1999) Proc EuroSpeech99 , pp. 447-480
- Chen, C.J.¹

9
- 85135270151
- Speaker-based segmentation for audio data indexing
- Budapest, Hungary, [Delacourt99]
- [Delacourt99] P. Delacourt, D. Kryze, C.J. Wellekens. Speaker-based segmentation for audio data indexing. Proc EuroSpeech99, Budapest, Hungary, 1999, pp. 1195-1198
- (1999) Proc EuroSpeech99 , pp. 1195-1198
- Delacourt, P.¹ Kryze, D.² Wellekens, C.J.³

10
- 0003323143
- Experimental results in audio-indexing
- Chantilly, VA, [Dharanipragada97]
- [Dharanipragada97] S. Dharanipragada, S. Roukos. Experimental results in audio-indexing. Proc DARPA Speech Recognition Workshop, Chantilly, VA, 1997, pp. 2-5
- (1997) Proc DARPA Speech Recognition Workshop , pp. 2-5
- Dharanipragada, S.¹ Roukos, S.²

11
- 85135252230
- Story Segmentation and Topic Detection for Recognized Speech
- Budapest, Hungary, [Dharanipragada99a]
- [Dharanipragada99a] S. Dharanipragada, M. Franz, J.S. McCarley, S. Roukos, and R.T. Ward. Story Segmentation and Topic Detection for Recognized Speech. Proc EuroSpeech99, Budapest, Hungary, 1999, pp. 2435-2438
- (1999) Proc EuroSpeech99 , pp. 2435-2438
- Dharanipragada, S.¹ Franz, M.² McCarley, J.S.³ Roukos, S.⁴ Ward, R.T.⁵

12
- 0342321399
- Audio Indexing for Broadcast News
- E.M. Voorhees, D.K. Harman (eds.) NIST Special Publication 500-242, [Dharanipragada99b]
- [Dharanipragada99b] S. Dharanipragada, M. Franz, S. Roukos. Audio Indexing for Broadcast News. Proc Seventh Text REtrieval Conference (TREC-7), E.M. Voorhees, D.K. Harman (eds.) NIST Special Publication 500-242, 1999, pp. 115-120
- (1999) Proc Seventh Text REtrieval Conference (TREC-7) , pp. 115-120
- Dharanipragada, S.¹ Franz, M.² Roukos, S.³

13
- 0004072715
- Digital Speech Processing
- Marcel Dekker, [Furui89]
- [Furui89] S. Furui. Digital Speech Processing, Synthesis and Recognition. Marcel Dekker, 1989
- (1989) Synthesis and Recognition
- Furui, S.¹

14
- 0026400244
- Segregation of Speakers for Speech Recognition and Speaker identification
- Seattle, Washington, [Gish91]
- [Gish91] H. Gish, M. Siu, R. Rohlicek. Segregation of Speakers for Speech Recognition and Speaker identification. Proc Int Conf on Acoustics, Speech, and Signal Processing, Seattle, Washington, 1991, pp. 873-876
- (1991) Proc Int Conf on Acoustics, Speech, and Signal Processing , pp. 873-876
- Gish, H.¹ Siu, M.² Rohlicek, R.³

15
- 0028516097
- Text-Independent Speaker identification
- [Gish94]
- [Gish94] H. Gish, M. Schmidt. Text-Independent Speaker identification. IEEE Signal Processing, Volume 11, Number 4, 1994, pp. 18-32
- (1994) IEEE Signal Processing , vol.11 , Issue.4 , pp. 18-32
- Gish, H.¹ Schmidt, M.²

16
- 0028996969
- A Tree Search Strategy for Large Vocabulary Continuous Speech Recognition
- Detroit, Michigan, [Gopalakrishnan95]
- [Gopalakrishnan95] P.S. Gopalakrishnan, L.R. Bahl, R. Mercer. A Tree Search Strategy for Large Vocabulary Continuous Speech Recognition. Proc Int Conf on Acoustics, Speech, and Signal Processing, Detroit, Michigan, 1995, pp. 572-575
- (1995) Proc Int Conf on Acoustics, Speech, and Signal Processing , pp. 572-575
- Gopalakrishnan, P.S.¹ Bahl, L.R.² Mercer, R.³

17
- 84892706143
- Pushing Streaming Video-Indexing Video Archives
- October-December, [Grosky97]
- [Grosky97] W. Grosky. Pushing Streaming Video-Indexing Video Archives. IEEE Multimedia, October-December 1997, pp. 7-8
- (1997) IEEE Multimedia , pp. 7-8
- Grosky, W.¹

18
- 0039141124
- Finding Information in Audio: A New Paradigm for Audio Browsing and Retrieval
- Cambridge, United Kingdom, April, [Hirschberg99]
- [Hirschberg99] J. Hirschberg, S. Whittaker, D, Hindle, F. Periera, A. Singhal. Finding Information in Audio: A New Paradigm for Audio Browsing and Retrieval. Proc ESCA Tutorial and Research Workshop, Cambridge, United Kingdom, April 1999
- (1999) Proc ESCA Tutorial and Research Workshop
- Hirschberg, J.¹ Whittaker, S.² Hindle, D.³ Periera, F.⁴ Singhal, A.⁵

19
- 85119434191
- Fast Speaker Change Detection for Broadcast News Transcription and Indexing
- Budapest, Hungary, [Liu99]
- [Liu99] D. Liu, F. Kubala. Fast Speaker Change Detection for Broadcast News Transcription and Indexing. Proc EuroSpeech99, Budapest, Hungary, 1999, pp. 1031-1034
- (1999) Proc EuroSpeech 99 , pp. 1031-1034
- Liu, D.¹ Kubala, F.²

20
- 0006317638
- Overview of the 1997 DARPA Speech Recognition Workshop
- Chantilly, Virginia, [Pallet97]
- [Pallet97] D. Pallet. Overview of the 1997 DARPA Speech Recognition Workshop. Proc DARPA Speech Recognition Workshop, Chantilly, Virginia, 1997, pp. 1-2
- (1997) Proc DARPA Speech Recognition Workshop , pp. 1-2
- Pallet, D.¹

21
- 0004244302
- Prentice-Hall, [Rabiner93]
- [Rabiner93] L. Rabiner, B.H. Juang, Fundamentals of Speech Recognition, Prentice-Hall, 1993
- (1993) Fundamentals of Speech Recognition
- Rabiner, L.¹ Juang, B.H.²

22
- 0001319911
- Okapi at TREC3
- D.K. Harman, editor, NIST Special Publication 500-226, Gaithersburg, Maryland, [Robertson95]
- [Robertson95] S.E. Robertson, S. Walker, K. Sparck-Jones, M.M. Hancock-Beaulieu, M. Gatford. Okapi at TREC3. Proc Third Text REtrieval Conference (TREC-3) D.K. Harman, editor, NIST Special Publication 500-226, Gaithersburg, Maryland, 1995, pp. 109-126
- (1995) Proc Third Text REtrieval Conference (TREC-3) , pp. 109-126
- Robertson, S.E.¹ Walker, S.² Sparck-Jones, K.³ Hancock-Beaulieu, M.M.⁴ Gatford, M.⁵

23
- 0001941052
- In, Marcel Dekker, [Rosenberg92], S. Furui, M.M. Sondhi (eds.)
- [Rosenberg92] A.E. Rosenberg, F.K. Soong. In: Advances in Speech Signal Processing, S. Furui, M.M. Sondhi (eds.) Marcel Dekker, 1992, pp. 701-738
- (1992) Advances in Speech Signal Processing , pp. 701-738
- Rosenberg, A.E.¹ Soong, F.K.²

24
- 0003882234
- AddisonWesley, [Salton89]
- [Salton89] G. Salton. Automatic Text Processing. AddisonWesley, 1989
- (1989) Automatic Text Processing
- Salton, G.¹

25
- 0032660827
- Name-It: Naming and Detecting Faces in News Videos
- January-March, [Satoh99]
- [Satoh99] S. Satoh, Y. Nakamura, T. Kanade. Name-It: Naming and Detecting Faces in News Videos. IEEE Multimedia, Volume 6, Number 1, January-March 1999, pp. 22-35
- (1999) IEEE Multimedia , vol.6 , Issue.1 , pp. 22-35
- Satoh, S.¹ Nakamura, Y.² Kanade, T.³

26
- 84892698756
- The CueVideo Spoken Media Retrieval System
- 95018, 10143, April 22 [Srinivasan99]
- [Srinivasan99] S. Srinivasan, D. Petkovic, D. Ponceleon, M. Viswanathan. The CueVideo Spoken Media Retrieval System. IBM Research Report RJ 10143 (95018), April 22, 1999
- (1999) IBM Research Report RJ
- Srinivasan, S.¹ Petkovic, D.² Ponceleon, D.³ Viswanathan, M.⁴

27
- 78650540904
- Improved Speaker Segmentation and Segments Clustering Using the Bayesian Information Criterion
- Budapest, Hungary, [Tritschler99]
- [Tritschler99] A. Trischler, R.A. Gopinath. Improved Speaker Segmentation and Segments Clustering Using the Bayesian Information Criterion. Proc EuroSpeech99, Budapest, Hungary, 1999, pp. 679-682
- (1999) Proc EuroSpeech99 , pp. 679-682
- Trischler, A.¹ Gopinath, R.A.²

28
- 0002415750
- Retrieval from Spoken Documents Using Content And Speaker Information
- Bangalore, India, [Viswanathan99]
- [Viswanathan99] M. Viswanathan, H.S.M. Beigi, S. Dharanipragada, A. Tritschler. Retrieval from Spoken Documents Using Content And Speaker Information. Proc Int Conf on Document Analysis and Retrieval (ICDAR99), Bangalore, India, 1999, pp. 567-572
- (1999) Proc Int Conf on Document Analysis and Retrieval (ICDAR99) , pp. 567-572
- Viswanathan, M.¹ Beigi, H.S.M.² Dharanipragada, S.³ Tritschler, A.⁴

29
- 0030242072
- Content-based Classification, Search, and Retrieval of Audio
- Fall, [Wold96]
- [Wold96] E. Wold, T. Blum, D. Keislar. Content-based Classification, Search, and Retrieval of Audio. IEEE Multimedia, Volume 3, Number 3, Fall 1996, pp. 27-36
- (1996) IEEE Multimedia , vol.3 , Issue.3 , pp. 27-36
- Wold, E.¹ Blum, T.² Keislar, D.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.