-
1
-
-
84889324982
-
-
A. Solomonoff, A. Mielke, M. Schmidt, H. Gish, Clustering speakers by their voices, in: Proceedings of the 1998 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, Seattle, USA, May 1998, pp. 757-760.
-
A. Solomonoff, A. Mielke, M. Schmidt, H. Gish, Clustering speakers by their voices, in: Proceedings of the 1998 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, Seattle, USA, May 1998, pp. 757-760.
-
-
-
-
2
-
-
33745200276
-
-
R. Sinha, S.E. Tranter, M.J.F. Gales, P.C. Woodland, The Cambridge University March 2005 speaker diarisation system, in: Proceedings of the European Conference on Speech Communication and Technology, Lisbon, Portugal, September 2005, pp. 2437-2440.
-
R. Sinha, S.E. Tranter, M.J.F. Gales, P.C. Woodland, The Cambridge University March 2005 speaker diarisation system, in: Proceedings of the European Conference on Speech Communication and Technology, Lisbon, Portugal, September 2005, pp. 2437-2440.
-
-
-
-
3
-
-
38949205073
-
-
ISO/IEC 15938-4:2001, Multimedia content description interface-part 4: audio, Version 1.0, 2001.
-
ISO/IEC 15938-4:2001, Multimedia content description interface-part 4: audio, Version 1.0, 2001.
-
-
-
-
4
-
-
0003769779
-
-
Wiley, West Sussex, England
-
Manjunath B.S., Salembier P., Sikora T., and Salembier P. Introduction to MPEG 7: Multimedia Content Description Language (2002), Wiley, West Sussex, England
-
(2002)
Introduction to MPEG 7: Multimedia Content Description Language
-
-
Manjunath, B.S.1
Salembier, P.2
Sikora, T.3
Salembier, P.4
-
5
-
-
4544361760
-
-
H.G. Kim, T. Sikora, Comparison of MPEG-7 audio spectrum projection features and MFCC applied to speaker recognition, sound classification and audio segmentation, in: Proceedings of the 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 5, Montreal, Canada, May 2004, pp. 925-928.
-
H.G. Kim, T. Sikora, Comparison of MPEG-7 audio spectrum projection features and MFCC applied to speaker recognition, sound classification and audio segmentation, in: Proceedings of the 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 5, Montreal, Canada, May 2004, pp. 925-928.
-
-
-
-
6
-
-
84979955147
-
-
H.G. Kim, T. Sikora, Audio spectrum projection based on several basis decomposition algorithms applied to general sound recognition and audio segmentation, in: Proceedings of the 12th European Signal Processing Conference, Vienna, Austria, September 2004, pp. 1047-1050.
-
H.G. Kim, T. Sikora, Audio spectrum projection based on several basis decomposition algorithms applied to general sound recognition and audio segmentation, in: Proceedings of the 12th European Signal Processing Conference, Vienna, Austria, September 2004, pp. 1047-1050.
-
-
-
-
7
-
-
34547324377
-
-
M. Kotti, E. Benetos, C. Kotropoulos, Automatic speaker change detection with the Bayesian information criterion using MPEG-7 features and a fusion scheme, in: Proceedings of the 2006 IEEE International Symposium on Circuits and Systems, Kos, Greece, May 2006.
-
M. Kotti, E. Benetos, C. Kotropoulos, Automatic speaker change detection with the Bayesian information criterion using MPEG-7 features and a fusion scheme, in: Proceedings of the 2006 IEEE International Symposium on Circuits and Systems, Kos, Greece, May 2006.
-
-
-
-
8
-
-
34247559206
-
-
M. Kotti, L.G.P.M. Martins, E. Benetos, J.S. Cardoso, C. Kotropoulos, Automatic speaker segmentation using multiple features and distance measures: a comparison of three approaches, in: Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, Toronto, Canada, July 2006, pp. 1101-1104.
-
M. Kotti, L.G.P.M. Martins, E. Benetos, J.S. Cardoso, C. Kotropoulos, Automatic speaker segmentation using multiple features and distance measures: a comparison of three approaches, in: Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, Toronto, Canada, July 2006, pp. 1101-1104.
-
-
-
-
9
-
-
64149092838
-
-
W.H. Tsai, S.S. Cheng, H.M. Wang, Speaker clustering of speech utterances using a voice characteristic reference space, in: Proceedings of the International Conference on Spoken Language Processing, Jeju Island, Korea, October 2004.
-
W.H. Tsai, S.S. Cheng, H.M. Wang, Speaker clustering of speech utterances using a voice characteristic reference space, in: Proceedings of the International Conference on Spoken Language Processing, Jeju Island, Korea, October 2004.
-
-
-
-
10
-
-
4544247119
-
-
D. Liu, F. Kubala, Online speaker clustering, in: Proceedings of the 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, Montreal, Canada, May 2004, pp. 333-336.
-
D. Liu, F. Kubala, Online speaker clustering, in: Proceedings of the 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, Montreal, Canada, May 2004, pp. 333-336.
-
-
-
-
11
-
-
84875953283
-
-
S.S. Chen, P.S. Gopalakrishnan, Clustering via the Bayesian information criterion with applications in speech recognition, in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, Seattle, USA, May 1998, pp. 645-648.
-
S.S. Chen, P.S. Gopalakrishnan, Clustering via the Bayesian information criterion with applications in speech recognition, in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, Seattle, USA, May 1998, pp. 645-648.
-
-
-
-
12
-
-
0141809272
-
-
S. Meignier, J.F. Bonastre, S. Igounet, E-HMM approach for learning and adapting sound models for speaker indexing, in: Proceedings of the Odyssey Speaker and Language Recognition Workshop, Crete, Greece, June 2001, pp. 175-180.
-
S. Meignier, J.F. Bonastre, S. Igounet, E-HMM approach for learning and adapting sound models for speaker indexing, in: Proceedings of the Odyssey Speaker and Language Recognition Workshop, Crete, Greece, June 2001, pp. 175-180.
-
-
-
-
13
-
-
85009289298
-
-
J. Ajmera, H. Bourlard, I. Lapidot, I. McCowan, Unknown-multiple speaker clustering using HMM, in: Proceedings of the International Conference on Spoken Language Processing, CO, USA, September 2002, pp. 573-576.
-
J. Ajmera, H. Bourlard, I. Lapidot, I. McCowan, Unknown-multiple speaker clustering using HMM, in: Proceedings of the International Conference on Spoken Language Processing, CO, USA, September 2002, pp. 573-576.
-
-
-
-
14
-
-
33745185104
-
-
X. Zhu, C. Barras, S. Meignier, J.-L. Gauvain, Combining speaker identification and BIC for speaker diarization, in: Proceedings of the InterSpeech, Lisbon, Portugal, September 2005, pp. 2441-2444.
-
X. Zhu, C. Barras, S. Meignier, J.-L. Gauvain, Combining speaker identification and BIC for speaker diarization, in: Proceedings of the InterSpeech, Lisbon, Portugal, September 2005, pp. 2441-2444.
-
-
-
-
15
-
-
29044442235
-
Step-by-step and integrated approaches in broadcast news speaker diarization
-
Meignier S., Moraru D., Fredouille C., Bonastre J.F., and Besacier L. Step-by-step and integrated approaches in broadcast news speaker diarization. Comput. Speech Language 20 2-3 (April-July 2006) 303-330
-
(2006)
Comput. Speech Language
, vol.20
, Issue.2-3
, pp. 303-330
-
-
Meignier, S.1
Moraru, D.2
Fredouille, C.3
Bonastre, J.F.4
Besacier, L.5
-
16
-
-
34047266609
-
Multistage speaker diarization of broadcast news
-
Barras C., Zhu X., Meignier S., and Gauvain J.L. Multistage speaker diarization of broadcast news. IEEE Trans. Audio Speech Language Process. 14 5 (September 2006) 1505-1512
-
(2006)
IEEE Trans. Audio Speech Language Process.
, vol.14
, Issue.5
, pp. 1505-1512
-
-
Barras, C.1
Zhu, X.2
Meignier, S.3
Gauvain, J.L.4
-
18
-
-
0031233424
-
Speaker recognition: a tutorial
-
Campbell J.P. Speaker recognition: a tutorial. Proc. IEEE 85 9 (September 1997) 1437-1462
-
(1997)
Proc. IEEE
, vol.85
, Issue.9
, pp. 1437-1462
-
-
Campbell, J.P.1
-
19
-
-
0034505639
-
-
V. Wan, W.M. Campbell, Support vector machines for speaker verification and identification, in: Proceedings of the Neural Networks for Signal Processing, vol. 10, Sydney, Australia, December 2000, pp. 775-784.
-
V. Wan, W.M. Campbell, Support vector machines for speaker verification and identification, in: Proceedings of the Neural Networks for Signal Processing, vol. 10, Sydney, Australia, December 2000, pp. 775-784.
-
-
-
-
21
-
-
34047266379
-
Progress in the CU-HTK broadcast news transcription system
-
Gales M.J.F., Kim D.Y., Woodland P.C., Chan H.Y., Mrva D., Sinha R., and Tranter S.E. Progress in the CU-HTK broadcast news transcription system. IEEE Trans. Speech Audio Process. 14 5 (September 2006) 1513-1525
-
(2006)
IEEE Trans. Speech Audio Process.
, vol.14
, Issue.5
, pp. 1513-1525
-
-
Gales, M.J.F.1
Kim, D.Y.2
Woodland, P.C.3
Chan, H.Y.4
Mrva, D.5
Sinha, R.6
Tranter, S.E.7
-
22
-
-
38949191461
-
-
National Institute of Standards and Technology (NIST)-The Segmentation Task: Find the Story Boundaries 〈http://www.nist.gov/speech/tests/tdt/tdt99/presentations/NIST_segmentation/index.htm〉.
-
National Institute of Standards and Technology (NIST)-The Segmentation Task: Find the Story Boundaries 〈http://www.nist.gov/speech/tests/tdt/tdt99/presentations/NIST_segmentation/index.htm〉.
-
-
-
-
23
-
-
38949110169
-
-
The Center for Spoken Language Research of the Colorado University (CSLR) 〈http://cslr.colorado.edu/〉.
-
The Center for Spoken Language Research of the Colorado University (CSLR) 〈http://cslr.colorado.edu/〉.
-
-
-
-
24
-
-
38949163851
-
-
International Computer Science Institute-Speech Research Group Berkeley 〈http://www.icsi.berkeley.edu/groups/speech/〉.
-
International Computer Science Institute-Speech Research Group Berkeley 〈http://www.icsi.berkeley.edu/groups/speech/〉.
-
-
-
-
25
-
-
38949198872
-
-
Speech Analysis and Interpretation Laboratory (SAIL) at the University of Southern California 〈http://sail.usc.edu/projectsIntro.php〉.
-
Speech Analysis and Interpretation Laboratory (SAIL) at the University of Southern California 〈http://sail.usc.edu/projectsIntro.php〉.
-
-
-
-
26
-
-
38949096946
-
-
International Speech Technology and Research (STAR) Laboratory at Stanford research institute (SRI) 〈http://www.speech.sri.com/projects/sieve/〉.
-
International Speech Technology and Research (STAR) Laboratory at Stanford research institute (SRI) 〈http://www.speech.sri.com/projects/sieve/〉.
-
-
-
-
27
-
-
38949095530
-
-
Microsoft Audio Projects 〈http://research.microsoft.com/users/llu/Audioprojects.aspx〉.
-
Microsoft Audio Projects 〈http://research.microsoft.com/users/llu/Audioprojects.aspx〉.
-
-
-
-
28
-
-
38949093086
-
-
The Institut Dalle Molle d'Intelligence Artificielle Perceptive (IDIAP) Research Institute 〈http://www.idiap.ch/speech_processing.php〉.
-
The Institut Dalle Molle d'Intelligence Artificielle Perceptive (IDIAP) Research Institute 〈http://www.idiap.ch/speech_processing.php〉.
-
-
-
-
29
-
-
38949141761
-
-
The Laboratoire d'Informatique pour la Mècanique et les Sciences de l'Ingènieur (LIMSI) Spoken Language Processing Group 〈http://www.limsi.fr/TLP〉.
-
The Laboratoire d'Informatique pour la Mècanique et les Sciences de l'Ingènieur (LIMSI) Spoken Language Processing Group 〈http://www.limsi.fr/TLP〉.
-
-
-
-
30
-
-
38949115082
-
-
The Department of Speech, Music and Hearing of the Royal Institute of Technology (KTH) at Stockholm 〈http://www.speech.kth.se〉.
-
The Department of Speech, Music and Hearing of the Royal Institute of Technology (KTH) at Stockholm 〈http://www.speech.kth.se〉.
-
-
-
-
31
-
-
38949152506
-
-
The Chair of Computer Science VI, Computer Science Department, Aachen University 〈http://www-i6.informatik.rwth-aachen.de〉.
-
The Chair of Computer Science VI, Computer Science Department, Aachen University 〈http://www-i6.informatik.rwth-aachen.de〉.
-
-
-
-
32
-
-
38949134678
-
-
The Infant Speech Segmentation Project at Berkeley University 〈http://www-gse.berkeley.edu/research/completed/InfantSpeech.html〉.
-
The Infant Speech Segmentation Project at Berkeley University 〈http://www-gse.berkeley.edu/research/completed/InfantSpeech.html〉.
-
-
-
-
33
-
-
38949101288
-
-
Language Science Research Group, Washington University 〈http://lsrg.cs.wustl.edu〉.
-
Language Science Research Group, Washington University 〈http://lsrg.cs.wustl.edu〉.
-
-
-
-
34
-
-
38949182068
-
-
The University College of London Psychology Speech Group, speech segmentation issues 〈http://www.speech.psychol.ucl.ac.uk〉.
-
The University College of London Psychology Speech Group, speech segmentation issues 〈http://www.speech.psychol.ucl.ac.uk〉.
-
-
-
-
36
-
-
0037700756
-
-
L. Lu, H. Zhang, Speaker change detection and tracking in real-time news broadcast analysis, in: Proceedings of the ACM Multimedia 2002, Juan-les-Pins, France, December 2002, pp. 602-610.
-
L. Lu, H. Zhang, Speaker change detection and tracking in real-time news broadcast analysis, in: Proceedings of the ACM Multimedia 2002, Juan-les-Pins, France, December 2002, pp. 602-610.
-
-
-
-
37
-
-
17444365032
-
Unsupervised speaker segmentation and tracking in real-time audio content analysis
-
Lu L., and Zhang H. Unsupervised speaker segmentation and tracking in real-time audio content analysis. Multimedia Systems 10 4 (April 2005) 332-343
-
(2005)
Multimedia Systems
, vol.10
, Issue.4
, pp. 332-343
-
-
Lu, L.1
Zhang, H.2
-
38
-
-
38949090534
-
-
A. Tritschler, R. Gopinath, Improved speaker segmentation and segments clustering using the Bayesian information criterion, in: Proceedings of the 6th European Conference on Speech Communication and Technology, Budapest, Hungary, September 1999, pp. 679-682.
-
A. Tritschler, R. Gopinath, Improved speaker segmentation and segments clustering using the Bayesian information criterion, in: Proceedings of the 6th European Conference on Speech Communication and Technology, Budapest, Hungary, September 1999, pp. 679-682.
-
-
-
-
39
-
-
0034273195
-
DISTBIC: a speaker-based segmentation for audio data indexing
-
Delacourt P., and Wellekens C.J. DISTBIC: a speaker-based segmentation for audio data indexing. Speech Comm. 32 (September 2000) 111-126
-
(2000)
Speech Comm.
, vol.32
, pp. 111-126
-
-
Delacourt, P.1
Wellekens, C.J.2
-
40
-
-
85009282223
-
-
S. Know, S. Narayanan, Speaker change detection using a new weighted distance measure, in: Proceedings of the International Conference on Spoken Language, vol. 4, CO, USA, September 2002, pp. 2537-2540.
-
S. Know, S. Narayanan, Speaker change detection using a new weighted distance measure, in: Proceedings of the International Conference on Spoken Language, vol. 4, CO, USA, September 2002, pp. 2537-2540.
-
-
-
-
41
-
-
85143189670
-
-
T. Wu, L. Lu, K. Chen, H. Zhang, UBM-based real-time speaker segmentation for broadcasting news, in: Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, Hong Kong, April 2003, pp. 193-196.
-
T. Wu, L. Lu, K. Chen, H. Zhang, UBM-based real-time speaker segmentation for broadcasting news, in: Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, Hong Kong, April 2003, pp. 193-196.
-
-
-
-
42
-
-
85009212151
-
-
S.S. Cheng, H.M. Wang, A sequential metric-based audio segmentation method via the Bayesian information criterion, in: Proceedings of the 8th European Conference on Speech Communication and Technology, Geneva, Switzerland, September 2003, pp. 945-948.
-
S.S. Cheng, H.M. Wang, A sequential metric-based audio segmentation method via the Bayesian information criterion, in: Proceedings of the 8th European Conference on Speech Communication and Technology, Geneva, Switzerland, September 2003, pp. 945-948.
-
-
-
-
44
-
-
33646789869
-
-
H. Kim, D. Elter, T. Sikora, Hybrid speaker-based segmentation system using model-level clustering, in: Proceedings of the 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. I, Philadelphia, USA, March 2005, pp. 745-748.
-
H. Kim, D. Elter, T. Sikora, Hybrid speaker-based segmentation system using model-level clustering, in: Proceedings of the 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. I, Philadelphia, USA, March 2005, pp. 745-748.
-
-
-
-
46
-
-
27644599375
-
Unsupervised speaker indexing using generic models
-
Know S., and Narayanan S. Unsupervised speaker indexing using generic models. IEEE Trans. Speech Audio Process. 13 5 (September 2005) 1004-1013
-
(2005)
IEEE Trans. Speech Audio Process.
, vol.13
, Issue.5
, pp. 1004-1013
-
-
Know, S.1
Narayanan, S.2
-
47
-
-
33745000055
-
Automatic segmentation and identification of mixed-language speech using delta-BIC and LSA-based GMMs
-
Wu C.H., Chiu Y.H., Shia C.J., and Lin C.Y. Automatic segmentation and identification of mixed-language speech using delta-BIC and LSA-based GMMs. IEEE Trans. Audio Speech Language Process. 14 1 (January 2006) 266-276
-
(2006)
IEEE Trans. Audio Speech Language Process.
, vol.14
, Issue.1
, pp. 266-276
-
-
Wu, C.H.1
Chiu, Y.H.2
Shia, C.J.3
Lin, C.Y.4
-
48
-
-
38949101287
-
-
T. Wu, L. Lu, K. Chen, H. Zhang, Universal background models for real-time speaker change detection, in: Proceedings of the 9th International Conference on Multimedia Modeling, Tamshui, Taiwan, January 2003, pp. 135-149.
-
T. Wu, L. Lu, K. Chen, H. Zhang, Universal background models for real-time speaker change detection, in: Proceedings of the 9th International Conference on Multimedia Modeling, Tamshui, Taiwan, January 2003, pp. 135-149.
-
-
-
-
49
-
-
4544280424
-
-
S.E. Tranter, K. Yu, G. Evermann, P.C. Woodland, Generating and evaluating segmentations for automatic speech recognition of conversational telephone speech, in: Proceedings of the 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, Montreal, Canada, May 2004, pp. 433-477.
-
S.E. Tranter, K. Yu, G. Evermann, P.C. Woodland, Generating and evaluating segmentations for automatic speech recognition of conversational telephone speech, in: Proceedings of the 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, Montreal, Canada, May 2004, pp. 433-477.
-
-
-
-
50
-
-
0141814632
-
-
D. Wang, L. Lu, H.J. Zhang, Speech segmentation without speech recognition, in: Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, Hong Kong, April 2003, pp. 468-471.
-
D. Wang, L. Lu, H.J. Zhang, Speech segmentation without speech recognition, in: Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, Hong Kong, April 2003, pp. 468-471.
-
-
-
-
51
-
-
33947127409
-
Multiple change-point audio segmentation and classification using an MDL-based Gaussian model
-
Wu C.H., and Hsieh C.H. Multiple change-point audio segmentation and classification using an MDL-based Gaussian model. IEEE Trans. Audio Speech Language Process. 14 2 (March 2006) 647-657
-
(2006)
IEEE Trans. Audio Speech Language Process.
, vol.14
, Issue.2
, pp. 647-657
-
-
Wu, C.H.1
Hsieh, C.H.2
-
52
-
-
4544369704
-
-
R. Huang, J.H.L. Hansen, Advances in unsupervised audio segmentation for the broadcast news and ngsw corpora, in: Proceedings of the 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, Montreal, Canada, May 2004, pp. 741-744.
-
R. Huang, J.H.L. Hansen, Advances in unsupervised audio segmentation for the broadcast news and ngsw corpora, in: Proceedings of the 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, Montreal, Canada, May 2004, pp. 741-744.
-
-
-
-
53
-
-
84889435599
-
-
Wiley, West Sussex, England
-
Kim H.-G., Moreau N., and Sikora T. MPEG-7 Audio and Beyond: Audio Content Indexing and Retrieval (2005), Wiley, West Sussex, England
-
(2005)
MPEG-7 Audio and Beyond: Audio Content Indexing and Retrieval
-
-
Kim, H.-G.1
Moreau, N.2
Sikora, T.3
-
54
-
-
0003424145
-
-
Wiley, IEEE, New York
-
Deller J.R., Hansen J.H.L., and Proakis J.G. Discrete-Time Processing of Speech Signals (1999), Wiley, IEEE, New York
-
(1999)
Discrete-Time Processing of Speech Signals
-
-
Deller, J.R.1
Hansen, J.H.L.2
Proakis, J.G.3
-
55
-
-
0004056285
-
-
Pearson Education, Prentice-Hall, Upper River Saddle, NJ
-
Huang X.D., Acero A., and Hon H.-S. Spoken Language Processing: A Guide to Theory, Algorithm, and System Development (2001), Pearson Education, Prentice-Hall, Upper River Saddle, NJ
-
(2001)
Spoken Language Processing: A Guide to Theory, Algorithm, and System Development
-
-
Huang, X.D.1
Acero, A.2
Hon, H.-S.3
-
56
-
-
0000618817
-
New methods of pitch extraction
-
Sondhi M.M. New methods of pitch extraction. IEEE Trans. Audio Electroacoustics 16 2 (June 1968) 262-266
-
(1968)
IEEE Trans. Audio Electroacoustics
, vol.16
, Issue.2
, pp. 262-266
-
-
Sondhi, M.M.1
-
57
-
-
0002038020
-
Pitch and voicing determination
-
Furui S., and Sondhi M.M. (Eds), Marcel Dekker Inc., New York
-
Hess W.J. Pitch and voicing determination. In: Furui S., and Sondhi M.M. (Eds). Advances in Speech Signal Processing (1991), Marcel Dekker Inc., New York
-
(1991)
Advances in Speech Signal Processing
-
-
Hess, W.J.1
-
58
-
-
33746410556
-
Emotional speech recognition: resources, features, and methods
-
Ververidis D., and Kotropoulos C. Emotional speech recognition: resources, features, and methods. Speech Comm. 48 9 (September 2006) 1162-1181
-
(2006)
Speech Comm.
, vol.48
, Issue.9
, pp. 1162-1181
-
-
Ververidis, D.1
Kotropoulos, C.2
-
59
-
-
84990950602
-
-
B. Li, Y. Li, C. Wang, C. Zhang, A new efficient pitch-tracking algorithm, in: Proceedings of the 2003 IEEE International Conference on Robotics, Intelligent Systems and Signal Processing, vol. 2, Hunan, China, October 2003, pp. 1102-1107.
-
B. Li, Y. Li, C. Wang, C. Zhang, A new efficient pitch-tracking algorithm, in: Proceedings of the 2003 IEEE International Conference on Robotics, Intelligent Systems and Signal Processing, vol. 2, Hunan, China, October 2003, pp. 1102-1107.
-
-
-
-
60
-
-
0033692969
-
-
T. Kemp, M. Schmidt, M. Westphal, A. Waibel, Strategies for automatic segmentation of audio data, in: Proceedings of the 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 3, Istanbul, Turkey, June 2000, pp. 1423-1426.
-
T. Kemp, M. Schmidt, M. Westphal, A. Waibel, Strategies for automatic segmentation of audio data, in: Proceedings of the 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 3, Istanbul, Turkey, June 2000, pp. 1423-1426.
-
-
-
-
61
-
-
33646769986
-
-
M. Collet, D. Charlet, F. Bimbot, A correlation metric for speaker tracking using anchor models, in: Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, Hong Kong, April 2003, pp. 713-716.
-
M. Collet, D. Charlet, F. Bimbot, A correlation metric for speaker tracking using anchor models, in: Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, Hong Kong, April 2003, pp. 713-716.
-
-
-
-
62
-
-
0032139769
-
Automatic segmentation of speech recorded in unknown noisy channel characteristics
-
Pellom B.L., and Hansen J.H.L. Automatic segmentation of speech recorded in unknown noisy channel characteristics. Speech Comm. 25 1-3 (August 1998) 97-116
-
(1998)
Speech Comm.
, vol.25
, Issue.1-3
, pp. 97-116
-
-
Pellom, B.L.1
Hansen, J.H.L.2
-
63
-
-
0037401304
-
Speech/music segmentation using entropy and dynamism features in a HMM classification framework
-
Ajmera J., McCowan I., and Bourland H. Speech/music segmentation using entropy and dynamism features in a HMM classification framework. Speech Comm. 40 3 (May 2003) 351-363
-
(2003)
Speech Comm.
, vol.40
, Issue.3
, pp. 351-363
-
-
Ajmera, J.1
McCowan, I.2
Bourland, H.3
-
64
-
-
4544303183
-
-
N. Mesgarani, S. Shamma, M. Slaney, Speech discrimination based on multiscale spectro-temporal modulations, in: Proceedings of the 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, Montreal, Canada, May 2004, pp. 601-604.
-
N. Mesgarani, S. Shamma, M. Slaney, Speech discrimination based on multiscale spectro-temporal modulations, in: Proceedings of the 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, Montreal, Canada, May 2004, pp. 601-604.
-
-
-
-
65
-
-
84863671030
-
-
J.A. Arias, J. Pinquier, R. Andè-Obrecht, Evaluation of classification techniques for audio indexing, in: Proceedings of the 13th European Signal Processing Conference, Antalya, Turkey, September 2005.
-
J.A. Arias, J. Pinquier, R. Andè-Obrecht, Evaluation of classification techniques for audio indexing, in: Proceedings of the 13th European Signal Processing Conference, Antalya, Turkey, September 2005.
-
-
-
-
66
-
-
33644539859
-
Audio-based description and structuring of videos
-
Harb H., and Chen L. Audio-based description and structuring of videos. Internat. J. Digital Libraries 6 1 (February 2006) 70-81
-
(2006)
Internat. J. Digital Libraries
, vol.6
, Issue.1
, pp. 70-81
-
-
Harb, H.1
Chen, L.2
-
67
-
-
0029352294
-
Second-order statistical measures for text-independent speaker identification
-
Bimbot F., Magrin-Chagnolleau I., and Mathan L. Second-order statistical measures for text-independent speaker identification. Speech Comm. 17 1-2 (August 1995) 177-192
-
(1995)
Speech Comm.
, vol.17
, Issue.1-2
, pp. 177-192
-
-
Bimbot, F.1
Magrin-Chagnolleau, I.2
Mathan, L.3
-
68
-
-
85008020310
-
SpeechFind: advances in spoken document retrieval for a national gallery of the spoken word
-
Hansen J.H.L., Huang R., Zhou B., Seadle M., Deller J.R., Gurijala A.R., Kurimo M., and Angkititrakul P. SpeechFind: advances in spoken document retrieval for a national gallery of the spoken word. IEEE Trans. Speech Audio Process. 13 5 (September 2005) 712-730
-
(2005)
IEEE Trans. Speech Audio Process.
, vol.13
, Issue.5
, pp. 712-730
-
-
Hansen, J.H.L.1
Huang, R.2
Zhou, B.3
Seadle, M.4
Deller, J.R.5
Gurijala, A.R.6
Kurimo, M.7
Angkititrakul, P.8
-
69
-
-
85143190520
-
-
M. Cettolo, M. Vescovi, Efficient audio segmentation algorithms based on the BIC, in: Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 6, Hong Kong, April 2003, pp. 537-540.
-
M. Cettolo, M. Vescovi, Efficient audio segmentation algorithms based on the BIC, in: Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 6, Hong Kong, April 2003, pp. 537-540.
-
-
-
-
70
-
-
85009210477
-
-
M. Vescovi, M. Cettolo, R. Rizzi, A DP algorithm for speaker change detection, in: Proceedings of the 8th European Conference on Speech Communication and Technology, Geneva, Switzerland, September 2003, pp. 2997-3000.
-
M. Vescovi, M. Cettolo, R. Rizzi, A DP algorithm for speaker change detection, in: Proceedings of the 8th European Conference on Speech Communication and Technology, Geneva, Switzerland, September 2003, pp. 2997-3000.
-
-
-
-
71
-
-
38949102855
-
-
Q. Jin, K. Laskowski, T. Schultz, A. Waibel, Speaker segmentation and clustering in meetings, in: Proceedings of the NIST Meeting Recognition Workshop, Montreal, Canada, May 2004, pp. 112-117.
-
Q. Jin, K. Laskowski, T. Schultz, A. Waibel, Speaker segmentation and clustering in meetings, in: Proceedings of the NIST Meeting Recognition Workshop, Montreal, Canada, May 2004, pp. 112-117.
-
-
-
-
72
-
-
10844275417
-
Evaluation of BIC-based algorithms for audio segmentation
-
Cettolo M., Vescovi M., and Rizzi R. Evaluation of BIC-based algorithms for audio segmentation. Comput. Speech Language 19 (April 2005) 1004-1013
-
(2005)
Comput. Speech Language
, vol.19
, pp. 1004-1013
-
-
Cettolo, M.1
Vescovi, M.2
Rizzi, R.3
-
73
-
-
0001011286
-
Robust procedures in multivariate analysis I: robust covariance estimation
-
Campbell N.A. Robust procedures in multivariate analysis I: robust covariance estimation. Appl. Statist. 29 3 (1980) 231-237
-
(1980)
Appl. Statist.
, vol.29
, Issue.3
, pp. 231-237
-
-
Campbell, N.A.1
-
74
-
-
85009128756
-
-
S. Cheng, H. Wang, Metric SEQDAC: a hybrid approach for audio segmentation, in: Proceedings of the 8th International Conference on Spoken Language Processing, Jeju, Korea, October 2004, pp. 1617-1620.
-
S. Cheng, H. Wang, Metric SEQDAC: a hybrid approach for audio segmentation, in: Proceedings of the 8th International Conference on Spoken Language Processing, Jeju, Korea, October 2004, pp. 1617-1620.
-
-
-
-
75
-
-
0026400244
-
-
H. Gish, M.H. Siu, R. Rohlicek, Segregation of speakers for speech recognition and speaker identification, in: Proceedings of the 1991 IEEE International Conference on Acoustics, Speech, and Signal Processing, Toronto, Canada, April 1991, pp. 873-876.
-
H. Gish, M.H. Siu, R. Rohlicek, Segregation of speakers for speech recognition and speaker identification, in: Proceedings of the 1991 IEEE International Conference on Acoustics, Speech, and Signal Processing, Toronto, Canada, April 1991, pp. 873-876.
-
-
-
-
76
-
-
4544339441
-
-
J. Ajmera, G. Lathoud, I. McCowan, Clustering and segmenting speakers and their locations in meetings, in: Proceedings of the 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, Montreal, Canada, May 2004, pp. 605-608.
-
J. Ajmera, G. Lathoud, I. McCowan, Clustering and segmenting speakers and their locations in meetings, in: Proceedings of the 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, Montreal, Canada, May 2004, pp. 605-608.
-
-
-
-
77
-
-
38949206466
-
-
D.P.W. Ellis, J.C. Liu, Speaker turn segmentation based on between-channel differences, in: Proceedings of the NIST Meeting Recognition Workshop, Montreal, Canada, May 2004, pp. 112-117.
-
D.P.W. Ellis, J.C. Liu, Speaker turn segmentation based on between-channel differences, in: Proceedings of the NIST Meeting Recognition Workshop, Montreal, Canada, May 2004, pp. 112-117.
-
-
-
-
78
-
-
38949203922
-
-
J. Alabiso, R. MacIntyre, D. Graff, 1997 English Broadcast News Transcripts (HUB4), Linguistic Data Consortium, Philadelphia, 1998.
-
J. Alabiso, R. MacIntyre, D. Graff, 1997 English Broadcast News Transcripts (HUB4), Linguistic Data Consortium, Philadelphia, 1998.
-
-
-
-
81
-
-
85021249401
-
-
M. Federica, D. Giordani, P. Caletti, Development and evaluation of an Italian broadcast news corpus, in: Proceedings of the 2nd International Conference on Language Resources and Evaluation, Athens, Greece, May-June 2000, pp. 921-924.
-
M. Federica, D. Giordani, P. Caletti, Development and evaluation of an Italian broadcast news corpus, in: Proceedings of the 2nd International Conference on Language Resources and Evaluation, Athens, Greece, May-June 2000, pp. 921-924.
-
-
-
-
82
-
-
38949190789
-
-
S. Chen, P. Gopalakrishnan, Speaker, environment and channel change detection and clustering via the Bayesian information criterion, in: Proceedings of the DARPA Broadcast News Transcription Understanding Workshop, Landsdowne, VA, February 1998, pp. 127-132.
-
S. Chen, P. Gopalakrishnan, Speaker, environment and channel change detection and clustering via the Bayesian information criterion, in: Proceedings of the DARPA Broadcast News Transcription Understanding Workshop, Landsdowne, VA, February 1998, pp. 127-132.
-
-
-
-
83
-
-
33745184949
-
MATBN: a mandarin Chinese broadcast news corpus
-
Wang H.M., Chen B., Kuo J.W., and Cheng S.S. MATBN: a mandarin Chinese broadcast news corpus. Comput. Linguistics Chinese Language Process. 10 2 (June 2005) 219-236
-
(2005)
Comput. Linguistics Chinese Language Process.
, vol.10
, Issue.2
, pp. 219-236
-
-
Wang, H.M.1
Chen, B.2
Kuo, J.W.3
Cheng, S.S.4
-
84
-
-
38949211653
-
-
Linguistic Data Consortium, Philadelphia
-
Graff D. TDT3 Mandarin Audio (2001), Linguistic Data Consortium, Philadelphia
-
(2001)
TDT3 Mandarin Audio
-
-
Graff, D.1
-
85
-
-
0242323752
-
Unified fusion rules for multisensor multihypothesis network decision systems
-
Zhu Y., and Rong X. Unified fusion rules for multisensor multihypothesis network decision systems. IEEE Trans. System Man Cybernet. 33 4 (July 2003) 502-513
-
(2003)
IEEE Trans. System Man Cybernet.
, vol.33
, Issue.4
, pp. 502-513
-
-
Zhu, Y.1
Rong, X.2
-
86
-
-
38949122539
-
-
M. Kotti, E. Benetos, C. Kotropoulos, Computationally efficient and robust BIC-based speaker segmentation, IEEE Trans. Audio Speech Language Process., in revision.
-
M. Kotti, E. Benetos, C. Kotropoulos, Computationally efficient and robust BIC-based speaker segmentation, IEEE Trans. Audio Speech Language Process., in revision.
-
-
-
-
87
-
-
38949110862
-
-
The Linguistic Data Consortium 〈http://www.ldc.upenn.edu/〉.
-
The Linguistic Data Consortium 〈http://www.ldc.upenn.edu/〉.
-
-
-
-
88
-
-
35348882681
-
Phonemic segmentation using the generalised Gamma distribution and small sample Bayesian information criterion
-
Almpanidis G., and Kotropoulos C. Phonemic segmentation using the generalised Gamma distribution and small sample Bayesian information criterion. Speech Comm. 50 1 (January 2008) 38-55
-
(2008)
Speech Comm.
, vol.50
, Issue.1
, pp. 38-55
-
-
Almpanidis, G.1
Kotropoulos, C.2
-
89
-
-
33745190484
-
-
W.-H. Tsai, H.-M. Wang, Speaker clustering of unknown utterances based on maximum purity estimation, in: Proceedings of the European Conference on Speech Communication and Technology, Lisbon, Portugal, September 2005.
-
W.-H. Tsai, H.-M. Wang, Speaker clustering of unknown utterances based on maximum purity estimation, in: Proceedings of the European Conference on Speech Communication and Technology, Lisbon, Portugal, September 2005.
-
-
-
-
90
-
-
38949136524
-
-
J.-L. Gauvain, L. Lamel, G. Adda, Partitioning and transcription of broadcast news data, in: Proceedings of the International Conference on Spoken Language Processing, Sydney, Australia, December 1998, pp. 1335-1338.
-
J.-L. Gauvain, L. Lamel, G. Adda, Partitioning and transcription of broadcast news data, in: Proceedings of the International Conference on Spoken Language Processing, Sydney, Australia, December 1998, pp. 1335-1338.
-
-
-
-
92
-
-
84946742526
-
-
J. Ajmera, C. Wooters, A robust speaker clustering algorithm, in: Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, Virgin Islands, November 2003, pp. 411-416.
-
J. Ajmera, C. Wooters, A robust speaker clustering algorithm, in: Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, Virgin Islands, November 2003, pp. 411-416.
-
-
-
-
93
-
-
38949104255
-
-
I. Voitovetsky, H. Guterman, A. Cohen, Validity criterion for unsupervised speaker recognition, in: Proceedings of the First Workshop Text, Speech, and Dialogue, Brno, Czech Republic, September 1998, pp. 321-326.
-
I. Voitovetsky, H. Guterman, A. Cohen, Validity criterion for unsupervised speaker recognition, in: Proceedings of the First Workshop Text, Speech, and Dialogue, Brno, Czech Republic, September 1998, pp. 321-326.
-
-
-
-
94
-
-
0031331636
-
-
I. Voitovetsky, H. Guterman, A. Cohen, Unsupervised speaker classification using self-organizing maps, in: Proceedings of the IEEE Workshop Neural Networks for Signal Processing, Amelia Island, USA, September 1997, pp. 578-587.
-
I. Voitovetsky, H. Guterman, A. Cohen, Unsupervised speaker classification using self-organizing maps, in: Proceedings of the IEEE Workshop Neural Networks for Signal Processing, Amelia Island, USA, September 1997, pp. 578-587.
-
-
-
-
95
-
-
84864281086
-
-
I. Lapidot, H. Guterman, Resolution limitation in speakers clustering and segmentation problems, in: Proceedings of the 2001: A Speaker Odyssey, The Speaker Recognition Workshop, Chania, Greece, June 18-22, 2001, pp. 169-173.
-
I. Lapidot, H. Guterman, Resolution limitation in speakers clustering and segmentation problems, in: Proceedings of the 2001: A Speaker Odyssey, The Speaker Recognition Workshop, Chania, Greece, June 18-22, 2001, pp. 169-173.
-
-
-
-
96
-
-
0036650810
-
Unsupervised speaker recognition based on competition between self-organizing maps
-
Lapidot I., Guterman H., and Cohen A. Unsupervised speaker recognition based on competition between self-organizing maps. IEEE Trans. Neural Networks 13 4 (July 2002) 877-887
-
(2002)
IEEE Trans. Neural Networks
, vol.13
, Issue.4
, pp. 877-887
-
-
Lapidot, I.1
Guterman, H.2
Cohen, A.3
-
97
-
-
38949193377
-
-
1998 HUB4 Broadcast News Evaluation English Test Material, Linguistic Data Consortium, Philadelphia, 2000.
-
1998 HUB4 Broadcast News Evaluation English Test Material, Linguistic Data Consortium, Philadelphia, 2000.
-
-
-
-
99
-
-
38949107492
-
-
M. Przybocki, A. Martin, 2001 NIST Speaker Recognition Evaluation Corpus, Linguistic Data Consortium, Philadelphia, 2002.
-
M. Przybocki, A. Martin, 2001 NIST Speaker Recognition Evaluation Corpus, Linguistic Data Consortium, Philadelphia, 2002.
-
-
-
-
100
-
-
38949099187
-
-
H. Jin, F. Kubala, R. Schwartz, Automatic speaker clustering, in: Proceedings of the Speech Recognition Workshop, Chantilly, Virginia, 1997, pp. 108-111.
-
H. Jin, F. Kubala, R. Schwartz, Automatic speaker clustering, in: Proceedings of the Speech Recognition Workshop, Chantilly, Virginia, 1997, pp. 108-111.
-
-
-
-
101
-
-
38949098345
-
-
Linguistic Data Consortium, Philadelphia
-
Fiscus J., Garofolo J., Przybocki M., Fisher W., and Pallett D. 1997 English Broadcast News Speech (HUB4) (1998), Linguistic Data Consortium, Philadelphia
-
(1998)
1997 English Broadcast News Speech (HUB4)
-
-
Fiscus, J.1
Garofolo, J.2
Przybocki, M.3
Fisher, W.4
Pallett, D.5
-
102
-
-
38949156053
-
-
C. Barras, X. Zhu, S. Meignier, J.-L. Gauvain, Improving speaker diarization, in: Proceedings of the Fall Rich Transcription Workshop (RT-04), Palisades, NY, November 2004 [Online]. Available: 〈http://www.limsi.fr/Individu/barras/publis/rt04f_diarization.pdf〉.
-
C. Barras, X. Zhu, S. Meignier, J.-L. Gauvain, Improving speaker diarization, in: Proceedings of the Fall Rich Transcription Workshop (RT-04), Palisades, NY, November 2004 [Online]. Available: 〈http://www.limsi.fr/Individu/barras/publis/rt04f_diarization.pdf〉.
-
-
-
-
103
-
-
84863340525
-
-
Linguistic Data Consortium, Philadelphia
-
Graff D., Alabiso J., Fiscus J., Garofolo J., Fisher W., and Pallett D. 1996 English Broadcast News Dev and Eval (HUB4) (1997), Linguistic Data Consortium, Philadelphia
-
(1997)
1996 English Broadcast News Dev and Eval (HUB4)
-
-
Graff, D.1
Alabiso, J.2
Fiscus, J.3
Garofolo, J.4
Fisher, W.5
Pallett, D.6
|