-
1
-
-
33646380923
-
Approaches and applications of speaker diarization
-
D.A. Reynolds and P.A. Torres-Carrasquillo, "Approaches and Applications of Speaker Diarization," Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing, pp. 953-956, 2010.
-
(2010)
Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing
, pp. 953-956
-
-
Reynolds, D.A.1
Torres-Carrasquillo, P.A.2
-
2
-
-
47749152568
-
The rich transcription 2007 meeting recognition evaluation
-
Springer-Verlag
-
J.G. Fiscus, J. Ajot, and J.S. Garofolo, "The Rich Transcription 2007 Meeting Recognition Evaluation," Multimodal Technologies for Perception of Humans, pp. 373-389, Springer-Verlag, 2008.
-
(2008)
Multimodal Technologies for Perception of Humans
, pp. 373-389
-
-
Fiscus, J.G.1
Ajot, J.2
Garofolo, J.S.3
-
3
-
-
0002606824
-
Transcription of broadcast news shows with the IBM large vocabulary speech recognition system
-
R. Bakis, S. Chen, P. Gopalakrishnan, R. Gopinath, L. Polymenakos, and M. Franz, "Transcription of Broadcast News Shows with the IBM Large Vocabulary Speech Recognition System," Proc. Speech Recognition Workshop, pp. 67-72, 1997.
-
(1997)
Proc. Speech Recognition Workshop
, pp. 67-72
-
-
Bakis, R.1
Chen, S.2
Gopalakrishnan, P.3
Gopinath, R.4
Polymenakos, L.5
Franz, M.6
-
4
-
-
85128356454
-
Partitioning and transcription of broadcast news data
-
J. luc Gauvain, L. Lamel, and G. Adda, "Partitioning and Transcription of Broadcast News Data," Proc. Int'l Conf. Spoken Language Processing, pp. 1335-1338, 1998.
-
(1998)
Proc. Int'l Conf. Spoken Language Processing
, pp. 1335-1338
-
-
Luc Gauvain, J.1
Lamel, L.2
Adda, G.3
-
6
-
-
33745196256
-
Spectral cross-correlation features for audio indexing of broadcast news and meetings
-
9th European Conference on Speech Communication and Technology, Eurospeech Interspeech
-
M. Yamaguchi, M. Yamashita, and S. Matsunaga, "Spectral Cross-Correlation Features for Audio Indexing of Broadcast News and Meetings," Proc. Ninth European Conf. Speech Comm. and Technology, pp. 613-616, 2005. (Pubitemid 43908137)
-
(2005)
9th European Conference on Speech Communication and Technology
, pp. 613-616
-
-
Yamaguchi, M.1
Yamashita, M.2
Matsunaga, S.3
-
7
-
-
0034273195
-
DISTBIC: A speaker-based segmentation for audio data indexing
-
P. Delacourt, D. Kryze, and C.J. Wellekens, "DISTBIC: A Speaker-Based Segmentation for Audio Data Indexing," Speech Comm., vol. 32, pp. 111-126, 2000.
-
(2000)
Speech Comm.
, vol.32
, pp. 111-126
-
-
Delacourt, P.1
Kryze, D.2
Wellekens, C.J.3
-
8
-
-
84875953283
-
Clustering via the bayesian information criterion with applications in speech recognition
-
S.S. Chenn and P. Gopalakrishnan, "Clustering via the Bayesian Information Criterion with Applications in Speech Recognition," Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing, vol. 2, pp. 645-648, 1998.
-
(1998)
Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing
, vol.2
, pp. 645-648
-
-
Chenn, S.S.1
Gopalakrishnan, P.2
-
9
-
-
0034857759
-
Speaker change detection and speaker clustering using VQ distortion for broadcast news speech recognition
-
K. Mori and S. Nakagawa, "Speaker Change Detection and Speaker Clustering Using VQ Distortion for Broadcast News Speech Recognition," Systems and Computers in Japan, vol. 34, pp. 413-416, 2001. (Pubitemid 32839275)
-
(2001)
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
, vol.1
, pp. 413-416
-
-
Mori, K.1
Nakagawa, S.2
-
10
-
-
85009142161
-
A novel method for two-speaker segmentation
-
R. Gangadharaiah, B. Narayanaswamy, and Narayanaswamy, "A Novel Method for Two-Speaker Segmentation," Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing, 2004.
-
(2004)
Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing
-
-
Gangadharaiah, R.1
Narayanaswamy, B.2
Narayanaswamy3
-
13
-
-
24644451644
-
Pixels that sound
-
E. Kidron, Y.Y. Schechner, and M. Elad, "Pixels That Sound," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, pp. 88-95, 2005.
-
(2005)
Proc. IEEE CS Conf. Computer Vision and Pattern Recognition
, pp. 88-95
-
-
Kidron, E.1
Schechner, Y.Y.2
Elad, M.3
-
14
-
-
84899028297
-
Using audio-visual synchrony to locate sounds
-
MIT Press
-
J. Hershey and J. Movellan, "Using Audio-Visual Synchrony to Locate Sounds," Advances in Neural Information Processing Systems, vol. 12, pp. 813-819, MIT Press, 1999.
-
(1999)
Advances in Neural Information Processing Systems
, vol.12
, pp. 813-819
-
-
Hershey, J.1
Movellan, J.2
-
15
-
-
84947720445
-
Audiovisual segmentation and the cocktail party effect
-
T. Darrell, J.W. Fisher III, and P. Viola, "Audiovisual Segmentation and the Cocktail Party Effect," Proc. Int'l Conf. Multimodal Interfaces, pp. 32-40, 2000.
-
(2000)
Proc. Int'l Conf. Multimodal Interfaces
, pp. 32-40
-
-
Darrell, T.1
Fisher Iii, J.W.2
Viola, P.3
-
16
-
-
4243096131
-
Multimodal processing by finding common cause
-
H.J. Nock, G. Iyengar, and C. Neti, "Multimodal Processing by Finding Common Cause," Comm. ACM, vol. 47, no. 1, pp. 51-56, 2004.
-
(2004)
Comm. ACM
, vol.47
, Issue.1
, pp. 51-56
-
-
Nock, H.J.1
Iyengar, G.2
Neti, C.3
-
17
-
-
84908470296
-
Audio-visual synchrony for detection of monologues in video archives
-
G. Iyengar, H.J. Nock, and C. Neti, "Audio-Visual Synchrony for Detection of Monologues in Video Archives," Proc. IEEE Int'l Conf. Multimedia and Expo, pp. 329-332, 2003.
-
(2003)
Proc. IEEE Int'l Conf. Multimedia and Expo
, pp. 329-332
-
-
Iyengar, G.1
Nock, H.J.2
Neti, C.3
-
19
-
-
0038715064
-
Distributed meetings: A meeting capture and broadcasting system
-
R. Cutler, Y. Rui, A. Gupta, J. Cadiz, I. Tashev, L. wei He, A. Colburn, Z.Z.Z. Liu, and S. Silverberg, "Distributed Meetings: A Meeting Capture and Broadcasting System," Proc. 10th ACM Int'l Conf. Multimedia, 2002.
-
(2002)
Proc. 10th ACM Int'l Conf. Multimedia
-
-
Cutler, R.1
Rui, Y.2
Gupta, A.3
Cadiz, J.4
Tashev, I.5
Wei He, L.6
Colburn, A.7
Liu, Z.Z.Z.8
Silverberg, S.9
-
20
-
-
21244492850
-
Real-time speaker tracking using particle filter sensor fusion
-
DOI 10.1109/JPROC.2003.823146, Sequential State Estimation: From Kalman Filters to Particles Filters
-
Y. Chen and Y. Rui, "Real-Time Speaker Tracking Using Particle Filter Sensor Fusion," Proc. IEEE, vol. 92, no. 3, pp. 485-494, Mar. 2004. (Pubitemid 40890755)
-
(2004)
Proceedings of the IEEE
, vol.92
, Issue.3
, pp. 485-494
-
-
Chen, Y.1
Rui, Y.2
-
21
-
-
4544347587
-
Multiple person and speaker activity tracking with a particle filter
-
May
-
N. Checka, K. Wilson, M. Siracusa, and T. Darrell, "Multiple Person and Speaker Activity Tracking with a Particle Filter," Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing, May 2004.
-
(2004)
Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing
-
-
Checka, N.1
Wilson, K.2
Siracusa, M.3
Darrell, T.4
-
22
-
-
81855192139
-
-
IDIAP-RR 66, IDIAP
-
D. Gatica-Perez, G. Lathoud, J.-M. Odobez, and I. McCowan, "Multimodal Multispeaker Probabilistic Tracking in Meetings," IDIAP-RR 66, IDIAP, 2004.
-
(2004)
Multimodal Multispeaker Probabilistic Tracking in Meetings
-
-
Gatica-Perez, D.1
Lathoud, G.2
Odobez, J.-M.3
McCowan, I.4
-
25
-
-
0024610919
-
A tutorial on hidden markov models and selected applications in speech recognition
-
Feb.
-
L.R. Rabiner, "A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition," Proc. IEEE, vol. 77, no. 2, pp. 257-286, Feb. 1989.
-
(1989)
Proc. IEEE
, vol.77
, Issue.2
, pp. 257-286
-
-
Rabiner, L.R.1
-
26
-
-
33644532974
-
Robust real-time object detection
-
P. Viola and M. Jones, "Robust Real-Time Object Detection," Proc. Second Int'l Workshop Statistical and Computational Theories of Vision-Modeling, Learning, Computing, and Sampling, 2001.
-
(2001)
Proc. Second Int'l Workshop Statistical and Computational Theories of Vision-Modeling, Learning, Computing, and Sampling
-
-
Viola, P.1
Jones, M.2
-
27
-
-
24644524200
-
Visual categorization with bags of keypoints
-
C. Dance, J. Willamowski, L. Fan, C. Bray, and G. Csurka, "Visual Categorization with Bags of Keypoints," Proc. European Conf. Computer Vision Int'l Workshop Statistical Learning in Computer Vision, 2004.
-
(2004)
Proc. European Conf. Computer Vision Int'l Workshop Statistical Learning in Computer Vision
-
-
Dance, C.1
Willamowski, J.2
Fan, L.3
Bray, C.4
Csurka, G.5
-
28
-
-
0009622481
-
Learning joint statistical models for audio-visual fusion and segregation
-
MIT Press
-
T. Darrell, J.W. Fisher III, W.T. Freeman, and P. Viola, "Learning Joint Statistical Models for Audio-Visual Fusion and Segregation," Advances in Neural Information Processing Systems, vol. 13, pp. 772-778, MIT Press, 2000.
-
(2000)
Advances in Neural Information Processing Systems
, vol.13
, pp. 772-778
-
-
Darrell, T.1
Fisher Iii, J.W.2
Freeman, W.T.3
Viola, P.4
-
29
-
-
0001185873
-
An essay towards solving a problem in the doctrine of chances
-
B. Thomas, "An Essay Towards Solving a Problem in the Doctrine of Chances," Philosophical Trans. Royal Soc., vol. 53, pp. 370-418, 1763.
-
Philosophical Trans. Royal Soc.
, vol.53
, Issue.1763
, pp. 370-418
-
-
Thomas, B.1
-
31
-
-
34547231084
-
EM detection of common origin of multi-modal cues
-
DOI 10.1145/1180995.1181037, ICMI'06: 8th International Conference on Multimodal Interfaces, Conference Proceedings
-
A.K. Noulas and B.J.A. Krö se, "EM Detection of Common Origin of Multi-Modal Cues," Proc. Int'l Conf. Multimodal Interfaces, pp. 201-208, 2006. (Pubitemid 47113450)
-
(2006)
ICMI'06: 8th International Conference on Multimodal Interfaces, Conference Proceeding
, pp. 201-208
-
-
Noulas, A.K.1
Krose, B.J.A.2
-
34
-
-
58049136519
-
Announcing the AMI meeting corpus
-
Jan.-Mar.
-
J. Carletta, "Announcing the AMI Meeting Corpus," The ELRA Newsletter, vol. 1, no. 1, pp. 3-5, Jan.-Mar. 2006.
-
(2006)
The ELRA Newsletter
, vol.1
, Issue.1
, pp. 3-5
-
-
Carletta, J.1
-
36
-
-
34548346846
-
Automatic cluster complexity and quantity selection: Towards robust speaker diarization
-
X. Anguera, C. Wooters, and J. Hernando, "Automatic Cluster Complexity and Quantity Selection: Towards Robust Speaker Diarization," Proc. Third Joint Workshop Multimodal Interaction and Related Machine Learning Algorithms, pp. 248-256, 2006.
-
(2006)
Proc. Third Joint Workshop Multimodal Interaction and Related Machine Learning Algorithms
, pp. 248-256
-
-
Anguera, X.1
Wooters, C.2
Hernando, J.3
|