SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn , Issue , 2013, Pages 2929-2933

Robust audio-codebooks for Large-Scale Event Detection in consumer videos

(6) Rawat, Shourabh a Schulam, Peter F a Burger, Susanne a Ding, Duo a Wang, Yipei a Metze, Florian a

a Carnegie Mellon University (United States)

Author keywords

Audio codebook models; Multimedia Event Detection; Video retrieval

Indexed keywords

SPEECH RECOGNITION; STATISTICS;

BAG-OF-WORDS MODELS; CLUSTERING APPROACH; COMPACT REPRESENTATION; LATENT DIRICHLET ALLOCATIONS; MULTIMEDIA EVENT DETECTIONS; OVERALL EFFECTIVENESS; TEMPORAL INFORMATION; VIDEO RETRIEVAL;

IMAGE RETRIEVAL;

EID: 84906214187 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (20)

References (21)

1
- 84906223343
- "Trecvid multimedia event detection, evaluation 2012, " 2012, http://www.nist.gov/itl/iad/mig/med12.cfm.
- (2012) Trecvid Multimedia Event Detection, Evaluation 2012

2
- 84906258497
- Manual annotation of environmental noise in audio streams
- Pittsburgh
- S. Burger, Q. Jin, P. Schulam, and F. Metze, "Manual annotation of environmental noise in audio streams, " Carnegie Mellon University, Language Technologies Institute, Technical Report CMU-LTI-12-017, Pittsburgh, 2012.
- (2012) Carnegie Mellon University, Language Technologies Institute, Technical Report CMU-LTI-12-017
- Burger, S.¹ Jin, Q.² Schulam, P.³ Metze, F.⁴

3
- 84905270442
- IBM research and Columbia university trecvid-2011 multimedia event detection (med) system
- L. Cao, S. Chang, N. Codella, C. Cotton, D. Ellis, L. Gong, M. Hill, G. Hua, J. Kender, M. Merler et al., "Ibm research and columbia university trecvid-2011 multimedia event detection (med) system, " TRECVID Multimedia Event Detection Task (MED), 2011.
- (2011) TRECVID Multimedia Event Detection Task (MED)
- Cao, L.¹ Chang, S.² Codella, N.³ Cotton, C.⁴ Ellis, D.⁵ Gong, L.⁶ Hill, M.⁷ Hua, G.⁸ Kender, J.⁹ Merler, M.¹⁰

4
- 84878551166
- Event-based video retrieval using audio
- Q. Jin, P. Schulam, S. Rawat, S. Burger, D. Ding, and F. Metze, "Event-based video retrieval using audio, " in Proc. of Interspeech, 2012.
- (2012) Proc. of Interspeech
- Jin, Q.¹ Schulam, P.² Rawat, S.³ Burger, S.⁴ Ding, D.⁵ Metze, F.⁶

5
- 84878606595
- Bag-of-audio-words approach for multimedia event classification
- S. Pancoasta and M. Akbacak, "Bag-of-audio-words approach for multimedia event classification, " in Proc. of Interspeech, 2012.
- (2012) Proc. of Interspeech
- Pancoasta, S.¹ Akbacak, M.²

6
- 33645887246
- Support vector machines using gmm supervectors for speaker verification
- W. Campbell, D. Sturim, and D. Reynolds, "Support vector machines using gmm supervectors for speaker verification, " Signal Processing Letters, IEEE, vol. 13, no. 5, pp. 308-311, 2006.
- (2006) Signal Processing Letters, IEEE , vol.13 , Issue.5 , pp. 308-311
- Campbell, W.¹ Sturim, D.² Reynolds, D.³

7
- 84878580398
- Compact audio representation for event detection in consumer media
- X. Zhuang, S. Tsakalidis, S. Wu, P. Natarajan, R. Prasad, and P. Natarajan, "Compact audio representation for event detection in consumer media, " in Proc. of Interspeech, 2012.
- (2012) Proc. of Interspeech
- Zhuang, X.¹ Tsakalidis, S.² Wu, S.³ Natarajan, P.⁴ Prasad, R.⁵ Natarajan, P.⁶

8
- 84878582006
- Consumer-level multimedia event detection through unsupervised audio signal modeling
- B. Byun, I. Kim, S. M. Siniscalchi, and C.-H. Lee, "Consumer-level multimedia event detection through unsupervised audio signal modeling, " in Proc. of Interspeech, 2012.
- (2012) Proc. of Interspeech
- Byun, B.¹ Kim, I.² Siniscalchi, S.M.³ Lee, C.-H.⁴

9
- 84865744986
- Unsupervised learning of acoustic unit descriptors for audio content representation and classification
- S. Chaudhuri, M. Harvilla, and B. Raj, "Unsupervised learning of acoustic unit descriptors for audio content representation and classification, " in Proc. of Interspeech, 2011.
- (2011) Proc. of Interspeech
- Chaudhuri, S.¹ Harvilla, M.² Raj, B.³

10
- 77952671498
- Visual word ambiguity
- J. van Gemert, C. Veenman, A. Smeulders, and J. Geusebroek, "Visual word ambiguity, " Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 32, no. 7, pp. 1271-1283, 2010.
- (2010) Pattern Analysis and Machine Intelligence, IEEE Transactions on , vol.32 , Issue.7 , pp. 1271-1283
- Gemert, J.V.¹ Veenman, C.² Smeulders, A.³ Geusebroek, J.⁴

11
- 77249101259
- Comparing compact codebooks for visual categorization
- J. van Gemert, C. Snoek, C. Veenman, A. Smeulders, and J. Geusebroek, "Comparing compact codebooks for visual categorization, " Computer Vision and Image Understanding, vol. 114, no. 4, pp. 450-462, 2010.
- (2010) Computer Vision and Image Understanding , vol.114 , Issue.4 , pp. 450-462
- Van Gemert, J.¹ Snoek, C.² Veenman, C.³ Smeulders, A.⁴ Geusebroek, J.⁵

12
- 37849036011
- Evaluating bag-of-visual-words representations in scene classification
- ACM
- J. Yang, Y. Jiang, A. Hauptmann, and C. Ngo, "Evaluating bag-of-visual-words representations in scene classification, " in Proceedings of the international workshop on Workshop on multimedia information retrieval. ACM, 2007, pp. 197-206.
- (2007) Proceedings of the International Workshop on Workshop on Multimedia Information Retrieval. , pp. 197-206
- Yang, J.¹ Jiang, Y.² Hauptmann, A.³ Ngo, C.⁴

13
- 33745934686
- Creating efficient codebooks for visual recognition
- IEEE
- F. Jurie and B. Triggs, "Creating efficient codebooks for visual recognition, " in Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on, vol. 1. IEEE, 2005, pp. 604-610.
- (2005) Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on , vol.1 , pp. 604-610
- Jurie, F.¹ Triggs, B.²

14
- 77955746721
- Audio-based semantic concept classification for consumer video
- K. Lee and D. Ellis, "Audio-based semantic concept classification for consumer video, " Audio, Speech, and Language Processing, IEEE Transactions on, vol. 18, no. 6, pp. 1406-1416, 2010.
- (2010) Audio, Speech, and Language Processing, IEEE Transactions on , vol.18 , Issue.6 , pp. 1406-1416
- Lee, K.¹ Ellis, D.²

15
- 0030369274
- Inclusion of temporal information into features for speech recognition
- IEEE
- B. Milner, "Inclusion of temporal information into features for speech recognition, " in Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on, vol. 1. IEEE, 1996, pp. 256-259.
- (1996) Spoken Language, 1996 ICSLP 96 Proceedings, Fourth International Conference on , vol.1 , pp. 256-259
- Milner, B.¹

16
- 84864146684
- Temporal pooling and multiscale learning for automatic annotation and ranking of music audio
- P. Hamel, S. Lemieux, Y. Bengio, and D. Eck, "Temporal pooling and multiscale learning for automatic annotation and ranking of music audio, " in In Proceedings of the 12th International Conference on Music Information Retrieval (ISMIR11), 2011.
- (2011) thInternational Conference on Music Information Retrieval (ISMIR11)
- Hamel, P.¹ Lemieux, S.² Bengio, Y.³ Eck, D.⁴

17
- 84867599761
- Semi-supervised learning helps in sound event classification
- IEEE
- Z. Zhang and B. Schuller, "Semi-supervised learning helps in sound event classification, " in Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on. IEEE, 2012, pp. 333-336.
- (2012) Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on , pp. 333-336
- Zhang, Z.¹ Schuller, B.²

18
- 78650977476
- Opensmile: The Munich versatile and fast open-source audio feature extractor
- ACM
- F. Eyben, M. Wollmer, and B. Schuller, "Opensmile: the munich versatile and fast open-source audio feature extractor, " in Proceedings of the international conference on Multimedia. ACM, 2010, pp. 1459-1462.
- (2010) Proceedings of the International Conference on Multimedia , pp. 1459-1462
- Eyben, F.¹ Wollmer, M.² Schuller, B.³

19
- 78650898296
- An n-gram model for unstructured audio signals toward information retrieval
- IEEE
- S. Kim, S. Sundaram, P. Georgiou, and S. Narayanan, "An n-gram model for unstructured audio signals toward information retrieval, " in Multimedia Signal Processing (MMSP), 2010 IEEE International Workshop on. IEEE, 2010, pp. 477-480.
- (2010) Multimedia Signal Processing (MMSP), 2010 IEEE International Workshop on , pp. 477-480
- Kim, S.¹ Sundaram, S.² Georgiou, P.³ Narayanan, S.⁴

20
- 0141607824
- Latent dirichlet allocation
- D. Blei, A. Ng, and M. Jordan, "Latent dirichlet allocation, " the Journal of machine Learning research, vol. 3, pp. 993-1022, 2003.
- (2003) The Journal of Machine Learning Research , vol.3 , pp. 993-1022
- Blei, D.¹ Ng, A.² Jordan, M.³

21
- 1542370010
- A. K. McCallum, "Mallet: A machine learning for language toolkit, " 2002, http://mallet.cs.umass.edu.
- (2002) Mallet: A Machine Learning for Language Toolkit
- McCallum, A.K.¹

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.