SCOPUS 정보 검색 플랫폼

Volumn 25, Issue 1, 2014, Pages 49-69

Multimedia event detection with multimodal feature fusion and temporal concept localization

(10) Oh, Sangmin a McCloskey, Scott b Kim, Ilseo a Vahdat, Arash c Cannons, Kevin J c Hajimirsadeghi, Hossein c Mori, Greg c Perera, A G Amitha a Pandey, Megha a Corso, Jason J d

a KITWARE INC (United States)

b HONEYWELL TECHNOLOGY CENTER (United States)

c SIMON FRASER UNIVERSITY (Canada)

d the State University of New York (United States)

Author keywords

Classification; Fusion; Machine learning; Multimedia

Indexed keywords

EID: 84894902895 PISSN: 09328092 EISSN: 14321769 Source Type: Journal
DOI: 10.1007/s00138-013-0525-x Document Type: Article

Times cited : (41)

References (60)

1
- 84894900796
- http://www.lscom.org/

2
- 84894901277
- TRECVID 2011 Multimedia Event Detection Evaluation Plan Version 3.0. http://www.nist.gov/itl/iad/mig/upload/MED11-EvalPlan-V03-20110801a.pdf
- TRECVID 2011 Multimedia Event Detection Evaluation Plan Version 3.0

3
- 14344252374
- Multiple kernel learning, conic duality, and the smo algorithm
- Bach, F.R.; Lanckriet, G.R.G.; Jordan, M.I.: Multiple kernel learning, conic duality, and the smo algorithm. In: ICML (2004)
- (2004) ICML
- Bach, F.R.¹ Lanckriet, G.R.G.² Jordan, M.I.³

4
- 78650994996
- Explicit and implicit concept-based video retrieval with bipartite graph propagation model
- Bao, L.; Cao, J.; Zhang, Y.; Li, J.; yu Chen, M.; Hauptmann, A.G.: Explicit and implicit concept-based video retrieval with bipartite graph propagation model. In: ACM Multimedia (2010)
- (2010) ACM Multimedia
- Bao, L.¹ Cao, J.² Zhang, Y.³ Li, J.⁴ Yu Chen, M.⁵ Hauptmann, A.G.⁶

5
- 1542287501
- Modeling annotated data
- Blei, D.M.; Jordan, M.I.: Modeling annotated data. In: ACM SIGIR, pp. 127-134 (2003)
- (2003) ACM SIGIR , pp. 127-134
- Blei, D.M.¹ Jordan, M.I.²

6
- 84878582006
- Consumer-level multimedia event detection through unsupervised audio signal modeling
- Byun, B.; Kim, I.; Siniscalchi, S.M.; Lee, C.H.: Consumer-level multimedia event detection through unsupervised audio signal modeling. In: InterSpeech (2012)
- (2012) InterSpeech
- Byun, B.¹ Kim, I.² Siniscalchi, S.M.³ Lee, C.H.⁴

7
- 84905252171
- Cao, L.; Chang, S.F.; Codella, N.; Cotton, C.; Ellis, D.; Gong, L.; Hill, M.; Hua, G.; Kender, J.; Merler, M.; Mu, Y.; Smith, J.R.; Yu, F.X.: IBM research and Columbia University TRECVID-2012 multimedia event detection (MED), multimedia event recounting (MER), and semantic indexing (SIN) systems (2012)
- (2012) IBM Research and Columbia University TRECVID-2012 Multimedia Event Detection (MED), Multimedia Event Recounting (MER), and Semantic Indexing (SIN) Systems
- Cao, L.¹ Chang, S.F.² Codella, N.³ Cotton, C.⁴ Ellis, D.⁵ Gong, L.⁶ Hill, M.⁷ Hua, G.⁸ Kender, J.⁹ Merler, M.¹⁰ Mu, Y.¹¹ Smith, J.R.¹² Yu, F.X.¹³

8
- 50649087214
- Spatially coherent latent topic model for concurrent segmentation and classification of objects and scenes
- Cao, L.; Fei-Fei, L.: Spatially coherent latent topic model for concurrent segmentation and classification of objects and scenes. In: ICCV (2007)
- (2007) ICCV
- Cao, L.¹ Fei-Fei, L.²

9
- 79955702502
- Libsvm: A library for support vector machines
- 10.1145/1961189.1961199
- Chang, C.C.; Lin, C.J.: Libsvm: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 27:1-27:27 (2011)
- (2011) ACM Trans. Intell. Syst. Technol. , vol.2 , Issue.3 , pp. 271-2727
- Chang, C.C.¹ Lin, C.J.²

10
- 33645146449
- Histograms of oriented gradients for human detection
- Dalal, N.; Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)
- (2005) CVPR
- Dalal, N.¹ Triggs, B.²

11
- 80052876786
- What does classifying more than 10,000 image categories tell us?
- Deng, J.; Berg, A.C.; Li, K.; Fei-Fei, L.: What does classifying more than 10,000 image categories tell us? In: ECCV (2010)
- (2010) ECCV
- Deng, J.¹ Berg, A.C.² Li, K.³ Fei-Fei, L.⁴

12
- 85198028989
- ImageNet: A large-scale hierarchical image database
- Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009)
- (2009) CVPR
- Deng, J.¹ Dong, W.² Socher, R.³ Li, L.J.⁴ Li, K.⁵ Fei-Fei, L.⁶

13
- 77955422240
- Object detection with discriminatively trained part based models
- 10.1109/TPAMI.2009.167
- Felzenszwalb, P.; Girshick, R.; McAllester, D.; Ramanan, D.: Object detection with discriminatively trained part based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627-1645 (2010)
- (2010) IEEE Trans. Pattern Anal. Mach. Intell. , vol.32 , Issue.9 , pp. 1627-1645
- Felzenszwalb, P.¹ Girshick, R.² McAllester, D.³ Ramanan, D.⁴

14
- 78650976705
- Towards a universal detector by mining concepts with small semantic gaps
- Feng, J.; Zheng, Y.; Yan, S.: Towards a universal detector by mining concepts with small semantic gaps. In: ACM Multimedia (2010)
- (2010) ACM Multimedia
- Feng, J.¹ Zheng, Y.² Yan, S.³

15
- 80053231413
- Topic models for image annotation and text illustration
- Feng, Y.; Lapata, M.: Topic models for image annotation and text illustration. In: NAACL HLT (2010)
- (2010) NAACL HLT
- Feng, Y.¹ Lapata, M.²

16
- 14344255188
- A mfom learning approach to robust multiclass multi-label text categorization
- Gao, S.; Wu, W.; Lee, C.H.; Chua, T.S.: A mfom learning approach to robust multiclass multi-label text categorization. In: ICML (2004)
- (2004) ICML
- Gao, S.¹ Wu, W.² Lee, C.H.³ Chua, T.S.⁴

17
- 77953202699
- TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation
- Guillaumin, M.; Mensink, T.; Verbeek, J.; Schmid, C.: TagProp: discriminative metric learning in nearest neighbor models for image auto-annotation. In: ICCV (2009)
- (2009) ICCV
- Guillaumin, M.¹ Mensink, T.² Verbeek, J.³ Schmid, C.⁴

18
- 64549150985
- Video retrieval based on semantic concepts
- 10.1109/JPROC.2008.916355
- Hauptmann, A.G.; Christel, M.G.; Yan, R.: Video retrieval based on semantic concepts. Proc. IEEE 96(4), 602-622 (2008)
- (2008) Proc. IEEE , vol.96 , Issue.4 , pp. 602-622
- Hauptmann, A.G.¹ Christel, M.G.² Yan, R.³

19
- 80054815184
- A survey on visual content-based video indexing and retrieval
- Hu, W.; Xie, N.; Li, L.; Zeng, X.; Maybank, S.J.: A survey on visual content-based video indexing and retrieval. IEEE Trans. Syst. Man Cybern. Part C 41(6), 797-819 (2011). URL: http://dx.doi.org/10.1109/TSMCC.2011.2109710
- (2011) IEEE Trans. Syst. Man Cybern. Part C , vol.41 , Issue.6 , pp. 797-819
- Hu, W.¹ Xie, N.² Li, L.³ Zeng, X.⁴ Maybank, S.J.⁵

20
- 25144471298
- Score normalization in multimodal biometric systems
- 10.1016/j.patcog.2005.01.012
- Jain, A.; Nandakumar, K.; Ross, A.: Score normalization in multimodal biometric systems. Pattern Recogn. 38(12), 2270-2285 (2005)
- (2005) Pattern Recogn. , vol.38 , Issue.12 , pp. 2270-2285
- Jain, A.¹ Nandakumar, K.² Ross, A.³

21
- 84871359352
- Leveraging high-level and low-level features for multimedia event detection
- Jiang, L.; Hauptmann, A.G.; Xiang, G.: Leveraging high-level and low-level features for multimedia event detection. In: ACM-MM (2012)
- (2012) ACM-MM
- Jiang, L.¹ Hauptmann, A.G.² Xiang, G.³

22
- 84455170074
- Audio-visual grouplet: Temporal audio-visual interactions for general video concept classification
- Jiang, W.; Loui, A.C.: Audio-visual grouplet: temporal audio-visual interactions for general video concept classification. In: ACM Multimedia (2011)
- (2011) ACM Multimedia
- Jiang, W.¹ Loui, A.C.²

23
- 84905157258
- Combining multiple modalities, contextual concepts, and temporal matching
- Jiang, Y.G.; Zeng, X.; Ye, G.; Bhattacharya, S.; Ellis, D.; Shah, M.; Chang, S.F.: Combining multiple modalities, contextual concepts, and temporal matching. In: NIST TRECVID Workshop (2010)
- (2010) NIST TRECVID Workshop
- Jiang, Y.G.¹ Zeng, X.² Ye, G.³ Bhattacharya, S.⁴ Ellis, D.⁵ Shah, M.⁶ Chang, S.F.⁷

24
- 0032203256
- Pattern recognition using a family of design algorithm based upon the generalized probabilistic descent method
- 10.1109/5.726793
- Katagiri, S.; Juang, B.H.; Lee, C.H.: Pattern recognition using a family of design algorithm based upon the generalized probabilistic descent method. Proc. IEEE 86, 2345-2373 (1998)
- (1998) Proc. IEEE , vol.86 , pp. 2345-2373
- Katagiri, S.¹ Juang, B.H.² Lee, C.H.³

25
- 82455163885
- Optimization of average precision with maximal figure-of-merit learning
- Kim, I.; Lee, C.H.: Optimization of average precision with maximal figure-of-merit learning. In: MLSP (2011)
- (2011) MLSP
- Kim, I.¹ Lee, C.H.²

26
- 84894904810
- Explicit performance metric optimization for fusion-based video retrieval
- Kim, I.; Oh, S.; Byun, B.; Perera, A.G.A.; Lee, C.H.: Explicit performance metric optimization for fusion-based video retrieval. In: ECCV Workshops, no. 3 (2012)
- (2012) ECCV Workshops , Issue.3
- Kim, I.¹ Oh, S.² Byun, B.³ Perera, A.G.A.⁴ Lee, C.H.⁵

27
- 84894904810
- Explicit performance metric optimization for fusion-based video retrieval
- Kim, I.; Oh, S.; Byun, B.; Perera, A.G.A.; Lee, C.H.: Explicit performance metric optimization for fusion-based video retrieval. In: ECCV Workshop (2012)
- (2012) ECCV Workshop
- Kim, I.¹ Oh, S.² Byun, B.³ Perera, A.G.A.⁴ Lee, C.H.⁵

28
- 0032021555
- On combining classifiers
- 10.1109/34.667881
- Kittler, J.; Hatef, M.; Duin, R.P.W.; Matas, J.: On combining classifiers. PAMI 20, 226-239 (1998)
- (1998) PAMI , vol.20 , pp. 226-239
- Kittler, J.¹ Hatef, M.² Duin, R.P.W.³ Matas, J.⁴

29
- 84898426452
- A spatio-temporal descriptor based on 3d-gradients
- Klaser, A.; Marszalek, M.; Schmid, C.: A spatio-temporal descriptor based on 3d-gradients. In: BMVC (2008)
- (2008) BMVC
- Klaser, A.¹ Marszalek, M.² Schmid, C.³

30
- 84894902092
- Double fusion for multimedia event detection
- Lan, Z.Z.; Bao, L.; Yu, S.I.; Liu, W.; Hauptmann, A.G.: Double fusion for multimedia event detection. In: ICME (2012)
- (2012) ICME
- Lan, Z.Z.¹ Bao, L.² Yu, S.I.³ Liu, W.⁴ Hauptmann, A.G.⁵

31
- 80052874098
- Learning hierarchical spatio-temporal features for action recognition with independent subspace analysis
- Le, Q.; Zou, W.; Yeung, S.; Ng, A.: Learning hierarchical spatio-temporal features for action recognition with independent subspace analysis. In: CVPR (2011)
- (2011) CVPR
- Le, Q.¹ Zou, W.² Yeung, S.³ Ng, A.⁴

32
- 0023800699
- A segment model based approach to speech recognition
- Lee, C.H.; Soong, F.K.; Juang, B.H.: A segment model based approach to speech recognition. In: ICASSP (1988)
- (1988) ICASSP
- Lee, C.H.¹ Soong, F.K.² Juang, B.H.³

33
- 77955746721
- Audio-based semantic concept classification for consumer video
- Lee, K.; Ellis, D.P.W.: Audio-based semantic concept classification for consumer video. IEEE Trans. Audio Speech Lang. Process. 18(6), 1406-1416 (2010)
- (2010) IEEE Trans. Audio Speech Lang. Process. , vol.18 , Issue.6 , pp. 1406-1416
- Lee, K.¹ Ellis, D.P.W.²

34
- 85162513516
- Object bank: A high-level image representation for scene classification & semantic feature sparsification
- Li, L.J.; Su, H.; Xing, E.P.; Li, F.F.: Object bank: A high-level image representation for scene classification & semantic feature sparsification. In: NIPS (2010)
- (2010) NIPS
- Li, L.J.¹ Su, H.² Xing, E.P.³ Li, F.F.⁴

35
- 84887358015
- Local expert forest of score fusion for video event classification
- Liu, J.; McCloskey, S.; Liu, Y.: Local expert forest of score fusion for video event classification. In: ECCV (2012)
- (2012) ECCV
- Liu, J.¹ McCloskey, S.² Liu, Y.³

36
- 84856667921
- Linear dependency modeling for feature fusion
- Ma, A.J.; Yuen, P.C.: Linear dependency modeling for feature fusion. In: ICCV, pp. 2041-2048 (2011)
- (2011) ICCV , pp. 2041-2048
- Ma, A.J.¹ Yuen, P.C.²

37
- 51949098112
- Classification using intersection kernel support vector machines is efficient
- Maji, S.; Berg, A.C.; Malik, J.: Classification using intersection kernel support vector machines is efficient. In: CVPR (2008)
- (2008) CVPR
- Maji, S.¹ Berg, A.C.² Malik, J.³

38
- 70449580491
- A new baseline for image annotation
- Makadia, A.; Pavlovic, V.; Kumar, S.: A new baseline for image annotation. In: ECCV (2008)
- (2008) ECCV
- Makadia, A.¹ Pavlovic, V.² Kumar, S.³

39
- 84866712341
- Multimodal feature fusion for robust event detection in web videos
- Natarajan, P.; Wu, S.; Vitaladevuni, S.N.P.; Zhuang, X.; Tsakalidis, S.; Park, U.; Prasad, R.; Natarajan, P.: Multimodal feature fusion for robust event detection in web videos. In: CVPR (2012)
- (2012) CVPR
- Natarajan, P.¹ Wu, S.² Vitaladevuni, S.N.P.³ Zhuang, X.⁴ Tsakalidis, S.⁵ Park, U.⁶ Prasad, R.⁷ Natarajan, P.⁸

40
- 31844433358
- Predicting good probabilities with supervised learning
- Niculescu-Mizil, A.; Caruana, R.: Predicting good probabilities with supervised learning. In: ICML (2005)
- (2005) ICML
- Niculescu-Mizil, A.¹ Caruana, R.²

41
- 0035328421
- Modeling the shape of the scene: A holistic representation of the spatial envelope
- 10.1023/A:1011139631724 0990.68601
- Oliva, A.; Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42(3), 145-175 (2001)
- (2001) Int. J. Comput. Vis. , vol.42 , Issue.3 , pp. 145-175
- Oliva, A.¹ Torralba, A.²

42
- 84905223557
- TRECVID 2011 - An overview of the goals, tasks, data, evaluation mechanisms and metrics
- NIST, USA
- Over, P.; Awad, G.; Michel, M.; Fiscus, J.; Antonishek, B.; Smeaton, A.F.; Kraaij, W.; Quéenot, G.: TRECVID 2011 - an overview of the goals, tasks, data, evaluation mechanisms and metrics. In: Proceedings of TRECVID 2011. NIST, USA (2011)
- (2011) Proceedings of TRECVID 2011
- Over, P.¹ Awad, G.² Michel, M.³ Fiscus, J.⁴ Antonishek, B.⁵ Smeaton, A.F.⁶ Kraaij, W.⁷ Quéenot, G.⁸

43
- 84905274625
- TRECVID 2012-an overview of the goals, tasks, data, evaluation mechanisms and metrics
- NIST, USA
- Over, P.; Fiscus, J.; Sanders, G.; Shaw, B.; Awad, G.; Michel, M.; Smeaton, A.; Kraaij, W.; Quéenot, G.: TRECVID 2012-an overview of the goals, tasks, data, evaluation mechanisms and metrics. In: Proceedings of TRECVID 2012. NIST, USA (2012)
- (2012) Proceedings of TRECVID 2012
- Over, P.¹ Fiscus, J.² Sanders, G.³ Shaw, B.⁴ Awad, G.⁵ Michel, M.⁶ Smeaton, A.⁷ Kraaij, W.⁸ Quéenot, G.⁹

44
- 77955999239
- Topic regression multi-model latent dirichlet allocation for image annotation
- Putthividhya, D.; Attias, H.T.; Nagarajan, S.S.: Topic regression multi-model latent dirichlet allocation for image annotation. In: CVPR (2010)
- (2010) CVPR
- Putthividhya, D.¹ Attias, H.T.² Nagarajan, S.S.³

45
- 70349195978
- On the importance of modeling temporal information in music tag annotation
- Reed, J.; Lee, C.H.: On the importance of modeling temporal information in music tag annotation. In: ICASSP (2009)
- (2009) ICASSP
- Reed, J.¹ Lee, C.H.²

46
- 77955426203
- Evaluating color descriptors for object and scene recognition
- 10.1109/TPAMI.2009.154
- van de Sande, K.E.A.; Gevers, T.; Snoek, C.G.M.: Evaluating color descriptors for object and scene recognition. PAMI 32(9), 1582-1596 (2010)
- (2010) PAMI , vol.32 , Issue.9 , pp. 1582-1596
- Van De Sande, K.E.A.¹ Gevers, T.² Snoek, C.G.M.³

47
- 78149315356
- Robust fusion: Extreme value theory for recognition score normalization
- Scheirer, W.; Rocha, A.; Micheals, R.; Boult, T.: Robust fusion: extreme value theory for recognition score normalization. In: ECCV, pp. 481-495 (2010)
- (2010) ECCV , pp. 481-495
- Scheirer, W.¹ Rocha, A.² Micheals, R.³ Boult, T.⁴

48
- 84908571977
- Multimedia semantic indexing using model vectors
- Smith, J.; Naphade, M.; Natsev, A.: Multimedia semantic indexing using model vectors. In: ICME (2003)
- (2003) ICME
- Smith, J.¹ Naphade, M.² Natsev, A.³

49
- 34547172608
- The challenge problem for automated detection of 101 semantic concepts in multimedia
- Snoek, C.G.M.; Worring, M.; van Gemert, J.C.; Geusebroek, J.M.; Smeulders, A.W.: The challenge problem for automated detection of 101 semantic concepts in multimedia. In: Proceedings of ACM Multimedia (2006)
- (2006) Proceedings of ACM Multimedia
- Snoek, C.G.M.¹ Worring, M.² Van Gemert, J.C.³ Geusebroek, J.M.⁴ Smeulders, A.W.⁵

50
- 84866707906
- Evaluation of low-level features and their combinations for complex event detection in open source videos
- Tamrakar, A.; Ali, S.; Yu, Q.; Liu, J.; Javed, O.; Divakaran, A.; Cheng, H.; Sawhney, H.S.: Evaluation of low-level features and their combinations for complex event detection in open source videos. In: CVPR (2012)
- (2012) CVPR
- Tamrakar, A.¹ Ali, S.² Yu, Q.³ Liu, J.⁴ Javed, O.⁵ Divakaran, A.⁶ Cheng, H.⁷ Sawhney, H.S.⁸

51
- 67650999671
- Optimal classifier fusion in a non-bayesian probabilistic framework
- 10.1109/TPAMI.2008.224
- Terrades, O.R.; Valveny, E.; Tabbone, S.: Optimal classifier fusion in a non-bayesian probabilistic framework. PAMI 31(9), 1630-1644 (2009)
- (2009) PAMI , vol.31 , Issue.9 , pp. 1630-1644
- Terrades, O.R.¹ Valveny, E.² Tabbone, S.³

52
- 78049411640
- An acoustic segment model approach to incorporating temporal information into speaker modeling for text-independent speaker recognition
- Tsao, Y.; Sun, H.; Li, H.; Lee, C.H.: An acoustic segment model approach to incorporating temporal information into speaker modeling for text-independent speaker recognition. In: ICASSP (2010)
- (2010) ICASSP
- Tsao, Y.¹ Sun, H.² Li, H.³ Lee, C.H.⁴

53
- 77953196456
- Multiple kernels for object detection
- Vedaldi, A.; Gulshan, V.; Varma, M.; Zisserman, A.: Multiple kernels for object detection. In: ICCV (2009)
- (2009) ICCV
- Vedaldi, A.¹ Gulshan, V.² Varma, M.³ Zisserman, A.⁴

54
- 84901598107
- Vedaldi, A.; Zisserman, A.: Efficient additive kernels via explicit feature maps (2011)
- (2011) Efficient Additive Kernels Via Explicit Feature Maps
- Vedaldi, A.¹ Zisserman, A.²

55
- 70450178502
- Simultaneous image classification and annotation
- Wang, C.; Blei, D.M.; Fei-Fei, L.: Simultaneous image classification and annotation. In: CVPR (2009)
- (2009) CVPR
- Wang, C.¹ Blei, D.M.² Fei-Fei, L.³

56
- 70450216856
- Max-margin hidden conditional random fields for human action recognition
- Wang, Y.; Mori, G.: Max-margin hidden conditional random fields for human action recognition. In: CVPR (2009)
- (2009) CVPR
- Wang, Y.¹ Mori, G.²

57
- 77955988947
- SUN database: Large-scale scene recognition from abbey to zoo
- Xiao, J.; Hays, J.; Ehinger, K.; Oliva, A.; Torralba, A.: SUN database: large-scale scene recognition from abbey to zoo. In: CVPR (2010)
- (2010) CVPR
- Xiao, J.¹ Hays, J.² Ehinger, K.³ Oliva, A.⁴ Torralba, A.⁵

58
- 84877737125
- Kernel latent svm for visual recognition
- Yang, W.; Wang, Y.; Vahdat, A.; Mori, G.: Kernel latent svm for visual recognition. In: Advances in Neural Information Processing Systems (NIPS) (2012)
- (2012) Advances in Neural Information Processing Systems (NIPS)
- Yang, W.¹ Wang, Y.² Vahdat, A.³ Mori, G.⁴

59
- 84866712367
- Robust late fusion with rank minimization
- Ye, G.; Liu, D.; Jhuo, I.H.; Chang, S.F.: Robust late fusion with rank minimization. In: CVPR (2012)
- (2012) CVPR
- Ye, G.¹ Liu, D.² Jhuo, I.H.³ Chang, S.F.⁴

60
- 84885614497
- Text classification with kernels on the multinomial manifold
- Zhang, D.; Chen, X.; Lee, W.S.: Text classification with kernels on the multinomial manifold. In: SIGIR (2005)
- (2005) SIGIR
- Zhang, D.¹ Chen, X.² Lee, W.S.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.