메뉴 건너뛰기




Volumn 25, Issue 1, 2005, Pages 5-35

Multimodal video indexing: A review of the state-of-the-art

Author keywords

Analysis framework; Multimodal integration; Multimodal video indexing; Review; Video segmentation

Indexed keywords

ALGORITHMS; INTEGRATION; MULTIMEDIA SYSTEMS; PROJECT MANAGEMENT; STATISTICAL METHODS; TEXT PROCESSING;

EID: 10044236762     PISSN: 13807501     EISSN: None     Source Type: Journal    
DOI: 10.1023/B:MTAP.0000046380.27575.a5     Document Type: Review
Times cited : (351)

References (103)
  • 1
    • 0002121497 scopus 로고    scopus 로고
    • Part-of-speech tagging and partial parsing
    • S. Young and G. Bloothooft (Eds.), Kluwer Academic Publishers, Dordrecht
    • S. Abney, "Part-of-speech tagging and partial parsing," in Corpus-Based Methods in Language and Speech Processing, S. Young and G. Bloothooft (Eds.), Kluwer Academic Publishers, Dordrecht, 1997, pp. 118-136.
    • (1997) Corpus-based Methods in Language and Speech Processing , pp. 118-136
    • Abney, S.1
  • 2
    • 0000718946 scopus 로고    scopus 로고
    • The advanced video information system: Data structures and query processing
    • S. Adali, K.S. Candan, S.S. Chen, K. Erol, and V.S. Subrahmanian, "The advanced video information system: Data structures and query processing," Multimedia Systems, Vol. 4, No. 4, pp. 172-186, 1996.
    • (1996) Multimedia Systems , vol.4 , Issue.4 , pp. 172-186
    • Adali, S.1    Candan, K.S.2    Chen, S.S.3    Erol, K.4    Subrahmanian, V.S.5
  • 3
    • 0035368101 scopus 로고    scopus 로고
    • Multi-modal dialogue scene detection using hidden markov models for content-based multimedia indexing
    • A.A. Alatan, A.N. Akansu, and W. Wolf, "Multi-modal dialogue scene detection using hidden markov models for content-based multimedia indexing," Multimedia Tools and Applications, Vol. 14, No. 2, pp. 137-151, 2001.
    • (2001) Multimedia Tools and Applications , vol.14 , Issue.2 , pp. 137-151
    • Alatan, A.A.1    Akansu, A.N.2    Wolf, W.3
  • 4
    • 0031611061 scopus 로고    scopus 로고
    • Region-based parametric motion segmentation using color information
    • Y. Altunbasak, P.E. Eren, and A.M. Tekalp, "Region-based parametric motion segmentation using color information," Graphical Models and Image Processing, Vol. 60, No. 1, pp. 13-23, 1998.
    • (1998) Graphical Models and Image Processing , vol.60 , Issue.1 , pp. 13-23
    • Altunbasak, Y.1    Eren, P.E.2    Tekalp, A.M.3
  • 5
    • 0036502392 scopus 로고    scopus 로고
    • Event based indexing of broadcasted sports video by intermodal collaboration
    • N. Babaguchi, Y. Kawai, and T. Kitahashi, "Event based indexing of broadcasted sports video by intermodal collaboration," IEEE Transactions on Multimedia, Vol. 4, No. 1, pp. 68-75, 2002.
    • (2002) IEEE Transactions on Multimedia , vol.4 , Issue.1 , pp. 68-75
    • Babaguchi, N.1    Kawai, Y.2    Kitahashi, T.3
  • 7
    • 0035309512 scopus 로고    scopus 로고
    • Content-based indexing and retrieval of TV news
    • M. Bertini, A. Del Bimbo, and P. Pala, "Content-based indexing and retrieval of TV news," Pattern Recognition Letters, Vol. 22, No. 5, pp. 503-516, 2001.
    • (2001) Pattern Recognition Letters , vol.22 , Issue.5 , pp. 503-516
    • Bertini, M.1    Del Bimbo, A.2    Pala, P.3
  • 8
    • 0032632354 scopus 로고    scopus 로고
    • An algorithm that learns what's in a name
    • D. Bikel, R. Schwartz, and R.M. Weischedel, "An algorithm that learns what's in a name," Machine Learning, Vol. 34, Nos. 1-3, pp. 211-231, 1999.
    • (1999) Machine Learning , vol.34 , Issue.1-3 , pp. 211-231
    • Bikel, D.1    Schwartz, R.2    Weischedel, R.M.3
  • 15
    • 0033892811 scopus 로고    scopus 로고
    • Interactive maps for a digital video library
    • M. Christel, A. Olligschlaeger, and C. Huang, "Interactive maps for a digital video library," IEEE Multimedia, Vol. 7, No. 1, pp. 60-67, 2000.
    • (2000) IEEE Multimedia , vol.7 , Issue.1 , pp. 60-67
    • Christel, M.1    Olligschlaeger, A.2    Huang, C.3
  • 16
    • 0032595005 scopus 로고    scopus 로고
    • Semantics in visual information retrieval
    • C. Colombo, A. Del Bimbo, and P. Pala, "Semantics in visual information retrieval," IEEE Multimedia, Vol. 6, No. 3, pp. 38-53, 1999.
    • (1999) IEEE Multimedia , vol.6 , Issue.3 , pp. 38-53
    • Colombo, C.1    Del Bimbo, A.2    Pala, P.3
  • 17
    • 10044262950 scopus 로고    scopus 로고
    • Convera. http://www.convera.com.
  • 23
    • 0029458263 scopus 로고
    • Automatic recognition of film genres
    • San Francisco, USA
    • S. Fischer, R. Lienhart, and W. Effelsberg, "Automatic recognition of film genres," in ACM Multimedia 1995, San Francisco, USA, 1995, pp. 295-304.
    • (1995) ACM Multimedia 1995 , pp. 295-304
    • Fischer, S.1    Lienhart, R.2    Effelsberg, W.3
  • 26
    • 0029456574 scopus 로고
    • Query by humming - Musical information retrieval in an audio database
    • San Francisco, USA
    • A. Ghias, J. Logan, D. Chamberlin, and B.C. Smith, "Query by humming - musical information retrieval in an audio database," in ACM Multimedia 1995, San Francisco, USA, 1995.
    • (1995) ACM Multimedia 1995
    • Ghias, A.1    Logan, J.2    Chamberlin, D.3    Smith, B.C.4
  • 29
    • 0034269926 scopus 로고    scopus 로고
    • A semantic event-detection approach and its application to detecting hunts in wildlife video
    • N. Haering, R. Qian, and I. Sezan, "A semantic event-detection approach and its application to detecting hunts in wildlife video," IEEE Transactions on Circuits and Systems for Video Technology, Vol. 10, No. 6, pp. 857-868, 2000.
    • (2000) IEEE Transactions on Circuits and Systems for Video Technology , vol.10 , Issue.6 , pp. 857-868
    • Haering, N.1    Qian, R.2    Sezan, I.3
  • 33
    • 10044239072 scopus 로고    scopus 로고
    • Topic labeling of multilingual broadcast news in the informedia digital video library
    • Berkely, USA
    • A.G. Hauptmann, D. Lee, and P.E. Kennedy, 'Topic labeling of multilingual broadcast news in the informedia digital video library," in ACM DL/SIGIR MIDAS Workshop, Berkely, USA, 1999.
    • (1999) ACM DL/SIGIR MIDAS Workshop
    • Hauptmann, A.G.1    Lee, D.2    Kennedy, P.E.3
  • 34
    • 0031673913 scopus 로고    scopus 로고
    • Story segmentation and detection of commercials in broadcast news video
    • Santa Barbara, USA
    • A.G. Hauptmann and M.J. Witbrock, "Story segmentation and detection of commercials in broadcast news video," in ADL-98 Advances in Digital Libraries, Santa Barbara, USA, 1998, pp. 168-179.
    • (1998) ADL-98 Advances in Digital Libraries , pp. 168-179
    • Hauptmann, A.G.1    Witbrock, M.J.2
  • 36
    • 34247627935 scopus 로고    scopus 로고
    • Automatic video indexing based on shot classification
    • of Lecture Notes in Computer Science, Springer-Verlag: Osaka, Japan
    • I. Ide, K. Yamamoto, and H. Tanaka, "Automatic video indexing based on shot classification," in First International Conference on Advanced Multimedia Content Processing, Vol. 1554 of Lecture Notes in Computer Science, Springer-Verlag: Osaka, Japan, 1999.
    • (1999) First International Conference on Advanced Multimedia Content Processing , vol.1554
    • Ide, I.1    Yamamoto, K.2    Tanaka, H.3
  • 38
    • 84976799884 scopus 로고
    • Metadata in video databases
    • R. Jain and A. Hampapur, "Metadata in video databases," ACM SIGMOD, Vol. 23, No. 4, pp. 27-33, 1994.
    • (1994) ACM SIGMOD , vol.23 , Issue.4 , pp. 27-33
    • Jain, R.1    Hampapur, A.2
  • 39
    • 0033327198 scopus 로고    scopus 로고
    • Learning to recognize speech by watching television
    • P.J. Jang and A.G. Hauptmann, "Learning to recognize speech by watching television," IEEE Intelligent Systems, Vol. 14, No. 5, pp. 51-58, 1999.
    • (1999) IEEE Intelligent Systems , vol.14 , Issue.5 , pp. 51-58
    • Jang, P.J.1    Hauptmann, A.G.2
  • 43
    • 0035308233 scopus 로고    scopus 로고
    • Classification of general audio data for content-based retrieval
    • D. Li, I.K. Sethi, N. Dimitrova, and T. McGee, "Classification of general audio data for content-based retrieval," Pattern Recognition Letters, Vol. 22, No. 5, pp. 533-544, 2001.
    • (2001) Pattern Recognition Letters , vol.22 , Issue.5 , pp. 533-544
    • Li, D.1    Sethi, I.K.2    Dimitrova, N.3    McGee, T.4
  • 44
    • 0033909041 scopus 로고    scopus 로고
    • Automatic text detection and tracking in digital video
    • H. Li, D. Doermann, and O. Kia, "Automatic text detection and tracking in digital video," IEEE Transactions on Image Processing, Vol. 9, No. 1, pp. 147-156, 2000.
    • (2000) IEEE Transactions on Image Processing , vol.9 , Issue.1 , pp. 147-156
    • Li, H.1    Doermann, D.2    Kia, O.3
  • 47
    • 0032115209 scopus 로고    scopus 로고
    • Video handling with music and speech detection
    • K. Minami, A. Akutsu, H. Hamada, and Y. Tomomura, "Video handling with music and speech detection," IEEE Multimedia, Vol. 5, No. 3, pp. 17-25, 1998.
    • (1998) IEEE Multimedia , vol.5 , Issue.3 , pp. 17-25
    • Minami, K.1    Akutsu, A.2    Hamada, H.3    Tomomura, Y.4
  • 48
    • 84905368120 scopus 로고    scopus 로고
    • Video annotation for content-based retrieval using human behavior analysis and domain knowledge
    • Grenoble, France
    • H. Miyamori and S. Iisaku, "Video annotation for content-based retrieval using human behavior analysis and domain knowledge," in IEEE International Conference on Automatic Face and Gesture Recognition, Grenoble, France, 2000, pp. 26-30.
    • (2000) IEEE International Conference on Automatic Face and Gesture Recognition , pp. 26-30
    • Miyamori, H.1    Iisaku, S.2
  • 51
    • 0032595006 scopus 로고    scopus 로고
    • Everything you always wanted to know about MPEG-7: Part 1
    • F. Nack and A.T. Lindsay, "Everything you always wanted to know about MPEG-7: Part 1," IEEE Multimedia, Vol. 6, No. 3, pp. 65-77, 1999.
    • (1999) IEEE Multimedia , vol.6 , Issue.3 , pp. 65-77
    • Nack, F.1    Lindsay, A.T.2
  • 52
    • 0033203799 scopus 로고    scopus 로고
    • Everything you always wanted to know about MPEG-7: Part 2
    • F. Nack and A.T. Lindsay, "Everything you always wanted to know about MPEG-7: Part 2," IEEE Multimedia, Vol. 6, No. 4, pp. 64-73, 1999.
    • (1999) IEEE Multimedia , vol.6 , Issue.4 , pp. 64-73
    • Nack, F.1    Lindsay, A.T.2
  • 54
    • 0031374433 scopus 로고    scopus 로고
    • Speaker identification and video analysis for hierarchical video shot classification
    • Washington DC, USA
    • J. Nam, A. Enis Cetin, and A.H. Tewfik, "Speaker identification and video analysis for hierarchical video shot classification," in IEEE International Conference on Image Processing, Washington DC, USA, 1997, Vol. 2.
    • (1997) IEEE International Conference on Image Processing , vol.2
    • Nam, J.1    Cetin, A.E.2    Tewfik, A.H.3
  • 55
    • 0035281949 scopus 로고    scopus 로고
    • A probabilistic framework for semantic video indexing, filtering, and retrieval
    • M.R. Naphade and T.S. Huang, "A probabilistic framework for semantic video indexing, filtering, and retrieval," IEEE Transactions on Multimedia, Vol. 3, No. 1, pp. 141-151, 2001.
    • (2001) IEEE Transactions on Multimedia , vol.3 , Issue.1 , pp. 141-151
    • Naphade, M.R.1    Huang, T.S.2
  • 56
    • 0033896326 scopus 로고    scopus 로고
    • Detection of moving objects in video using a robust motion similarity measure
    • H.T. Nguyen, M. Worring, and A. Dev, "Detection of moving objects in video using a robust motion similarity measure," IEEE Transactions on Image Processing, Vol. 9, No. 1, pp. 137-141, 2000.
    • (2000) IEEE Transactions on Image Processing , vol.9 , Issue.1 , pp. 137-141
    • Nguyen, H.T.1    Worring, M.2    Dev, A.3
  • 57
    • 0027887659 scopus 로고
    • A design space for multimodal systems: Concurrent processing and data fusion
    • Amsterdam, the Netherlands
    • L. Nigay and J. Coutaz, "A design space for multimodal systems: concurrent processing and data fusion." in INTERCHI'93 Proceedings, Amsterdam, the Netherlands, 1993, pp. 172-178.
    • (1993) INTERCHI'93 Proceedings , pp. 172-178
    • Nigay, L.1    Coutaz, J.2
  • 58
    • 0031344674 scopus 로고    scopus 로고
    • The state of the art in text filtering
    • D.W. Oard, "The state of the art in text filtering," User Modeling and User-Adapted Interaction, Vol. 7, No. 3, pp. 141-178, 1997.
    • (1997) User Modeling and User-Adapted Interaction , vol.7 , Issue.3 , pp. 141-178
    • Oard, D.W.1
  • 67
    • 10044260677 scopus 로고    scopus 로고
    • Face detection methods: A critical evaluation
    • Intelligent Sensory Information Systems, University of Amsterdam, 2000
    • T.V. Pham and M. Worring, "Face detection methods: A critical evaluation," Technical Report 2000-11, Intelligent Sensory Information Systems, University of Amsterdam, 2000.
    • Technical Report 2000-11
    • Pham, T.V.1    Worring, M.2
  • 68
    • 10044230195 scopus 로고    scopus 로고
    • Praja. http://www.praja.com.
  • 69
    • 0024610919 scopus 로고
    • A tutorial on hidden markov models and selected applications in speech recognition
    • L.R. Rabiner, "A tutorial on hidden markov models and selected applications in speech recognition," Proceedings of the IEEE, Vol. 77, No. 2, pp. 257-286, 1989.
    • (1989) Proceedings of the IEEE , vol.77 , Issue.2 , pp. 257-286
    • Rabiner, L.R.1
  • 71
    • 0034440695 scopus 로고    scopus 로고
    • Automatically extracting highlights for TV baseball programs
    • Los Angeles, USA
    • Y. Rui, A. Gupta, and A. Acero, "Automatically extracting highlights for TV baseball programs," in ACM Multimedia 2000, Los Angeles, USA, 2000, pp. 105-115.
    • (2000) ACM Multimedia 2000 , pp. 105-115
    • Rui, Y.1    Gupta, A.2    Acero, A.3
  • 73
    • 0032306091 scopus 로고    scopus 로고
    • Identification of story units in audio-visual sequences by joint audio and video processing
    • Chicago, USA
    • C. Saraceno and R. Leonardi, "Identification of story units in audio-visual sequences by joint audio and video processing," in IEEE International Conference on Image Processing, Chicago, USA, 1998.
    • (1998) IEEE International Conference on Image Processing
    • Saraceno, C.1    Leonardi, R.2
  • 74
    • 0032660827 scopus 로고    scopus 로고
    • Name-It: Naming and detecting faces in news videos
    • S. Satoh, Y. Nakamura, and T. Kanade, "Name-It: Naming and detecting faces in news videos," IEEE Multimedia, Vol. 6, No. 1, pp. 22-35, 1999.
    • (1999) IEEE Multimedia , vol.6 , Issue.1 , pp. 22-35
    • Satoh, S.1    Nakamura, Y.2    Kanade, T.3
  • 76
    • 0033682228 scopus 로고    scopus 로고
    • A statistical method for 3D object detection applied to faces and cars
    • Hilton Head, USA
    • H. Schneiderman and T. Kanade, "A statistical method for 3D object detection applied to faces and cars," in IEEE Computer Vision and Pattern Recognition, Hilton Head, USA, 2000.
    • (2000) IEEE Computer Vision and Pattern Recognition
    • Schneiderman, H.1    Kanade, T.2
  • 80
    • 0029373840 scopus 로고
    • Automatic indexing and content-based retrieval of captioned images
    • R.K. Srihari, "Automatic indexing and content-based retrieval of captioned images," IEEE Computer, Vol. 28, No. 9, pp. 49-56, 1995.
    • (1995) IEEE Computer , vol.28 , Issue.9 , pp. 49-56
    • Srihari, R.K.1
  • 87
    • 0032455864 scopus 로고    scopus 로고
    • On image classification: City images vs. landscapes
    • A. Vailaya, A.K. Jain, and H.-J. Zhang, "On image classification: City images vs. landscapes," Pattern Recognition, Vol. 31, pp. 1921-1936, 1998.
    • (1998) Pattern Recognition , vol.31 , pp. 1921-1936
    • Vailaya, A.1    Jain, A.K.2    Zhang, H.-J.3
  • 88
    • 0036999134 scopus 로고    scopus 로고
    • Systematic evaluation of logical story unit segmentation
    • J. Vendrig and M. Worring, "Systematic evaluation of logical story unit segmentation," IEEE Transactions on Multimedia, Vol. 4, No. 4, pp. 492-499, 2002.
    • (2002) IEEE Transactions on Multimedia , vol.4 , Issue.4 , pp. 492-499
    • Vendrig, J.1    Worring, M.2
  • 89
    • 10044257375 scopus 로고    scopus 로고
    • Virage, http ://www.virage.com.
  • 90
    • 85032751556 scopus 로고    scopus 로고
    • Multimedia content analysis using both audio and visual clues
    • Y. Wang, Z. Liu, and J. Huang, "Multimedia content analysis using both audio and visual clues," IEEE Signal Processing Magazine, Vol. 17, No. 6, pp. 12-36, 2000.
    • (2000) IEEE Signal Processing Magazine , vol.17 , Issue.6 , pp. 12-36
    • Wang, Y.1    Liu, Z.2    Huang, J.3
  • 92
    • 0030242072 scopus 로고    scopus 로고
    • Content-based classification, search, and retrieval of audio
    • E. Wold, T. Blum, D. Keislar, and J. Wheaton, "Content-based classification, search, and retrieval of audio," IEEE Multimedia, Vol. 3, No. 3, pp. 27-36, 1996.
    • (1996) IEEE Multimedia , vol.3 , Issue.3 , pp. 27-36
    • Wold, E.1    Blum, T.2    Keislar, D.3    Wheaton, J.4
  • 93
    • 0030241856 scopus 로고    scopus 로고
    • Spatio-temporal segmentation of image sequences for object-oriented low bit-rate image coding
    • L. Wu, J. Benois-Pineau, and D. Barba, "Spatio-temporal segmentation of image sequences for object-oriented low bit-rate image coding," Image Communication, Vol. 8, No. 6, pp. 513-544, 1996.
    • (1996) Image Communication , vol.8 , Issue.6 , pp. 513-544
    • Wu, L.1    Benois-Pineau, J.2    Barba, D.3
  • 97
    • 34250082473 scopus 로고
    • Automatic partitioning of full-motion video
    • H.-J. Zhang, A. Kankanhalli, and S.W. Smoliar, "Automatic partitioning of full-motion video," Multimedia Systems, Vol. 1, No. 1, pp. 10-28, 1993.
    • (1993) Multimedia Systems , vol.1 , Issue.1 , pp. 10-28
    • Zhang, H.-J.1    Kankanhalli, A.2    Smoliar, S.W.3
  • 98
    • 0000464746 scopus 로고
    • Automatic parsing and indexing of news video
    • H.-J. Zhang, S.Y. Tan, S.W. Smoliar, and G. Yihong, "Automatic parsing and indexing of news video," Multimedia Systems, Vol. 2, No. 6, pp. 256-266, 1995.
    • (1995) Multimedia Systems , vol.2 , Issue.6 , pp. 256-266
    • Zhang, H.-J.1    Tan, S.Y.2    Smoliar, S.W.3    Yihong, G.4
  • 102
    • 84925836212 scopus 로고    scopus 로고
    • Rule-based video classification system for basketball video indexing
    • Los Angeles, USA
    • W. Zhou, A. Vellaikal, and C.-C.J. Kuo, "Rule-based video classification system for basketball video indexing," in ACM Multimedia 2000, Los Angeles, USA, 2000.
    • (2000) ACM Multimedia 2000
    • Zhou, W.1    Vellaikal, A.2    Kuo, C.-C.J.3
  • 103
    • 79952051658 scopus 로고    scopus 로고
    • Automatic news video segmentation and categorization based on closed-captioned text
    • Tokyo, Japan
    • W. Zhu, C. Toklu, and S.-P. Liou, "Automatic news video segmentation and categorization based on closed-captioned text," in IEEE International Conference on Multimedia & Expo, Tokyo, Japan, 2001, pp. 1036-1039.
    • (2001) IEEE International Conference on Multimedia & Expo , pp. 1036-1039
    • Zhu, W.1    Toklu, C.2    Liou, S.-P.3


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.