메뉴 건너뛰기




Volumn , Issue , 2008, Pages 123-162

Audio content analysis

Author keywords

[No Author keywords available]

Indexed keywords


EID: 84892229349     PISSN: None     EISSN: None     Source Type: Book    
DOI: 10.1007/978-1-84800-076-6_5     Document Type: Chapter
Times cited : (4)

References (106)
  • 2
    • 0036288688 scopus 로고    scopus 로고
    • A new speaker change detection method for two-speaker segmentation
    • '
    • Adami, A. G., Kajarekar, S. S. and Hermansky, H. (2002), A new speaker change detection method for two-speaker segmentation, in 'Proceedings of the ICASSP', Vol. 4, pp. 3908-3911.
    • (2002) Proceedings of the ICASSP , vol.4 , pp. 3908-3911
    • Adami, A.G.1    Kajarekar, S.S.2    Hermansky, H.3
  • 5
    • 0031233424 scopus 로고    scopus 로고
    • Speaker recognition: A tutorial
    • Campbell, J. P. (1997), 'Speaker recognition: A tutorial', Proc. IEEE 85(9), 1437-1462.
    • (1997) Proc. IEEE , vol.85 , Issue.9 , pp. 1437-1462
    • Campbell, J.P.1
  • 6
    • 0032638667 scopus 로고    scopus 로고
    • A comparison of features for speech, music discrimination
    • Carey, M. J., Parris, E. S. and Lloyd-Thomas, H. (1999), A comparison of features for speech, music discrimination, in 'Proceedings of the ICASSP', Vol. 1, pp. 149-152.
    • (1999) Proceedings of the ICASSP , vol.1 , pp. 149-152
    • Carey, M.J.1    Parris, E.S.2    Lloyd-Thomas, H.3
  • 7
    • 0035364397 scopus 로고    scopus 로고
    • MPEG-7 sound-recognition tools
    • Casey, M. (2001), 'MPEG-7 sound-recognition tools', IEEE Trans. Circ. Syst. Video Tech. 11(6), 737-747.
    • (2001) IEEE Trans. Circ. Syst. Video Tech. , vol.11 , Issue.6 , pp. 737-747
    • Casey, M.1
  • 8
    • 33845439374 scopus 로고    scopus 로고
    • Foafing the music: Bridging the semantic gap in music recommendation
    • of LNCS
    • Celma, O. (2006), Foafing the music: Bridging the semantic gap in music recommendation, in 'Proceedings of the 5th International Semantic Web Conference', Vol. 4273 of LNCS, pp. 927-934.
    • (2006) Proceedings of the 5th International Semantic Web Conference , vol.4273 , pp. 927-934
    • Celma, O.1
  • 11
    • 0002595416 scopus 로고    scopus 로고
    • Speaker, environment and channel change detection and clustering via the bayesian information criterion
    • Chen, S. S. and Gopalakrishnan, P. S. (1998), Speaker, environment and channel change detection and clustering via the bayesian information criterion, in 'Proceedings of the DARPA Speech Recognition Workshop'.
    • (1998) Proceedings of the DARPA Speech Recognition Workshop
    • Chen, S.S.1    Gopalakrishnan, P.S.2
  • 12
    • 85009212151 scopus 로고    scopus 로고
    • A sequential metric-based audio segmentation method via the bayesian information criterion
    • Cheng, S.-s. and Wang, H.-M. (2003), A sequential metric-based audio segmentation method via the bayesian information criterion, in 'Proceedings of the EUROSPEECH', pp. 945-948.
    • (2003) Proceedings of the EUROSPEECH , pp. 945-948
    • Cheng, S.-S.1    Wang, H.-M.2
  • 16
    • 0038368765 scopus 로고    scopus 로고
    • Combination of similarity measures for effective spoken document retrieval
    • Crestani, F. (2003), 'Combination of similarity measures for effective spoken document retrieval', J. Inform. Sci. 29(2), 87-96.
    • (2003) J. Inform. Sci. , vol.29 , Issue.2 , pp. 87-96
    • Crestani, F.1
  • 17
    • 34250285795 scopus 로고
    • U-statistic hierarchical clustering
    • D'Andrade, R. (1978), U-statistic hierarchical clustering, in 'Psychometrika', Vol. 43, pp. 59-68.
    • (1978) Psychometrika , vol.43 , pp. 59-68
    • D'Andrade, R.1
  • 18
    • 0019053271 scopus 로고
    • Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
    • Davis, S. B. and Mermelstein, P. (1980), 'Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences', IEEE Trans. Acoust., Speech, Signal Process. 28(4), 357-366.
    • (1980) IEEE Trans. Acoust., Speech, Signal Process. , vol.28 , Issue.4 , pp. 357-366
    • Davis, S.B.1    Mermelstein, P.2
  • 19
    • 0034273195 scopus 로고    scopus 로고
    • DISTBIC: A speaker-based segmentation for audio data indexing
    • Delacourt, P. and Wellekens, C. J. (2000), 'DISTBIC: A speaker-based segmentation for audio data indexing', Speech Comm. 32(1), 111-126.
    • (2000) Speech Comm. , vol.32 , Issue.1 , pp. 111-126
    • Delacourt, P.1    Wellekens, C.J.2
  • 23
    • 84908266591 scopus 로고    scopus 로고
    • The beat spectrum: A new approach to rhythm analysis
    • Foote, J. (2001), The beat spectrum: A new approach to rhythm analysis, in 'Proceedings of the ICME', pp. 881-884.
    • (2001) Proceedings of the ICME , pp. 881-884
    • Foote, J.1
  • 24
    • 57649180845 scopus 로고    scopus 로고
    • Content-based retrieval of music and audio
    • C.-C. Jay Kuo et al., ed.
    • Foote, J. T. (1997), Content-based retrieval of music and audio, in C.-C. Jay Kuo et al., ed., 'Proceedings of the Electronic Imaging', Vol. 3229, pp. 138-147.
    • (1997) Proceedings of the Electronic Imaging , vol.3229 , pp. 138-147
    • Foote, J.T.1
  • 25
    • 85128356454 scopus 로고    scopus 로고
    • Partitioning and transcription of broadcast news data
    • Gauvain, J.-L., Lamel, L. and Adda, G. (1998), Partitioning and transcription of broadcast news data, in 'Proceedings of the ICSLP', Vol. 5, pp. 1335-1338.
    • (1998) Proceedings of the ICSLP , vol.5 , pp. 1335-1338
    • Gauvain, J.-L.1    Lamel, L.2    Adda, G.3
  • 26
    • 0028516097 scopus 로고
    • Text-independent speaker identification
    • Gish, H. and Schmidt, M. (1994), 'Text-independent speaker identification', IEEE Signal Process Mag. 11(4), 18-32.
    • (1994) IEEE Signal Process Mag. , vol.11 , Issue.4 , pp. 18-32
    • Gish, H.1    Schmidt, M.2
  • 27
    • 0026400244 scopus 로고
    • Segregation of speakers for speech recognition and speaker identification.
    • Gish, H., Siu, M.-H. and Rohlicek, R. (1991), Segregation of speakers for speech recognition and speaker identification., in 'Proceedings of the ICASSP', pp. 873-876.
    • (1991) Proceedings of the ICASSP , pp. 873-876
    • Gish, H.1    Siu, M.-H.2    Rohlicek, R.3
  • 28
    • 0030372637 scopus 로고    scopus 로고
    • A probabilistic framework for feature-based speech recognition
    • Glass, J., Chang, J. and McCandless, M. (1996), A probabilistic framework for feature-based speech recognition, in 'Proceedings of the ICSLP', Vol. 4, pp. 2277-2280.
    • (1996) Proceedings of the ICSLP , vol.4 , pp. 2277-2280
    • Glass, J.1    Chang, J.2    McCandless, M.3
  • 29
    • 84892200966 scopus 로고
    • A first approach to speech retrieval
    • Technical Report 238, ETH Zrich
    • Glavitsch, U. (1995), A first approach to speech retrieval, Technical Report 238, ETH Zrich, Institute of Information Systems.
    • (1995) Institute of Information Systems
    • Glavitsch, U.1
  • 30
    • 0027151606 scopus 로고
    • Recognition of environmental sounds
    • Goldhor, R. S. (1993), Recognition of environmental sounds, in 'Proceedings of the ICASSP', Vol. 1, pp. 149-152.
    • (1993) Proceedings of the ICASSP , vol.1 , pp. 149-152
    • Goldhor, R.S.1
  • 32
    • 33847144778 scopus 로고    scopus 로고
    • Boosting for content-based audio classification and retrieval: An evaluation
    • Guo, G., Zhang, H.-J. and Li, S. Z. (2001), Boosting for content-based audio classification and retrieval: An evaluation, in 'Proceedings of the ICME', pp. 1200-1203.
    • (2001) Proceedings of the ICME , pp. 1200-1203
    • Guo, G.1    Zhang, H.-J.2    Li, S.Z.3
  • 34
    • 0033639509 scopus 로고    scopus 로고
    • Overview of the sixth text retrieval conference (trec-6)
    • Harman, D. (2000), 'Overview of the sixth text retrieval conference (trec-6)', Inform. Process. Manag. 36(1), 3-35.
    • (2000) Inform. Process. Manag. , vol.36 , Issue.1 , pp. 3-35
    • Harman, D.1
  • 35
    • 0025041264 scopus 로고
    • Perceptual linear predictive (PLP) analysis of speech
    • Hermansky, H. (1990), 'Perceptual linear predictive (PLP) analysis of speech', J. Acoust. Soc. Am. 87(4), 1738-1752.
    • (1990) J. Acoust. Soc. Am. , vol.87 , Issue.4 , pp. 1738-1752
    • Hermansky, H.1
  • 36
    • 17444404830 scopus 로고    scopus 로고
    • Automatic classification of musical instrument sounds
    • Herrera, P., Peeters, G. and Dubnov, S. (2003), 'Automatic classification of musical instrument sounds', J. New. Music Res. 32(1), 3-21.
    • (2003) J. New. Music Res. , vol.32 , Issue.1 , pp. 3-21
    • Herrera, P.1    Peeters, G.2    Dubnov, S.3
  • 37
    • 85009070699 scopus 로고    scopus 로고
    • Automatic metric-based speech segmentation for broadcast news via principal component analysis
    • Hung, J.-W., Wang, H.-M. and Lee, L.-S. (2000), Automatic metric-based speech segmentation for broadcast news via principal component analysis, in 'Proceedings of the ICSLP', Vol. 4, pp. 121-124.
    • (2000) Proceedings of the ICSLP , vol.4 , pp. 121-124
    • Hung, J.-W.1    Wang, H.-M.2    Lee, L.-S.3
  • 42
    • 0014129195 scopus 로고
    • Hierarchical clustering schemes
    • Johnson, S. C. (1967), 'Hierarchical clustering schemes', Psychometrika 32(3), 241-254.
    • (1967) Psychometrika , vol.32 , Issue.3 , pp. 241-254
    • Johnson, S.C.1
  • 45
    • 0033692969 scopus 로고    scopus 로고
    • Strategies for automatic segmentation of audio data
    • Kemp, T., Schmidt, M., Westphal, M. and Waibel, A. (2000), Strategies for automatic segmentation of audio data, in 'Proceedings ICASSP', Vol. 3, pp. 1423-1426.
    • (2000) Proceedings ICASSP , vol.3 , pp. 1423-1426
    • Kemp, T.1    Schmidt, M.2    Westphal, M.3    Waibel, A.4
  • 48
    • 33646789869 scopus 로고    scopus 로고
    • Hybrid speaker-based segmentation system using model-level clustering
    • Kim, H.-G., Ertelt, D. and Sikora, T. (2005), Hybrid speaker-based segmentation system using model-level clustering, in 'Proceedings of the ICASSP', Vol. 1, pp. 745-748.
    • (2005) Proceedings of the ICASSP , vol.1 , pp. 745-748
    • Kim, H.-G.1    Ertelt, D.2    Sikora, T.3
  • 50
    • 33748519104 scopus 로고    scopus 로고
    • Signal processing methods for the transcription of music
    • PhD thesis, Finland
    • Klapuri, A. (2004), Signal Processing Methods for the Transcription of Music, PhD thesis, Tampere University of Technology, Tampere, Finland.
    • (2004) Tampere University of Technology, Tampere
    • Klapuri, A.1
  • 51
    • 33745217037 scopus 로고    scopus 로고
    • Using syllable-based indexing features and language models to improve german spoken document retrieval
    • Larson, M. and Eickeler, S. (2003), Using syllable-based indexing features and language models to improve german spoken document retrieval, in 'Proceedings of the EUROSPEECH', pp. 1217-1220.
    • (2003) Proceedings of the EUROSPEECH , pp. 1217-1220
    • Larson, M.1    Eickeler, S.2
  • 52
    • 84892243397 scopus 로고
    • Kluwer Academic Publishers, chapter Appendix I
    • Lee, K.-F. (1989), Automatic Speech Recognition, Kluwer Academic Publishers, chapter Appendix I.2, p. 147.
    • (1989) Automatic Speech Recognition , vol.2 , pp. 147
    • Lee, K.-F.1
  • 53
    • 85119434191 scopus 로고    scopus 로고
    • Fast speaker change detection for broadcast news transcription and indexing
    • Liu, D. and Kubala, F. (1999), Fast speaker change detection for broadcast news transcription and indexing, in 'Proceedings of the EUROSPEECH', Vol. 3, pp. 1031-1034.
    • (1999) Proceedings of the EUROSPEECH , vol.3 , pp. 1031-1034
    • Liu, D.1    Kubala, F.2
  • 55
    • 0032181880 scopus 로고    scopus 로고
    • Audio feature extraction and analysis for scene segmentation and classification
    • Liu, Z., Wang, Y. and Chen, T. (1998), 'Audio feature extraction and analysis for scene segmentation and classification', J. VLSI Signal Process. 20(1/2), 61-79.
    • (1998) J. VLSI Signal Process. , vol.20 , Issue.1-2 , pp. 61-79
    • Liu, Z.1    Wang, Y.2    Chen, T.3
  • 57
    • 33645326073 scopus 로고    scopus 로고
    • Real-time unsupervised speaker change detection
    • Lu, L. and Zhang, H. J. (2002a), Real-time unsupervised speaker change detection, in 'Proceedings of the ICPR', Vol. 2, pp. 358-361.
    • (2002) Proceedings of the ICPR , vol.2 , pp. 358-361
    • Lu, L.1    Zhang, H.J.2
  • 59
    • 84873533162 scopus 로고    scopus 로고
    • An investigation of feature models for music genre classification using the support vector classifier
    • Meng, A. and Shawe-Taylor, J. (2005), An investigation of feature models for music genre classification using the support vector classifier, in 'Proceedings of the ISMIR', pp. 604-609.
    • (2005) Proceedings of the ISMIR , pp. 604-609
    • Meng, A.1    Shawe-Taylor, J.2
  • 61
    • 33745218075 scopus 로고    scopus 로고
    • Comparison of different phone-based spoken document retrieval methods with text and spoken queries
    • Moreau, N., Jin, S. and Sikora, T. (2005), Comparison of different phone-based spoken document retrieval methods with text and spoken queries, in 'Proceedings of the EUROSPEECH', pp. 641-644.
    • (2005) Proceedings of the EUROSPEECH , pp. 641-644
    • Moreau, N.1    Jin, S.2    Sikora, T.3
  • 62
    • 0034857759 scopus 로고    scopus 로고
    • Speaker change detection and speaker clustering using VQ distortion for broadcast news speech recognition
    • Mori, K. and Nakagawa, S. (2001), Speaker change detection and speaker clustering using VQ distortion for broadcast news speech recognition, in 'Proceedings of the ICASSP', Vol. 1, pp. 413-416.
    • (2001) Proceedings of the ICASSP , vol.1 , pp. 413-416
    • Mori, K.1    Nakagawa, S.2
  • 63
    • 0027311597 scopus 로고
    • A new speech recognition method based on VQ-distortion and HMM
    • Nakagawa, S. and Suzuk, H. (1993), A new speech recognition method based on VQ-distortion and HMM, in 'Proceedings of the ICASSP', Vol. 2, pp. 676-679.
    • (1993) Proceedings of the ICASSP , vol.2 , pp. 676-679
    • Nakagawa, S.1    Suzuk, H.2
  • 65
    • 0033692609 scopus 로고    scopus 로고
    • Information fusion for spoken document retrieval
    • Ng, K. (2000), Information fusion for spoken document retrieval, in 'Proceedings ICASSP', Vol. 6, pp. 2405-2408.
    • (2000) Proceedings ICASSP , vol.6 , pp. 2405-2408
    • Ng, K.1
  • 66
    • 0031636298 scopus 로고    scopus 로고
    • Phonetic recognition for spoken document retrieval
    • Ng, K. and Zue, V. W. (1998), Phonetic recognition for spoken document retrieval, in 'Proceedings ICASSP', Vol. 1, pp. 325-328.
    • (1998) Proceedings ICASSP , vol.1 , pp. 325-328
    • Ng, K.1    Zue, V.W.2
  • 67
    • 25444508285 scopus 로고    scopus 로고
    • Survey of sparse and non-sparse methods in source separation
    • O'Grady, P. D., Pearlmutter, B. A. and Rickard, S. T. (2005), 'Survey of sparse and non-sparse methods in source separation', Int. J. Imag. Syst. Tech. 15(1), 18-33.
    • (2005) Int. J. Imag. Syst. Tech. , vol.15 , Issue.1 , pp. 18-33
    • O'Grady, P.D.1    Pearlmutter, B.A.2    Rickard, S.T.3
  • 68
    • 38549108667 scopus 로고    scopus 로고
    • Musical metadata and knowledge management
    • D. Schwartz, ed., Idea Group
    • Pachet, F. (2005), Musical metadata and knowledge management, in D. Schwartz, ed., 'Encyclopedia of Knowledge Management', Idea Group, pp. 672-677.
    • (2005) Encyclopedia of Knowledge Management , pp. 672-677
    • Pachet, F.1
  • 71
    • 34547541889 scopus 로고    scopus 로고
    • Applied clustering for automatic speaker-based segmentation of audio material
    • Pietquin, O., Couvreur, L. and Couvreur, P. (2001), Applied clustering for automatic speaker-based segmentation of audio material, in 'JORBEL', Vol. 41, pp. 69-81.
    • (2001) JORBEL , vol.41 , pp. 69-81
    • Pietquin, O.1    Couvreur, L.2    Couvreur, P.3
  • 72
    • 0034444712 scopus 로고    scopus 로고
    • Integrating visual, audio and text analysis for news video
    • Qi, W., Gu, L., Jiang, H., Chen, X. and Zhang, H. (2000), Integrating visual, audio and text analysis for news video, in 'Proceedings of the ICIP', Vol. 3, pp. 520-523.
    • (2000) Proceedings of the ICIP , vol.3 , pp. 520-523
    • Qi, W.1    Gu, L.2    Jiang, H.3    Chen, X.4    Zhang, H.5
  • 75
    • 0029386354 scopus 로고
    • Keyword detection in conversational speech utterances using hidden markov model based continuous speech recognition
    • Rose, R. (1995), 'Keyword detection in conversational speech utterances using hidden markov model based continuous speech recognition', Comput. Speech Lang. 9(4), 309-333.
    • (1995) Comput. Speech Lang. , vol.9 , Issue.4 , pp. 309-333
    • Rose, R.1
  • 77
    • 4544228318 scopus 로고    scopus 로고
    • Identity verification using speech and face information
    • Sanderson, C. and Paliwala, K. K. (2004), 'Identity verification using speech and face information', Digit. Signal Process. 14(5), 449-480.
    • (2004) Digit. Signal Process. , vol.14 , Issue.5 , pp. 449-480
    • Sanderson, C.1    Paliwala, K.K.2
  • 78
    • 0029765670 scopus 로고    scopus 로고
    • Real-time discrimination of broadcast speech/music
    • Saunders, J. (1996), Real-time discrimination of broadcast speech/music, in 'Proceedings of the ICASSP', Vol. 2, pp. 993-996.
    • (1996) Proceedings of the ICASSP , vol.2 , pp. 993-996
    • Saunders, J.1
  • 79
    • 2642521115 scopus 로고
    • Assessing the retrieval effectiveness of a speech retrieval system by simulating recognition errors
    • Schaeuble, P. and Glavitsch, U. (1994), Assessing the retrieval effectiveness of a speech retrieval system by simulating recognition errors, in 'Proceedings Workshop on Human Language Technology', pp. 370-372.
    • (1994) Proceedings Workshop on Human Language Technology , pp. 370-372
    • Schaeuble, P.1    Glavitsch, U.2
  • 80
    • 0030648077 scopus 로고    scopus 로고
    • Construction and evaluation of a robust multifeature speech/music discriminator
    • Scheirer, E. and Slaney, M. (1997), Construction and evaluation of a robust multifeature speech/music discriminator, in 'Proceedings of the ICASSP', Vol. 2, pp. 1331-1334.
    • (1997) Proceedings of the ICASSP , vol.2 , pp. 1331-1334
    • Scheirer, E.1    Slaney, M.2
  • 81
    • 0000120766 scopus 로고
    • Estimation the dimension of a model
    • Schwarz, G. (1978), Estimation the dimension of a model, in 'Annals of Statistics', Vol. 6, pp. 461-464.
    • (1978) Annals of Statistics , vol.6 , pp. 461-464
    • Schwarz, G.1
  • 82
    • 85069154781 scopus 로고    scopus 로고
    • Musical sound modeling with sinusoids plus noise
    • C. Roads, S. T. Pope, A. Piccialli and G. D. Poli, eds, Swets & Zeitlinger The Netherlands
    • Serra, X. (1997), Musical sound modeling with sinusoids plus noise, in C. Roads, S. T. Pope, A. Piccialli and G. D. Poli, eds, 'Musical Signal Processing', Swets & Zeitlinger The Netherlands, pp. 91-122.
    • (1997) Musical Signal Processing , pp. 91-122
    • Serra, X.1
  • 84
    • 84945116938 scopus 로고    scopus 로고
    • Non-negative matrix factorization for polyphonic music transcription
    • Smaragdis, P. and Brown, J. C. (2003), Non-negative matrix factorization for polyphonic music transcription, in 'Proceedings of the WASPAA', pp. 177-180.
    • (2003) Proceedings of the WASPAA , pp. 177-180
    • Smaragdis, P.1    Brown, J.C.2
  • 85
    • 85031608427 scopus 로고    scopus 로고
    • Speaker tracking and detection with multiple speakers
    • Sönmez, K., Heck, L. and Weintraub, M. (1999), Speaker tracking and detection with multiple speakers, in 'Proceedings of the EUROSPEECH', Vol. 5, pp. 2219-2222.
    • (1999) Proceedings of the EUROSPEECH , vol.5 , pp. 2219-2222
    • Sönmez, K.1    Heck, L.2    Weintraub, M.3
  • 86
    • 0027252184 scopus 로고
    • Speech segmentation and clustering based on speaker features
    • Sugiyama, M., Murakami, J. and H. Watanabe (1993), Speech segmentation and clustering based on speaker features, in 'Proceedings of the ICASSP', Vol. 2, pp. 395-398.
    • (1993) Proceedings of the ICASSP , vol.2 , pp. 395-398
    • Sugiyama, M.1    Murakami, J.2    Watanabe, H.3
  • 87
    • 20444498720 scopus 로고    scopus 로고
    • Detection of unique people in news programs using multimodal shot clustering
    • Taskiran, C., Albiol, A., Torres, L. and Delp, E. (2004), Detection of unique people in news programs using multimodal shot clustering, in 'Proceedings of the ICIP', Vol. 1, pp. 697-700.
    • (2004) Proceedings of the ICIP , vol.1 , pp. 697-700
    • Taskiran, C.1    Albiol, A.2    Torres, L.3    Delp, E.4
  • 89
    • 78650540904 scopus 로고    scopus 로고
    • Improved speaker segmentation and segments clustering using the bayesian information criterion.
    • Tritschler, A. and Gopinath, R. (1999), Improved speaker segmentation and segments clustering using the bayesian information criterion., in 'Proceedings of the EUROSPEECH', pp. 679-682.
    • (1999) Proceedings of the EUROSPEECH , pp. 679-682
    • Tritschler, A.1    Gopinath, R.2
  • 90
    • 0036648502 scopus 로고    scopus 로고
    • Musical genre classification of audio signals
    • Tzanetakis, G. and Cook, P. (2002), 'Musical genre classification of audio signals', IEEE Trans. Speech Audio Process. 10(5), 293-302.
    • (2002) IEEE Trans. Speech Audio Process. , vol.10 , Issue.5 , pp. 293-302
    • Tzanetakis, G.1    Cook, P.2
  • 94
    • 85032751556 scopus 로고    scopus 로고
    • Multimedia content analysis using both audio and visual clues
    • Wang, Y., Liu, Z. and Huang, J.-C. (2000), 'Multimedia content analysis using both audio and visual clues', IEEE Signal Process. Mag. 17(6), 12-36.
    • (2000) IEEE Signal Process. Mag. , vol.17 , Issue.6 , pp. 12-36
    • Wang, Y.1    Liu, Z.2    Huang, J.-C.3
  • 96
    • 0025517070 scopus 로고
    • Automatic recognition of keywords in unconstrained speech using hidden markov models
    • Wilpon, J., Rabiner, L. and Lee, C.-H. (1990), 'Automatic recognition of keywords in unconstrained speech using hidden markov models', IEEE Trans. Acoust., Speech, Signal Process. 38, 1870-1878.
    • (1990) IEEE Trans. Acoust., Speech, Signal Process. , vol.38 , pp. 1870-1878
    • Wilpon, J.1    Rabiner, L.2    Lee, C.-H.3
  • 97
    • 0030242072 scopus 로고    scopus 로고
    • Content-based classification, search, and retrieval of audio
    • Wold, E., Blum, T., Keislar, D. and Wheaton, J. (1996), 'Content-based classification, search, and retrieval of audio', IEEE Multimedia 3(3), 27-36.
    • (1996) IEEE Multimedia , vol.3 , Issue.3 , pp. 27-36
    • Wold, E.1    Blum, T.2    Keislar, D.3    Wheaton, J.4
  • 99
    • 0141478771 scopus 로고    scopus 로고
    • UBM-based real-time speaker segmentation for broadcasting news
    • Wu, T., Lu, L. and Zhang, H.-J. (2003), UBM-based real-time speaker segmentation for broadcasting news, in 'Proceedings of the ICASSP', Vol. 2, pp. 193-196.
    • (2003) Proceedings of the ICASSP , vol.2 , pp. 193-196
    • Wu, T.1    Lu, L.2    Zhang, H.-J.3
  • 100
    • 0141855132 scopus 로고    scopus 로고
    • Comparing MFCC and MPEG-7 audio features for feature extraction, maximum likelihood HMM and entropic prior HMM for sports audio classification
    • Xiong, Z., Radhakrishnan, R., Divakaran, A. and Huang, T. S. (2003), Comparing MFCC and MPEG-7 audio features for feature extraction, maximum likelihood HMM and entropic prior HMM for sports audio classification, in 'Proceedings of the ICASSP', Vol. 5, pp. 628-31.
    • (2003) Proceedings of the ICASSP , vol.5 , pp. 628-631
    • Xiong, Z.1    Radhakrishnan, R.2    Divakaran, A.3    Huang, T.S.4
  • 101
    • 85009160774 scopus 로고    scopus 로고
    • An improved model-based speaker segmentation system
    • Yu, P., Seide, F., Ma, C. and Chang, E. (2003), An improved model-based speaker segmentation system, in 'Proceedings of the EUROSPEECH', pp. 2025-2028.
    • (2003) Proceedings of the EUROSPEECH , pp. 2025-2028
    • Yu, P.1    Seide, F.2    Ma, C.3    Chang, E.4
  • 102
    • 0032629748 scopus 로고    scopus 로고
    • Hierarchical classification of audio data for archiving and retrieving
    • Zhang, T. and Kuo, C.-C. (1999a), Hierarchical classification of audio data for archiving and retrieving, in 'Proceedings of the ICASSP', Vol. 6, pp. 3001-3004.
    • (1999) Proceedings of the ICASSP , vol.6 , pp. 3001-3004
    • Zhang, T.1    Kuo, C.-C.2
  • 104
    • 85009275098 scopus 로고    scopus 로고
    • Speechfind: An experimental on-line spoken document retrieval system for historical audio archives
    • Zhou, B. and Hansen, J. (2002), Speechfind: An experimental on-line spoken document retrieval system for historical audio archives, in 'Proceedings of the ICSLP', Vol. 3, pp. 1969-1972.
    • (2002) Proceedings of the ICSLP , vol.3 , pp. 1969-1972
    • Zhou, B.1    Hansen, J.2
  • 105
    • 85009089453 scopus 로고    scopus 로고
    • Unsupervised audio stream segmentation and clustering via the bayesian information criterion
    • Zhou, B. and Hansen, J. H. L. (2000), Unsupervised audio stream segmentation and clustering via the bayesian information criterion, in 'Proceedings of the ICSLP', Vol. 3, pp. 714-717.
    • (2000) Proceedings of the ICSLP , vol.3 , pp. 714-717
    • Zhou, B.1    Hansen, J.H.L.2
  • 106
    • 0025587109 scopus 로고
    • The summit speech recognition system: Phonological modeling and lexical access
    • Zue, V., Glass, J., Goodine, D., Phillips, M. and Seneff, S. (1990), The summit speech recognition system: phonological modeling and lexical access, in 'Proceedings of the ICASSP', Vol. 1, pp. 49-52.
    • (1990) Proceedings of the ICASSP , vol.1 , pp. 49-52
    • Zue, V.1    Glass, J.2    Goodine, D.3    Phillips, M.4    Seneff, S.5


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.