메뉴 건너뛰기




Volumn 78, Issue , 2010, Pages 71-150

Features for Content-Based Audio Retrieval

Author keywords

Audio Retrieval; Content Based Audio Features; Content Based Retrieval; Feature Extraction; Taxonomy

Indexed keywords


EID: 85027695706     PISSN: 00652458     EISSN: None     Source Type: Book Series    
DOI: 10.1016/S0065-2458(10)78003-7     Document Type: Chapter
Times cited : (179)

References (204)
  • 1
    • 0004244302 scopus 로고
    • Fundamentals of Speech Recognition
    • Prentice-Hall Upper Saddle River, NJ
    • Rabiner, L., Juang, B., Fundamentals of Speech Recognition. 1993, Prentice-Hall, Upper Saddle River, NJ.
    • (1993)
    • Rabiner, L.1    Juang, B.2
  • 2
    • 0037237084 scopus 로고    scopus 로고
    • Music information retrieval
    • (Chapter 7)
    • Downie, J.S., Music information retrieval. Annu. Rev. Inform. Sci. Technol. 37 (2003), 295–340 (Chapter 7).
    • (2003) Annu. Rev. Inform. Sci. Technol. , vol.37 , pp. 295-340
    • Downie, J.S.1
  • 3
    • 84892200847 scopus 로고    scopus 로고
    • Signal Processing Methods for Music Transcription
    • Springer New York, NY
    • Klapuri, A., Davy, M., Signal Processing Methods for Music Transcription. 2006, Springer, New York, NY.
    • (2006)
    • Klapuri, A.1    Davy, M.2
  • 4
    • 0042830801 scopus 로고    scopus 로고
    • Comparison of techniques for environmental sound recognition
    • Cowling, M., Sitte, R., Comparison of techniques for environmental sound recognition. Pattern Recogn. Lett. 24:15 (November 2003), 2895–2907.
    • (2003) Pattern Recogn. Lett. , vol.24 , Issue.15 , pp. 2895-2907
    • Cowling, M.1    Sitte, R.2
  • 5
    • 0016572913 scopus 로고
    • A vector space model for automatic indexing
    • Salton, G., Wong, A., Yang, C.S., A vector space model for automatic indexing. Commun. ACM, 18(11), 1975, 613620.
    • (1975) Commun. ACM , vol.18 , Issue.11 , pp. 613620
    • Salton, G.1    Wong, A.2    Yang, C.S.3
  • 6
    • 0003947444 scopus 로고    scopus 로고
    • Principles of Visual Information Retrieval
    • Springer London
    • Lew, M.S., Principles of Visual Information Retrieval. January 2001, Springer, London.
    • (2001)
    • Lew, M.S.1
  • 7
    • 85097913301 scopus 로고    scopus 로고
    • International Conference on Music Information Retrieval last visited: September 2009
    • ISMIR., International Conference on Music Information Retrieval, 2004 http://ismir2004.ismir.net last visited: September 2009.
    • (2004)
  • 8
    • 84872166874 scopus 로고    scopus 로고
    • Music Information Retrieval Evaluation Exchange
    • last visited: September 2009
    • MIREX. Music Information Retrieval Evaluation Exchange. 2007 http://www.music-ir.org/mirexwiki last visited: September 2009.
    • (2007)
  • 9
    • 85097889292 scopus 로고
    • Bioacoustical Terminology, ANSI S3.20-1995 (R2003)
    • American National Standards Institute New York, NY
    • ANSI. Bioacoustical Terminology, ANSI S3.20-1995 (R2003). 1995, American National Standards Institute, New York, NY.
    • (1995)
  • 10
    • 84952660190 scopus 로고
    • Zur Tonhöhenwahrnehmung von Klängen. I. Psychoakustische Grundlagen
    • Terhardt, E., Zur Tonhöhenwahrnehmung von Klängen. I. Psychoakustische Grundlagen. Acustica 26 (1972), 173–186.
    • (1972) Acustica , vol.26 , pp. 173-186
    • Terhardt, E.1
  • 11
    • 0004236521 scopus 로고    scopus 로고
    • Psychoacoustics: Facts and Models
    • second ed. Springer Berlin
    • Zwicker, E., Fastl, H., Psychoacoustics: Facts and Models. second ed., 1999, Springer, Berlin.
    • (1999)
    • Zwicker, E.1    Fastl, H.2
  • 12
    • 0035790891 scopus 로고    scopus 로고
    • Musical instrument timbres classification with spectral features
    • Proceedings of the IEEE Workshop on Multimedia Signal Processing, Cannes, France IEEE Piscataway, NJ
    • Agostini, G., Longari, M., Pollastri, E., Musical instrument timbres classification with spectral features., Proceedings of the IEEE Workshop on Multimedia Signal Processing, Cannes, France, October 2001, IEEE, Piscataway, NJ, 97–102.
    • (2001) , pp. 97-102
    • Agostini, G.1    Longari, M.2    Pollastri, E.3
  • 13
    • 34547257372 scopus 로고    scopus 로고
    • Perceptual distance in timbre space
    • Proceedings of 11th Meeting of the International Conference on Auditory Display, Limerick, Ireland
    • Terasawa, H., Slaney, M., Berger, J., Perceptual distance in timbre space., Proceedings of 11th Meeting of the International Conference on Auditory Display, Limerick, Ireland, July 2005, 61–68.
    • (2005) , pp. 61-68
    • Terasawa, H.1    Slaney, M.2    Berger, J.3
  • 14
    • 0001019347 scopus 로고
    • The relation of pitch to intensity
    • Stevens, S.S., The relation of pitch to intensity. J. Acoust. Soc. Am. 6:3 (1935), 150–154.
    • (1935) J. Acoust. Soc. Am. , vol.6 , Issue.3 , pp. 150-154
    • Stevens, S.S.1
  • 15
    • 0036497405 scopus 로고    scopus 로고
    • Problems of music information retrieval in the real world
    • Byrd, D., Crawford, T., Problems of music information retrieval in the real world. Inform. Process. Manage. 38:2 (March 2002), 249–272.
    • (2002) Inform. Process. Manage. , vol.38 , Issue.2 , pp. 249-272
    • Byrd, D.1    Crawford, T.2
  • 16
    • 0038136880 scopus 로고    scopus 로고
    • Survey of compressed-domain features used in audio–visual indexing and analysis
    • Wang, A., Divakaran, A., Vetro, A., Chang, S.F., Sun, H., Survey of compressed-domain features used in audio–visual indexing and analysis. J. Vis. Commun. Image Represent. 14:2 (June 2003), 150–183.
    • (2003) J. Vis. Commun. Image Represent. , vol.14 , Issue.2 , pp. 150-183
    • Wang, A.1    Divakaran, A.2    Vetro, A.3    Chang, S.F.4    Sun, H.5
  • 17
    • 85032751556 scopus 로고    scopus 로고
    • Multimedia content analysis using both audio and visual clues
    • Wang, Y., Liu, Z., Huang, J.C., Multimedia content analysis using both audio and visual clues. IEEE Signal Process. Mag. 17:6 (November 2000), 12–36.
    • (2000) IEEE Signal Process. Mag. , vol.17 , Issue.6 , pp. 12-36
    • Wang, Y.1    Liu, Z.2    Huang, J.C.3
  • 18
    • 2942707401 scopus 로고    scopus 로고
    • Manipulation, Analysis and Retrieval Systems for Audio Signals
    • Ph.D. Thesis. Computer Science Department, Princeton University
    • Tzanetakis, G., Manipulation, Analysis and Retrieval Systems for Audio Signals. 2002 Ph.D. Thesis. Computer Science Department, Princeton University.
    • (2002)
    • Tzanetakis, G.1
  • 19
    • 0035786658 scopus 로고    scopus 로고
    • Feature selection for automatic classification of musical instrument sounds
    • JCDL'01: Proceedings of the 1st ACM/IEEE-CS Joint Conference on Digital Libraries ACM Press New York, NY
    • Liu, M., Wan, C., Feature selection for automatic classification of musical instrument sounds., JCDL'01: Proceedings of the 1st ACM/IEEE-CS Joint Conference on Digital Libraries, 2001, ACM Press, New York, NY, 247–248.
    • (2001) , pp. 247-248
    • Liu, M.1    Wan, C.2
  • 20
    • 0003801149 scopus 로고    scopus 로고
    • Content-Based Audio Classification and Retrieval for Audiovisual Data Parsing
    • Kluwer Academic Publishers Boston, MA
    • Zhang, T., Kuo, C.C.J., Content-Based Audio Classification and Retrieval for Audiovisual Data Parsing. 2001, Kluwer Academic Publishers, Boston, MA.
    • (2001)
    • Zhang, T.1    Kuo, C.C.J.2
  • 21
  • 22
    • 0141703354 scopus 로고    scopus 로고
    • Robust speech recognition using features based on zero crossings with peak amplitudes
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, Hong Kong, China IEEE Piscataway, NJ
    • Gajic, B., Paliwal, K.K., Robust speech recognition using features based on zero crossings with peak amplitudes., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, Hong Kong, China, April 2003, IEEE, Piscataway, NJ, 64–67.
    • (2003) , pp. 64-67
    • Gajic, B.1    Paliwal, K.K.2
  • 24
    • 4744344335 scopus 로고    scopus 로고
    • Modulation-scale analysis for content identification
    • Sukittanon, S., Atlas, L.E., Pitton, W.J., Modulation-scale analysis for content identification. IEEE Trans. Signal Process. 52:10 (2004), 3023–3035.
    • (2004) IEEE Trans. Signal Process. , vol.52 , Issue.10 , pp. 3023-3035
    • Sukittanon, S.1    Atlas, L.E.2    Pitton, W.J.3
  • 25
    • 15544385732 scopus 로고    scopus 로고
    • Automatic feature extraction for classifying audio data
    • Mierswa, I., Morik, K., Automatic feature extraction for classifying audio data. Mach. Learn. J. 58:2–3 (February 2005), 127–149.
    • (2005) Mach. Learn. J. , vol.58 , Issue.2-3 , pp. 127-149
    • Mierswa, I.1    Morik, K.2
  • 26
    • 0002161311 scopus 로고
    • The quefrency alanysis of time series for echoes: cepstrum, pseudo-autocovariance, cross-cepstrum, and saphe-cracking
    • M. Rosenblatt Proceedings of the Symposium on Time Series Analysis Wiley New York, NY
    • Bogert, B., Healy, M., Tukey, J., The quefrency alanysis of time series for echoes: cepstrum, pseudo-autocovariance, cross-cepstrum, and saphe-cracking. Rosenblatt, M., (eds.), Proceedings of the Symposium on Time Series Analysis, 1963, Wiley, New York, NY, 209–243.
    • (1963) , pp. 209-243
    • Bogert, B.1    Healy, M.2    Tukey, J.3
  • 27
    • 0030711174 scopus 로고    scopus 로고
    • The modulation spectrogram: in pursuit of an invariant representation of speech
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing IEEE Piscataway, NJ vol. 3
    • Greenberg, S., Kingsbury, B.E.D., The modulation spectrogram: in pursuit of an invariant representation of speech., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, April 1997, IEEE, Piscataway, NJ, 1647–1650 vol. 3.
    • (1997) , pp. 1647-1650
    • Greenberg, S.1    Kingsbury, B.E.D.2
  • 28
    • 0038376759 scopus 로고    scopus 로고
    • Content-based organization and visualization of music archives
    • Proceedings of the 10th ACM International Conference on Multimedia ACM Press New York, NY
    • Pampalk, E., Rauber, A., Merkl, D., Content-based organization and visualization of music archives., Proceedings of the 10th ACM International Conference on Multimedia, 2002, ACM Press, New York, NY, 570–579.
    • (2002) , pp. 570-579
    • Pampalk, E.1    Rauber, A.2    Merkl, D.3
  • 29
    • 0032136330 scopus 로고    scopus 로고
    • Robust speech recognition using the modulation spectrogram
    • Kingsbury, B., Morgan, N., Greenberg, S., Robust speech recognition using the modulation spectrogram. Speech Commun. 25 (1998), 117–132.
    • (1998) Speech Commun. , vol.25 , pp. 117-132
    • Kingsbury, B.1    Morgan, N.2    Greenberg, S.3
  • 30
    • 0003782493 scopus 로고    scopus 로고
    • Analysis of Observed Chaotic Data
    • Springer New York, NY
    • Abarbanel, H., Analysis of Observed Chaotic Data. 1996, Springer, New York, NY.
    • (1996)
    • Abarbanel, H.1
  • 31
    • 0141591552 scopus 로고    scopus 로고
    • Speech recognition using reconstructed phase space features
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, Hong Kong, China IEEE Piscataway, NJ
    • Lindgren, A.C., Johnson, M.T., Povinelli, R.J., Speech recognition using reconstructed phase space features., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, Hong Kong, China, April 2003, IEEE, Piscataway, NJ, 60–63.
    • (2003) , pp. 60-63
    • Lindgren, A.C.1    Johnson, M.T.2    Povinelli, R.J.3
  • 32
    • 84953656445 scopus 로고
    • Subdivision of the audible frequency range into critical bands (Frequenzgruppen)
    • Zwicker, E., Subdivision of the audible frequency range into critical bands (Frequenzgruppen). J. Acoust. Soc. Am., 33, 1961, 248.
    • (1961) J. Acoust. Soc. Am. , vol.33 , pp. 248
    • Zwicker, E.1
  • 33
    • 0025294553 scopus 로고
    • Auditory filter shapes at low center frequencies
    • Moore, C.J., Peters, R.W., Glasberg, B.R., Auditory filter shapes at low center frequencies. J. Acoust. Soc. Am. 88:1 (1990), 132–140.
    • (1990) J. Acoust. Soc. Am. , vol.88 , Issue.1 , pp. 132-140
    • Moore, C.J.1    Peters, R.W.2    Glasberg, B.R.3
  • 34
    • 84955035459 scopus 로고
    • A scale for the measurement of the psychological magnitude pitch
    • Stevens, S.S., Volkmann, J., Newman, E.B., A scale for the measurement of the psychological magnitude pitch. J. Acoust. Soc. Am. 8:3 (January 1937), 185–190.
    • (1937) J. Acoust. Soc. Am. , vol.8 , Issue.3 , pp. 185-190
    • Stevens, S.S.1    Volkmann, J.2    Newman, E.B.3
  • 35
    • 0003789815 scopus 로고    scopus 로고
    • An Introduction to the Psychology of Hearing
    • fifth ed. Academic Press Amsterdam
    • Moore, B.C.J., An Introduction to the Psychology of Hearing. fifth ed., 2004, Academic Press, Amsterdam.
    • (2004)
    • Moore, B.C.J.1
  • 36
    • 0020816083 scopus 로고
    • Suggested formulae for calculating auditory-filter bandwidths and excitation patterns
    • Moore, B.C.J., Glasberg, B.R., Suggested formulae for calculating auditory-filter bandwidths and excitation patterns. J. Acoust. Soc. Am. 74:3 (September 1983), 750–753.
    • (1983) J. Acoust. Soc. Am. , vol.74 , Issue.3 , pp. 750-753
    • Moore, B.C.J.1    Glasberg, B.R.2
  • 37
    • 0000614795 scopus 로고
    • The auditory masking of one pure tone by another and its probable relation to the dynamics of the inner ear
    • Wegel, R.L., Lane, C.E., The auditory masking of one pure tone by another and its probable relation to the dynamics of the inner ear. Phys. Rev. 23 (February 1924), 266–285.
    • (1924) Phys. Rev. , vol.23 , pp. 266-285
    • Wegel, R.L.1    Lane, C.E.2
  • 38
    • 84955013022 scopus 로고
    • Loudness, its definition, measurement and calculation
    • Fletcher, H., Munson, W.A., Loudness, its definition, measurement and calculation. J. Acoust. Soc. Am. 5:2 (October 1933), 82–108.
    • (1933) J. Acoust. Soc. Am. , vol.5 , Issue.2 , pp. 82-108
    • Fletcher, H.1    Munson, W.A.2
  • 39
    • 85097862309 scopus 로고
    • International Standard 226, Acoustics—Normal Equal-Loudness Level Contours
    • International Organization for Standardization (ISO). International Standard 226, Acoustics—Normal Equal-Loudness Level Contours. 1987.
    • (1987)
  • 40
    • 0032657125 scopus 로고    scopus 로고
    • The importance of perceptive adaptation of sound features for audio content processing
    • Proceedings SPIE Conferences, Electronic Imaging 1999, Storage and Retrieval for Image and Video Databases VII, San Jose, CA
    • Pfeiffer, S., The importance of perceptive adaptation of sound features for audio content processing., Proceedings SPIE Conferences, Electronic Imaging 1999, Storage and Retrieval for Image and Video Databases VII, San Jose, CA, January 1999, 328–337.
    • (1999) , pp. 328-337
    • Pfeiffer, S.1
  • 41
    • 34447546202 scopus 로고
    • On the psychophysical law
    • Stevens, S.S., On the psychophysical law. Psychol. Rev. 64:3 (May 1957), 153–181.
    • (1957) Psychol. Rev. , vol.64 , Issue.3 , pp. 153-181
    • Stevens, S.S.1
  • 42
    • 4243152700 scopus 로고    scopus 로고
    • Content-based identification of audio material using mpeg-7 low level description
    • Proceedings of the International Symposium of Music Information Retrieval
    • Allamanche, E., Herre, J., Helmuth, O., Frba, B., Kasten, T., Cremer, M., Content-based identification of audio material using mpeg-7 low level description., Proceedings of the International Symposium of Music Information Retrieval, 2001.
    • (2001)
    • Allamanche, E.1    Herre, J.2    Helmuth, O.3    Frba, B.4    Kasten, T.5    Cremer, M.6
  • 43
    • 0038444621 scopus 로고    scopus 로고
    • Distortion discriminant analysis for audio fingerprinting
    • Burges, C.J.C., Platt, J.C., Jana, S., Distortion discriminant analysis for audio fingerprinting. IEEE Trans. Speech Audio Process. 11:3 (May 2003), 165–174.
    • (2003) IEEE Trans. Speech Audio Process. , vol.11 , Issue.3 , pp. 165-174
    • Burges, C.J.C.1    Platt, J.C.2    Jana, S.3
  • 44
    • 85009108066 scopus 로고    scopus 로고
    • MFCC computation from magnitude spectrum of higher lag autocorrelation coefficients for robust speech recognition
    • Proceedings of the International Conference on Spoken Language Processing
    • Shannon, B.J., Paliwal, K.K., MFCC computation from magnitude spectrum of higher lag autocorrelation coefficients for robust speech recognition., Proceedings of the International Conference on Spoken Language Processing, October 2004, 129–132.
    • (2004) , pp. 129-132
    • Shannon, B.J.1    Paliwal, K.K.2
  • 45
    • 22544476848 scopus 로고    scopus 로고
    • Combination of autocorrelation-based features and projection measure technique for speaker identification
    • Yuo, K.H., Hwang, T.H., Wang, H.C., Combination of autocorrelation-based features and projection measure technique for speaker identification. IEEE Trans. Speech Audio Process. 13:4 (July 2005), 565–574.
    • (2005) IEEE Trans. Speech Audio Process. , vol.13 , Issue.4 , pp. 565-574
    • Yuo, K.H.1    Hwang, T.H.2    Wang, H.C.3
  • 46
    • 84873540864 scopus 로고    scopus 로고
    • Audio matching via chroma-based statistical features
    • Proceedings of the 6th International Conference on Music Information Retrieval, London
    • Müller, M., Kurth, F., Clausen, M., Audio matching via chroma-based statistical features., Proceedings of the 6th International Conference on Music Information Retrieval, London, September 2005, 288–295.
    • (2005) , pp. 288-295
    • Müller, M.1    Kurth, F.2    Clausen, M.3
  • 47
    • 1542439119 scopus 로고    scopus 로고
    • A comparative study on content-based music genre classification
    • SIGIR'03: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Toronto, ON, Canada ACM Press New York, NY
    • Li, T., Ogihara, M., Li, Q., A comparative study on content-based music genre classification., SIGIR'03: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Toronto, ON, Canada, 2003, ACM Press, New York, NY, 282–289.
    • (2003) , pp. 282-289
    • Li, T.1    Ogihara, M.2    Li, Q.3
  • 48
    • 33644626634 scopus 로고    scopus 로고
    • A large set of audio features for sound description (similarity and classification) in the CUIDADO project
    • Peeters, G., A large set of audio features for sound description (similarity and classification) in the CUIDADO project. Technical Report, 2004.
    • (2004) Technical Report
    • Peeters, G.1
  • 49
    • 0037708486 scopus 로고    scopus 로고
    • Content-based audio classification and segmentation by using support vector machines
    • Lu, L., Zhang, H.J., Li, S.Z., Content-based audio classification and segmentation by using support vector machines. Multimedia Syst. 8:6 (April 2003), 482–492.
    • (2003) Multimedia Syst. , vol.8 , Issue.6 , pp. 482-492
    • Lu, L.1    Zhang, H.J.2    Li, S.Z.3
  • 50
    • 84873543378 scopus 로고    scopus 로고
    • Inferring efficient hierarchical taxonomies for MIR tasks, application to musical instruments
    • Proceedings of the International Conference on Music Information Retrieval
    • Essid, S., Richard, G., David, B., Inferring efficient hierarchical taxonomies for MIR tasks, application to musical instruments., Proceedings of the International Conference on Music Information Retrieval, September 2005.
    • (2005)
    • Essid, S.1    Richard, G.2    David, B.3
  • 51
    • 0347387977 scopus 로고
    • An experimental automatic word recognition system
    • Joint Speech Research Unit Ruislip, England
    • Bridle, J.S., Brown, M.D., An experimental automatic word recognition system. JSRU Report No. 1003, 1974, Joint Speech Research Unit, Ruislip, England.
    • (1974) JSRU Report No. 1003
    • Bridle, J.S.1    Brown, M.D.2
  • 52
    • 0033705976 scopus 로고    scopus 로고
    • Speech/music discrimination for multimedia applications
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Istanbul, Turkey IEEE Piscataway, NJ vol. 6
    • El-Maleh, K., Klein, M., Petrucci, G., Kabal, P., Speech/music discrimination for multimedia applications., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Istanbul, Turkey, June 2000, IEEE, Piscataway, NJ, 2445–2448 vol. 6.
    • (2000) , pp. 2445-2448
    • El-Maleh, K.1    Klein, M.2    Petrucci, G.3    Kabal, P.4
  • 53
    • 0022806994 scopus 로고
    • Spectral analysis and discrimination by zero-crossings
    • Kedem, B., Spectral analysis and discrimination by zero-crossings. IEEE Proc. 74 (1986), 1477–1493.
    • (1986) IEEE Proc. , vol.74 , pp. 1477-1493
    • Kedem, B.1
  • 54
    • 13144306118 scopus 로고    scopus 로고
    • A speech/music discriminator based on RMS and zero-crossings
    • Panagiotakis, C., Tziritas, G., A speech/music discriminator based on RMS and zero-crossings. IEEE Trans. Multimedia 7:1 (February 2005), 155–166.
    • (2005) IEEE Trans. Multimedia , vol.7 , Issue.1 , pp. 155-166
    • Panagiotakis, C.1    Tziritas, G.2
  • 55
    • 0029765670 scopus 로고    scopus 로고
    • Real-time discrimination of broadcast speech/music
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, Atlanta, GA IEEE Piscataway, NJ
    • Saunders, J., Real-time discrimination of broadcast speech/music., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, Atlanta, GA, May 1996, IEEE, Piscataway, NJ, 993–996.
    • (1996) , pp. 993-996
    • Saunders, J.1
  • 56
    • 33646759744 scopus 로고    scopus 로고
    • Features for audio and music classification
    • Proceedings of the International Conference on Music Information Retrieval
    • McKinney, M.F., Breebaart, J., Features for audio and music classification., Proceedings of the International Conference on Music Information Retrieval, October 2003.
    • (2003)
    • McKinney, M.F.1    Breebaart, J.2
  • 57
    • 33646717803 scopus 로고    scopus 로고
    • Fusion of audio and motion information on hmm-based highlight extraction for baseball games
    • Cheng, C.C., Hsu, C.T., Fusion of audio and motion information on hmm-based highlight extraction for baseball games. IEEE Trans. Multimedia 8:3 (June 2006), 585–599.
    • (2006) IEEE Trans. Multimedia , vol.8 , Issue.3 , pp. 585-599
    • Cheng, C.C.1    Hsu, C.T.2
  • 58
    • 11244258301 scopus 로고    scopus 로고
    • Emotion recognition using acoustic features and textual content
    • Proceedings of the IEEE International Conference on Multimedia and Expo, vol. 1, Taipei, Taiwan IEEE Piscataway, NJ
    • Chuang, Z.J., Wu, C.H., Emotion recognition using acoustic features and textual content., Proceedings of the IEEE International Conference on Multimedia and Expo, vol. 1, Taipei, Taiwan, June 2004, IEEE, Piscataway, NJ, 53–56.
    • (2004) , pp. 53-56
    • Chuang, Z.J.1    Wu, C.H.2
  • 59
    • 67650108322 scopus 로고    scopus 로고
    • Automatic singer identification
    • Proceedings of the IEEE International Conference on Multimedia and Expo IEEE Piscataway, NJ vol. 1
    • Zhang, T., Automatic singer identification., Proceedings of the IEEE International Conference on Multimedia and Expo, July 2003, IEEE, Piscataway, NJ, 33–36 vol. 1.
    • (2003) , pp. 33-36
    • Zhang, T.1
  • 60
    • 33846110275 scopus 로고    scopus 로고
    • A flexible framework for key audio effects detection and auditory context inference
    • Cai, R., Lu, L., Hanjalic, A., Zhang, H.J., Cai, L.H., A flexible framework for key audio effects detection and auditory context inference. IEEE Trans. Speech Audio Process. 14 (May 2006), 1026–1039.
    • (2006) IEEE Trans. Speech Audio Process. , vol.14 , pp. 1026-1039
    • Cai, R.1    Lu, L.2    Hanjalic, A.3    Zhang, H.J.4    Cai, L.H.5
  • 61
    • 0029765806 scopus 로고    scopus 로고
    • Feature extraction based on zero-crossings with peak amplitudes for robust speech recognition in noisy environments
    • Proceedings of the International Conference on Acoustics, Speech, and Signal Processing IEEE Piscataway, NJ vol. 1
    • Kim, D.-S., Jeong, J.-H., Kim, J.-W., Lee, S.-Y., Feature extraction based on zero-crossings with peak amplitudes for robust speech recognition in noisy environments., Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, October 1996, IEEE, Piscataway, NJ, 61–64 vol. 1.
    • (1996) , pp. 61-64
    • Kim, D.-S.1    Jeong, J.-H.2    Kim, J.-W.3    Lee, S.-Y.4
  • 62
    • 0032785783 scopus 로고    scopus 로고
    • Auditory processing of speech signals for robust speech recognition in real-world noisy environments
    • Kim, D.S., Lee, S.Y., Kil, R.M., Auditory processing of speech signals for robust speech recognition in real-world noisy environments. IEEE Trans. Speech Audio Process. 7:1 (January 1999), 55–69.
    • (1999) IEEE Trans. Speech Audio Process. , vol.7 , Issue.1 , pp. 55-69
    • Kim, D.S.1    Lee, S.Y.2    Kil, R.M.3
  • 63
    • 85009118262 scopus 로고    scopus 로고
    • A noise-robust feature extraction method based on pitch-synchronous ZCPA for ASR
    • Proceedings of the International Conference on Spoken Language Processing, Jeju Island, Korea
    • Ghulam, M., Fukuda, T., Horikawa, J., Nitta, T., A noise-robust feature extraction method based on pitch-synchronous ZCPA for ASR., Proceedings of the International Conference on Spoken Language Processing, Jeju Island, Korea, October 2004, 133–136.
    • (2004) , pp. 133-136
    • Ghulam, M.1    Fukuda, T.2    Horikawa, J.3    Nitta, T.4
  • 64
    • 33646758174 scopus 로고    scopus 로고
    • Pitch-synchronous ZCPA (PS-ZCPA)-based feature extraction with auditory masking
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, Philadelphia, PA IEEE Piscataway, NJ
    • Ghulam, M., Fukuda, T., Horikawa, J., Nitta, T., Pitch-synchronous ZCPA (PS-ZCPA)-based feature extraction with auditory masking., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, Philadelphia, PA, March 2005, IEEE, Piscataway, NJ, 517–520.
    • (2005) , pp. 517-520
    • Ghulam, M.1    Fukuda, T.2    Horikawa, J.3    Nitta, T.4
  • 65
    • 85097901157 scopus 로고    scopus 로고
    • Information Technology—Multimedia Content Description Interface—Part 4: Audio (Number 15938), ISO/IEC, Moving Pictures Expert Group
    • first ed.
    • ISO-IEC. Information Technology—Multimedia Content Description Interface—Part 4: Audio (Number 15938), ISO/IEC, Moving Pictures Expert Group. first ed., 2002.
    • (2002)
  • 66
    • 34047265156 scopus 로고    scopus 로고
    • Discrimination and retrieval of animal sounds
    • Proceedings of IEEE Multimedia Modelling Conference, Beijing, China IEEE Piscataway, NJ
    • Mitrovic, D., Zeppelzauer, M., Breiteneder, C., Discrimination and retrieval of animal sounds., Proceedings of IEEE Multimedia Modelling Conference, Beijing, China, January 2006, IEEE, Piscataway, NJ, 339–343.
    • (2006) , pp. 339-343
    • Mitrovic, D.1    Zeppelzauer, M.2    Breiteneder, C.3
  • 67
    • 33845633986 scopus 로고    scopus 로고
    • Toward semantic indexing and retrieval using hierarchical audio models
    • Chu, W.T., Cheng, W.H., Hsu, J.Y.J., Wu, J.L., Toward semantic indexing and retrieval using hierarchical audio models. Multimedia Syst. 10:6 (May 2005), 570–583.
    • (2005) Multimedia Syst. , vol.10 , Issue.6 , pp. 570-583
    • Chu, W.T.1    Cheng, W.H.2    Hsu, J.Y.J.3    Wu, J.L.4
  • 68
    • 33847295200 scopus 로고    scopus 로고
    • SVM-based audio scene classification
    • Proceedings of the IEEE International Conference on Natural Language Processing and Knowledge Engineering, Wuhan, China IEEE Piscataway, NJ
    • Jiang, H., Bai, J., Zhang, S., Xu, B., SVM-based audio scene classification., Proceedings of the IEEE International Conference on Natural Language Processing and Knowledge Engineering, Wuhan, China, October 2005, IEEE, Piscataway, NJ, 131–136.
    • (2005) , pp. 131-136
    • Jiang, H.1    Bai, J.2    Zhang, S.3    Xu, B.4
  • 69
    • 0030242072 scopus 로고    scopus 로고
    • Content-based classification, search, and retrieval of audio
    • Wold, T., Blum, D., Wheaton, J., Content-based classification, search, and retrieval of audio. IEEE Multimedia, 3(3), 1996, 2736.
    • (1996) IEEE Multimedia , vol.3 , Issue.3 , pp. 2736
    • Wold, T.1    Blum, D.2    Wheaton, J.3
  • 70
    • 0032181880 scopus 로고    scopus 로고
    • Audio feature extraction and analysis for scene segmentation and classification
    • Liu, Z., Wang, Y., Chen, T., Audio feature extraction and analysis for scene segmentation and classification. J. VLSI Signal Process. 20:1–2 (October 1998), 61–79.
    • (1998) J. VLSI Signal Process. , vol.20 , Issue.1-2 , pp. 61-79
    • Liu, Z.1    Wang, Y.2    Chen, T.3
  • 71
    • 0003425258 scopus 로고
    • Digital Processing of Speech Signals
    • Prentice-Hall Englewood Cliffs, NJ
    • Rabiner, L., Schafer, R., Digital Processing of Speech Signals. 1978, Prentice-Hall, Englewood Cliffs, NJ.
    • (1978)
    • Rabiner, L.1    Schafer, R.2
  • 72
    • 0002884330 scopus 로고
    • The government standard linear predictive coding algorithm: LPC-10
    • Tremain, T., The government standard linear predictive coding algorithm: LPC-10. Speech Technol. Mag. 1 (April 1982), 40–49.
    • (1982) Speech Technol. Mag. , vol.1 , pp. 40-49
    • Tremain, T.1
  • 73
    • 20444491279 scopus 로고    scopus 로고
    • Automatic classification of speech and music using neural networks
    • MMDB'04: Proceedings of the 2nd ACM International Workshop on Multimedia Databases ACM Press New York, NY
    • Khan, M.K.S., Al-Khatib, W.G., Moinuddin, M., Automatic classification of speech and music using neural networks., MMDB'04: Proceedings of the 2nd ACM International Workshop on Multimedia Databases, 2004, ACM Press, New York, NY, 94–99.
    • (2004) , pp. 94-99
    • Khan, M.K.S.1    Al-Khatib, W.G.2    Moinuddin, M.3
  • 74
    • 33746879922 scopus 로고    scopus 로고
    • Machine-learning based classification of speech and music
    • Khan, M.K.S., Al-Khatib, W.G., Machine-learning based classification of speech and music. Multimedia Syst. 12:1 (August 2006), 55–67.
    • (2006) Multimedia Syst. , vol.12 , Issue.1 , pp. 55-67
    • Khan, M.K.S.1    Al-Khatib, W.G.2
  • 75
    • 0034867981 scopus 로고    scopus 로고
    • A study on content-based classification and retrieval of audio database
    • Proceedings of the International Symposium on Database Engineering and Applications, Grenoble, France IEEE Computer Society Washington, DC
    • Liu, M., Wan, C., A study on content-based classification and retrieval of audio database., Proceedings of the International Symposium on Database Engineering and Applications, Grenoble, France, July 2001, IEEE Computer Society, Washington, DC, 339–345.
    • (2001) , pp. 339-345
    • Liu, M.1    Wan, C.2
  • 76
    • 18744375187 scopus 로고    scopus 로고
    • Automatic music classification and summarization
    • Xu, C., Maddage, N.C., Shao, X., Automatic music classification and summarization. IEEE Trans. Speech Audio Process. 13:3 (May 2005), 441–450.
    • (2005) IEEE Trans. Speech Audio Process. , vol.13 , Issue.3 , pp. 441-450
    • Xu, C.1    Maddage, N.C.2    Shao, X.3
  • 77
    • 0031233424 scopus 로고    scopus 로고
    • Speaker recognition: A tutorial
    • Champbell, J.P., Speaker recognition: A tutorial. Proc. IEEE 85:9 (September 1997), 1437–1462.
    • (1997) Proc. IEEE , vol.85 , Issue.9 , pp. 1437-1462
    • Champbell, J.P.1
  • 78
    • 4544247190 scopus 로고    scopus 로고
    • Music instrument recognition: from isolated notes to solo phrases
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 4, Montreal, QC, Canada IEEE Piscataway, NJ
    • Krishna, A.G., Sreenivas, T.V., Music instrument recognition: from isolated notes to solo phrases., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 4, Montreal, QC, Canada, May 2004, IEEE, Piscataway, NJ, 265–268.
    • (2004) , pp. 265-268
    • Krishna, A.G.1    Sreenivas, T.V.2
  • 79
    • 0032023735 scopus 로고    scopus 로고
    • Statistical properties of line spectrum pairs
    • Tourneret, J.Y., Statistical properties of line spectrum pairs. Signal Process. 65:2 (March 1998), 239–255.
    • (1998) Signal Process. , vol.65 , Issue.2 , pp. 239-255
    • Tourneret, J.Y.1
  • 80
    • 17444365032 scopus 로고    scopus 로고
    • Unsupervised speaker segmentation and tracking in real-time audio content analysis
    • Lu, L., Zhang, H.J., Unsupervised speaker segmentation and tracking in real-time audio content analysis. Multimedia Syst. 10:4 (April 2005), 332–343.
    • (2005) Multimedia Syst. , vol.10 , Issue.4 , pp. 332-343
    • Lu, L.1    Zhang, H.J.2
  • 81
    • 13444256090 scopus 로고    scopus 로고
    • Music artist style identification by semi-supervised learning from both lyrics and content
    • Proceedings of the 12th Annual ACM International Conference on Multimedia ACM Press New York, NY
    • Li, T., Ogihara, M., Music artist style identification by semi-supervised learning from both lyrics and content., Proceedings of the 12th Annual ACM International Conference on Multimedia, 2004, ACM Press, New York, NY, 364–367.
    • (2004) , pp. 364-367
    • Li, T.1    Ogihara, M.2
  • 82
    • 33646739998 scopus 로고    scopus 로고
    • Toward intelligent music information retrieval
    • Li, T., Ogihara, M., Toward intelligent music information retrieval. IEEE Trans. Multimedia 8:3 (June 2006), 564–574.
    • (2006) IEEE Trans. Multimedia , vol.8 , Issue.3 , pp. 564-574
    • Li, T.1    Ogihara, M.2
  • 83
    • 16244420091 scopus 로고    scopus 로고
    • Multigroup classification of audio signals using time-frequency parameters
    • Umapathy, K., Krishnan, S., Jimaa, S., Multigroup classification of audio signals using time-frequency parameters. IEEE Trans. Multimedia 7:2 (April 2005), 308–315.
    • (2005) IEEE Trans. Multimedia , vol.7 , Issue.2 , pp. 308-315
    • Umapathy, K.1    Krishnan, S.2    Jimaa, S.3
  • 84
    • 0033279679 scopus 로고    scopus 로고
    • Towards robust features for classifying audio in the CueVideo system
    • Proceedings of the 7th ACM International Conference on Multimedia (Part 1) ACM Press New York, NY
    • Srinivasan, S., Petkovic, D., Ponceleon, D., Towards robust features for classifying audio in the CueVideo system., Proceedings of the 7th ACM International Conference on Multimedia (Part 1), 1999, ACM Press, New York, NY, 393–400.
    • (1999) , pp. 393-400
    • Srinivasan, S.1    Petkovic, D.2    Ponceleon, D.3
  • 86
    • 0030648077 scopus 로고    scopus 로고
    • Construction and evaluation of a robust multi-feature speech/music discriminator
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, Munich, Germany
    • Scheirer, E., Slaney, M., Construction and evaluation of a robust multi-feature speech/music discriminator., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, Munich, Germany, April 1997, 1331–1334.
    • (1997) , pp. 1331-1334
    • Scheirer, E.1    Slaney, M.2
  • 87
    • 0034792569 scopus 로고    scopus 로고
    • A robust audio classification and segmentation method
    • Proceedings of the 9th ACM International Conference on Multimedia, Ottawa, ON, Canada ACM Press New York, NY
    • Lu, L., Jiang, H., Zhang, H.J., A robust audio classification and segmentation method., Proceedings of the 9th ACM International Conference on Multimedia, Ottawa, ON, Canada, 2001, ACM Press, New York, NY, 203–211.
    • (2001) , pp. 203-211
    • Lu, L.1    Jiang, H.2    Zhang, H.J.3
  • 88
    • 0036648502 scopus 로고    scopus 로고
    • Musical genre classification of audio signals
    • Tzanetakis, G., Musical genre classification of audio signals. IEEE Trans. Speech Audio Process. 10:5 (July 2002), 293–302.
    • (2002) IEEE Trans. Speech Audio Process. , vol.10 , Issue.5 , pp. 293-302
    • Tzanetakis, G.1
  • 89
    • 33746837319 scopus 로고    scopus 로고
    • Audio-based gender identification using bootstrapping
    • Proceedings of the IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, Victoria, BC, Canada IEEE Piscataway, NJ
    • Tzanetakis, G., Audio-based gender identification using bootstrapping., Proceedings of the IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, Victoria, BC, Canada, August 2005, IEEE, Piscataway, NJ, 432–433.
    • (2005) , pp. 432-433
    • Tzanetakis, G.1
  • 90
    • 11144341364 scopus 로고    scopus 로고
    • An industrial strength audio search algorithm
    • Proceedings of the International Conference on Music Information Retrieval, Baltimore, MD
    • Wang, A., An industrial strength audio search algorithm., Proceedings of the International Conference on Music Information Retrieval, Baltimore, MD, October 2003, 7–13.
    • (2003) , pp. 7-13
    • Wang, A.1
  • 91
    • 33747199309 scopus 로고    scopus 로고
    • The Shazam music recognition service
    • Wang, A., The Shazam music recognition service. Commun. ACM 49:8 (August 2006), 44–48.
    • (2006) Commun. ACM , vol.49 , Issue.8 , pp. 44-48
    • Wang, A.1
  • 92
    • 0026923568 scopus 로고
    • Significance of group delay functions in spectrum estimation
    • Yegnanarayan, B., Murthy, H.A., Significance of group delay functions in spectrum estimation. IEEE Trans. Signal Process. 40:9 (September 1992), 2281–2289.
    • (1992) IEEE Trans. Signal Process. , vol.40 , Issue.9 , pp. 2281-2289
    • Yegnanarayan, B.1    Murthy, H.A.2
  • 93
    • 0029375490 scopus 로고
    • Determination of instants of significant excitation in speech using group delay function
    • Smits, R., Yegnanarayana, B., Determination of instants of significant excitation in speech using group delay function. IEEE Trans. Speech Audio Process. 3:5 (September 1995), 325–333.
    • (1995) IEEE Trans. Speech Audio Process. , vol.3 , Issue.5 , pp. 325-333
    • Smits, R.1    Yegnanarayana, B.2
  • 94
    • 14644411724 scopus 로고    scopus 로고
    • Beat tracking of musical performances using low-level audio features
    • Sethares, W.A., Morris, R.D., Sethares, J.C., Beat tracking of musical performances using low-level audio features. IEEE Trans. Speech Audio Process. 13:2 (March 2005), 275–285.
    • (2005) IEEE Trans. Speech Audio Process. , vol.13 , Issue.2 , pp. 275-285
    • Sethares, W.A.1    Morris, R.D.2    Sethares, J.C.3
  • 95
    • 33847165718 scopus 로고    scopus 로고
    • Evaluation of the modified group delay feature for isolated word recognition
    • Proceedings of the International Symposium on Signal Processing and Its Applications, vol. 2, Sydney, Australia IEEE Piscataway, NJ
    • Alsteris, L.D., Paliwal, K.K., Evaluation of the modified group delay feature for isolated word recognition., Proceedings of the International Symposium on Signal Processing and Its Applications, vol. 2, Sydney, Australia, August 2005, IEEE, Piscataway, NJ, 715–718.
    • (2005) , pp. 715-718
    • Alsteris, L.D.1    Paliwal, K.K.2
  • 96
    • 33845951461 scopus 로고    scopus 로고
    • Significance of joint features derived from the modified group delay function in speech processing
    • 10.1155/2007/79032
    • Hegde, M., Murthy, H.A., Gadde, V.R., Significance of joint features derived from the modified group delay function in speech processing. EURASIP J. Appl. Signal Process. 15:1 (January 2007), 190–202 10.1155/2007/79032.
    • (2007) EURASIP J. Appl. Signal Process. , vol.15 , Issue.1 , pp. 190-202
    • Hegde, M.1    Murthy, H.A.2    Gadde, V.R.3
  • 97
    • 4544293687 scopus 로고    scopus 로고
    • Application of the modified group delay function to speaker identification and discrimination
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, Montreal, QC, Canada IEEE Piscataway, NJ
    • Hegde, R.M., Murthy, H.A., Rao, G.V.R., Application of the modified group delay function to speaker identification and discrimination., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, Montreal, QC, Canada, May 2004, IEEE, Piscataway, NJ, 517–520.
    • (2004) , pp. 517-520
    • Hegde, R.M.1    Murthy, H.A.2    Rao, G.V.R.3
  • 98
    • 0141480080 scopus 로고    scopus 로고
    • The modified group delay function and its application to phoneme recognition
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, Hong Kong, China IEEE Piscataway, NJ
    • Murthy, H.A., Gadde, V., The modified group delay function and its application to phoneme recognition., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, Hong Kong, China, April 2003, IEEE, Piscataway, NJ, 68–71.
    • (2003) , pp. 68-71
    • Murthy, H.A.1    Gadde, V.2
  • 99
    • 14944343016 scopus 로고    scopus 로고
    • Subband-based group delay segmentation of spontaneous speech into syllable-like units
    • Nagarajan, T., Murthy, H.A., Subband-based group delay segmentation of spontaneous speech into syllable-like units. EURASIP J. Appl. Signal Process. 2004:17 (2004), 2614–2625.
    • (2004) EURASIP J. Appl. Signal Process. , vol.2004 , Issue.17 , pp. 2614-2625
    • Nagarajan, T.1    Murthy, H.A.2
  • 100
    • 0024879901 scopus 로고
    • Formant extraction from Fourier transform phase
    • International Conference on Acoustics, Speech, and Signal Processing vol. 1
    • Murthy, H.A., Murthy, K.V.M., Yegnarayana, B., Formant extraction from Fourier transform phase., International Conference on Acoustics, Speech, and Signal Processing, May 1989, 484–487 vol. 1.
    • (1989) , pp. 484-487
    • Murthy, H.A.1    Murthy, K.V.M.2    Yegnarayana, B.3
  • 101
    • 62649114038 scopus 로고    scopus 로고
    • Factors in automatic musical genre classification of audio signals
    • Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY IEEE Piscataway, NJ
    • Li, T., Tzanetakis, G., Factors in automatic musical genre classification of audio signals., Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY, October 2003, IEEE, Piscataway, NJ, 143–146.
    • (2003) , pp. 143-146
    • Li, T.1    Tzanetakis, G.2
  • 102
    • 4544304284 scopus 로고    scopus 로고
    • Harmonicity and dynamics-based features for audio
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 4, Montreal, QC, Canada IEEE Piscataway, NJ
    • Srinivasan, H., Kankanhalli, M., Harmonicity and dynamics-based features for audio., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 4, Montreal, QC, Canada, May 2004, IEEE, Piscataway, NJ, 321–324.
    • (2004) , pp. 321-324
    • Srinivasan, H.1    Kankanhalli, M.2
  • 103
    • 33750566007 scopus 로고    scopus 로고
    • Gaussian mixture modeling using short time Fourier transform features for audio fingerprinting
    • Proceedings of the IEEE International Conference on Multimedia and Expo, Amsterdam, The Netherlands IEEE Piscataway, NJ
    • Ramalingam, A., Krishnan, S., Gaussian mixture modeling using short time Fourier transform features for audio fingerprinting., Proceedings of the IEEE International Conference on Multimedia and Expo, Amsterdam, The Netherlands, July 2005, IEEE, Piscataway, NJ, 1146–1149.
    • (2005) , pp. 1146-1149
    • Ramalingam, A.1    Krishnan, S.2
  • 104
    • 54249104868 scopus 로고    scopus 로고
    • MPEG-7 Audio and Beyond
    • Wiley West Sussex, England
    • Kim, H., Moreau, N., Sikora, T., MPEG-7 Audio and Beyond. 2005, Wiley, West Sussex, England.
    • (2005)
    • Kim, H.1    Moreau, N.2    Sikora, T.3
  • 105
    • 36549038450 scopus 로고    scopus 로고
    • How similar do songs sound? Towards modeling human perception of musical similarity
    • Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY IEEE Piscataway, NJ
    • Herre, J., Allamanche, E., Ertel, C., How similar do songs sound? Towards modeling human perception of musical similarity., Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY, October 2003, IEEE, Piscataway, NJ, 83–86.
    • (2003) , pp. 83-86
    • Herre, J.1    Allamanche, E.2    Ertel, C.3
  • 106
    • 85013917537 scopus 로고    scopus 로고
    • Audio feature extraction and analysis for scene classification
    • Proceedings of the IEEE Workshop on Multimedia Signal Processing, Princeton, NJ IEEE Piscataway, NJ
    • Liu, Z., Huang, J., Wang, Y., Chen, T., Audio feature extraction and analysis for scene classification., Proceedings of the IEEE Workshop on Multimedia Signal Processing, Princeton, NJ, June 1997, IEEE, Piscataway, NJ, 343–348.
    • (1997) , pp. 343-348
    • Liu, Z.1    Huang, J.2    Wang, Y.3    Chen, T.4
  • 107
    • 17444399233 scopus 로고    scopus 로고
    • Musical instrument timbres classification with spectral features
    • Agostini, G., Longari, M., Pollastri, E., Musical instrument timbres classification with spectral features. EURASIP J. Appl. Signal Process. 2003:1 (2003), 5–14.
    • (2003) EURASIP J. Appl. Signal Process. , vol.2003 , Issue.1 , pp. 5-14
    • Agostini, G.1    Longari, M.2    Pollastri, E.3
  • 108
    • 33646532381 scopus 로고    scopus 로고
    • Music genre classification with taxonomy
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing IEEE Piscataway, NJ vol. 5
    • Li, T., Ogihara, M., Music genre classification with taxonomy., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, March 2005, IEEE, Piscataway, NJ, 197–200 vol. 5.
    • (2005) , pp. 197-200
    • Li, T.1    Ogihara, M.2
  • 109
    • 0003579084 scopus 로고
    • Digital Coding of Waveforms: Principles and Applications to Speech and Video
    • Prentice-Hall Englewood Cliffs, NJ
    • Jayant, N.S., Noll, P., Digital Coding of Waveforms: Principles and Applications to Speech and Video. Prentice-Hall Signal Processing Series, 1984, Prentice-Hall, Englewood Cliffs, NJ.
    • (1984) Prentice-Hall Signal Processing Series
    • Jayant, N.S.1    Noll, P.2
  • 110
    • 84948124247 scopus 로고    scopus 로고
    • Visualization of metre and other rhythm features
    • Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, Darmstadt, Germany IEEE Piscataway, NJ
    • Guaus, E., Batlle, E., Visualization of metre and other rhythm features., Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, Darmstadt, Germany, December 2003, IEEE, Piscataway, NJ, 282–285.
    • (2003) , pp. 282-285
    • Guaus, E.1    Batlle, E.2
  • 111
    • 11244341096 scopus 로고    scopus 로고
    • Audio content identification by using perceptual hashing
    • Proceedings of the IEEE International Conference on Multimedia and Expo, Taipei, Taiwan IEEE Piscataway, NJ
    • Lancini, R., Mapelli, F., Pezzano, R., Audio content identification by using perceptual hashing., Proceedings of the IEEE International Conference on Multimedia and Expo, Taipei, Taiwan, June 2004, IEEE, Piscataway, NJ, 739–742.
    • (2004) , pp. 739-742
    • Lancini, R.1    Mapelli, F.2    Pezzano, R.3
  • 112
    • 0035688755 scopus 로고    scopus 로고
    • Robust matching of audio signals using spectral flatness features
    • Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY IEEE Piscataway, NJ
    • Herre, J., Allamanche, E., Hellmuth, O., Robust matching of audio signals using spectral flatness features., Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY, October 2001, IEEE, Piscataway, NJ, 127–130.
    • (2001) , pp. 127-130
    • Herre, J.1    Allamanche, E.2    Hellmuth, O.3
  • 113
    • 4544250678 scopus 로고    scopus 로고
    • Spectral entropy based feature for robust ASR
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, Montreal, QC, Canada IEEE Piscataway, NJ
    • Misra, H., Ikbal, S., Bourlard, H., Hermansky, H., Spectral entropy based feature for robust ASR., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, Montreal, QC, Canada, May 2004, IEEE, Piscataway, NJ, 193–196.
    • (2004) , pp. 193-196
    • Misra, H.1    Ikbal, S.2    Bourlard, H.3    Hermansky, H.4
  • 114
    • 33646801180 scopus 로고    scopus 로고
    • Multi-resolution spectral entropy feature for robust ASR
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, Philadelphia, PA IEEE Piscataway, NJ
    • Misra, H., Ikbal, S., Sivadas, S., Bourlard, H., Multi-resolution spectral entropy feature for robust ASR., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, Philadelphia, PA, March 2005, IEEE, Piscataway, NJ, 253–256.
    • (2005) , pp. 253-256
    • Misra, H.1    Ikbal, S.2    Sivadas, S.3    Bourlard, H.4
  • 116
    • 0035442477 scopus 로고    scopus 로고
    • Scene determination based on video and audio features
    • Pfeiffer, S., Lienhart, R., Effelsberg, W., Scene determination based on video and audio features. Multimedia Tools Appl. 15:1 (September 2001), 59–81.
    • (2001) Multimedia Tools Appl. , vol.15 , Issue.1 , pp. 59-81
    • Pfeiffer, S.1    Lienhart, R.2    Effelsberg, W.3
  • 117
    • 0003391579 scopus 로고
    • Pitch Determination of Speech Signals: Algorithms and Devices
    • Springer Berlin
    • Hess, W., Pitch Determination of Speech Signals: Algorithms and Devices. 1983, Springer, Berlin.
    • (1983)
    • Hess, W.1
  • 118
    • 84892166605 scopus 로고    scopus 로고
    • A spectrally mixed excitation (SMX) vocoder with robust parameter determination
    • Proceedings of the International Conference on Acoustics, Speech and Signal Processing vol. 2
    • Cho, Y.D., Kim, M.Y., Kim, S.R., A spectrally mixed excitation (SMX) vocoder with robust parameter determination., Proceedings of the International Conference on Acoustics, Speech and Signal Processing, May 1998, 601–604 vol. 2.
    • (1998) , pp. 601-604
    • Cho, Y.D.1    Kim, M.Y.2    Kim, S.R.3
  • 119
    • 0030846123 scopus 로고    scopus 로고
    • A unitary model of pitch perception
    • Meddis, R., O'Mard, L., A unitary model of pitch perception. J. Acoust. Soc. Am. 102:3 (September 1997), 1811–1820.
    • (1997) J. Acoust. Soc. Am. , vol.102 , Issue.3 , pp. 1811-1820
    • Meddis, R.1    O'Mard, L.2
  • 120
    • 84953652991 scopus 로고
    • Circularity in judgements of relative pitch
    • Shepard, R.N., Circularity in judgements of relative pitch. J. Acoust. Soc. Am. 36 (1964), 2346–2353.
    • (1964) J. Acoust. Soc. Am. , vol.36 , pp. 2346-2353
    • Shepard, R.N.1
  • 121
    • 13144282752 scopus 로고    scopus 로고
    • Audio thumbnailing of popular music using chroma-based representations
    • Bartsch, M.A., Wakefield, G.H., Audio thumbnailing of popular music using chroma-based representations. IEEE Trans. Multimedia 7:1 (February 2005), 96–104.
    • (2005) IEEE Trans. Multimedia , vol.7 , Issue.1 , pp. 96-104
    • Bartsch, M.A.1    Wakefield, G.H.2
  • 122
    • 0141520565 scopus 로고    scopus 로고
    • A chorus-section detecting method for musical audio signals
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 5, Hong Kong, China IEEE Piscataway, NJ
    • Goto, M., A chorus-section detecting method for musical audio signals., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 5, Hong Kong, China, April 2003, IEEE, Piscataway, NJ, 437–440.
    • (2003) , pp. 437-440
    • Goto, M.1
  • 123
    • 84892355208 scopus 로고    scopus 로고
    • Information Retrieval for Music and Motion
    • Springer Berlin
    • Müller, M., Information Retrieval for Music and Motion. 2007, Springer, Berlin.
    • (2007)
    • Müller, M.1
  • 124
    • 33646741047 scopus 로고    scopus 로고
    • Precise pitch profile feature extraction from musical audio for key detection
    • Zhu, Y., Kankanhalli, M.S., Precise pitch profile feature extraction from musical audio for key detection. IEEE Trans. Multimedia 8:3 (June 2006), 575–584.
    • (2006) IEEE Trans. Multimedia , vol.8 , Issue.3 , pp. 575-584
    • Zhu, Y.1    Kankanhalli, M.S.2
  • 125
    • 0034853025 scopus 로고    scopus 로고
    • Robust singing detection in speech/music discriminator design
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Salt Lake City, UT IEEE Piscataway, NJ
    • Chou, W., Gu, L., Robust singing detection in speech/music discriminator design., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Salt Lake City, UT, May 2001, IEEE, Piscataway, NJ, 865–868.
    • (2001) , pp. 865-868
    • Chou, W.1    Gu, L.2
  • 126
    • 84889344642 scopus 로고    scopus 로고
    • Instrument description in the context of MPEG-7
    • Proceedings of International Computer Music Conference, Berlin, Germany
    • Peeters, G., McAdams, S., Herrera, P., Instrument description in the context of MPEG-7., Proceedings of International Computer Music Conference, Berlin, Germany, August, 2000.
    • (2000)
    • Peeters, G.1    McAdams, S.2    Herrera, P.3
  • 127
    • 0019053271 scopus 로고
    • Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
    • Davis, S., Mermelstein, P., Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Signal Process. 28:4 (August 1980), 357–366.
    • (1980) IEEE Trans. Acoust. Speech Signal Process. , vol.28 , Issue.4 , pp. 357-366
    • Davis, S.1    Mermelstein, P.2
  • 128
    • 84953653667 scopus 로고
    • Short-time spectrum and “cepstrum” techniques for vocal-pitch detection
    • Noll, A.M., Short-time spectrum and “cepstrum” techniques for vocal-pitch detection. J. Acoust. Soc. Am. 36:2 (1964), 296–302.
    • (1964) J. Acoust. Soc. Am. , vol.36 , Issue.2 , pp. 296-302
    • Noll, A.M.1
  • 129
    • 13444270431 scopus 로고    scopus 로고
    • Audio keyword generation for sports video analysis
    • Proceedings of the ACM International Conference on Multimedia
    • Xu, M., Duan, L., Chia, L., Xu, C., Audio keyword generation for sports video analysis., Proceedings of the ACM International Conference on Multimedia, 2004, 758–759.
    • (2004) , pp. 758-759
    • Xu, M.1    Duan, L.2    Chia, L.3    Xu, C.4
  • 130
    • 85097903590 scopus 로고    scopus 로고
    • Noise robust Chinese speech recognition using feature vector normalization and higher-order cepstral coefficients
    • Proceedings of the 5th International Conference on Signal Processing
    • Wang, X., Dong, Y., Hakkinen, J., Viikki, O., Noise robust Chinese speech recognition using feature vector normalization and higher-order cepstral coefficients., Proceedings of the 5th International Conference on Signal Processing, August 2000, 738–741.
    • (2000) , pp. 738-741
    • Wang, X.1    Dong, Y.2    Hakkinen, J.3    Viikki, O.4
  • 131
    • 13444292995 scopus 로고    scopus 로고
    • Content-based music structure analysis with applications to music semantics understanding
    • Proceedings of the ACM International Conference on Multimedia ACM Press New York, NY
    • Maddage, N., Xu, C., Kankanhalli, M., Shao, X., Content-based music structure analysis with applications to music semantics understanding., Proceedings of the ACM International Conference on Multimedia, 2004, ACM Press, New York, NY, 112–119.
    • (2004) , pp. 112-119
    • Maddage, N.1    Xu, C.2    Kankanhalli, M.3    Shao, X.4
  • 132
    • 84868695748 scopus 로고    scopus 로고
    • On compensating the Mel-frequency cepstral coefficients for noisy speech recognition
    • Proceedings of the Australasian Computer Science Conference, Hobart, Australia Australian Computer Society Darlinghurst, NSW
    • Choi, E.H.C., On compensating the Mel-frequency cepstral coefficients for noisy speech recognition., Proceedings of the Australasian Computer Science Conference, Hobart, Australia, 2006, Australian Computer Society, Darlinghurst, NSW, 49–54.
    • (2006) , pp. 49-54
    • Choi, E.H.C.1
  • 133
    • 85009115888 scopus 로고    scopus 로고
    • An auditory system-based feature for robust speech recognition
    • Proceedings of the European Conference on Speech Communication and Technology, Aalborg, Denmark International Speech Communication Association Geneva
    • Li, Q., Soong, F.K., Siohan, O., An auditory system-based feature for robust speech recognition., Proceedings of the European Conference on Speech Communication and Technology, Aalborg, Denmark, September 2001, International Speech Communication Association, Geneva, 619–622.
    • (2001) , pp. 619-622
    • Li, Q.1    Soong, F.K.2    Siohan, O.3
  • 134
    • 0026626445 scopus 로고
    • Auditory representations of acoustic signals
    • Yang, X., Wang, K., Shamma, S., Auditory representations of acoustic signals. IEEE Trans. Inform. Theory 38:2 (March 1992), 824–839.
    • (1992) IEEE Trans. Inform. Theory , vol.38 , Issue.2 , pp. 824-839
    • Yang, X.1    Wang, K.2    Shamma, S.3
  • 135
    • 0025041264 scopus 로고
    • Perceptual linear predictive (PLP) analysis of speech
    • Hermansky, H., Perceptual linear predictive (PLP) analysis of speech. J. Acoust. Soc. Am. 87:4 (April 1990), 1738–1752.
    • (1990) J. Acoust. Soc. Am. , vol.87 , Issue.4 , pp. 1738-1752
    • Hermansky, H.1
  • 136
    • 0016067897 scopus 로고
    • Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification
    • Atal, B.S., Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. J. Acoust. Soc. Am. 55:6 (June 1974), 1304–1312.
    • (1974) J. Acoust. Soc. Am. , vol.55 , Issue.6 , pp. 1304-1312
    • Atal, B.S.1
  • 137
    • 0035480380 scopus 로고    scopus 로고
    • A speaker identification system using a model of artificial neural networks for an elevator application
    • Adami, A., Barone, D., A speaker identification system using a model of artificial neural networks for an elevator application. Inform. Sci. 138:1–4 (October 2001), 1–5.
    • (2001) Inform. Sci. , vol.138 , Issue.1-4 , pp. 1-5
    • Adami, A.1    Barone, D.2
  • 138
    • 0019939342 scopus 로고
    • Fluctuation strength and temporal masking patterns of amplitude-modulated broadband noise
    • Fastl, H., Fluctuation strength and temporal masking patterns of amplitude-modulated broadband noise. Hear. Res. 8:1 (September 1982), 59–69.
    • (1982) Hear. Res. , vol.8 , Issue.1 , pp. 59-69
    • Fastl, H.1
  • 139
    • 84873312246 scopus 로고
    • A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria
    • Houtgast, T., Steeneken, H.J., A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria. J. Acoust. Soc. Am. 77:3 (March 1985), 1069–1077.
    • (1985) J. Acoust. Soc. Am. , vol.77 , Issue.3 , pp. 1069-1077
    • Houtgast, T.1    Steeneken, H.J.2
  • 140
    • 85143189691 scopus 로고    scopus 로고
    • Modulation frequency features for audio fingerprinting
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, Orlando, FL IEEE Piscataway, NJ
    • Sukittanon, S., Atlas, L.E., Modulation frequency features for audio fingerprinting., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, Orlando, FL, May 2002, IEEE, Piscataway, NJ, 1773–1776.
    • (2002) , pp. 1773-1776
    • Sukittanon, S.1    Atlas, L.E.2
  • 141
    • 0034515662 scopus 로고    scopus 로고
    • Automatic audio segmentation using a measure of audio novelty
    • Proceedings of the IEEE International Conference on Multimedia and Expo, vol. 1, New York, NY IEEE Piscataway, NJ
    • Foote, J., Automatic audio segmentation using a measure of audio novelty., Proceedings of the IEEE International Conference on Multimedia and Expo, vol. 1, New York, NY, August 2000, IEEE, Piscataway, NJ, 452–455.
    • (2000) , pp. 452-455
    • Foote, J.1
  • 142
    • 84908266591 scopus 로고    scopus 로고
    • The beat spectrum: a new approach to rhythm analysis
    • Proceedings of the IEEE International Conference on Multimedia and Expo IEEE Piscataway, NJ
    • Foote, J., Uchihashi, S., The beat spectrum: a new approach to rhythm analysis., Proceedings of the IEEE International Conference on Multimedia and Expo, 2001, IEEE, Piscataway, NJ, 881–884.
    • (2001) , pp. 881-884
    • Foote, J.1    Uchihashi, S.2
  • 143
    • 36549014432 scopus 로고    scopus 로고
    • The cyclic beat spectrum: tempo-related audio features for time-scale invariant audio identification
    • Proceedings of the 7th International Conference on Music Information Retrieval, Victoria, BC, Canada
    • Kurth, F., Gehrmann, T., Müller, M., The cyclic beat spectrum: tempo-related audio features for time-scale invariant audio identification., Proceedings of the 7th International Conference on Music Information Retrieval, Victoria, BC, Canada, October 2006, 35–40.
    • (2006) , pp. 35-40
    • Kurth, F.1    Gehrmann, T.2    Müller, M.3
  • 144
    • 0031972902 scopus 로고    scopus 로고
    • Tempo and beat analysis of acoustic musical signals
    • Scheirer, E., Tempo and beat analysis of acoustic musical signals. J. Acoust. Soc. Am. 103:1 (January 1998), 588–601.
    • (1998) J. Acoust. Soc. Am. , vol.103 , Issue.1 , pp. 588-601
    • Scheirer, E.1
  • 145
    • 0003616413 scopus 로고    scopus 로고
    • Music-Listening Systems
    • Program in Media Arts and Sciences, Ph.D. Thesis, MIT, Cambridge, MA, 2000
    • Scheirer, E., Music-Listening Systems. 2000 Program in Media Arts and Sciences, Ph.D. Thesis, MIT, Cambridge, MA, 2000.
    • (2000)
    • Scheirer, E.1
  • 146
    • 0010051198 scopus 로고    scopus 로고
    • Audio analysis using the discrete wavelet transform
    • Proceedings of the International Conference on Acoustics and Music: Theory and Applications, Malta
    • Tzanetakis, G., Essl, G., Cook, P., Audio analysis using the discrete wavelet transform., Proceedings of the International Conference on Acoustics and Music: Theory and Applications, Malta, September 2001.
    • (2001)
    • Tzanetakis, G.1    Essl, G.2    Cook, P.3
  • 147
    • 84890516600 scopus 로고    scopus 로고
    • Human perception and computer extraction of musical beat strength
    • Proceedings of the International Conference on Digital Audio Effects, Hamburg, Germany
    • Tzanetakis, G., Essl, G., Cook, P., Human perception and computer extraction of musical beat strength., Proceedings of the International Conference on Digital Audio Effects, Hamburg, Germany, September 2002, 257–261.
    • (2002) , pp. 257-261
    • Tzanetakis, G.1    Essl, G.2    Cook, P.3
  • 148
    • 33744931123 scopus 로고    scopus 로고
    • A wavelet packet representation of audio signals for music genre classification using different ensemble and feature selection techniques
    • Proceedings of the ACM SIGMM International Workshop on Multimedia Information Retrieval, Berkeley, CA ACM Press New York, NY
    • Grimaldi, M., Cunningham, P., Kokaram, A., A wavelet packet representation of audio signals for music genre classification using different ensemble and feature selection techniques., Proceedings of the ACM SIGMM International Workshop on Multimedia Information Retrieval, Berkeley, CA, 2003, ACM Press, New York, NY, 102–108.
    • (2003) , pp. 102-108
    • Grimaldi, M.1    Cunningham, P.2    Kokaram, A.3
  • 149
    • 0003456805 scopus 로고    scopus 로고
    • A Wavelet Tour of Signal Processing
    • Academic Press San Diego, CA
    • Mallat, S., A Wavelet Tour of Signal Processing. 1999, Academic Press, San Diego, CA.
    • (1999)
    • Mallat, S.1
  • 150
    • 0038535978 scopus 로고    scopus 로고
    • Using psycho-acoustic models and self-organizing maps to create a hierarchical structuring of music by sound similarity
    • Proceedings of the International Conference on Music Information Retrieval, Paris, France IRCAM-Centre Pompidou Paris
    • Rauber, A., Pampalk, E., Merkl, D., Using psycho-acoustic models and self-organizing maps to create a hierarchical structuring of music by sound similarity., Proceedings of the International Conference on Music Information Retrieval, Paris, France, October 2002, IRCAM-Centre Pompidou, Paris.
    • (2002)
    • Rauber, A.1    Pampalk, E.2    Merkl, D.3
  • 151
    • 2542463254 scopus 로고    scopus 로고
    • Audio classification based on MPEG-7 spectral basis representations
    • Kim, H., Moreau, N., Sikora, T., Audio classification based on MPEG-7 spectral basis representations. IEEE Trans. Circuits Syst. Video Technol. 14 (2004), 716–725.
    • (2004) IEEE Trans. Circuits Syst. Video Technol. , vol.14 , pp. 716-725
    • Kim, H.1    Moreau, N.2    Sikora, T.3
  • 152
    • 17444446371 scopus 로고    scopus 로고
    • Extracting noise-robust features from audio data
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Orlando, FL IEEE Piscataway, NJ
    • Burges, C.J.C., Platt, J.C., Jana, S., Extracting noise-robust features from audio data., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Orlando, FL, May 2002, IEEE, Piscataway, NJ, 1021–1024.
    • (2002) , pp. 1021-1024
    • Burges, C.J.C.1    Platt, J.C.2    Jana, S.3
  • 153
    • 0032627304 scopus 로고    scopus 로고
    • A modulated complex lapped transform and its applications to audio processing
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Phoenix, AZ IEEE Piscataway, NJ
    • Malvar, H., A modulated complex lapped transform and its applications to audio processing., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Phoenix, AZ, March 1999, IEEE, Piscataway, NJ, 1421–1424.
    • (1999) , pp. 1421-1424
    • Malvar, H.1
  • 154
    • 27744493655 scopus 로고    scopus 로고
    • Nonlinear speech analysis using models for chaotic systems
    • Kokkinos, I., Maragos, P., Nonlinear speech analysis using models for chaotic systems. IEEE Trans. Speech Audio Process. 13:6 (November 2005), 1098–1109.
    • (2005) IEEE Trans. Speech Audio Process. , vol.13 , Issue.6 , pp. 1098-1109
    • Kokkinos, I.1    Maragos, P.2
  • 155
    • 0036289924 scopus 로고    scopus 로고
    • Speech analysis and feature extraction using chaotic models
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Orlando, FL IEEE Piscataway, NJ
    • Pitsikalis, V., Maragos, P., Speech analysis and feature extraction using chaotic models., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Orlando, FL, May 2002, IEEE, Piscataway, NJ, 533–536.
    • (2002) , pp. 533-536
    • Pitsikalis, V.1    Maragos, P.2
  • 156
    • 27944451785 scopus 로고    scopus 로고
    • Feature analysis and extraction for audio automatic classification
    • Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, Big Island, HI IEEE Piscataway, NJ
    • Bai, L., Hu, Y., Lao, S., Chen, J., Wu, L., Feature analysis and extraction for audio automatic classification., Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, Big Island, HI, October 2005, IEEE, Piscataway, NJ, 767–772.
    • (2005) , pp. 767-772
    • Bai, L.1    Hu, Y.2    Lao, S.3    Chen, J.4    Wu, L.5
  • 157
    • 33746817948 scopus 로고    scopus 로고
    • A silence detection and suppression technique design for voice over IP systems
    • Proceedings of the IEEE Pacific Rim Conference on Communications, Computers, and Signal Processing, Victoria, BC, Canada IEEE Piscataway, NJ
    • Becker, R., Corsetti, G., Guedes Silveira, J., Balbinot, R., Castello, F., A silence detection and suppression technique design for voice over IP systems., Proceedings of the IEEE Pacific Rim Conference on Communications, Computers, and Signal Processing, Victoria, BC, Canada, August 2005, IEEE, Piscataway, NJ, 173–176.
    • (2005) , pp. 173-176
    • Becker, R.1    Corsetti, G.2    Guedes Silveira, J.3    Balbinot, R.4    Castello, F.5
  • 158
    • 0034796139 scopus 로고    scopus 로고
    • Pause concepts for audio segmentation at different semantic levels
    • Proceedings of the ACM International Conference on Multimedia, Ottawa, ON, Canada ACM Press New York, NY
    • Pfeiffer, S., Pause concepts for audio segmentation at different semantic levels., Proceedings of the ACM International Conference on Multimedia, Ottawa, ON, Canada, 2001, ACM Press, New York, NY, 187–193.
    • (2001) , pp. 187-193
    • Pfeiffer, S.1
  • 159
    • 85009090165 scopus 로고    scopus 로고
    • High-level feature weighted GMM network for audio stream classification
    • Proceedings of the International Conference on Spoken Language Processing, Jeju Island, Korea
    • Huang, R., Hansen, J.H.L., High-level feature weighted GMM network for audio stream classification., Proceedings of the International Conference on Spoken Language Processing, Jeju Island, Korea, October 2004, 1061–1064.
    • (2004) , pp. 1061-1064
    • Huang, R.1    Hansen, J.H.L.2
  • 160
    • 8344242026 scopus 로고    scopus 로고
    • Local fuzzy PCA based GMM with dimension reduction on speaker identification
    • Lee, K.Y., Local fuzzy PCA based GMM with dimension reduction on speaker identification. Pattern Recogn. Lett. 25:16 (2004), 1811–1817.
    • (2004) Pattern Recogn. Lett. , vol.25 , Issue.16 , pp. 1811-1817
    • Lee, K.Y.1
  • 161
    • 0036293699 scopus 로고    scopus 로고
    • Merging segmental and rhythmic features for automatic language identification
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Orlando, FL IEEE Piscataway, NJ
    • Farinas, J., Pellegrino, F.C., Rouas, J.-L., Andre-Obrech, F., Merging segmental and rhythmic features for automatic language identification., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Orlando, FL, May 2002, IEEE, Piscataway, NJ, 753–756.
    • (2002) , pp. 753-756
    • Farinas, J.1    Pellegrino, F.C.2    Rouas, J.-L.3    Andre-Obrech, F.4
  • 162
    • 0141591602 scopus 로고    scopus 로고
    • Speaker and text independent language identification using predictive error histogram vectors
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Hong Kong, China IEEE Piscataway, NJ
    • Gu, Q.R., Shibata, T., Speaker and text independent language identification using predictive error histogram vectors., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Hong Kong, China, April 2003, IEEE, Piscataway, NJ, 36–39.
    • (2003) , pp. 36-39
    • Gu, Q.R.1    Shibata, T.2
  • 163
    • 0035441593 scopus 로고    scopus 로고
    • Spoken language recognition—a step toward multilinguality in speech processing
    • Navratil, J., Spoken language recognition—a step toward multilinguality in speech processing. IEEE Trans. Speech Audio Process. 9:6 (September 2001), 678–685.
    • (2001) IEEE Trans. Speech Audio Process. , vol.9 , Issue.6 , pp. 678-685
    • Navratil, J.1
  • 164
    • 85009275225 scopus 로고    scopus 로고
    • Approaches to language identification using Gaussian mixture models and shifted delta cepstral features
    • Proceedings of the International Conference on Spoken Language Processing, Denver, CO
    • Torres-Carrasquillo, P.A., Singer, E., Kohler, M.A., Greene, R.J., Reynolds, D.A., Deller, J.R. Jr., Approaches to language identification using Gaussian mixture models and shifted delta cepstral features., Proceedings of the International Conference on Spoken Language Processing, Denver, CO, September 2002, 89–92.
    • (2002) , pp. 89-92
    • Torres-Carrasquillo, P.A.1    Singer, E.2    Kohler, M.A.3    Greene, R.J.4    Reynolds, D.A.5    Deller, J.R.6
  • 165
    • 33644609617 scopus 로고    scopus 로고
    • Emotive alert: HMM-based emotion detection in voicemail messages
    • Proceedings of the International Conference on Intelligent User Interfaces, San Diego, CA ACM Press New York, NY
    • Inanoglu, Z., Caneel, R., Emotive alert: HMM-based emotion detection in voicemail messages., Proceedings of the International Conference on Intelligent User Interfaces, San Diego, CA, 2005, ACM Press, New York, NY, 251–253.
    • (2005) , pp. 251-253
    • Inanoglu, Z.1    Caneel, R.2
  • 166
    • 0141702124 scopus 로고    scopus 로고
    • Classification of stress in speech using linear and nonlinear features
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Hong Kong, China IEEE Piscataway, NJ
    • Nwe, T.L., Foo, S.W., De Silva, L.C., Classification of stress in speech using linear and nonlinear features., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Hong Kong, China, April 2003, IEEE, Piscataway, NJ, 9–12.
    • (2003) , pp. 9-12
    • Nwe, T.L.1    Foo, S.W.2    De Silva, L.C.3
  • 167
    • 77956269951 scopus 로고    scopus 로고
    • Towards automatic recognition of emotion in speech
    • Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, Darmstadt, Germany IEEE Piscataway, NJ
    • Razak, A.A., Yusof, M.H.M., Komiya, R., Towards automatic recognition of emotion in speech., Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, Darmstadt, Germany, December 2003, IEEE, Piscataway, NJ, 548–551.
    • (2003) , pp. 548-551
    • Razak, A.A.1    Yusof, M.H.M.2    Komiya, R.3
  • 168
    • 0036299156 scopus 로고    scopus 로고
    • Automatic estimation of one's age with his/her speech based upon acoustic modeling techniques of speakers
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, Orlando, FL IEEE Piscataway, NJ
    • Minematsu, N., Sekiguchi, M., Hirose, K., Automatic estimation of one's age with his/her speech based upon acoustic modeling techniques of speakers., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, Orlando, FL, May 2002, IEEE, Piscataway, NJ, 137–140.
    • (2002) , pp. 137-140
    • Minematsu, N.1    Sekiguchi, M.2    Hirose, K.3
  • 169
    • 33846961105 scopus 로고    scopus 로고
    • Comparison of neural networks and support vector machines applied to optimized features extracted from patients’ speech signal for classification of vocal fold inflammation
    • Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, Athens, Greece IEEE Piscataway, NJ
    • Behroozmand, R., Almasganj, F., Comparison of neural networks and support vector machines applied to optimized features extracted from patients’ speech signal for classification of vocal fold inflammation., Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, Athens, Greece, December 2005, IEEE, Piscataway, NJ, 844–849.
    • (2005) , pp. 844-849
    • Behroozmand, R.1    Almasganj, F.2
  • 170
    • 84948186412 scopus 로고    scopus 로고
    • Non-negative component parts of sound for classification
    • Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, Darmstadt, Germany IEEE Piscataway, NJ
    • Cho, Y.C., Choi, S., Bang, S.Y., Non-negative component parts of sound for classification., Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, Darmstadt, Germany, December 2003, IEEE, Piscataway, NJ, 633–636.
    • (2003) , pp. 633-636
    • Cho, Y.C.1    Choi, S.2    Bang, S.Y.3
  • 171
    • 33646787141 scopus 로고    scopus 로고
    • Use of modulation spectra for representation and classification of acoustic transients from sniper fire
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 4, Philadelphia, PA IEEE Piscataway, NJ
    • Owsley, L., Atlas, L., Heinemann, C., Use of modulation spectra for representation and classification of acoustic transients from sniper fire., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 4, Philadelphia, PA, March 2005, IEEE, Piscataway, NJ, 1129–1132.
    • (2005) , pp. 1129-1132
    • Owsley, L.1    Atlas, L.2    Heinemann, C.3
  • 172
    • 33749069115 scopus 로고    scopus 로고
    • Audio analysis for surveillance applications
    • Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY IEEE Piscataway, NJ
    • Radhakrishnan, R., Divakaran, A., Smaragdis, P., Audio analysis for surveillance applications., Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY, October 2003, IEEE, Piscataway, NJ, 158–161.
    • (2003) , pp. 158-161
    • Radhakrishnan, R.1    Divakaran, A.2    Smaragdis, P.3
  • 173
    • 0030396150 scopus 로고    scopus 로고
    • Automatic audio content analysis
    • Proceedings of the ACM International Conference on Multimedia, Boston, MA ACM Press New York, NY
    • Pfeiffer, S., Fischer, S., Effelsberg, E., Automatic audio content analysis., Proceedings of the ACM International Conference on Multimedia, Boston, MA, 1996, ACM Press, New York, NY, 21–30.
    • (1996) , pp. 21-30
    • Pfeiffer, S.1    Fischer, S.2    Effelsberg, E.3
  • 174
    • 11244271500 scopus 로고    scopus 로고
    • Robust soccer highlight generation with a novel dominant-speech feature extractor
    • Proceedings of the IEEE International Conference on Multimedia and Expo, Taipei, Taiwan IEEE Piscataway, NJ vol. 1
    • Wang, K., Xu, C., Robust soccer highlight generation with a novel dominant-speech feature extractor., Proceedings of the IEEE International Conference on Multimedia and Expo, Taipei, Taiwan, June 2004, IEEE, Piscataway, NJ, 591–594 vol. 1.
    • (2004) , pp. 591-594
    • Wang, K.1    Xu, C.2
  • 175
    • 33745827806 scopus 로고    scopus 로고
    • Affect-based indexing and retrieval of films
    • Proceedings of the Annual ACM International Conference on Multimedia, Singapore ACM Press Berkeley
    • Chan, C.G., Jones, G.J.F., Affect-based indexing and retrieval of films., Proceedings of the Annual ACM International Conference on Multimedia, Singapore, 2005, ACM Press, Berkeley, 427–430.
    • (2005) , pp. 427-430
    • Chan, C.G.1    Jones, G.J.F.2
  • 176
    • 4544273366 scopus 로고    scopus 로고
    • Content based audio classification and retrieval using joint time-frequency analysis
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Montreal, QC, Canada IEEE Piscataway, NJ
    • Esmaili, S., Krishnan, S., Raahemifar, K., Content based audio classification and retrieval using joint time-frequency analysis., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Montreal, QC, Canada, May 2004, IEEE, Piscataway, NJ, 665–668.
    • (2004) , pp. 665-668
    • Esmaili, S.1    Krishnan, S.2    Raahemifar, K.3
  • 177
    • 21544467298 scopus 로고    scopus 로고
    • Content-based recognition of musical instruments
    • Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, Rome, Italy IEEE Piscataway, NJ
    • Fanelli, A.M., Caponetti, L., Castellano, G., Buscicchio, C.A., Content-based recognition of musical instruments., Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, Rome, Italy, December 2004, IEEE, Piscataway, NJ, 361–364.
    • (2004) , pp. 361-364
    • Fanelli, A.M.1    Caponetti, L.2    Castellano, G.3    Buscicchio, C.A.4
  • 178
    • 33744926889 scopus 로고    scopus 로고
    • Discrete wavelet packet transform and ensembles of lazy and eager learners for music genre classification
    • Grimaldi, M., Cunningham, P., Kokaram, A., Discrete wavelet packet transform and ensembles of lazy and eager learners for music genre classification. Multimedia Syst. 11:5 (April 2006), 422–437.
    • (2006) Multimedia Syst. , vol.11 , Issue.5 , pp. 422-437
    • Grimaldi, M.1    Cunningham, P.2    Kokaram, A.3
  • 179
    • 46149102188 scopus 로고    scopus 로고
    • Singing voice features by time-frequency representations
    • Proceedings of the International Symposium on Image and Signal Processing and Analysis, vol. 1, Rome, Italy IEEE Piscataway, NJ
    • Mesaros, A., Lupu, E., Rusu, C., Singing voice features by time-frequency representations., Proceedings of the International Symposium on Image and Signal Processing and Analysis, vol. 1, Rome, Italy, September 2003, IEEE, Piscataway, NJ, 471–475.
    • (2003) , pp. 471-475
    • Mesaros, A.1    Lupu, E.2    Rusu, C.3
  • 180
    • 29044450290 scopus 로고    scopus 로고
    • The way it sounds: timbre models for analysis and retrieval of music signals
    • Aucouturier, J.-J., Pachet, F., Sandler, M., The way it sounds: timbre models for analysis and retrieval of music signals. IEEE Trans. Multimedia 7:6 (December 2005), 1028–1035.
    • (2005) IEEE Trans. Multimedia , vol.7 , Issue.6 , pp. 1028-1035
    • Aucouturier, J.-J.1    Pachet, F.2    Sandler, M.3
  • 181
    • 33749077780 scopus 로고    scopus 로고
    • Hierarchical multi-class self similarities
    • Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY IEEE Piscataway, NJ
    • Jehan, T., Hierarchical multi-class self similarities., Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY, October 2005, IEEE, Piscataway, NJ, 311–314.
    • (2005) , pp. 311-314
    • Jehan, T.1
  • 182
    • 4544274781 scopus 로고    scopus 로고
    • Content-based music similarity search and emotion detection
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 5, Montreal, QC, Canada IEEE Piscataway, NJ
    • Li, T., Ogihara, M., Content-based music similarity search and emotion detection., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 5, Montreal, QC, Canada, May 2004, IEEE, Piscataway, NJ, 705–708.
    • (2004) , pp. 705-708
    • Li, T.1    Ogihara, M.2
  • 183
    • 4744357961 scopus 로고    scopus 로고
    • A unified approach to content-based and fault-tolerant music recognition
    • Clausen, M., Kurth, F., A unified approach to content-based and fault-tolerant music recognition. IEEE Trans. Multimedia 6:5 (October 2004), 717–731.
    • (2004) IEEE Trans. Multimedia , vol.6 , Issue.5 , pp. 717-731
    • Clausen, M.1    Kurth, F.2
  • 184
    • 15344342335 scopus 로고    scopus 로고
    • Repeating pattern discovery and structure analysis from acoustic music data
    • Proceedings of the ACM SIGMM International Workshop on Multimedia Information Retrieval, New York, NY ACM Press New York, NY
    • Lu, L., Wang, M., Zhang, H.J., Repeating pattern discovery and structure analysis from acoustic music data., Proceedings of the ACM SIGMM International Workshop on Multimedia Information Retrieval, New York, NY, 2004, ACM Press, New York, NY, 275–282.
    • (2004) , pp. 275-282
    • Lu, L.1    Wang, M.2    Zhang, H.J.3
  • 185
    • 3042520493 scopus 로고    scopus 로고
    • Recognition of piano notes with the aid of FRM filters
    • Proceedings of the International Symposium on Control, Communications and Signal Processing, Hammamet, Tunisia IEEE Piscataway, NJ
    • Foo, S.W., Leem, W.T., Recognition of piano notes with the aid of FRM filters., Proceedings of the International Symposium on Control, Communications and Signal Processing, Hammamet, Tunisia, March 2004, IEEE, Piscataway, NJ, 409–413.
    • (2004) , pp. 409-413
    • Foo, S.W.1    Leem, W.T.2
  • 186
    • 84945133945 scopus 로고    scopus 로고
    • Summarizing popular music via structural similarity analysis
    • Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY IEEE Piscataway, NJ
    • Cooper, M., Foote, J., Summarizing popular music via structural similarity analysis., Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY, October 2003, IEEE, Piscataway, NJ, 127–130.
    • (2003) , pp. 127-130
    • Cooper, M.1    Foote, J.2
  • 187
    • 85097907867 scopus 로고    scopus 로고
    • Cubyhum: a fully operational query by humming system
    • Proceedings of the International Conference on Music Information Retrieval, Paris, France IRCAM-Centre Pompidou Paris
    • Pauws, S., Cubyhum: a fully operational query by humming system., Proceedings of the International Conference on Music Information Retrieval, Paris, France, October 2002, IRCAM-Centre Pompidou, Paris.
    • (2002)
    • Pauws, S.1
  • 188
    • 0037622306 scopus 로고    scopus 로고
    • Enhancing sonic browsing using audio information retrieval
    • Proceedings of the International Conference on Auditory Display, Kyoto, Japan
    • Brazil, E., Fernström, M., Tzanetakis, G., Cook, P., Enhancing sonic browsing using audio information retrieval., Proceedings of the International Conference on Auditory Display, Kyoto, Japan, July 2002.
    • (2002)
    • Brazil, E.1    Fernström, M.2    Tzanetakis, G.3    Cook, P.4
  • 189
    • 33646767819 scopus 로고    scopus 로고
    • Improving music genre classification by short time feature integration
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Philadelphia, PA IEEE Piscataway, NJ
    • Meng, A., Ahrendt, P., Larsen, J., Improving music genre classification by short time feature integration., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Philadelphia, PA, March 2005, IEEE, Piscataway, NJ, 497–500.
    • (2005) , pp. 497-500
    • Meng, A.1    Ahrendt, P.2    Larsen, J.3
  • 190
    • 0141743614 scopus 로고    scopus 로고
    • Musical genre classification using support vector machines
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Hong Kong, China IEEE Piscataway, NJ
    • Changsheng, X., Maddage, N.C., Xi, S., Fang, C., Qi, T., Musical genre classification using support vector machines., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Hong Kong, China, April 2003, IEEE, Piscataway, NJ, 429–432.
    • (2003) , pp. 429-432
    • Changsheng, X.1    Maddage, N.C.2    Xi, S.3    Fang, C.4    Qi, T.5
  • 191
    • 0035685514 scopus 로고    scopus 로고
    • To catch a chorus: using chroma-based representations for audio thumbnailing
    • Proceedings of the IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics, New Paltz, NY IEEE Piscataway, NJ
    • Bartsch, M.A., Wakefield, G.H., To catch a chorus: using chroma-based representations for audio thumbnailing., Proceedings of the IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics, New Paltz, NY, October 2001, IEEE, Piscataway, NJ, 15–18.
    • (2001) , pp. 15-18
    • Bartsch, M.A.1    Wakefield, G.H.2
  • 192
    • 85097899262 scopus 로고    scopus 로고
    • Comparison of several acoustic modeling techniques and decoding algorithms for embedded speech recognition systems
    • Proceedings of the Workshop on DSP in Mobile and Vehicular Systems, Nagoya, Japan
    • Lvy, C., Linars, G., Nocera, P., Comparison of several acoustic modeling techniques and decoding algorithms for embedded speech recognition systems., Proceedings of the Workshop on DSP in Mobile and Vehicular Systems, Nagoya, Japan, April 2003.
    • (2003)
    • Lvy, C.1    Linars, G.2    Nocera, P.3
  • 193
    • 30344446676 scopus 로고    scopus 로고
    • Pseudo complex cepstrum using discrete cosine transform
    • Muralishankar, R., Ramakrishnan, A.G., Pseudo complex cepstrum using discrete cosine transform. Int. J. Speech Technol. 8:2 (June 2005), 181–191.
    • (2005) Int. J. Speech Technol. , vol.8 , Issue.2 , pp. 181-191
    • Muralishankar, R.1    Ramakrishnan, A.G.2
  • 194
    • 85097870299 scopus 로고    scopus 로고
    • Speaker adaptive speech recognition using phone pair model
    • Proceedings of the 5th International Conference on Signal Processing, Beijing, China
    • Baojie, L., Hirose, K., Speaker adaptive speech recognition using phone pair model., Proceedings of the 5th International Conference on Signal Processing, Beijing, China, August 2000, 714–717.
    • (2000) , pp. 714-717
    • Baojie, L.1    Hirose, K.2
  • 195
    • 85097903590 scopus 로고    scopus 로고
    • Noise robust Chinese speech recognition using feature vector normalization and higher-order cepstral coefficients
    • Proceedings of the 5th International Conference on Signal Processing, Beijing, China
    • Wang, X., Dong, Y., Häkkinen, J., Viikki, O., Noise robust Chinese speech recognition using feature vector normalization and higher-order cepstral coefficients., Proceedings of the 5th International Conference on Signal Processing, Beijing, China, August 2000, 738–741.
    • (2000) , pp. 738-741
    • Wang, X.1    Dong, Y.2    Häkkinen, J.3    Viikki, O.4
  • 196
    • 0036298770 scopus 로고    scopus 로고
    • Modulation features for speech recognition
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Orlando, FL IEEE Piscataway, NJ
    • Dimitriadis, D., Maragos, P., Potamianos, A., Modulation features for speech recognition., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Orlando, FL, May 2002, IEEE, Piscataway, NJ, 377–380.
    • (2002) , pp. 377-380
    • Dimitriadis, D.1    Maragos, P.2    Potamianos, A.3
  • 197
    • 33947639038 scopus 로고    scopus 로고
    • Joint acoustic-modulation frequency for speaker recognition
    • Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Toulouse, France IEEE Piscataway, NJ
    • Kinnunen, T., Joint acoustic-modulation frequency for speaker recognition., Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Toulouse, France, May 2006, IEEE, Piscataway, NJ, 665–668.
    • (2006) , pp. 665-668
    • Kinnunen, T.1
  • 198
    • 33644628732 scopus 로고    scopus 로고
    • Evaluation of frequently used audio features for classification of music into perceptual categories
    • Proceedings of the 4th International Workshop Content-Based Multimedia Indexing, Riga, Latvia
    • Pohle, T., Pampalk, E., Widmer, G., Evaluation of frequently used audio features for classification of music into perceptual categories., Proceedings of the 4th International Workshop Content-Based Multimedia Indexing, Riga, Latvia, 2005.
    • (2005)
    • Pohle, T.1    Pampalk, E.2    Widmer, G.3
  • 199
    • 85009188340 scopus 로고    scopus 로고
    • Nonlinear analysis of speech signals: generalized dimensions and Lyapunov exponents
    • Proceedings of the European Conference on Speech Communication and Technology, Geneva, Switzerland
    • Pitsikalis, V., Kokkinos, I., Maragos, P., Nonlinear analysis of speech signals: generalized dimensions and Lyapunov exponents., Proceedings of the European Conference on Speech Communication and Technology, Geneva, Switzerland, September 2003, 817–820.
    • (2003) , pp. 817-820
    • Pitsikalis, V.1    Kokkinos, I.2    Maragos, P.3
  • 200
    • 33750574645 scopus 로고    scopus 로고
    • Pitch histograms in audio and symbolic music information retrieval
    • Tzanetakis, G., Ermolinskyi, A., Cook, P., Pitch histograms in audio and symbolic music information retrieval. J. New Music Res. 32:2 (June 2003), 143–152.
    • (2003) J. New Music Res. , vol.32 , Issue.2 , pp. 143-152
    • Tzanetakis, G.1    Ermolinskyi, A.2    Cook, P.3
  • 201
    • 11244339730 scopus 로고    scopus 로고
    • An audio recommendation system based on audio signature description scheme in MPEG-7 audio
    • Proceedings of the IEEE International Conference on Multimedia and Expo, Taipei, Taiwan, vol. 1 IEEE Piscataway, NJ
    • Huang, Y.C., Jenor, S.K., An audio recommendation system based on audio signature description scheme in MPEG-7 audio., Proceedings of the IEEE International Conference on Multimedia and Expo, Taipei, Taiwan, vol. 1, June 2004, IEEE, Piscataway, NJ, 639–642.
    • (2004) , pp. 639-642
    • Huang, Y.C.1    Jenor, S.K.2
  • 202
    • 0035576554 scopus 로고    scopus 로고
    • Indexing and retrieval of audio: a survey
    • Lu, G., Indexing and retrieval of audio: a survey. Multimedia Tools Appl. 15:3 (December 2001), 269–290.
    • (2001) Multimedia Tools Appl. , vol.15 , Issue.3 , pp. 269-290
    • Lu, G.1
  • 203
    • 3042712303 scopus 로고    scopus 로고
    • Audio information retrieval a bibliographical study
    • Davy, M., Godsill, S.J., Audio information retrieval a bibliographical study. Technical Report, February 2002.
    • (2002) Technical Report
    • Davy, M.1    Godsill, S.J.2
  • 204
    • 84942244978 scopus 로고    scopus 로고
    • A review of algorithms for audio fingerprinting
    • Proceedings of the IEEE Workshop on Multimedia Signal Processing, St. Thomas, VI IEEE Piscataway, NJ
    • Cano, P., Batle, E., Kalker, T., Haitsma, J., A review of algorithms for audio fingerprinting., Proceedings of the IEEE Workshop on Multimedia Signal Processing, St. Thomas, VI, December 2002, IEEE, Piscataway, NJ, 169–173.
    • (2002) , pp. 169-173
    • Cano, P.1    Batle, E.2    Kalker, T.3    Haitsma, J.4


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.