SCOPUS 정보 검색 플랫폼

Volumn 34, Issue 3, 2007, Pages 375-395

A general audio classifier based on human perception motivated model

Author keywords

Audio classification; Content based audio indexing; Gender identification; Highlights detection; Music genre recognition; Perceptually motivated features; Piecewise Gaussian Modelling

Indexed keywords

ACOUSTIC SIGNAL PROCESSING; CLASSIFICATION (OF INFORMATION); MATHEMATICAL MODELS; PIECEWISE LINEAR TECHNIQUES; PROBLEM SOLVING; PSYCHOPHYSIOLOGY;

AUDIO CLASSIFICATION; GENDER IDENTIFICATION; MUSIC GENRE RECOGNITION; PERCEPTUALLY MOTIVATED FEATURES;

AUDIO ACOUSTICS;

EID: 34547284188 PISSN: 13807501 EISSN: 15737721 Source Type: Journal
DOI: 10.1007/s11042-007-0108-9 Document Type: Article

Times cited : (17)

References (54)

1
- 0037401304
- Speech/Music discrimination using entropy and dynamism features in a HMM classification framework
- Ajmera J, McCowan I, Bourlard H (2003) Speech/Music discrimination using entropy and dynamism features in a HMM classification framework. Speech Commun 40(3):351-363
- (2003) Speech Commun , vol.40 , Issue.3 , pp. 351-363
- Ajmera, J.¹ McCowan, I.² Bourlard, H.³

2
- 0032638667
- A comparison of features for speech, music discrimination
- Carey M, Parris E, Lloyd-Thomas H ( 1999) A comparison of features for speech, music discrimination. Proceedings of IEEE ICASSP99, pp149-152
- (1999) Proceedings of IEEE ICASSP99 , pp. 149-152
- Carey, M.¹ Parris, E.² Lloyd-Thomas, H.³

3
- 0029716457
- Integrated image and speech analysis for content-based video indexing
- Chang Y-L, Zeng W, Kamel I, Alonso R (1996) Integrated image and speech analysis for content-based video indexing. Proceedings, the third IEEE international conference on multimedia computing and systems, pp306-313
- (1996) Proceedings, the third IEEE international conference on multimedia computing and systems , pp. 306-313
- Chang, Y.-L.¹ Zeng, W.² Kamel, I.³ Alonso, R.⁴

4
- 0028919718
- Auditory event-related potentials dissociate early and late memory processes
- Elsevier
- Chao L, Nielsen-Bohlman L, Knight R (1995) Auditory event-related potentials dissociate early and late memory processes. Electroencephalogr Clin Neurophysiol 96:157-168, Elsevier
- (1995) Electroencephalogr Clin Neurophysiol , vol.96 , pp. 157-168
- Chao, L.¹ Nielsen-Bohlman, L.² Knight, R.³

5
- 0035791627
- Extraction of TV highlights using multimedia features
- Dagtas S, Abdel-Mottaleb M (2001) Extraction of TV highlights using multimedia features. Proceedings, IEEE 4th workshop on multimedia signal processing
- (2001) Proceedings, IEEE 4th workshop on multimedia signal processing
- Dagtas, S.¹ Abdel-Mottaleb, M.²

6
- 84890539753
- Classifying audio of movies by a multi expert system
- De Santo M el al (2001) Classifying audio of movies by a multi expert system. Proceedings of the IEEE 11th international conference on image analysis and processing, pp386-391
- (2001) Proceedings of the IEEE 11th international conference on image analysis and processing , pp. 386-391
- De Santo, M.¹ el al²

7
- 0035308233
- Classification of general audio data for content-based retrieval
- Elsevier
- Dongge L et al (2001) Classification of general audio data for content-based retrieval. Pattern Recogn Lett 22:533-544, Elsevier
- (2001) Pattern Recogn Lett , vol.22 , pp. 533-544
- Dongge, L.¹

8
- 0033705976
- Speech/music discrimination for multimedia applications
- El-Maleh K, Klein M, Petrucci G, Kabal P (2000) Speech/music discrimination for multimedia applications. Proceedings of IEEE ICASSP00, pp2445-2449
- (2000) Proceedings of IEEE ICASSP00 , pp. 2445-2449
- El-Maleh, K.¹ Klein, M.² Petrucci, G.³ Kabal, P.⁴

9
- 0004740094
- A similarity measure for automatic audio classification
- Stanford March
- Foote J (1997) A similarity measure for automatic audio classification. In Proc. AAAI 1997 spring symposium on intelligent integration and use of text, image, video, and audio corpora. Stanford (March)
- (1997) Proc. AAAI 1997 spring symposium on intelligent integration and use of text, image, video, and audio corpora
- Foote, J.¹

10
- 85128356454
- Partitioning and transcription of broadcast news data
- Gauvain J-L, Lamel L, Adda G (1998) Partitioning and transcription of broadcast news data. Proc. ICSLP'98 5:1335-1338
- (1998) Proc. ICSLP'98 , vol.5 , pp. 1335-1338
- Gauvain, J.-L.¹ Lamel, L.² Adda, G.³

11
- 0141623871
- RWC music database: Popular, classical, and jazz music databases
- Goto M, Hashiguchi H, Nishimura T, Oka R (2002) RWC music database: popular, classical, and jazz music databases. Proceedings, the 3rd international conference on music information retrieval (ISMIR02), pp287-288
- (2002) Proceedings, the 3rd international conference on music information retrieval (ISMIR02) , pp. 287-288
- Goto, M.¹ Hashiguchi, H.² Nishimura, T.³ Oka, R.⁴

12
- 34547368583
- Recognition of music types
- ICASSP
- Hagen S, Tanja S, Martin W (1998) Recognition of music types. Proceedings, the 1998 IEEE international conference on acoustics, speech and signal processing, ICASSP
- (1998) Proceedings, the 1998 IEEE international conference on acoustics, speech and signal processing
- Hagen, S.¹ Tanja, S.² Martin, W.³

13
- 0002751623
- Segment generation and clustering in the HTK broadcast news transcription system
- Hain T, Johnson SE, Tuerk A, Woodland PC, Young SJ (1998) Segment generation and clustering in the HTK broadcast news transcription system. Proc. 1998 DARPA broadcast news transcription and understanding workshop, pp 133-137
- (1998) Proc. 1998 DARPA broadcast news transcription and understanding workshop , pp. 133-137
- Hain, T.¹ Johnson, S.E.² Tuerk, A.³ Woodland, P.C.⁴ Young, S.J.⁵

14
- 84963543360
- User-oriented affective video analysis
- conference
- Hanjalic A, Xu L-Q (2001) User-oriented affective video analysis. Proceedings, IEEE workshop on content-based access of image and video libraries, in conjunction with the IEEE CVPR 2001 conference
- (2001) Proceedings, IEEE workshop on content-based access of image and video libraries, in conjunction with the IEEE CVPR
- Hanjalic, A.¹ Xu, L.-Q.²

15
- 84908593315
- Gender identification using a general audio classifier
- Harb H, Chen L (2003) Gender identification using a general audio classifier. Proceedings, the IEEE international conference on multimedia & expo ICME, pp733-736
- (2003) Proceedings, the IEEE international conference on multimedia & expo ICME , pp. 733-736
- Harb, H.¹ Chen, L.²

16
- 84872720209
- Robust speech/music discrimination using spectrum's first order statistics and neural networks
- Harb H, Chen L (2003) Robust speech/music discrimination using spectrum's first order statistics and neural networks. Proceedings, the IEEE international symposium on signal processing and its applications ISSPA2003, pp125-128
- (2003) Proceedings, the IEEE international symposium on signal processing and its applications ISSPA2003 , pp. 125-128
- Harb, H.¹ Chen, L.²

17
- 0003413187
- Macmillan
- Haykin S (1994) Neural networks a comprehensive foundation. Macmillan
- (1994) Neural networks a comprehensive foundation
- Haykin, S.¹

18
- 0026374868
- Improved acoustic modeling with the SPHINX speech recognition system
- ICASSP-91
- Huang XD, Lee KF, Hon HW, Hwang MY (1991) Improved acoustic modeling with the SPHINX speech recognition system. Proceedings of the IEEE ICASSP-91, 1:345-348
- (1991) Proceedings of the IEEE
- Huang, X.D.¹ Lee, K.F.² Hon, H.W.³ Hwang, M.Y.⁴

19
- 84908322734
- Music type classification by spectral contrast features
- Jiang D-N, Lu L, Zhang H-J, Cai L-H, Tao J-H (2002) Music type classification by spectral contrast features. Proceedings, IEEE international conference on multimedia and expo (ICME02)
- (2002) Proceedings, IEEE international conference on multimedia and expo (ICME02)
- Jiang, D.-N.¹ Lu, L.² Zhang, H.-J.³ Cai, L.-H.⁴ Tao, J.-H.⁵

20
- 17444377070
- Jung E, Schwarzbacher A, Lawlor R (2002) Implementation of real-time AMDF pitch-detection for voice gender nonnalization. Proceedings of the 14th international conference on digital signal processing. DSP 2002 2:827-830
- Jung E, Schwarzbacher A, Lawlor R (2002) Implementation of real-time AMDF pitch-detection for voice gender nonnalization. Proceedings of the 14th international conference on digital signal processing. DSP 2002 2:827-830

21
- 0003204113
- Acoustic segmentation for audio browsers
- Sydney, Australia July
- Kimber D, Wilcox L (1996) Acoustic segmentation for audio browsers. Proceedings of interface conference, Sydney, Australia (July)
- (1996) Proceedings of interface conference
- Kimber, D.¹ Wilcox, L.²

22
- 34547288647
- Kiranyaz S, Aubazac M, Gabbouj M (2003) Unsupervised segmentation and classification over MP3 and AAC audio bitstreams. In the Proc. of the 4th European workshop on image analysis for multimedia interactive services WIAMIS 03, World Scientific, London UK
- Kiranyaz S, Aubazac M, Gabbouj M (2003) Unsupervised segmentation and classification over MP3 and AAC audio bitstreams. In the Proc. of the 4th European workshop on image analysis for multimedia interactive services WIAMIS 03, World Scientific, London UK

23
- 34547336240
- Konig Y, Morgan N (1992) GDNN a gender dependent neural network for continuous speech recognition. Proceedings, international joint conference on neural networks, IJCNN, 2, 7-11 2:332-337
- Konig Y, Morgan N (1992) GDNN a gender dependent neural network for continuous speech recognition. Proceedings, international joint conference on neural networks, IJCNN, Volume: 2, 7-11 2:332-337

24
- 0034273520
- Content-based classification and retrieval of audio using the nearest feature line method
- Li S (2000) Content-based classification and retrieval of audio using the nearest feature line method. IEEE Trans Speech Audio Process 8:619-625
- (2000) IEEE Trans Speech Audio Process , vol.8 , pp. 619-625
- Li, S.¹

25
- 0034515703
- Content-based indexing and retrieval of audio data using wavelets
- Li G, Khokhar A (2000) Content-based indexing and retrieval of audio data using wavelets. Proceedings, the IEEE international conference on multimedia and expo (II), pp885-888
- (2000) Proceedings, the IEEE international conference on multimedia and expo (II) , pp. 885-888
- Li, G.¹ Khokhar, A.²

26
- 0005093211
- Efficient cepstral normalization for robust speech recognition
- March
- Liu F, Stern R, Huang X, Acero A (1993) Efficient cepstral normalization for robust speech recognition. Proceedings of ARPA speech and natural language workshop, pp69-74 (March)
- (1993) Proceedings of ARPA speech and natural language workshop , pp. 69-74
- Liu, F.¹ Stern, R.² Huang, X.³ Acero, A.⁴

27
- 0032181880
- J VLSI Signal Process Syst
- Liu Z, Wang T, Chen T (1998) Audio feature extraction and analysis for multimedia content classification. J VLSI Signal Process Syst
- (1998) Audio feature extraction and analysis for multimedia content classification
- Liu, Z.¹ Wang, T.² Chen, T.³

28
- 20444469135
- Improving accuracy in behaviour identification for content-based retrieval by using audio and video information
- Miyamori H (2002) Improving accuracy in behaviour identification for content-based retrieval by using audio and video information. Proceedings of IEEE ICPR02, 2:826-830
- (2002) Proceedings of IEEE ICPR02 , vol.2 , pp. 826-830
- Miyamori, H.¹

29
- 0034789152
- Affect computing in film through sound energy dynamics
- Moncrieff S, Dorai C, Venkatesh S (2001) Affect computing in film through sound energy dynamics. Proceedings of ACM MM
- (2001) Proceedings of ACM MM
- Moncrieff, S.¹ Dorai, C.² Venkatesh, S.³

30
- 34547322429
- Moore, BCJ (ed) (1995), Hearing. Academic, Toronto
- Moore, BCJ (ed) (1995), Hearing. Academic, Toronto

31
- 0030635350
- Phone-context specific gender-dependent acoustic-models for continuous speech recognition
- Neti C, Roukos S (1997) Phone-context specific gender-dependent acoustic-models for continuous speech recognition. Proceedings, IEEE workshop on automatic speech recognition and understanding, 192-198
- (1997) Proceedings, IEEE workshop on automatic speech recognition and understanding , pp. 192-198
- Neti, C.¹ Roukos, S.²

32
- 0036331298
- Elsevier
- Noppeney U, Price CJ (2002) Retrieval of visual, auditory, and abstract semantics. NeuroImage 15:917-926, Elsevier
- (2002) Retrieval of visual, auditory, and abstract semantics. NeuroImage , vol.15 , pp. 917-926
- Noppeney, U.¹ Price, C.J.²

33
- 0029762796
- Language independent gender identification
- Partis ES, Carey MJ (1996) Language independent gender identification. Proceedings of IEEE ICASSP, pp685-688
- (1996) Proceedings of IEEE ICASSP , pp. 685-688
- Partis, E.S.¹ Carey, M.J.²

34
- 0010020774
- Scanning the dial: An exploration of factors in the identification of musical style
- Society for Music Perception and Cognition
- Perrot, D, Gjerdigen, RO Scanning the dial: an exploration of factors in the identification of musical style. Proceedings, the 1999 Society for Music Perception and Cognition
- (1999) Proceedings, the
- Perrot, D.¹ Gjerdigen, R.O.²

35
- 0030396150
- Automatic audio content analysis
- Pfeiffer S, Fischer S, Effelsberg W (1996) Automatic audio content analysis. Proceedings of ACM Multimedia, pp21-30
- (1996) Proceedings of ACM Multimedia , pp. 21-30
- Pfeiffer, S.¹ Fischer, S.² Effelsberg, W.³

36
- 0036288612
- Speech and music classification in audio documents
- ICASSP
- Pinquier J, Sénac C André-Obrecht R (2002) Speech and music classification in audio documents. Proceedings, the IEEE ICASSP'2002, pp4164-4167
- (2002) Proceedings, the IEEE , pp. 4164-4167
- Pinquier, J.¹ Sénac, C.² André-Obrecht, R.³

37
- 0033693368
- Content-based methods for the management of digital music
- Pye D (2000) Content-based methods for the management of digital music. Proceedings, IEEE international conference on, acoustics, speech, and signal processing, ICASSP'00.volume:4, 4:2437-2440
- (2000) Proceedings, IEEE international conference on, acoustics, speech, and signal processing, ICASSP'00 , vol.4
- Pye, D.¹

38
- 0029209272
- Robust text-independent speaker identification using Gaussian mixture speaker models
- Reynolds DA, Rose RC (1995) Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans Speech Audio Process 3(1):72-83
- (1995) IEEE Trans Speech Audio Process , vol.3 , Issue.1 , pp. 72-83
- Reynolds, D.A.¹ Rose, R.C.²

39
- 34047243933
- Selection, parameter estimation, and discriminative training of hidden Markov models for general audio modeling
- Reyes-Gomez M, Ellis D (2003) Selection, parameter estimation, and discriminative training of hidden Markov models for general audio modeling. Proceedings, the IEEE international conference on multimedia & expo ICME
- (2003) Proceedings, the IEEE international conference on multimedia & expo ICME
- Reyes-Gomez, M.¹ Ellis, D.²

40
- 0030365534
- Robust gender-dependent acoustic-phonetic modelling in continuous speech recognition based on a new automatic male female classification
- 2, Oct
- Rivarol V, Farhal A, O'Shaughnessy D (1996) Robust gender-dependent acoustic-phonetic modelling in continuous speech recognition based on a new automatic male female classification. Proceedings, fourth international conference on spoken language, ICSLP 96, Volume: 2 3-6 2:1081-1084 (Oct)
- (1996) Proceedings, fourth international conference on spoken language, ICSLP , vol.96
- Rivarol, V.¹ Farhal, A.² O'Shaughnessy, D.³

41
- 0029765670
- Real time discrimination of broadcast speech/music
- Saunders J (1996) Real time discrimination of broadcast speech/music, Proc. Of ICASSP96 2: 993-996
- (1996) Proc. Of ICASSP96 , vol.2 , pp. 993-996
- Saunders, J.¹

42
- 0030648077
- Construction and evaluation of a robust multifeature speech/music discriminator
- Munich, Germany April
- Scheirer E, Slaney M (1997) Construction and evaluation of a robust multifeature speech/music discriminator. Proceedings of IEEE ICASSP'97, Munich, Germany (April)
- (1997) Proceedings of IEEE ICASSP'97
- Scheirer, E.¹ Slaney, M.²

43
- 0034843119
- Experiments on speech tracking in audio documents using Gaussian mixture modeling
- Seek M, Magrin-Chagnolleau I, Bimbot F (2001) Experiments on speech tracking in audio documents using Gaussian mixture modeling. Proceedings of IEEE ICASSP01, 1:601-604
- (2001) Proceedings of IEEE ICASSP01 , vol.1 , pp. 601-604
- Seek, M.¹ Magrin-Chagnolleau, I.² Bimbot, F.³

44
- 47749135356
- Mixtures of probability experts for audio retrieval and indexing
- Slaney M (2002) Mixtures of probability experts for audio retrieval and indexing. Proceedings, IEEE international conference on multimedia and expo, ICME 2002, 1:345-348
- (2002) Proceedings, IEEE international conference on multimedia and expo, ICME , vol.1 , pp. 345-348
- Slaney, M.¹

45
- 0031356844
- Automatic gender identification optimised for language independence
- Slomka S, Sridharan S (1997) Automatic gender identification optimised for language independence. Proceeding of IEEE TENCON-speech and image technologies for computing and telecommunications, pp145-148
- (1997) Proceeding of IEEE TENCON-speech and image technologies for computing and telecommunications , pp. 145-148
- Slomka, S.¹ Sridharan, S.²

46
- 0034501286
- Video scene segmentation using video and audio features
- New York July
- Sundaram H, Chang S-F (2000) Video scene segmentation using video and audio features. IEEE international conference on multimedia and expo, New York (July)
- (2000) IEEE international conference on multimedia and expo
- Sundaram, H.¹ Chang, S.-F.²

47
- 0036648502
- Musical genre classification of audio signals
- Tzanetakis G, Cook P (2002) Musical genre classification of audio signals. IEEE Trans Speech Audio Process 10(5):293-302
- (2002) IEEE Trans Speech Audio Process , vol.10 , Issue.5 , pp. 293-302
- Tzanetakis, G.¹ Cook, P.²

48
- 0010053023
- Automatic musical genre classification of audio signals
- ISMIR
- Tzanetakis G, Essl G, Cook P (2001) Automatic musical genre classification of audio signals. Proceedings, international symposium on music information retrieval (ISMIR)
- (2001) Proceedings, international symposium on music information retrieval
- Tzanetakis, G.¹ Essl, G.² Cook, P.³

49
- 85032751556
- Multimedia content analysis using both audio and visual cues
- Wang Y, Liu Z, Huang J-C (2000) Multimedia content analysis using both audio and visual cues. IEEE Signal Process Mag 116:12-36
- (2000) IEEE Signal Process Mag , vol.116 , pp. 12-36
- Wang, Y.¹ Liu, Z.² Huang, J.-C.³

50
- 77958036231
- Speech/music discrimination based on posterior probability features
- Williams G, Ellis D (1999) Speech/music discrimination based on posterior probability features. Proceedings of Eurospeech
- (1999) Proceedings of Eurospeech
- Williams, G.¹ Ellis, D.²

51
- 0030242072
- Content-based classification search and retrieval of audio
- Wold E, Blum T, Keislar D, Wheaton J (1996) Content-based classification search and retrieval of audio. IEEE Multimedia Magazine 3(3):27-36
- (1996) IEEE Multimedia Magazine , vol.3 , Issue.3 , pp. 27-36
- Wold, E.¹ Blum, T.² Keislar, D.³ Wheaton, J.⁴

52
- 0035815506
- Organizing sound sequences in the human brain: The interplay of auditory streaming and temporal integration
- Elsevier
- Yabe H et al (2001) Organizing sound sequences in the human brain: the interplay of auditory streaming and temporal integration. Brain Res 897:222-227, Elsevier
- (2001) Brain Res , vol.897 , pp. 222-227
- Yabe, H.¹

53
- 0035340677
- Audio content analysis for on-line audiovisual data segmentation
- Zhang T, Jay Kuo C-C (2001) Audio content analysis for on-line audiovisual data segmentation. IEEE Trans Speech Audio Process 9(4):441-457
- (2001) IEEE Trans Speech Audio Process , vol.9 , Issue.4 , pp. 441-457
- Zhang, T.¹ Jay Kuo, C.-C.²

54
- 0036888031
- Zhou W, Dao S, Jay Kuo C-C (2002) On line knowledge and rule-based video classification system for video indexing and dissemination. Inf Sys 27:559-586, Elsevier
- Zhou W, Dao S, Jay Kuo C-C (2002) On line knowledge and rule-based video classification system for video indexing and dissemination. Inf Sys 27:559-586, Elsevier

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.