SCOPUS 정보 검색 플랫폼

Volumn 12, Issue 1, 2006, Pages 55-67

Machine-learning based classification of speech and music

(2) Khan, M Kashif Saeed a Al Khatib, Wasfi G a

a KING FAHD UNIVERSITY OF PETROLEUM AND MINERALS (Saudi Arabia)

Author keywords

Audio features; Audio signal processing; Fuzzy c means clustering; Hidden Markov Models; Neural networks; Speech music classification

Indexed keywords

CLASSIFICATION (OF INFORMATION); FUZZY SETS; INFORMATION RETRIEVAL SYSTEMS; LEARNING SYSTEMS; MARKOV PROCESSES; MULTIMEDIA SYSTEMS; NEURAL NETWORKS;

AUDIO FEATURES; AUDIO SIGNAL PROCESSING; FUZZY C-MEANS CLUSTERING; HIDDEN MARKOV MODELS; MEL-FREQUENCY CEPSTRAL COEFFICIENTS; SPEECH MUSIC CLASSIFICATION;

SPEECH RECOGNITION;

EID: 33746879922 PISSN: 09424962 EISSN: None Source Type: Journal
DOI: 10.1007/s00530-006-0034-0 Document Type: Article

Times cited : (42)

References (37)

1
- 0030648077
- Construction and evaluation of a robust multifeature speech/music discriminator
- IEEE
- Scheirer, E., Slaney, M.: Construction and evaluation of a robust multifeature speech/music discriminator. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'97, IEEE), Vol. 2, pp. 1331-1334 (1997)
- (1997) Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'97) , vol.2 , pp. 1331-1334
- Scheirer, E.¹ Slaney, M.²

2
- 33746879653
- A multifeature speech/music discrimination system
- IEEE
- Saad, E.M., El-Adawy, M.I., Abu-El-Wata, M.E., Wahba, A.A.: A multifeature speech/music discrimination system. In: Proceedings of the 19th National Radio Science Conference (NRSC'02, IEEE), pp. 208-213 (2002)
- (2002) Proceedings of the 19th National Radio Science Conference (NRSC'02) , pp. 208-213
- Saad, E.M.¹ El-Adawy, M.I.² Abu-El-Wata, M.E.³ Wahba, A.A.⁴

3
- 0029765670
- Real-time discrimination of broadcast speech/music
- IEEE
- John Saunders: Real-time discrimination of broadcast speech/music. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'96, IEEE), Vol. 2, pp. 993-996 (1996)
- (1996) Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'96) , vol.2 , pp. 993-996
- Saunders, J.¹

4
- 0032638667
- A comparison of features for speech, music discrimination
- IEEE
- Carey, M.J., Parris, E.S., Lloyd-Thomas, H.: A comparison of features for speech, music discrimination. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'99, IEEE), Vol. 1, pp. 149-152 (1999)
- (1999) Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'99) , vol.1 , pp. 149-152
- Carey, M.J.¹ Parris, E.S.² Lloyd-Thomas, H.³

5
- 70350279492
- Feature fusion for music detection
- Parris, E.S., Carey, M.J., Lloyd-Thomas, H.: Feature fusion for music detection. In: Proceedings of the European Conference on Speech Communication and Technology (EUROSPEECH'99), pp. 2191-2194 (1999)
- (1999) Proceedings of the European Conference on Speech Communication and Technology (EUROSPEECH'99) , pp. 2191-2194
- Parris, E.S.¹ Carey, M.J.² Lloyd-Thomas, H.³

6
- 0034853025
- Robust singing detection in speech/music discriminator design
- IEEE
- Chou, W., Gu, L.: Robust singing detection in speech/music discriminator design. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'01, IEEE), Vol. 2, pp. 865-868 (2001)
- (2001) Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'01) , vol.2 , pp. 865-868
- Chou, W.¹ Gu, L.²

7
- 0036288612
- Speech and music classification in audio documents
- IEEE
- Pinquier, J., Sénac, C., André-Obrecht, R.: Speech and music classification in audio documents. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'02, IEEE), Vol. 4, pp. 4164-4164 (2002)
- (2002) Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'02) , vol.4 , pp. 4164-4164
- Pinquier, J.¹ Sénac, C.² André-Obrecht, R.³

8
- 85009291610
- Robust speech/music classification in audio documents
- Pinquier, J., Rouas, J.-L., André-Obrecht, R.: Robust speech/music classification in audio documents. In: Proceedings of the 7th International Conference on Spoken Language (ICSLP'02), Vol. 3, pp. 2005-2008 (2002)
- (2002) Proceedings of the 7th International Conference on Spoken Language (ICSLP'02) , vol.3 , pp. 2005-2008
- Pinquier, J.¹ Rouas, J.-L.² André-Obrecht, R.³

9
- 0141590391
- A fusion study in speech/music classification
- IEEE
- Pinquier, J., Rouas, J.L., André-Obrecht, R.: A fusion study in speech/music classification. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'03, IEEE), Vol. 2, pp. II-17-II-20 (2003)
- (2003) Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'03) , vol.2
- Pinquier, J.¹ Rouas, J.L.² André-Obrecht, R.³

10
- 84872720209
- Robust speech music discrimination using spectrum's first order statistics and neural networks
- IEEE
- Harb, H., Chen, L.: Robust speech music discrimination using spectrum's first order statistics and neural networks. In: Proceedings of the 7th International Symposium on Signal Processing and its Applications, IEEE, Vol. 2, pp. 125-128 (2003)
- (2003) Proceedings of the 7th International Symposium on Signal Processing and Its Applications , vol.2 , pp. 125-128
- Harb, H.¹ Chen, L.²

11
- 20444469089
- Speech/music/silence and gender detection algorithm
- Harb, H., Chen, L., Auloge, J.Y.: Speech/music/silence and gender detection algorithm. In: Proceedings of the 7th International Conference on Distributed Multimedia Systems (DMS'01), pp. 257-262 (2001)
- (2001) Proceedings of the 7th International Conference on Distributed Multimedia Systems (DMS'01) , pp. 257-262
- Harb, H.¹ Chen, L.² Auloge, J.Y.³

12
- 40849091837
- Discrimination between speech and music based on a low frequency modulation feature
- Karnebäck, S.: Discrimination between speech and music based on a low frequency modulation feature. In: Proceedings of the European Conference on Speech Communication and Technology (EUROSPEECH'01), pp. 1891-1894 (2001)
- (2001) Proceedings of the European Conference on Speech Communication and Technology (EUROSPEECH'01) , pp. 1891-1894
- Karnebäck, S.¹

13
- 51449085287
- A fast and robust speech/music discrimination approach
- IEEE
- Wang, W.Q., Gao, W., Ying, D.W.: A fast and robust speech/music discrimination approach. In: Proceedings of the Information, Communications & Signal Processing (ICICS-PCM'03, IEEE), Vol. 3, pp. 1325-1329 (2003)
- (2003) Proceedings of the Information, Communications & Signal Processing (ICICS-PCM'03) , vol.3 , pp. 1325-1329
- Wang, W.Q.¹ Gao, W.² Ying, D.W.³

14
- 0033705976
- Speech/music discrimination for multimedia applications
- IEEE
- El-Maleh, K., Klein, M., Petrucci, G., Kabal, P.: Speech/music discrimination for multimedia applications. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'00, IEEE), Vol. 4, pp. 2445-2448 (2000)
- (2000) Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'00) , vol.4 , pp. 2445-2448
- El-Maleh, K.¹ Klein, M.² Petrucci, G.³ Kabal, P.⁴

15
- 20444472966
- A speech/music discriminator based on rms and zero-crossings
- Panagiotakis, C., Tziritas, G.: A speech/music discriminator based on rms and zero-crossings. IEEE Trans. Multimedia (2004)
- (2004) IEEE Trans. Multimedia
- Panagiotakis, C.¹ Tziritas, G.²

16
- 78649537792
- Applying neural network on content-based audio classification
- IEEE
- Shao, X., Xu, C., Kankanhalli, M.S.: Applying neural network on content-based audio classification. In: Proceedings of the Fourth International Conference on Information, Communications and Signal Processing, IEEE, Vol. 3, pp. 1823-1825 (2003)
- (2003) Proceedings of the Fourth International Conference on Information, Communications and Signal Processing , vol.3 , pp. 1823-1825
- Shao, X.¹ Xu, C.² Kankanhalli, M.S.³

17
- 4544345094
- A comparison of human and automatic musical genre classification
- IEEE
- Lippens, S., Martens, J.P., De Mulder, T., Tzanetakis, G.: A comparison of human and automatic musical genre classification. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'04, IEEE), Vol. 4, pp. IV-233-IV-236 (2004)
- (2004) Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'04) , vol.4
- Lippens, S.¹ Martens, J.P.² De Mulder, T.³ Tzanetakis, G.⁴

18
- 4544304284
- Harmonicity and dynamics-based features for audio
- IEEE
- Srinivasan, S.H., Kankanhalli, M.: Harmonicity and dynamics-based features for audio. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'04, IEEE), Vol. 4, pp. IV-321-IV-324 (2004)
- (2004) Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'04) , vol.4
- Srinivasan, S.H.¹ Kankanhalli, M.²

19
- 4243919627
- Master's thesis, Department of Information Technology, Tampere University of Technology, Finland
- Vesa Peltonen: Computational auditory scene recognition. Master's thesis, Department of Information Technology, Tampere University of Technology, Finland (2001)
- (2001) Computational Auditory Scene Recognition
- Peltonen, V.¹

20
- 0010053023
- Automatic musical genre classification of audio signals
- Tzanetakis, G., Essl, G., Cook, P.: Automatic musical genre classification of audio signals. In: Proceedings of the International Symposium on Music Information Retrieval (ISMIR'01), pp. 205-210 (2001)
- (2001) Proceedings of the International Symposium on Music Information Retrieval (ISMIR'01) , pp. 205-210
- Tzanetakis, G.¹ Essl, G.² Cook, P.³

21
- 0036648502
- Musical genre classification of audio signals
- Tzanetakis, G., Cook, P.: Musical genre classification of audio signals. IEEE Trans. Speech Audio Proc. 10(5), 293-302 (2002)
- (2002) IEEE Trans. Speech Audio Proc. , vol.10 , Issue.5 , pp. 293-302
- Tzanetakis, G.¹ Cook, P.²

22
- 0037708486
- Content-based audio classification and segmentation by using support vector machines
- Lu, L., Zhang, H.-J., Li, S.Z.: Content-based audio classification and segmentation by using support vector machines. ACM Mult. Sys. J. 8(6), 482-492 (2003)
- (2003) ACM Mult. Sys. J. , vol.8 , Issue.6 , pp. 482-492
- Lu, L.¹ Zhang, H.-J.² Li, S.Z.³

23
- 0036556701
- Audio classification in speech and music: A comparison between a statistical and a neural approach
- Bugatti, A., Flammini, A., Migliorati, P.: Audio classification in speech and music: A comparison between a statistical and a neural approach. EURASIP J. Appl. Sig. Proc. 4, 372-378 (2002)
- (2002) EURASIP J. Appl. Sig. Proc. , vol.4 , pp. 372-378
- Bugatti, A.¹ Flammini, A.² Migliorati, P.³

24
- 0034792569
- A robust audio classification and segmentation method
- ACM
- Lu, L., Jiang, H., Zhang, H.-J.: A robust audio classification and segmentation method. In: Proceedings of the 9th ACM International Conference on Multimedia (MM'01, ACM), pp. 203-211 (2001)
- (2001) Proceedings of the 9th ACM International Conference on Multimedia (MM'01) , pp. 203-211
- Lu, L.¹ Jiang, H.² Zhang, H.-J.³

25
- 0036816475
- Content analysis for audio classification and segmentation
- Lu, L., Zhang, H.-J., Jiang, H.: Content analysis for audio classification and segmentation. IEEE Trans. Speech Audio Proc. 10(7), 504-516 (2002)
- (2002) IEEE Trans. Speech Audio Proc. , vol.10 , Issue.7 , pp. 504-516
- Lu, L.¹ Zhang, H.-J.² Jiang, H.³

26
- 10044253046
- Speech music discrimination using class-specific features
- IEEE
- Beierholm, T., Baggenstoss, P.M.: Speech music discrimination using class-specific features. In: Proceedings of the 17th International Conference on Pattern Recognition (ICPR'04, IEEE), Vol. 2, pp. 379-382 (2004)
- (2004) Proceedings of the 17th International Conference on Pattern Recognition (ICPR'04) , vol.2 , pp. 379-382
- Beierholm, T.¹ Baggenstoss, P.M.²

27
- 0028727261
- Detection of human speech in structured noise
- IEEE
- Hoyt, J.D., Wechsler, H.: Detection of human speech in structured noise. In: Proceedings of the International Conference on Neural Networks, IEEE, Vol. 7, pp. 4493-4496 (1994)
- (1994) Proceedings of the International Conference on Neural Networks , vol.7 , pp. 4493-4496
- Hoyt, J.D.¹ Wechsler, H.²

28
- 0035308233
- Classification of general audio data for content-based retrieval
- Li, D., Sethi, I.K., Dimitrova, N., McGee, T.: Classification of general audio data for content-based retrieval. Patt. Recog. Lett. 22(5), 533-544 (2001)
- (2001) Patt. Recog. Lett. , vol.22 , Issue.5 , pp. 533-544
- Li, D.¹ Sethi, I.K.² Dimitrova, N.³ McGee, T.⁴

29
- 84889075620
- A framework for audio analysis based on classification and temporal segmentation
- IEEE
- Tzanetakis, G., Cook, P.: A framework for audio analysis based on classification and temporal segmentation. In: EUROMICRO Workshop on Music Technology and Audio Processing, IEEE, Vol. 2, pp. 61-67 (1999)
- (1999) EUROMICRO Workshop on Music Technology and Audio Processing , vol.2 , pp. 61-67
- Tzanetakis, G.¹ Cook, P.²

30
- 0031624374
- Classification of audio signals using statistical features on time and wavelet transform domains
- IEEE
- Lambrou, T., Kudumakis, P., Speller, R., Sandler, M., Linney, A.: Classification of audio signals using statistical features on time and wavelet transform domains. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'98, IEEE), Vol. 6, pp. 3621-3624 (1998)
- (1998) Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'98) , vol.6 , pp. 3621-3624
- Lambrou, T.¹ Kudumakis, P.² Speller, R.³ Sandler, M.⁴ Linney, A.⁵

31
- 0031619927
- Classification of transient time-varying signals using dft and wavelet packet based methods
- IEEE
- Delfs, C., Jondral, R: Classification of transient time-varying signals using dft and wavelet packet based methods. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'98, IEEE), Vol. 3, pp. 1569-1572 (1998)
- (1998) Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'98) , vol.3 , pp. 1569-1572
- Delfs, C.¹ Jondral, R.²

32
- 0004008854
- Plenum Press, New York
- Bezdek J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)
- (1981) Pattern Recognition with Fuzzy Objective Function Algorithms
- Bezdek, J.C.¹

33
- 0003922190
- Wiley, New York
- Duda, R.O., Stork, D.G., Hart, P.E.: Pattern classification, 2nd edn. Wiley, New York (2001)
- (2001) Pattern Classification, 2nd Edn.
- Duda, R.O.¹ Stork, D.G.² Hart, P.E.³

34
- 33746924151
- Master's thesis, King Fahd University of Petroleum andMinerals, Dhahran, Saudi Arabia
- Kashif Saeed Khan, M.: Automatic classification of speech and music in digitized audio. Master's thesis, King Fahd University of Petroleum andMinerals, Dhahran, Saudi Arabia (2005)
- (2005) Automatic Classification of Speech and Music in Digitized Audio
- Kashif Saeed Khan, M.¹

35
- 0024861871
- Approximation by superpositions of a sigmoidal function
- Cybenko, G.: Approximation by superpositions of a sigmoidal function. Math. Con. Sig. Sys. 2(4), 303-314 (1989)
- (1989) Math. Con. Sig. Sys. , vol.2 , Issue.4 , pp. 303-314
- Cybenko, G.¹

36
- 33746883995
- Artificial neural networks for speech and vision
- Mammone, R.J. (ed.): Chapman & Hall, London
- Mammone, R.J. (ed.): Artificial neural networks for speech and vision. Chapman & Hall Neural Computing, 1st edn. Chapman & Hall, London (1994)
- (1994) Chapman & Hall Neural Computing, 1st Edn.

37
- 0022594196
- An introduction to hidden markov models
- Rabiner, L.R., Juang, B.H.: An introduction to hidden markov models. IEEE ASSP Magazine 3(1), 4-16 (1986)
- (1986) IEEE ASSP Magazine , vol.3 , Issue.1 , pp. 4-16
- Rabiner, L.R.¹ Juang, B.H.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.