SCOPUS 정보 검색 플랫폼

Computer Speech and Language

Volumn 24, Issue 2, 2010, Pages 341-357

A wavelet-based parameterization for speech/music discrimination

(4) Didiot, E a Illina, I a Fohr, D a Mella, O a

a INRIA (France)

Author keywords

Dynamic parameters; Long term parameters; Segmentation; Speech music discrimination; Static parameters; Wavelets

Indexed keywords

DYNAMIC PARAMETERS; LONG-TERM PARAMETERS; SEGMENTATION; SPEECH/MUSIC DISCRIMINATION; STATIC PARAMETERS; WAVELETS;

FOURIER ANALYSIS; FREQUENCY BANDS; LAW ENFORCEMENT; PARAMETERIZATION; SIGNAL PROCESSING; SPEECH RECOGNITION;

WAVELET DECOMPOSITION;

EID: 70349238685 PISSN: 08852308 EISSN: 10958363 Source Type: Journal
DOI: 10.1016/j.csl.2009.05.003 Document Type: Article

Times cited : (43)

References (51)

1
- 0037401304
- Speech/music discrimination using entropy and dynamism features in a HMM classification framework
- Ajmera J., McCowan I., and Bourlard H. Speech/music discrimination using entropy and dynamism features in a HMM classification framework. Speech Communication 40 (2003) 351-363
- (2003) Speech Communication , vol.40 , pp. 351-363
- Ajmera, J.¹ McCowan, I.² Bourlard, H.³

2
- 33947319848
- Application of fisher linear discriminant analysis to speech/music classification
- Alexandre-Cortizo, E., Rosa-Zurera, M., Lopez-Ferreras, F., 2005. Application of fisher linear discriminant analysis to speech/music classification. In: IEEE Eurocon, pp. 1666-1669.
- (2005) IEEE Eurocon , pp. 1666-1669
- Alexandre-Cortizo, E.¹ Rosa-Zurera, M.² Lopez-Ferreras, F.³

3
- 0035688686
- Locating singing voice segments within music signals
- Berenzweig, A., Ellis, P.W., 2001. Locating singing voice segments within music signals. In: IEEE Workshop on Apps of Sign. Proc. to Acous. and Audio.
- (2001) IEEE Workshop on Apps of Sign. Proc. to Acous. and Audio
- Berenzweig, A.¹ Ellis, P.W.²

4
- 84987906938
- ANTS: Le systFme de transcription automatique du LORIA
- Brun, A., Cerisara, C., Fohr, D., Illina, I., Langlois, D., Mella, O., Smaili, K., 2004. ANTS: le systFme de transcription automatique du LORIA. In: JournTes d'Etude sur la Parole - JEP'04.
- (2004) JournTes d'Etude sur la Parole - JEP'04
- Brun, A.¹ Cerisara, C.² Fohr, D.³ Illina, I.⁴ Langlois, D.⁵ Mella, O.⁶ Smaili, K.⁷

5
- 0032638667
- A comparison of features for speech, music discrimination
- ICASSP, pp
- Carey, M.J., Parris, E.S., Lloyd-Thomas, H., 1999. A comparison of features for speech, music discrimination. In: Proc. IEEE Int. Conf. on Acoustic, Speech and Signal Processing, ICASSP, pp. 149-152.
- (1999) Proc. IEEE Int. Conf. on Acoustic, Speech and Signal Processing , pp. 149-152
- Carey, M.J.¹ Parris, E.S.² Lloyd-Thomas, H.³

6
- 0034853025
- Robust singing detection in speech/music discriminator design
- ICASSP, pp
- Chou, W., Gu, L., 2001. Robust singing detection in speech/music discriminator design. In: Proc. IEEE Int. Conf. on Acoustic, Speech and Signal Processing, ICASSP, pp. 865-868.
- (2001) Proc. IEEE Int. Conf. on Acoustic, Speech and Signal Processing , pp. 865-868
- Chou, W.¹ Gu, L.²

7
- 44949224036
- Ph.D. thesis, Université Henri Poincaré, Nancy, France
- Deviren, M., 2004. Revisiting speech recognition systems: dynamic Bayesian networks and new computational paradigms. Ph.D. thesis, Université Henri Poincaré, Nancy, France.
- (2004) Revisiting speech recognition systems: Dynamic Bayesian networks and new computational paradigms
- Deviren, M.¹

8
- 70349242814
- Deviren, M., Daoudi, K., 2003. Frequency Filtering or Wavelet Filtering? ICANN/ICONIP.
- (2003) Frequency Filtering or Wavelet Filtering? ICANN/ICONIP
- Deviren, M.¹ Daoudi, K.²

9
- 44949087900
- A wavelet-based parametrization for speech/music segmentation
- Didiot, E., Illina, I., Mella, O., Haton, J.-P., Fohr, D., 2006. A wavelet-based parametrization for speech/music segmentation. In: Proc. Int. Conf. on Spoken Language Processing, ICSLP, pp. 653-656.
- (2006) Proc. Int. Conf. on Spoken Language Processing, ICSLP , pp. 653-656
- Didiot, E.¹ Illina, I.² Mella, O.³ Haton, J.-P.⁴ Fohr, D.⁵

10
- 70349241267
- Speech/music discrimination based on wavelets for broadcast programs
- Didiot, E., Illina, I., Mella, O., Haton, J.-P., Fohr, D., 2006. Speech/music discrimination based on wavelets for broadcast programs. In: IEEE International Conference on Signal Processing and Multimedia Applications, pp. 151-156.
- (2006) IEEE International Conference on Signal Processing and Multimedia Applications , pp. 151-156
- Didiot, E.¹ Illina, I.² Mella, O.³ Haton, J.-P.⁴ Fohr, D.⁵

11
- 33745225159
- Auditory teager energy cepstrum coefficients for robust speech recognition
- Dimitriadis, D., Maragos, P., Potamianos, A., 2005. Auditory teager energy cepstrum coefficients for robust speech recognition. Proc. European Conf. on Speech Communication and Technology.
- (2005) Proc. European Conf. on Speech Communication and Technology
- Dimitriadis, D.¹ Maragos, P.² Potamianos, A.³

12
- 33745182925
- Automatic music genre classification using second-order statistical measures for the prescriptive approach
- Ezzaidi, H., Rouat, J., 2005. Automatic music genre classification using second-order statistical measures for the prescriptive approach. In: Proc. European Conf. on Speech Communication and Technology, pp. 141-144.
- (2005) Proc. European Conf. on Speech Communication and Technology , pp. 141-144
- Ezzaidi, H.¹ Rouat, J.²

13
- 70349263032
- Segmentation en macro-classes acoustiques d'Tmissions radiophoniques dans le cadre d'ESTER
- Fredouille, C., Matrouf, D., Linares, G., Nocera, P., 2004. Segmentation en macro-classes acoustiques d'Tmissions radiophoniques dans le cadre d'ESTER. In: JournTes d'Etude sur la Parole - JEP04.
- (2004) JournTes d'Etude sur la Parole
- Fredouille, C.¹ Matrouf, D.² Linares, G.³ Nocera, P.⁴

14
- 3142664255
- Hybrid SVM/HMM architectures for speech recognition
- Ganapathiraju, A., Picone, J., 2000. Hybrid SVM/HMM architectures for speech recognition. In: Neural Information Processing Systems.
- (2000) Neural Information Processing Systems
- Ganapathiraju, A.¹ Picone, J.²

15
- 0036567851
- The LIMSI broadcast news transcription system
- Gauvain J.-L., Lamel L., and Adda G. The LIMSI broadcast news transcription system. Speech Communication 37 1 (2002) 89-108
- (2002) Speech Communication , vol.37 , Issue.1 , pp. 89-108
- Gauvain, J.-L.¹ Lamel, L.² Adda, G.³

16
- 0002725741
- The LIMSI 1998 Hub-4E transcription system
- February, pp
- Gauvain, J.L., Lamel, L., Adda, G., Jardino, M., 1999. The LIMSI 1998 Hub-4E transcription system. In: Proc. DARPA Broadcast News Transcription Workshop, February, pp. 99-104.
- (1999) Proc. DARPA Broadcast News Transcription Workshop , pp. 99-104
- Gauvain, J.L.¹ Lamel, L.² Adda, G.³ Jardino, M.⁴

17
- 0034842305
- Integration of fixed and multiple resolution analysis in a speech recognition system
- ICASSP, pp
- Gemello, R., Albesano, D., Moisa, L., De Mori, R., 2001. Integration of fixed and multiple resolution analysis in a speech recognition system. In: Proc. IEEE Int. Conf. on Acoustic, Speech and Signal Processing, ICASSP, pp. 121-124.
- (2001) Proc. IEEE Int. Conf. on Acoustic, Speech and Signal Processing , pp. 121-124
- Gemello, R.¹ Albesano, D.² Moisa, L.³ De Mori, R.⁴

18
- 70349230534
- ESTER, une campagne d'Tvaluation des systFmes d'indexation automatique d'Tmissions radiophoniques en francais
- Gravier, G., Bonastre, J.F., Geoffrois, E., Galliano, S., Mc Tait, K., Choukri, K., 2004. ESTER, une campagne d'Tvaluation des systFmes d'indexation automatique d'Tmissions radiophoniques en francais. In: JournTes d'Etude sur la Parole - JEP04.
- (2004) JournTes d'Etude sur la Parole
- Gravier, G.¹ Bonastre, J.F.² Geoffrois, E.³ Galliano, S.⁴ Mc Tait, K.⁵ Choukri, K.⁶

19
- 85071069033
- Segmentation and classification of broadcast news audio
- Hain, T., Woodland, P., 1998. Segmentation and classification of broadcast news audio. In: Proc. Int. Conf. on Spoken Language Processing, ICSLP.
- (1998) Proc. Int. Conf. on Spoken Language Processing, ICSLP
- Hain, T.¹ Woodland, P.²

20
- 0032629671
- The teager energy based feature parameters for robust speech recognition in car noise
- ICASSP, pp
- Jabloun, F., Enis Cetin, A., 1999. The teager energy based feature parameters for robust speech recognition in car noise. In: Proc. IEEE Int. Conf. on Acoustic, Speech and Signal Processing, ICASSP, pp. 273-276.
- (1999) Proc. IEEE Int. Conf. on Acoustic, Speech and Signal Processing , pp. 273-276
- Jabloun, F.¹ Enis Cetin, A.²

21
- 0003770709
- Kluwer
- Junqua J.-C., and Haton J.-P. Robustness in Automatic Speech Recognition: Problems, Issues, and Solutions (1995), Kluwer
- (1995) Robustness in Automatic Speech Recognition: Problems, Issues, and Solutions
- Junqua, J.-C.¹ Haton, J.-P.²

22
- 20444491279
- Automatic classification of speech and music using neural networks
- Kahn, M., Al-Khatib, W., Moinuddin, M., 2004. Automatic classification of speech and music using neural networks. In: Proc. ACM Int. Workshop on Multimedia Databases, pp. 94-99.
- (2004) Proc. ACM Int. Workshop on Multimedia Databases , pp. 94-99
- Kahn, M.¹ Al-Khatib, W.² Moinuddin, M.³

23
- 0025635254
- On a simple algorithm to calculate the 'energy'of a signal
- ICASSP, pp
- Kaiser, J.F., 1990. On a simple algorithm to calculate the 'energy'of a signal. In: Proc. IEEE Int. Conf. on Acoustic, Speech and Signal Processing, ICASSP, pp. 381-384.
- (1990) Proc. IEEE Int. Conf. on Acoustic, Speech and Signal Processing , pp. 381-384
- Kaiser, J.F.¹

24
- 34247239858
- Speech/music discrimination based on spectral peak analysis and multi-layer perceptron
- Keum, J.S., Lee, H.S., 2006. Speech/music discrimination based on spectral peak analysis and multi-layer perceptron. In: International Conference on Hybrid Information Technology, vol. 2, pp. 56-61.
- (2006) International Conference on Hybrid Information Technology , vol.2 , pp. 56-61
- Keum, J.S.¹ Lee, H.S.²

25
- 33746879922
- Machine learning-based classification of speech and music
- Khan M., and Al-Khatib W.G. Machine learning-based classification of speech and music. Multi-Media Systems 12 (2006) 55-67
- (2006) Multi-Media Systems , vol.12 , pp. 55-67
- Khan, M.¹ Al-Khatib, W.G.²

26
- 0034845044
- Kim, I.J., Yang, S.I., Kwon, Y., 2001. Speech Enhancement using Adaptive Wavelet Shrinkage. In: ISIE-2001, 1, pp. 501-504.
- Kim, I.J., Yang, S.I., Kwon, Y., 2001. Speech Enhancement using Adaptive Wavelet Shrinkage. In: ISIE-2001, vol. 1, pp. 501-504.

27
- 84912104934
- A new noise-robust subband front-end and its comparison to PLP
- Kryze, D., Rigazio, L., Junqua, J.-C., 1999. A new noise-robust subband front-end and its comparison to PLP. In: Automatic Speech Recognition and Understanding Workshop.
- (1999) Automatic Speech Recognition and Understanding Workshop
- Kryze, D.¹ Rigazio, L.² Junqua, J.-C.³

28
- 85008016199
- Audio classification and categorization based on wavelets and support vector machine
- Lin C.-C., Chen S.-H., Truong T.-K., and Chang Y. Audio classification and categorization based on wavelets and support vector machine. IEEE Transactions on Speech and Audio Processing 13 (2005) 644-651
- (2005) IEEE Transactions on Speech and Audio Processing , vol.13 , pp. 644-651
- Lin, C.-C.¹ Chen, S.-H.² Truong, T.-K.³ Chang, Y.⁴

29
- 4444295205
- Mel frequency cepstral coefficients for music modeling
- Logan, B., 2000. Mel frequency cepstral coefficients for music modeling. In: Proc. International Symposium on Music Information Retrieval.
- (2000) Proc. International Symposium on Music Information Retrieval
- Logan, B.¹

30
- 0036816475
- Content analysis for audio classification and segmentation
- Lu, L., Zhang, H.-J., Jiang, H., 2002. Content analysis for audio classification and segmentation. In: IEEE Transactions on Speech and Audio Processing, vol. 10(7), pp. 504-516.
- (2002) IEEE Transactions on Speech and Audio Processing , vol.10 , Issue.7 , pp. 504-516
- Lu, L.¹ Zhang, H.-J.² Jiang, H.³

31
- 0024700097
- A theory for multiresolution signal decomposition: the wavelet representation
- Mallat S. A theory for multiresolution signal decomposition: the wavelet representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 11 (1989) 674-693
- (1989) IEEE Transactions on Pattern Analysis and Machine Intelligence , vol.11 , pp. 674-693
- Mallat, S.¹

32
- 0003456805
- Academic Press
- Mallat S. A Wavelet Tour of Signal Processing (1998), Academic Press
- (1998) A Wavelet Tour of Signal Processing
- Mallat, S.¹

33
- 77956386426
- Les ondelettes et leurs applications
- Misiti M., Misiti Y., Oppenheim G., and Poggi J.-M. Les ondelettes et leurs applications. Editeur Lavoisier Hermes (2003)
- (2003) Editeur Lavoisier Hermes
- Misiti, M.¹ Misiti, Y.² Oppenheim, G.³ Poggi, J.-M.⁴

34
- 13144306118
- A speech/music discriminator based on RMS and zero-crossings
- Panagiotakis, C., Tziritas, G., 2005. A speech/music discriminator based on RMS and zero-crossings. In: IEEE Transaction on Multimedia, vol. 7(1), pp. 155-166.
- (2005) IEEE Transaction on Multimedia , vol.7 , Issue.1 , pp. 155-166
- Panagiotakis, C.¹ Tziritas, G.²

35
- 0036288612
- Speech and music classification in audio documents
- ICASSP, pp
- Pinquier, J., Senac, C., Andre-Obrecht, R., 2002. Speech and music classification in audio documents. In: Proc. IEEE Int. Conf. on Acoustic, Speech and Signal Processing, ICASSP, pp. 4164-4167.
- (2002) Proc. IEEE Int. Conf. on Acoustic, Speech and Signal Processing , pp. 4164-4167
- Pinquier, J.¹ Senac, C.² Andre-Obrecht, R.³

36
- 0035278964
- Time-frequency distributions for automatic speech recognition
- Potamianos, A., Maragos, P., 2001. Time-frequency distributions for automatic speech recognition. In: IEEE Transactions on Speech and Audio Processing, pp. 196-200.
- (2001) IEEE Transactions on Speech and Audio Processing , pp. 196-200
- Potamianos, A.¹ Maragos, P.²

37
- 78651558871
- Segmentation Parole/Musique pour la transcription automatique
- Razik, J., Fohr, D., Mella, O., Parlangeau-VallFs, N., 2004. Segmentation Parole/Musique pour la transcription automatique. In: JournTes d'Etudes sur la Parole.
- (2004) JournTes d'Etudes sur la Parole
- Razik, J.¹ Fohr, D.² Mella, O.³ Parlangeau-VallFs, N.⁴

38
- 70349244394
- Comparison of two speech/music segmentation systems for audio indexing on the web
- Razik, J., Senac, C., Fohr, D., Mella, O., Parlangeau-Valles, N., 2003. Comparison of two speech/music segmentation systems for audio indexing on the web. In: Proc. Multi Conference on Systemics, Cybernetics and Informatics.
- (2003) Proc. Multi Conference on Systemics, Cybernetics and Informatics
- Razik, J.¹ Senac, C.² Fohr, D.³ Mella, O.⁴ Parlangeau-Valles, N.⁵

39
- 27644502441
- Image compression from DCT to wavelets: a review
- Saha S. Image compression from DCT to wavelets: a review. ACM Crossroads 6 3 (2000) 644-651
- (2000) ACM Crossroads , vol.6 , Issue.3 , pp. 644-651
- Saha, S.¹

40
- 0033688848
- High resolution speech feature parameterization for monophone-based stressed speech recognition
- Sarikaya R., and Hansen J.H.L. High resolution speech feature parameterization for monophone-based stressed speech recognition. IEEE Signal Processing Letters 7 7 (2000) 182-185
- (2000) IEEE Signal Processing Letters , vol.7 , Issue.7 , pp. 182-185
- Sarikaya, R.¹ Hansen, J.H.L.²

41
- 0029765670
- Real-time discrimination of broadcast speech/music
- ICASSP, pp
- Saunders, J., 1996. Real-time discrimination of broadcast speech/music. In: Proc. IEEE Int. Conf. on Acoustic, Speech and Signal Processing, ICASSP, pp. 993-996.
- (1996) Proc. IEEE Int. Conf. on Acoustic, Speech and Signal Processing , pp. 993-996
- Saunders, J.¹

42
- 0030648077
- Construction and evaluation of a robust multifeature speech/music discriminator
- ICASSP, pp
- Scheirer, E., Slaney, M., 1997. Construction and evaluation of a robust multifeature speech/music discriminator. In: Proc. IEEE Int. Conf. on Acoustic, Speech and Signal Processing, ICASSP, pp. 1331-1334.
- (1997) Proc. IEEE Int. Conf. on Acoustic, Speech and Signal Processing , pp. 1331-1334
- Scheirer, E.¹ Slaney, M.²

43
- 45849121392
- Detection of speech and music based on spectral tracking
- Taniguchi, T., Tohyama, M., Katsuhiko, S., 2008. Detection of speech and music based on spectral tracking. In: Speech Communication, vol. 50, pp. 547-563.
- (2008) Speech Communication , vol.50 , pp. 547-563
- Taniguchi, T.¹ Tohyama, M.² Katsuhiko, S.³

44
- 0002751623
- Segment generation and clustering in the HTK broadcast news transcription system
- Hain, T., Johnson, S.E., Tuerk, A., Woodland, P.C., Young, S.J., 1998. Segment generation and clustering in the HTK broadcast news transcription system. In: Proc. 1998 DARPA Broadcast News Transcription and Understanding Workshop, pp. 133-137.
- (1998) Proc. 1998 DARPA Broadcast News Transcription and Understanding Workshop , pp. 133-137
- Hain, T.¹ Johnson, S.E.² Tuerk, A.³ Woodland, P.C.⁴ Young, S.J.⁵

45
- 84892175859
- Automatic speech recognition based on ceptral coefficients and a mel-based discrete energy operator
- ICASSP, pp
- Tolba, H., O'Shaughnessy, D., 1998. Automatic speech recognition based on ceptral coefficients and a mel-based discrete energy operator. In: Proc. IEEE Int. Conf. on Acoustic, Speech and Signal Processing, ICASSP, pp. 973-976.
- (1998) Proc. IEEE Int. Conf. on Acoustic, Speech and Signal Processing , pp. 973-976
- Tolba, H.¹ O'Shaughnessy, D.²

46
- 0036648502
- Musical genre classification of audio signals
- Tzanetakis G., and Cook P. Musical genre classification of audio signals. IEEE Transactions on Speech and Audio Processing 10 5 (2002) 293-302
- (2002) IEEE Transactions on Speech and Audio Processing , vol.10 , Issue.5 , pp. 293-302
- Tzanetakis, G.¹ Cook, P.²

47
- 16244420091
- Multigroup classification of audio signals using time-frequency parameters
- Umapathy K., Krishnan S., and Jimaa S. Multigroup classification of audio signals using time-frequency parameters. IEEE Transaction on Multimedia 7 2 (2005) 308-315
- (2005) IEEE Transaction on Multimedia , vol.7 , Issue.2 , pp. 308-315
- Umapathy, K.¹ Krishnan, S.² Jimaa, S.³

48
- 77958036231
- Speech/music dscrimination based on posterior probability features
- Williams, G., Ellis, D., 1999. Speech/music dscrimination based on posterior probability features. In: Proc. European Conf. on Speech Communication and Technology, pp. 687-690.
- (1999) Proc. European Conf. on Speech Communication and Technology , pp. 687-690
- Williams, G.¹ Ellis, D.²

49
- 0343697653
- CRC Press, LLC
- Wold E., Blum T., Keislar D., and Wheater J. Classification, Search and Retireval of Audio. CRC Handbook of Multimedia Computing (1999), CRC Press, LLC
- (1999) Classification, Search and Retireval of Audio. CRC Handbook of Multimedia Computing
- Wold, E.¹ Blum, T.² Keislar, D.³ Wheater, J.⁴

50
- 0003822743
- Cambridge, England, Entropic Ltd, Microsoft
- Young, S.J., Kershaw, D., Odell, J., Ollason, D., Valtchev, V., Woodland, P., 1995. The HTK Book. Cambridge, England, Entropic Ltd., Microsoft.
- (1995) The HTK Book
- Young, S.J.¹ Kershaw, D.² Odell, J.³ Ollason, D.⁴ Valtchev, V.⁵ Woodland, P.⁶

51
- 0035340677
- Audio content analysis for online audiovisual data segmentation and classification
- Zhang, T., Kuo, C.-C.J., 2001. Audio content analysis for online audiovisual data segmentation and classification. In: IEEE Transactions on Speech and Audio Processing, vol. 9(4), pp. 441-457.
- (2001) IEEE Transactions on Speech and Audio Processing , vol.9 , Issue.4 , pp. 441-457
- Zhang, T.¹ Kuo, C.-C.J.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.