-
2
-
-
0036288688
-
A new speaker change detection method for two-speaker segmentation
-
'
-
Adami, A. G., Kajarekar, S. S. and Hermansky, H. (2002), A new speaker change detection method for two-speaker segmentation, in 'Proceedings of the ICASSP', Vol. 4, pp. 3908-3911.
-
(2002)
Proceedings of the ICASSP
, vol.4
, pp. 3908-3911
-
-
Adami, A.G.1
Kajarekar, S.S.2
Hermansky, H.3
-
4
-
-
70349442041
-
An accurate timbre model for musical instruments and its application to classification
-
Burred, J. J., Röbel, A. and Rodet, X. (2006), An accurate timbre model for musical instruments and its application to classification, in 'Proceedings of the First Workshop on Learning the Semantics of Audio Signals (LSAS)', pp. 22-32.
-
(2006)
Proceedings of the First Workshop on Learning the Semantics of Audio Signals (LSAS)
, pp. 22-32
-
-
Burred, J.J.1
Röbel, A.2
Rodet, X.3
-
5
-
-
0031233424
-
Speaker recognition: A tutorial
-
Campbell, J. P. (1997), 'Speaker recognition: A tutorial', Proc. IEEE 85(9), 1437-1462.
-
(1997)
Proc. IEEE
, vol.85
, Issue.9
, pp. 1437-1462
-
-
Campbell, J.P.1
-
6
-
-
0032638667
-
A comparison of features for speech, music discrimination
-
Carey, M. J., Parris, E. S. and Lloyd-Thomas, H. (1999), A comparison of features for speech, music discrimination, in 'Proceedings of the ICASSP', Vol. 1, pp. 149-152.
-
(1999)
Proceedings of the ICASSP
, vol.1
, pp. 149-152
-
-
Carey, M.J.1
Parris, E.S.2
Lloyd-Thomas, H.3
-
7
-
-
0035364397
-
MPEG-7 sound-recognition tools
-
Casey, M. (2001), 'MPEG-7 sound-recognition tools', IEEE Trans. Circ. Syst. Video Tech. 11(6), 737-747.
-
(2001)
IEEE Trans. Circ. Syst. Video Tech.
, vol.11
, Issue.6
, pp. 737-747
-
-
Casey, M.1
-
8
-
-
33845439374
-
Foafing the music: Bridging the semantic gap in music recommendation
-
of LNCS
-
Celma, O. (2006), Foafing the music: Bridging the semantic gap in music recommendation, in 'Proceedings of the 5th International Semantic Web Conference', Vol. 4273 of LNCS, pp. 927-934.
-
(2006)
Proceedings of the 5th International Semantic Web Conference
, vol.4273
, pp. 927-934
-
-
Celma, O.1
-
12
-
-
85009212151
-
A sequential metric-based audio segmentation method via the bayesian information criterion
-
Cheng, S.-s. and Wang, H.-M. (2003), A sequential metric-based audio segmentation method via the bayesian information criterion, in 'Proceedings of the EUROSPEECH', pp. 945-948.
-
(2003)
Proceedings of the EUROSPEECH
, pp. 945-948
-
-
Cheng, S.-S.1
Wang, H.-M.2
-
13
-
-
84892290921
-
-
of LNCS, Springer, New York
-
Coden, A., Brown, E. W. and Srinivasan, S., eds (2001), Proceedings of the ACM SIGIR 2001 Workshop on Information Retrieval Techniques for Speech Applications, Vol. 2273 of LNCS, Springer, New York.
-
(2001)
Proceedings of the ACM SIGIR 2001 Workshop on Information Retrieval Techniques for Speech Applications
, vol.2273
-
-
Coden, A.1
Brown, E.W.2
Srinivasan, S.3
-
15
-
-
0003603515
-
-
Cambridge University Press, Cambridge
-
Cole, R. A., Mariani, J., Uszkoreit, H., Zaenen, A. and Zue, V., eds (1998), Survey of the state of the art in Human Language Technology, Cambridge University Press, Cambridge.
-
(1998)
Survey of the State of the Art in Human Language Technology
-
-
Cole, R.A.1
Mariani, J.2
Uszkoreit, H.3
Zaenen, A.4
Zue, V.5
-
16
-
-
0038368765
-
Combination of similarity measures for effective spoken document retrieval
-
Crestani, F. (2003), 'Combination of similarity measures for effective spoken document retrieval', J. Inform. Sci. 29(2), 87-96.
-
(2003)
J. Inform. Sci.
, vol.29
, Issue.2
, pp. 87-96
-
-
Crestani, F.1
-
17
-
-
34250285795
-
U-statistic hierarchical clustering
-
D'Andrade, R. (1978), U-statistic hierarchical clustering, in 'Psychometrika', Vol. 43, pp. 59-68.
-
(1978)
Psychometrika
, vol.43
, pp. 59-68
-
-
D'Andrade, R.1
-
18
-
-
0019053271
-
Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
-
Davis, S. B. and Mermelstein, P. (1980), 'Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences', IEEE Trans. Acoust., Speech, Signal Process. 28(4), 357-366.
-
(1980)
IEEE Trans. Acoust., Speech, Signal Process.
, vol.28
, Issue.4
, pp. 357-366
-
-
Davis, S.B.1
Mermelstein, P.2
-
19
-
-
0034273195
-
DISTBIC: A speaker-based segmentation for audio data indexing
-
Delacourt, P. and Wellekens, C. J. (2000), 'DISTBIC: A speaker-based segmentation for audio data indexing', Speech Comm. 32(1), 111-126.
-
(2000)
Speech Comm.
, vol.32
, Issue.1
, pp. 111-126
-
-
Delacourt, P.1
Wellekens, C.J.2
-
21
-
-
0003922190
-
-
Wiley Interscience, New York
-
Duda, R. O., Hart, P. E. and Stork, D. G. (2000), Pattern Classification, Wiley Interscience, New York.
-
(2000)
Pattern Classification
-
-
Duda, R.O.1
Hart, P.E.2
Stork, D.G.3
-
23
-
-
84908266591
-
The beat spectrum: A new approach to rhythm analysis
-
Foote, J. (2001), The beat spectrum: A new approach to rhythm analysis, in 'Proceedings of the ICME', pp. 881-884.
-
(2001)
Proceedings of the ICME
, pp. 881-884
-
-
Foote, J.1
-
24
-
-
57649180845
-
Content-based retrieval of music and audio
-
C.-C. Jay Kuo et al., ed.
-
Foote, J. T. (1997), Content-based retrieval of music and audio, in C.-C. Jay Kuo et al., ed., 'Proceedings of the Electronic Imaging', Vol. 3229, pp. 138-147.
-
(1997)
Proceedings of the Electronic Imaging
, vol.3229
, pp. 138-147
-
-
Foote, J.T.1
-
25
-
-
85128356454
-
Partitioning and transcription of broadcast news data
-
Gauvain, J.-L., Lamel, L. and Adda, G. (1998), Partitioning and transcription of broadcast news data, in 'Proceedings of the ICSLP', Vol. 5, pp. 1335-1338.
-
(1998)
Proceedings of the ICSLP
, vol.5
, pp. 1335-1338
-
-
Gauvain, J.-L.1
Lamel, L.2
Adda, G.3
-
26
-
-
0028516097
-
Text-independent speaker identification
-
Gish, H. and Schmidt, M. (1994), 'Text-independent speaker identification', IEEE Signal Process Mag. 11(4), 18-32.
-
(1994)
IEEE Signal Process Mag.
, vol.11
, Issue.4
, pp. 18-32
-
-
Gish, H.1
Schmidt, M.2
-
27
-
-
0026400244
-
Segregation of speakers for speech recognition and speaker identification.
-
Gish, H., Siu, M.-H. and Rohlicek, R. (1991), Segregation of speakers for speech recognition and speaker identification., in 'Proceedings of the ICASSP', pp. 873-876.
-
(1991)
Proceedings of the ICASSP
, pp. 873-876
-
-
Gish, H.1
Siu, M.-H.2
Rohlicek, R.3
-
28
-
-
0030372637
-
A probabilistic framework for feature-based speech recognition
-
Glass, J., Chang, J. and McCandless, M. (1996), A probabilistic framework for feature-based speech recognition, in 'Proceedings of the ICSLP', Vol. 4, pp. 2277-2280.
-
(1996)
Proceedings of the ICSLP
, vol.4
, pp. 2277-2280
-
-
Glass, J.1
Chang, J.2
McCandless, M.3
-
29
-
-
84892200966
-
A first approach to speech retrieval
-
Technical Report 238, ETH Zrich
-
Glavitsch, U. (1995), A first approach to speech retrieval, Technical Report 238, ETH Zrich, Institute of Information Systems.
-
(1995)
Institute of Information Systems
-
-
Glavitsch, U.1
-
30
-
-
0027151606
-
Recognition of environmental sounds
-
Goldhor, R. S. (1993), Recognition of environmental sounds, in 'Proceedings of the ICASSP', Vol. 1, pp. 149-152.
-
(1993)
Proceedings of the ICASSP
, vol.1
, pp. 149-152
-
-
Goldhor, R.S.1
-
31
-
-
36549057588
-
-
PhD thesis, Universitat Pompeu Fabra, Barcelona, Spain
-
Gómez, E. (2006), Tonal Description of Music Audio Signals, PhD thesis, Universitat Pompeu Fabra, Barcelona, Spain.
-
(2006)
Tonal Description of Music Audio Signals
-
-
Gómez, E.1
-
32
-
-
33847144778
-
Boosting for content-based audio classification and retrieval: An evaluation
-
Guo, G., Zhang, H.-J. and Li, S. Z. (2001), Boosting for content-based audio classification and retrieval: An evaluation, in 'Proceedings of the ICME', pp. 1200-1203.
-
(2001)
Proceedings of the ICME
, pp. 1200-1203
-
-
Guo, G.1
Zhang, H.-J.2
Li, S.Z.3
-
34
-
-
0033639509
-
Overview of the sixth text retrieval conference (trec-6)
-
Harman, D. (2000), 'Overview of the sixth text retrieval conference (trec-6)', Inform. Process. Manag. 36(1), 3-35.
-
(2000)
Inform. Process. Manag.
, vol.36
, Issue.1
, pp. 3-35
-
-
Harman, D.1
-
35
-
-
0025041264
-
Perceptual linear predictive (PLP) analysis of speech
-
Hermansky, H. (1990), 'Perceptual linear predictive (PLP) analysis of speech', J. Acoust. Soc. Am. 87(4), 1738-1752.
-
(1990)
J. Acoust. Soc. Am.
, vol.87
, Issue.4
, pp. 1738-1752
-
-
Hermansky, H.1
-
36
-
-
17444404830
-
Automatic classification of musical instrument sounds
-
Herrera, P., Peeters, G. and Dubnov, S. (2003), 'Automatic classification of musical instrument sounds', J. New. Music Res. 32(1), 3-21.
-
(2003)
J. New. Music Res.
, vol.32
, Issue.1
, pp. 3-21
-
-
Herrera, P.1
Peeters, G.2
Dubnov, S.3
-
37
-
-
85009070699
-
Automatic metric-based speech segmentation for broadcast news via principal component analysis
-
Hung, J.-W., Wang, H.-M. and Lee, L.-S. (2000), Automatic metric-based speech segmentation for broadcast news via principal component analysis, in 'Proceedings of the ICSLP', Vol. 4, pp. 121-124.
-
(2000)
Proceedings of the ICSLP
, vol.4
, pp. 121-124
-
-
Hung, J.-W.1
Wang, H.-M.2
Lee, L.-S.3
-
42
-
-
0014129195
-
Hierarchical clustering schemes
-
Johnson, S. C. (1967), 'Hierarchical clustering schemes', Psychometrika 32(3), 241-254.
-
(1967)
Psychometrika
, vol.32
, Issue.3
, pp. 241-254
-
-
Johnson, S.C.1
-
44
-
-
41149104768
-
Speaker change detection using support vector machines
-
Kartik, V., Satish, D. S. and Sekhar, C. C. (2005), Speaker change detection using support vector machines, in 'Proceedings of the ISCA ITRW on Non-linear Speech Processing', pp. 130-136.
-
(2005)
Proceedings of the ISCA ITRW on Non-linear Speech Processing
, pp. 130-136
-
-
Kartik, V.1
Satish, D.S.2
Sekhar, C.C.3
-
45
-
-
0033692969
-
Strategies for automatic segmentation of audio data
-
Kemp, T., Schmidt, M., Westphal, M. and Waibel, A. (2000), Strategies for automatic segmentation of audio data, in 'Proceedings ICASSP', Vol. 3, pp. 1423-1426.
-
(2000)
Proceedings ICASSP
, vol.3
, pp. 1423-1426
-
-
Kemp, T.1
Schmidt, M.2
Westphal, M.3
Waibel, A.4
-
46
-
-
33646908801
-
The 1995 abbot hybrid connectionist-hmm large-vocabulary recognition system
-
Kershaw, D., Robinson, A. and Renals, S. (1996), The 1995 abbot hybrid connectionist-hmm large-vocabulary recognition system, in 'Proceedings of the ARPA Speech Recognition Workshop', pp. 93-98.
-
(1996)
Proceedings of the ARPA Speech Recognition Workshop
, pp. 93-98
-
-
Kershaw, D.1
Robinson, A.2
Renals, S.3
-
48
-
-
33646789869
-
Hybrid speaker-based segmentation system using model-level clustering
-
Kim, H.-G., Ertelt, D. and Sikora, T. (2005), Hybrid speaker-based segmentation system using model-level clustering, in 'Proceedings of the ICASSP', Vol. 1, pp. 745-748.
-
(2005)
Proceedings of the ICASSP
, vol.1
, pp. 745-748
-
-
Kim, H.-G.1
Ertelt, D.2
Sikora, T.3
-
49
-
-
84889435599
-
-
John Wiley & Sons, New York
-
Kim, H.-G., Moreau, N. and Sikora, T. (2005), MPEG-7 Audio and Beyond: Audio Content Indexing and Retrieval, John Wiley & Sons, New York.
-
(2005)
MPEG-7 Audio and Beyond: Audio Content Indexing and Retrieval
-
-
Kim, H.-G.1
Moreau, N.2
Sikora, T.3
-
50
-
-
33748519104
-
Signal processing methods for the transcription of music
-
PhD thesis, Finland
-
Klapuri, A. (2004), Signal Processing Methods for the Transcription of Music, PhD thesis, Tampere University of Technology, Tampere, Finland.
-
(2004)
Tampere University of Technology, Tampere
-
-
Klapuri, A.1
-
51
-
-
33745217037
-
Using syllable-based indexing features and language models to improve german spoken document retrieval
-
Larson, M. and Eickeler, S. (2003), Using syllable-based indexing features and language models to improve german spoken document retrieval, in 'Proceedings of the EUROSPEECH', pp. 1217-1220.
-
(2003)
Proceedings of the EUROSPEECH
, pp. 1217-1220
-
-
Larson, M.1
Eickeler, S.2
-
52
-
-
84892243397
-
-
Kluwer Academic Publishers, chapter Appendix I
-
Lee, K.-F. (1989), Automatic Speech Recognition, Kluwer Academic Publishers, chapter Appendix I.2, p. 147.
-
(1989)
Automatic Speech Recognition
, vol.2
, pp. 147
-
-
Lee, K.-F.1
-
53
-
-
85119434191
-
Fast speaker change detection for broadcast news transcription and indexing
-
Liu, D. and Kubala, F. (1999), Fast speaker change detection for broadcast news transcription and indexing, in 'Proceedings of the EUROSPEECH', Vol. 3, pp. 1031-1034.
-
(1999)
Proceedings of the EUROSPEECH
, vol.3
, pp. 1031-1034
-
-
Liu, D.1
Kubala, F.2
-
55
-
-
0032181880
-
Audio feature extraction and analysis for scene segmentation and classification
-
Liu, Z., Wang, Y. and Chen, T. (1998), 'Audio feature extraction and analysis for scene segmentation and classification', J. VLSI Signal Process. 20(1/2), 61-79.
-
(1998)
J. VLSI Signal Process.
, vol.20
, Issue.1-2
, pp. 61-79
-
-
Liu, Z.1
Wang, Y.2
Chen, T.3
-
56
-
-
11244350380
-
Fusion of semantic and acoustic approaches for spoken document retrieval
-
Logan, B., Prasangsit, P. and Moreno, P. (2003), Fusion of semantic and acoustic approaches for spoken document retrieval, in 'Proceedings of the ISCA Workshop on Multilingual Spoken Document Retrieval', pp. 1-6.
-
(2003)
Proceedings of the ISCA Workshop on Multilingual Spoken Document Retrieval
, pp. 1-6
-
-
Logan, B.1
Prasangsit, P.2
Moreno, P.3
-
57
-
-
33645326073
-
Real-time unsupervised speaker change detection
-
Lu, L. and Zhang, H. J. (2002a), Real-time unsupervised speaker change detection, in 'Proceedings of the ICPR', Vol. 2, pp. 358-361.
-
(2002)
Proceedings of the ICPR
, vol.2
, pp. 358-361
-
-
Lu, L.1
Zhang, H.J.2
-
59
-
-
84873533162
-
An investigation of feature models for music genre classification using the support vector classifier
-
Meng, A. and Shawe-Taylor, J. (2005), An investigation of feature models for music genre classification using the support vector classifier, in 'Proceedings of the ISMIR', pp. 604-609.
-
(2005)
Proceedings of the ISMIR
, pp. 604-609
-
-
Meng, A.1
Shawe-Taylor, J.2
-
61
-
-
33745218075
-
Comparison of different phone-based spoken document retrieval methods with text and spoken queries
-
Moreau, N., Jin, S. and Sikora, T. (2005), Comparison of different phone-based spoken document retrieval methods with text and spoken queries, in 'Proceedings of the EUROSPEECH', pp. 641-644.
-
(2005)
Proceedings of the EUROSPEECH
, pp. 641-644
-
-
Moreau, N.1
Jin, S.2
Sikora, T.3
-
62
-
-
0034857759
-
Speaker change detection and speaker clustering using VQ distortion for broadcast news speech recognition
-
Mori, K. and Nakagawa, S. (2001), Speaker change detection and speaker clustering using VQ distortion for broadcast news speech recognition, in 'Proceedings of the ICASSP', Vol. 1, pp. 413-416.
-
(2001)
Proceedings of the ICASSP
, vol.1
, pp. 413-416
-
-
Mori, K.1
Nakagawa, S.2
-
63
-
-
0027311597
-
A new speech recognition method based on VQ-distortion and HMM
-
Nakagawa, S. and Suzuk, H. (1993), A new speech recognition method based on VQ-distortion and HMM, in 'Proceedings of the ICASSP', Vol. 2, pp. 676-679.
-
(1993)
Proceedings of the ICASSP
, vol.2
, pp. 676-679
-
-
Nakagawa, S.1
Suzuk, H.2
-
65
-
-
0033692609
-
Information fusion for spoken document retrieval
-
Ng, K. (2000), Information fusion for spoken document retrieval, in 'Proceedings ICASSP', Vol. 6, pp. 2405-2408.
-
(2000)
Proceedings ICASSP
, vol.6
, pp. 2405-2408
-
-
Ng, K.1
-
66
-
-
0031636298
-
Phonetic recognition for spoken document retrieval
-
Ng, K. and Zue, V. W. (1998), Phonetic recognition for spoken document retrieval, in 'Proceedings ICASSP', Vol. 1, pp. 325-328.
-
(1998)
Proceedings ICASSP
, vol.1
, pp. 325-328
-
-
Ng, K.1
Zue, V.W.2
-
67
-
-
25444508285
-
Survey of sparse and non-sparse methods in source separation
-
O'Grady, P. D., Pearlmutter, B. A. and Rickard, S. T. (2005), 'Survey of sparse and non-sparse methods in source separation', Int. J. Imag. Syst. Tech. 15(1), 18-33.
-
(2005)
Int. J. Imag. Syst. Tech.
, vol.15
, Issue.1
, pp. 18-33
-
-
O'Grady, P.D.1
Pearlmutter, B.A.2
Rickard, S.T.3
-
68
-
-
38549108667
-
Musical metadata and knowledge management
-
D. Schwartz, ed., Idea Group
-
Pachet, F. (2005), Musical metadata and knowledge management, in D. Schwartz, ed., 'Encyclopedia of Knowledge Management', Idea Group, pp. 672-677.
-
(2005)
Encyclopedia of Knowledge Management
, pp. 672-677
-
-
Pachet, F.1
-
70
-
-
0030396150
-
Automatic audio content analysis
-
Pfeiffer, S., Fischer, S. and Effelsberg, W. (1996), Automatic audio content analysis, in 'Proceedings 4th ACM International Multimedia Conference', pp. 21-30.
-
(1996)
Proceedings 4th ACM International Multimedia Conference
, pp. 21-30
-
-
Pfeiffer, S.1
Fischer, S.2
Effelsberg, W.3
-
71
-
-
34547541889
-
Applied clustering for automatic speaker-based segmentation of audio material
-
Pietquin, O., Couvreur, L. and Couvreur, P. (2001), Applied clustering for automatic speaker-based segmentation of audio material, in 'JORBEL', Vol. 41, pp. 69-81.
-
(2001)
JORBEL
, vol.41
, pp. 69-81
-
-
Pietquin, O.1
Couvreur, L.2
Couvreur, P.3
-
72
-
-
0034444712
-
Integrating visual, audio and text analysis for news video
-
Qi, W., Gu, L., Jiang, H., Chen, X. and Zhang, H. (2000), Integrating visual, audio and text analysis for news video, in 'Proceedings of the ICIP', Vol. 3, pp. 520-523.
-
(2000)
Proceedings of the ICIP
, vol.3
, pp. 520-523
-
-
Qi, W.1
Gu, L.2
Jiang, H.3
Chen, X.4
Zhang, H.5
-
75
-
-
0029386354
-
Keyword detection in conversational speech utterances using hidden markov model based continuous speech recognition
-
Rose, R. (1995), 'Keyword detection in conversational speech utterances using hidden markov model based continuous speech recognition', Comput. Speech Lang. 9(4), 309-333.
-
(1995)
Comput. Speech Lang.
, vol.9
, Issue.4
, pp. 309-333
-
-
Rose, R.1
-
76
-
-
34548201182
-
Video to the rescue of audio: Shot boundary assisted speaker change detection
-
Samour, A., Karaman, M., Goldmann, L. and Sikora, T. (2007), Video to the rescue of audio: Shot boundary assisted speaker change detection, in 'Proceedings of the Electronic Imaging', Vol. 6506.
-
(2007)
Proceedings of the Electronic Imaging
, vol.6506
-
-
Samour, A.1
Karaman, M.2
Goldmann, L.3
Sikora, T.4
-
77
-
-
4544228318
-
Identity verification using speech and face information
-
Sanderson, C. and Paliwala, K. K. (2004), 'Identity verification using speech and face information', Digit. Signal Process. 14(5), 449-480.
-
(2004)
Digit. Signal Process.
, vol.14
, Issue.5
, pp. 449-480
-
-
Sanderson, C.1
Paliwala, K.K.2
-
78
-
-
0029765670
-
Real-time discrimination of broadcast speech/music
-
Saunders, J. (1996), Real-time discrimination of broadcast speech/music, in 'Proceedings of the ICASSP', Vol. 2, pp. 993-996.
-
(1996)
Proceedings of the ICASSP
, vol.2
, pp. 993-996
-
-
Saunders, J.1
-
79
-
-
2642521115
-
Assessing the retrieval effectiveness of a speech retrieval system by simulating recognition errors
-
Schaeuble, P. and Glavitsch, U. (1994), Assessing the retrieval effectiveness of a speech retrieval system by simulating recognition errors, in 'Proceedings Workshop on Human Language Technology', pp. 370-372.
-
(1994)
Proceedings Workshop on Human Language Technology
, pp. 370-372
-
-
Schaeuble, P.1
Glavitsch, U.2
-
80
-
-
0030648077
-
Construction and evaluation of a robust multifeature speech/music discriminator
-
Scheirer, E. and Slaney, M. (1997), Construction and evaluation of a robust multifeature speech/music discriminator, in 'Proceedings of the ICASSP', Vol. 2, pp. 1331-1334.
-
(1997)
Proceedings of the ICASSP
, vol.2
, pp. 1331-1334
-
-
Scheirer, E.1
Slaney, M.2
-
81
-
-
0000120766
-
Estimation the dimension of a model
-
Schwarz, G. (1978), Estimation the dimension of a model, in 'Annals of Statistics', Vol. 6, pp. 461-464.
-
(1978)
Annals of Statistics
, vol.6
, pp. 461-464
-
-
Schwarz, G.1
-
82
-
-
85069154781
-
Musical sound modeling with sinusoids plus noise
-
C. Roads, S. T. Pope, A. Piccialli and G. D. Poli, eds, Swets & Zeitlinger The Netherlands
-
Serra, X. (1997), Musical sound modeling with sinusoids plus noise, in C. Roads, S. T. Pope, A. Piccialli and G. D. Poli, eds, 'Musical Signal Processing', Swets & Zeitlinger The Netherlands, pp. 91-122.
-
(1997)
Musical Signal Processing
, pp. 91-122
-
-
Serra, X.1
-
83
-
-
0002782496
-
Automatic segmentation, classification and clustering of broadcast news audio
-
Siegler, M. A., Jain, U., Raj, B. and Stern, R. M. (1997), Automatic segmentation, classification and clustering of broadcast news audio, in 'Proceedings of the DARPA Speech Recognition Workshop', pp. 97-99.
-
(1997)
Proceedings of the DARPA Speech Recognition Workshop
, pp. 97-99
-
-
Siegler, M.A.1
Jain, U.2
Raj, B.3
Stern, R.M.4
-
84
-
-
84945116938
-
Non-negative matrix factorization for polyphonic music transcription
-
Smaragdis, P. and Brown, J. C. (2003), Non-negative matrix factorization for polyphonic music transcription, in 'Proceedings of the WASPAA', pp. 177-180.
-
(2003)
Proceedings of the WASPAA
, pp. 177-180
-
-
Smaragdis, P.1
Brown, J.C.2
-
85
-
-
85031608427
-
Speaker tracking and detection with multiple speakers
-
Sönmez, K., Heck, L. and Weintraub, M. (1999), Speaker tracking and detection with multiple speakers, in 'Proceedings of the EUROSPEECH', Vol. 5, pp. 2219-2222.
-
(1999)
Proceedings of the EUROSPEECH
, vol.5
, pp. 2219-2222
-
-
Sönmez, K.1
Heck, L.2
Weintraub, M.3
-
86
-
-
0027252184
-
Speech segmentation and clustering based on speaker features
-
Sugiyama, M., Murakami, J. and H. Watanabe (1993), Speech segmentation and clustering based on speaker features, in 'Proceedings of the ICASSP', Vol. 2, pp. 395-398.
-
(1993)
Proceedings of the ICASSP
, vol.2
, pp. 395-398
-
-
Sugiyama, M.1
Murakami, J.2
Watanabe, H.3
-
87
-
-
20444498720
-
Detection of unique people in news programs using multimodal shot clustering
-
Taskiran, C., Albiol, A., Torres, L. and Delp, E. (2004), Detection of unique people in news programs using multimodal shot clustering, in 'Proceedings of the ICIP', Vol. 1, pp. 697-700.
-
(2004)
Proceedings of the ICIP
, vol.1
, pp. 697-700
-
-
Taskiran, C.1
Albiol, A.2
Torres, L.3
Delp, E.4
-
89
-
-
78650540904
-
Improved speaker segmentation and segments clustering using the bayesian information criterion.
-
Tritschler, A. and Gopinath, R. (1999), Improved speaker segmentation and segments clustering using the bayesian information criterion., in 'Proceedings of the EUROSPEECH', pp. 679-682.
-
(1999)
Proceedings of the EUROSPEECH
, pp. 679-682
-
-
Tritschler, A.1
Gopinath, R.2
-
90
-
-
0036648502
-
Musical genre classification of audio signals
-
Tzanetakis, G. and Cook, P. (2002), 'Musical genre classification of audio signals', IEEE Trans. Speech Audio Process. 10(5), 293-302.
-
(2002)
IEEE Trans. Speech Audio Process.
, vol.10
, Issue.5
, pp. 293-302
-
-
Tzanetakis, G.1
Cook, P.2
-
91
-
-
85032751074
-
Identification of sound recordings
-
Venkatachalam, V., Cazzanti, L., Dhillon, N. and Wells, M. (2004), 'Identification of sound recordings', IEEE Signal Process. Mag. 21(2), 92-99.
-
(2004)
IEEE Signal Process. Mag.
, vol.21
, Issue.2
, pp. 92-99
-
-
Venkatachalam, V.1
Cazzanti, L.2
Dhillon, N.3
Wells, M.4
-
92
-
-
85159475599
-
Transient modeling synthesis: A flexible analysis/synthesis tool for transient signals
-
Verma, T., Levine, S. and Meng, T. (1997), Transient modeling synthesis: A flexible analysis/synthesis tool for transient signals, in 'Proceedings of the International Computer Music Conference (ICMC)', pp. 164-167.
-
(1997)
Proceedings of the International Computer Music Conference (ICMC)
, pp. 164-167
-
-
Verma, T.1
Levine, S.2
Meng, T.3
-
94
-
-
85032751556
-
Multimedia content analysis using both audio and visual clues
-
Wang, Y., Liu, Z. and Huang, J.-C. (2000), 'Multimedia content analysis using both audio and visual clues', IEEE Signal Process. Mag. 17(6), 12-36.
-
(2000)
IEEE Signal Process. Mag.
, vol.17
, Issue.6
, pp. 12-36
-
-
Wang, Y.1
Liu, Z.2
Huang, J.-C.3
-
95
-
-
79952385877
-
Segmentation of speech using speaker identification
-
Wilcox, L., Chen, F., Kimber, D. and Balasubramanian, V. (1994), Segmentation of speech using speaker identification, in 'Proceedings of the ICASSP', pp. 161-164.
-
(1994)
Proceedings of the ICASSP
, pp. 161-164
-
-
Wilcox, L.1
Chen, F.2
Kimber, D.3
Balasubramanian, V.4
-
96
-
-
0025517070
-
Automatic recognition of keywords in unconstrained speech using hidden markov models
-
Wilpon, J., Rabiner, L. and Lee, C.-H. (1990), 'Automatic recognition of keywords in unconstrained speech using hidden markov models', IEEE Trans. Acoust., Speech, Signal Process. 38, 1870-1878.
-
(1990)
IEEE Trans. Acoust., Speech, Signal Process.
, vol.38
, pp. 1870-1878
-
-
Wilpon, J.1
Rabiner, L.2
Lee, C.-H.3
-
97
-
-
0030242072
-
Content-based classification, search, and retrieval of audio
-
Wold, E., Blum, T., Keislar, D. and Wheaton, J. (1996), 'Content-based classification, search, and retrieval of audio', IEEE Multimedia 3(3), 27-36.
-
(1996)
IEEE Multimedia
, vol.3
, Issue.3
, pp. 27-36
-
-
Wold, E.1
Blum, T.2
Keislar, D.3
Wheaton, J.4
-
98
-
-
0002452931
-
The HTK large vocabulary recognition system for the 1995 ARPA H3 task
-
Woodland, P., Gales, M., Pye, D. and Valtchev, V. (1996), The HTK large vocabulary recognition system for the 1995 ARPA H3 task, in 'Proceedings of the ARPA Speech Recognition Workshop', pp. 99-104.
-
(1996)
Proceedings of the ARPA Speech Recognition Workshop
, pp. 99-104
-
-
Woodland, P.1
Gales, M.2
Pye, D.3
Valtchev, V.4
-
99
-
-
0141478771
-
UBM-based real-time speaker segmentation for broadcasting news
-
Wu, T., Lu, L. and Zhang, H.-J. (2003), UBM-based real-time speaker segmentation for broadcasting news, in 'Proceedings of the ICASSP', Vol. 2, pp. 193-196.
-
(2003)
Proceedings of the ICASSP
, vol.2
, pp. 193-196
-
-
Wu, T.1
Lu, L.2
Zhang, H.-J.3
-
100
-
-
0141855132
-
Comparing MFCC and MPEG-7 audio features for feature extraction, maximum likelihood HMM and entropic prior HMM for sports audio classification
-
Xiong, Z., Radhakrishnan, R., Divakaran, A. and Huang, T. S. (2003), Comparing MFCC and MPEG-7 audio features for feature extraction, maximum likelihood HMM and entropic prior HMM for sports audio classification, in 'Proceedings of the ICASSP', Vol. 5, pp. 628-31.
-
(2003)
Proceedings of the ICASSP
, vol.5
, pp. 628-631
-
-
Xiong, Z.1
Radhakrishnan, R.2
Divakaran, A.3
Huang, T.S.4
-
101
-
-
85009160774
-
An improved model-based speaker segmentation system
-
Yu, P., Seide, F., Ma, C. and Chang, E. (2003), An improved model-based speaker segmentation system, in 'Proceedings of the EUROSPEECH', pp. 2025-2028.
-
(2003)
Proceedings of the EUROSPEECH
, pp. 2025-2028
-
-
Yu, P.1
Seide, F.2
Ma, C.3
Chang, E.4
-
102
-
-
0032629748
-
Hierarchical classification of audio data for archiving and retrieving
-
Zhang, T. and Kuo, C.-C. (1999a), Hierarchical classification of audio data for archiving and retrieving, in 'Proceedings of the ICASSP', Vol. 6, pp. 3001-3004.
-
(1999)
Proceedings of the ICASSP
, vol.6
, pp. 3001-3004
-
-
Zhang, T.1
Kuo, C.-C.2
-
103
-
-
0033331933
-
Classification and retrieval of sound effects in audiovisual data management
-
Zhang, T. and Kuo, C.-C. J. (1999b), Classification and retrieval of sound effects in audiovisual data management, in 'Proceedings of the Asilomar Conference on Signals, Systems, and Computers', Vol. 1, pp. 730-734.
-
(1999)
Proceedings of the Asilomar Conference on Signals, Systems, and Computers
, vol.1
, pp. 730-734
-
-
Zhang, T.1
Kuo, C.-C.J.2
-
104
-
-
85009275098
-
Speechfind: An experimental on-line spoken document retrieval system for historical audio archives
-
Zhou, B. and Hansen, J. (2002), Speechfind: An experimental on-line spoken document retrieval system for historical audio archives, in 'Proceedings of the ICSLP', Vol. 3, pp. 1969-1972.
-
(2002)
Proceedings of the ICSLP
, vol.3
, pp. 1969-1972
-
-
Zhou, B.1
Hansen, J.2
-
105
-
-
85009089453
-
Unsupervised audio stream segmentation and clustering via the bayesian information criterion
-
Zhou, B. and Hansen, J. H. L. (2000), Unsupervised audio stream segmentation and clustering via the bayesian information criterion, in 'Proceedings of the ICSLP', Vol. 3, pp. 714-717.
-
(2000)
Proceedings of the ICSLP
, vol.3
, pp. 714-717
-
-
Zhou, B.1
Hansen, J.H.L.2
-
106
-
-
0025587109
-
The summit speech recognition system: Phonological modeling and lexical access
-
Zue, V., Glass, J., Goodine, D., Phillips, M. and Seneff, S. (1990), The summit speech recognition system: phonological modeling and lexical access, in 'Proceedings of the ICASSP', Vol. 1, pp. 49-52.
-
(1990)
Proceedings of the ICASSP
, vol.1
, pp. 49-52
-
-
Zue, V.1
Glass, J.2
Goodine, D.3
Phillips, M.4
Seneff, S.5
|