SCOPUS 정보 검색 플랫폼

MPEG-7 Audio and Beyond: Audio Content Indexing and Retrieval

Volumn , Issue , 2006, Pages 1-285

MPEG-7 Audio and Beyond: Audio Content Indexing and Retrieval

(3) Kim, Hyoung Gook a Moreau, Nicolas b Sikora, Thomas b

a Samsung Advanced Institute of Technology (SAIT) (South Korea)

b TECHNISCHE UNIVERSITÄT BERLIN (Germany)

Author keywords

[No Author keywords available]

Indexed keywords

EID: 84889435599 PISSN: None EISSN: None Source Type: Book
DOI: 10.1002/0470093366 Document Type: Book

Times cited : (247)

References (215)

1
- 4243152700
- Content-based Identification of Audio Material Using MPEG-7 Low Level Description
- International Symposium Music Information Retrieval, Bloomington, IN, USA, October
- Allamanche E., Herre J., Helmuth O., Fröba B., Kasten T. and Cremer M. (2001) "Content-based Identification of Audio Material Using MPEG-7 Low Level Description", International Symposium Music Information Retrieval, Bloomington, IN, USA, October.
- (2001)
- Allamanche, E.¹ Herre, J.² Helmuth, O.³ Fröba, B.⁴ Kasten, T.⁵ Cremer, M.⁶

2
- 84889309871
- Basic Speech Sounds, their Analysis and Features
- in Spoken Dialogues with Computers, Academic Press, London
- Angelini B., Falavigna D., Omologo M. and De Mori R. (1998) "Basic Speech Sounds, their Analysis and Features", in Spoken Dialogues with Computers, pp. 69-121, Academic Press, London.
- (1998) , pp. 69-121
- Angelini, B.¹ Falavigna, D.² Omologo, M.³ De Mori, R.⁴

3
- 0020148958
- Synthesis by Spectral Amplitude and 'Brightness' Matching Analyzed Musical Sounds
- Beauchamp J. W. (1982) "Synthesis by Spectral Amplitude and 'Brightness' Matching Analyzed Musical Sounds", Journal of Audio Engineering Society, vol. 30, no. 6, pp. 396-406.
- (1982) Journal of Audio Engineering Society , vol.30 , Issue.6 , pp. 396-406
- Beauchamp, J.W.¹

4
- 84889311325
- A Hierarchical Approach to Automatic Musical Genre Classification
- 6th International Conference on Digital Audio Effects (DAFX), London, UK, September
- Burred J. J. and Lerch A. (2003) "A Hierarchical Approach to Automatic Musical Genre Classification", 6th International Conference on Digital Audio Effects (DAFX), London, UK, September.
- (2003)
- Burred, J.J.¹ Lerch, A.²

5
- 33645801332
- Hierarchical Automatic Audio Signal Classification
- Burred J. J. and Lerch A. (2004) "Hierarchical Automatic Audio Signal Classification", Journal of the Audio Engineering Society, vol. 52, no. 7/8, pp. 724-739.
- (2004) Journal of the Audio Engineering Society , vol.52 , Issue.7-8 , pp. 724-739
- Burred, J.J.¹ Lerch, A.²

6
- 84892166605
- A Spectrally Mixed Excitation (SMX) Vocoder with Robust Parameter Determination
- Seattle, WA , USA, May
- Cho Y. D., Kim M. Y. and Kim S. R. (1998) "A Spectrally Mixed Excitation (SMX) Vocoder with Robust Parameter Determination", ICASSP '98, vol. 2, pp. 601-604, Seattle, WA , USA, May.
- (1998) ICASSP '98 , vol.2 , pp. 601-604
- Cho, Y.D.¹ Kim, M.Y.² Kim, S.R.³

7
- 0019053271
- Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences
- Davis S. B. and Mermelstein P. (1980) "Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences", IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 28, no. 4, pp. 357-365.
- (1980) IEEE Transactions on Acoustics, Speech, and Signal Processing , vol.28 , Issue.4 , pp. 357-365
- Davis, S.B.¹ Mermelstein, P.²

8
- 85009074922
- Harmonic Tunnelling: Tracking Non-Stationary Noises during Speech
- Eurospeech 2001, Aalborg, Denmark, September
- Ealey D., Kelleher H. and Pearce D. (2001) "Harmonic Tunnelling: Tracking Non-Stationary Noises during Speech", Eurospeech 2001, Aalborg, Denmark, September.
- (2001)
- Ealey, D.¹ Kelleher, H.² Pearce, D.³

9
- 0003901864
- Speech and Audio Signal Processing: Processing and Perception of Speech and Music
- John Wiley & Sons, Inc., New York
- Gold B. and Morgan N. (1999) Speech and Audio Signal Processing: Processing and Perception of Speech and Music, John Wiley & Sons, Inc., New York.
- (1999)
- Gold, B.¹ Morgan, N.²

10
- 0018139926
- Perceptual Effects of Spectral Modifications on Musical Timbres
- Grey J. M. and Gordon J. W. (1978) "Perceptual Effects of Spectral Modifications on Musical Timbres", Journal of Acoustical Society of America, vol. 63, no. 5, pp. 1493-1500.
- (1978) Journal of Acoustical Society of America , vol.63 , Issue.5 , pp. 1493-1500
- Grey, J.M.¹ Gordon, J.W.²

11
- 0003455850
- Information Technology -Multimedia Content Description Interface -Part 4: Audio
- ISO/IEC, FDIS 15938-4:2001(E), June
- ISO/IEC (2001) Information Technology -Multimedia Content Description Interface -Part 4: Audio, FDIS 15938-4:2001(E), June.
- (2001)

12
- 0032671913
- Silence Detection for Multimedia Communication Systems
- Jacobs S., Eleftheriadis A. and Anastassiou D. (1999) "Silence Detection for Multimedia Communication Systems", Multimedia Systems, vol. 7, no. 2, pp. 157-164.
- (1999) Multimedia Systems , vol.7 , Issue.2 , pp. 157-164
- Jacobs, S.¹ Eleftheriadis, A.² Anastassiou, D.³

13
- 4544361760
- Comparison of MPEG-7 Audio Spectrum Projection Features and MFCC Applied to Speaker Recognition, Sound Classification and Audio Segmentation
- ICASSP'2004, Montreal, Canada, May
- Kim H.-G. and Sikora T. (2004) "Comparison of MPEG-7 Audio Spectrum Projection Features and MFCC Applied to Speaker Recognition, Sound Classification and Audio Segmentation", ICASSP'2004, Montreal, Canada, May.
- (2004)
- Kim, H.-G.¹ Sikora, T.²

14
- 0002477067
- Why is musical timbre so hard to understand?
- in Structure and perception of electroacoustic sound and music, Elsevier, Amsterdam
- Krumhansl C. L. (1989) "Why is musical timbre so hard to understand?" in Structure and perception of electroacoustic sound and music, pp. 43-53, Elsevier, Amsterdam.
- (1989) , pp. 43-53
- Krumhansl, C.L.¹

15
- 0034293572
- A Common Perceptual Space for Harmonic and Percussive Timbres
- Lakatos S. (2000) "A Common Perceptual Space for Harmonic and Percussive Timbres", Perception and Psychophysics, vol. 62, no. 7, pp. 1426-1439.
- (2000) Perception and Psychophysics , vol.62 , Issue.7 , pp. 1426-1439
- Lakatos, S.¹

16
- 0035308233
- Classification of General Audio Data for Content-based Retrieval
- Li D., Sethi I. K., Dimitrova N. and McGee T. (2001) "Classification of General Audio Data for Content-based Retrieval", Pattern Recognition Letters, Special Issue on Image/Video Indexing and Retrieval, vol. 22, no. 5.
- (2001) Pattern Recognition Letters, Special Issue on Image/Video Indexing and Retrieval , vol.22 , Issue.5
- Li, D.¹ Sethi, I.K.² Dimitrova, N.³ McGee, T.⁴

17
- 0034273520
- Content-based Audio Classification and Retrieval using the Nearest Feature Line Method
- Li S. Z. (2000) "Content-based Audio Classification and Retrieval using the Nearest Feature Line Method", IEEE Transactions on Speech and Audio Processing, vol. 8, no. 5, pp. 619-625.
- (2000) IEEE Transactions on Speech and Audio Processing , vol.8 , Issue.5 , pp. 619-625
- Li, S.Z.¹

18
- 79955939942
- Mel Frequency Cepstral Coefficients for Music Modeling
- International Symposium on Music Information Retrieval (ISMIR), Plymouth, MA, October
- Logan B. (2000) "Mel Frequency Cepstral Coefficients for Music Modeling", International Symposium on Music Information Retrieval (ISMIR), Plymouth, MA, October.
- (2000)
- Logan, B.¹

19
- 0036816475
- Content Analysis for Audio Classification and Segmentation
- Lu L., Zhang H.-J. and Jiang H. (2002) "Content Analysis for Audio Classification and Segmentation", IEEE Transactions on Speech and Audio Processing, vol. 10, no. 7, pp. 504-516.
- (2002) IEEE Transactions on Speech and Audio Processing , vol.10 , Issue.7 , pp. 504-516
- Lu, L.¹ Zhang, H.-J.² Jiang, H.³

20
- 0003769779
- Introduction to MPEG-7
- John Wiley & Sons, Ltd, Chicherter
- Manjunath B. S., Salembier P. and Sikora T. (2002) Introduction to MPEG-7, John Wiley & Sons, Ltd, Chicherter.
- (2002)
- Manjunath, B.S.¹ Salembier, P.² Sikora, T.³

21
- 0012468695
- Perspectives on the Contribution of Timbre to Musical Structure
- McAdams S. (1999) "Perspectives on the Contribution of Timbre to Musical Structure", Computer Music Journal, vol. 23, no. 3, pp. 85-102.
- (1999) Computer Music Journal , vol.23 , Issue.3 , pp. 85-102
- McAdams, S.¹

22
- 0029442124
- Perceptual Scaling of Synthesized Musical Timbres: Common Dimensions, Specificities, and Latent Subject Classes
- McAdams S., Winsberg S., Donnadieu S., De Soete G. and Krimphoff J. (1995) "Perceptual Scaling of Synthesized Musical Timbres: Common Dimensions, Specificities, and Latent Subject Classes", Psychological Research, no. 58, pp. 177-192.
- (1995) Psychological Research , vol.58 , pp. 177-192
- McAdams, S.¹ Winsberg, S.² Donnadieu, S.³ De Soete, G.⁴ Krimphoff, J.⁵

23
- 0016113915
- The Optimum Comb Method of Pitch Period Analysis of Continuous Digitized Speech
- Moorer J. (1974) "The Optimum Comb Method of Pitch Period Analysis of Continuous Digitized Speech", IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 22, no. 5, pp. 330-338.
- (1974) IEEE Transactions on Acoustics, Speech, and Signal Processing , vol.22 , Issue.5 , pp. 330-338
- Moorer, J.¹

24
- 0001628038
- Nonlinear Filtering of Multiplied and Convolved Signals
- Oppenheim A. V., Schafer R. W. and Stockham T. G. (1968) "Nonlinear Filtering of Multiplied and Convolved Signals", IEEE Proceedings, vol. 56, no. 8, pp. 1264-1291.
- (1968) IEEE Proceedings , vol.56 , Issue.8 , pp. 1264-1291
- Oppenheim, A.V.¹ Schafer, R.W.² Stockham, T.G.³

25
- 33746597668
- Salient Feature Extraction of Musical Instrument Signals
- Thesis for the Degree of Master of Arts in Electro-Acoustic Music, Dartmouth College
- Park T. H. (2000) "Salient Feature Extraction of Musical Instrument Signals", Thesis for the Degree of Master of Arts in Electro-Acoustic Music, Dartmouth College.
- (2000)
- Park, T.H.¹

26
- 84889344642
- Instrument Sound Description in the Context of MPEG-7
- ICMC'2000 International Computer Music Conference, Berlin, Germany, August
- Peeters G., McAdams S. and Herrera P. (2000) "Instrument Sound Description in the Context of MPEG-7", ICMC'2000 International Computer Music Conference, Berlin, Germany, August.
- (2000)
- Peeters, G.¹ McAdams, S.² Herrera, P.³

27
- 0003425258
- Digital Processing of Speech Signals
- Prentice Hall, Englewood Cliffs, NJ
- Rabiner L. R. and Schafer R. W. (1978) Digital Processing of Speech Signals, Prentice Hall, Englewood Cliffs, NJ.
- (1978)
- Rabiner, L.R.¹ Schafer, R.W.²

28
- 0030648077
- Construction and Evaluation of a Robust Multifeature Speech/Music Discriminator
- Munich, Germany, April
- Scheirer E. and Slaney M. (1997) "Construction and Evaluation of a Robust Multifeature Speech/Music Discriminator", ICASSP '97, vol. 2, pp. 1331-1334, Munich, Germany, April.
- (1997) ICASSP '97 , vol.2 , pp. 1331-1334
- Scheirer, E.¹ Slaney, M.²

29
- 0036648502
- Musical Genre Classification of Audio Signals
- Tzanetakis G. and Cook P. (2002) "Musical Genre Classification of Audio Signals", IEEE Transactions on Speech and Audio Processing, vol. 10, no. 5, pp. 293-302.
- (2002) IEEE Transactions on Speech and Audio Processing , vol.10 , Issue.5 , pp. 293-302
- Tzanetakis, G.¹ Cook, P.²

30
- 85032751556
- Multimedia Content Analysis Using Both Audio and Visual Cues
- Wang Y., Liu Z. and Huang J.-C. (2000) "Multimedia Content Analysis Using Both Audio and Visual Cues", IEEE Signal Processing Magazine, vol. 17, no. 6, pp. 12-36.
- (2000) IEEE Signal Processing Magazine , vol.17 , Issue.6 , pp. 12-36
- Wang, Y.¹ Liu, Z.² Huang, J.-C.³

31
- 0030242072
- Content-Based Classification, Search, and Retrieval of Audio
- Wold E., Blum T., Keslar D. and Wheaton J. (1996) "Content-Based Classification, Search, and Retrieval of Audio", IEEE MultiMedia, vol. 3, no. 3, pp. 27-36.
- (1996) IEEE MultiMedia , vol.3 , Issue.3 , pp. 27-36
- Wold, E.¹ Blum, T.² Keslar, D.³ Wheaton, J.⁴

32
- 0141855132
- Comparing MFCC and MPEG-7 Audio Features for Feature Extraction, Maximum Likelihood HMM and Entropic Prior HMM for Sports Audio Classification
- Hong Kong, April
- Xiong Z., Radhakrishnan R., Divakaran A. and Huang T. S. (2003) "Comparing MFCC and MPEG-7 Audio Features for Feature Extraction, Maximum Likelihood HMM and Entropic Prior HMM for Sports Audio Classification", IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'03), vol. 5, pp. 628-631, Hong Kong, April.
- (2003) IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'03) , vol.5 , pp. 628-631
- Xiong, Z.¹ Radhakrishnan, R.² Divakaran, A.³ Huang, T.S.⁴

33
- 0035364397
- MPEG-7 Sound Recognition Tools
- Casey M. A. (2001) "MPEG-7 Sound Recognition Tools", IEEE Transactions on Circuits and Systems for Video Technology, vol. 11, no. 6, pp. 737-747.
- (2001) IEEE Transactions on Circuits and Systems for Video Technology , vol.11 , Issue.6 , pp. 737-747
- Casey, M.A.¹

34
- 84948186412
- Non-Negative Component Parts of Sound for Classification
- IEEE International Symposium on Signal Processing and Information Technology, Darmstadt, Germany, December
- Cho Y.-C., Choi S. and Bang S.-Y. (2003) "Non-Negative Component Parts of Sound for Classification", IEEE International Symposium on Signal Processing and Information Technology, Darmstadt, Germany, December.
- (2003)
- Cho, Y.-C.¹ Choi, S.² Bang, S.-Y.³

35
- 34249753618
- Support Vector Networks
- Cortes C. and Vapnik V. (1995) "Support Vector Networks", Machine Learning, vol. 20, pp. 273-297.
- (1995) Machine Learning , vol.20 , pp. 273-297
- Cortes, C.¹ Vapnik, V.²

36
- 0004236492
- Matrix Computations
- Johns Hopkins University Press, Baltimore, MD
- Golub G. H. and Van Loan C. F. (1993) Matrix Computations, Johns Hopkins University Press, Baltimore, MD.
- (1993)
- Golub, G.H.¹ Van Loan, C.F.²

37
- 0004063090
- Neural Networks
- 2nd Edition, Prentice Hall, Englewood Cliffs, NJ
- Haykins S. (1998) Neural Networks, 2nd Edition, Prentice Hall, Englewood Cliffs, NJ.
- (1998)
- Haykins, S.¹

38
- 0032629347
- Fast and Robust Fixed-Point algorithms for Independent Component Analysis
- Hyvärinen A., (1999) "Fast and Robust Fixed-Point algorithms for Independent Component Analysis", IEEE Transactions on Neural Networks, vol. 10, no. 3, pp. 626-634.
- (1999) IEEE Transactions on Neural Networks , vol.10 , Issue.3 , pp. 626-634
- Hyvärinen, A.¹

39
- 0003905759
- Independent Component Analysis
- John Wiley & Sons, Inc., New York
- Hyvärinen A., Karhunen J. and Oja E. (2001) Independent Component Analysis, John Wiley & Sons, Inc., New York.
- (2001)
- Hyvärinen, A.¹ Karhunen, J.² Oja, E.³

40
- 0003946510
- Principal Component Analysis
- Springer-Verlag, Berlin
- Jollife I. T. (1986) Principal Component Analysis, Springer-Verlag, Berlin.
- (1986)
- Jollife, I.T.¹

41
- 4544361760
- Comparison of MPEG-7 Audio Spectrum Projection Features and MFCC applied to Speaker Recognition, Sound Classification and Audio Segmentation
- Proceedings IEEE ICASSP 2004, Montreal, Canada, May
- Kim H.-G. and Sikora T. (2004a) "Comparison of MPEG-7 Audio Spectrum Projection Features and MFCC applied to Speaker Recognition, Sound Classification and Audio Segmentation", Proceedings IEEE ICASSP 2004, Montreal, Canada, May.
- (2004)
- Kim, H.-G.¹ Sikora, T.²

42
- 84889408786
- Audio Spectrum Projection Based on Several Basis Decomposition Algorithms Applied to General Sound Recognition and Audio Segmentation
- Proceedings of EURASIP-EUSIPCO 2004, Vienna, Austria, September
- Kim H.-G. and Sikora T. (2004b) "Audio Spectrum Projection Based on Several Basis Decomposition Algorithms Applied to General Sound Recognition and Audio Segmentation", Proceedings of EURASIP-EUSIPCO 2004, Vienna, Austria, September.
- (2004)
- Kim, H.-G.¹ Sikora, T.²

43
- 84889470436
- How Efficient Is MPEG-7 Audio for Sound Classification, Musical Instrument Identification, Speaker Recognition, and Speaker-Based Segmentation?
- IEEE Transactions on Speech and Audio Processing, submitted
- Kim H.-G. and Sikora T. (2004c) "How Efficient Is MPEG-7 Audio for Sound Classification, Musical Instrument Identification, Speaker Recognition, and Speaker-Based Segmentation?", IEEE Transactions on Speech and Audio Processing, submitted.
- (2004)
- Kim, H.-G.¹ Sikora, T.²

44
- 85009168586
- Speaker Recognition Using MPEG-7 Descriptors
- Proceedings EUROSPEECH 2003, Geneva, Switzerland, September
- Kim H.-G., Berdahl E., Moreau N. and Sikora T. (2003) "Speaker Recognition Using MPEG-7 Descriptors", Proceedings EUROSPEECH 2003, Geneva, Switzerland, September.
- (2003)
- Kim, H.-G.¹ Berdahl, E.² Moreau, N.³ Sikora, T.⁴

45
- 84889291047
- How Efficient is MPEG-7 for General Sound Recognition?
- 25th International AES Conference "Metadata for Audio", London, UK, June
- Kim H.-G., Burred J. J. and Sikora T. (2004a) "How Efficient is MPEG-7 for General Sound Recognition?", 25th International AES Conference "Metadata for Audio", London, UK, June.
- (2004)
- Kim, H.-G.¹ Burred, J.J.² Sikora, T.³

46
- 2542463254
- Audio Classification Based on MPEG-7 Spectral Basis Representations
- Kim H.-G., Moreau N. and Sikora T. (2004b) "Audio Classification Based on MPEG-7 Spectral Basis Representations", IEEE Transactions on Circuits and Systems for Video Technology, vol. 141, no. 5, pp. 716-725.
- (2004) IEEE Transactions on Circuits and Systems for Video Technology , vol.141 , Issue.5 , pp. 716-725
- Kim, H.-G.¹ Moreau, N.² Sikora, T.³

47
- 0033592606
- Learning the Parts of Objects by Non-Negative Matrix Factorization
- Lee D. D. and Seung H. S. (1999) "Learning the Parts of Objects by Non-Negative Matrix Factorization", Nature, vol. 401, pp. 788-791.
- (1999) Nature , vol.401 , pp. 788-791
- Lee, D.D.¹ Seung, H.S.²

48
- 84898964201
- Algorithms for Non-Negative Matrix Factorization
- NIPS 2001 Conference, Vancouver, Canada
- Lee D. D. and Seung H. S. (2001) "Algorithms for Non-Negative Matrix Factorization", NIPS 2001 Conference, Vancouver, Canada.
- (2001)
- Lee, D.D.¹ Seung, H.S.²

49
- 0003769779
- Introduction to MPEG-7
- John Wiley & Sons, Ltd, Chichester
- Manjunath B. S., Salembier P. and Sikora T. (2001) Introduction to MPEG-7, John Wiley & Sons, Ltd, Chichester.
- (2001)
- Manjunath, B.S.¹ Salembier, P.² Sikora, T.³

50
- 0004244302
- Fundamentals of Speech Recognition
- Prentice Hall, Englewood Cliffs, NJ
- Rabiner L. R. and Jung B. (1993) Fundamentals of Speech Recognition, Prentice Hall, Englewood Cliffs, NJ.
- (1993)
- Rabiner, L.R.¹ Jung, B.²

51
- 0029355999
- Speaker Identification and Verification Using Gaussian Mixture Speaker Models
- Reynolds D. A. (1995) Speaker Identification and Verification Using Gaussian Mixture Speaker Models, Speech Communication, pp. 91-108.
- (1995) Speech Communication , pp. 91-108
- Reynolds, D.A.¹

52
- 84945116938
- Non-Negative Matrix Factorization for Polyphonic Music Transcription
- IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY, USA, October
- Smaragdis P. and Brown J. C. (2003) "Non-Negative Matrix Factorization for Polyphonic Music Transcription", IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY, USA, October.
- (2003)
- Smaragdis, P.¹ Brown, J.C.²

53
- 84889309871
- Basic Speech Sounds, their Analysis and Features
- in Spoken Dialogues with Computers, R. De Mori (ed.), Academic Press, London
- Angelini B., Falavigna D., Omologo M. and De Mori R. (1998) "Basic Speech Sounds, their Analysis and Features", in Spoken Dialogues with Computers, pp. 69-121, R. De Mori (ed.), Academic Press, London.
- (1998) , pp. 69-121
- Angelini, B.¹ Falavigna, D.² Omologo, M.³ De Mori, R.⁴

54
- 84889315094
- A System for Searching and Browsing Spoken Communications
- HLT-NAACL 2004 Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval, Boston, MA, USA, May
- Begeja L., Renger B., Saraclar M., Gibbon D., Liu Z. and Shahraray B. (2004) "A System for Searching and Browsing Spoken Communications", HLT-NAACL 2004 Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval, pp. 1-8, Boston, MA, USA, May.
- (2004) , pp. 1-8
- Begeja, L.¹ Renger, B.² Saraclar, M.³ Gibbon, D.⁴ Liu, Z.⁵ Shahraray, B.⁶

55
- 84889328931
- Dublin City University Video Track Experiments for TREC 2002
- NIST, 11th Text Retrieval Conference (TREC 2002), Gaithersburg, MD, USA, November
- Browne P., Czirjek C., Gurrin C., Jarina R., Lee H., Marlow S., McDonald K., Murphy N., O'Connor N. E., Smeaton A. F. and Ye J. (2002) "Dublin City University Video Track Experiments for TREC 2002", NIST, 11th Text Retrieval Conference (TREC 2002), Gaithersburg, MD, USA, November.
- (2002)
- Browne, P.¹ Czirjek, C.² Gurrin, C.³ Jarina, R.⁴ Lee, H.⁵ Marlow, S.⁶ McDonald, K.⁷ Murphy, N.⁸ O'Connor, N.E.⁹ Smeaton, A.F.¹⁰ Ye, J.¹¹

56
- 0004116125
- Implementation of the SMART Information Retrieval System
- Computer Science Department, Cornell University, Report 85-686
- Buckley C. (1985) "Implementation of the SMART Information Retrieval System", Computer Science Department, Cornell University, Report 85-686.
- (1985)
- Buckley, C.¹

57
- 0004119259
- The Sound Pattern of English
- MIT Press, Cambridge, MA
- Chomsky N. and Halle M. (1968) The Sound Pattern of English, MIT Press, Cambridge, MA.
- (1968)
- Chomsky, N.¹ Halle, M.²

58
- 84889399246
- Phonetic Searching vs. LVCSR: How to Find What You Really Want in Audio Archives
- AVIOS 2001, San Jose, CA, USA, April
- Clements M., Cardillo P. S. and Miller M. S. (2001) "Phonetic Searching vs. LVCSR: How to Find What You Really Want in Audio Archives", AVIOS 2001, San Jose, CA, USA, April.
- (2001)
- Clements, M.¹ Cardillo, P.S.² Miller, M.S.³

59
- 78650946218
- Information Retrieval Techniques for Speech Applications
- ACM SIGIR 2001 Workshop "Information Retrieval Techniques for Speech Applications"
- Coden A. R., Brown E. and Srinivasan S. (2001) "Information Retrieval Techniques for Speech Applications", ACM SIGIR 2001 Workshop "Information Retrieval Techniques for Speech Applications".
- (2001)
- Coden, A.R.¹ Brown, E.² Srinivasan, S.³

60
- 84889316721
- A Model for Combining Semantic and Phonetic Term Similarity for Spoken Document and Spoken Query Retrieval
- International Computer Science Institute, Berkeley, CA, tr-99-020, December
- Crestani F. (1999) "A Model for Combining Semantic and Phonetic Term Similarity for Spoken Document and Spoken Query Retrieval", International Computer Science Institute, Berkeley, CA, tr-99-020, December.
- (1999)
- Crestani, F.¹

61
- 84889317644
- Using Semantic and Phonetic Term Similarity for Spoken Document Retrieval and Spoken Query Processing
- in Technologies for Constructing Intelligent Systems, J. G.-R. B. Bouchon-Meunier and R. R. Yager (eds) Springer-Verlag, Heidelberg, Germany
- Crestani F. (2002) "Using Semantic and Phonetic Term Similarity for Spoken Document Retrieval and Spoken Query Processing" in Technologies for Constructing Intelligent Systems, pp. 363-376, J. G.-R. B. Bouchon-Meunier and R. R. Yager (eds) Springer-Verlag, Heidelberg, Germany.
- (2002) , pp. 363-376
- Crestani, F.¹

62
- 0032270571
- "Is This Document Relevant? . . . Probably": A Survey of Probabilistic Models in Information Retrieval
- Crestani F., Lalmas M., van Rijsbergen C. J. and Campbell I. (1998) " "Is This Document Relevant? . . . Probably": A Survey of Probabilistic Models in Information Retrieval", ACM Computing Surveys, vol. 30, no. 4, pp. 528-552.
- (1998) ACM Computing Surveys , vol.30 , Issue.4 , pp. 528-552
- Crestani, F.¹ Lalmas, M.² Van Rijsbergen, C.J.³ Campbell, I.⁴

63
- 0028996879
- Language Modelling by Variable Length Sequences: Theoretical Formulation and Evaluation of Multigrams
- ICASSP'95, Detroit, USA
- Deligne S. and Bimbot F. (1995) "Language Modelling by Variable Length Sequences: Theoretical Formulation and Evaluation of Multigrams", ICASSP'95, pp. 169-172, Detroit, USA.
- (1995) , pp. 169-172
- Deligne, S.¹ Bimbot, F.²

64
- 84889269862
- Phoneme-Level Indexing for Fast and Vocabulary-Independent Voice/Voice Retrieval
- ESCA Tutorial and Research Workshop (ETRW), "Accessing Information in Spoken Audio", Cambridge, UK, April
- Ferrieux A. and Peillon S. (1999) "Phoneme-Level Indexing for Fast and Vocabulary-Independent Voice/Voice Retrieval", ESCA Tutorial and Research Workshop (ETRW), "Accessing Information in Spoken Audio", Cambridge, UK, April.
- (1999)
- Ferrieux, A.¹ Peillon, S.²

65
- 0012577933
- The LIMSI SDR System for TREC-9
- NIST, 9th Text Retrieval Conference (TREC 9), Gaithersburg, MD, USA, November
- Gauvain J.-L., Lamel L., Barras C., Adda G. and de Kercardio Y. (2000) "The LIMSI SDR System for TREC-9", NIST, 9th Text Retrieval Conference (TREC 9), pp. 335-341, Gaithersburg, MD, USA, November.
- (2000) , pp. 335-341
- Gauvain, J.-L.¹ Lamel, L.² Barras, C.³ Adda, G.⁴ De Kercardio, Y.⁵

66
- 0023776395
- Multi-Level Acoustic Segmentation of Continuous Speech
- ICASSP'88, New York, USA, April
- Glass J. and Zue V. W. (1988) "Multi-Level Acoustic Segmentation of Continuous Speech", ICASSP'88, pp. 429-432, New York, USA, April.
- (1988) , pp. 429-432
- Glass, J.¹ Zue, V.W.²

67
- 0030372637
- A Probabilistic Framework for Featurebased Speech Recognition
- Philadelphia, PA, USA, October
- Glass J., Chang J. and McCandless M. (1996) "A Probabilistic Framework for Featurebased Speech Recognition", ICSLP'96, vol. 4, pp. 2277-2280, Philadelphia, PA, USA, October.
- (1996) ICSLP'96 , vol.4 , pp. 2277-2280
- Glass, J.¹ Chang, J.² McCandless, M.³

68
- 0026989462
- A System for Retrieving Speech Documents
- ACM, SIGIR
- Glavitsch U. and Schäuble P. (1992) "A System for Retrieving Speech Documents", ACM, SIGIR, pp. 168-176.
- (1992) , pp. 168-176
- Glavitsch, U.¹ Schäuble, P.²

69
- 0003901864
- Speech and Audio Signal Processing
- John Wiley & Sons, Inc., New York
- Gold B. and Morgan N. (1999) Speech and Audio Signal Processing, John Wiley & Sons, Inc., New York.
- (1999)
- Gold, B.¹ Morgan, N.²

70
- 0003877861
- Heterogeneous acoustic measurements and multiple classifiers for speech recognition
- PhD Thesis, Massachusetts Institute of Technology (MIT), Cambridge, MA
- Halberstadt A. K. (1998) "Heterogeneous acoustic measurements and multiple classifiers for speech recognition", PhD Thesis, Massachusetts Institute of Technology (MIT), Cambridge, MA.
- (1998)
- Halberstadt, A.K.¹

71
- 0004185151
- Clustering Algorithms
- John Wiley & Sons, Inc., New York
- Hartigan J. (1975) Clustering Algorithms, John Wiley & Sons, Inc., New York.
- (1975)
- Hartigan, J.¹

72
- 78649307442
- Audio Hot Spotting and Retrieval using Multiple Features
- HLT-NAACL 2004 Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval, Boston, MA, USA, May
- Hu Q., Goodman F., Boykin S., Fish R. and Greiff W. (2004) "Audio Hot Spotting and Retrieval using Multiple Features", HLT-NAACL 2004 Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval, pp. 13-17, Boston, MA, USA, May.
- (2004) , pp. 13-17
- Hu, Q.¹ Goodman, F.² Boykin, S.³ Fish, R.⁴ Greiff, W.⁵

73
- 0004671920
- The Application of Classical Information Retrieval Techniques to Spoken Documents
- PhD Thesis, University of Cambridge, Speech, Vision and Robotic Group, Cambridge, UK
- James D. A. (1995) "The Application of Classical Information Retrieval Techniques to Spoken Documents", PhD Thesis, University of Cambridge, Speech, Vision and Robotic Group, Cambridge, UK.
- (1995)
- James, D.A.¹

74
- 0003786003
- Statistical Methods for Speech Recognition
- MIT Press, Cambridge, MA
- Jelinek F. (1998) Statistical Methods for Speech Recognition, MIT Press, Cambridge, MA.
- (1998)
- Jelinek, F.¹

75
- 0002623652
- Spoken Document Retrieval for TREC-9 at Cambridge University
- NIST, 9th Text Retrieval Conference (TREC 9), Gaithersburg, MD, USA, November
- Johnson S. E., Jourlin P., Spärck Jones K. and Woodland P. C. (2000) "Spoken Document Retrieval for TREC-9 at Cambridge University", NIST, 9th Text Retrieval Conference (TREC 9), pp. 117-126, Gaithersburg, MD, USA, November.
- (2000) , pp. 117-126
- Johnson, S.E.¹ Jourlin, P.² Spärck Jones, K.³ Woodland, P.C.⁴

76
- 0030379111
- Retrieving Spoken Documents by Combining Multiple Index Sources
- ACM SIGIR'96, Zurich, Switzerland, August
- Jones G. J. F., Foote J. T., Spärk Jones K. and Young S. J. (1996) "Retrieving Spoken Documents by Combining Multiple Index Sources", ACM SIGIR'96, pp. 30-38, Zurich, Switzerland, August.
- (1996) , pp. 30-38
- Jones, G.J.F.¹ Foote, J.T.² Spärk Jones, K.³ Young, S.J.⁴

77
- 0023312404
- Estimation of Probabilities from Sparse Data for the Language Model Component of a Speech Recognizer
- Katz S. M. (1987) "Estimation of Probabilities from Sparse Data for the Language Model Component of a Speech Recognizer", IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 3, pp. 400-401.
- (1987) IEEE Transactions on Acoustics, Speech and Signal Processing , vol.3 , pp. 400-401
- Katz, S.M.¹

78
- 0344139642
- Speech-based Retrieval using Semantic Co-Occurrence Filtering
- ARPA, Human Language Technologies (HLT) Conference, Plainsboro, NJ, USA
- Kupiec J., Kimber D. and Balasubramanian V. (1994) "Speech-based Retrieval using Semantic Co-Occurrence Filtering", ARPA, Human Language Technologies (HLT) Conference, pp. 373-377, Plainsboro, NJ, USA.
- (1994) , pp. 373-377
- Kupiec, J.¹ Kimber, D.² Balasubramanian, V.³

79
- 33745217037
- Using Syllable-based Indexing Features and Language Models to Improve German Spoken Document Retrieval
- ISCA, Eurospeech 2003, Geneva, Switzerland, September
- Larson M. and Eickeler S. (2003) "Using Syllable-based Indexing Features and Language Models to Improve German Spoken Document Retrieval", ISCA, Eurospeech 2003, pp. 1217-1220, Geneva, Switzerland, September.
- (2003) , pp. 1217-1220
- Larson, M.¹ Eickeler, S.²

80
- 85009112218
- Multi-layer Subword Units for Open-Vocabulary Spoken Document Retrieval
- ICSLP'2004, Jeju Island, Korea, October
- Lee S. W., Tanaka K. and Itoh Y. (2004) "Multi-layer Subword Units for Open-Vocabulary Spoken Document Retrieval", ICSLP'2004, Jeju Island, Korea, October.
- (2004)
- Lee, S.W.¹ Tanaka, K.² Itoh, Y.³

81
- 0001116877
- Binary Codes Capable of Correcting Deletions, Insertions and Reversals
- Levenshtein V. I. (1966) "Binary Codes Capable of Correcting Deletions, Insertions and Reversals", Soviet Physics Doklady, vol. 10, no. 8, pp. 707-710.
- (1966) Soviet Physics Doklady , vol.10 , Issue.8 , pp. 707-710
- Levenshtein, V.I.¹

82
- 0034276059
- Representation and linking mechanisms for audio in MPEG-7
- Lindsay A. T., Srinivasan S., Charlesworth J. P. A., Garner P. N. and Kriechbaum W. (2000) "Representation and linking mechanisms for audio in MPEG-7", Signal Processing: Image Communication Journal, Special Issue on MPEG-7, vol. 16, pp. 193-209.
- (2000) Signal Processing: Image Communication Journal, Special Issue on MPEG-7 , vol.16 , pp. 193-209
- Lindsay, A.T.¹ Srinivasan, S.² Charlesworth, J.P.A.³ Garner, P.N.⁴ Kriechbaum, W.⁵

83
- 84889344869
- Word and Sub-word Indexing Approaches for Reducing the Effects of OOV Queries on Spoken Audio
- Human Language Technology Conference (HLT 2002), San Diego, CA, USA, March
- Logan B., Moreno P. J. and Deshmukh O. (2002) "Word and Sub-word Indexing Approaches for Reducing the Effects of OOV Queries on Spoken Audio", Human Language Technology Conference (HLT 2002), San Diego, CA, USA, March.
- (2002)
- Logan, B.¹ Moreno, P.J.² Deshmukh, O.³

84
- 85009154200
- Keyword Recognition and Extraction by Multiple-LVCSRs with 60,000 Words in Speech-driven WEB Retrieval Task
- ICSLP'2004, Jeju Island, Korea, October
- Matsushita M., Nishizaki H., Nakagawa S. and Utsuro T. (2004) "Keyword Recognition and Extraction by Multiple-LVCSRs with 60,000 Words in Speech-driven WEB Retrieval Task", ICSLP'2004, Jeju Island, Korea, October.
- (2004)
- Matsushita, M.¹ Nishizaki, H.² Nakagawa, S.³ Utsuro, T.⁴

85
- 84889470946
- Combination of Phone N-Grams for a MPEG-7-based Spoken Document Retrieval System
- EUSIPCO 2004, Vienna, Austria, September
- Moreau N., Kim H.-G. and Sikora T. (2004a) "Combination of Phone N-Grams for a MPEG-7-based Spoken Document Retrieval System", EUSIPCO 2004, Vienna, Austria, September.
- (2004)
- Moreau, N.¹ Kim, H.-G.² Sikora, T.³

86
- 84889420489
- Phone-based Spoken Document Retrieval in Conformance with the MPEG-7 Standard
- 25th International AES Conference "Metadata for Audio", London, UK, June
- Moreau N., Kim H.-G. and Sikora T. (2004b) "Phone-based Spoken Document Retrieval in Conformance with the MPEG-7 Standard", 25th International AES Conference "Metadata for Audio", London, UK, June.
- (2004)
- Moreau, N.¹ Kim, H.-G.² Sikora, T.³

87
- 33745186799
- Phonetic Confusion Based Document Expansion for Spoken Document Retrieval
- ICSLP Interspeech 2004, Jeju Island, Korea, October
- Moreau N., Kim H.-G. and Sikora T. (2004c) "Phonetic Confusion Based Document Expansion for Spoken Document Retrieval", ICSLP Interspeech 2004, Jeju Island, Korea, October.
- (2004)
- Moreau, N.¹ Kim, H.-G.² Sikora, T.³

88
- 84889403089
- Scoring Algorithms for Wordspotting Systems
- HLT-NAACL 2004 Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval, Boston, MA, USA, May
- Morris R. W., Arrowood J. A., Cardillo P. S. and Clements M. A. (2004) "Scoring Algorithms for Wordspotting Systems", HLT-NAACL 2004 Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval, pp. 18-21, Boston, MA, USA, May.
- (2004) , pp. 18-21
- Morris, R.W.¹ Arrowood, J.A.² Cardillo, P.S.³ Clements, M.A.⁴

89
- 0034274806
- Experiments in Spoken Document Retrieval Using Phoneme N-grams
- Ng C., Wilkinson R. and Zobel J. (2000) "Experiments in Spoken Document Retrieval Using Phoneme N-grams", Speech Communication, vol. 32, no. 1, pp. 61-77.
- (2000) Speech Communication , vol.32 , Issue.1 , pp. 61-77
- Ng, C.¹ Wilkinson, R.² Zobel, J.³

90
- 0038576501
- Towards Robust Methods for Spoken Document Retrieval
- Sydney, Australia, November
- Ng K. (1998) "Towards Robust Methods for Spoken Document Retrieval", ICSLP'98, vol. 3, pp. 939-342, Sydney, Australia, November.
- (1998) ICSLP'98 , vol.3 , pp. 939-342
- Ng, K.¹

91
- 84937320583
- Subword-based Approaches for Spoken Document Retrieval
- PhD Thesis, Massachusetts Institute of Technology (MIT), Cambridge, MA
- Ng K. (2000) "Subword-based Approaches for Spoken Document Retrieval", PhD Thesis, Massachusetts Institute of Technology (MIT), Cambridge, MA.
- (2000)
- Ng, K.¹

92
- 0031636298
- Phonetic Recognition for Spoken Document Retrieval
- ICASSP'98, Seattle, WA, USA
- Ng K. and Zue V. (1998) "Phonetic Recognition for Spoken Document Retrieval", ICASSP'98, pp. 325-328, Seattle, WA, USA.
- (1998) , pp. 325-328
- Ng, K.¹ Zue, V.²

93
- 0034300710
- Subword-based Approaches for Spoken Document Retrieval
- Ng K. and Zue V. W. (2000) "Subword-based Approaches for Spoken Document Retrieval", Speech Communication, vol. 32, no. 3, pp. 157-186.
- (2000) Speech Communication , vol.32 , Issue.3 , pp. 157-186
- Ng, K.¹ Zue, V.W.²

94
- 85017287102
- An Efficient A Stack Decoder Algorithm for Continuous Speech Recognition with a Stochastic Language Model
- ICASSP'92, San Francisco, USA
- Paul D. B. (1992) "An Efficient A Stack Decoder Algorithm for Continuous Speech Recognition with a Stochastic Language Model", ICASSP'92, pp. 25-28, San Francisco, USA.
- (1992) , pp. 25-28
- Paul, D.B.¹

95
- 84948481845
- An Algorithm for Suffix Stripping
- Porter M. (1980) "An Algorithm for Suffix Stripping", Program, vol. 14, no. 3, pp. 130-137.
- (1980) Program , vol.14 , Issue.3 , pp. 130-137
- Porter, M.¹

96
- 0024610919
- A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition
- Rabiner L. (1989) "A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition", Proceedings of the IEEE, vol. 77, no. 2, pp. 257-286.
- (1989) Proceedings of the IEEE , vol.77 , Issue.2 , pp. 257-286
- Rabiner, L.¹

97
- 0004244302
- Fundamentals of Speech Recognition
- Prentice Hall, Englewood Cliffs, NJ
- Rabiner L. and Juang B.-H. (1993) Fundamentals of Speech Recognition, Prentice Hall, Englewood Cliffs, NJ.
- (1993)
- Rabiner, L.¹ Juang, B.-H.²

98
- 0017630891
- The probability ranking principle in IR
- Robertson E. S. (1977) "The probability ranking principle in IR", Journal of Documentation, vol. 33, no. 4, pp. 294-304.
- (1977) Journal of Documentation , vol.33 , Issue.4 , pp. 294-304
- Robertson, E.S.¹

99
- 0029386354
- Keyword Detection in Conversational Speech Utterances Using Hidden Markov Model Based Continuous Speech Recognition
- Rose R. C. (1995) "Keyword Detection in Conversational Speech Utterances Using Hidden Markov Model Based Continuous Speech Recognition", Computer, Speech and Language, vol. 9, no. 4, pp. 309-333.
- (1995) Computer, Speech and Language , vol.9 , Issue.4 , pp. 309-333
- Rose, R.C.¹

100
- 45549117987
- Term-Weighting Approaches in Automatic Text Retrieval
- Salton G. and Buckley C. (1988) "Term-Weighting Approaches in Automatic Text Retrieval", Information Processing and Management, vol. 24, no. 5, pp. 513-523.
- (1988) Information Processing and Management , vol.24 , Issue.5 , pp. 513-523
- Salton, G.¹ Buckley, C.²

101
- 0003653039
- Introduction to Modern Information Retrieval
- McGraw-Hill, New York
- Salton G. and McGill M. J. (1983) Introduction to Modern Information Retrieval, McGraw-Hill, New York.
- (1983)
- Salton, G.¹ McGill, M.J.²

102
- 0033658324
- Phonetic Confusion Matrix Based Spoken Document Retrieval
- 23rd Annual ACM Conference on Research and Development in Information Retrieval (SIGIR'00), Athens, Greece, July
- Srinivasan S. and Petkovic D. (2000) "Phonetic Confusion Matrix Based Spoken Document Retrieval", 23rd Annual ACM Conference on Research and Development in Information Retrieval (SIGIR'00), pp. 81-87, Athens, Greece, July.
- (2000) , pp. 81-87
- Srinivasan, S.¹ Petkovic, D.²

103
- 84889282489
- Common Evaluation Measures
- TREC, NIST, 10th Text Retrieval Conference (TREC 2001), Gaithersburg, MD, USA, November
- TREC (2001) "Common Evaluation Measures", NIST, 10th Text Retrieval Conference (TREC 2001), pp. A-14, Gaithersburg, MD, USA, November.
- (2001) , pp. 14

104
- 0004217877
- Information Retrieval
- Butterworths, London
- van Rijsbergen C. J. (1979) Information Retrieval, Butterworths, London.
- (1979)
- van Rijsbergen, C.J.¹

105
- 0002565067
- Overview of the Seventh Text REtrieval Conference
- NIST, 7th Text Retrieval Conference (TREC-7), Gaithersburg, MD, USA, November
- Voorhees E. and Harman D. K. (1998) "Overview of the Seventh Text REtrieval Conference", NIST, 7th Text Retrieval Conference (TREC-7), pp. 1-24, Gaithersburg, MD, USA, November.
- (1998) , pp. 1-24
- Voorhees, E.¹ Harman, D.K.²

106
- 0002748692
- Okapi at TREC-6 Automatic Ad Hoc, VLC, Routing, Filtering and QSDR
- 6th Text Retrieval Conference (TREC-6), Gaithersburg, MD, USA, November
- Walker S., Robertson S. E., Boughanem M., Jones G. J. F. and Spärck Jones K. (1997) "Okapi at TREC-6 Automatic Ad Hoc, VLC, Routing, Filtering and QSDR", 6th Text Retrieval Conference (TREC-6), pp. 125-136, Gaithersburg, MD, USA, November.
- (1997) , pp. 125-136
- Walker, S.¹ Robertson, S.E.² Boughanem, M.³ Jones, G.J.F.⁴ Spärck Jones, K.⁵

107
- 0010052837
- Spoken Document Retrieval Based on Phoneme Recognition
- PhD Thesis, Swiss Federal Institute of Technology (ETH), Zurich
- Wechsler M. (1998) "Spoken Document Retrieval Based on Phoneme Recognition", PhD Thesis, Swiss Federal Institute of Technology (ETH), Zurich.
- (1998)
- Wechsler, M.¹

108
- 0032282577
- New Techniques for Open-Vocabulary Spoken Document Retrieval
- 21st Annual ACM Conference on Research and Development in Information Retrieval (SIGIR'98), Melbourne, Australia, August
- Wechsler M., Munteanu E. and Schäuble P. (1998) "New Techniques for Open-Vocabulary Spoken Document Retrieval", 21st Annual ACM Conference on Research and Development in Information Retrieval (SIGIR'98), pp. 20-27, Melbourne, Australia, August.
- (1998) , pp. 20-27
- Wechsler, M.¹ Munteanu, E.² Schäuble, P.³

109
- 33745207743
- SAMPA computer readable phonetic alphabet
- in Handbook of Standards and Resources for Spoken Language Systems, D. Gibbon, R. Moore and R. Winski (eds), Mouton de Gruyter, Berlin and New York
- Wells J. C. (1997) "SAMPA computer readable phonetic alphabet", in Handbook of Standards and Resources for Spoken Language Systems, D. Gibbon, R. Moore and R. Winski (eds), Mouton de Gruyter, Berlin and New York.
- (1997)
- Wells, J.C.¹

110
- 0025517070
- Automatic Recognition of Keywords in Unconstrained Speech Using Hidden Markov Models
- Wilpon J. G., Rabiner L. R. and Lee C.-H. (1990) "Automatic Recognition of Keywords in Unconstrained Speech Using Hidden Markov Models", Transactions on Acoustics, Speech and Signal Processing, vol. 38, no. 11, pp. 1870-1878.
- (1990) Transactions on Acoustics, Speech and Signal Processing , vol.38 , Issue.11 , pp. 1870-1878
- Wilpon, J.G.¹ Rabiner, L.R.² Lee, C.-H.³

111
- 84889299571
- Speech Recognition and Information Retrieval: Experiments in Retrieving Spoken Documents
- DARPA Speech Recognition Workshop, Chantilly, VA, USA, February
- Witbrock M. and Hauptmann A. G. (1997) "Speech Recognition and Information Retrieval: Experiments in Retrieving Spoken Documents", DARPA Speech Recognition Workshop, Chantilly, VA, USA, February.
- (1997)
- Witbrock, M.¹ Hauptmann, A.G.²

112
- 85009089367
- A Hybrid Word/Phoneme-Based Approach for Improved Vocabulary-Independent Search in Spontaneous Speech
- ICSLP'2004, Jeju Island, Korea, October
- Yu P. and Seide F. T. B. (2004) "A Hybrid Word/Phoneme-Based Approach for Improved Vocabulary-Independent Search in Spontaneous Speech", ICSLP'2004, Jeju Island, Korea, October.
- (2004)
- Yu, P.¹ Seide, F.T.B.²

113
- 85029488480
- Fast and Practical Approximate String Matching
- Combinatorial Pattern Matching, Third Annual Symposium, Barcelona, Spain
- Baeza-Yates R. (1992) "Fast and Practical Approximate String Matching", Combinatorial Pattern Matching, Third Annual Symposium, pp. 185-192, Barcelona, Spain.
- (1992) , pp. 185-192
- Baeza-Yates, R.¹

114
- 13344269607
- Evaluation of Distance Measures for MPEG-7 Melody Contours
- International Workshop on Multimedia Signal Processing, IEEE Signal Processing Society, Siena, Italy
- Batke J. M., Eisenberg G., Weishaupt P. and Sikora T. (2004a) "Evaluation of Distance Measures for MPEG-7 Melody Contours", International Workshop on Multimedia Signal Processing, IEEE Signal Processing Society, Siena, Italy.
- (2004)
- Batke, J.M.¹ Eisenberg, G.² Weishaupt, P.³ Sikora, T.⁴

115
- 84886074605
- A Query by Humming System Using MPEG-7 Descriptors
- Proceedings of the 116th AES Convention, AES, Berlin, Germany
- Batke J. M., Eisenberg G., Weishaupt P. and Sikora T. (2004b) "A Query by Humming System Using MPEG-7 Descriptors", Proceedings of the 116th AES Convention, AES, Berlin, Germany.
- (2004)
- Batke, J.M.¹ Eisenberg, G.² Weishaupt, P.³ Sikora, T.⁴

116
- 0001835850
- Accurate Short-term Analysis of the Fundamental Frequency and the Harmonics-to-Noise Ratio of a Sampled Sound
- IFA Proceedings 17, Institute of Phonetic Sciences of the University of Amsterdam, the Netherlands
- Boersma P. (1993) "Accurate Short-term Analysis of the Fundamental Frequency and the Harmonics-to-Noise Ratio of a Sampled Sound", IFA Proceedings 17, Institute of Phonetic Sciences of the University of Amsterdam, the Netherlands.
- (1993)
- Boersma, P.¹

117
- 0036031477
- Melody Retrieval on the Web
- Proceedings of ACM/SPIE Conference on Multimedia Computing and Networking, Boston, MA, USA
- Chai W. and Vercoe B. (2002) "Melody Retrieval on the Web", Proceedings of ACM/SPIE Conference on Multimedia Computing and Networking, Boston, MA, USA.
- (2002)
- Chai, W.¹ Vercoe, B.²

118
- 4544373794
- An Auditory Model Based Transcriber of Singing Sequences
- Ehent, Belgium
- Clarisse L. P., Martens J. P., Lesaffre M., Baets B. D., Meyer H. D. and Leman M. (2002) An Auditory Model Based Transcriber of Singing Sequences", Proceedings of the ISMIR, pp. 116-123, Ehent, Belgium.
- (2002) Proceedings of the ISMIR , pp. 116-123
- Clarisse, L.P.¹ Martens, J.P.² Lesaffre, M.³ Baets, B.D.⁴ Meyer, H.D.⁵ Leman, M.⁶

119
- 13344261703
- BeatBank - An MPEG-7 compliant query by tapping system
- Proceedings of the 116th AES Convention, Berlin, Germany
- Eisenberg G., Batke J. M. and Sikora T. (2004) "BeatBank - An MPEG-7 compliant query by tapping system", Proceedings of the 116th AES Convention, Berlin, Germany.
- (2004)
- Eisenberg, G.¹ Batke, J.M.² Sikora, T.³

120
- 0033677009
- A Robust Predominant-f0 Estimation Method for Real-time Detection of Melody and Bass Lines in CD Recordings
- Proceedings of ICASSP, Tokyo, Japan
- Goto M. (2000) "A Robust Predominant-f0 Estimation Method for Real-time Detection of Melody and Bass Lines in CD Recordings", Proceedings of ICASSP, pp. 757-760, Tokyo, Japan.
- (2000) , pp. 757-760
- Goto, M.¹

121
- 0034848863
- A Predominant-f0 Estimation Method for CD Recordings: Map Estimation Using EM Algorithm for Adaptive Tone Models
- Proceedings of ICASSP, pp. V-3365-3368, Tokyo, Japan
- Goto M. (2001) "A Predominant-f0 Estimation Method for CD Recordings: Map Estimation Using EM Algorithm for Adaptive Tone Models", Proceedings of ICASSP, pp. V-3365-3368, Tokyo, Japan.
- (2001)
- Goto, M.¹

122
- 11844270131
- Techniques for the Automated Analysis of Musical Audio
- PhD Thesis, University of Cambridge, Cambridge, UK
- Hainsworth S. W. (2003) "Techniques for the Automated Analysis of Musical Audio", PhD Thesis, University of Cambridge, Cambridge, UK.
- (2003)
- Hainsworth, S.W.¹

123
- 84889440508
- An Audio Front-End for Query-by-Humming Systems
- 2nd Annual International Symposium on Music Information Retrieval, ISMIR, Bloomington, IN, USA
- Haus G. and Pollastri E. (2001) "An Audio Front-End for Query-by-Humming Systems", 2nd Annual International Symposium on Music Information Retrieval, ISMIR, Bloomington, IN, USA.
- (2001)
- Haus, G.¹ Pollastri, E.²

124
- 84889456437
- GUIDO/MIR-An experimental musical information retrieval system based on Guido music notation
- Proceedings of the Second Annual International Symposium on Music Information Retrieval, Bloomington, IN, USA
- Hoos H. H., Renz K. and Görg M. (2001) "GUIDO/MIR-An experimental musical information retrieval system based on Guido music notation", Proceedings of the Second Annual International Symposium on Music Information Retrieval, Bloomington, IN, USA.
- (2001)
- Hoos, H.H.¹ Renz, K.² Görg, M.³

125
- 0003455850
- Information Technology - Multimedia Content Description Interface - Part 4: Audio
- ISO, 15938-4:2001(E)
- ISO (2001a) Information Technology - Multimedia Content Description Interface - Part 4: Audio, 15938-4:2001(E).
- (2001)

126
- 0012179370
- Information Technology - Multimedia Content Description Interface - Part 5: Multimedia Description Schemes
- ISO, 15938-5:2001(E)
- ISO (2001b) Information Technology - Multimedia Content Description Interface - Part 5: Multimedia Description Schemes, 15938-5:2001(E).
- (2001)

127
- 0005008397
- Analysis of a Contour-based Representation for Melody
- Proceedings of the International Symposium on Music Information Retrieval, Boston, MA, USA
- Kim Y. E., Chai W., Garcia R. and Vercoe B. (2000) "Analysis of a Contour-based Representation for Melody", Proceedings of the International Symposium on Music Information Retrieval, Boston, MA, USA.
- (2000)
- Kim, Y.E.¹ Chai, W.² Garcia, R.³ Vercoe, B.⁴

128
- 84889324368
- Means of Integrating Audio Content Analysis Algorithms
- 110th Audio Engineering Society Convention, Amsterdam, the Netherlands
- Klapuri A. (2001) "Means of Integrating Audio Content Analysis Algorithms", 110th Audio Engineering Society Convention, Amsterdam, the Netherlands.
- (2001)
- Klapuri, A.¹

129
- 33748519104
- Signal Processing Methods for the Automatic Transcription of Music
- PhD Thesis, Tampere University of Technology, Tampere, Finland
- Klapuri A. (2004) "Signal Processing Methods for the Automatic Transcription of Music", PhD Thesis, Tampere University of Technology, Tampere, Finland.
- (2004)
- Klapuri, A.¹

130
- 84948666520
- Efficient Calculation of a Physiologicallymotivated Representation for Sound
- IEEE International Conference on Digital Signal Processing, Santorini, Greece
- Klapuri A. P. and Astola J. T. (2002) "Efficient Calculation of a Physiologicallymotivated Representation for Sound", IEEE International Conference on Digital Signal Processing, Santorini, Greece.
- (2002)
- Klapuri, A.P.¹ Astola, J.T.²

131
- 0003769779
- Introduction to MPEG-7
- 1 Edition, John Wiley & Sons, Ltd, Chichester
- Manjunath B. S., Salembier P. and Sikora T. (eds) (2002) Introduction to MPEG-7, 1 Edition, John Wiley & Sons, Ltd, Chichester.
- (2002)
- Manjunath, B.S.¹ Salembier, P.² Sikora, T.³

132
- 0037728485
- Signal Processing for Melody Transcription
- Proceedings of the 19th Australasian Computer Science Conference, Waikato, New Zealand
- McNab R. J., Smith L. A. and Witten I. H. (1996a) "Signal Processing for Melody Transcription", Proceedings of the 19th Australasian Computer Science Conference, Waikato, New Zealand.
- (1996)
- McNab, R.J.¹ Smith, L.A.² Witten, I.H.³

133
- 0029695822
- Towards the Digital Music Library: Tune retrieval from acoustic input
- Proceedings of the first ACM International Conference on Digital Libraries, Bethesda, MD, USA
- McNab R. J., Smith L. A., Witten I. H., Henderson C. L. and Cunningham S. J. (1996b) "Towards the Digital Music Library: Tune retrieval from acoustic input", Proceedings of the first ACM International Conference on Digital Libraries, pp. 11-18, Bethesda, MD, USA.
- (1996) , pp. 11-18
- McNab, R.J.¹ Smith, L.A.² Witten, I.H.³ Henderson, C.L.⁴ Cunningham, S.J.⁵

134
- 0025740746
- Virtual Pitch and Phase Sensitivity of a Computer Model of the Auditory Periphery. I: Pitch identification
- Meddis R. and Hewitt M. J. (1991) "Virtual Pitch and Phase Sensitivity of a Computer Model of the Auditory Periphery. I: Pitch identification", Journal of the Acoustical Society of America, vol. 89, no. 6, pp. 2866-2882.
- (1991) Journal of the Acoustical Society of America , vol.89 , Issue.6 , pp. 2866-2882
- Meddis, R.¹ Hewitt, M.J.²

135
- 84889359896
- Die Ganze Musik im Internet
- Musicline, QBH system provided by phononet GmbH
- Musicline (n.d.) "Die Ganze Musik im Internet", QBH system provided by phononet GmbH.

136
- 84889317725
- Musipedia, the open music encyclopedia
- Musipedia
- Musipedia (2004) "Musipedia, the open music encyclopedia", www.musipedia.org.
- (2004)

137
- 84889386344
- Information technology -Multimedia content description interface -Part 4: Audio, AMENDMENT 1: Audio extensions
- N57, Audio Group Text of ISO/IEC 15938-4:2002/FDAM 1
- N57 (2003) Information technology -Multimedia content description interface -Part 4: Audio, AMENDMENT 1: Audio extensions, Audio Group Text of ISO/IEC 15938-4:2002/FDAM 1.
- (2003)

138
- 84959256090
- An Interface for Melody Input
- Prechelt L. and Typke R. (2001) "An Interface for Melody Input", ACM Transactions on Computer-Human Interaction, vol. 8, no. 2, pp. 133-149.
- (2001) ACM Transactions on Computer-Human Interaction , vol.8 , Issue.2 , pp. 133-149
- Prechelt, L.¹ Typke, R.²

139
- 0031972902
- Tempo and Beat Analysis of Acoustic Musical Signals
- Scheirer E. D. (1998) "Tempo and Beat Analysis of Acoustic Musical Signals", Journal of the Acoustical Society of America, vol. 103, no. 1, pp. 588-601.
- (1998) Journal of the Acoustical Society of America , vol.103 , Issue.1 , pp. 588-601
- Scheirer, E.D.¹

140
- 84889269903
- Pitch Detection of the Singing Voice in Musical Audio
- Proceedings of the 114th AES Convention, Amsterdam, the Netherlands
- Shandilya S. and Rao P. (2003) "Pitch Detection of the Singing Voice in Musical Audio", Proceedings of the 114th AES Convention, Amsterdam, the Netherlands.
- (2003)
- Shandilya, S.¹ Rao, P.²

141
- 4744373951
- Music Information Retrieval Technology
- PhD Thesis, Royal Melbourne Institute of Technology, Melbourne, Australia
- Uitdenbogerd A. L. (2002) "Music Information Retrieval Technology", PhD Thesis, Royal Melbourne Institute of Technology, Melbourne, Australia.
- (2002)
- Uitdenbogerd, A.L.¹

142
- 0033279561
- Matching Techniques for Large Music Databases
- Proceedings of the ACM Multimedia Conference (ed. D. Bulterman, K. Jeffay and H. J. Zhang), Orlando, Florida
- Uitdenbogerd A. L. and Zobel J. (1999) "Matching Techniques for Large Music Databases", Proceedings of the ACM Multimedia Conference (ed. D. Bulterman, K. Jeffay and H. J. Zhang), pp. 57-66, Orlando, Florida.
- (1999) , pp. 57-66
- Uitdenbogerd, A.L.¹ Zobel, J.²

143
- 13344250717
- Music Ranking Techniques Evaluated
- Proceedings of the Australasian Computer Science Conference (ed. M. Oudshoorn), Melbourne, Australia
- Uitdenbogerd A. L. and Zobel J. (2002) "Music Ranking Techniques Evaluated", Proceedings of the Australasian Computer Science Conference (ed. M. Oudshoorn), pp. 275-283, Melbourne, Australia.
- (2002) , pp. 275-283
- Uitdenbogerd, A.L.¹ Zobel, J.²

144
- 84866006845
- A Probabilistic Model for the Transcription of Single-voice Melodies
- Finnish Signal Processing Symposium, FINSIG Tampere University of Technology, Tampere, Finland
- Viitaniemi T., Klapuri A. and Eronen A. (2003) "A Probabilistic Model for the Transcription of Single-voice Melodies", Finnish Signal Processing Symposium, FINSIG Tampere University of Technology, Tampere, Finland.
- (2003)
- Viitaniemi, T.¹ Klapuri, A.² Eronen, A.³

145
- 84855721130
- Wikipedia, the free encyclopedia
- Wikipedia
- Wikipedia (2001) "Wikipedia, the free encyclopedia", http://en.wikipedia.org.
- (2001)

146
- 4243152700
- Content-based Identification of Audio Material Using MPEG-7 Low Level Description
- International Symposium on Music Information Retrieval, Bloomington, NI, USA, October
- Allamanche E., Herre J., Helmuth O., Fröba B., Kasten T. and Cremer M. (2001) "Content-based Identification of Audio Material Using MPEG-7 Low Level Description", International Symposium on Music Information Retrieval, Bloomington, NI, USA, October.
- (2001)
- Allamanche, E.¹ Herre, J.² Helmuth, O.³ Fröba, B.⁴ Kasten, T.⁵ Cremer, M.⁶

147
- 0005540823
- Modern Information Retrieval
- Addison-Wesley, Reading, MA
- Baeza-Yates R. and Ribeiro-Neto B. (1999) Modern Information Retrieval, Addison-Wesley, Reading, MA.
- (1999)
- Baeza-Yates, R.¹ Ribeiro-Neto, B.²

148
- 29344471330
- Automatic Song Identification in Noisy Broadcast Audio
- International Conference on Signal and Image Processing (SIP 2002), Kauai, HI, USA, August
- Batlle E., Masip J. and Guaus E. (2002) "Automatic Song Identification in Noisy Broadcast Audio", International Conference on Signal and Image Processing (SIP 2002), Kauai, HI, USA, August.
- (2002)
- Batlle, E.¹ Masip, J.² Guaus, E.³

149
- 4243471699
- Method and Article of Manufacture for Content-Based Analysis, Storage, Retrieval and Segmentation of Audio Information
- US Patent 5918.223
- Blum T., Keislar D., Wheaton J. and Wold E. (1999) "Method and Article of Manufacture for Content-Based Analysis, Storage, Retrieval and Segmentation of Audio Information", US Patent 5918.223.
- (1999)
- Blum, T.¹ Keislar, D.² Wheaton, J.³ Wold, E.⁴

150
- 17444446371
- Extracting Noise-Robust Features from Audio Data
- ICASSP 2002, Orlando, FL, USA, May
- Burges C., Platt J. and Jana S. (2002) "Extracting Noise-Robust Features from Audio Data", ICASSP 2002, Orlando, FL, USA, May.
- (2002)
- Burges, C.¹ Platt, J.² Jana, S.³

151
- 84889321947
- Statistical Significance in Song-Spotting in Audio
- International Symposium on Music Information Retrieval (MUSIC IR 2001), Bloomington, IN, USA, October
- Cano P., Kaltenbrunner M., Mayor O. and Batlle E. (2001) "Statistical Significance in Song-Spotting in Audio", International Symposium on Music Information Retrieval (MUSIC IR 2001), Bloomington, IN, USA, October.
- (2001)
- Cano, P.¹ Kaltenbrunner, M.² Mayor, O.³ Batlle, E.⁴

152
- 84942244978
- A Review of Algorithms for Audio Fingerprinting
- International Workshop on Multimedia Signal Processing (MMSP 2002), St Thomas, Virgin Islands, December
- Cano P., Batlle E., Kalker T. and Haitsma J. (2002a) "A Review of Algorithms for Audio Fingerprinting", International Workshop on Multimedia Signal Processing (MMSP 2002), St Thomas, Virgin Islands, December.
- (2002)
- Cano, P.¹ Batlle, E.² Kalker, T.³ Haitsma, J.⁴

153
- 84889413586
- Robust Sound Modeling for Song Detection in Broadcast Audio
- AES 112th International Convention, Munich, Germany, May
- Cano P., Batlle E., Mayer H. and Neuschmied H. (2002b) "Robust Sound Modeling for Song Detection in Broadcast Audio", AES 112th International Convention, Munich, Germany, May.
- (2002)
- Cano, P.¹ Batlle, E.² Mayer, H.³ Neuschmied, H.⁴

154
- 84889297121
- Audio Fingerprinting: Concepts and Applications
- International Conference on Fuzzy Systems Knowledge Discovery (FSKD'02), Singapore, November
- Cano P., Gómez E., Batlle E., Gomes L. and Bonnet M. (2002c) "Audio Fingerprinting: Concepts and Applications", International Conference on Fuzzy Systems Knowledge Discovery (FSKD'02), Singapore, November.
- (2002)
- Cano, P.¹ Gómez, E.² Batlle, E.³ Gomes, L.⁴ Bonnet, M.⁵

155
- 0345043999
- Searching in Metric Spaces
- Chávez E., Navarro G., Baeza-Yates R. A. and Marroquín J. L. (2001) "Searching in Metric Spaces", ACM Computing Surveys, vol. 23, no. 3, pp. 273-321.
- (2001) ACM Computing Surveys , vol.23 , Issue.3 , pp. 273-321
- Chávez, E.¹ Navarro, G.² Baeza-Yates, R.A.³ Marroquín, J.L.⁴

156
- 19644372694
- Audio Watermarking and Fingerprinting: For Which Applications?
- Gomes L., Cano P., Gómez E., Bonnet M. and Batlle E. (2003) "Audio Watermarking and Fingerprinting: For Which Applications?", Journal of New Music Research, vol. 32, no. 1, pp. 65-81.
- (2003) Journal of New Music Research , vol.32 , Issue.1 , pp. 65-81
- Gomes, L.¹ Cano, P.² Gómez, E.³ Bonnet, M.⁴ Batlle, E.⁵

157
- 84889342431
- Mixed Watermarking-Fingerprinting Approach for Integrity Verification of Audio Recordings
- International Telecommunications Symposium (ITS 2002), Natal, Brazil, September
- Gómez E., Cano P., Gomes L., Batlle E. and Bonnet M. (2002) "Mixed Watermarking-Fingerprinting Approach for Integrity Verification of Audio Recordings", International Telecommunications Symposium (ITS 2002), Natal, Brazil, September.
- (2002)
- Gómez, E.¹ Cano, P.² Gomes, L.³ Batlle, E.⁴ Bonnet, M.⁵

158
- 33845940056
- A Highly Robust Audio Fingerprinting System
- 3rd International Conference on Music Information Retrieval (ISMIR2002), Paris, France, October
- Haitsma J. and Kalker T. (2002) "A Highly Robust Audio Fingerprinting System", 3rd International Conference on Music Information Retrieval (ISMIR2002), Paris, France, October.
- (2002)
- Haitsma, J.¹ Kalker, T.²

159
- 84942246936
- Scalable Robust Audio Fingerprinting Using MPEG-7 Content Description
- IEEE Workshop on Multimedia Signal Processing (MMSP 2002), Virgin Islands, December
- Herre J., Hellmuth O. and Cremer M. (2002) "Scalable Robust Audio Fingerprinting Using MPEG-7 Content Description", IEEE Workshop on Multimedia Signal Processing (MMSP 2002), Virgin Islands, December.
- (2002)
- Herre, J.¹ Hellmuth, O.² Cremer, M.³

160
- 84889446449
- Applications and Challenges for Audio Fingerprinting
- 111th AES Convention, New York, USA, December
- Kalker T. (2001) "Applications and Challenges for Audio Fingerprinting", 111th AES Convention, New York, USA, December.
- (2001)
- Kalker, T.¹

161
- 84889465844
- Signal Recognition System and Method
- US Patent 5.210.820
- Kenyon S. (1999) "Signal Recognition System and Method", US Patent 5.210.820.
- (1999)
- Kenyon, S.¹

162
- 0034849207
- Very Quick Audio Searching: Introducing Global Pruning to the Time-Series Active Search
- Salt Lake City, UT, USA, May
- Kimura A., Kashino K., Kurozumi T. and Murase H. (2001) "Very Quick Audio Searching: Introducing Global Pruning to the Time-Series Active Search", ICASSP'01, vol. 3, pp. 1429-1432, Salt Lake City, UT, USA, May.
- (2001) ICASSP'01 , vol.3 , pp. 1429-1432
- Kimura, A.¹ Kashino, K.² Kurozumi, T.³ Murase, H.⁴

163
- 84873545221
- Identification of Highly Distorted Audio Material for Querying Large Scale Databases
- 112th AES International Convention, Munich, Germany, May
- Kurth F., Ribbrock A. and Clausen M. (2002) "Identification of Highly Distorted Audio Material for Querying Large Scale Databases", 112th AES International Convention, Munich, Germany, May.
- (2002)
- Kurth, F.¹ Ribbrock, A.² Clausen, M.³

164
- 0018918171
- An Algorithm for Vector Quantizer Design
- Linde Y., Buzo A. and Gray R. M. (1980) "An Algorithm for Vector Quantizer Design", IEEE Transactions on Communications, vol. 28, no. 1, pp. 84-95.
- (1980) IEEE Transactions on Communications , vol.28 , Issue.1 , pp. 84-95
- Linde, Y.¹ Buzo, A.² Gray, R.M.³

165
- 0025489558
- Detecting and Logging Advertisements Using its Sound
- Lourens J. G. (1990) "Detecting and Logging Advertisements Using its Sound", IEEE Transactions on Broadcasting, vol. 36, no. 3, pp. 231-233.
- (1990) IEEE Transactions on Broadcasting , vol.36 , Issue.3 , pp. 231-233
- Lourens, J.G.¹

166
- 20444444996
- A Perceptual Audio Hashing Algorithm: A Tool for Robust Audio Identification and Information Hiding
- 4th Workshop on Information Hiding, Pittsburgh, PA, USA, April
- Mihcak M. K. and Venkatesan R. (2001) "A Perceptual Audio Hashing Algorithm: A Tool for Robust Audio Identification and Information Hiding", 4th Workshop on Information Hiding, Pittsburgh, PA, USA, April.
- (2001)
- Mihcak, M.K.¹ Venkatesan, R.²

167
- 0035099428
- A New Approach to the Automatic Recognition of Musical Recordings
- Papaodysseus C., Roussopoulos G., Fragoulis D. and Alexiou C. (2001) "A New Approach to the Automatic Recognition of Musical Recordings", Journal of the AES, vol. 49, no. 1/2, pp. 23-35.
- (2001) Journal of the AES , vol.49 , Issue.1-2 , pp. 23-35
- Papaodysseus, C.¹ Roussopoulos, G.² Fragoulis, D.³ Alexiou, C.⁴

168
- 4744354885
- Request for Information on Audio Fingerprinting Technologies
- RIAA/IFPI, available at
- RIAA/IFPI (2001) "Request for Information on Audio Fingerprinting Technologies", available at http://www.ifpi.org/site-content/press/20010615.html.
- (2001)

169
- 0034478682
- Short-term Sound Stream Characterization for Reliable, Real-Time Occurrence Monitoring of Given Sound-Prints
- 10th IEEE Mediterranean Electrotechnical Conference (MELECON 2000), Cyprus, May
- Richly G., Varga L., Kovács F. and Hosszú G. (2000) "Short-term Sound Stream Characterization for Reliable, Real-Time Occurrence Monitoring of Given Sound-Prints", 10th IEEE Mediterranean Electrotechnical Conference (MELECON 2000), pp. 29-31, Cyprus, May.
- (2000) , pp. 29-31
- Richly, G.¹ Varga, L.² Kovács, F.³ Hosszú, G.⁴

170
- 0030681105
- Transform-Based Indexing of Audio Data for Multimedia Databases
- IEEE International Conference on Multimedia Computing and Systems (ICMCS '97), Ottawa, Canada, June
- Subramanya S., Simba R., Narahari B. and Youssef A. (1997) "Transform-Based Indexing of Audio Data for Multimedia Databases", IEEE International Conference on Multimedia Computing and Systems (ICMCS '97), pp. 211-218, Ottawa, Canada, June.
- (1997) , pp. 211-218
- Subramanya, S.¹ Simba, R.² Narahari, B.³ Youssef, A.⁴

171
- 85143189691
- Modulation Frequency Features for Audio Fingerprinting
- ICASSP 2002, Orlando, FL, USA, May
- Sukittanon S. and Atlas L. (2002) "Modulation Frequency Features for Audio Fingerprinting", ICASSP 2002, Orlando, FL, USA, May.
- (2002)
- Sukittanon, S.¹ Atlas, L.²

172
- 0004172718
- Pattern Recognition
- Academic Press, San Diego, CA
- Theodoris S. and Koutroumbas K. (1998) Pattern Recognition, Academic Press, San Diego, CA.
- (1998)
- Theodoris, S.¹ Koutroumbas, K.²

173
- 84894907010
- Semantic Video Retrieval Using Audio Analysis
- Proceedings CIVR 2002, London, UK, July
- Bakker E. M. and Lew M. S. (2002) "Semantic Video Retrieval Using Audio Analysis", Proceedings CIVR 2002, pp. 271-277, London, UK, July.
- (2002) , pp. 271-277
- Bakker, E.M.¹ Lew, M.S.²

174
- 0031233424
- Speaker Recognition: A Tutorial
- Cambell J. R. (1997) "Speaker Recognition: A Tutorial", Proceedings of the IEEE, vol. 85, no. 9, pp. 1437-1462.
- (1997) Proceedings of the IEEE , vol.85 , Issue.9 , pp. 1437-1462
- Cambell, J.R.¹

175
- 0002595416
- Speaker Environment and Channel Change Detection and Clustering via the Bayesian Information Criterion
- DARPA Broadcast News Transcription and Understanding Workshop 1998, Lansdowne, VA, USA, February
- Chen S. and Gopalakrishnan P. (1998) "Speaker Environment and Channel Change Detection and Clustering via the Bayesian Information Criterion", DARPA Broadcast News Transcription and Understanding Workshop 1998, Lansdowne, VA, USA, February.
- (1998)
- Chen, S.¹ Gopalakrishnan, P.²

176
- 6344242294
- Detection of Soccer Goal Shots Using Joint Multimedia Features and Classification Rules
- Proceedings of the Fourth International Workshop on Multimedia Data Mining (MDM/KDD2003), Washington, DC, USA, August
- Chen S.-C., Shyu M.-L., Zhang C., Luo L. and Chen M. (2003) "Detection of Soccer Goal Shots Using Joint Multimedia Features and Classification Rules", Proceedings of the Fourth International Workshop on Multimedia Data Mining (MDM/KDD2003), pp. 36-44, Washington, DC, USA, August.
- (2003) , pp. 36-44
- Chen, S.-C.¹ Shyu, M.-L.² Zhang, C.³ Luo, L.⁴ Chen, M.⁵

177
- 85009212151
- A Sequential Metric-Based Audio Segmentation Method via the Bayesian Information Criterion
- Proceedings EUROSPEECH 2003, Geneva, Switzerland, September
- Cheng S.-S and Wang H.-M. (2003) "A Sequential Metric-Based Audio Segmentation Method via the Bayesian Information Criterion", Proceedings EUROSPEECH 2003, Geneva, Switzerland, September.
- (2003)
- Cheng, S.-S.¹ Wang, H.-M.²

178
- 84948186412
- Non-Negative Component Parts of Sound for Classification
- IEEE International Symposium on Signal Processing and Information Technology, Darmstadt, Germany, December
- Cho Y.-C., Choi S. and Bang S.-Y. (2003) "Non-Negative Component Parts of Sound for Classification", IEEE International Symposium on Signal Processing and Information Technology, Darmstadt, Germany, December.
- (2003)
- Cho, Y.-C.¹ Choi, S.² Bang, S.-Y.³

179
- 0030381663
- Unsupervised Speaker Segmentation in Telephone Conversations
- Proceedings, Nineteenth Convention of Electrical and Electronics Engineers, Israel
- Cohen A. and Lapidus V. (1996) "Unsupervised Speaker Segmentation in Telephone Conversations", Proceedings, Nineteenth Convention of Electrical and Electronics Engineers, Israel, pp. 102-105.
- (1996) , pp. 102-105
- Cohen, A.¹ Lapidus, V.²

180
- 0035500783
- Speech Enhancement for Non-Stationary Environments
- Cohen I. and Berdugo, B. (2001) "Speech Enhancement for Non-Stationary Environments", Signal Processing, vol. 81, pp. 2403-2418.
- (2001) Signal Processing , vol.81 , pp. 2403-2418
- Cohen, I.¹ Berdugo, B.²

181
- 0034273195
- DISTBIC: A Speaker-Based Segmentation for Audio Data Indexing
- Delacourt P. and Welekens C. J. (2000) "DISTBIC: A Speaker-Based Segmentation for Audio Data Indexing", Speech Communication, vol. 32, pp. 111-126.
- (2000) Speech Communication , vol.32 , pp. 111-126
- Delacourt, P.¹ Welekens, C.J.²

182
- 0003578015
- Cluster Analysis
- 3rd Edition, Oxford University Press, New York
- Everitt B. S. (1993) Cluster Analysis, 3rd Edition, Oxford University Press, New York.
- (1993)
- Everitt, B.S.¹

183
- 0019555090
- Cepstral Analysis Technique for Automatic Speaker Verification
- Furui S. (1981) "Cepstral Analysis Technique for Automatic Speaker Verification", IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-29, pp. 254-272.
- (1981) IEEE Transactions on Acoustics, Speech, and Signal Processing , vol.ASSP-29 , pp. 254-272
- Furui, S.¹

184
- 0000808717
- Partitioning and Transcription of Broadcast News Data
- Proceedings of ICSLP 1998, Sydney, Australia, November
- Gauvain J. L., Lamel L. and Adda G. (1998) "Partitioning and Transcription of Broadcast News Data", Proceedings of ICSLP 1998, Sydney, Australia, November.
- (1998)
- Gauvain, J.L.¹ Lamel, L.² Adda, G.³

185
- 0028516097
- Text-Independent Speaker Identification
- Gish H. and Schmidt N. (1994) "Text-Independent Speaker Identification", IEEE Signal Processing Magazine, pp. 18-21.
- (1994) IEEE Signal Processing Magazine , pp. 18-21
- Gish, H.¹ Schmidt, N.²

186
- 0026400244
- Segregation of Speaker for Speech Recognition and Speaker Identification
- Proceedings of ICASSP, Toronto, Canada, May
- Gish H., Siu M.-H. and Rohlicek R. (1991) "Segregation of Speaker for Speech Recognition and Speaker Identification", Proceedings of ICASSP, pp. 873-876, Toronto, Canada, May.
- (1991) , pp. 873-876
- Gish, H.¹ Siu, M.-H.² Rohlicek, R.³

187
- 0025041264
- Perceptual Linear Predictive (PLP) Analysis of Speech
- Hermansky H. (1990) "Perceptual Linear Predictive (PLP) Analysis of Speech", Journal of the Acoustical Society of America, vol. 87, no. 4, pp. 1738-1752.
- (1990) Journal of the Acoustical Society of America , vol.87 , Issue.4 , pp. 1738-1752
- Hermansky, H.¹

188
- 0028517164
- RASTA Processing of Speech
- Hermansky H. and Morgan N. (1994) "RASTA Processing of Speech", IEEE Transactions on Speech and Audio Processing, vol. 2, no. 4, pp. 578-589.
- (1994) IEEE Transactions on Speech and Audio Processing , vol.2 , Issue.4 , pp. 578-589
- Hermansky, H.¹ Morgan, N.²

189
- 0022929809
- The Computation of Line Spectral Frequencies Using Chebyshev Polynomials
- Kabal P. and Ramachandran R. (1986) "The Computation of Line Spectral Frequencies Using Chebyshev Polynomials", IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-34, no. 6, pp. 1419-1426.
- (1986) IEEE Transactions on Acoustics, Speech, and Signal Processing , vol.ASSP-34 , Issue.6 , pp. 1419-1426
- Kabal, P.¹ Ramachandran, R.²

190
- 0033692969
- Strategies for Automatic Segmentation of Audio Data
- Proceedings ICASSP 2000, Istanbul, Turkey, June
- Kemp T., Schmidt M., Westphal M. and Waibel A. (2000) "Strategies for Automatic Segmentation of Audio Data", Proceedings ICASSP 2000, Istanbul, Turkey, June.
- (2000)
- Kemp, T.¹ Schmidt, M.² Westphal, M.³ Waibel, A.⁴

191
- 8844234947
- Automatic Segmentation of Speakers in Broadcast Audio Material
- IS&T/SPIE's Electronic Imaging 2004, San Jose, CA, USA, January
- Kim H.-G. and Sikora T. (2004a) "Automatic Segmentation of Speakers in Broadcast Audio Material", IS&T/SPIE's Electronic Imaging 2004, San Jose, CA, USA, January.
- (2004)
- Kim, H.-G.¹ Sikora, T.²

192
- 4544361760
- Comparison of MPEG-7 Audio Spectrum Projection Features and MFCC Applied to Speaker Recognition, Sound Classification and Audio Segmentation
- Proceedings ICASSP 2004, Montreal, Canada, May
- Kim H.-G. and Sikora T. (2004b) "Comparison of MPEG-7 Audio Spectrum Projection Features and MFCC Applied to Speaker Recognition, Sound Classification and Audio Segmentation", Proceedings ICASSP 2004, Montreal, Canada, May.
- (2004)
- Kim, H.-G.¹ Sikora, T.²

193
- 85009101164
- Speech Enhancement based on Smoothing of Spectral Noise Floor
- Proceedings INTERSPEECH 2004 -ICSLP, Jeju Island, South Korea, October
- Kim H.-G. and Sikora T. (2004c) "Speech Enhancement based on Smoothing of Spectral Noise Floor", Proceedings INTERSPEECH 2004 -ICSLP, Jeju Island, South Korea, October.
- (2004)
- Kim, H.-G.¹ Sikora, T.²

194
- 0032181880
- Audio Feature Extraction and Analysis for Scene Segmentation and Classification
- Liu Z., Wang Y. and Chen T. (1998) "Audio Feature Extraction and Analysis for Scene Segmentation and Classification", Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology, vol. 20, no. 1/2, pp. 61-80.
- (1998) Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology , vol.20 , Issue.1-2 , pp. 61-80
- Liu, Z.¹ Wang, Y.² Chen, T.³

195
- 84889446323
- Speaker Change Detection and Tracking in Real-time News Broadcasting Analysis
- Proceedings 9th ACM International Conference on Multimedia, 2001, Ottawa, Canada, October
- Lu L. and Zhang H.-J. (2001) "Speaker Change Detection and Tracking in Real-time News Broadcasting Analysis", Proceedings 9th ACM International Conference on Multimedia, 2001, pp. 203-211, Ottawa, Canada, October.
- (2001) , pp. 203-211
- Lu, L.¹ Zhang, H.-J.²

196
- 84889394943
- A Robust Audio Classification and Segmentation Method
- Proceedings 10th ACM International Conference on Multimedia, 2002, Juan les Pins, France, December
- Lu L., Jiang H. and Zhang H.-J. (2002) "A Robust Audio Classification and Segmentation Method", Proceedings 10th ACM International Conference on Multimedia, 2002, Juan les Pins, France, December.
- (2002)
- Lu, L.¹ Jiang, H.² Zhang, H.-J.³

197
- 0032667465
- Tracking Speech-presence Uncertainty to Improve Speech Enhancement in Non-stationary Noise Environments
- Phoenix, AZ, USA, March
- Malah D., Cox R. and Accardi A. (1999) "Tracking Speech-presence Uncertainty to Improve Speech Enhancement in Non-stationary Noise Environments", Proceedings ICASSP 1999, vol. 2, pp. 789-792, Phoenix, AZ, USA, March.
- (1999) Proceedings ICASSP 1999 , vol.2 , pp. 789-792
- Malah, D.¹ Cox, R.² Accardi, A.³

198
- 33745186799
- Phonetic Confusion Based Document Expansion for Spoken Document Retrieval
- ICSLP Interspeech 2004, Jeju Island, Korea, October
- Moreau N., Kim H.-G. and Sikora T. (2004) "Phonetic Confusion Based Document Expansion for Spoken Document Retrieval", ICSLP Interspeech 2004, Jeju Island, Korea, October.
- (2004)
- Moreau, N.¹ Kim, H.-G.² Sikora, T.³

199
- 0003425258
- Digital Processing of Speech Signals
- Prentice Hall (Signal Processing Series), Englewood Cliffs, NJ
- Rabiner L. R. and Schafer R. W. (1978) Digital Processing of Speech Signals, Prentice Hall (Signal Processing Series), Englewood Cliffs, NJ.
- (1978)
- Rabiner, L.R.¹ Schafer, R.W.²

200
- 85128386923
- Blind Clustering of Speech Utterances Based on Speaker and Language Characteristics
- Proceedings ICASSP 1998, Seattle, WA, USA, May
- Reynolds D. A., Singer E., Carlson B. A., McLaughlin J. J., O'Leary G.C. and Zissman M. A. (1998) "Blind Clustering of Speech Utterances Based on Speaker and Language Characteristics", Proceedings ICASSP 1998, Seattle, WA, USA, May.
- (1998)
- Reynolds, D.A.¹ Singer, E.² Carlson, B.A.³ McLaughlin, J.J.⁴ O'Leary, G.C.⁵ Zissman, M.A.⁶

201
- 84889346572
- Automatic Segmentation, Classification and Clustering of Broadcast News Audio
- Proceedings of Speech Recognition Workshop, Chantilly, VA, USA, February
- Siegler M. A., Jain U., Raj B. and Stern R. M. (1997) "Automatic Segmentation, Classification and Clustering of Broadcast News Audio", Proceedings of Speech Recognition Workshop, Chantilly, VA, USA, February.
- (1997)
- Siegler, M.A.¹ Jain, U.² Raj, B.³ Stern, R.M.⁴

202
- 85009265801
- An Unsupervised, Sequential Learning Algorithm for the Segmentation of Speech Waveforms with Multiple Speakers
- Proceedings ICASSP 1992, vol.2, San Francisco, USA, March
- Siu M.-H., Yu G. and Gish H. (1992) "An Unsupervised, Sequential Learning Algorithm for the Segmentation of Speech Waveforms with Multiple Speakers", Proceedings ICASSP 1992, vol.2, pp. 189-192, San Francisco, USA, March.
- (1992) , pp. 189-192
- Siu, M.-H.¹ Yu, G.² Gish, H.³

203
- 84889324982
- Speaker Tracking and Detection with Multiple Speakers
- Seattle, WA, USA, May
- Solomonoff A., Mielke A., Schmidt M. and Gish H. (1998) "Speaker Tracking and Detection with Multiple Speakers", Proceedings ICASSP 1998, vol. 2, pp. 757-760, Seattle, WA, USA, May.
- (1998) Proceedings ICASSP 1998 , vol.2 , pp. 757-760
- Solomonoff, A.¹ Mielke, A.² Schmidt, M.³ Gish, H.⁴

204
- 0037521928
- Speaker Tracking and Detection with Multiple Speakers
- Proceedings EUROSPEECH 1999, Budapest, Hungary, September
- Sommez K., Heck L. and Weintraub M. (1999) "Speaker Tracking and Detection with Multiple Speakers", Proceedings EUROSPEECH 1999, Budapest, Hungary, September.
- (1999)
- Sommez, K.¹ Heck, L.² Weintraub, M.³

205
- 0033279679
- Towards Robust Features for Classifying Audio in the CueVideo System
- Proceedings 7th ACM International Conference on Multimedia, Ottawa, Canada, October
- Srinivasan S., Petkovic D. and Ponceleon D. (1999) "Towards Robust Features for Classifying Audio in the CueVideo System", Proceedings 7th ACM International Conference on Multimedia, pp. 393-400, Ottawa, Canada, October.
- (1999) , pp. 393-400
- Srinivasan, S.¹ Petkovic, D.² Ponceleon, D.³

206
- 0027252184
- Speech Segmentation and Clustering Based on Speaker Features
- Minneapolis, USA, April
- Sugiyama M., Murakami J. and Watanabe H. (1993) "Speech Segmentation and Clustering Based on Speaker Features", Proceedings ICASSP 1993, vol. 2, pp. 395-398, Minneapolis, USA, April.
- (1993) Proceedings ICASSP 1993 , vol.2 , pp. 395-398
- Sugiyama, M.¹ Murakami, J.² Watanabe, H.³

207
- 0003775661
- Improved Speaker Segmentation and Segments Clustering Using the Bayesian Information Criterion
- Proceedings EUROSPEECH 1999, Budapest, Hungary, September
- Tritschler A. and Gopinath R. (1999) "Improved Speaker Segmentation and Segments Clustering Using the Bayesian Information Criterion", Proceedings EUROSPEECH 1999, Budapest, Hungary, September.
- (1999)
- Tritschler, A.¹ Gopinath, R.²

208
- 11244258944
- Sports Highlight Detection from Keyword Sequences Using HMM
- Proceedings ICME 2004, Taipei, China, June
- Wang J., Xu C., Chng E. S. and Tian Q. (2004) "Sports Highlight Detection from Keyword Sequences Using HMM", Proceedings ICME 2004, Taipei, China, June.
- (2004)
- Wang, J.¹ Xu, C.² Chng, E.S.³ Tian, Q.⁴

209
- 85032751556
- Multimedia Content Analysis Using Audio and Visual Information
- Wang Y., Liu Z. and Huang J. (2000) "Multimedia Content Analysis Using Audio and Visual Information", IEEE Signal Processing Magazine (invited paper), vol. 17, no. 6, pp. 12-36.
- (2000) IEEE Signal Processing Magazine (invited paper) , vol.17 , Issue.6 , pp. 12-36
- Wang, Y.¹ Liu, Z.² Huang, J.³

210
- 79952385877
- Segmentation of Speech Using Speaker Identification
- Proceedings ICASSP 1994, Adelaide, Australia, April
- Wilcox L., Chen F., Kimber D. and Balasubramanian V. (1994) "Segmentation of Speech Using Speaker Identification", Proceedings ICASSP 1994, Adelaide, Australia, April.
- (1994)
- Wilcox, L.¹ Chen, F.² Kimber, D.³ Balasubramanian, V.⁴

211
- 84892177707
- Experiments in Broadcast News Transcription
- Proceedings ICASSP 1998, Seattle, WA, USA, May
- Woodland P. C., Hain T., Johnson S., Niesler T., Tuerk A. and Young S. (1998) "Experiments in Broadcast News Transcription", Proceedings ICASSP 1998, Seattle, WA, USA, May.
- (1998)
- Woodland, P.C.¹ Hain, T.² Johnson, S.³ Niesler, T.⁴ Tuerk, A.⁵ Young, S.⁶

212
- 61949211336
- UBM-Based Real-Time Speaker Segmentation for Broadcasting News
- ICME 2003, Hong Kong, April
- Wu T., Lu L., Chen K. and Zhang H.-J. (2003) "UBM-Based Real-Time Speaker Segmentation for Broadcasting News", ICME 2003, vol.2, pp. 721-724, Hong Kong, April.
- (2003) , vol.2 , pp. 721-724
- Wu, T.¹ Lu, L.² Chen, K.³ Zhang, H.-J.⁴

213
- 0141743478
- Audio Events Detection Based Highlights Extraction from Baseball, Golf and Soccer Games in a Unified Framework
- Hong Kong, April
- Xiong Z., Radhakrishnan R., Divakaran A. and Huang T. S. (2003) "Audio Events Detection Based Highlights Extraction from Baseball, Golf and Soccer Games in a Unified Framework", Proceedings ICASSP 2003, vol. 5, pp. 632-635, Hong Kong, April.
- (2003) Proceedings ICASSP 2003 , vol.5 , pp. 632-635
- Xiong, Z.¹ Radhakrishnan, R.² Divakaran, A.³ Huang, T.S.⁴

214
- 85009160774
- An Improved Model-Based Speaker Segmentation System
- Proceedings EUROSPEECH 2003, Geneva, Switzerland, September
- Yu P., Seide F., Ma C. and Chang E. (2003) "An Improved Model-Based Speaker Segmentation System", Proceedings EUROSPEECH 2003, Geneva, Switzerland, September.
- (2003)
- Yu, P.¹ Seide, F.² Ma, C.³ Chang, E.⁴

215
- 85009089453
- Unsupervised Audio Stream Segmentation and Clustering via the Bayesian Information Criterion
- Proceedings ICSLP 2000, Beijing, China, October
- Zhou B. W. and John H. L. (2000) "Unsupervised Audio Stream Segmentation and Clustering via the Bayesian Information Criterion", Proceedings ICSLP 2000, Beijing, China, October.
- (2000)
- Zhou, B.W.¹ John, H.L.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.