SCOPUS 정보 검색 플랫폼

IEEE Transactions on Audio, Speech and Language Processing

Volumn 18, Issue 3, 2010, Pages 688-707

Segmentation, indexing, and retrieval for environmental and natural sounds

(5) Wichern, Gordon a Xue, Jiachen a Thornburg, Harvey a Mechtley, Brandon a Spanias, Andreas a

a ARIZONA STATE UNIVERSITY (United States)

Author keywords

Acoustic signal analysis; Acoustic signal detection; Bayes procedures; Clustering methods; Database query processing

Indexed keywords

ACOUSTIC SIGNAL ANALYSIS; ACOUSTIC SIGNAL DETECTION; BAYES PROCEDURE; CLUSTERING METHODS; DATABASE QUERY PROCESSING;

ACOUSTIC WAVES; AUDIO ACOUSTICS; AUDIO SYSTEMS; BAYESIAN NETWORKS; DATABASE SYSTEMS; HIDDEN MARKOV MODELS; INDEXING (OF INFORMATION); INFERENCE ENGINES; QUERY PROCESSING; SIGNAL ANALYSIS; SIGNAL DETECTION;

CLUSTERING ALGORITHMS;

EID: 76949085351 PISSN: 15587916 EISSN: None Source Type: Journal
DOI: 10.1109/TASL.2010.2041384 Document Type: Article

Times cited : (61)

References (50)

1
- 34548068861
- Rochester VT: Destiny Books
- R. Schafer, The Soundscape. Rochester, VT: Destiny Books, 1968.
- (1968) The Soundscape
- Schafer, R.¹

2
- 0008874464
- Norwood NJ: Ablex Publishing
- B. Truax, Acoustic Communication. Norwood, NJ: Ablex Publishing, 1984.
- (1984) Acoustic Communication
- Truax, B.¹

3
- 0003684441
- Cambridge MA: MIT PRess
- A. Bregman, Auditory Scene Analysis: The Perceptual Organization of Sound. Cambridge, MA: MIT PRess, 1990.
- (1990) Auditory Scene Analysis: The Perceptual Organization of Sound
- Bregman, A.¹

4
- 33750550452
- Automatic surveillance of the acoustic activity in our living environment
- Amsterdam, The Netherlands Jul.
- A. Harma, M. F. McKinney, and J. Skowronek, "Automatic surveillance of the acoustic activity in our living environment," in IEEE Int. Conf. Multimedia and Expo, Amsterdam, The Netherlands, Jul. 2005.
- (2005) IEEE Int. Conf. Multimedia and Expo
- Harma, A.¹ McKinney, M.F.² Skowronek, J.³

5
- 0042033931
- Soundscapes in urban and rural planning and design-A brief communication of a research project
- P. Hedfors and P. Grahn, R. Schafer and H. Jarviluoma, Eds., "Soundscapes in urban and rural planning and design-A brief communication of a research project," in Northern Soundscapes: Yearbook of Soundscape Studies, vol.1, pp. 67-82.
- Northern Soundscapes: Yearbook of Soundscape Studies , vol.1 , pp. 67-82
- Hedfors, P.¹ Grahn, P.² Schafer, R.³ Jarviluoma Eds., H.⁴

6
- 14944367313
- Minimal-impact audio-based personal archives
- New York Oct.
- D. P. W. Ellis and K. Lee, "Minimal-impact audio-based personal archives," in Proc. 1st ACM Workshop Continuous Archiving and Recording of Personal Experiences CARPE-04, New York, Oct. 2004.
- (2004) Proc. 1st ACM Workshop Continuous Archiving and Recording of Personal Experiences CARPE-04
- Ellis, D.P.W.¹ Lee, K.²

7
- 33745204826
- MyLifeBits: A personal database for everything
- J. Gemmell, G. Bell, and R. Lueder, "MyLifeBits: A personal database for everything," Commun. ACM, vol.49, no.1, pp. 88-95, 2006.
- (2006) Commun. ACM , vol.49 , Issue.1 , pp. 88-95
- Gemmell, J.¹ Bell, G.² Lueder, R.³

8
- 0003444613
- Mahwah NJ: Lawrence Erlbaum Associates
- D. F. Rosenthal and H. G. Okuno, Computational Auditory Scene Analysis. Mahwah, NJ: Lawrence Erlbaum Associates, 1998.
- (1998) Computational Auditory Scene Analysis
- Rosenthal, D.F.¹ Okuno, H.G.²

9
- 0023831656
- A new statistical approach for automatic segmentation of continuous speech signals
- Jan.
- R. Andre-Obrecht, "A new statistical approach for automatic segmentation of continuous speech signals," IEEE Trans. Acoust., Speech, Signal Process., vol.36, no.1, pp. 29-40, Jan. 1988.
- (1988) IEEE Trans. Acoust., Speech, Signal Process. , vol.36 , Issue.1 , pp. 29-40
- Andre-Obrecht, R.¹

10
- 31844447985
- Ph.D. dissertation, Radboud Univ. of Nijmegen, Nijmegen, The Netherlands
- A. T. Cemgil, "Bayesian Music Transcription," Ph.D. dissertation, Radboud Univ. of Nijmegen, Nijmegen, The Netherlands, 2004.
- (2004) Bayesian Music Transcription
- Cemgil, A.T.¹

11
- 34250777622
- Ph.D. dissertation, Stanford Univ., Stanford, CA
- H. Thornburg, "Detection and modeling of transient audio signals with prior information," Ph.D. dissertation, Stanford Univ., Stanford, CA, 2005.
- (2005) Detection and Modeling of Transient Audio Signals with Prior Information
- Thornburg, H.¹

12
- 33947685019
- Audio elements based auditory scene segmentation
- Toulouse, France
- L. Lu, R. Cai, and A. Hanjalic, "Audio elements based auditory scene segmentation," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., Toulouse, France, 2006.
- (2006) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process.
- Lu, L.¹ Cai, R.² Hanjalic, A.³

13
- 46749124289
- Robust multi-feature segmentation and indexing for natural sound environments
- Bordeaux, France
- G. Wichern, H. Thornburg, B. Mechtley, A. Fink, K. Tu, and A. Spanias, "Robust multi-feature segmentation and indexing for natural sound environments," in Proc. IEEE/EURASIP Int. Workshop Content- Based Multimedia Indexing (CBMI), Bordeaux, France, 2007, pp. 69-76.
- (2007) Proc. IEEE/EURASIP Int. Workshop Content- Based Multimedia Indexing (CBMI) , pp. 69-76
- Wichern, G.¹ Thornburg, H.² Mechtley, B.³ Fink, A.⁴ Tu, K.⁵ Spanias, A.⁶

14
- 33846227904
- Automatic meeting segmentation using dynamic Bayesian networks
- Jan.
- A. Dielmann and S. Renals, "Automatic meeting segmentation using dynamic Bayesian networks," IEEE Trans. Multimedia, vol.9, no.1, pp. 25-36, Jan. 2007.
- (2007) IEEE Trans. Multimedia , vol.9 , Issue.1 , pp. 25-36
- Dielmann, A.¹ Renals, S.²

15
- 33646748325
- Modeling individual and group actions in meetings with layered HMMs
- May
- D. Zhang, D. Gatica-Perez, S. Bengio, and I. McCowan, "Modeling individual and group actions in meetings with layered HMMs," IEEE Trans. Multimedia, vol.8, no.3, pp. 509-520, May 2006.
- (2006) IEEE Trans. Multimedia , vol.8 , Issue.3 , pp. 509-520
- Zhang, D.¹ Gatica-Perez, D.² Bengio, S.³ McCowan, I.⁴

16
- 0004217877
- London U.K.: Butterwoths
- C. J. V. Rijsbergen, Information Retrieval. London, U.K.: Butterwoths, 1979.
- (1979) Information Retrieval
- Rijsbergen, C.J.V.¹

17
- 0001920633
- Melody spotting using hidden Markov models
- Bloomington, IN
- A. S. Durey and M. A. Clements, "Melody spotting using hidden Markov models," in Proc. Int. Symp. Music Inf. Retrieval (ISMIR)., Bloomington, IN, 2001.
- (2001) Proc. Int. Symp. Music Inf. Retrieval (ISMIR)
- Durey, A.S.¹ Clements, M.A.²

18
- 0036989244
- HMM-based musical query retrieval
- Portland, OR
- J. Shifrin, B. Pardo, C. Meek, and W. Birmingham, "HMM-based musical query retrieval," in Proc. 2nd ACM/IEEE-CS Joint Conf. Digital Libraries, Portland, OR, 2002.
- (2002) Proc. 2nd ACM/IEEE-CS Joint Conf. Digital Libraries
- Shifrin, J.¹ Pardo, B.² Meek, C.³ Birmingham, W.⁴

19
- 0036991668
- Robust temporal and spectral modeling for query by melody
- Tampere, Finland
- S. Shalev-Shwartz, S. Dubnov, N. Friedman, and Y. Singer, "Robust temporal and spectral modeling for query by melody," in Proc. 25th Annu. Int. ACM SIGIR Conf. Res. Develop. Inf. Retrieval, Tampere, Finland, 2002.
- (2002) Proc. 25th Annu. Int. ACM SIGIR Conf. Res. Develop. Inf. Retrieval
- Shalev-Shwartz, S.¹ Dubnov, S.² Friedman, N.³ Singer, Y.⁴

20
- 0030242072
- Content-based classification, search, and retrieval of audio
- E. Wold, T. Blum, D. Keislar, and J. Wheaton, "Content-based classification, search, and retrieval of audio," IEEE Multimedia, vol.3, no.3, pp. 27-36, 1996.
- (1996) IEEE Multimedia , vol.3 , Issue.3 , pp. 27-36
- Wold, E.¹ Blum, T.² Keislar, D.³ Wheaton, J.⁴

21
- 2942720260
- Features for audio and music classification
- Baltimore, MD Oct.
- M. F. McKinney and J. Breebaart, "Features for audio and music classification," in Proc. 4th Int. Conf. Music Inf. Retrieval, Baltimore, MD, Oct. 2003.
- (2003) Proc. 4th Int. Conf. Music Inf. Retrieval
- McKinney, M.F.¹ Breebaart, J.²

22
- 0036648502
- Musical genre classification of audio signals
- Jul.
- G. Tzanetakis and P. Cook, "Musical genre classification of audio signals," IEEE Trans. Speech Audio Process., vol.10, no.5, pp. 293-302, Jul. 2002.
- (2002) IEEE Trans. Speech Audio Process. , vol.10 , Issue.5 , pp. 293-302
- Tzanetakis, G.¹ Cook, P.²

23
- 84889435599
- West Sussex U.K.: Wiley
- H. G. Kim, N. Moreau, and T. Sikora, MPEG-7 Audio and Beyond: Audio Content Indexing and Retrieval. West Sussex, U.K.: Wiley, 2005.
- (2005) MPEG-7 Audio and Beyond: Audio Content Indexing and Retrieval
- Kim, H.G.¹ Moreau, N.² Sikora, T.³

24
- 0001481529
- Bark and ERB bilinear transforms
- Nov.
- J. O. Smith III and J. S. Abel, "Bark and ERB bilinear transforms," IEEE Trans. Speech Audio Process., vol.7, no.6, pp. 697-708, Nov. 1999.
- (1999) IEEE Trans. Speech Audio Process. , vol.7 , Issue.6 , pp. 697-708
- Smith III, J.O.¹ Abel, J.S.²

25
- 0037418225
- Optimally sparse representation in general (nonorthogonal) dictionaries via minimization
- D. L. Donoho and M. Elad, "Optimally sparse representation in general (nonorthogonal) dictionaries via minimization," in Proc. National Academy Sci., 2003, vol. 100, no. 5, pp. 2197-2202.
- (2003) Proc. National Academy Sci. , vol.100 , Issue.5 , pp. 2197-2202
- Donoho, D.L.¹ Elad, M.²

26
- 76949109252
- Speech Processing, Transmission and Quality Aspects (STQ); Distributed Speech Recognition; Front-End Feature Extraction Algorithm; Compression Algorithms, ETSI ES 201 108 v1.1.3 (2003-2009), 2003, E.T.S.I. standard document
- Speech Processing, Transmission and Quality Aspects (STQ); Distributed Speech Recognition; Front-End Feature Extraction Algorithm; Compression Algorithms, ETSI ES 201 108 v1.1.3 (2003-2009), 2003, E.T.S.I. standard document.

27
- 0036214787
- Yin, a fundamental frequency estimator for speech and music
- A. de Cheveigne and H. Kawahara, "Yin, a fundamental frequency estimator for speech and music," J. Acoust. Soc. Amer., vol.111, no.4, pp. 1917-1930, 2002.
- (2002) J. Acoust. Soc. Amer. , vol.111 , Issue.4 , pp. 1917-1930
- De Cheveigne, A.¹ Kawahara, H.²

28
- 0015756315
- An optimum processor theory for the central formation of the pitch of complex tones
- J. L. Goldstein, "An optimum processor theory for the central formation of the pitch of complex tones," J. Acoust. Soc. Amer., vol.54, no.6, pp. 1496-1516, 1973.
- (1973) J. Acoust. Soc. Amer. , vol.54 , Issue.6 , pp. 1496-1516
- Goldstein, J.L.¹

29
- 23944465437
- A new probabilistic spectral pitch estimator: Exact and MCMC-approximate strategies
- U. K. Wiil, Ed. New York: Springer-Verlag
- H. Thornburg and R. J. Leistikow, "A new probabilistic spectral pitch estimator: Exact and MCMC-approximate strategies," in Lecture Notes in Computer Science 3310, U. K. Wiil, Ed. New York: Springer-Verlag, 2005.
- (2005) Lecture Notes in Computer Science 3310
- Thornburg, H.¹ Leistikow, R.J.²

30
- 47649111947
- Melody extraction and musical onset detection via probabilistic models of STFT peak data
- May
- H. Thornburg, R. Leistikow, and J. Berger, "Melody extraction and musical onset detection via probabilistic models of STFT peak data," IEEE Trans. Audio, Speech, Lang. Process., vol.15, no.4, pp. 1257-1272, May 2007.
- (2007) IEEE Trans. Audio, Speech, Lang. Process. , vol.15 , Issue.4 , pp. 1257-1272
- Thornburg, H.¹ Leistikow, R.² Berger, J.³

31
- 0003802315
- New York: Wiley
- F. Gustaffsson, Adaptive Filtering and Change Detection. New York: Wiley, 2001.
- (2001) Adaptive Filtering and Change Detection
- Gustaffsson, F.¹

32
- 46749144025
- A dynamic Bayesian network approach to tracking learned switching dynamic models
- Pittsburgh, PA
- V. Pavlovic, J. M. Rehg, and T. Cham, "A dynamic Bayesian network approach to tracking learned switching dynamic models," in Proc. Int. Workshop Hybrid Syst., Pittsburgh, PA, 2000.
- (2000) Proc. Int. Workshop Hybrid Syst.
- Pavlovic, V.¹ Rehg, J.M.² Cham, T.³

33
- 0002595416
- Speaker environment and channel change detection and clustering via the Bayesian information criterion
- S. Chen and P. Gopalakrishnan, "Speaker environment and channel change detection and clustering via the Bayesian information criterion," in Proc. DARPA Broadcast News Transcription and Understanding Workshop, 1998.
- (1998) Proc. DARPA Broadcast News Transcription and Understanding Workshop
- Chen, S.¹ Gopalakrishnan, P.²

34
- 33750392431
- Accessing minimal-impact personal audio archives
- Jul.
- D. Ellis and K. Lee, "Accessing minimal-impact personal audio archives," IEEE Multimedia, vol.13, no.4, pp. 30-38, Jul. 2006.
- (2006) IEEE Multimedia , vol.13 , Issue.4 , pp. 30-38
- Ellis, D.¹ Lee, K.²

35
- 70349471166
- Multi-channel audio segmentation for continuous observation and archival of large spaces
- Taipei, Taiwan
- G. Wichern, H. Thornburg, and A. Spanias, "Multi-channel audio segmentation for continuous observation and archival of large spaces," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., Taipei, Taiwan, 2009, pp. 237-240.
- (2009) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. , pp. 237-240
- Wichern, G.¹ Thornburg, H.² Spanias, A.³

36
- 0002165558
- Rao-Blackwellised particle filtering for dynamic Bayesian networks
- Stanford, CA
- A. Doucet, N. de Freitas, K. Murphy, and S. Russell, "Rao- Blackwellised particle filtering for dynamic Bayesian networks," in Proc. Conf. Uncertainty in Artif. Intell., Stanford, CA, 2000.
- (2000) Proc. Conf. Uncertainty in Artif. Intell.
- Doucet, A.¹ De Freitas, N.² Murphy, K.³ Russell, S.⁴

37
- 0016355478
- A new look at the statistical model identification
- Mar.
- H. Akaike, "A new look at the statistical model identification, " IEEE Trans. Autom. Control, vol.AC-19, no.6, pp. 716-723, Mar. 1974.
- (1974) IEEE Trans. Autom. Control , vol.AC-19 , Issue.6 , pp. 716-723
- Akaike, H.¹

38
- 0042553279
- Smoothing and differentiation of data by simplified least squares procedures
- A. Savitzky and M. J. Golay, "Smoothing and differentiation of data by simplified least squares procedures," Anal. Chem., vol.36, no.8, pp. 1627-1639, 1964.
- (1964) Anal. Chem. , vol.36 , Issue.8 , pp. 1627-1639
- Savitzky, A.¹ Golay, M.J.²

39
- 0024610919
- A tutorial on hidden Markov models and selected applications in speech recognition
- Feb.
- L. R. Rabiner, "A tutorial on hidden Markov models and selected applications in speech recognition," Proc. IEEE, vol.77, no.2, pp. 257-286, Feb. 1989.
- (1989) Proc. IEEE , vol.77 , Issue.2 , pp. 257-286
- Rabiner, L.R.¹

40
- 84899013108
- On spectral clustering analysis and an algorithm
- Vancouver, BC, Canada
- A. Y. Ng, M. Jordan, and Y. Weiss, "On spectral clustering analysis and an algorithm," in Adv. Neural Inf. Process. Syst., Vancouver, BC, Canada, 2002.
- (2002) Adv. Neural Inf. Process. Syst.
- Ng, A.Y.¹ Jordan, M.² Weiss, Y.³

41
- 0022018101
- A probabilistic distance measure for hidden Markov models
- B. H. Huang and L. R. Rabiner, "A probabilistic distance measure for hidden Markov models," AT&T Tech. J., vol.64, no.2, pp. 1251-1270, 1985.
- (1985) AT&T Tech. J. , vol.64 , Issue.2 , pp. 1251-1270
- Huang, B.H.¹ Rabiner, L.R.²

42
- 51449089882
- Fast querybyexample of environmental sounds via robust and efficient clusterbased indexing
- Las Vegas, NV
- J. Xue, G. Wichern, H. Thornburg, and A. S. Spanias, "Fast querybyexample of environmental sounds via robust and efficient clusterbased indexing," in IEEE Int. Conf. Acoust., Speech, Signal Process., Las Vegas, NV, 2008, pp. 5-8.
- (2008) IEEE Int. Conf. Acoust., Speech, Signal Process. , pp. 5-8
- Xue, J.¹ Wichern, G.² Thornburg, H.³ Spanias, A.S.⁴

43
- 33644783522
- Self-tuning spectral clustering
- Whistler, BC, Canada
- L. Zelnik-Manor and P. Perona, "Self-tuning spectral clustering," in Adv. Neural Inf. Process. Syst., Whistler, BC, Canada, 2004.
- (2004) Adv. Neural Inf. Process. Syst.
- Zelnik-Manor, L.¹ Perona, P.²

44
- 0000120766
- Estimating the dimension of a model
- G. Schwarz, "Estimating the dimension of a model," Ann. Statist., vol.6, no.2, pp. 461-464, 1978.
- (1978) Ann. Statist. , vol.6 , Issue.2 , pp. 461-464
- Schwarz, G.¹

45
- 0001450951
- The TREC spoken document retrieval track: A success story
- Gaithersburg, MD
- J. S. Garofolo, C. G. P. Auzanne, and E. M. Voorhees, "The TREC spoken document retrieval track: A success story," in Proc 8th Text REtrieval Conf. (TREC), Gaithersburg, MD, 1999.
- (1999) Proc 8th Text REtrieval Conf. (TREC)
- Garofolo, J.S.¹ Auzanne, C.G.P.² Voorhees, E.M.³

46
- 51449095148
- A tempo-insensitive distance measure for cover song identification based on chroma features
- Las Vegas, NV
- J. H. Jensen, M. G. Christensen, D. Ellis, and S. H. Jensen, "A tempo-insensitive distance measure for cover song identification based on chroma features," in IEEE Int. Conf. Acoust., Speech, Signal Process., Las Vegas, NV, 2008.
- (2008) IEEE Int. Conf. Acoust., Speech, Signal Process.
- Jensen, J.H.¹ Christensen, M.G.² Ellis, D.³ Jensen, S.H.⁴

47
- 34547505718
- Audio information retrieval using semantic similarity
- Honolulu, HI
- L. Barrington, A. Chan, D. Turnbull, and G. R. G. Lanckriet, "Audio information retrieval using semantic similarity," in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., Honolulu, HI, 2007, pp. 725-728.
- (2007) Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. , pp. 725-728
- Barrington, L.¹ Chan, A.² Turnbull, D.³ Lanckriet, G.R.G.⁴

48
- 50249158039
- Distortion-aware query by example for environmental sounds
- New Paltz, NY
- G. Wichern, J. Xue, H. Thornburg, and A. Spanias, "Distortion-aware query by example for environmental sounds," in Proc. IEEE Workshop the Applicat. Signal Process. Audio Acoust. (WASPAA), New Paltz, NY, 2007, pp. 335-338.
- (2007) Proc. IEEE Workshop the Applicat. Signal Process. Audio Acoust. (WASPAA) , pp. 335-338
- Wichern, G.¹ Xue, J.² Thornburg, H.³ Spanias, A.⁴

49
- 8644267670
- Conceptnet: A practical commonsense reasoning toolkit
- H. Liu and P. Singh, "Conceptnet: A practical commonsense reasoning toolkit," BT Technol. J., vol.22, no.4, pp. 211-226, 2004.
- (2004) BT Technol. J. , vol.22 , Issue.4 , pp. 211-226
- Liu, H.¹ Singh, P.²

50
- 76949096922
- Unifying semantic and content-based approaches for retrieval of environmental sounds
- New Paltz, NY
- G. Wichern, H. Thornburg, and A. Spanias, "Unifying semantic and content-based approaches for retrieval of environmental sounds," in Proc. IEEE Workshop Applicat. Signal Process. to Audio Acoust. (WASPAA), New Paltz, NY, 2009, pp. 13-16.
- (2009) Proc. IEEE Workshop Applicat. Signal Process. to Audio Acoust. (WASPAA) , pp. 13-16
- Wichern, G.¹ Thornburg, H.² Spanias, A.³

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.