SCOPUS 정보 검색 플랫폼

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Volumn , Issue , 2013, Pages 2282-2286

A blind segmentation approach to acoustic event detection based on I-vector

(5) Huang, Zhen a Cheng, You Chi a Li, Kehuang a Hautamäki, Ville a,b Lee, Chin Hui a

a GEORGIA INSTITUTE OF TECHNOLOGY (United States)

b UNIVERSITY OF EASTERN FINLAND (Finland)

Author keywords

Acoustic event detection; Blind segmentation; I vector; Maximal figure of merit; Support vector machine

Indexed keywords

AUDIO ACOUSTICS; HIDDEN MARKOV MODELS; SUPPORT VECTOR MACHINES; VECTORS;

ACOUSTIC EVENT DETECTIONS; ACOUSTIC EVENTS; AUTOMATIC IMAGE ANNOTATION; BLIND SEGMENTATION; CONVENTIONAL APPROACH; EVENT BOUNDARY; I VECTORS; MAXIMAL FIGURE OF MERIT;

IMAGE RETRIEVAL;

EID: 84906272598 PISSN: 2308457X EISSN: 19909772 Source Type: Conference Proceeding
DOI: None Document Type: Conference Paper

Times cited : (34)

References (32)

1
- 77955558847
- Real-world acoustic event detection
- X. Zhuang, X. Zhou, M. A. Hasegawa-Johnson, and T. S. Huang, "Real-world acoustic event detection, " Pattern Recognition Letters, vol. 31, no. 12, pp. 1543-1551, 2010.
- (2010) Pattern Recognition Letters , vol.31 , Issue.12 , pp. 1543-1551
- Zhuang, X.¹ Zhou, X.² Hasegawa-Johnson, M.A.³ Huang, T.S.⁴

2
- 84905269591
- 2011 Multimedia event detection: Late- fusion approaches to combine multiple audio-visual features
- A. G. A. Perera, S. Oh, M. Leotta, I. Kim, B. Byun, C.-H. Lee, S. McCloskey, J. Liu, B. Miller, Z. F. Huan, A. Vahdat, W. Yang, G. Mori, K. Tang, D. Koller, L. Fei-Fei, K. Li, G. Chen, J. Corso, Y. Fu, and R. Srihari, "2011 Multimedia Event Detection: Late- Fusion Approaches to Combine Multiple Audio-Visual features, " in Proc. NIST TRECVID Workshop, 2011.
- (2011) Proc. NIST TRECVID Workshop
- Perera, A.G.A.¹ Oh, S.² Leotta, M.³ Kim, I.⁴ Byun, B.⁵ Lee, C.-H.⁶ McCloskey, S.⁷ Liu, J.⁸ Miller, B.⁹ Huan, Z.F.¹⁰ Vahdat, A.¹¹ Yang, W.¹² Mori, G.¹³ Tang, K.¹⁴ Koller, D.¹⁵ Fei-Fei, L.¹⁶ Li, K.¹⁷ Chen, G.¹⁸ Corso, J.¹⁹ Fu, Y.²⁰ Srihari, R.²¹ more..

3
- 84905233993
- Tokyotech+ canon at TRECVID 2011
- N. Inoue, Y. Kamishima, T. Wada, K. Shinoda, and S. Sato, "TokyoTech+ Canon at TRECVID 2011, " in Proc. NIST TRECVID Workshop, 2011.
- (2011) Proc. NIST TRECVID Workshop
- Inoue, N.¹ Kamishima, Y.² Wada, T.³ Shinoda, K.⁴ Sato, S.⁵

4
- 84867614198
- Audio event detection from acoustic unit occurrence patterns
- IEEE
- A. Kumar, P. Dighe, R. Singh, S. Chaudhuri, and B. Raj, "Audio event detection from acoustic unit occurrence patterns, " in Proc. ICASSP. IEEE, 2012, pp. 489-492.
- (2012) Proc. ICASSP , pp. 489-492
- Kumar, A.¹ Dighe, P.² Singh, R.³ Chaudhuri, S.⁴ Raj, B.⁵

5
- 11244272075
- Highlight sound effects detection in audio stream
- IEEE
- R. Cai, L. Lu, H.-J. Zhang, and L.-H. Cai, "Highlight sound effects detection in audio stream, " in Proc. ICME, vol. 3. IEEE, 2003, pp. III-37.
- (2003) Proc. ICME , vol.3
- Cai, R.¹ Lu, L.² Zhang, H.-J.³ Cai, L.-H.⁴

6
- 51449101221
- Feature analysis and selection for acoustic event detection
- X. Zhuang, X. Zhou, T. S. Huang, and M. Hasegawa-Johnson, "Feature analysis and selection for acoustic event detection, " in in Proc. ICASSP. IEEE, 2008, pp. 17-20.
- (2008) Proc. ICASSP. IEEE , pp. 17-20
- Zhuang, X.¹ Zhou, X.² Huang, T.S.³ Hasegawa-Johnson, M.⁴

7
- 84878582006
- Consumerlevel multimedia event detection through unsupervised audio signal modeling
- B. Byun, I. Kim, S. M. Siniscalchi, and C.-H. Lee, "Consumerlevel multimedia event detection through unsupervised audio signal modeling, " in Proc. INTERSPEECH, 2012.
- (2012) Proc. INTERSPEECH
- Byun, B.¹ Kim, I.² Siniscalchi, S.M.³ Lee, C.-H.⁴

8
- 0024610919
- A tutorial on hidden Markov models and selected applications in speech recognition
- L. R. Rabiner, "A tutorial on hidden Markov models and selected applications in speech recognition, " Proceedings of the IEEE, vol. 77, no. 2, pp. 257-286, 1989.
- (1989) Proceedings of the IEEE , vol.77 , Issue.2 , pp. 257-286
- Rabiner, L.R.¹

9
- 33947632983
- Automatic image annotation through multi-topic text categorization
- IEEE
- S. Gao, D.-H.Wang, and C.-H. Lee, "Automatic image annotation through multi-topic text categorization, " in Proc. ICASSP, vol. 2. IEEE, 2006, pp. II-II.
- (2006) Proc. ICASSP , vol.2
- Gao, S.¹ Wang, D.-h.² Lee, C.-H.³

10
- 77951957024
- An incremental learning framework combining sample confidence and discrimination with an application to automatic image annotation
- IEEE
- B. Byun and C.-H. Lee, "An incremental learning framework combining sample confidence and discrimination with an application to automatic image annotation, " in Proc. ICIP. IEEE, 2009, pp. 1441-1444.
- (2009) Proc. ICIP , pp. 1441-1444
- Byun, B.¹ Lee, C.-H.²

11
- 70349213510
- A hierarchical grid feature representation framework for automatic image annotation
- IEEE
- I. Kim and C.-H. Lee, "A hierarchical grid feature representation framework for automatic image annotation, " in Proc. ICASSP. IEEE, 2009, pp. 1125-1128.
- (2009) Proc. ICASSP , pp. 1125-1128
- Kim, I.¹ Lee, C.-H.²

12
- 34547502608
- A vector space modeling approach to spoken language identification
- H. Li, B. Ma, and C.-H. Lee, "A vector space modeling approach to spoken language identification, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 1, pp. 271-284, 2007.
- (2007) IEEE Transactions on Audio, Speech, and Language Processing , vol.15 , Issue.1 , pp. 271-284
- Li, H.¹ Ma, B.² Lee, C.-H.³

13
- 14344255188
- A MFoM learning approach to robust multiclass multi-label text categorization
- ACM
- S. Gao, W. Wu, C.-H. Lee, and T.-S. Chua, "A MFoM learning approach to robust multiclass multi-label text categorization, " in Proc. ICML. ACM, 2004, p. 42.
- (2004) Proc. ICML , pp. 42
- Gao, S.¹ Wu, W.² Lee, C.-H.³ Chua, T.-S.⁴

14
- 79956286980
- A regularized maximum figure-of-merit (rmfom) approach to supervised and semi-supervised learning
- C. Ma and C.-H. Lee, "A Regularized Maximum Figure-of-Merit (rMFoM) Approach to Supervised and Semi-Supervised Learning, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 19, no. 5, pp. 1316-1327, 2011.
- (2011) IEEE Transactions on Audio, Speech, and Language Processing , vol.19 , Issue.5 , pp. 1316-1327
- Ma, C.¹ Lee, C.-H.²

15
- 50249170027
- Joint factor analysis versus eigenchannels in speaker recognition
- P. Kenny, G. Boulianne, P. Ouellet, and P. Dumouchel, "Joint factor analysis versus eigenchannels in speaker recognition, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 4, pp. 1435-1447, 2007.
- (2007) IEEE Transactions on Audio, Speech, and Language Processing , vol.15 , Issue.4 , pp. 1435-1447
- Kenny, P.¹ Boulianne, G.² Ouellet, P.³ Dumouchel, P.⁴

16
- 70450180849
- Support vector machines versus fast scoring in the lowdimensional total variability space for speaker verification
- N. Dehak, R. Dehak, P. Kenny, N. Brummer, P. Ouellet, and P. Dumouchel, "Support vector machines versus fast scoring in the lowdimensional total variability space for speaker verification, " in Proc. Interspeech, 2009, pp. 1559-1562.
- (2009) Proc. Interspeech , pp. 1559-1562
- Dehak, N.¹ Dehak, R.² Kenny, P.³ Brummer, N.⁴ Ouellet, P.⁵ Dumouchel, P.⁶

17
- 85073247582
- Variational bayes logistic regression as regularized fusion for NIST sre 2010
- V. Hautamäki, K. A. Lee, A. Larcher, T. Kinnunen, B. Ma, and H. Li, "Variational bayes logistic regression as regularized fusion for NIST sre 2010, " in Proc. Odyssey: The Speaker and Language Recognition Workshop, 2012.
- (2012) Proc. Odyssey: The Speaker and Language Recognition Workshop
- Hautamäki, V.¹ Lee, K.A.² Larcher, A.³ Kinnunen, T.⁴ Ma, B.⁵ Li, H.⁶

18
- 79951609039
- Front-end factor analysis for speaker verification
- N. Dehak, P. J. Kenny, R. Dehak, P. Dumouchel, and P. Ouellet, "Front-end factor analysis for speaker verification, " IEEE Transactions on Audio, Speech, and Language Processing, vol. 19, no. 4, pp. 788-798, 2011.
- (2011) IEEE Transactions on Audio, Speech, and Language Processing , vol.19 , Issue.4 , pp. 788-798
- Dehak, N.¹ Kenny, P.J.² Dehak, R.³ Dumouchel, P.⁴ Ouellet, P.⁵

19
- 0038959172
- Probabilistic principal component analysis
- M. E. Tipping and C. M. Bishop, "Probabilistic principal component analysis, " Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol. 61, no. 3, pp. 611-622, 1999.
- (1999) Journal of the Royal Statistical Society: Series B (Statistical Methodology) , vol.61 , Issue.3 , pp. 611-622
- Tipping, M.E.¹ Bishop, C.M.²

20
- 84865733857
- Analysis of i-vector length normalization in speaker recognition systems
- D. Garcia-Romero and C. Y. Espy-Wilson, "Analysis of i-vector length normalization in speaker recognition systems, " in Proc. International Conference on Speech Communication and Technology, 2011, pp. 249-252.
- (2011) Proc. International Conference on Speech Communication and Technology , pp. 249-252
- Garcia-Romero, D.¹ Espy-Wilson, C.Y.²

21
- 18744386134
- Eigenvoice modeling with sparse training data
- P. Kenny, G. Boulianne, and P. Dumouchel, "Eigenvoice modeling with sparse training data, " IEEE Transactions on Speech and Audio Processing, vol. 13, no. 3, pp. 345-354, 2005.
- (2005) IEEE Transactions on Speech and Audio Processing , vol.13 , Issue.3 , pp. 345-354
- Kenny, P.¹ Boulianne, G.² Dumouchel, P.³

22
- 84906274305
- springer New York
- C. M. Bishop et al., Pattern recognition and machine learning. springer New York, 2006, vol. 4, no. 4.
- (2006) Pattern Recognition and Machine Learning , vol.4 , Issue.4
- Bishop, C.M.¹

23
- 0002629270
- Maximum likelihood from incomplete data via the em algorithm
- Series B (Methodological
- A. P. Dempster, N. M. Laird, and D. B. Rubin, "Maximum likelihood from incomplete data via the em algorithm, " Journal of the Royal Statistical Society. Series B (Methodological), pp. 1-38, 1977.
- (1977) Journal of the Royal Statistical Society , pp. 1-38
- Dempster, A.P.¹ Laird, N.M.² Rubin, D.B.³

24
- 85084012167
- ALIZE/spkdet: A state-of-the-art open source software for speaker recognition
- J.-F. Bonastre, N. Scheffer, D. Matrouf, C. Fredouille, A. Larcher, A. Preti, G. Pouchoulin, N. Evans, B. Fauve, and J. Mason, "ALIZE/SpkDet: A state-of-the-art open source software for speaker recognition, " in Proc. Odyssey: The Speaker and Language Recognition Workshop, 2008.
- (2008) Proc. Odyssey: The Speaker and Language Recognition Workshop
- Bonastre, J.-F.¹ Scheffer, N.² Matrouf, D.³ Fredouille, C.⁴ Larcher, A.⁵ Preti, A.⁶ Pouchoulin, G.⁷ Evans, N.⁸ Fauve, B.⁹ Mason, J.¹⁰

25
- 34249753618
- Support-vector networks
- C. Cortes and V. Vapnik, "Support-vector networks, " Machine learning, vol. 20, no. 3, pp. 273-297, 1995.
- (1995) Machine Learning , vol.20 , Issue.3 , pp. 273-297
- Cortes, C.¹ Vapnik, V.²

26
- 84906260904
- TRECVID 2012 genie: Multimedia event detection and recounting
- A. Vahdat, K. Cannons, H. Hajimirsadeghi, G. Mori, S. Mc-Closkey, B. Miller, S. Venkatesha, P. Davalos, P. Das, C. Xu et al., "TRECVID 2012 GENIE: Multimedia event detection and recounting, " in Proc. NIST TRECVID Workshop, 2012.
- (2012) Proc. NIST TRECVID Workshop
- Vahdat, A.¹ Cannons, K.² Hajimirsadeghi, H.³ Mori, G.⁴ Mc-Closkey, S.⁵ Miller, B.⁶ Venkatesha, S.⁷ Davalos, P.⁸ Das, P.⁹ Xu, C.¹⁰

27
- 69949113988
- An experimental study on discriminative concept classifier combination for trecvid high-level feature extraction
- B. Byun, C. Ma, and C.-H. Lee, "An experimental study on discriminative concept classifier combination for trecvid high-level feature extraction, " in Proc. ICIP. IEEE, 2008, pp. 2532-2535.
- (2008) Proc. ICIP. IEEE , pp. 2532-2535
- Byun, B.¹ Ma, C.² Lee, C.-H.³

28
- 79955702502
- LIBSVM: A library for support vector machines
- 27:1-27:27, software available at
- C.-C. Chang and C.-J. Lin, "LIBSVM: A library for support vector machines, " ACM Transactions on Intelligent Systems and Technology, vol. 2, pp. 27:1-27:27, 2011, software available at http://www.csie.ntu.edu.tw/cjlin/ libsvm.
- (2011) ACM Transactions on Intelligent Systems and Technology , vol.2
- Chang, C.-C.¹ Lin, C.-J.²

29
- 0003822743
- Cambridge University Engineering Department
- S. Young, G. Evermann, D. Kershaw, G. Moore, J. Odell, D. Ollason, V. Valtchev, and P. Woodland, "The HTK book, " Cambridge University Engineering Department, vol. 3, 2002.
- (2002) The HTK Book , vol.3
- Young, S.¹ Evermann, G.² Kershaw, D.³ Moore, G.⁴ Odell, J.⁵ Ollason, D.⁶ Valtchev, V.⁷ Woodland, P.⁸

30
- 0035506942
- Comparison of different implementations of mfcc
- F. Zheng, G. Zhang, and Z. Song, "Comparison of different implementations of mfcc, " Journal of Computer Science and Technology, vol. 16, no. 6, pp. 582-589, 2001.
- (2001) Journal of Computer Science and Technology , vol.16 , Issue.6 , pp. 582-589
- Zheng, F.¹ Zhang, G.² Song, Z.³

31
- 84866410482
- Searching for sounds: A demonstration of findsounds. Com and findsounds palette
- S. V. Rice and S. M. Bailey, "Searching for sounds: A demonstration of Findsounds. com and Findsounds palette, " in Proc. the International Computer Music Conference, 2004, pp. 215-218.
- (2004) Proc. The International Computer Music Conference , pp. 215-218
- Rice, S.V.¹ Bailey, S.M.²

32
- 84890497765
- The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions
- H.-G. Hirsch and D. Pearce, "The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions, " in ASR2000-Automatic Speech Recognition: Challenges for the new Millenium ISCA Tutorial and Research Workshop (ITRW), 2000.
- (2000) ASR2000-Automatic Speech Recognition: Challenges for the New Millenium ISCA Tutorial and Research Workshop (ITRW)
- Hirsch, H.-G.¹ Pearce, D.²

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.