메뉴 건너뛰기




Volumn 19, Issue 3, 2016, Pages 631-645

Robust acoustic bird recognition for habitat monitoring with wireless sensor networks

Author keywords

Birdsong recognition; Deep neural network; GTECC; Noise power estimation; QCN; WSN

Indexed keywords

ACOUSTIC NOISE; BIRDS; COMPLEX NETWORKS; ECONOMIC AND SOCIAL EFFECTS; ECOSYSTEMS; ENERGY UTILIZATION; SENSOR NODES;

EID: 84980009672     PISSN: 13812416     EISSN: 15728110     Source Type: Journal    
DOI: 10.1007/s10772-016-9354-4     Document Type: Article
Times cited : (21)

References (59)
  • 2
    • 77955555508 scopus 로고    scopus 로고
    • Detecting bird sounds in a complex acoustic environment and application to bioacoustic monitoring
    • Bardeli, R., Wolff, D., Kurth, F., Koch, M., Tauchert, K. H., & Frommolt, K. H. (2010). Detecting bird sounds in a complex acoustic environment and application to bioacoustic monitoring. Pattern Recognition Letters, 31(12), 1524–1534.
    • (2010) Pattern Recognition Letters , vol.31 , Issue.12 , pp. 1524-1534
    • Bardeli, R.1    Wolff, D.2    Kurth, F.3    Koch, M.4    Tauchert, K.H.5    Frommolt, K.H.6
  • 4
    • 77955734646 scopus 로고    scopus 로고
    • Unsupervised equalization of Lombard effect for speech recognition in noisy adverse environments
    • Bořil, H., & Hansen, J. H. (2010). Unsupervised equalization of Lombard effect for speech recognition in noisy adverse environments. Audio, Speech, and Language Processing, IEEE Transactions on, 18(6), 1379–1393.
    • (2010) Audio, Speech, and Language Processing, IEEE Transactions on , vol.18 , Issue.6 , pp. 1379-1393
    • Bořil, H.1    Hansen, J.H.2
  • 5
    • 80051656187 scopus 로고    scopus 로고
    • UT-Scope: Towards LVCSR under Lombard effect induced by varying types and levels of noisy background
    • 2011 IEEE International Conference on (pp. 4472–4475). IEEE
    • Bořil, H., & Hansen, J. H. (2011). UT-Scope: Towards LVCSR under Lombard effect induced by varying types and levels of noisy background. In Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on (pp. 4472–4475). IEEE.
    • (2011) Acoustics, Speech and Signal Processing (ICASSP)
    • Bořil, H.1    Hansen, J.H.2
  • 6
    • 2342565130 scopus 로고    scopus 로고
    • The impact of environmental noise on song amplitude in a territorial bird
    • Brumm, H. (2004). The impact of environmental noise on song amplitude in a territorial bird. Journal of Animal Ecology, 73(3), 434–440.
    • (2004) Journal of Animal Ecology , vol.73 , Issue.3 , pp. 434-440
    • Brumm, H.1
  • 7
    • 84893649633 scopus 로고    scopus 로고
    • Diurnal activity budget and breeding ecology of the White-headed Duck Oxyura leucocephala at Lake Tonga (North-east Algeria)
    • Chettibi, F., Khelifa, R., Aberkane, M., Bouslama, Z., & Houhamdi, M. (2013). Diurnal activity budget and breeding ecology of the White-headed Duck Oxyura leucocephala at Lake Tonga (North-east Algeria). Zoology and Ecology, 23(3), 183–190.
    • (2013) Zoology and Ecology , vol.23 , Issue.3 , pp. 183-190
    • Chettibi, F.1    Khelifa, R.2    Aberkane, M.3    Bouslama, Z.4    Houhamdi, M.5
  • 8
    • 80051661572 scopus 로고    scopus 로고
    • Noise robust bird song detection using syllable pattern-based hidden Markov models
    • 2011 IEEE International Conference on (pp. 345–348). IEEE
    • Chu, W., & Blumstein, D. T. (2011). Noise robust bird song detection using syllable pattern-based hidden Markov models. In Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on (pp. 345–348). IEEE.
    • (2011) Acoustics, Speech and Signal Processing (ICASSP)
    • Chu, W.1    Blumstein, D.T.2
  • 9
    • 84861776914 scopus 로고    scopus 로고
    • Multi-column deep neural network for traffic sign classification
    • Cireşan, D., Meier, U., Masci, J., & Schmidhuber, J. (2012). Multi-column deep neural network for traffic sign classification. Neural Networks, 32, 333–338.
    • (2012) Neural Networks , vol.32 , pp. 333-338
    • Cireşan, D.1    Meier, U.2    Masci, J.3    Schmidhuber, J.4
  • 10
    • 0041360463 scopus 로고    scopus 로고
    • Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging
    • Cohen, I. (2003). Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging. Speech and Audio Processing, IEEE Transactions on, 11(5), 466–475.
    • (2003) Speech and Audio Processing, IEEE Transactions on , vol.11 , Issue.5 , pp. 466-475
    • Cohen, I.1
  • 12
    • 0019053271 scopus 로고
    • Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
    • Davis, S. B., & Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. Acoustics, Speech and Signal Processing, IEEE Transactions on, 28(4), 357–366.
    • (1980) Acoustics, Speech and Signal Processing, IEEE Transactions on , vol.28 , Issue.4 , pp. 357-366
    • Davis, S.B.1    Mermelstein, P.2
  • 14
    • 84890526837 scopus 로고    scopus 로고
    • New types of deep neural network learning for speech recognition and related applications: An overview
    • 2013 IEEE International Conference on (pp. 8599–8603). IEEE
    • Deng, L., Hinton, G., & Kingsbury, B. (2013). New types of deep neural network learning for speech recognition and related applications: An overview. In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on (pp. 8599–8603). IEEE.
    • (2013) Acoustics, Speech and Signal Processing (ICASSP)
    • Deng, L.1    Hinton, G.2    Kingsbury, B.3
  • 16
    • 85015555767 scopus 로고    scopus 로고
    • A nonlinear unsupervised adaptation technique for speech recognition
    • Dharanipragada, S., & Padmanabhan, M. (2000). A nonlinear unsupervised adaptation technique for speech recognition. In INTERSPEECH (pp. 556–559).
    • (2000) In INTERSPEECH , pp. 556-559
    • Dharanipragada, S.1    Padmanabhan, M.2
  • 18
    • 0028312802 scopus 로고
    • Auditory models and human performance in tasks related to speech coding and speech recognition
    • Ghitza, O. (1994). Auditory models and human performance in tasks related to speech coding and speech recognition. Speech and Audio Processing, IEEE Transactions on, 2(1), 115–132.
    • (1994) Speech and Audio Processing, IEEE Transactions on , vol.2 , Issue.1 , pp. 115-132
    • Ghitza, O.1
  • 19
    • 0025110885 scopus 로고
    • Derivation of auditory filter shapes from notched-noise data
    • Glasberg, B. R., & Moore, B. C. (1990). Derivation of auditory filter shapes from notched-noise data. Hearing Research, 47(1), 103–138.
    • (1990) Hearing Research , vol.47 , Issue.1 , pp. 103-138
    • Glasberg, B.R.1    Moore, B.C.2
  • 21
    • 34047249084 scopus 로고    scopus 로고
    • Quantile based histogram equalization for noise robust large vocabulary speech recognition
    • Hilger, F., & Ney, H. (2006). Quantile based histogram equalization for noise robust large vocabulary speech recognition. Audio, Speech, and Language Processing, IEEE Transactions on, 14(3), 845–854.
    • (2006) Audio, Speech, and Language Processing, IEEE Transactions on , vol.14 , Issue.3 , pp. 845-854
    • Hilger, F.1    Ney, H.2
  • 22
    • 84980638102 scopus 로고    scopus 로고
    • A wireless embedded sensor architecture for system-level optimization, UC Berkeley Technical Report
    • Hill, J., & Culler, D. (2002). A wireless embedded sensor architecture for system-level optimization. UC Berkeley Technical Report.
    • (2002) & Culler, D
    • Hill, J.1
  • 23
    • 85032751458 scopus 로고    scopus 로고
    • Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
    • Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A. R., Jaitly, N., et al. (2012). Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. Signal Processing Magazine IEEE, 29(6), 82–97.
    • (2012) Signal Processing Magazine IEEE , vol.29 , Issue.6 , pp. 82-97
    • Hinton, G.1    Deng, L.2    Yu, D.3    Dahl, G.E.4    Mohamed, A.R.5    Jaitly, N.6    Kingsbury, B.7
  • 25
    • 79959196015 scopus 로고    scopus 로고
    • Automatic detection and recognition of tonal bird sounds in noisy environments
    • Jančovič, P., & Köküer, M. (2011). Automatic detection and recognition of tonal bird sounds in noisy environments. EURASIP Journal on Advances in Signal Processing, 2011(1), 982936.
    • (2011) EURASIP Journal on Advances in Signal Processing , vol.2011 , Issue.1 , pp. 982936
    • Jančovič, P.1    Köküer, M.2
  • 26
    • 0025635254 scopus 로고
    • On a simple algorithm to calculate the energy’of a signal. In Acoustics, Speech, and Signal Processing, 1988
    • Kaiser, J. F. (1990). On a simple algorithm to calculate the energy’of a signal. In Acoustics, Speech, and Signal Processing, 1988. ICASSP-88., 1988 International Conference on (pp. 381–384).
    • (1990) ICASSP-88., 1988 International Conference on , pp. 381-384
    • Kaiser, J.F.1
  • 27
    • 84867201503 scopus 로고    scopus 로고
    • Robust signal-to-noise ratio estimation based on waveform amplitude distribution analysis
    • Kim, C., & Stern, R. M. (2008). Robust signal-to-noise ratio estimation based on waveform amplitude distribution analysis. In INTERSPEECH (pp. 2598–2601).
    • (2008) In INTERSPEECH , pp. 2598-2601
    • Kim, C.1    Stern, R.M.2
  • 28
    • 84867608537 scopus 로고    scopus 로고
    • Power-normalized cepstral coefficients (PNCC) for robust speech recognition
    • 2012 IEEE International Conference on (pp. 4101–4104). IEEE
    • Kim, C., & Stern, R. M. (2012). Power-normalized cepstral coefficients (PNCC) for robust speech recognition. In Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on (pp. 4101–4104). IEEE.
    • (2012) Acoustics, Speech and Signal Processing (ICASSP)
    • Kim, C.1    Stern, R.M.2
  • 35
    • 84856061530 scopus 로고    scopus 로고
    • Novel variable length Teager energy based features for person recognition from their hum
    • 2010 IEEE International Conference on (pp. 4526–4529). IEEE
    • Patil, H., & Parhi, K. K. (2010). Novel variable length Teager energy based features for person recognition from their hum. In Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on (pp. 4526–4529). IEEE.
    • (2010) Acoustics Speech and Signal Processing (ICASSP)
    • Patil, H.1    Parhi, K.K.2
  • 37
    • 84890507079 scopus 로고    scopus 로고
    • Methods for classification of nocturnal migratory bird vocalizations using Pseudo Wigner-Ville Transform
    • 2013 IEEE International Conference on (pp. 758–762). IEEE
    • Patti, A., & Williamson, G. (2013). Methods for classification of nocturnal migratory bird vocalizations using Pseudo Wigner-Ville Transform. In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on (pp. 758–762). IEEE.
    • (2013) Acoustics, Speech and Signal Processing (ICASSP)
    • Patti, A.1    Williamson, G.2
  • 38
    • 84921500946 scopus 로고    scopus 로고
    • Unsupervised dictionary extraction of bird vocalisations and new tools on assessing and visualising bird activity
    • Potamitis, I. (2015). Unsupervised dictionary extraction of bird vocalisations and new tools on assessing and visualising bird activity. Ecological Informatics, 26, 6–17.
    • (2015) Ecological Informatics , vol.26 , pp. 6-17
    • Potamitis, I.1
  • 39
    • 84893699916 scopus 로고    scopus 로고
    • Improved cepstral mean and variance normalization using Bayesian framework
    • 2013 IEEE Workshop on (pp. 156–161). IEEE
    • Prasad, N. V., & Umesh, S. (2013). Improved cepstral mean and variance normalization using Bayesian framework. In Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on (pp. 156–161). IEEE.
    • (2013) Automatic Speech Recognition and Understanding (ASRU)
    • Prasad, N.V.1    Umesh, S.2
  • 40
    • 85011421605 scopus 로고    scopus 로고
    • Automatic recognition of bird individuals on an open set using as-is recordings
    • Ptacek, L., Machlica, L., Linhart, P., Jaska, P., & Muller, L. (2015). Automatic recognition of bird individuals on an open set using as-is recordings. Bioacoustics, 25(1), 1–19.
    • (2015) Bioacoustics , vol.25 , Issue.1 , pp. 1-19
    • Ptacek, L.1    Machlica, L.2    Linhart, P.3    Jaska, P.4    Muller, L.5
  • 41
    • 29444448046 scopus 로고    scopus 로고
    • A noise-estimation algorithm for highly non-stationary environments
    • Rangachari, S., & Loizou, P. C. (2006). A noise-estimation algorithm for highly non-stationary environments. Speech Communication, 48(2), 220–231.
    • (2006) Speech Communication , vol.48 , Issue.2 , pp. 220-231
    • Rangachari, S.1    Loizou, P.C.2
  • 42
    • 84867605860 scopus 로고    scopus 로고
    • A comparison of front-end compensation strategies for robust LVCSR under room reverberation and increased vocal effort
    • 2012 IEEE International Conference on (pp. 4701–4704). IEEE
    • Sadjadi, S. O., Bořil, H., & Hansen, J. H. (2012). A comparison of front-end compensation strategies for robust LVCSR under room reverberation and increased vocal effort. In Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on (pp. 4701–4704). IEEE.
    • (2012) Acoustics, Speech and Signal Processing (ICASSP)
    • Sadjadi, S.O.1    Bořil, H.2    Hansen, J.H.3
  • 44
    • 84875405186 scopus 로고    scopus 로고
    • Exploiting deep neural networks for detection-based speech recognition
    • Siniscalchi, S. M., Yu, D., Deng, L., & Lee, C. H. (2013). Exploiting deep neural networks for detection-based speech recognition. Neurocomputing, 106, 148–157.
    • (2013) Neurocomputing , vol.106 , pp. 148-157
    • Siniscalchi, S.M.1    Yu, D.2    Deng, L.3    Lee, C.H.4
  • 45
    • 0004213132 scopus 로고    scopus 로고
    • Auditory toolbox. Interval Research Corporation
    • Slaney, M. (1998). Auditory toolbox. Interval Research Corporation, Technical Report (Vol. 10).
    • (1998) Technical Report , vol.10
    • Slaney, M.1
  • 46
    • 84980702947 scopus 로고    scopus 로고
    • Contributions à l’étude des réseaux sociaux: propagation, fouille
    • Doctoral dissertation: Université des Antilles-Guyane)
    • Stattner, E. (2012). Contributions à l’étude des réseaux sociaux: propagation, fouille, collecte de données (Doctoral dissertation, Université des Antilles-Guyane).
    • (2012) collecte de données
    • Stattner, E.1
  • 48
    • 84905690760 scopus 로고    scopus 로고
    • Automatic large-scale classification of bird sounds is strongly improved by unsupervised feature learning
    • Stowell, D., & Plumbley, M. D. (2014). Automatic large-scale classification of bird sounds is strongly improved by unsupervised feature learning. PeerJ, 2, e488.
    • (2014) PeerJ , vol.2 , pp. e488
    • Stowell, D.1    Plumbley, M.D.2
  • 52
    • 0037445322 scopus 로고    scopus 로고
    • Preprocessing in a Tiered Sensor Network for Habitat Monitoring, EURASIP
    • Wang, H., Estrin, D., & Girod, L. (2003). Preprocessing in a Tiered Sensor Network for Habitat Monitoring, EURASIP. Journal on Applied Signal Processing, 4, 392–401.
    • (2003) Journal on Applied Signal Processing , vol.4 , pp. 392-401
    • Wang, H.1    Estrin, D.2    Girod, L.3
  • 53
    • 80051605016 scopus 로고    scopus 로고
    • Audio recognition in the wild: Static and dynamic classification on a real-world database of animal vocalizations
    • 2011 IEEE International Conference on (pp. 337–340). IEEE
    • Weninger, F., & Schuller, B. (2011). Audio recognition in the wild: Static and dynamic classification on a real-world database of animal vocalizations. In Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on (pp. 337–340). IEEE.
    • (2011) Acoustics, Speech and Signal Processing (ICASSP)
    • Weninger, F.1    Schuller, B.2
  • 54
    • 37649022051 scopus 로고    scopus 로고
    • A new perceptually motivated MVDR-based acoustic front-end (PMVDR) for robust automatic speech recognition
    • Yapanel, U. H., & Hansen, J. H. (2008). A new perceptually motivated MVDR-based acoustic front-end (PMVDR) for robust automatic speech recognition. Speech Communication, 50(2), 142–152.
    • (2008) Speech Communication , vol.50 , Issue.2 , pp. 142-152
    • Yapanel, U.H.1    Hansen, J.H.2
  • 55
    • 84957659835 scopus 로고    scopus 로고
    • Noise estimation based on soft decisions and conditional smoothing for speech enhancement
    • International Workshop on (pp. 1–4). VDE
    • Yong, P. C., Nordholm, S., & Dam, H. H. (2012). Noise estimation based on soft decisions and conditional smoothing for speech enhancement. In Acoustic Signal Enhancement; Proceedings of IWAENC 2012; International Workshop on (pp. 1–4). VDE.
    • (2012) Acoustic Signal Enhancement; Proceedings of IWAENC 2012
    • Yong, P.C.1    Nordholm, S.2    Dam, H.H.3
  • 57
    • 70349225970 scopus 로고    scopus 로고
    • A low-complexity noise estimation algorithm based on smoothing of noise power estimation and estimation bias correction
    • IEEE International Conference on (pp. 4421–4424). IEEE
    • Yu, R. (2009). A low-complexity noise estimation algorithm based on smoothing of noise power estimation and estimation bias correction. In Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on (pp. 4421–4424). IEEE.
    • (2009) Acoustics, Speech and Signal Processing, 2009. ICASSP 2009
    • Yu, R.1
  • 58
    • 84980658479 scopus 로고    scopus 로고
    • U.S. Patent No
    • Washington, DC: U.S. Patent and Trademark Office
    • Yu, D., Deng, L., Seide, F. T. B., & Li, G. (2016). U.S. Patent No. 9,235,799. Washington, DC: U.S. Patent and Trademark Office.
    • (2016) 9,235,799
    • Yu, D.1    Deng, L.2    Seide, F.T.B.3    Li, G.4
  • 59
    • 84922887558 scopus 로고    scopus 로고
    • Adaptive energy detection for bird sound detection in complex environments
    • Zhang, X., & Li, Y. (2015). Adaptive energy detection for bird sound detection in complex environments. Neurocomputing, 155, 108–116.
    • (2015) Neurocomputing , vol.155 , pp. 108-116
    • Zhang, X.1    Li, Y.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.