-
1
-
-
84967316016
-
A review of physical and perceptual feature extraction techniques for speech, music and environmental sounds
-
Alías, F., Socoró, J. C., and Sevillano, X. (2016). "A review of physical and perceptual feature extraction techniques for speech, music and environmental sounds," Appl. Sci. 6 (5), 143. 10.3390/app6050143
-
(2016)
Appl. Sci.
, vol.6
, Issue.5
, pp. 143
-
-
Alías, F.1
Socoró, J.C.2
Sevillano, X.3
-
2
-
-
43049174575
-
SURF: Speeded up robust features
-
Bay, H., Ess, A., Tuytelaars, T., and Van Gool, L. (2008). "SURF: Speeded up robust features," Comput. Vis. Image Understand. 110 (3), 346-359. 10.1016/j.cviu.2007.09.014
-
(2008)
Comput. Vis. Image Understand.
, vol.110
, Issue.3
, pp. 346-359
-
-
Bay, H.1
Ess, A.2
Tuytelaars, T.3
Van Gool, L.4
-
3
-
-
0026471285
-
An efficient algorithm for the calculation of a constant Q transform
-
Brown, J. C., and Puckette, M. S. (1992). "An efficient algorithm for the calculation of a constant Q transform," J. Acoust. Soc. Am. 92 (5), 2698-2701. 10.1121/1.404385
-
(1992)
J. Acoust. Soc. Am.
, vol.92
, Issue.5
, pp. 2698-2701
-
-
Brown, J.C.1
Puckette, M.S.2
-
4
-
-
84982950203
-
Annotating multimedia/ multi-modal resources with ELAN
-
developed at Max Planck Institute for Psycholinguistics, The Language Archive, Nijmegen, NL, (Last viewed September 26, 2017)
-
Brugman, H., and Russel, A. (2004). "Annotating multimedia/ multi-modal resources with ELAN," in Proceedings of LREC 2004, Fourth International Conference on Language Resources and Evaluation, developed at Max Planck Institute for Psycholinguistics, The Language Archive, Nijmegen, NL, https://tla.mpi.nl/tools/tla-tools/elan/ (Last viewed September 26, 2017).
-
(2004)
Proceedings of LREC 2004, Fourth International Conference on Language Resources and Evaluation
-
-
Brugman, H.1
Russel, A.2
-
5
-
-
33645801332
-
Hierarchical automatic audio signal classification
-
Burred, J. J., and Lerch, A. (2004). "Hierarchical automatic audio signal classification," J. Audio Eng. Soc. 52 (7/8), 724-738, available at http://www.aes.org/e-lib/browse.cfm?elib=13015.
-
(2004)
J. Audio Eng. Soc.
, vol.52
, Issue.7-8
, pp. 724-738
-
-
Burred, J.J.1
Lerch, A.2
-
6
-
-
4644319109
-
The reliability and sensitivity to change of acoustic measures of voice quality
-
Carding, P. N., Steen, I. N., Webb, A., Mackenzie, K., Deary, I. J., and Wilson, J. A. (2004). "The reliability and sensitivity to change of acoustic measures of voice quality," Clin. Otolaryngol. 29 (5), 538-544. 10.1111/j.1365-2273.2004.00846.x
-
(2004)
Clin. Otolaryngol.
, vol.29
, Issue.5
, pp. 538-544
-
-
Carding, P.N.1
Steen, I.N.2
Webb, A.3
Mackenzie, K.4
Deary, I.J.5
Wilson, J.A.6
-
7
-
-
79955702502
-
LIBSVM: A library for support vector machines
-
Chang, C. C., and Lin, C. J. (2011). "LIBSVM: A library for support vector machines," ACM Trans. Intell. Syst. Tech. (TIST) 2 (3), 1-39. 10.1145/1961189.1961199
-
(2011)
ACM Trans. Intell. Syst. Tech. (TIST)
, vol.2
, Issue.3
, pp. 1-39
-
-
Chang, C.C.1
Lin, C.J.2
-
8
-
-
0036214787
-
YIN, a fundamental frequency estimator for speech and music
-
Cheveigné, A., and Kawahara, H. (2002). "YIN, a fundamental frequency estimator for speech and music," J. Acoust. Soc. Am. 111 (4), 1917-1930. 10.1121/1.1458024
-
(2002)
J. Acoust. Soc. Am.
, vol.111
, Issue.4
, pp. 1917-1930
-
-
Cheveigné, A.1
Kawahara, H.2
-
9
-
-
0030691985
-
Modeling auditory processing of amplitude modulation, I. Detection and masking with narrow-band carriers
-
Dau, T., Kollmeier, B., and Kohlrausch, A. (1997). "Modeling auditory processing of amplitude modulation, I. Detection and masking with narrow-band carriers," J. Acoust. Soc. Am. 102 (5), 2892-2905. 10.1121/1.420344
-
(1997)
J. Acoust. Soc. Am.
, vol.102
, Issue.5
, pp. 2892-2905
-
-
Dau, T.1
Kollmeier, B.2
Kohlrausch, A.3
-
11
-
-
84934898408
-
Modeling the perception of tempo
-
Elowsson, A., and Friberg, A. (2015). "Modeling the perception of tempo," J. Acoust. Soc. Am. 137, 3163-3177. 10.1121/1.4919306
-
(2015)
J. Acoust. Soc. Am.
, vol.137
, pp. 3163-3177
-
-
Elowsson, A.1
Friberg, A.2
-
12
-
-
85016561050
-
Predicting the perception of performed dynamics in music audio with ensemble learning
-
Elowsson, A., and Friberg, A. (2017). "Predicting the perception of performed dynamics in music audio with ensemble learning," J. Acoust. Soc. Am. 141, 2224-2242. 10.1121/1.4978245
-
(2017)
J. Acoust. Soc. Am.
, vol.141
, pp. 2224-2242
-
-
Elowsson, A.1
Friberg, A.2
-
13
-
-
84907866137
-
Modelling the speed of music using features from harmonic/percussive separated audio
-
Elowsson, A., Friberg, A., Madison, G., and Paulin, J. (2013). "Modelling the speed of music using features from harmonic/percussive separated audio," in Proceedings of the International Symposium on Music Information Retrieval (ISMIR 2013), pp. 481-486.
-
(2013)
Proceedings of the International Symposium on Music Information Retrieval (ISMIR 2013)
, pp. 481-486
-
-
Elowsson, A.1
Friberg, A.2
Madison, G.3
Paulin, J.4
-
14
-
-
85139270599
-
Harmonic/percussive separation using median filtering
-
Graz, Austria (September 6-10)
-
FitzGerald, D. (2010). "Harmonic/percussive separation using median filtering," in Proceedings of DAFx-10, Graz, Austria (September 6-10).
-
(2010)
Proceedings of DAFx-10
-
-
Fitzgerald, D.1
-
15
-
-
34250887482
-
CUEX: An algorithm for extracting expressive tone variables from audio recordings
-
Friberg, A., Schoonderwaldt, E., and Juslin, P. N. (2007). "CUEX: An algorithm for extracting expressive tone variables from audio recordings," Acta Acust. united Acust. 93, 411-420, available at https://www.ingentaconnect.com/contentone/dav/aaua/2007/00000093/00000003/art00010.
-
(2007)
Acta Acust. United Acust.
, vol.93
, pp. 411-420
-
-
Friberg, A.1
Schoonderwaldt, E.2
Juslin, P.N.3
-
16
-
-
11144325691
-
Partial least-squares regression: A tutorial
-
Geladi, P., and Kowalski, B. R. (1986). "Partial least-squares regression: A tutorial," Anal. Chim. Acta. 185, 1-17. 10.1016/0003-2670(86)80028-9
-
(1986)
Anal. Chim. Acta.
, vol.185
, pp. 1-17
-
-
Geladi, P.1
Kowalski, B.R.2
-
17
-
-
33646159992
-
Acoustic-perceptual correlates of voice quality in elderly men and women
-
Gorham-Rowan, M. M., and Laures-Gore, J. (2006). "Acoustic-perceptual correlates of voice quality in elderly men and women," J. Commun. Disorders 39 (3), 171-184. 10.1016/j.jcomdis.2005.11.005
-
(2006)
J. Commun. Disorders
, vol.39
, Issue.3
, pp. 171-184
-
-
Gorham-Rowan, M.M.1
Laures-Gore, J.2
-
19
-
-
0036512599
-
The relationship between cepstral peak prominence and selected parameters of dysphonia
-
Heman-Ackah, Y. D., Michael, D. D., and Goding, G. S. (2002). "The relationship between cepstral peak prominence and selected parameters of dysphonia," J. Voice 16 (1), 20-27. 10.1016/S0892-1997(02)00067-X
-
(2002)
J. Voice
, vol.16
, Issue.1
, pp. 20-27
-
-
Heman-Ackah, Y.D.1
Michael, D.D.2
Goding, G.S.3
-
20
-
-
0028015597
-
Acoustic correlates of breathy vocal quality
-
Hillenbrand, J., Cleveland, R. A., and Erickson, R. L. (1994). "Acoustic correlates of breathy vocal quality," J. Speech Lang. Hear. Res. 37 (4), 769-778. 10.1044/jshr.3704.769
-
(1994)
J. Speech Lang. Hear. Res.
, vol.37
, Issue.4
, pp. 769-778
-
-
Hillenbrand, J.1
Cleveland, R.A.2
Erickson, R.L.3
-
21
-
-
0030124559
-
Acoustic correlates of breathy vocal quality: Dysphonic voices and continuous speech
-
Hillenbrand, J., and Houde, R. A. (1996). "Acoustic correlates of breathy vocal quality: Dysphonic voices and continuous speech," J. Speech Lang. Hear. Res. 39 (2), 311-321. 10.1044/jshr.3902.311
-
(1996)
J. Speech Lang. Hear. Res.
, vol.39
, Issue.2
, pp. 311-321
-
-
Hillenbrand, J.1
Houde, R.A.2
-
24
-
-
77950502144
-
Listener expertise and sound identification influence the categorization of environmental sounds
-
Lemaitre, G., Houix, O., Misdariis N., and Susini P. (2010). "Listener expertise and sound identification influence the categorization of environmental sounds," J. Exp. Psychol.: Appl. 16 (1), 16-32. 10.1037/a0018762
-
(2010)
J. Exp. Psychol.: Appl.
, vol.16
, Issue.1
, pp. 16-32
-
-
Lemaitre, G.1
Houix, O.2
Misdariis, N.3
Susini, P.4
-
25
-
-
85029340764
-
Vocal imitations of non-vocal sounds
-
Lemaitre, G., Houix, O., Voisin, F., Misdariis, N., and Susini, P. (2016a). "Vocal imitations of non-vocal sounds," PLoS One 11 (12), e0168167. 10.1371/journal.pone.0168167
-
(2016)
PLoS One
, vol.11
, Issue.12
, pp. e0168167
-
-
Lemaitre, G.1
Houix, O.2
Voisin, F.3
Misdariis, N.4
Susini, P.5
-
26
-
-
84954521435
-
Vocal imitations of basic auditory features
-
Lemaitre, G., Jabbari, A., Misdariis, N., Houix, O., and Susini, P. (2016b). "Vocal imitations of basic auditory features," J. Acoust. Soc. Am. 139 (1), 290-300. 10.1121/1.4939738
-
(2016)
J. Acoust. Soc. Am.
, vol.139
, Issue.1
, pp. 290-300
-
-
Lemaitre, G.1
Jabbari, A.2
Misdariis, N.3
Houix, O.4
Susini, P.5
-
27
-
-
85026540671
-
Rising tones and rustling noises: Metaphors in gestural depictions of sounds
-
Lemaitre, G., Scurto, H., Françoise, J., Bevilacqua, F., Houix, O., and Susini, P. (2017). "Rising tones and rustling noises: Metaphors in gestural depictions of sounds," PLoS One 12 (7), e0181786. 10.1371/journal.pone.0181786
-
(2017)
PLoS One
, vol.12
, Issue.7
, pp. e0181786
-
-
Lemaitre, G.1
Scurto, H.2
Françoise, J.3
Bevilacqua, F.4
Houix, O.5
Susini, P.6
-
28
-
-
85014539377
-
-
Deliverable 4.4.1 in the EC-project Sketching Audio Technologies using Vocalizations and Gestures (SkAT-VG), (Last viewed September 5, 2018)
-
Lemaitre, G., Voisin, F., Scurto, H., Houix, O., Susini, P., Misdariis, N., and Bevilacqua, F. (2015). "A large set of vocal and gestural imitations," Deliverable 4.4.1 in the EC-project Sketching Audio Technologies using Vocalizations and Gestures (SkAT-VG), http://skatvg.iuav.it/wp-content/uploads/2015/11/SkATVGDeliverableD4.4.1.pdf (Last viewed September 5, 2018).
-
(2015)
A Large Set of Vocal and Gestural Imitations
-
-
Lemaitre, G.1
Voisin, F.2
Scurto, H.3
Houix, O.4
Susini, P.5
Misdariis, N.6
Bevilacqua, F.7
-
29
-
-
84926628005
-
Idealized computational models for auditory receptive fields
-
Lindeberg, T., and Friberg, A. (2015a). "Idealized computational models for auditory receptive fields," PLoS One 10 (3), e0119032. 10.1371/journal.pone.0119032
-
(2015)
PLoS One
, vol.10
, Issue.3
, pp. e0119032
-
-
Lindeberg, T.1
Friberg, A.2
-
30
-
-
84931078597
-
Scale-space theory for auditory signals
-
Springer Lecture Notes in Computer Science
-
Lindeberg, T., and Friberg, A. (2015b). "Scale-space theory for auditory signals," in Proceedings of Scale Space and Variational Methods in Computer Vision (SSVM 2015), Vol. 9087 of Springer Lecture Notes in Computer Science, pp. 3-15.
-
(2015)
Proceedings of Scale Space and Variational Methods in Computer Vision (SSVM 2015)
, vol.9087
, pp. 3-15
-
-
Lindeberg, T.1
Friberg, A.2
-
31
-
-
70449382199
-
Acoustic measurement of overall voice quality: A meta-analysis
-
Maryn, Y., Roy, N., De Bodt, M., Van Cauwenberge, P., and Corthals, P. (2009). "Acoustic measurement of overall voice quality: A meta-analysis," J. Acoust. Soc. Am. 126 (5), 2619-2634. 10.1121/1.3224706
-
(2009)
J. Acoust. Soc. Am.
, vol.126
, Issue.5
, pp. 2619-2634
-
-
Maryn, Y.1
Roy, N.2
De Bodt, M.3
Van Cauwenberge, P.4
Corthals, P.5
-
32
-
-
84879000794
-
-
Ph.D. thesis, University of Victoria, Department of Linguistics, Canada
-
Moisik, S. R. (2013). "The epilarynx in speech," Ph.D. thesis, University of Victoria, Department of Linguistics, Canada.
-
(2013)
The Epilarynx in Speech
-
-
Moisik, S.R.1
-
33
-
-
77950441135
-
A high-speed laryngoscopic investigation of aryepiglottic trilling
-
Moisik, S. R., Esling, J. H., and Crevier-Buchman, L. (2010). "A high-speed laryngoscopic investigation of aryepiglottic trilling," J. Acoust. Soc. Am. 127 (3), 1548-1558. 10.1121/1.3299203
-
(2010)
J. Acoust. Soc. Am.
, vol.127
, Issue.3
, pp. 1548-1558
-
-
Moisik, S.R.1
Esling, J.H.2
Crevier-Buchman, L.3
-
34
-
-
81355164049
-
The timbre toolbox: Extracting audio descriptors from musical signals
-
Peeters, G., Giordano, B. L., Susini, P., Misdariis, N., and McAdams, S. (2011). "The timbre toolbox: Extracting audio descriptors from musical signals," J. Acoust. Soc. Am. 130, 2902-2916. 10.1121/1.3642604
-
(2011)
J. Acoust. Soc. Am.
, vol.130
, pp. 2902-2916
-
-
Peeters, G.1
Giordano, B.L.2
Susini, P.3
Misdariis, N.4
McAdams, S.5
-
35
-
-
33748611921
-
Ensemble based systems in decision making
-
Polikar, R. (2006). "Ensemble based systems in decision making," IEEE Circ. Syst. Mag. 6 (3), 21-45. 10.1109/MCAS.2006.1688199
-
(2006)
IEEE Circ. Syst. Mag.
, vol.6
, Issue.3
, pp. 21-45
-
-
Polikar, R.1
-
36
-
-
0027997572
-
Measurements of the vibrato rate of ten singers
-
Prame, E. (1994). "Measurements of the vibrato rate of ten singers," J. Acoust. Soc. Am. 96, 1979-1984. 10.1121/1.410141
-
(1994)
J. Acoust. Soc. Am.
, vol.96
, pp. 1979-1984
-
-
Prame, E.1
-
38
-
-
4043137356
-
A tutorial on support vector regression
-
Smola, A. J., and Schölkopf, B. (2004). "A tutorial on support vector regression," Stat. Comput. 14 (3), 199-222. 10.1023/B:STCO.0000035301.49549.88
-
(2004)
Stat. Comput.
, vol.14
, Issue.3
, pp. 199-222
-
-
Smola, A.J.1
Schölkopf, B.2
-
39
-
-
85053896970
-
-
Deliverable D2.2.2 in the EC-project Sketching Audio Technologies using Vocalizations and Gestures (SkAT-VG), (Last viewed September 5, 2018)
-
Ternström, S., and Mauro, D. A. (2015). "Extensive set of recorded imitations," Deliverable D2.2.2 in the EC-project Sketching Audio Technologies using Vocalizations and Gestures (SkAT-VG), http://skatvg.iuav.it/wp-content/uploads/2015/01/SkATVGDeliverableD2.2.2.pdf (Last viewed September 5, 2018).
-
(2015)
Extensive Set of Recorded Imitations
-
-
Ternström, S.1
Mauro, D.A.2
|