-
1
-
-
34047193159
-
Sinusoidal model based on instantaneous frequency attractors
-
Abe T., and Honda M. Sinusoidal model based on instantaneous frequency attractors. IEEE Trans. Audio, Speech Language Process. 14 4 (2006) 1292-1300
-
(2006)
IEEE Trans. Audio, Speech Language Process.
, vol.14
, Issue.4
, pp. 1292-1300
-
-
Abe, T.1
Honda, M.2
-
2
-
-
0030371135
-
-
Abe, T., Kobayashi, T., Imai, S., 1996. Robust pitch estimation with harmonics enhancement in noisy based on instantaneous frequency. In: Proc. ICSLP 9, Vol. 2, pp. 1277-1280.
-
Abe, T., Kobayashi, T., Imai, S., 1996. Robust pitch estimation with harmonics enhancement in noisy based on instantaneous frequency. In: Proc. ICSLP 9, Vol. 2, pp. 1277-1280.
-
-
-
-
4
-
-
85143191655
-
-
Chou, W., Gu, L., May 2001. Robust singing detection in speech/music discriminator design. In: Proc. ICASSP 2001, Vol. II, pp. 865-868.
-
Chou, W., Gu, L., May 2001. Robust singing detection in speech/music discriminator design. In: Proc. ICASSP 2001, Vol. II, pp. 865-868.
-
-
-
-
5
-
-
0002629270
-
Maximum likelihood from incomplete data via the EM algorithm
-
Dempster A., Laird N., and Rubin D. Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. Ser. B 39 1 (1977) 1-38
-
(1977)
J. Roy. Statist. Soc. Ser. B
, vol.39
, Issue.1
, pp. 1-38
-
-
Dempster, A.1
Laird, N.2
Rubin, D.3
-
6
-
-
0027154415
-
-
Depalle, P., Garcia, G., Rodet, X., 1993. Tracking of partials for additive sound synthesis using hidden markov models. In: ICASSP-93, Vol. 1, pp. 225-228.
-
Depalle, P., Garcia, G., Rodet, X., 1993. Tracking of partials for additive sound synthesis using hidden markov models. In: ICASSP-93, Vol. 1, pp. 225-228.
-
-
-
-
7
-
-
0028831004
-
Temporal envelope and fine structure cues for speech intelligibility
-
Drullman R. Temporal envelope and fine structure cues for speech intelligibility. J. Acoust. Soc. Amer. 97 (1995) 585-592
-
(1995)
J. Acoust. Soc. Amer.
, vol.97
, pp. 585-592
-
-
Drullman, R.1
-
8
-
-
45849089726
-
-
Goto, M., Hashiguchi, H., Nishimura, T., Oka, R., 2002. RWC music database: popular, classical, and jazz music databases. In: Proc. 3rd Internat. Conf. on Music Information Retrieval (ISMIR 2002), pp. 287-288.
-
Goto, M., Hashiguchi, H., Nishimura, T., Oka, R., 2002. RWC music database: popular, classical, and jazz music databases. In: Proc. 3rd Internat. Conf. on Music Information Retrieval (ISMIR 2002), pp. 287-288.
-
-
-
-
9
-
-
45849095873
-
-
Goto, M., Hashiguchi, H., Nishimura, T., Oka, R., 2003. RWC music database: music genre database and musical instrument sound database. In: Proc. 4th Internat. Conf. on Music Information Retrieval (ISMIR 2003), pp. 229-230.
-
Goto, M., Hashiguchi, H., Nishimura, T., Oka, R., 2003. RWC music database: music genre database and musical instrument sound database. In: Proc. 4th Internat. Conf. on Music Information Retrieval (ISMIR 2003), pp. 229-230.
-
-
-
-
11
-
-
0032644224
-
JNAS: Japanese speech corpus for large vocabulary continuous speech recognition research
-
Itou K., Yamamoto M., Takeda K., Takezawa T., Matsuoka T., Kobayashi T., Shikano K., and Itahashi S. JNAS: Japanese speech corpus for large vocabulary continuous speech recognition research. J. Acoust. Soc. Jpn. (E) 20 3 (1999) 199-206
-
(1999)
J. Acoust. Soc. Jpn. (E)
, vol.20
, Issue.3
, pp. 199-206
-
-
Itou, K.1
Yamamoto, M.2
Takeda, K.3
Takezawa, T.4
Matsuoka, T.5
Kobayashi, T.6
Shikano, K.7
Itahashi, S.8
-
13
-
-
0037347128
-
Signal representation including waveform envelope by clustered line-spectrum modeling
-
Kazama M., Yoshida K., and Tohyama M. Signal representation including waveform envelope by clustered line-spectrum modeling. J. Audio Eng. Soc. 51 3 (2003) 123-137
-
(2003)
J. Audio Eng. Soc.
, vol.51
, Issue.3
, pp. 123-137
-
-
Kazama, M.1
Yoshida, K.2
Tohyama, M.3
-
14
-
-
45849084436
-
-
Kim, H., Burred, J., Sikora, T., 2004. How efficient is MPEG-7 for general sound recognition? In: 25th Internat. Audio Engineering Society Conference Metadata For Audio.
-
Kim, H., Burred, J., Sikora, T., 2004. How efficient is MPEG-7 for general sound recognition? In: 25th Internat. Audio Engineering Society Conference Metadata For Audio.
-
-
-
-
15
-
-
85037142779
-
-
Maekawa, K., Koiso, H., Furui, S., Isahara, H., 2000. Spontaneous speech corpus of Japanese. In: Proc. 2nd Internat. Conf. Language Resources and Evaluation (LREC2000), pp. 947-952.
-
Maekawa, K., Koiso, H., Furui, S., Isahara, H., 2000. Spontaneous speech corpus of Japanese. In: Proc. 2nd Internat. Conf. Language Resources and Evaluation (LREC2000), pp. 947-952.
-
-
-
-
16
-
-
29844444090
-
-
Marks, S., Gonzalez, R., 2005. Techniques for improving the accuracy of sinusoidal tracking. In: Proc. Internet and Multimedia Systems and Applications 2005, pp. 299-304.
-
Marks, S., Gonzalez, R., 2005. Techniques for improving the accuracy of sinusoidal tracking. In: Proc. Internet and Multimedia Systems and Applications 2005, pp. 299-304.
-
-
-
-
17
-
-
84863772450
-
Speech analysis/synthesis based on a sinusoidal representation
-
McAulay R., and Quatieri T. Speech analysis/synthesis based on a sinusoidal representation. IEEE Trans. ASSP ASSP-34 4 (1986) 744-754
-
(1986)
IEEE Trans. ASSP
, vol.ASSP-34
, Issue.4
, pp. 744-754
-
-
McAulay, R.1
Quatieri, T.2
-
18
-
-
84904439718
-
-
Melih, K., Gonzalez, R., 1999. Audio source type segmentation using a perceptually based representation. In: Proc. 5th Internat. Symposium on Signal Processing and Its Applications, 1999, ISSPA'99, Vol. 1, pp. 51-54.
-
Melih, K., Gonzalez, R., 1999. Audio source type segmentation using a perceptually based representation. In: Proc. 5th Internat. Symposium on Signal Processing and Its Applications, 1999, ISSPA'99, Vol. 1, pp. 51-54.
-
-
-
-
19
-
-
0034502302
-
-
Melih, K., Gonzalez, R., 2000. Source segmentation for structured audio. In: IEEE Internat. Conf. on Multimedia and Expo, ICME 2000, Vol. 2, pp. 811-814.
-
Melih, K., Gonzalez, R., 2000. Source segmentation for structured audio. In: IEEE Internat. Conf. on Multimedia and Expo, ICME 2000, Vol. 2, pp. 811-814.
-
-
-
-
21
-
-
45849098098
-
-
Nawab, S.H., Espy-Wilson, C.Y., Mani, R., Bitar, N.N., 1998. Computational auditory scene analysis. Lawrence Erlbaum Associates, Knowledge-based analysis of speech mixed with sporadic environmental sounds, pp. 177-194 (Chapter 12).
-
Nawab, S.H., Espy-Wilson, C.Y., Mani, R., Bitar, N.N., 1998. Computational auditory scene analysis. Lawrence Erlbaum Associates, Knowledge-based analysis of speech mixed with sporadic environmental sounds, pp. 177-194 (Chapter 12).
-
-
-
-
22
-
-
45849084435
-
-
Plante, F., Meyer, G., Ainsworth, W.A., 1995. A pitch extraction reference database. In: EUROSPEECH'95. pp. 837-840.
-
Plante, F., Meyer, G., Ainsworth, W.A., 1995. A pitch extraction reference database. In: EUROSPEECH'95. pp. 837-840.
-
-
-
-
24
-
-
45849148672
-
-
Sakakibara, K.-I., Osaka, N., 1998. On concatenation of musical sounds using a sinusoidal model. In: Technical Report of IEICE, Vol. SP97-108, pp. 1-6 (in Japanese).
-
Sakakibara, K.-I., Osaka, N., 1998. On concatenation of musical sounds using a sinusoidal model. In: Technical Report of IEICE, Vol. SP97-108, pp. 1-6 (in Japanese).
-
-
-
-
25
-
-
0029765670
-
-
Saunders, J., 1996. Real-time discrimination of broadcast speech/music. In: Proc. ICASSP'96, Vol. 2, pp. 993-996.
-
Saunders, J., 1996. Real-time discrimination of broadcast speech/music. In: Proc. ICASSP'96, Vol. 2, pp. 993-996.
-
-
-
-
26
-
-
0030648077
-
-
Scheirer, E., Slaney, M., 1997. Construction and evaluation of a robust multifeature speech/music discriminator. In: Proc. ICASSP'97, Vol. II, pp. 1331-1334.
-
Scheirer, E., Slaney, M., 1997. Construction and evaluation of a robust multifeature speech/music discriminator. In: Proc. ICASSP'97, Vol. II, pp. 1331-1334.
-
-
-
-
27
-
-
45849137620
-
-
Takeuchi, S., Yamashita, M., Uchida, T., Sugiyama, M., 2001. Optimization of voice/music detection in sound dat. In: Consistent and Reliable Acoustic Cues for sound analysis (CRAC Workshop).
-
Takeuchi, S., Yamashita, M., Uchida, T., Sugiyama, M., 2001. Optimization of voice/music detection in sound dat. In: Consistent and Reliable Acoustic Cues for sound analysis (CRAC Workshop).
-
-
-
-
28
-
-
33745184907
-
-
Taniguchi, T., Adachi, A., Okawa, S., Honda, M., Shirai, K., 2005. Discrimination of speech, musical instruments and singing voices using the temporal patterns of sinusoidal segments in audio signals. In: Proc. Interspeech2005, pp. 589-592.
-
Taniguchi, T., Adachi, A., Okawa, S., Honda, M., Shirai, K., 2005. Discrimination of speech, musical instruments and singing voices using the temporal patterns of sinusoidal segments in audio signals. In: Proc. Interspeech2005, pp. 589-592.
-
-
-
-
29
-
-
44449143006
-
-
Taniguchi, T., Tohyama, M., Shirai, K., 2006. Spectral frequency tracking for classifying audio signals. In: IEEE Internat. Symposium on Signal Processing and Information Technology, 2006, pp. 300-303.
-
Taniguchi, T., Tohyama, M., Shirai, K., 2006. Spectral frequency tracking for classifying audio signals. In: IEEE Internat. Symposium on Signal Processing and Information Technology, 2006, pp. 300-303.
-
-
-
-
30
-
-
45849099961
-
-
Torkkola, K., 1999. Blind separation for audio signals - are we there yet? In: Proc. Internat. Workshop on Independent Component Analysis and Signal Separation (ICA'99).
-
Torkkola, K., 1999. Blind separation for audio signals - are we there yet? In: Proc. Internat. Workshop on Independent Component Analysis and Signal Separation (ICA'99).
-
-
-
-
31
-
-
45849129969
-
-
Virtanen, T., 2003. Sound source separation using sparse coding with temporal continuity objective. In: Proc. ICMC, pp. 231-234.
-
Virtanen, T., 2003. Sound source separation using sparse coding with temporal continuity objective. In: Proc. ICMC, pp. 231-234.
-
-
-
-
32
-
-
0033707902
-
-
Virtanen, T., Klapuri, A., 2000. Separation of harmonic sound sources using sinusoidal modeling. In: Proc. IEEE Internat. Conf. on Acoust. Speech Signal Process, ICASSP'00, Vol. 2, pp. 765-768.
-
Virtanen, T., Klapuri, A., 2000. Separation of harmonic sound sources using sinusoidal modeling. In: Proc. IEEE Internat. Conf. on Acoust. Speech Signal Process, ICASSP'00, Vol. 2, pp. 765-768.
-
-
-
-
33
-
-
45849143454
-
-
Xiong, Z., Radhakrishnan, R., Divakaran, A., Huang, T., 2003. Comparing MFCC and MPEG-7 audio features for feature extraction, maximum likelihood HMM and entropic prior HMM for sports audio classification. In: Proc. Internat. Conf. on Multimedia and Expo, 2003, ICME'03, Vol. 3, pp. 397-400.
-
Xiong, Z., Radhakrishnan, R., Divakaran, A., Huang, T., 2003. Comparing MFCC and MPEG-7 audio features for feature extraction, maximum likelihood HMM and entropic prior HMM for sports audio classification. In: Proc. Internat. Conf. on Multimedia and Expo, 2003, ICME'03, Vol. 3, pp. 397-400.
-
-
-
|