메뉴 건너뛰기




Volumn 131, Issue 2, 2012, Pages 1515-1528

Acoustic hole filling for sparse enrollment data using a cohort universal corpus for speaker recognition

Author keywords

[No Author keywords available]

Indexed keywords

ACOUSTIC MODEL; AUDIO STREAM; EIGENVOICES; GAUSSIAN MIXTURE MODEL; HOLE FILLING; HUMAN LISTENERS; MACHINE PERFORMANCE; ORIGINAL SYSTEMS; SIMILARITY MEASUREMENTS; SPEAKER MODEL; SPEAKER MODELING; SPEAKER RECOGNITION; TEST DURATION; TEST MATERIALS;

EID: 84857425027     PISSN: 00014966     EISSN: None     Source Type: Journal    
DOI: 10.1121/1.3672707     Document Type: Article
Times cited : (5)

References (34)
  • 1
    • 50249182472 scopus 로고    scopus 로고
    • Discriminative in-set/out-of-set speaker recognition
    • 10.1109/TASL.2006.881689
    • Angkititrakul, P., and Hansen, J. H. L. (2007). Discriminative in-set/out-of-set speaker recognition., IEEE Trans. Audio, Speech, Lang. Process. 15, 498-508. 10.1109/TASL.2006.881689
    • (2007) IEEE Trans. Audio, Speech, Lang. Process , vol.15 , pp. 498-508
    • Angkititrakul, P.1    Hansen, J.H.L.2
  • 2
    • 27144489164 scopus 로고    scopus 로고
    • A tutorial on support vector machines for pattern recognition
    • 10.1023/A:1009715923555
    • Burges, C. (1998). A tutorial on support vector machines for pattern recognition., Data Min. Knowl. Discov. 2, 121-167. 10.1023/A:1009715923555
    • (1998) Data Min. Knowl. Discov , vol.2 , pp. 121-167
    • Burges, C.1
  • 5
    • 0028419019 scopus 로고
    • Maximum a posteriori estimation for multivariate Gaussian mixture observations of markov chains
    • 10.1109/89.279278
    • Gauvain, J., and Lee, C. (1994). Maximum a posteriori estimation for multivariate Gaussian mixture observations of markov chains., IEEE Trans. Speech Audio Process. 2, 291-298. 10.1109/89.279278
    • (1994) IEEE Trans. Speech Audio Process , vol.2 , pp. 291-298
    • Gauvain, J.1    Lee, C.2
  • 6
    • 0028516097 scopus 로고
    • Text-independent speaker identification
    • 10.1109/79.317924
    • Gish, H., and Schmidt, M. (1994). Text-independent speaker identification., IEEE Signal Process. Mag. 11, 18-32. 10.1109/79.317924
    • (1994) IEEE Signal Process. Mag , vol.11 , pp. 18-32
    • Gish, H.1    Schmidt, M.2
  • 9
    • 85009152939 scopus 로고    scopus 로고
    • CU-move: Robust speech processing for in-vehicle speech systems
    • Beijing, China
    • Hansen, J. H. L., Plucienkowski, J., Gallant, S., Pellom, B., and Ward, W. (2000). CU-move: Robust speech processing for in-vehicle speech systems., in ICSLP 2000, Beijing, China, pp. 524-527.
    • (2000) ICSLP 2000 , pp. 524-527
    • Hansen, J.H.L.1    Plucienkowski, J.2    Gallant, S.3    Pellom, B.4    Ward, W.5
  • 10
    • 0000375621 scopus 로고
    • A robust version of the probability ratio test
    • 10.1214/aoms/1177699803
    • Huber, P. (1965). A robust version of the probability ratio test., Ann Math. Stat. 36, 1753-1758. 10.1214/aoms/1177699803
    • (1965) Ann Math. Stat , vol.36 , pp. 1753-1758
    • Huber, P.1
  • 11
  • 12
    • 43249091937 scopus 로고    scopus 로고
    • Speaker and session variability in GMM-based speaker verification
    • 10.1109/TASL.2007.894527
    • Kenny, P., Boulianne, G., Ouellet, P., and Dumouchel, P. (2007). Speaker and session variability in GMM-based speaker verification., IEEE Trans. Audio, Speech, Lang. Process. 15, 1448-1460. 10.1109/TASL.2007.894527
    • (2007) IEEE Trans. Audio, Speech, Lang. Process , vol.15 , pp. 1448-1460
    • Kenny, P.1    Boulianne, G.2    Ouellet, P.3    Dumouchel, P.4
  • 15
    • 0001927585 scopus 로고
    • On information and sufficiency
    • 10.1214/aoms/1177729694
    • Kullback, S., and Leibler, R. (1951). On information and sufficiency., Ann. Math. Stat. 22, 79-86. 10.1214/aoms/1177729694
    • (1951) Ann. Math. Stat , vol.22 , pp. 79-86
    • Kullback, S.1    Leibler, R.2
  • 16
    • 85135371588 scopus 로고
    • High performance speaker-independent phone recognition using CDHMM
    • Berlin, Germany
    • Lamel, L., and Gauvain, J. (1993). High performance speaker-independent phone recognition using CDHMM., in EUROSPEECH 1993, Berlin, Germany, pp. 121-124.
    • (1993) EUROSPEECH 1993 , pp. 121-124
    • Lamel, L.1    Gauvain, J.2
  • 17
    • 0024768209 scopus 로고
    • Speaker-independent phone recognition using hidden Markov models
    • 10.1109/29.46546
    • Lee, K., and Hon, H. (1989). Speaker-independent phone recognition using hidden Markov models., IEEE Trans. Acoust., Speech, Signal Process. 37, 1641-1648. 10.1109/29.46546
    • (1989) IEEE Trans. Acoust., Speech, Signal Process , vol.37 , pp. 1641-1648
    • Lee, K.1    Hon, H.2
  • 19
    • 44949143337 scopus 로고    scopus 로고
    • Speaker cluster based GMM tokenization for speaker recognition
    • Pittsburgh, PA
    • Ma, B., Zhu, D., Tong, R., and Li, H. (2006). Speaker cluster based GMM tokenization for speaker recognition., in INTERSPEECH 2006, Pittsburgh, PA, Vol. 1, pp. 505-508.
    • (2006) INTERSPEECH 2006 , vol.1 , pp. 505-508
    • Ma, B.1    Zhu, D.2    Tong, R.3    Li, H.4
  • 20
    • 33947670488 scopus 로고    scopus 로고
    • A comparison of various adaptation methods for speaker verification with limited enrollment data
    • Toulouse, France
    • Mak, M., Hsiao, R., and Mak, B. (2006). A comparison of various adaptation methods for speaker verification with limited enrollment data., in ICASSP 2006, Toulouse, France, Vol. 1, pp. 929-932.
    • (2006) ICASSP 2006 , vol.1 , pp. 929-932
    • Mak, M.1    Hsiao, R.2    Mak, B.3
  • 21
    • 84857407484 scopus 로고    scopus 로고
    • Structural linear model-space transformations for speaker adaptation
    • Geneva, Switzerland
    • Matrouf, D., Bellot, O., Nocera, P., Linares, G., and Bonastre, J. (2003). Structural linear model-space transformations for speaker adaptation., in EUROSPEECH 2003, Geneva, Switzerland, pp. 1625-1628.
    • (2003) EUROSPEECH 2003 , pp. 1625-1628
    • Matrouf, D.1    Bellot, O.2    Nocera, P.3    Linares, G.4    Bonastre, J.5
  • 22
    • 0017992187 scopus 로고
    • Frequency of occurrence of phonemes in conversational English
    • Mines, M., Hanson, B., and Shoup, J. (1978). Frequency of occurrence of phonemes in conversational English. Lang. Speech 21, 221-241.
    • (1978) Lang. Speech , vol.21 , pp. 221-241
    • Mines, M.1    Hanson, B.2    Shoup, J.3
  • 23
    • 84867213267 scopus 로고    scopus 로고
    • Language and genre detection in audio content analysis
    • Brisbane, Australia
    • Mitra, V., Garcia-Romero, D., and Espy-Wilson, C. (2008). Language and genre detection in audio content analysis., INTERSPEECH-2008, Brisbane, Australia, pp. 2506-2509.
    • (2008) INTERSPEECH-2008 , pp. 2506-2509
    • Mitra, V.1    Garcia-Romero, D.2    Espy-Wilson, C.3
  • 24
    • 32644450332 scopus 로고    scopus 로고
    • Feature warping for robust speaker verification
    • Pelecanos, J., and Sridharan, S. (2001). Feature warping for robust speaker verification., Proc. Speaker Odyssey 13, 1-5.
    • (2001) Proc. Speaker Odyssey , vol.13 , pp. 1-5
    • Pelecanos, J.1    Sridharan, S.2
  • 25
    • 63049100830 scopus 로고    scopus 로고
    • In-set/out-of-set speaker recognition under sparse enrollment
    • 10.1109/TASL.2007.902058
    • Prakash, V., and Hansen, J. H. L. (2007). In-set/out-of-set speaker recognition under sparse enrollment., IEEE Trans. Audio, Speech, Lang. Process. 15, 2044-2052. 10.1109/TASL.2007.902058
    • (2007) IEEE Trans. Audio, Speech, Lang. Process , vol.15 , pp. 2044-2052
    • Prakash, V.1    Hansen, J.H.L.2
  • 26
    • 0033884858 scopus 로고    scopus 로고
    • Speaker verification using adapted gaussian mixture models
    • 10.1006/dspr.1999.0361
    • Reynolds, D., Quatieri, T., and Dunn, R. (2000). Speaker verification using adapted gaussian mixture models., Digit. Signal Process. 10, 19-41. 10.1006/dspr.1999.0361
    • (2000) Digit. Signal Process , vol.10 , pp. 19-41
    • Reynolds, D.1    Quatieri, T.2    Dunn, R.3
  • 27
    • 44949231596 scopus 로고    scopus 로고
    • A multiclass framework for speaker verification within an acoustic event sequence system
    • Pittsburgh, PA
    • Scheffer, N., and Bonastre, J. (2006). A multiclass framework for speaker verification within an acoustic event sequence system., in INTERSPEECH 2006, Pittsburgh, PA, pp. 501-504.
    • (2006) INTERSPEECH 2006 , pp. 501-504
    • Scheffer, N.1    Bonastre, J.2
  • 28
    • 0033889739 scopus 로고    scopus 로고
    • Speaker verification by human listeners: Experiments comparing human and machine performance using the NIST 1998 Speaker Evaluation Data 1
    • 10.1006/dspr.1999.0356
    • Schmidt-Nielsen, A., and Crystal, T. (2000). Speaker verification by human listeners: Experiments comparing human and machine performance using the NIST 1998 Speaker Evaluation Data 1., Digit. Signal Process. 10, 249-266. 10.1006/dspr.1999.0356
    • (2000) Digit. Signal Process , vol.10 , pp. 249-266
    • Schmidt-Nielsen, A.1    Crystal, T.2
  • 30
    • 85009207995 scopus 로고    scopus 로고
    • Score normalisation applied to open-set, text-independent speaker identification
    • Geneva, Switzerland
    • Sivakumaran, P., Fortuna, J., and Ariyaeeinia, A. (2003). Score normalisation applied to open-set, text-independent speaker identification., in EUROSPEECH, Geneva, Switzerland, pp. 2669-2672.
    • (2003) EUROSPEECH , pp. 2669-2672
    • Sivakumaran, P.1    Fortuna, J.2    Ariyaeeinia, A.3
  • 31
    • 0022229052 scopus 로고
    • A vector quantization approach to speaker recognition
    • FL
    • Soong, F., Rosenberg, A., Rabiner, L., and Juang, L. (1985). A vector quantization approach to speaker recognition., in ICASSP 1985, FL, pp. 387-390.
    • (1985) ICASSP 1985 , pp. 387-390
    • Soong, F.1    Rosenberg, A.2    Rabiner, L.3    Juang, L.4
  • 32
    • 85009275225 scopus 로고    scopus 로고
    • Approaches to language identification using Gaussian mixture models and shifted delta cepstral features
    • Tampa, FL
    • Torres-Carrasquillo, P., Singer, E., Kohler, M., Greene, R., Reynolds, D., and Deller, Jr., J. (2002). Approaches to language identification using Gaussian mixture models and shifted delta cepstral features., in ICSLP 2002, Tampa, FL, pp. 89-92.
    • (2002) ICSLP 2002 , pp. 89-92
    • Torres-Carrasquillo, P.1    Singer, E.2    Kohler, M.3    Greene, R.4    Reynolds, D.5    Deller Jr., J.6
  • 33
    • 64549092742 scopus 로고    scopus 로고
    • A cohort-based speaker model synthesis for mismatched channels in speaker verification
    • 10.1109/TASL.2007.899297
    • Wu, W., Zheng, T., Xu, M., and Soong, F. (2007). A cohort-based speaker model synthesis for mismatched channels in speaker verification., IEEE Trans. Audio, Speech, Lang. Process. 15, 1893-1903. 10.1109/TASL.2007.899297
    • (2007) IEEE Trans. Audio, Speech, Lang. Process , vol.15 , pp. 1893-1903
    • Wu, W.1    Zheng, T.2    Xu, M.3    Soong, F.4
  • 34
    • 0041360472 scopus 로고    scopus 로고
    • Efficient text-independent speaker verification with structural gaussian mixture models and neural network
    • 10.1109/TSA.2003.815822
    • Xiang, B., and Berger, T. (2003). Efficient text-independent speaker verification with structural gaussian mixture models and neural network., IEEE Trans. Audio, Speech, Lang. Process. 11, 447-456. 10.1109/TSA.2003.815822
    • (2003) IEEE Trans. Audio, Speech, Lang. Process , vol.11 , pp. 447-456
    • Xiang, B.1    Berger, T.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.