메뉴 건너뛰기




Volumn 85, Issue 9, 1997, Pages 1437-1462

Speaker recognition: A tutorial

Author keywords

Access control; Authentication; Biomedical measurements; Biomedical signal processing; Biomedical transducers; Biometric; Communication system security; Computer nehvork security; Computer security; Corpus; Data bases; Identification of persons; Public safety

Indexed keywords

BIOMEDICAL ENGINEERING; COMPUTER NETWORKS; DATABASE SYSTEMS; PATTERN RECOGNITION SYSTEMS; SECURITY OF DATA; SPEECH PROCESSING;

EID: 0031233424     PISSN: 00189219     EISSN: None     Source Type: Journal    
DOI: 10.1109/5.628714     Document Type: Article
Times cited : (1294)

References (63)
  • 1
    • 0016067897 scopus 로고
    • Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification
    • B. S. Atal, "Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification," J. Acoust. Soc. Amer., vol. 55, no. 6, pp. 1304-1312, 1974.
    • (1974) J. Acoust. Soc. Amer. , vol.55 , Issue.6 , pp. 1304-1312
    • Atal, B.S.1
  • 2
    • 0016939145 scopus 로고
    • Automatic recognition of speakers from their voices
    • _, "Automatic recognition of speakers from their voices," Proc. IEEE, vol. 64, pp. 460-475, 1976.
    • (1976) Proc. IEEE , vol.64 , pp. 460-475
  • 4
    • 0024925404 scopus 로고
    • Distance measures for signal processing and pattern recognition
    • M. Basseville, "Distance measures for signal processing and pattern recognition," Signal Process., vol. 18, pp. 349-369, 1989.
    • (1989) Signal Process. , vol.18 , pp. 349-369
    • Basseville, M.1
  • 7
    • 44949275238 scopus 로고
    • The Federal Standard 1016 4800 bps CELP voice coder
    • J. P. Campbell, Jr., T. E. Tremain, and V. C. Welch, "The Federal Standard 1016 4800 bps CELP voice coder," Digital Signal Processing, vol. 1, no. 3, pp. 145-155, 1991.
    • (1991) Digital Signal Processing , vol.1 , Issue.3 , pp. 145-155
    • Campbell Jr., J.P.1    Tremain, T.E.2    Welch, V.C.3
  • 9
    • 0003101253 scopus 로고
    • Speaker recognition using HMM with experiments on the YOHO database
    • Madrid, Italy
    • C. Che and Q. Lin, "Speaker recognition using HMM with experiments on the YOHO database," in Proc. EUROSPEECH, Madrid, Italy, pp. 625-628, 1995.
    • (1995) Proc. EUROSPEECH , pp. 625-628
    • Che, C.1    Lin Q2
  • 11
    • 0015958387 scopus 로고
    • On a new class of bounds on Bayes risk in multihypothesis pattern recognition
    • P. A. Devijver, "On a new class of bounds on Bayes risk in multihypothesis pattern recognition," IEEE Trans. Compitt., vol. C-23, no. 1, pp. 70-80, 1974.
    • (1974) IEEE Trans. Compitt. , vol.C-23 , Issue.1 , pp. 70-80
    • Devijver, P.A.1
  • 12
    • 0022150488 scopus 로고
    • Speaker recognition-Identifying people by their voices
    • Nov.
    • G. R. Doddington, "Speaker recognition-Identifying people by their voices," Proc. IEEE, vol. 73, pp. 1651-1664, Nov. 1985.
    • (1985) Proc. IEEE , vol.73 , pp. 1651-1664
    • Doddington, G.R.1
  • 15
    • 0012143979 scopus 로고
    • Introduction to statistical pattern recognition
    • W. Rheinboldt and D. Siewiorek, Eds. San Diego, CA: Academic
    • K. Fukunaga, "Introduction to statistical pattern recognition," in Computer Science and Scientific Computing, 2nd ed., W. Rheinboldt and D. Siewiorek, Eds. San Diego, CA: Academic, 1990.
    • (1990) Computer Science and Scientific Computing, 2nd Ed.
    • Fukunaga, K.1
  • 16
    • 0019555090 scopus 로고
    • Cepstral analysis technique for automatic speaker verification
    • S. Furui, "Cepstral analysis technique for automatic speaker verification," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-29, pp. 254-272, 1981.
    • (1981) IEEE Trans. Acoust., Speech, Signal Processing , vol.ASSP-29 , pp. 254-272
    • Furui, S.1
  • 17
    • 0026402417 scopus 로고
    • Speaker-dependent-feature extraction, recognition and processing techniques
    • _, "Speaker-dependent-feature extraction, recognition and processing techniques," Speech Commun., vol. 10, pp. 505-520, 1991.
    • (1991) Speech Commun. , vol.10 , pp. 505-520
  • 18
    • 0028516097 scopus 로고
    • Text-independent speaker identification
    • H. Gish and M. Schmidt, 'Text-independent speaker identification," IEEE Signal Processing Mag., vol. 11, pp. 18-32, 1994.
    • (1994) IEEE Signal Processing Mag. , vol.11 , pp. 18-32
    • Gish, H.1    Schmidt, M.2
  • 19
    • 84972517864 scopus 로고
    • Discriminant analysis and clustering
    • R. Gnanadesikan and J. R. Kettenring, "Discriminant analysis and clustering," Statistical Sei., vol. 4, no. 1, pp. 34-69, 1989.
    • (1989) Statistical Sei. , vol.4 , Issue.1 , pp. 34-69
    • Gnanadesikan, R.1    Kettenring, J.R.2
  • 20
    • 0017851927 scopus 로고
    • On the use of windows for harmonic analysis with the DFT
    • F. J. Harris, "On the use of windows for harmonic analysis with the DFT," Proc. IEEE, vol. 66, pp. 51-83, 1978.
    • (1978) Proc. IEEE , vol.66 , pp. 51-83
    • Harris, F.J.1
  • 21
    • 33646938211 scopus 로고
    • YOHO speaker verification
    • Baltimore, MD
    • A. Higgins, "YOHO speaker verification," presented at the Speech Research Symp., Baltimore, MD, 1990.
    • (1990) Speech Research Symp.
    • Higgins, A.1
  • 22
    • 0003019863 scopus 로고
    • Speaker verification using randomized phrase prompting
    • A. Higgins, L. Bahler, and J. Porter, "Speaker verification using randomized phrase prompting," Digital Signal Processing, vol. 1, no. 2, pp. 89-106, 1991.
    • (1991) Digital Signal Processing , vol.1 , Issue.2 , pp. 89-106
    • Higgins, A.1    Bahler, L.2    Porter, J.3
  • 26
    • 65249157560 scopus 로고
    • The divergence and Bhattacharyya distance measures in signal selection
    • T. Kailath, "The divergence and Bhattacharyya distance measures in signal selection," IEEE Trans. Commun. TechnoL, vol. COM-15, no. 1, pp. 52-60, 1967.
    • (1967) IEEE Trans. Commun. TechnoL , vol.COM-15 , Issue.1 , pp. 52-60
    • Kailath, T.1
  • 27
    • 0005511704 scopus 로고
    • Low Bit Rate Speech Encoder Based on Line-Spectrum-Frequency
    • National Research Laboratory, Washington, D.C.
    • G. Kang and L. Fransen, "Low Bit Rate Speech Encoder Based on Line-Spectrum-Frequency," National Research Laboratory, Washington, D.C., NRL Rep. 8857, 1985.
    • (1985) NRL Rep. , vol.8857
    • Kang, G.1    Fransen, L.2
  • 29
    • 0001927585 scopus 로고
    • On information and sufficiency
    • S. Kullback and R. Leibler, "On information and sufficiency," Annals Math. Statist., vol. 22, pp. 79-86, 1951.
    • (1951) Annals Math. Statist. , vol.22 , pp. 79-86
    • Kullback, S.1    Leibler, R.2
  • 30
    • 0026103147 scopus 로고
    • Information-theoretic distortion measures for speech recognition
    • Y.-T. Lee, "Information-theoretic distortion measures for speech recognition," IEEE Trans. Acotist., Speech, Signal Processing, vol. 39, pp. 330-335, 1991.
    • (1991) IEEE Trans. Acotist., Speech, Signal Processing , vol.39 , pp. 330-335
    • Lee, Y.-T.1
  • 32
    • 0016495091 scopus 로고
    • Linear prediction: A tutorial review
    • J. Makhoul, "Linear prediction: A tutorial review," Proc. IEEE, vol. 63, pp. 561-580, 1975.
    • (1975) Proc. IEEE , vol.63 , pp. 561-580
    • Makhoul, J.1
  • 33
    • 0030247355 scopus 로고    scopus 로고
    • Robust speaker recognition-A feature-based approach
    • R. Mammone, X. Zhang, and R. Ramachandran, "Robust speaker recognition-A feature-based approach," IEEE Signal Processing Mag., vol. 13, no. 5, pp. 58-71, 1996.
    • (1996) IEEE Signal Processing Mag. , vol.13 , Issue.5 , pp. 58-71
    • Mammone, R.1    Zhang, X.2    Ramachandran, R.3
  • 34
    • 0018331352 scopus 로고
    • Text-independent speaker recognition from a large linguistically unconstrained time-spaced data base
    • J. D. Markel and S. B. Davis, "Text-independent speaker recognition from a large linguistically unconstrained time-spaced data base," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-27, no. 1, pp. 74-82, 1979.
    • (1979) IEEE Trans. Acoust., Speech, Signal Processing , vol.ASSP-27 , Issue.1 , pp. 74-82
    • Markel, J.D.1    Davis, S.B.2
  • 35
    • 33646913195 scopus 로고    scopus 로고
    • 1997 speaker recognition evaluation
    • sect. 2, A. Martin, Ed., Maritime Institute of Technology, Linthicum Heights, MD, June 25-26, See also the NIST Spoken Natural Language Processing Group's FTP server. Available
    • A. Martin and M. Przybocki, "1997 speaker recognition evaluation," in Proc. Speaker Recognition Workshop, sect. 2, A. Martin, Ed., Maritime Institute of Technology, Linthicum Heights, MD, June 25-26, 1997. (See also the NIST Spoken Natural Language Processing Group's FTP server. Available: ftp://jaguar.ncsl.nist.gov/speaker/).
    • (1997) Proc. Speaker Recognition Workshop
    • Martin, A.1    Przybocki, M.2
  • 36
    • 0025244699 scopus 로고
    • Speaker verification: A tutorial
    • Jan.
    • J. Naik, "Speaker verification: A tutorial," IEEE Commun. Mag., vol. 28, pp. 42-48, Jan. 1990.
    • (1990) IEEE Commun. Mag. , vol.28 , pp. 42-48
    • Naik, J.1
  • 37
    • 33646905029 scopus 로고
    • Speech communication, human and machine
    • Reading, MA: Addison-Wesley
    • D. O'Shaughnessy, "Speech communication, human and machine," Digital Signal Processing. Reading, MA: Addison-Wesley, 1987.
    • (1987) Digital Signal Processing
    • O'Shaughnessy, D.1
  • 39
    • 33646937929 scopus 로고    scopus 로고
    • Commensurability among biometric systems: How to know when three apples probably equals seven oranges
    • J. Campbell, Ed., Crystal City, VA, Apr. 8-9, See also the Biometric Consortium's web site. Available
    • G. Papcun, "Commensurability among biometric systems: How to know when three apples probably equals seven oranges," in Proc. Biometrie Consortium, 9th Meeting, J. Campbell, Ed., Crystal City, VA, Apr. 8-9, 1997. (See also the Biometric Consortium's web site. Available: http://w\vw.biometrics.org:8080/).
    • (1997) Proc. Biometrie Consortium, 9th Meeting
    • Papcun, G.1
  • 43
    • 0022594196 scopus 로고
    • An introduction to hidden Markov models
    • Jan.
    • L. R. Rabiner and B.-H. Juang, "An introduction to hidden Markov models," IEEEASSP Mag.,vo\. 3, pp. 4-16, Jan. 1986.
    • (1986) IEEEASSP Mag. , vol.3 , pp. 4-16
    • Rabiner, L.R.1    Juang, B.-H.2
  • 45
    • 0024610919 scopus 로고
    • A tutorial on hidden Markov models and selected applications in speech recognition
    • Feb.
    • L. R. Rabiner, "A tutorial on hidden Markov models and selected applications in speech recognition," Proc. IEEE, vol. 77, pp. 257-286, Feb. 1989.
    • (1989) Proc. IEEE , vol.77 , pp. 257-286
    • Rabiner, L.R.1
  • 47
    • 0029355999 scopus 로고
    • Speaker identification and verification using Gaussian mixture speaker models
    • D. Reynolds, "Speaker identification and verification using Gaussian mixture speaker models," Speech Commun., vol. 17, pp. 91-108, 1995.
    • (1995) Speech Commun. , vol.17 , pp. 91-108
    • Reynolds, D.1
  • 48
    • 0012097212 scopus 로고
    • Text-dependent speaker verification using decoupled and integrated speaker and speech recognizers
    • Madrid, Spain
    • D. Reynolds and B. Carlson, 'Text-dependent speaker verification using decoupled and integrated speaker and speech recognizers," in Proc. EUROSPEECH, Madrid, Spain, 1995, pp. 647-650.
    • (1995) Proc. EUROSPEECH , pp. 647-650
    • Reynolds, D.1    Carlson, B.2
  • 49
    • 0029209272 scopus 로고
    • Robust text-independent speaker identification using Gaussian mixture speaker models
    • D. Reynolds and R. Rose, "Robust text-independent speaker identification using Gaussian mixture speaker models," IEEE Trans. Speech Audio Processing, vol. 3, no. 1, pp. 72-83, 1995.
    • (1995) IEEE Trans. Speech Audio Processing , vol.3 , Issue.1 , pp. 72-83
    • Reynolds, D.1    Rose, R.2
  • 50
    • 33744663691 scopus 로고    scopus 로고
    • M.I.T. Lincoln Laboratory site presentation
    • A. Martin, Ed., sect. 5, Maritime Institute of Technology, Linthicum Heights, MD, Mar. 27-28, See also the NIST Spoken Natural Language Processing Group's FTP server. Available
    • D. Reynolds, "M.I.T. Lincoln Laboratory site presentation," in Speaker Recognition Workshop, A. Martin, Ed., sect. 5, Maritime Institute of Technology, Linthicum Heights, MD, Mar. 27-28, 1996. (See also the NIST Spoken Natural Language Processing Group's FTP server. Available: ftp://jaguar.nesl.nist.gov/speaker/).
    • (1996) Speaker Recognition Workshop
    • Reynolds, D.1
  • 51
    • 0016939165 scopus 로고
    • Automatic speaker verification: A review
    • Apr.
    • A. Rosenberg, "Automatic speaker verification: A review," Proc. IEEE, vol. 64, pp. 475-487, Apr. 1976.
    • (1976) Proc. IEEE , vol.64 , pp. 475-487
    • Rosenberg, A.1
  • 52
    • 0001941052 scopus 로고
    • Recent research in automatic speaker recognition
    • S. Furui and M. M. Sondhi, Eds. New York: Marcel Dekker
    • A. E. Rosenbcrg and F. K. Soong, "Recent research in automatic speaker recognition," in Advances in Speech Signal Processing, S. Furui and M. M. Sondhi, Eds. New York: Marcel Dekker, 1992, pp. 701-738.
    • (1992) Advances in Speech Signal Processing , pp. 701-738
    • Rosenbcrg, A.E.1    Soong, F.K.2
  • 55
    • 0017930815 scopus 로고
    • Dynamic programming algorithm optimization for spoken word recognition
    • H. Sakoe and S. Chiba, "Dynamic programming algorithm optimization for spoken word recognition," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-26, no. 1, pp. 43-49, 1978.
    • (1978) IEEE Trans. Acoust., Speech, Signal Processing , vol.ASSP-26 , Issue.1 , pp. 43-49
    • Sakoe, H.1    Chiba, S.2
  • 56
    • 84944832124 scopus 로고
    • The application of probability density estimation to text independent speaker identification
    • Paris, France
    • R. Schwanz, S. Roucos, and M. Berouti, "The application of probability density estimation to text independent speaker identification," in Proc. Int. Conf. Acoustics, Speech, and Signal Processing, Paris, France, 1982, pp. 1649-1652.
    • (1982) Proc. Int. Conf. Acoustics, Speech, and Signal Processing , pp. 1649-1652
    • Schwanz, R.1    Roucos, S.2    Berouti, M.3
  • 58
    • 0023314827 scopus 로고
    • A vector quantization approach to speaker recognition
    • _, "A vector quantization approach to speaker recognition," AT&T Tech. J., vol. 66, no. 2, pp. 14-26, 1987.
    • (1987) AT&T Tech. J. , vol.66 , Issue.2 , pp. 14-26
  • 59
    • 33646944976 scopus 로고
    • Speaker verification
    • M. Jack and J. Laver, Eds. Edinburgh, Scotland: Edinburgh Univ. Press
    • A. Sutherland and M. Jack, "Speaker verification," in Aspects of Speech Technology, M. Jack and J. Laver, Eds. Edinburgh, Scotland: Edinburgh Univ. Press, 1988, pp. 185-215.
    • (1988) Aspects of Speech Technology , pp. 185-215
    • Sutherland, A.1    Jack, M.2
  • 60
    • 0026117640 scopus 로고
    • On the application of mixture AR hidden Markov models to text independent speaker recognition
    • N. Z. Tishby, "On the application of mixture AR hidden Markov models to text independent speaker recognition," IEEE Trans. Acoust., Speech, Signal Processing, vol. 39, no. 3, pp. 563-570, 1991.
    • (1991) IEEE Trans. Acoust., Speech, Signal Processing , vol.39 , Issue.3 , pp. 563-570
    • Tishby, N.Z.1
  • 61
    • 0344741225 scopus 로고
    • Pattern recognition principles
    • R. Kalaba, Ed. Reading, MA: Addison-Wesley
    • J. Tou and R. Gonzalez, "Pattern recognition principles," in Applied Mathematics and Computation, R. Kalaba, Ed. Reading, MA: Addison-Wesley, 1974.
    • (1974) Applied Mathematics and Computation
    • Tou, J.1    Gonzalez, R.2
  • 62
    • 0009061528 scopus 로고
    • Some approaches to optimum feature extraction
    • i. Tou, Ed. New York: Academic
    • J. Tou and P. Heydorn, "Some approaches to optimum feature extraction," in Computer and Information Sciences-lit i. Tou, Ed. New York: Academic, pp. 57-89, 1967.
    • (1967) Computer and Information Sciences-lit , pp. 57-89
    • Tou, J.1    Heydorn, P.2


* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.